Anders Sandberg has written a “final report” released simultaneously with the announcement of FHI’s closure. The abstract and an excerpt follow.
...Normally manifestos are written first, and then hopefully stimulate actors to implement their vision. This document is the reverse
Just read this in the Guardian.
The title is: "‘Eugenics on steroids’: the toxic and contested legacy of Oxford’s Future of Humanity Institute"
The sub-headline states: "Nick Bostrom’s centre for studying existential risk warned about AI but also gave rise to cultish ideas such as effective altruism."
The tone of the rest of the article is similar. Very disappointing from the Guardian, who typically would align with EA thinking on many topics. But probably today EA is an easy target. It's useful to be aware of the uphill struggle we have, even among liberals, to ensure that EA gets at least a fair hearing.
Sharing for info only. Obviously I don't agree with the article.
A crucial consideration in assessing the risks of advanced AI is the moral value we place on "unaligned" AIs—systems that do not share human preferences—which could emerge if we fail to make enough progress on technical alignment.
In this post I'll consider three potential...
So e.g. if I thought humans were utilitarians primarily because it is simple to express in concepts that humans and AIs share, then I would agree with you. But in fact I feel like it is pretty important that humans feel pleasure and pain, and have empathy, to explain why some humans are utilitarians. (Mostly I think the "true explanation" will have to appeal to more than simplicity, and the additional features this "true explanation" will appeal to are very likely to differ between humans and AIs.)
Thanks for trying to better understand my views. I apprecia...
I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable.
I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies.
In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences.
Would be interested in empirical evidence on this question (ideally actual...
Author: Laura Keen, Senior U.S. Program Manager at GiveDirectly
NOTE: This post is specific to GiveDirectly’s work in the U.S., which is run by dedicated U.S. staff and funded by U.S.-restricted donations. Donations to GiveDirectly only fund our international work unless expressly given...
GPT-5 training is probably starting around now. It seems very unlikely that GPT-5 will cause the end of the world. But it’s hard to be sure. I would guess that GPT-5 is more likely to kill me than an asteroid, a supervolcano, a plane crash or a brain tumor. We can predict...
- What is the risk level below which you'd be OK with unpausing AI?
I think approximately 1 in 10,000 chance of extinction for each new GPT would be acceptable given the benefits of AI. This is approximately my guess for GPT-5, so I think if we could release that model and then pause, I'd be okay with that.
A major consideration here is the use of AI to mitigate other x-risks. Some of Toby Ord's x-risk estimates:
As advanced machine learning systems become increasingly widespread, the question of how to make them safe is also gaining attention. Within this debate, the term “open source” is frequently brought up. Some claim that open sourcing models will potentially increase the likelihood of societal risks, while others insist that open sourcing is the only way to ensure the development and deployment of these “artificial intelligence,” or “AI,” systems goes well. Despite this idea of “open source” being a central debate of “AI” governance, only one group has released cutting edge “AI” which can be considered Open Source.
The term Open Source was first used to describe software in 1998, and was coined by Christine Peterson to describe the principles that would guide the development of the Netscape...
Effective today, I’ve left Open Philanthropy and joined the Carnegie Endowment for International Peace[1] as a Visiting Scholar. At Carnegie, I will analyze and write about topics relevant to AI risk reduction. In the short term, I will focus on (a) what AI capabilities...
This is a linkpost for Imitation Learning is Probably Existentially Safe by Michael Cohen and Marcus Hutter.
...Concerns about extinction risk from AI vary among experts in the field. But AI encompasses a very broad category of algorithms. Perhaps some algorithms would
I agree with the title and basic thesis of this article but I find its argumentation weak.
...First, we’ll offer a simple argument that a sufficiently advanced supervised learning algorithm, trained to imitate humans, would very likely not gain total control over humanity (to the point of making everyone defenseless) and then cause or allow human extinction from that position.
No human has ever gained total control over humanity. It would be a very basic mistake to think anyone ever has. Moreover, if they did so, very few humans would accept human extinction. A
Yudkowsky's comments at his sister's wedding seems surprisingly relevant here:
... (read more)