In March of this year, 30,000 people, including leading AI figures like Yoshua Bengio and Stuart Russell, signed a letter calling on AI labs to pause the training of AI systems. While it seems unlikely that this letter will succeed in pausing the development of AI, it did draw substantial attention to slowing AI as a strategy for reducing existential risk.
While initial work has been done on this topic (this sequence links to some relevant work), many areas of uncertainty remain. I’ve asked a group of participants to discuss and debate various aspects of the value of advocating for a pause on the development of AI on the EA Forum, in a format loosely inspired by Cato Unbound.
- On September 16, we will launch with three posts:
- David Manheim will share a post giving an overview of what a pause would include, how a pause would work, and some possible concrete steps forward
- Nora Belrose will post outlining some of the risks of a pause
- Thomas Larsen will post a concrete policy proposal
- After this, we will release one post per day, each from a different author
- Many of the participants will also be commenting on each other’s work
Responses from Forum users are encouraged; you can share your own posts on this topic or comment on the posts from participants. You’ll be able to find the posts by looking at this tag (remember that you can subscribe to tags to be notified of new posts).
I think it is unlikely that this debate will result in a consensus agreement, but I hope that it will clarify the space of policy options, why those options may be beneficial or harmful, and what future work is needed.
People who have agreed to participate
These are in random order, and they’re participating as individuals, not representing any institution:
- David Manheim (ALTER)
- Matthew Barnett (Epoch AI)
- Zach Stein-Perlman (AI Impacts)
- Holly Elmore (AI pause advocate)
- Buck Shlegeris (Redwood Research)
- Anonymous researcher (Major AI lab)
- Anonymous professor (Anonymous University)
- Rob Bensinger (Machine Intelligence Research Institute)
- Nora Belrose (EleutherAI)
- Thomas Larsen (Center for AI Policy)
- Quintin Pope (Oregon State University)
Scott Alexander will be writing a summary/conclusion of the debate at the end.
Thanks to Lizka Vaintrob, JP Addison, and Jessica McCurdy for help organizing this, and Lizka (+ Midjourney) for the picture.
So do any of these not exist in some form proving the tech is real?
Do any of these not have real world data of the drawbacks?
Note for example human genetic engineering, we never tried that but when we edit mammals errors are common and we know the incredible cost of birth defects. Also since we edit other mammals we know 100 percent this isn't just a possibility down the road, it's real, same tooling that works on rats will work on humans.
Do any of these, if you researched the tech and developed it to it's full potential, allow you to invade every country on earth and/or threaten to kill every living person within that countrys borders?
Note those last paragraph. This is not me being edgy. If exactly one nation has a large nuclear arsenal they could do exactly that. Once other powers started to get nukes in the 1950s, every nation had to get their own or have trusted friends with nukes and a treaty protecting them.
AGI technology if developed to it's full potential, and kept exclusive to one superpower, would allow the same.
It means any multilateral agreement to stop agi research because of dangers that haven't been shown yet to exist means each party is throwing away all the benefits, like becoming incredibly wealthy, immortal, and having almost limitless automated military hardware.
So much incentive to cheat it seems like a non starter.
For an agreement to be even possible I think there would need to be evidence that
(1) it's too dangerous to even experiment with AGI systems because it could "break out and kill everyone". Break out to where? What computers can host it? Where are they? How does the AGI stop humans cutting the power or bombing the racks of interconnected H100s?
(2) There's no point in building models, benchmarking them, finding the ones that are safe to use and also AGI, and using them, they will all betray you.
It's possible that (1,2) are true facts of reality but there is no actual evidence.