In March of this year, 30,000 people, including leading AI figures like Yoshua Bengio and Stuart Russell, signed a letter calling on AI labs to pause the training of AI systems. While it seems unlikely that this letter will succeed in pausing the development of AI, it did draw substantial attention to slowing AI as a strategy for reducing existential risk.
While initial work has been done on this topic (this sequence links to some relevant work), many areas of uncertainty remain. I’ve asked a group of participants to discuss and debate various aspects of the value of advocating for a pause on the development of AI on the EA Forum, in a format loosely inspired by Cato Unbound.
- On September 16, we will launch with three posts:
- David Manheim will share a post giving an overview of what a pause would include, how a pause would work, and some possible concrete steps forward
- Nora Belrose will post outlining some of the risks of a pause
- Thomas Larsen will post a concrete policy proposal
- After this, we will release one post per day, each from a different author
- Many of the participants will also be commenting on each other’s work
Responses from Forum users are encouraged; you can share your own posts on this topic or comment on the posts from participants. You’ll be able to find the posts by looking at this tag (remember that you can subscribe to tags to be notified of new posts).
I think it is unlikely that this debate will result in a consensus agreement, but I hope that it will clarify the space of policy options, why those options may be beneficial or harmful, and what future work is needed.
People who have agreed to participate
These are in random order, and they’re participating as individuals, not representing any institution:
- David Manheim (ALTER)
- Matthew Barnett (Epoch AI)
- Zach Stein-Perlman (AI Impacts)
- Holly Elmore (AI pause advocate)
- Buck Shlegeris (Redwood Research)
- Anonymous researcher (Major AI lab)
- Anonymous professor (Anonymous University)
- Rob Bensinger (Machine Intelligence Research Institute)
- Nora Belrose (EleutherAI)
- Thomas Larsen (Center for AI Policy)
- Quintin Pope (Oregon State University)
Scott Alexander will be writing a summary/conclusion of the debate at the end.
Thanks to Lizka Vaintrob, JP Addison, and Jessica McCurdy for help organizing this, and Lizka (+ Midjourney) for the picture.
Regardless of whether a pause has benefits, can someone succinctly explain how a pause could even be possible? Nuclear arms pauses never meaningfully happened - SALT treaties were decades later and didn't meaningfully reduce the "nuclear apocalypse" doom risk, at least for citizens living in targeted countries. (Maybe it reduced worldwide risks)
A nuclear bombardment is a quantifiable, object level thing. You can use a special tool and draw circles on a paper map of your own cities and wargame out where the enemy will allocate their missiles if their goal is maximum destruction. Multiply a series probability estimate of how likely the warhead will reach the target and give ful yield and add up the estimated damage.
The "we must pause" threat of AI doom I have seen so far doesn't have any data of any threat. And a frequent threat model is the "treacherous turn" - you wouldn't be able to quantify how risky the current AI systems you have at any point or their capacity for damage in numbers because the AI systems are hiding their motives and capabilities until they see a route to victory.
With no data of a threat how could any policymakers agree to a pause?
The other examples of human technologies we did agree to not pursue all exist and the threats are concrete. The damage from nerve gas, CFCs, bio warfare, and human genetic engineering we have data on.
As a side note the teacherous turn threat model has an obvious technical mitigation measure : use the least compute on any task by selecting from a pool of models the most efficient one. This discourages models with the cognitive capacity for sedition or hidden capabilities as they will need more memory and compute to run.
My proposal is to engineer powerful and reliable AI immediately, as fast as feasible. If this is true endgame - whoever wins the race owns the planet if not the accessible universe - then spending and effort should be proportional. It's the only way.
You deal with the dangerous out of control AI by tasking your reliable models with destroying them.
The core of your approach is to subdivide and validate all the subtasks. No model is manufacturing the drones used to do this by itself, it's thousands of temporary instances. You filter the information used t... (read more)