In March of this year, 30,000 people, including leading AI figures like Yoshua Bengio and Stuart Russell, signed a letter calling on AI labs to pause the training of AI systems. While it seems unlikely that this letter will succeed in pausing the development of AI, it did draw substantial attention to slowing AI as a strategy for reducing existential risk.
While initial work has been done on this topic (this sequence links to some relevant work), many areas of uncertainty remain. I’ve asked a group of participants to discuss and debate various aspects of the value of advocating for a pause on the development of AI on the EA Forum, in a format loosely inspired by Cato Unbound.
- On September 16, we will launch with three posts:
- David Manheim will share a post giving an overview of what a pause would include, how a pause would work, and some possible concrete steps forward
- Nora Belrose will post outlining some of the risks of a pause
- Thomas Larsen will post a concrete policy proposal
- After this, we will release one post per day, each from a different author
- Many of the participants will also be commenting on each other’s work
Responses from Forum users are encouraged; you can share your own posts on this topic or comment on the posts from participants. You’ll be able to find the posts by looking at this tag (remember that you can subscribe to tags to be notified of new posts).
I think it is unlikely that this debate will result in a consensus agreement, but I hope that it will clarify the space of policy options, why those options may be beneficial or harmful, and what future work is needed.
People who have agreed to participate
These are in random order, and they’re participating as individuals, not representing any institution:
- David Manheim (ALTER)
- Matthew Barnett (Epoch AI)
- Zach Stein-Perlman (AI Impacts)
- Holly Elmore (AI pause advocate)
- Buck Shlegeris (Redwood Research)
- Anonymous researcher (Major AI lab)
- Anonymous professor (Anonymous University)
- Rob Bensinger (Machine Intelligence Research Institute)
- Nora Belrose (EleutherAI)
- Thomas Larsen (Center for AI Policy)
- Quintin Pope (Oregon State University)
Scott Alexander will be writing a summary/conclusion of the debate at the end.
Thanks to Lizka Vaintrob, JP Addison, and Jessica McCurdy for help organizing this, and Lizka (+ Midjourney) for the picture.
My proposal is to engineer powerful and reliable AI immediately, as fast as feasible. If this is true endgame - whoever wins the race owns the planet if not the accessible universe - then spending and effort should be proportional. It's the only way.
You deal with the dangerous out of control AI by tasking your reliable models with destroying them.
The core of your approach is to subdivide and validate all the subtasks. No model is manufacturing the drones used to do this by itself, it's thousands of temporary instances. You filter the information used to reach the combat solvers that decide how to task each drone to destroy the enemy so any begging from the enemy is never processed. You design the killer drones with lots of low level interlocks to prevent the obvious misuse and they would use controllers maybe using conventional software so they cannot be convinced not to carry out the mission as they can't understand language.
The general concept is if 99 percent of the drones are "safe" like this then even if escaped models are smart they just can't win.
Or in more concrete terms, I am saying say a simple reliable combat solver is not going to be a lot worse than a more complex one. That superintelligence saturates. Simple and reliable hypersonic stealth drones are still almost as good as whatever a superintelligence cooks up etc. It's an assumption on available utility relative to compute.