This post is an attempt to summarize the crucial considerations for an AI pause and AI pause advocacy. It is also a promotion for the International PauseAI protest, 21 October, the biggest AI protest ever, held in 7 countries on the same day. The aim of this post is to present an unbiased view, but obviously that may not be the case.
You can check out the EA Forum event page for the protest here.
Seven Crucial Considerations for Pausing AI
Under the default scenario, is the risk from AI acceptable?
- Is alignment going well?
- How likely is alignment by default?
- How hard is the alignment problem?
- What is an acceptable level of risk given the potential benefits of AGI?
This is the only question where I'm confident enough to say the answer is clearly "no". The remaining question is whether a pause would decrease the risk.
How dangerous is hardware overhang in a pause?
- How much AI progress comes from additional spending vs. hardware improvement vs. algorithmic progress?
- Could hardware overhang cause a fast takeoff? Is a fast takeoff substantially more dangerous than a slow one?
- Would multipolar scenarios be more likely due to hardware overhang? Are multipolar or unipolar scenarios more dangerous?
- Is it possible to pause hardware progress as well?
How much does AI capability progress help alignment?
- Is fine-tuning based safety such as RLHF making progress on fundamental alignment?
- How far are we from mechanistic interpretability on frontier models? On models that may constitute superintelligence?
- How long would it take to figure out agent foundations? Is this necessary to fundamentally solving alignment?
- Does governance get harder as we get closer to AGI because more people are more invested in the development of AI? Or easier, because the danger becomes more apparent?
- Is it possible to allow better models to be trained in some labs / a CERN for AI, while keeping them from being deployed / stolen by hackers?
Can we pause algorithmic capabilities research while allowing alignment research?
- Is it possible to set up an institution that makes this judgement reliably? Is it as simple as disallowing some papers from NeurIPS, ICML, etc.?
- How much alignment research is dual-use and how bad are the capability gains?
Is there a good, plausible political mechanism through which to pause?
- Is there a stable enough entity to administer and enforce a global pause? Would such an entity present its own risks or is this just another entry in the history of restraint?
- Could the US use its power over the GPU supply chain to unilaterally impose a pause on the world?
- What should be the criteria for ending a pause, if any? Is an indefinite pause that gets ended in an unplanned way worse than a pause with arbitrary criteria for ending?
- Should we create TAI, if we have definitely solved the alignment problem?
Does open-source AI make an effective pause impossible?
- How long would it take for open-source to create TAI?
- Can we monitor small or distributed compute clusters?
- How long until a single consumer device can train TAI given hardware and algorithmic progress (assuming we can't pause algorithmic progress)?
- What is the level of capabilities that should not be open-sourced? Is it lower than the criteria for pausing SOTA models?
A meta consideration that depends on the answers to the previous questions:
Are the benefits of more time for research and regulation greater than the costs of a pause?
- How far are we from TAI on the default trajectory?
- How useful is additional time with current AI for alignment research?
- Will alignment ever be "solved"? How would we know when it's safe to proceed?
If alignment research just needs more time to progress and we're currently on the cusp of AGI, then it might be worth risking hardware overhang in order to gain that time. If alignment research progress depends upon having advanced AI then hardware overhang could mean we miss the most crucial time for alignment research.
Six Crucial Considerations for Pause Protests
While it is important to consider whether a pause would be good, this does not necessarily determine whether pause advocacy and protests are good. For example, while a pause might be bad due to hardware overhang, it might be good to call for a pause, because it will force companies to try harder on safety. Or conversely, a pause might be good because it gives us time to do alignment research, but pause advocacy might make companies hostile to x-risk concerns.
Will pause advocacy create polarization around the issue of AI safety?
- Could protests make AI a culture war / partisan issue? Is this inevitably going to happen anyway?
- Does polarization create stagnation in policy?
- Is it possible for an activist movement around AI to maintain high epistemic standards? How damaging would an activist movement without high standards be?
Will pause advocacy harm other AI safety efforts?
- Do protests spend political capital that could be used to gain inside influence in labs and government?
- Does a non-EA branded PauseAI movement consisting mostly of EAs (at least for now) spend the political capital of EA?
- Does advocacy make it easier for policy makers to push useful AI regulation by widening the Overton Window and providing a public mandate / social license to act? Or does it make it harder by making a pause look like a policy pushed by extreme radicals?
- Will pause advocacy negatively affect the reputation of AI Safety as a research field or show that AI Safety experts really believe in x-risk ("if you actually think we're going to die, why aren't you protesting on the street?").
How effective is advocacy generally?
- Is AI fundamentally different from other issues with social movements?
- Does the research reliably show that advocacy does work?
- Does EA’s reputation (elitism, tech bros) make it harder to advocate for a pause?
- If some AI protest movement will inevitably arise over the next few years, how valuable is it to be the founder?
Is pause advocacy more robust to safety-washing than other efforts?
- Can anyone else call out regulation that claims to make AI safer but doesn't reduce x-risk? Can activists do this while recognizing regulation that is actually good?
- Is communicating a specific policy ask worse than communicating a real understanding of the problem?
Is Pause advocacy worth the time of the AI Safety community (assuming it is useful)?
- Will employers discriminate against protestors when hiring? Will they know who has protested?
- Does significant, sympathetic media attention make it a good use of time to attend a protest?
Another meta consideration:
Is it bad to apply consequentialist reasoning to public advocacy?
- Is it good (and also more effective from a consequentialist perspective) to just say the thing you believe without weighing the pros and cons of honest communication?
Perhaps PauseAI protests are not really about advocating for a pause so much as communicating the severity of the risk of AI. In which case it might be better for AI activists to focus on a different message . On the other hand perhaps "Pause AI" is the most accurate way to convey our beliefs when we are trying to send a message to the entire world.
What you cannot do is not decide. You will either attend the protest or not.
The International PauseAI protest will be held on 21 October in 7 countries.
Thanks to Gideon Futerman for feedback on this post.