Hide table of contents

This post is part of AI Pause Debate Week. Please see this sequence for other posts in the debate.

An AI Moratorium of some sort has been discussed, but details matter - it’s not particularly meaningful to agree or disagree with a policy that has no details. A discussion requires concrete claims. 

To start, I see three key questions, namely:

  1. What does a moratorium include? 
  2. When and how would a pause work? 
  3. What are the concrete steps forward?

Before answering those, I want to provide a very short introduction and propose what is in or out of bounds for a discussion.

There seems to be a strong consensus that future artificial intelligence could be very bad. There is quite a significant uncertainty and dispute about many of the details - how bad it could be, and when the different risks materialize. Pausing or stopping AI progress is anywhere from completely unreasonable to obviously necessary, depending on those risks, and the difficulty of avoiding them - but eliminating those uncertainties is a different discussion, and for now, I think we should agree to take the disputes and uncertainties about the risks as a given. We will need to debate and make decisions under uncertainty. So the question of whether to stop and how to do so depends on the details of the proposal - but these seem absent from most of the discussion. For that reason, I want to lay out a few of the places where I think these need clarification, including not just what a moratorium would include and exclude, but also concrete next steps to getting there.

Getting to a final proposal means facing a few uncomfortable policy constraints that I’d also like to suggest be agreed on for this discussion. An immediate, temporary pause isn’t currently possible to monitor, much less enforce, even if it were likely that some or most parties would agree. Similarly, a single company or country announcing a unilateral halt to building advanced models is not credible without assurances, and is likely both ineffective at addressing the broader race dynamics, and differentially advantages the least responsible actors. For these reasons, the type of moratorium I think worth discussing is a multilateral agreement centered on countries and international corporations, one which addresses both current and unclear future risks. But, as I will conclude, much needs to happen more rapidly than that - international oversight should not be an excuse for inaction. 

What Does a Moratorium Include?

There is at least widespread agreement on many things that aren’t and wouldn’t be included. Current systems aren’t going to be withdrawn - any ban would be targeted to systems more dangerous than those that exist. We’re not talking about banning academic research using current models, and no ban would stop research to make future systems safer, assuming that the research itself does not involve building dangerous systems. Similarly, systems that reasonably demonstrate that they are low risk would be allowed, though how that safety is shown is unclear.


 Next, there are certain parts of the proposal that are contentious, but not all of it. Most critics of a moratorium[1] agree that we should not and can’t afford to build dangerous systems - they simply disagree where the line belongs. Should we allow arbitrary plugins? Should we ban open-sourcing models? When do we need to stop? The answers are debated. And while these all seem worrying to me, the debate makes sense - there are many irreducible uncertainties, we have a global community with differing views, and actual diplomatic solutions will require people who disagree to come to some agreement. 

As should be clear from my views on the need to negotiate answers, I’m not planning to dictate exactly what I think we need to ban. However, there are things that are clearly on the far side of the line that we need to draw. Models that are directly dangerous, like self-driving AIs that target pedestrians, would already be illegal to use and should be illegal to build as well. The Chemical Weapons Convention already bans the development of new chemical weapons - but using AI to do so is already possible. The equivalent use of AI to create bioweapons is plausibly not far away, and such AI is likely also already a(n unenforceable) violation of the BWC. The making of models that violate treaty obligations to restrict the development of weapons is already illegal in some jurisdictions, but unenforceable rules do nothing. And looking forward, unacceptable model development includes, at the very least, models that pose a risk of being “black ball” technologies, in Bostrom’s framing - and the set of dangerous technologies that AI could enable will grow rather than shrink over time.

Given that we need lines, standards should be developed that categorize future models into unacceptable, high-risk, or low-risk, much as the EU AI Act does for applications. This would map to models that are banned, need approval, or can be deployed. The question is which models belong in each category, and how that decision is reached - something that can and will be debated. Figuring out which models pose unacceptable risks is important, and non-obvious, so caution is warranted - and some of that caution should be applied to unpredictable capability gains by larger models, which may be dangerous by default.

When and How Do We Stop?

Timing is critical, and there are some things that we need to ban, which can and should be banned today. However, I don’t think there’s a concrete proposal to temporarily or permanently pause that I could support - we don’t have clear criteria, we don’t have buy-in from the actors that is needed to make this work, and we don’t have a reasonable way to monitor, much less enforce, any agreement. Yes, companies could voluntarily pause AI development for 6 months, which could be a valuable signal. (Or would be so if we didn’t think it would be a smokescreen for “keep doing everything and delay releases slightly.”) But such temporary actions are neither necessary for actual progress on building governance, nor sufficient to reduce the risks from advanced AI, so it seems more like a distraction - albeit one that successfully started to move the Overton window - than a serious proposal.

And acting too soon is costly, being too conservative in what is allowed is costly, regulating and banning progress in areas that have the potential to deliver tremendous benefits is costly. These considerations are important. But AI is an emergency - the costs of any action are very large, and inaction and allowing unrestricted development is as much of a choice as the most extreme response - any decision, including the decision to move slowly on regulation or banning potential risks, needs to be justified, not excused.

We absolutely cannot delay responding. If there is a fire alarm for AI risk, at least one such alarm has been ringing for months. Just like a fire in the basement won’t yet burn people in the attic, AI that exists today does not pose immediate existential risks[2] to humanity - but it’s doing significant damage already, and if you ignore the growing risks, further damage quickly becomes unavoidable. We should ask model developers to be responsible, but voluntary compliance rarely works, and in a race scenario self-governance is clearly insufficient. If we agree that we’re fighting a fire, turning on the sink isn’t the way to respond, and if we aren’t yet in agreement about the fire, we presumably still want something akin to a sprinkler system. For AI, the current risks and likely future risks means we need concrete action to create a governance regime at least capable of banning systems. Because if we don’t build future risk mitigation plans, by the time there is agreement about the risks it will be too late to respond.

And any broad moratorium needs a governance mechanism for making decisions - a fixed policy will be quickly outpaced by technology changes. We should already be building the governance empowered to make decisions and capable of enforcing them, and a priori agreement that they need to put in place restrictions on dangerous developments. And while the moratorium on each type of dangerous model development would be indefinite, it would not permanently ban all future AI technologies. Instead, we expect that AI safety experts will participate in this governance, and over time build a consensus that safety of certain types of systems is assured enough to permit relaxation of rules in that domain. As above, I think the details need to be negotiated, and this is what global governance experts, working with domain experts, are good at.

What are concrete steps forward?

Immediate action is needed on three fronts at the national level, at the least by major countries and ideally everywhere. First, countries should be building capacity to monitor usage of AI systems and enforce current and future rules. Second, they need to provide clarity on legal authorities for banning or restricting development of models that break existing laws. And third, they should impose a concrete timeline for creating an international moratorium, including a governance regime able to restrict dangerous models. 

On these different fronts, there are a number of steps needed immediately, many of which will both concretely reduce harms of extant models, and help enable or set the stage for later governance and response.

Monitoring AI Systems Now

First, to enable both near term and longer term regulation, some forms of ongoing monitoring and reporting should be required of every group developing, deploying, or using these large scale AI systems - say, larger than GPT-3. Countries might require public reporting of models, or reporting to government agencies tasked with this. And again, this is true even if you think larger risks are far away. 

The registration of model training, model development, and intended applications of larger and more capable models would include registering details about their training data, the methods used to reduce bias, monitor usage, and prevent misuse, and stating the expected capabilities in advance. It seems reasonable to expect countries to monitor development and deployment of at least any AI systems larger than GPT-3, as well as models in domains expected to have significant misuse or other risks. For example, models used for fully autonomous general reasoning, such as AutoGPT, should have specific monitoring and human oversight requirements. And there is plenty of legal room for this type of requirement - for example, if a company wants to market a model to consumers, or use it inside of companies that make customer-facing decisions, this is a consumer protection issue. And again, even ignoring near-certain future risks, the harms have been known for quite a while, well before the models that were breaking laws and harming people were being called AI.

Enforcing Extant Laws

Second, some things are already illegal, and those laws need to be enforced - AI is not an excuse. Regulation and prosecution of AI misuse in the near term is both obviously needed, and important for making it clear that governments aren’t going to give companies free reign to make harmful or risky decisions. There already are places restrictions are needed, and there are and certainly will be AI models that should be banned. We’re well on our way towards having widespread public and political understanding of this - most of the public agrees that some rules are needed.

I claimed above that the idea of banning some AI models is not controversial. Lots of things are illegal, and banning AI that does illegal things is just clarifying that the status quo won’t be swept away by technology. I even think that there is near-universal agreement that rules and limits are needed, perhaps excepting the most strident defenders of accelerationism. We do not allow bad actors to build bioweapons, saying that good biotech will beat bad biotech - and we cannot afford to allow companies to build dangerous AI systems, and say that good AI will defend against it.

Further, regardless of your views on future AI risks, there are risks today. A proof-of-concept for using AI to create chemical weapons already exists. And countries already have a treaty obligation to stop any misuse of AI for biological weapons[3] - and this seems frighteningly plausible. Similarly, regulators can already see that stock market manipulation, inciting violence, election interference, and many other illegal acts are enabled by current AI models, and “the AI did it” is an abrogation of responsibility for foreseeable misuse, so should not act as a get-out-of-jail-free card. Clarifying the responsibility of AI model developers, application developers, and users when models are used to break laws is necessary to ensure that everyone knows what their responsibilities are[4]. It would also highlight a critical gap - that we don’t have the capacity to investigate misuse and enforce laws that should apply to AI. This is unacceptable, and must be addressed, both immediately, and because there are larger risks on the horizon. 

Plan for Future Governance and Policy

Governments must not wait for international consensus about how to mitigate risks. The types of misuse we see today are largely untraceable, because we don’t have any way to track who is building or deploying AI systems. Laws are being broken, and people are being hurt, and we as a society can’t respond because we don’t have the tools. But these tools and regulations are already technically and legally feasible. We need regulation and enforcement already, and lacking that, it is critical to at least build infrastructure enabling it as quickly as possible. 

And to digress slightly, there is a debate about the extent to which we are prepared for automation and job loss. This is a policy debate that is inextricably tied to other decisions about AI risks, and not directly relevant to the class of large-scale risks we are most concerned with. Similarly, there are intellectual property rights and other issues that relate to AI which policymakers will continue to debate. But because policy is messy, these will be part of policy formulation to address AI in the near term. This will likely involve taxation, regulation of what is allowed, and other measures - for the purpose of the current debate, we should agree that addressing these issues will be part of any domestic debate on impacts of AI and responses, and then both appreciate the need for action, and clearly state that even solving the problems created by near-human or human-level AGI isn’t addressing key risks of those and more advanced systems.

Moving beyond current needs, as both a way to ensure that domestic policy doesn’t get stuck dealing with immediate economic, equity, and political issues, I think we should push for an ambitious intermediate goal to promote the adoption of international standards regarding high-risk future models. To that end, I would call for every country to pass laws today that will trigger a full ban on deploying or training AI systems larger than GPT-4 which have not been reviewed by an international regulatory body with authority to reject applications, starting in 2025, pending international governance regimes with mandatory review provisions for potentially dangerous applications and models. This isn’t helpful for the most obvious immediate risks and economic impacts of AI - and for exactly that reason, it’s critical as a way to ensure the tremendous future risks aren’t ignored.

Most of these steps are all possible today, and many or most could even be announced quickly - perhaps at the UK Summit, though it unfortunately seems like we’re not on track for countries making such commitments and concrete actions that quickly.

Conclusion

To conclude, yes, we need to stop certain uses of AI and future more risky systems; yes, steps need to be taken sooner rather than later because the most extreme risks are increasing; and yes, there are concrete things for governments to do, some of which are likely to build towards a governance regime that include a moratorium or the equivalent. And many helpful and concrete steps can start today, and are needed to address immediate harms - even if the necessary moratorium on dangerous uses and model development and the accompanying governance regimes will take time to negotiate.

Notes

  1. ^

    Including OpenAI’s Sutskever and AltmanGary MarcusAnthropic, and even LeCun, who publicly says that no-one should build dangerous AI - but excluding Andreeson, who is opposed to any restrictions, and instead suggests we fight misuse of AI with more AI.

  2. ^

    That said, we shouldn’t ignore the harms AI is doing now - and restricting already illegal or harmful uses is certainly more than justified, and I applaud work being done by AI ethicists and other groups in that direction. The people in the basement should be saved, which is sufficient justification for many of the proposed policies - but I think milquetoast policy papers don’t help that cause either, and these abuses need more drastic responses.

  3. ^

    The obligation for the Biological Weapons Convention is clear that countries are in violation if they allow bioweapons to be developed, even if the state party itself was not involved. The requirements for the chemical weapons convention are less broad, so AI used for chemical weapons development is not directly banned by the treaty - though it certainly seems worth considering how to prevent it. And as noted, it is currently technically and bureaucratically impossible to monitor or ban such uses.

  4. ^

    In the interim, joint-and-several liability for developers, application providers, and users for misuse, copyright violation, and illegal discrimination would be a useful initial band-aid; among other things, this provides motive for companies to help craft regulation to provide clear rules about what is needed to ensure on each party’s behalf that they will not be financially liable for a given use, or misuse.

Comments11
Sorted by Click to highlight new comments since: Today at 8:45 AM

I appreciate the way this post adds a lot of clarity and detail to pause proposals. Thanks for writing it and thanks also to the debate organisers.

However, I think you’re equivocating kind of unhelpfully between LLM development - which would be presumably affected by a pause - and special-purpose model development (e.g. the linked MegaSyn example) which would probably not be. This matters because the claim that AI is currently an emergency and harms are currently occurring. For a pause to prevent these harms, they would have to be from cutting-edge LLMs, but I’m not aware of any compelling examples of such.

Tracking compute is required for both. These models provide sufficient reason to track compute, and to ensure that other abuses are not occurring, which was the reason I think it's relevant.

Thanks - this is clarifying. I think my confusion was down to not understanding the remit of the pause you're proposing. How about we carry on the discussion in the other comment on this?

Hmm, when my friends talk about a government-enforced pause, they most often mean a limit on training compute for LLMs. (Maybe you don't think that's "compelling"? Seems at least as compelling as other versions of "pause" to me.)

Ah, seems like my original comment was unclear. I was objecting to the conjunction between (a) AI systems already have the potential to cause harm (as evidenced by the Weapons of Math Destruction, Nature, and MegaSyn links) and (b) a pause in frontier AI would reduce harms. The potential harms due to (a) wouldn't be at all mitigated by (b) so I think it's either a bit confused or misleading to link them in this article. Does that clarify my objection?

In general I'm quite uncomfortable by the equivocation I see in a lot of places between "current models are actually causing concrete harm X" and "future models could have the potential to cause harm X" (as well as points in between, and interchanging special-purpose and general AI). I think these equivocations particularly harms the debate on open-sourcing, which I value and feels especially under threat right now.

It’s the other way around – comprehensive enforcement of laws to prevent current harms also prevent “frontier models” from getting developed and deployed. See my comment.

It’s unethical to ignore the harms of uses of open-source models (see laundering of authors’ works, or training on and generation of CSAM).

Harms there need to be prevented too. Both from the perspective of not hurting people in society now, and from the perspective of preventing the build up of risk.

Also, this raises the question whether “open-source” models are even “open-source” in the way software is: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4543807

Glad you also mentioned preventing harmful uses of AI with existing laws.

Law list:

  • copyright infringement of copying creatives works into datasets for training models used to compete in the original authors’ markets.

  • violating the EU Digital Single Market Directive’s stipulations on Text and Data Mining since it “prejudices the legitimate interests of the rightholder or which conflicts with the normal exploitation of his work or other subject-matter.”

  • violating GDPR/CCDA by not complying with citizens’ requests for access to personal data being processed, and to erase that data.

  • CSAM collected in image and text datasets (as well as synthetic CSAM generated in outputs).

  • Biometrics collected (eg. faces in images).

That’s just on data laundering. Didn’t get into employment law, product liability, or environmental regulations.

Moving beyond current needs, as both a way to ensure that domestic policy doesn’t get stuck dealing with immediate economic, equity, and political issues, I think we should push for an ambitious intermediate goal to promote the adoption of international standards regarding high-risk future models. To that end, I would call for every country to pass laws today that will trigger a full ban on deploying or training AI systems larger than GPT-4 which have not been reviewed by an international regulatory body with authority to reject applications, starting in 2025, pending international governance regimes with mandatory review provisions for potentially dangerous applications and models. This isn’t helpful for the most obvious immediate risks and economic impacts of AI - and for exactly that reason, it’s critical as a way to ensure the tremendous future risks aren’t ignored.

I strongly agree with that.

You don't talk much about compute caps as a lever elsewhere in the text, so I'm going to paste some passages I wrote on why I'm excited about compute-related interventions to slow down AI. (My summary on slowing AI is available on the database for AI governance researchers – if anyone is planning to work on this topic but doesn't have access to that database, feel free to email me and I can give you access to a copy.)

Compute seems particularly suited for governance measures: it’s quantifiable, can’t be used by multiple actors at once, and we can restrict access to it. None of these three factors apply to software (so it’s unfortunate that software progress plays a more significant role for AI timelines than compute increases). Monitoring compute access is currently difficult because compute is easy to transport, and we don’t know where much of it is. Still, we could help set up a database, demand reporting from sellers, and shift compute use from physical access to cloud computing or data center access (centralizing access helps with monitoring). The ideal target state for compute governance might be some kind of “moving bright line” of maximum compute allowances for training runs. (A static cap might be too difficult to enforce because compute costs to circumvent the cap will fall over time.) The regulation could be flexible so labs with a proven safety mindset can receive authorization to go beyond the cap. More ambitiously, there’s the idea of 

  • hardware-enabled governance mechanisms (previous terminology: “on-chip measures”). These are tamper-proof mechanisms on chips (or on the larger hardware components of compute clusters) that would allow for actions like communicating information about a chip’s location or its past activity, remote shutdown, or restricting the chip’s communication with other chips (limiting the size of a training run it could be involved in). Hardware-enabled mechanisms don’t yet exist in a tamper-proof way, but NVIDIA has chips that illustrate the concept. I’m particularly excited about hardware-enabled governance mechanisms because they’re the only idea related to slowing AI progress that could (combined with an ambitious regulatory framework) address the problem as a whole, instead of just giving us a small degree of temporary slowdown. (Hardware-enabled mechanisms would also continue to be helpful after the first aligned TAI is developed  – it’s not like coordination challenges will automatically go away at the point when an aligned AI is first developed.) Widespread implementation of such mechanisms is several years away even in a best-case scenario, so it seems crucial to get started.
    • Onni Arne and Lennart Heim have been looking into hardware-enabled governance mechanisms. (My sense from talking to them is that when it comes to monitoring and auditing of compute, they see the most promise in measures that show a chip's past activity, "proof of non-training.")  Yonadav Shavit also works on compute governance and seems like a great person to talk to about this.

And here's an unfortunate caveat about how compute governance may not be sufficient to avoid an AI catastrophe:

Software progress vs. compute: I’m mostly writing my piece based on the assumption that software progress and compute growth are both important levers (with software progress being the stronger one). However, there’s a view on which algorithmic improvements are a lot jumpier than Ajeya Cotra assumes in her “2020 compute training requirements” framework. If so, and if we’re already in a compute overhang (in the sense that it’s realistic to assume that new discoveries could get us to TAI with current levels of compute), it could be tremendously important to prevent algorithmic exploration by creative ML researchers, even at lower-than-cutting-edge levels of compute. (Also, the scaling hypothesis would likely be false or at least incomplete in that particular world, and compute restrictions would matter less since building TAI would mainly require software breakthroughs.) In short, if the road to TAI is mostly through algorithmic breakthroughs, we might be in a pretty bad situation in terms of not having available promising interventions to slow down progress.

But there might still be some things to do to slow progress a little bit, such as improving information security to prevent leakage of insights from leading labs, and export controls on model weights. 

I think that these are good points for technical discussions of how to implement rules, and thanks for brining them up - but I don't really makes sense to focus on this if the question is whether or not to regulate AI or have a moratorium.

Thank you for your carefully thought-through essay on AI governance. Given your success as a forecaster of geopolitical events, could you sketch out for us how we might implement AI governance on, for example, Iran, North Korea, and Russia? You mention sensors on chips to report problematic behavior, etc. However, badly behaving nations might develop their own fabs. We could follow the examples of attacks on Iran's nuclear weapons technologies. But would overt/covert military actions risk missing the creation of a "black ball" on the one hand, or escalation into global nuclear/chemical/biological conflict? 

These are difficult problems, but thankfully not the ones we need to deal with immediately. None of Iran, Russia and North Korea are chip producers nor are they particularly close to SOTA in ML - if there is on-chip monitoring for manufacturers, and cloud compute has restrictions, there is little chance they accelerate. And we stopped the first two from getting nukes for decades, so export controls are definitely a useful mechanism. In addition, the incentive for nuclear states or otherwise dangerous rogue actors to develop AGI as a strategic asset is lessened if they aren't needed for balance of power - so a global moratorium makes these states less likely to feel a need to keep up in order to stay in power. 

That said, a moratorium isn't a permanent solution to proliferation of dangerous tech, even if the regime were to end up being permanent. Like with nuclear weapons, we expect to raise costs of violating norms to be prohibitively high, and we can delay things for quite a long time, but if we don't have further progress on safety, and we remain / become convinced that unaligned ASI is an existential threat, we would need to continually reassess how strong sanctions and enforcement needs to be to prevent existential catastrophe. But if we get a moratorium in non-rogue states, thankfully, we don't need to answer these questions this decade, or maybe even next.

Curated and popular this week
Relevant opportunities