Cross-posted from my website.

Prior discussion: niplav's shortform (2025); Planning for Extreme AI Risks (2025) by Joshua Clymer

A frontier AI company (any one, I don't care which) should close shop and make an announcement along the lines of:

Powerful AI could end the human race. We are too worried that we don't know how to make this technology safe. We have decided to shut down because we don't want to be responsible for building the thing that kills us all.

A common refrain among safety-conscious AI developers: "it doesn't matter if we stop building dangerous AI, because someone else will just build it instead." Is that really true, though? If a multi-hundred-billion-dollar company comes out and says "We've concluded that our product is horribly dangerous, nobody knows how to make it safe, and there's too high a risk that it leads to human extinction", this won't raise any eyebrows? This has no chance of spurring policy-makers into action?

Shutting down would make people say, holy shit, they are serious about this extinction risk thing. Shutting down sends a strong signal to governments that they should pay serious attention to AI x-risk.

It also encourages other companies to take safety more seriously. Right now, at least three AI companies have said something like, "maybe we'd prefer to slow down and pay more attention to safety, but then the other companies will plow ahead recklessly." If one company decides not to plow ahead recklessly, and actually stops building existentially dangerous technology, that sends a hard-to-ignore message that coordination might be possible.

If a frontier AI company shuts down, will that work? Will companies work together to slow down? Will we get sane AI regulations as a direct result of the shutdown? Probably not. It won't singlehandedly solve all the coordination problems. But it's still a better idea than the current strategy of "race ahead while doing a dash of safety research on the side", which is even less likely to work. By AI companies' own admission, competitive pressures don't allow them to slow down. Why would things change in the future? How are they going to align AI if they have to move at maximum speed? Even if they slow down somewhat, what if alignment is hard[1], and they can't slow down by enough to properly solve the problem?

Counterpoint: If the most safety-conscious company shuts down, then it can't do any more safety research.

I expect shutting down would be worth the tradeoff—companies' safety research isn't doing much to reduce AI takeover risk. But perhaps instead of shutting down, an AI company could reallocate 100% of its budget on some combination of safety research + global coordination to make AI development safer, and do just those things until it runs out of money. Think of how much more safety work a they could do if they dedicated all their resources to the problem!

(Some might argue that AI companies need to build frontier models so they have something on which to do safety research. That argument doesn't make much sense when you think about it. There are a lot of kinds of research that don't require frontier models,[2] they can do plenty of research on the models that already exist, and they can make deals with other companies to get access to their latest models.)

What if investors sue the company?

It is my understanding that a self-induced shutdown would be legal for Anthropic (which is a public benefit corporation). I'm not sure about OpenAI—it's a for-profit now, but it's still owned in large part by a nonprofit that's allegedly obligated to put the benefit of humanity first.

More importantly, "we have to risk killing everyone because otherwise our investors might sue us" is not a serious position. I almost can't think of a worse excuse.

Some people might believe that a safety-minded AI company should shut down under some circumstances, but not now. My question then is: Under what conditions should they shut down? How will we know when those conditions are met? And how do we know that they'll follow through?


  1. It probably is. ↩︎

  2. Safety-minded AI companies treat alignment as an engineering problem, or treat philosophical problems as easy. There are critical aspects of the problem that can't be solved by engineering (or that aren't legible). You can work on those other aspects even if you don't have frontier models. ↩︎

22

3
3

Reactions

3
3

More posts like this

Comments6
Sorted by Click to highlight new comments since:

During the OpenAI board fiasco, we saw a large number of employees exert pressure on OpenAI to re-form under Sam Altman, suggesting labour power in AI labs is real and effective. Companies do have a hard time shutting themselves down for no reason, but have a much easier time scapegoating labour unions and strikes. And it’s relatively inexpensive to build a labour union, and easy to do when talent is scarce. Just sayin’

This seems like an very coarse take to me.

Shutting down one of these companies might cost, say, a trillion dollars and lost investor/employee value. And I think that the real risky 'frontier AI' might only be a portion of their work. 

I could very much see an argument for them to stop a lot of the key frontier work and then move to more conservative engineering efforts, for instance. I think that there's a large variety of space to do around AI development and AI safety, it seems easy for me to imagine large changes in direction that could still have some market value but much less risks. 

Other than OpenAI or Anthropic, I don't see an AI company shutting down being taken remotely seriously by anyone outside a very small number of people that understand how that lab was performing, most of whom already have their own very strong views on AI safety. To most people, it just says "loss making entities coming up with elaborate excuses for AI bubble starting to burst" (or "I told you Elon was full of shit " or even "look, a European/Chinese lab cannot compete with the US AI innovation because their governments have forced them to think about alignment too much, let's not make the mistake of regulating...")

OpenAI and Anthropic doing it would at least cut through to the average person/policymaker. But even if you believe they're sincerely mission driven and owe their stockholders nothing, OpenAI and Anthropic are not IPOing this year because they believe that shutting down is the correct course of action

While some people might notice the fact the companies are making losses, others will notice the fact that they are valued in the billions! I am not sure if a shutdown would actually be the right move but it would definitely send a message.

Why would they need to shut down completely? It seems like it would work just as well to say "We won't build any new models, but we'll keep serving the ones we already have." Then they wouldn't be accelerating towards riskier models, it wouldn't be AS bad for them financially (so it might have a better chance of happening) and they can still do safety research on the models they already have.

I think the resulting backlash would be extremely intense and polarize very many important people (tech CEOs, AI researchers, pro-business politicians, etc etc) against AI safety and taking AI x-risk seriously (see: Sam Altman's firing had a major negative impact on the tech industry's view of AI safety for the following year or so, and this would be a far bigger deal). This is not necessarily a decisive factor from a pure safety perspective, but could be a very important one.

Curated and popular this week
Relevant opportunities