Hide table of contents

(Published in TIME on March 29.)

 

An open letter published today calls for “all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4.”

This 6-month moratorium would be better than no moratorium. I have respect for everyone who stepped up and signed it. It’s an improvement on the margin.

I refrained from signing because I think the letter is understating the seriousness of the situation and asking for too little to solve it.

The key issue is not “human-competitive” intelligence (as the open letter puts it); it’s what happens after AI gets to smarter-than-human intelligence. Key thresholds there may not be obvious, we definitely can’t calculate in advance what happens when, and it currently seems imaginable that a research lab would cross critical lines without noticing.

Many researchers steeped in these issues, including myself, expect that the most likely result of building a superhumanly smart AI, under anything remotely like the current circumstances, is that literally everyone on Earth will die. Not as in “maybe possibly some remote chance,” but as in “that is the obvious thing that would happen.” It’s not that you can’t, in principle, survive creating something much smarter than you; it’s that it would require precision and preparation and new scientific insights, and probably not having AI systems composed of giant inscrutable arrays of fractional numbers.

Without that precision and preparation, the most likely outcome is AI that does not do what we want, and does not care for us nor for sentient life in general. That kind of caring is something that could in principle be imbued into an AI but we are not ready and do not currently know how.

Absent that caring, we get “the AI does not love you, nor does it hate you, and you are made of atoms it can use for something else.”

The likely result of humanity facing down an opposed superhuman intelligence is a total loss. Valid metaphors include “a 10-year-old trying to play chess against Stockfish 15”, “the 11th century trying to fight the 21st century,” and “Australopithecus trying to fight Homo sapiens“.

To visualize a hostile superhuman AI, don’t imagine a lifeless book-smart thinker dwelling inside the internet and sending ill-intentioned emails. Visualize an entire alien civilization, thinking at millions of times human speeds, initially confined to computers—in a world of creatures that are, from its perspective, very stupid and very slow. A sufficiently intelligent AI won’t stay confined to computers for long. In today’s world you can email DNA strings to laboratories that will produce proteins on demand, allowing an AI initially confined to the internet to build artificial life forms or bootstrap straight to postbiological molecular manufacturing.

If somebody builds a too-powerful AI, under present conditions, I expect that every single member of the human species and all biological life on Earth dies shortly thereafter.

There’s no proposed plan for how we could do any such thing and survive. OpenAI’s openly declared intention is to make some future AI do our AI alignment homework. Just hearing that this is the plan ought to be enough to get any sensible person to panic. The other leading AI lab, DeepMind, has no plan at all.

An aside: None of this danger depends on whether or not AIs are or can be conscious; it’s intrinsic to the notion of powerful cognitive systems that optimize hard and calculate outputs that meet sufficiently complicated outcome criteria. With that said, I’d be remiss in my moral duties as a human if I didn’t also mention that we have no idea how to determine whether AI systems are aware of themselves—since we have no idea how to decode anything that goes on in the giant inscrutable arrays—and therefore we may at some point inadvertently create digital minds which are truly conscious and ought to have rights and shouldn’t be owned.

The rule that most people aware of these issues would have endorsed 50 years earlier, was that if an AI system can speak fluently and says it’s self-aware and demands human rights, that ought to be a hard stop on people just casually owning that AI and using it past that point. We already blew past that old line in the sand. And that was probably correct; I agree that current AIs are probably just imitating talk of self-awareness from their training data. But I mark that, with how little insight we have into these systems’ internals, we do not actually know.

If that’s our state of ignorance for GPT-4, and GPT-5 is the same size of giant capability step as from GPT-3 to GPT-4, I think we’ll no longer be able to justifiably say “probably not self-aware” if we let people make GPT-5s. It’ll just be “I don’t know; nobody knows.” If you can’t be sure whether you’re creating a self-aware AI, this is alarming not just because of the moral implications of the “self-aware” part, but because being unsure means you have no idea what you are doing and that is dangerous and you should stop.


On Feb. 7, Satya Nadella, CEO of Microsoft, publicly gloated that the new Bing would make Google “come out and show that they can dance.” “I want people to know that we made them dance,” he said.

This is not how the CEO of Microsoft talks in a sane world. It shows an overwhelming gap between how seriously we are taking the problem, and how seriously we needed to take the problem starting 30 years ago.

We are not going to bridge that gap in six months.

It took more than 60 years between when the notion of Artificial Intelligence was first proposed and studied, and for us to reach today’s capabilities. Solving safety of superhuman intelligence—not perfect safety, safety in the sense of “not killing literally everyone”—could very reasonably take at least half that long. And the thing about trying this with superhuman intelligence is that if you get that wrong on the first try, you do not get to learn from your mistakes, because you are dead. Humanity does not learn from the mistake and dust itself off and try again, as in other challenges we’ve overcome in our history, because we are all gone.

Trying to get anything right on the first really critical try is an extraordinary ask, in science and in engineering. We are not coming in with anything like the approach that would be required to do it successfully. If we held anything in the nascent field of Artificial General Intelligence to the lesser standards of engineering rigor that apply to a bridge meant to carry a couple of thousand cars, the entire field would be shut down tomorrow.

We are not prepared. We are not on course to be prepared in any reasonable time window. There is no plan. Progress in AI capabilities is running vastly, vastly ahead of progress in AI alignment or even progress in understanding what the hell is going on inside those systems. If we actually do this, we are all going to die.

Many researchers working on these systems think that we’re plunging toward a catastrophe, with more of them daring to say it in private than in public; but they think that they can’t unilaterally stop the forward plunge, that others will go on even if they personally quit their jobs. And so they all think they might as well keep going. This is a stupid state of affairs, and an undignified way for Earth to die, and the rest of humanity ought to step in at this point and help the industry solve its collective action problem.


Some of my friends have recently reported to me that when people outside the AI industry hear about extinction risk from Artificial General Intelligence for the first time, their reaction is “maybe we should not build AGI, then.”

Hearing this gave me a tiny flash of hope, because it’s a simpler, more sensible, and frankly saner reaction than I’ve been hearing over the last 20 years of trying to get anyone in the industry to take things seriously. Anyone talking that sanely deserves to hear how bad the situation actually is, and not be told that a six-month moratorium is going to fix it.

On March 16, my partner sent me this email. (She later gave me permission to excerpt it here.)

“Nina lost a tooth! In the usual way that children do, not out of carelessness! Seeing GPT4 blow away those standardized tests on the same day that Nina hit a childhood milestone brought an emotional surge that swept me off my feet for a minute. It’s all going too fast. I worry that sharing this will heighten your own grief, but I’d rather be known to you than for each of us to suffer alone.”

When the insider conversation is about the grief of seeing your daughter lose her first tooth, and thinking she’s not going to get a chance to grow up, I believe we are past the point of playing political chess about a six-month moratorium.

If there was a plan for Earth to survive, if only we passed a six-month moratorium, I would back that plan. There isn’t any such plan.

Here’s what would actually need to be done:

The moratorium on new large training runs needs to be indefinite and worldwide. There can be no exceptions, including for governments or militaries. If the policy starts with the U.S., then China needs to see that the U.S. is not seeking an advantage but rather trying to prevent a horrifically dangerous technology which can have no true owner and which will kill everyone in the U.S. and in China and on Earth. If I had infinite freedom to write laws, I might carve out a single exception for AIs being trained solely to solve problems in biology and biotechnology, not trained on text from the internet, and not to the level where they start talking or planning; but if that was remotely complicating the issue I would immediately jettison that proposal and say to just shut it all down.

Shut down all the large GPU clusters (the large computer farms where the most powerful AIs are refined). Shut down all the large training runs. Put a ceiling on how much computing power anyone is allowed to use in training an AI system, and move it downward over the coming years to compensate for more efficient training algorithms. No exceptions for governments and militaries. Make immediate multinational agreements to prevent the prohibited activities from moving elsewhere. Track all GPUs sold. If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.

Frame nothing as a conflict between national interests, have it clear that anyone talking of arms races is a fool. That we all live or die as one, in this, is not a policy but a fact of nature. Make it explicit in international diplomacy that preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange, and that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs.

That’s the kind of policy change that would cause my partner and I to hold each other, and say to each other that a miracle happened, and now there’s a chance that maybe Nina will live. The sane people hearing about this for the first time and sensibly saying “maybe we should not” deserve to hear, honestly, what it would take to have that happen. And when your policy ask is that large, the only way it goes through is if policymakers realize that if they conduct business as usual, and do what’s politically easy, that means their own kids are going to die too.

Shut it all down.

We are not ready. We are not on track to be significantly readier in the foreseeable future. If we go ahead on this everyone will die, including children who did not choose this and did not do anything wrong.

Shut it down.

 


 

 Addendum, March 30: 

The great political writers who also aspired to be good human beings, from George Orwell on the left to Robert Heinlein on the right, taught me to acknowledge in my writing that politics rests on force.

George Orwell considered it a tactic of totalitarianism, that bullet-riddled bodies and mass graves were often described in vague euphemisms; that in this way brutal policies gained public support without their prices being justified, by hiding those prices.

Robert Heinlein thought it beneath a citizen's dignity to pretend that, if they bore no gun, they were morally superior to the police officers and soldiers who bore guns to defend their law and their peace; Heinlein, both metaphorically and literally, thought that if you eat meat—and he was not a vegetarian—you ought to be willing to visit a farm and try personally slaughtering a chicken.

When you pass a law, it means that people who defy the law go to jail; and if they try to escape jail they'll be shot.  When you advocate an international treaty, if you want that treaty to be effective, it may mean sanctions that will starve families, or a shooting war that kills people outright.

To threaten these things, but end up not having to do them, is not very morally distinct—I would say—from doing them.  I admit this puts me more on the Heinlein than on the Orwell side of things.  Orwell, I think, probably considers it very morally different if you have a society with a tax system and most people pay the taxes and very few actually go to jail.  Orwell is more sensitive to the count of actual dead bodies—or people impoverished by taxation or regulation, where Orwell acknowledges and cares when that actually happens.  Orwell, I think, has a point.  But I also think Heinlein has a point.  I claim that makes me a centrist.

Either way, neither Heinlein nor Orwell thought that laws and treaties and wars were never worth it.  They just wanted us to be honest about the cost.

Every person who pretends to be a libertarian—I cannot see them even pretending to be liberals—who quoted my call for law and treaty as a call for "violence", because I was frank in writing about the cost, ought to be ashamed of themselves for punishing compliance with Orwell and Heinlein's rule.

You can argue that the treaty and law I proposed is not worth its cost in force; my being frank about that cost is intended to help honest arguers make that counterargument.

To pretend that calling for treaty and law is VIOLENCE!! is hysteria.  It doesn't just punish compliance with the Heinlein/Orwell protocol, it plays into the widespread depiction of libertarians as hysterical.  (To be clear, a lot of libertarians—and socialists, and centrists, and whoever—are in fact hysterical, especially on Twitter.)  It may even encourage actual terrorism.

But is it not "violence", if in the end you need guns and airstrikes to enforce the law and treaty?  And here I answer: there's an actually important distinction between lawful force and unlawful force, which is not always of itself the distinction between Right and Wrong, but which is a real and important distinction.  The common and ordinary usage of the word "violence" often points to that distinction.  When somebody says "I do not endorse the use of violence" they do not, in common usage and common sense, mean, "I don't think people should be allowed to punch a mugger attacking them" or even "Ban all taxation."

Which, again, is not to say that all lawful force is good and all unlawful force is bad.  You can make a case for John Brown (of John Brown's Body).

But in fact I don't endorse shooting somebody on a city council who's enforcing NIMBY regulations.

I think NIMBY laws are wrong.  I think it's important to admit that law is ultimately backed by force.

But lawful force.  And yes, that matters.  That's why it's harmful to society if you shoot the city councilor—

—and a misuse of language if the shooter then says, "They were being violent!"


Addendum, March 31: 

Sometimes—even when you say something whose intended reading is immediately obvious to any reader who hasn't seen it before—it's possible to tell people to see something in writing that isn't there, and then they see it.

My TIME piece did not suggest nuclear strikes against countries that refuse to sign on to a global agreement against large AI training runs.  It said that, if a non-signatory country is building a datacenter that might kill everyone on Earth, you should be willing to preemptively destroy that datacenter; the intended reading is that you should do this even if the non-signatory country is a nuclear power and even if they try to threaten nuclear retaliation for the strike.  This is what is meant by "Make it explicit... that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs."

I'd hope that would be clear from any plain reading, if you haven't previously been lied-to about what it says.  It does not say, "Be willing to use nuclear weapons" to reduce the risk of training runs.  It says, "Be willing to run some risk of nuclear exchange" [initiated by the other country] to reduce the risk of training runs.

The taboo against first use of nuclear weapons continues to make sense to me.  I don't see why we'd need to throw that away in the course of adding "first use of GPU farms" to the forbidden list.

I further note:  Among the reasons to spell this all out, is that it's important to be explicit, in advance, about things that will cause your own country / allied countries to use military force.  Lack of clarity about this is how World War I and World War II both started.

If (say) the UK, USA, and China come to believe that large GPU runs run some risk of utterly annihilating their own populations and all of humanity, they would not deem it in their own interests to allow Russia to proceed with building a large GPU farm even if it were a true and certain fact that Russia would retaliate with nuclear weapons to the destruction of that GPU farm.  In this case—unless I'm really missing something about how this game is and ought to be played—you really want all the Allied countries to make it very clear, well in advance, that this is what they believe and this is how they will act.  This would be true even in a world where it was, in reality, factually false that the large GPU farm ran a risk of destroying humanity.  It would still be extremely important that the Allies be very explicit about what they believed and how they'd act as a result.  You would not want Russia believing that the Allies would back down from destroying the GPU farm given a credible commitment by Russia to nuke in reply to any conventional attack, and the Allies in fact believing that the danger to humanity meant they had to airstrike the GPU farm anyways.

So if I'd meant "Be willing to employ first use of nuclear weapons against a country for refusing to sign the agreement," or even "Use nukes to destroy rogue datacenters, instead of conventional weapons, for some unimaginable reason," I'd have said that, in words, very clearly, because you do not want to be vague about that sort of thing.

It is not what I meant, and there'd be no reason to say it, and the TIME piece plainly does not say it; and if somebody else told you I said that, update how much you trust them about anything else either.

 

So long as I'm clarifying things:  I do not dispute those critics who have noted that most international agreements, eg nuclear non-proliferation, bind only their signatories.  I agree that an alliance which declares its intent to strike a non-signatory country for dangerous behavior is extraordinary; though precedents would include Israel's airstrike on Iraq's unfinished Osirak reactor in 1981 (without which Iraq might well have possessed nuclear weapons at the time it invaded Kuwait—the later US misbehavior around Iraq does not change this earlier historical point).

My TIME piece does not say, "Hey, this problem ought to be solvable by totally conventional normal means, let's go use conventional treaties and diplomacy to solve it."  It says, "If anyone anywhere builds a sufficiently powerful AI, under anything remotely like present conditions, everyone will die.  Here is what we'd have to do to prevent that."

And no, I do not expect that policy proposal to be adopted, in real life, now that we've come to this.  I spent the last twenty years trying to have there be options that were Not This, not because I dislike this ultimate last resort... though it is horrible... but because I don't expect we actually have that resort.  This is not what I expect to happen, now that we've been reduced to this last resort.  I expect that we all die.  That is why I tried so hard to have things not end up here.

But if one day a lot of people woke up and decided that they didn't want to die, it seems to me that this is something extraordinary that a coalition of nuclear countries could decide to do, and maybe we wouldn't die.

If all the countries on Earth had to voluntarily sign on, it would not be an imaginable or viable plan even then; there's extraordinary, and then there's impossible.  Which is why I tried to spell out that, if the allied countries were willing to behave in the extraordinary way of "be willing to airstrike a GPU farm built by a non-signatory country" and "be willing to run a risk of nuclear retaliation from a nuclear non-signatory country", maybe those allied countries could decide to just-not-die even if Russia refused to be part of the coalition.

49

0
0

Reactions

0
0

More posts like this

Comments3
Sorted by Click to highlight new comments since: Today at 2:43 PM

(Meta note: The TIME piece was previously discussed here. I've cross-posted the contents because the TIME version is paywalled in some countries, and is plastered with ads. This version adds some clarifying notes that Eliezer wrote on Twitter regarding the article.)

I am not aware of any international treaties which sanction the use of force against a non-signatory nation except for those circumstances under which one of the signatory nations is first attacked by a non-signatory nation (e.g. collective defense agreements such as NATO). Your counterexample of the Israeli airstrike on the Osirak reactor is not a precedent as it was not a lawful use of force according to international law and was not sanctioned by any treaty. I agree that the Israeli government made the right decision in orchestrating the attack, but it is important to point out the differences between that and what you are suggesting.

Ultimately, to quibble about whether your suggestion is an "act of violence" or not misses the point. What you suggest would be an unprecedented sanctioning of force. I believe the introduction of such an agreement would be very incendiary and would offer a bad precedent. Note that no such agreement was signed in order to prevent nuclear proliferation. Many experts were very worried that nuclear weapons would proliferate much further than they ultimately did. Sometimes the use of force was used, but always with a lighter hand than "let's sign a treaty to bomb anyone we think has a reactor."

Before going too deep into the "should we air strike data centres" issue, I wonder if anyone out there has good numbers about the current availability of hardwares for LLM training. 

Assuming that the US/NATO is committed to shutting down AI development, how much impact does a serious restriction on chip production/distribution have on the ability of a foreign actor to train advanced LLMs? 

I suspect there are enough old GPUs out there that can be repurposed into training centres, but how much more difficult would it be that no/little new hardwares are coming in? 

And for those old GPUs inside consumer machines or crpto farms, is it possible to cripple their LLM training capability through software modifications? 

Assuming that Microsoft and Nvidia/AMD are onboard, I think it should be possible to push a modification to the firmware of almost every GPU installed inside windows machines that are connected to the internet (that...should be almost everything). If software modification can prevent GPUs/whatever from being used effectively in LLM training runs, this will hopefully take most existing GPU stocks (and all newly manufactured GPUs) out of the equation for at least sometime.