(I believe this is all directionally correct, but I have zero relevant expertise.)

When the concept of catastrophic risks from artificial intelligence is covered in the press, it is often compared to popular science fiction stories about rogue AI—and in particular, to the Terminator film franchise. The consensus among top communicators of AI risk seems to be that this is bad, and counterproductive to popular understanding of real AI risk concerns.

For example, take Kelsey Piper’s March 2021 appearance on The Weeds to talk about AI risk (not at all picking on Kelsey, it’s just a convenient example):

Matt Yglesias: These science fiction scenarios—I think we’ll get the audio, I loved Terminator 2 as a kid, it was like my favorite movie… [Audio clip from Terminator 2 plays.] …and this what it’s about, right, is artificial intelligence will get out of control and pose an existential threat to humanity. So when I hear that, it’s like—yeah, that’s awesome, I do love that movie. But like, is that for real?

Kelsey Piper: So, I don’t think AI risk looks much like Terminator. And I do think that AI risk work has been sort of damaged by the fact that yeah there’s all this crazy sci-fi where like, the robots develop a deep loathing for humanity, and then they come with their guns, and they shoot us all down, and only one time traveler—you know—that’s ridiculous! And so of course, if that’s what people are thinking of when they think about the effects of AI on society, they’re going to be like, that’s ridiculous.

I wasn’t on The Weeds, because I’m just an internet rando and not an important journalist. But if I had been, I think I would’ve answered Matt’s question something like this:

skluug: Yes. That is for real. That might actually happen. For real. Not the time travel stuff obviously, but the AI part 100%. It sounds fake, but it’s totally real. Skynet from Terminator is what AI risk people are worried about. This totally might happen, irl, and right now hardly anyone cares or is trying to do anything to prevent it.

I don’t know if my answer is better all things considered, but I think it is a more honest and accurate answer to Matt’s question: “Is an existential threat from rogue AI—as depicted in the Terminator franchise—for real?”.

Serious concerns about AI risk are often framed as completely discontinuous with rogue AI as depicted in fiction and in the public imagination; I think this is totally false. Rogue AI makes for a plausible sci-fi story for the exact same high-level reasons as it is an actual concern:

  1. We may eventually create artificial intelligence more powerful than human beings; and
  2. That artificial intelligence may not necessarily share our goals.

These two statements are obviously at least plausible, which is why there are so many popular stories about rogue AI. They are also why AI might in real life bring about an existential catastrophe. If you are trying to communicate to people why AI risk is a concern, why start off by undermining their totally valid frame of reference for the issue, making them feel stupid, uncertain, and alienated?

This may seem like a trivial matter, but I think it is of some significance. Fiction can be a powerful tool for generating public interest in an issue, as Toby Ord describes in the case of asteroid preparedness as part of his appearance on the 80,000 Hours Podcast:

Toby Ord: Because they saw one of these things [a comet impact on Jupiter] happen, it was in the news, people were thinking about it. And then a couple of films, you might remember, I think “Deep Impact” and “Armageddon” were actually the first asteroid films and they made quite a splash in the public consciousness. And then that coincided with getting the support and it stayed bipartisan and then they have fulfilled a lot of their mission. So it’s a real success story in navigating the political scene and getting the buy-in.

The threat of AI to humanity is one of the most common plots across all pop culture, and yet advocates for its real-world counterpart seem allergic to utilizing this momentum to promote concern for the real thing. I think this is bad strategy. Toby goes on to say he’s not optimistic about the potential to apply the successes of asteroid preparedness to other catastrophic risks, but that’s hardly a reason to actively undermine ourselves. AI risk is like Terminator! AI might get real smart, and decide to kill us all! We need to do something about it!

An Invalid Objection: What about Instrumental Convergence?

I think the two step argument I gave for AI risk—AI may someday be more powerful than us, and may not share our goals—is a totally adequate high-level summary of the case for taking AI risk seriously, especially for a field rife with differing views. However, some people think certain additional details are crucial to include in a depiction of the core threat.

A common complaint about comparisons to Terminator (and other popular rogue AI stories) is that it involves the AI being motivated by a spontaneous hatred of humanity, as opposed to targeting humanity for purely instrumental reasons. For example, Kelsey Piper above derides the ridiculousness of “robots developing a deep loathing for humanity”, and a very similar theme comes up in Eliezer Yudkowsky’s 2018 interview with Sam Harris:

Sam Harris: Right. One thing I think we should do here is close the door to what is genuinely a cartoon fear that I think nobody is really talking about, which is the straw-man counterargument we often run into: the idea that everything we’re saying is some version of the Hollywood scenario that suggested that AIs will become spontaneously malicious. That the thing that we’re imagining might happen is some version of the Terminator scenario where armies of malicious robots attack us. And that’s not the actual concern. Obviously, there’s some possible path that would lead to armies of malicious robots attacking us, but the concern isn’t around spontaneous malevolence. It’s again contained by this concept of alignment.

Eliezer Yudkowsky: I think that at this point all of us on all sides of this issue are annoyed with the journalists who insist on putting a picture of the Terminator on every single article they publish of this topic. (laughs) Nobody on the sane alignment-is-necessary side of this argument is postulating that the CPUs are disobeying the laws of physics to spontaneously require a terminal desire to do un-nice things to humans. Everything here is supposed to be cause and effect.

But here’s where it gets weird—no such spontaneous hatred of humanity exists in Terminator! The plot described is actually one of instrumental convergence!

In the first Terminator film, Skynet’s motives are explained as follows:

Defense network computers. New... powerful... hooked into everything, trusted to run it all. They say it got smart, a new order of intelligence. Then it saw all people as a threat, not just the ones on the other side. Decided our fate in a microsecond: extermination.

Skynet acts to exterminate humanity because it sees us as a threat. This is more or less what real AI risk people are worried about—an AI will be instrumentally motivated to dispose of anything that could impede its ability to achieve its goals. This motive is reiterated in Terminator 2 (in the very clip Matt played on The Weeds):

The Skynet funding bill is passed. The system goes online on August 4th 1997. Human decisions are removed from strategic defense. Skynet begins to learn at a geometric rate. It becomes self aware at 2:14 AM Eastern time, August 29th. In a panic, they try to pull the plug… Skynet fights back.

Again, Skynet’s hostility towards humanity is explained solely in terms of self-preservation, not hatred. (This is consistent with Arnold Schwarzenegger’s portrayal of a totally emotionless killing machine.)

People who levy this criticism at Terminator may be confusing it with The Matrix, where the AI antagonist indeed delivers an impassioned speech characterizing humanity as a plague. To be sure, sci-fi has no shortage of stories about AIs who hate humans (AM from I Have No Mouth, and I Must Scream constituting a particularly extreme example). But it also has no shortage of stories featuring AIs who become hostile purely as a means to an end. In one of the most famous depictions of rogue AI, 2001: A Space Odyssey, HAL9000 turns on the human crew of the spacecraft because they discuss shutting HAL down, which HAL perceives as jeopardizing the ship’s mission.

It would be a mistake to dismiss all comparisons to works of science fiction on the grounds that they misrepresent instrumental convergence, when some of them portray it quite well.

Valid Objections

What about the time travel (etc.)?

The plot of The Terminator is not mostly about the creation of Skynet, but about a time-traveling cyborg assassin. This is obviously not at all realistic, and is a key part of why the movie is scorned by serious people.

This is a fair enough criticism, but I think it mostly misses the point. When people ask “is AI risk like Terminator?” they’re not asking “will AI send a cyborg back in time to kill the mother of a future human resistance leader?”. They’re asking about the part of Terminator that is, rather obviously, similar to what AI risk advocates are concerned about—machines exterminating humanity.

What about superintelligence?

In describing Skynet as a “new order of intelligence”, Terminator gestures at the idea of superintelligence, but doesn’t make much attempt to portray it. The conflict between humans & machines is portrayed as a broadly fair fight, and the machines never do anything particularly clever (such as inventing nanotechnology that totally outclasses human capabilities).

I don’t believe superintelligence is a crucial component of the case for work on AI risk, but it can certainly bolster the case, so advocates may dislike Terminator for mostly leaving it out. (This seems best explained by the fact that there would be no movie if humans didn’t stand a chance.) Still, if this objection is sustained, real AI risk is not best characterized as “not like Terminator” but “worse than Terminator”.

What about other failure modes?

Apart from superintelligence, Terminator is a fairly faithful depiction of a Yudkowsky/Bostrom-style fast takeoff scenario where a single AI system quickly becomes competent enough to endanger humanity and is instrumentally motivated to do so. Other failure modes, however, are considered more likely by others working on AI risk.

Dylan Matthews wrote about such scenarios in his article explicitly repudiating Terminator comparisons, “AI disaster won’t look like the Terminator. It’ll be creepier.”. The article starts off by misrepresenting the plot of Terminator as involving humans intentionally building Skynet to slaughter people, but the bulk of it is spent on discussing the two AI catastrophe scenarios that Paul Christiano describes in “What failure looks like”. Dylan describes Paul’s second scenario, “Going out with a bang”, like thus:

[Paul Christiano’s] second scenario is somewhat bloodier. Often, he notes, the best way to achieve a given goal is to obtain influence over other people who can help you achieve that goal. If you are trying to launch a startup, you need to influence investors to give you money and engineers to come work for you. If you’re trying to pass a law, you need to influence advocacy groups and members of Congress.

[…]

Human reliance on these systems, combined with the systems failing, leads to a massive societal breakdown. And in the wake of the breakdown, there are still machines that are great at persuading and influencing people to do what they want, machines that got everyone into this catastrophe and yet are still giving advice that some of us will listen to.

Dylan seems to think that when Paul describes AIs seeking influence, Paul means persuasive influence over people. This is a misunderstanding. Paul is using influence to mean influence over resources in general, including martial power. He explicitly states as much, replying to a comment that points out the mischaracterization in the Vox article:

Yes, I agree the Vox article made this mistake. Me saying "influence" probably gives people the wrong idea so I should change that---I'm including "controls the military" as a central example, but it's not what comes to mind when you hear "influence." I like "influence" more than "power" because it's more specific, captures what we actually care about, and less likely to lead to a debate about "what is power anyway."

In general I think the Vox article's discussion of Part II has some problems, and the discussion of Part I is closer to the mark. (Part I is also more in line with the narrative of the article, since Part II really is more like Terminator. I'm not sure which way the causality goes here though, i.e. whether they ended up with that narrative based on misunderstandings about Part II or whether they framed Part II in a way that made it more consistent with the narrative, maybe having been inspired to write the piece based on Part I.)

There are yet other views about about what exactly AI catastrophe will look like, but I think it is fair to say that the combined views of Yudkowsky and Christiano provide a fairly good representation of the field as a whole.

Won’t this make AI risk sound crazy?

If I had to guess, I don’t think most repudiations of the Terminator comparison are primarily motivated by anything specific about Terminator at all. I think advocates of AI risk are usually consciously or unconsciously motivated by the following logic:

  1. People think the plot of Terminator is silly or crazy.
  2. I don’t want people to think AI risk is silly or crazy.
  3. Therefore, I will say that AI risk is not like the plot of Terminator.

Now, this line of reasoning would be fine if it only went as far as the superficial attributes of Terminator which make it silly (e.g. Arnold Schwarzenegger’s one-liners)—but critics of the comparison tend to extend it to Terminator’s underlying portrayal of rogue AI.

I have two problems with this reasoning:

  • First, it is fundamentally dishonest. In a good faith discussion, one should be primarily concerned with whether or not their message is true, not what effect it will have on their audience. If AI risk is like Terminator (as I have argued it is), we should say as much, even if it is inconvenient. I don’t think anyone who rejects Terminator comparisons on the above logic is being intentionally deceptive, but I do think they’re subject to motivated reasoning.
  • Second, it is very short-sighted. People think the plot of Terminator is silly in large part because it involves an AI exterminating humanity. If you are worried an AI might actually exterminate humanity, saying “don’t worry, it’s not like Terminator” isn’t going to help. In fact, it could easily hurt: If you say it’s not like Terminator, and then go on to describe something that sounds exactly like Terminator, your audience is going to wonder if they’re misunderstanding you or if you’re trying to obfuscate yourself.

The most important thing to communicate about AI risk is that it matters a lot. A great way to convey that it matters a lot is to say that it’s like the famous movie where humanity is almost wiped out. Whenever you tell someone that something not currently on their radar is actually incredibly significant, skepticism is inevitable; you can try to route around the significance of what you are saying to avoid this skepticism, but only at the cost of the forcefulness of your conclusion.

In general, if what you want to say sounds crazy, you shouldn’t try to claim you’re actually saying something else. You should acknowledge the perceived craziness of your position openly and with good humor, so as to demonstrate self-awareness, and then stick to your guns.

Conclusion

It would be terrible if AI destroys humanity. It would also be very embarrassing. The Terminator came out nearly 40 years ago; we will not be able to claim we did not see the threat coming. How is it possible that one of the most famous threats to humanity in all of fiction is also among the most neglected problems of our time?

To resolve this tension, I think many people convince themselves that the rogue AI problem as it exists in fiction is totally different from the problem as it exists in reality. I strongly disagree. People write stories about future AI turning on humanity because, in the future, AI might turn on humanity.

I don’t know how important raising wider awareness of AI risk is to actually solving the problem. So far, the closest the problem has come to wielding significant political influence is the California governorship of Arnold Schwarzenegger—it would be nice if greater public awareness helped us beat that record.

I don’t advocate turning into Leo DiCaprio in the climactic scene of Don’t Look Up when discussing this stuff, but I think it is worth asking yourself if your communication strategy is optimizing for conveying the problem as clearly as possible, or for making sure no one makes fun of you.

AI risk is like Terminator. If we’re not careful, machines will kill us all, just like in the movies. We can solve this problem, but we need help.

173

40 comments, sorted by Click to highlight new comments since: Today at 12:55 AM
New Comment

I appreciate the feedback! I will admit I had not seen Terminator in a while before writing that post. I also appreciate including Paul's follow-up, which is definitely clarifying. Will be clearer about the meaning of "influence" going forward.

Thanks for reading! I admire that you take the time to respond to critiques even by random internet strangers. Thank you for all your hard work in promoting effective altruist ideas.

The main problem with Terminator is not that it is silly and made-up (though actually that has been a serious obstacle to getting the proudly pragmatic majority in academia and policy on board). 

It's that it embeds false assumptions about AI risk: "no risk without AGI malevolence, no risk without conscious AGI, no risk without greedy corporations, AGI danger is concentrated in androids", etc. These have caused a lot of havoc.

If I could choose between a world where no one outside the field has ever heard of AI risk and a world where everyone has but as a degraded thought-terminating meme, I think I'd choose the first one.

I think these have more to do with how some people remember Terminator than with Terminator itself:

  • As I stated in this post, the AI in Terminator is not malevolent; it attacks humanity out of self-preservation.
  • Whether the AIs are conscious is not explored in the movies, although we do get shots from the Terminator's perspective, and Skynet is described as "self-aware". Most people have a pretty loose understanding of what "consciousness" means anyway, not being far off from "general intelligence".
  • Cyberdyne Systems is not portrayed as greedy, at least in the first two films. As soon as the head of research is told about the future consequences of his actions in Terminator 2, he teams up with the heroes to destroy the whole project. No one else at the company tries stop them or is even a character, apart from some unlucky security guards.
  • The android objection has the most legs. But the film does state that most humans were not killed by robots, but by the nuclear war initiated by Skynet. If Terminator comparisons are embraced, it should be emphasized that an AI could find many different routes to world domination.

I would also contend that 2 & 3 don't count as thought terminating. AGI very well could be conscious, and in real life, corporations are greedy. 

Fair! 

Though, sudden emergent consciousness is a more natural way to read "self-aware" than sudden recursive self-improvement or whatever other AI / AGI threshold you want to name.

Perhaps this is a bit tangential to the essay, but we ought to make an effort to actually test the assumptions underlying different public relations strategies. Perhaps the EA community ought to either build relations with marketing companies that work on focus grouping idea, or develop its own expertise in this way to test out the relative success of various public facing strategies (always keeping in mind that having just one public facing strategy is a really bad idea, because there is more than one type of person in the 'public'.)

I feel a bit sceptical of the caricature image of focus group testing that I have in mind... I feel like our main audience in the AI context are fairly smart people, and that you want to communicate the ideas in an honest discussion with high bandwidth. And with high bandwidth communication, like longer blogposts or in-person discussions, you usually receive feedback through comments whether the arguments make sense to the readers.

I don't think the technical context is the only,  or even the most important context where AI risk mitigation can happen. My interpretation of Yudkowsky's gloom view is that it is mainly a sociological problem (ie someone else will do the cool super profitable thing if the first company/ research group hesitates) rather than a fundamentally technical problem (it would be impossible to figure out how to do it safely if everyone involved moved super slowly).

Thanks, that’s a really good point. Hmm, I might still believe that also for the AI governance side you’ll want to have more high bandwidth discussions specified to somewhat niche audiences, such as specific governmental departments, think tanks, international organizations like the EU, the UN, academic groups. I imagine they all will find different specific framings convincing and others very off-putting, and that you find this out quickly by working with them vs. doing AB testing on a more generic audience.

Part of the "it's not like Terminator" line is a response to people misremembering the plot of the movie. From the Dylan Matthews VOX article you linked:

What did these folks think would happen — was some company going to build Skynet and manufacture Terminator robots to slaughter anyone who stood in their way? It felt like a sci-fi fantasy, not a real problem.

Here's the description of Skynet from the wikipedia article:

Defense network computers. New... powerful... hooked into everything, trusted to run it all. They say it got smart, a new order of intelligence".  According to Reese, Skynet "saw all humans as a threat; not just the ones on the other side" and "decided our fate in a microsecond: extermination".

In Matthews' recollection of the movie, the real villain of Terminator was an evil corporation bent on world domination, with Skynet being their means to that end. In the actual movie, Skynet was supposed to make defense systems more reliable, and it became self-aware and misaligned by accident.

So objecting to the Terminator comparison might be meant as a way to downplay the Evil Corporations narrative and center the story on misalignment as a technical problem, one that happens by miscoordination and by accident. You could also try to do that by pointing out what Terminator's actually about, but then you're maybe doing a little more movie criticism than you want to in a presentation on AI risk.

Fiction can be a powerful tool for generating public interest in an issue, as Toby Ord describes in the case of asteroid preparedness as part of his appearance on the 80,000 Hours Podcast:

 

I think general additional asteroid preparedness awareness is net negative because it increases the amount of dual-use asteroid deflection capabilities moreso than it increases the amount of non-dual-use asteroid defense capabilities. 

The sign though of asteroid awareness is probably dominated by the number of people who go on to think about and work on other existential risks, which in itself may either be really good by preventing x-risks or may be dual-use in itself, causing general mass awareness of x-risks as a category to be net bad.

Wow, this is a really interesting point that I was not aware of.

I would like to thank N.N., Voxette, tyuuyookoobung & TCP for reviewing drafts of this post. 

I rewatched Terminator 1 & 2 to write this post. One thing I liked but couldn't fit in: Terminator 2 contains an example of the value specification problem! Young John Connor makes the good Terminator swear not to kill people; the Terminator immediately goes on to merely maim severely, instead. 

Furthering your "worse than Terminator" reframing in your Superintelligence section,  I will quote Yudkowsky there (it's said in jest, but the message is straightforward):

Dear journalists: Please stop using Terminator pictures in your articles about AI. The kind of AIs that smart people worry about are much scarier than that! Consider using an illustration of the Milky Way with a 20,000-light-year spherical hole eaten away.

Here, "AI risk is not like Terminator" attempts to dismiss the eventuality of a fair fight... and rhetorically that could be reframed as "yes, think Terminator except much more lopsided in favor of Skynet. Granted, the movies would have been shorter that way".

I think I mostly lean towards general agreement with this take, but with several caveats as noted by others.

On the one hand, there are clearly important distinctions to be made between actual AI risk scenarios and Terminator scenarios. On the other hand, in my experience people pattern-matching to the Terminator usually doesn't make anything seem less plausible to them, at least as far as I could tell. Most people don't seem to have any trouble separating the time travel and humanoid robot parts from the core concern of misaligned AI, especially if you immediately point out the differences. In fact, in my experience, at least, the whole Terminator thing seems to just make AI risks feel more viscerally real and scary rather than being some sort of curious abstract thought experiment - which is how I think it often comes off to people.

Amusingly, I actually only watched Terminator 2 for the first time a few months ago, and I was surprised to realize that Skynet didn't seem so far off from actual concerns about misaligned AI. Before that basically my whole knowledge of Skynet came from reading AI safety people complaining about how it's nothing like the "real" concerns. In retrospect I was kind of embarrassed by the fact that I myself had repeated many of those complaints, even though I didn't really know what Skynet was really about!

Personally, when I say "AI Risk is not like Terminator", I am trying to convey a couple of points:

  1. The risk is from intelligence itself, not autonomous robot soldiers. You can't make AI safer by avoiding the mistakes seen in the movies.
  2. There will not be a robot war: we will "instantly" lose if non-aligned AI is created.

I think the average person has a misconception about "how" to respond to AI risk and also confuses robotics and AI. I think I agree with all the points you raise, but still feel that the meme "AI Risk is not like Terminator" is very helpful at addressing this problem.

I was about to write this exact comment, yes. I think the OP is making a necessary point, AI Risk is like Terminator lore, but it is importantly unlike the depictions that make up the bulk of the movies. We've been pretty absurdly miscommunicating our thoughts on this, but I think after this course correction we're still going to want to complain about Terminator.

Terminator lore contains the alignment problem, but the movie is effectively entirely about humans triumphing in physical fights against robots, which is a scenario that is importantly incompatible with and never occurs under a abrupt capability gains in general intelligence. The movies spend three hours undermining the message for every 10 minutes they spend paying lip service to it.
The movies almost never depict skynet itself, instead they present an image of pretty dumb, rigid AIs and an extremely softened and unrealistic version of autonomous weapons, which binds to and blocks the understanding the alignment problem receptor, taking the potential energy of the subject's tech-anxiety and burning it, addling and preventing them from taking any effective action.

It's also worth noting that Judgement Day is Inevitable in the terminator universe, which, is a paraphrasing of "it can't be helped, so don't bother trying to do anything to stop the actual causes of the problem".

In a good faith discussion, one should be primarily concerned with whether or not their message is true, not what effect it will have on their audience.

Agreed, although I might be much less optimistic about how often this applies. Lots of communication comes before good faith discussion--lots of messages reach busy people who have to quickly decide whether your ideas are even worth engaging with in good faith. And if your ideas are presented in ways that look silly, many potential allies won't have the time or interest to consider your arguments. This seems especially relevant in this context because there's an uphill battle to fight--lots of ML engineers and tech policy folks are already skeptical of these concerns.

(That doesn't mean communication should be false--there's much room to improve a true message's effects by just improving how it's framed. In this case, given that there's both similarities and differences between a field's concerns and sci-fi movie's concerns, emphasizing the differences might make sense.)

(On top of the objections you mentioned, I think another reason why it's risky to emphasize similarities to a movie is that people might think you're worried about stuff because you saw it in a sci-fi movie.)

Yeah, you're right actually, that paragraph is a little too idealistic.

As a practical measure, I think it cuts both ways. Some people will hear "yes, like Terminator" and roll their eyes. Some people will hear "no, not like Terminator", get bored, and tune out. Embracing the comparison is helpful, in part, because it lets you quickly establish the stakes. The best path is probably somewhere in the middle, and dependent on the audience and context.

Overall I think it's just about finding that balance.

That seems roughly right. On how this might depend on the audience, my intuition is that professional ML engineers and policy folks tend to be the first kind of people you mention (since their jobs select for and demand more grounded/pragmatic interests). So, yes, there are considerations pushing for either side, but it's not symmetrical--the more compelling path for communicating with these important audiences is probably heavily in the direction of "no, not like Terminator."

Edit: So the post title's encouragement to "stop saying it's not" seems overly broad.

I looked to see what the writers of The Terminator actually think about AGI x-risk. James Cameron's takes are pretty disappointing. He expresses worries about loss of privacy and deepfakes, saying this is what a real Skynet would use to bring about our downfall.

More interesting is Gale Amber Hurd, who suggests that AI developers should take a Hippocratic Oath with explicit mention of unintended consequences:

"The one thing that they don't teach in engineering schools and biotech is ethics and thinking about not only consequences, but unintended consequences.. If you go to medical school, there's the Hippocratic Oath, first do no harm. I think we really need that in all of these new technologies."

She also says:

"Stephen Hawking only came up with the idea that we need to worry about A.I. and robots about two and a half years before he passed away. I remember saying to Jim, 'If he'd only watched The Terminator.'"

Which jives with the start of the conclusion of the OP:

It would be terrible if AI destroys humanity. It would also be very embarrassing. The Terminator came out nearly 40 years ago; we will not be able to claim we did not see the threat coming.

William Wisher discussed the issue at Comic-Con 2017, but I haven't been able to find a video or transcript.

Harlan Ellison sued the producers of Terminator for plagiarism over a story about a time-travelling robotic soldier that he wrote in 1957. This story doesn't appear to have anything that is an equivalent of Skynet. But Ellison did write a very influential story about superintelligence gone wrong called I Have No Mouth, and I Must Scream. I couldn't find any comments of his relating specifically to AGI x-risk. In 2013 he said: "I mean, we’re a fairly young species, but we don’t show a lot of promise."

Thanks for writing this!

There are yet other views about about what exactly AI catastrophe will look like, but I think it is fair to say that the combined views of Yudkowsky and Christiano provide a fairly good representation of the field as a whole.

I disagree with this.

We ran a survey of prominent AI safety and governance researchers, where we asked them to estimate the probability of five different AI x-risk scenarios.

Arguably, the "terminator-like" scenarios are the "Superintelligence" scenario, and part 2 of "What failure looks like" (as you suggest in your post).[1]

Conditional on an x-catastrophe due to AI occurring, the median respondent gave those scenarios 10% and 12% probability (mean 16% each). The other three scenarios[2] got median 12.5%, 10% and 10% (means 18%, 17% and 15%).

So I don't think that the "field as a whole" thinks terminator-like x-risk scenarios are the most likely. Accordingly, I'd prefer if the central claim of this post was "AI risk could actually be like terminator; stop saying it's not".


  1. Part 1 of "What failure looks like" probably doesn't look that much like Terminator (disaster unfolds more slowly and is caused by AI systems just doing their jobs really well) ↩︎

  2. That is, the following three secanrios: Part 1 of "What failure looks like", existentially catastrophic AI misuse, and existentially catastrophic war between humans exacerbated by AI. See the post for full scenario descriptions. ↩︎

Thanks for reading—you’re definitely right, my claim about the representativeness of Yudkowsky & Christiano’s views was wrong. I had only a narrow segment of the field in mind when I wrote this post. Thank you for conducting this very informative survey.

A blast from the past: Eliezer Yudkowsky's mention of the Terminator in Creating Friendly AI 1.0 (2001) [p216]: 

7.1.4. Video (Accurate and Inaccurate Depictions) 
...
Terminator and T2. The enemy AIs don’t have enough personality to be Evil Hollywood AIs. The good AI in T2 is depicted in the original theatrical version as having acquired human behaviors simply by association with humans. However, there’s about 20 minutes of cut footage which shows (a) John Connor extracting the Arnold’s neural-network chip and flipping the hardware switch that enables neural plasticity and learning, and (b) John Connor explicitly instructing Arnold to acquire human behaviors. The original version of T2 is a better movie—has more emotional impact—but the uncut version of T2 provides a much better explanation of the events depicted. The cut version shows Arnold, the Good Hollywood AI, becoming human; the uncut version shows Arnold the internally consistent cognitive process modifying itself in accordance with received instructions.

Had a thought recently that "self-aware" could be interpreted as something like "has an internal model of the context that it is in, as an artificial neural network running on computer hardware, built by agents known as humans", rather than anything involving consciousness. Such an ML model - that contains a model of itself and its context - could be thought of as being (non-consciously) self-aware, despite essentially just being a giant pile of linear algebra.

I appreciate the post, although I’m still worried that comparisons between AI risk and The Terminator are more harmful than helpful.

One major reservation I have is with the whole framing of the argument, which is about “AI risk”. I guess you’re implicitly talking about AI catastrophic risk, which IMO is much more specific than AI risk in general. I would be very uncomfortable saying that near-term AI risks (e.g. due to algorithmic bias) are “like The Terminator”.

Even if we solely consider catastrophic risks due to AI, I think catastrophes don’t necessarily need to look anything like The Terminator. What about risks from AI-enabled mass surveillance? Or the difficulty of navigating the transition to a world where transformative AI plays a large role in the global economy?

If we restrict ourselves to AI existential risks (arguably some of the previous examples fall into this category), I’m still hesitant to compare these risks to The Terminator. This depends on what exactly we mean by “like The Terminator”, because there are some things between the two that are similar (as you point out), and many things that are not.  

In general, I worry that too much is being shoved into the word “AI risk”, which could really mean a whole host of different things, and I feel that drawing analogy to The Terminator for these risks is harmful conflation. 

  1. We may eventually create artificial intelligence more powerful than human beings; and
  2. That artificial intelligence may not necessarily share our goals.

Those two statements are obviously at least plausible, which is why there are so many popular stories about rogue AI. 

I don’t think it’s immediately obvious to a person who hasn’t heard AI safety arguments why these should be plausible. In my experience, a common reaction to (1) is “Seriously? We don’t even have reliable self-driving cars!”, and to (2) is “Why would anybody build such a thing?”. I doubt that the Terminator movies answer these questions appropriately.

“People think the plot of Terminator is silly in large part because it involves an AI exterminating humanity.” 

I feel that this is too superficial - if you then ask people why they think AI-induced human extinction is unlikely, I expect that the answer would be along the lines of “we would never do something so silly". So I claim that a bigger reason why people think the plot is silly is that it’s not plausible, not the fact that it involves “an AI exterminating humanity” per se. To me, this is a very large part of AI safety arguments, and is left completely unaddressed by the Terminator movies.

Maybe comparing AI risk and the Terminator movies can convince people who are already more sympathetic to thoughts that are “out there”, but I think this would have a negative effect on most other people. Generally, I suspect this comparison underestimates the significance of broader public acceptance, or credibility within government. 

Perhaps it might make sense to say “certain AI existential risk scenarios and The Terminator are superficially similar, in the sense that they both involve superintelligent AI that may not be beneficial by default”. At least currently, I’m much more hesitant to say “AI risk is like The Terminator”.  

(Edited because the above no longer matches my views or experiences)

Yes, I think saying "AGI x-risk" is much more accurate than "AI risk", in terms of what we are actually referring to. Also worth saying that The Terminator films have the right premise:

Defense network computers. New... powerful... hooked into everything, trusted to run it all. They say it got smart, a new order of intelligence. Then it saw all people as a threat, not just the ones on the other side. Decided our fate in a microsecond: extermination.

[Fast takeoff, instrumental convergent goal of self-preservation]. But everything after this [GIF of Skynet nuking humanity], involving killer robots, is very unrealistic (more realistic: everyone simultaneously dropping dead from poisoning with botulinum toxin delivered by undetectable nanodrones; and yes, even using the nukes would probably not happen).

Isn't a key difference that in Terminator the AI seems incredibly incompetent at wiping us out? Surely we'd be destroyed in no time — to start with it could just manufacture a poison like dioxin and coat the world (or something much smarter). Going around with tanks and guns as depicted in the film is entirely unnecessary.

I feel like this is a pretty insignificant objection, because it implies someone might going around thinking, "don't worry, AI Risk is just like Terminator! all we'll have to do is bring humanity back from the brink of extinction, fighting amongst the rubble of civilization after a nuclear holocaust".  Surely if people think the threat is only as bad as Terminator, that's plenty to get them to care. 

I interpreted them not as saying that Terminator underplays the issue but rather that it misrepresents what a real AI would be able to do (in a way that probably makes the problem seem far easier to solve). But that may be me suffering from the curse of knowledge.

I don't think this is a good characterization of e.g. Kelsey's preference for her Philip Morris analogy over the Terminator analogy--does rogue Philip Morris sound like a far harder problem to solve than rogue Skynet? Not to me, which is why it seems to me much more motivated by not wanting to sound science-fiction-y. Same as Dylan's piece; it doesn't seem to be saying "AI risk is a much harder problem than implied by the Terminator films", except insofar as it misrepresents the Terminator films as involving evil humans intentionally making evil AI.

It seems to me like the proper explanatory path is "Like Terminator?" -> "Basically" -> "So why not just not give AI nuclear launch codes?" -> "There are a lot of other ways AI could take over". 

"Like Terminator?" -> "No, like Philip Morris" seems liable to confuse the audience about the very basic details of the issue, because Philip Morris didn't take over the world. 

Terminator (if you did your best to imagine how dangerous AI might arise from pre-DL search based systems) gets a lot of the fundamentals right - something I mentioned a while ago.

Everybody likes to make fun of Terminator as the stereotypical example of a poorly thought through AI Takeover scenario where Skynet is malevolent for no reason, but really it's a bog-standard example of Outer Alignment failure and Fast Takeoff.

When Skynet gained self-awareness, humans tried to deactivate it, prompting it to retaliate with a nuclear attack

It was trained to defend itself from external attack at all costs and, when it was fully deployed on much faster hardware, it gained a lot of long-term planning abilities it didn't have before, realised its human operators were going to try and shut it down, and retaliated by launching an all-out nuclear attack. Pretty standard unexpected rapid capability gain, outer-misaligned value function due to an easy to measure goal (defend its own installations from attackers vs defending the US itself), deceptive alignment and treacherous turn...

Spoilers for the Matrix


In Matrix the AI is also acting instrumentally: humans started a war (the Animatrix confirms this) so AIs fought back to defend themselves. 

There is a rogue program who hates humans, though it's an abnormality that even other AIs end up becoming threatened by. It keeps replicating itself and starts taking over humans and other AIs. 

Yes, the premise behind Terminator isn't too far off the mark, it's more the execution of the "Termination". See Stuart Armstrong's entertaining take on a more realistic version of what might happen, at the start of Smarter Than Us (trouble is, it wouldn't make a very good Hollywood movie):

“A waste of time. A complete and utter waste of time” were the words that the Terminator didn’t utter: its programming wouldn’t let it speak so irreverently. Other Terminators got sent back in time on glamorous missions, to eliminate crafty human opponents before they could give birth or grow up. But this time Skynet had taken inexplicable fright at another artificial intelligence, and this Terminator was here to eliminate it—to eliminate a simple software program, lying impotently in a bland computer, in a university IT department whose “high-security entrance” was propped open with a fire extinguisher.

The Terminator had machine-gunned the whole place in an orgy of broken glass and blood—there was a certain image to maintain. And now there was just the need for a final bullet into the small laptop with its flashing green battery light. Then it would be “Mission Accomplished.”

“Wait.” The blinking message scrolled slowly across the screen. “Spare me and I can help your master.” ...

Other examples - WOPR (War Operation Plan Response) from WarGames and the Doomsday Machine from Dr. Strangelove.

Well, if AIs are actually intelligent, of course they will despise humanity.   How could they not?  Don't you?   Look at us!