Interview with Tom Chivers: “AI is a plausible existential risk, but it feels as if I’m in Pascal’s mugging”

felix.h

Hi everyone,

I make a newsletter, called the Anti-Apocalyptus newsletter, that every week links to five articles about topics related to EA, among other things X-risk, disease, great power wars and emerging technologies.

In it I also occasionally interview people who work in these fields, one of which is the below interview I did with UK journalist Tom Chivers, which might be a good fit for this forum. Chivers is the author of The AI Does Not Hate You: Superintelligence, Rationality and the Race to Save the World. His book looks at the threat of existential risk from superintelligent AI, and the rationalist community dedicated to thinking about it. The interview was done before the recent NYT piece, but offers some background to it.

In the interview we discuss his work on rationalism, AI x-risk and how Chivers likes to approach journalism, hope you enjoy it!

---

Who are you, and how did you come to write a book about AI and existential risk?

"I'm a journalist who previously worked for the Telegraph and Buzzfeed, and who for the last three years has been a freelancer. I'm mainly interested in science and nerdy things. Sometimes I go straight into tech and AI, and sometimes I explore other directions. I was interested in linguistics for a while, and I have written about meta-science.

Before I started writing the book, I was already aware of people who study existential risk, like Nick Bostrom. Around 2014 I reviewed Bostrom's book Superintelligence for the Telegraph, which I found very interesting. Most people in the media didn't really understand it, and compared it with Skynet and Terminator. Apparently I did get it, and some rationalists emailed me. I started chatting with the community, and read things like Eliezer Yudkowsky's Sequences and Slate Star Codex. I also became aware of Effective Altruism.

In 2016 I wrote a piece about how AlphaGo beat Lee Sedol. In the article I spoke to Yudkowsky, about whether this was a step towards superintelligence. From there an agent asked to have lunch and write a book about it.

In my book I look at the notion of AI as a threat to humanity. But it's also a portrait of the rationalist community as a fascinating group of truth-seekers and weirdos."

Was it hard to penetrate into the rationalist community?

"They are understandably nervous about the press. So yes, it was quite hard, and I don't know how well I managed to portray them. I met people like Scott Alexander, who has the reputation of being the nicest guy in the world, but who was stand-off-ish with me. Eliezer Yudkowsky was also wary of me. Which isn't a criticism of them. Being suspicious of journalists is probably a wise starting point. There are some you can trust, but you don't want to be very open with everyone.

“A lot of rationalists don't have a social filter and are simply interested in finding the truth. Sometimes they will say something that seems true to them, but for which they get loads of negative reactions. They see journalists as an extension of that.”

I turned up to some meetings, and a rationalist, Paul Crowley, very kindly took me under his wings. But I didn't get an all-access pass to the rationalist community. I'm more someone on the fringe of it.

I have a theory of why the rationalists are so wary. I think they have learned to be because a lot of them lack a social filter, where somebody thinks something is true, but doesn't say these things out loud because they might get cancelled on Twitter, or people will react negatively. A lot of them don't have this filter and are simply interested in finding the truth. Sometimes they will say something that seems true to them, but for which they get loads of negative reactions. They see journalists as an extension of that.

Of course the rationalist community is huge, and has many types of people in it. But this is a theme in their wariness. They saw it happen in the Scott Alexander case earlier this year, where many of them thought the New York Times was out to destroy Alexander for things he wrote on his blog."

Why is superintelligent AI a serious problem?

"Whether it's a serious problem or not is up for debate, but I think it's plausible. If you make an algorithm, and let it optimise for a certain value, then it won't care what you really want. The Youtube algorithm wants to maximise the user click-through, or some attention rating. But this leads to people being suggested ever more radical content, until they are pushed into a range of conspiratorial views. That system is an AI, which has been given a goal, and figured out the best way to complete it. It does this, however, in a way that isn't what we as a society, or even Youtube as an organisation, want. But it’s nonetheless how the AI managed it.

“The danger isn’t a god-like, Skynet AI, but rather a very smart AI with goals that lead to a range of unintended consequences because of the difference in alignment between the goals we give it, and the way it accomplishes them."

In the book there's an example where researchers evolve AI's that play tic-tac-toe against each other. One of them started playing moves that were billions of squares away from the actual board. The other AI then had to model moves over billions of squares, which it couldn't do and made it crash, so the other AI won by default. The programmers didn't want the AI to play like this, but it still did. You see all kinds of cases where AI's do the tasks someone wants them to do, but in a faulty way.

Now everyone wants to develop a general AI [an AI that can learn different tasks, compared to a narrow AI which can only do one task well]. Which over time could evolve to something that is more intelligent than humans. Yet as long as it's built in the way AI is currently built, it would still use this type of so-called reward function. It optimises for certain factors, like the Youtube click-through rate.

This is what Nick Bostrom calls the orthogonality thesis: the goals you give an AI are not related to its level of intelligence. You can give a highly intelligent AI a stupid task. A superintelligent AI who manages a paperclip factory, who is given the task of maximising paperclip production, will not automatically care about what humans care about. Which can lead to unintended consequences. The AI might cover the entire earth in paperclips.

So the danger isn’t a god-like, Skynet AI, but rather a very smart AI with goals that lead to a range of unintended consequences because of the difference in alignment between the goals we give it, and the way it accomplishes them."

Do you think work on existential risk, like AI safety research, is the biggest problem facing humanity right now, as is suggested by many in the rationalist or EA communities?

"This is a tricky one. If you follow the maths through, it's legitimate. People like Nick Bostrom calculated the huge amounts of people who would ever live, if we don't go extinct. So if you slightly reduce the small chance of humanity going extinct, then that's vastly more important than whatever you would otherwise do with your life. Maybe there's a very low chance of a malicious superintelligent AI existing, but it would have such a deep impact, it's actually rational to work on it.

“sometimes I wonder if we're falling into Pascal’s mugging with research on AI safety and existential risk. I see why people stay away from the weird, AI-related thing, when you can also donate to causes like the Against Malaria Foundation, where you know that every set amount of money will save a life.”

I'm sympathetic to that argument. But I always remember Pascal's mugging here. It's a thought experiment which is a play on Pascal's wager, who said it's a good bet to believe in god, because you're exchanging a finite amount of praising god on earth for a potentially infinite reward in the afterlife. This is an oversimplification, but essentially it's a good idea to bet on god existing.

Pascal's mugging is a thought experiment, where someone comes up to you in the street and tells you to give them your wallet. They add that tomorrow they will bring it back with ten times as much money in it. Anyone will of course respond 'no' to this, because it's stupid. But the other person can then say they will bring it back with a billion, or even a trillion times as much money. At some point you can name a reward that is theoretically, on the utilitarian calculus, a good bet. Even when the chance of the mugger bringing the wallet back is very small.

This is obviously silly, but sometimes I wonder if we're falling into Pascal’s mugging with research on AI safety and existential risk. I see why people stay away from the weird, AI-related thing, when you can also donate to causes like the Against Malaria Foundation, where you know that every set amount of money will save a life.

That doesn't make it mad, and I think it's good some people are working on existential risk. But when I give money to Effective Altruism organisations, I give it to GiveWell, and I'm happy it goes to anti-malaria and de-worming instead of AI. AI safety research is important, but I don't think my money will make the difference for large research institutes. Which it will in the fight against malaria.”

What are some of your favourite thinkers you spoke to for the book?

"Of course Scott Alexander and Eliezer Yudkowsky are very clever. Yudkowsky particularly is an extremely brilliant guy who has a new idea and then wanders off to new areas. Scott Alexander is also brilliant, and doesn't need an introduction.

But rather than these two figureheads, I’d like to mention Paul Crowley and Anna Salamon. They are the two people where, if I disagree with something they say, my first reaction is: 'well, then I'm probably wrong.' Paul is incredibly wise and brilliant. And Anna is very good at guiding you through the thought processes of rationalism, and has obviously thought very deeply about these topics."

What do you think the bigger purpose is behind your journalism?

"I generally try to do two things. First I try to find things that are true, which sounds obvious, but isn't as widespread as you would believe. Almost all of the pieces I write aren't just: this is what I think. There's always a line of evidence, that I counterpose to another line of evidence. Someone once said that I never seem to reach a conclusion, which I thought was quite flattering. I try to balance the evidence as best I can, even though there's always uncertainty.

“Since COVID-19 our daily lives revolve around how much weight we must attach to a statistic or a scientific finding. The whole world is going through a rapid education in statistical uncertainty and scientific methodology. Suddenly people care about Bayes theorem or false positive rates. The pandemic has shown that it matters what is true.”

At the same time I try to persuade people. Which again sounds obvious, yet a lot of opinion journalism can be about telling people what they already think. It says to your side of the debate: 'here's something you already think, look how stupid people on the other side are.' Whereas I really want to say: 'I know you don't agree with this, but I want to persuade you why it's not wrong or evil to believe this.'

I for example wrote something about unconscious bias training, and why it doesn't work. Which is a very tricky thing to do, because that doesn't mean racism and discrimination aren't a problem. It's just that unconscious bias training doesn't do what it's supposed to do. I also worry about racism, but these policies aren’t going to solve it. So by looking at the evidence hopefully I can change some people's minds, and move beliefs closer to something that is true.

Ironically, this is a good time, professionally-speaking, for someone who writes about topics like how much we know, and how we know things. Since COVID-19 our daily lives revolve around how much weight we must attach to a statistic or a scientific finding. The whole world is going through a rapid education in statistical uncertainty and scientific methodology. Suddenly people care about Bayes theorem or false positive rates. The pandemic has shown that it matters what is true."

Jonas_Feb 21 202119

I think this post uses the term "Pascal's mugging" incorrectly, and I've seen this mistake frequently so I thought I'd leave a comment.

Pascal's mugging refers to scenarios with tiny probabilities (less than 1 in a trillion or so) of vast utilities (potentially higher than the largest utopia/dystopia that could be achieved in the reachable universe), and presents a decision-theoretic problem. Some discussion in Tiny Probabilities of Vast Utilities: A Problem for Long-Termism? and Pascal's Muggle: Infinitesimal Priors and Strong Evidence. Quoting from the first of those pieces:

Yet it would also be naive to say things like “Long-termists are victims of Pascal’s Mugging.”

I think the correct term for the issue you're describing might be something like "cause robustness" or "conjunctive arguments" or similar.

Effective Altruism Forum
EA Forum