Let’s think about slowing down AI

Katja_Grace

Comments 9

Sorted by

New & upvoted

Lizka

I'm curating this post. (See also the comments on LessWrong.)

Some sections that stood out to me (turns out it's lots of sections!):

Restraint is not radical (and subsections, especially "Extremely valuable technologies") — great section.
Restraint is not terrorism, usually — great section. I really appreciated the list of types of work that could be part of slowing down AI and think more people should read the list.
The complicated race/anti-race — the example that started, "For example, here’s a situation which I think sounds intuitively like a you-should-race world, but where in the first model above, you should actually go as slowly as possible" was really useful (and the spreadsheet is great).
Caution is cooperative. This was an interesting argument that I had somehow not seen or thought about before: "It could be that people in control of AI capabilities would respond negatively to AI safety people pushing for slower progress. But that should be called ‘we might get punished’ not ‘we shouldn’t defect’. ‘Defection’ has moral connotations that are not due. [...] On top of all that, I worry that highlighting the narrative that wanting more cautious progress is defection is further destructive, because it makes it more likely that AI capabilities people see AI safety people as thinking of themselves as betraying AI researchers, if anyone engages in any such efforts. Which makes the efforts more aggressive. Like, if every time you see friends, you refer to it as ‘cheating on my partner’, your partner may reasonably feel hurt by your continual desire to see friends, even though the activity itself is innocuous."
‘We’ are not the US, ‘we’ are not the AI safety community — I've had conversations related to this that were pretty confused, and really appreciate seeing this written out.
- Some excerpts:
  - "The starkest appearance of error along these lines to me is in writing off the slowing of AI as inherently destructive of relations between the AI safety community and other AI researchers. If we grant that such activity would be seen as a betrayal (which seems unreasonable to me, but maybe), surely it could only be a betrayal if carried out by the AI safety community. There are quite a lot of people who aren’t in the AI safety community and have a stake in this, so maybe some of them could do something. It seems like a huge oversight to give up on all slowing of AI progress because you are only considering affordances available to the AI Safety Community."
  - "I more weakly suspect some related mental shortcut is misshaping the discussion of arms races in general. The thought that something is a ‘race’ seems much stickier than alternatives, even if the true incentives don’t really make it a race. Like, against the laws of game theory, people sort of expect the enemy to try to believe falsehoods, because it will better contribute to their racing. And this feels like realism. The uncertain details of billions of people one barely knows about, with all manner of interests and relationships, just really wants to form itself into an ‘us’ and a ‘them’ in zero-sum battle. This is a mental shortcut that could really kill us."
- Somewhat relatedly, I think EA should taboo "EA should"
Convincing people doesn't seem that hard — seems to hit & provide evidence for a position on a real crux. (As a side note, "[modern AI systems] are random connections jiggled in [a] gainful direction unfathomably many times, just as mysterious to their makers" is a great way to describe it.)
- See also discussion on LW, e.g. this comment (and this resource)
Technological choice is not luddism - I've seen the argument made (or the heuristic evoked) and appreciate this note
Cheems mindset/can’t do attitude — I hadn't heard this named, I think (except in narrower cases like learned helplessness etc.), and intuitively agree with its application here.

Miles_Brundage

Noting that in addition to the LW discussion linked below, there's also some discussion on an earlier EA Forum post here: https://forum.effectivealtruism.org/posts/sFemFbiFTntgtQDbD/katja-grace-let-s-think-about-slowing-down-ai

Ramiro

There's also something like an optics problem... at least for outsiders (by which I mean most people, including myself), when an AI developer voices concerns over AI safety / ethics and then develops an application without having those issues solved, I feel tempted to conclude that either it's a case of insincerity (and talking about AI safety is a case of ethics washing, or of attracting talent without increasing compensation)... or people are willingly courting doom.

Sharmake

I disagree with the thrust of this post (That we should slow down AI), but I do agree with the object level arguments, and thus I think it's worthy of curation despite me slightly opposing AI slowdowns.

To quote Rohin Shah on LW:

It makes it easier for a future misaligned AI to take over by increasing overhangs, both via compute progress and algorithmic efficiency progress. (This is basically the same sort of argument as "Every 18 months, the minimum IQ necessary to destroy the world drops by one point.")

Such strategies are likely to disproportionately penalize safety-conscious actors.

(As a concrete example of (2), if you build public support, maybe the public calls for compute restrictions on AGI companies and this ends up binding the companies with AGI safety teams but not the various AI companies that are skeptical of “AGI” and “AI x-risk” and say they are just building powerful AI tools without calling it AGI.)

For me personally there's a third reason, which is that (to first approximation) I have a limited amount of resources and it seems better to spend that on the "use good alignment techniques" plan rather than the "try to not build AGI" plan. But that's specific to me.

Sharmake

I'd like to ask a few questions about slowing down AGI as they may turn out to be cruxes for me.

How popular/unpopular is AI slowdown? Ideally, we'd get AI slowdown/AI progress/Neutral as choices in a poll. I also ideally would like different framings of the problem, to test how well frames affect people's choices. But I do want at least a poll on how popular/unpopular AI slowdown is.
How much does the government want AI to be slowed down? Is Trevor's story about the US government not willing to countenance AI slowdown correct, and instead speed it up the norm in interacting with the government?
How much will AI companies lobby against AI slowdown? Because if this is a repeat of the fossil fuel situation where AI is viewed by the public as extremely good, I probably would not support too much object work in AI governance, and instead go meta. In other words, I'd be doing more meta things. But if AI companies support AI slowdown or at least not oppose it, than things could be okay, depending on the answers to 1 and 2.

Geofrey Junior Waako

AI is a potentially destructive phenomenon if it is not well managed and like with all historical disruptors, Africa has always been affected severely by such developments- We are still in the journey of adapting, adopting and learning the current state-of-the-art technologies and yet AI is growing at a jet-like speed. I believe that more sensitization in Africa is a viable stepping stone to addressing the challenges that come with Artificial Intelligence.

Denis

I've just been reading this post as part of the BlueDot AI Safety Fundamentals training.

I am very sympathetic to the thinking behind this, and at the same time very conscious of the challenges.

It's a feeling that I get over and over as I learn more about AI Safety - that there are things that the world "should" be doing, that the world would be doing if it were a well-run corporation with a competent CEO (or a benevolent dictator ...), but that these things are unlikely to happen in the real world because we have such difficulties agreeing on principles and then deciding quickly on tangible actions (unless they involve using very expensive weapons to attack people, when we tend to decide much faster ...).

I know this article will cause some great discussions in our next BlueDot cohort meetings, and the 330+ upvotes clearly show you've really got a lot of people thinking about this.

But I would like to comment on another aspect of this post: It is absolutely beautifully written.

It makes the points in crystal-clear, simple language. The structure is logical. There are countless examples to ensure each point is well understood. If I were to disagree with anything, it would be very easy to pinpoint exactly what I'm disagreeing with, because it is so well-structured.

But more than that, even, it's just a joy to read, with even the occasional joke to keep the readers on their toes. I never imagined I'd reach the end of a 45-minute read and almost wish it were longer.

Most posts on EA forum are very well written. The standards for clarity and coherence and precision are very high. But a few, like this one, are just beautiful, and make me wish that they could be shared with a much broader audience, beyond the EA community.

I know that great writing takes work, so thank you for this post!

DPiepgrass

So there's these people we call the "AGI alignment community". This privileges "alignment" as the intervention of choice.

I propose calling it the "AGI c-risk community" instead (c-risk = catastrophic risk), or "AGI risk community" for short. [Edit: on second thought: "AGI safety"]

Jeroen Willems🔸

This is a great and interesting post! Thanks for sharing. I thought Scott's arguments we're really convincing but you updated me away from them. Some small notes:

Under 'Convincing people doesn't seem that hard': "I don’t remember ever having any trouble discussing AI risk with random strangers. " We have a wildly different experience! I feel like with every time I try to explain it to friends or family they think I'm crazy. They don't believe it at all. But perhaps I'm just really bad at explaining it. This is why I'm pretty pessimistic it's easy to convince people. I still don't want to give up on it though.

"I arrogantly think I could write a broadly compelling and accessible case for AI risk" Please do this! I would love to see it. We need more easily accessible introductions to AI risk. If it can help me become good at explaining the issue, that would be amazing.

A question that's perhaps a little less relevant: I think Scott made a metaphor once that AI safety folks shouldn't be like "climate activists" fighting against "fossil fuel companies" (AI capabilities folks). If coordination is possible, what would be a good metaphor? Are there other industries with capabilities and safety people working together?

Comments

Lizka

I'm curating this post. (See also the comments on LessWrong.)

Some sections that stood out to me (turns out it's lots of sections!):

Restraint is not radical (and subsections, especially "Extremely valuable technologies") — great section.
Restraint is not terrorism, usually — great section. I really appreciated the list of types of work that could be part of slowing down AI and think more people should read the list.
The complicated race/anti-race — the example that started, "For example, here’s a situation which I think sounds intuitively like a you-should-race world, but where in the first model above, you should actually go as slowly as possible" was really useful (and the spreadsheet is great).
Caution is cooperative. This was an interesting argument that I had somehow not seen or thought about before: "It could be that people in control of AI capabilities would respond negatively to AI safety people pushing for slower progress. But that should be called ‘we might get punished’ not ‘we shouldn’t defect’. ‘Defection’ has moral connotations that are not due. [...] On top of all that, I worry that highlighting the narrative that wanting more cautious progress is defection is further destructive, because it makes it more likely that AI capabilities people see AI safety people as thinking of themselves as betraying AI researchers, if anyone engages in any such efforts. Which makes the efforts more aggressive. Like, if every time you see friends, you refer to it as ‘cheating on my partner’, your partner may reasonably feel hurt by your continual desire to see friends, even though the activity itself is innocuous."
‘We’ are not the US, ‘we’ are not the AI safety community — I've had conversations related to this that were pretty confused, and really appreciate seeing this written out.
- Some excerpts:
  - "The starkest appearance of error along these lines to me is in writing off the slowing of AI as inherently destructive of relations between the AI safety community and other AI researchers. If we grant that such activity would be seen as a betrayal (which seems unreasonable to me, but maybe), surely it could only be a betrayal if carried out by the AI safety community. There are quite a lot of people who aren’t in the AI safety community and have a stake in this, so maybe some of them could do something. It seems like a huge oversight to give up on all slowing of AI progress because you are only considering affordances available to the AI Safety Community."
  - "I more weakly suspect some related mental shortcut is misshaping the discussion of arms races in general. The thought that something is a ‘race’ seems much stickier than alternatives, even if the true incentives don’t really make it a race. Like, against the laws of game theory, people sort of expect the enemy to try to believe falsehoods, because it will better contribute to their racing. And this feels like realism. The uncertain details of billions of people one barely knows about, with all manner of interests and relationships, just really wants to form itself into an ‘us’ and a ‘them’ in zero-sum battle. This is a mental shortcut that could really kill us."
- Somewhat relatedly, I think EA should taboo "EA should"
Convincing people doesn't seem that hard — seems to hit & provide evidence for a position on a real crux. (As a side note, "[modern AI systems] are random connections jiggled in [a] gainful direction unfathomably many times, just as mysterious to their makers" is a great way to describe it.)
- See also discussion on LW, e.g. this comment (and this resource)
Technological choice is not luddism - I've seen the argument made (or the heuristic evoked) and appreciate this note
Cheems mindset/can’t do attitude — I hadn't heard this named, I think (except in narrower cases like learned helplessness etc.), and intuitively agree with its application here.

Let’s think about slowing down AI

Averting doom by not building the doom machine

Quick clarifications

Why not slow down AI? Why not consider it?

The mundanity of the proposal

Restraint is not radical

Sucky technologies

Extremely valuable technologies

Restraint is not terrorism, usually

Coordination is not miraculous world government, usually

The arms race model and its alternatives

The arms race

The suicide race

The safety-or-suicide race

The complicated race/anti-race

Other equilibria and other games

Being friends with risk-takers

Caution is cooperative

‘We’ are not the US, ‘we’ are not the AI safety community

Notes on tractability

Convincing people doesn’t seem that hard

Do you need to convince everyone?

Buying time is big

Delay is probably finite by default

Obstruction doesn’t need discernment

Safety from speed, clout from complicity

Moods and philosophies, heuristics and attitudes

Technological choice is not luddism

Non-AGI visions of near-term thriving

Robust priors vs. specific galaxy-brained models

Cheems mindset/can’t do attitude

Conclusion

Acknowledgements

Notes