On Twitter, Katja Grace wrote:

I think people should think more about trying to slow down AI progress, if they believe it's going to destroy the world soon. I know people have like eighteen reasons to dismiss this idea out of hand, but I dispute them.

The introduction to the post is below. Do read the whole thing.

Consider reading alongside:

Averting doom by not building the doom machine

If you fear that someone will build a machine that will seize control of the world and annihilate humanity, then one kind of response is to try to build further machines that will seize control of the world even earlier without destroying it, forestalling the ruinous machine’s conquest. An alternative or complementary kind of response is to try to avert such machines being built at all, at least while the degree of their apocalyptic tendencies is ambiguous. 

The latter approach seems to me  like the kind of basic and obvious thing worthy of at least consideration, and also in its favor, fits nicely in the genre ‘stuff that it isn’t that hard to imagine happening in the real world’. Yet my impression is that for people worried about extinction risk from artificial intelligence, strategies under the heading ‘actively slow down AI progress’ have historically been dismissed and ignored (though ‘don’t actively speed up AI progress’ is popular).

The conversation near me over the years has felt a bit like this: 

Some people: AI might kill everyone. We should design a godlike super-AI of perfect goodness to prevent that.

Others: wow that sounds extremely ambitious

Some people: yeah but it’s very important and also we are extremely smart so idk it could work

[Work on it for a decade and a half]

Some people: ok that’s pretty hard, we give up

Others: oh huh shouldn’t we maybe try to stop the building of this dangerous AI? 

Some people: hmm, that would involve coordinating numerous people—we may be arrogant enough to think that we might build a god-machine that can take over the world and remake it as a paradise, but we aren’t delusional

This seems like an error to me. (And lately, to a bunch of other people.) 

I don’t have a strong view on whether anything in the space of ‘try to slow down some AI research’ should be done. But I think a) the naive first-pass guess should be a strong ‘probably’, and b) a decent amount of thinking should happen before writing off everything in this large space of interventions. Whereas customarily the tentative answer seems to be, ‘of course not’ and then the topic seems to be avoided for further thinking. (At least in my experience—the AI safety community is large, and for most things I say here, different experiences are probably had in different bits of it.)

Maybe my strongest view is that one shouldn’t apply such different standards of ambition to these different classes of intervention. Like: yes, there appear to be substantial difficulties in slowing down AI progress to good effect. But in technical alignment, mountainous challenges are met with enthusiasm for mountainous efforts. And it is very non-obvious that the scale of difficulty here is much larger than that involved in designing acceptably safe versions of machines capable of taking over the world before anyone else in the world designs dangerous versions. 

I’ve been talking about this with people over the past many months, and have accumulated an abundance of reasons for not trying to slow down AI, most of which I’d like to argue about at least a bit. My impression is that arguing in real life has coincided with people moving toward my views.


New comment
6 comments, sorted by Click to highlight new comments since: Today at 6:33 PM

Thanks for writing this, Katja, and Peter for sharing.

Agree with a lot of the specific points, though I found the title/thesis somewhat incongruent with the content. 

The various instances (?) of "slowing down AI" that you talk about seem pretty different in nature to me, and not all seem like they really are about slowing down AI in the sense that I/my colleagues might construe it.

Reducing compute investment, gating compute access in some way, getting people to switch from capabilities work to safety work, increasing restraint on deployment, coordinating on best practices, raising the alarm, etc. seem to be pretty diverse things. And many are happening already. Some could reasonably count as "slowing down" to me, while others don't (I'd maybe define slowing down AI as something like "reducing the rate of improvement in peak and/or average AI capabilities in the world, regardless of whether or how they are deployed or not"). 

I agree that there is relatively little work on slowing AI in the above sense and that there are some misconceptions about it, but many of the arguments you make address misconceptions about the feasibility and value of AI policy generally, not slowing down specifically.

I think restraining deployment would reduce the rate of improvement in peak AI capability in the world, via reduced funding. Do you think otherwise? How does that work?

Is it 1. restraining deployment won't reduce funding, or 2. restraining deployment would reduce funding, but reduced funding won't reduce the rate of improvement, or 3. restraining deployment would reduce funding, and reduced funding would reduce the rate of improvement, but still the whole argument doesn't work for reason X? 

A few aspects of my model: 

- Compute cost reduction is important for driving AI capabilities forward (among other things), and historically is mostly driven by things other than deployment of the most powerful systems (general semiconductor progress/learning curves, spillover from videogame related investments, deployment of systems other than the most powerful ones, e.g. machine translation, speech recognition, etc.). This may be changing as the share of NVIDIA's datacenter revenue increases and more companies deploy powerful LMs but for a long time this was the case. 

- Other drivers of AI progress such as investment in new algorithms via hiring people at top labs, algorithmic progress enabled by more compute, and increasing the amount of money spent on compute are only somewhat tied to deployment of the most powerful models. Again this may change over time but see e.g. all of DeepMind's value provided to Google via non-cutting-edge (or fairly domain specific) things like speech synthesis, as well as claims that it is a long-term investment rather than something which requires immediate revenue. Again these things may change over time but they have been true for some time and still have at least some truth. 

- Restraint is not all or nothing, e.g. given deployment of some system, it can be deployed more or less safely, there can be more or less alignment across companies on best practices, etc. And on the current margin, doing better w.r.t. safety is mostly bottlenecked by good ideas and people to execute on those ideas, rather than adjusting the "safety vs. speed" knob (though that's relevant to an extent, too). Given that situation, I think there is a lot of marginal additional restraint to be done without preventing deployment or otherwise significantly compromising lab interests (again, this could change eventually but right now I see plenty of opportunity to do "restraint-y" things that don't entail stopping all deployment). 

Is it your sense that A.I. researchers are genuinely working towards building an A.I. meant to "seize control", but benevolently? My first reaction is that that sounds extremely dangerous and sinister.