Hide table of contents

Crossposted from the Global Priorities Project

Co-written by Owen Cotton-Barratt and Toby Ord

There are several different kinds of artificial general intelligence (AGI) which might be developed, and there are different scenarios which could play out after one of them reaches a roughly human level of ability across a wide range of tasks. We shall discuss some of the implications we can see for these different scenarios, and what that might tell us about how we should act today.

A key difference between different types of post-AGI scenario is the ‘speed of takeoff’. This could be thought of as the time between first reaching a near human-level artificial intelligence and reaching one that far exceeds our capacities in almost all areas (or reaching a world where almost all economically productive work is done by artificial intelligences). In fast takeoff scenarios, this might happen over a scale of months, weeks, or days. In slow takeoff scenarios, it might take years or decades. There has been considerable discussion about which speed of takeoff is more likely, but less discussion about which is more desirable and what that implies.

Are slow takeoffs more desirable?

There are a few reasons to think that we’re more likely to get a good outcome in a slow takeoff scenario.

First, safety work today has an issue of neartsightedness. Since we don’t know quite what form artificial intelligence will eventually take, specific work today may end up being of no help on the problem we eventually face. If we had a slow takeoff scenario, there would be a period of time in which AGI safety researchers had a much better idea of the nature of the threat, and were able to optimise their work accordingly. This could make their work several times more valuable.

Second, and perhaps more crucially, in a slow takeoff the concerns about AGI safety are likely to spread much more widely through society. It is easy to imagine this producing widespread societal support of a level at or exceeding that for work on climate change, because the issue would be seen to be imminent. This could translate to much more work on securing a good outcome -- perhaps hundreds of times the total which had previously been done. Although there are some benefits to have work done serially rather than in parallel, these are likely to be overwhelmed by the sheer quantity of extra high-quality work which would attack the problem. Furthermore, the slower the takeoff, the more this additional work can also be done serially.

A third key factor is that a slow takeoff seems more likely to lead to a highly multipolar scenario. If AGI has been developed commercially, the creators are likely to licence out copies for various applications. Moreover it could give enough time for competitors to bring alternatives up to speed.

We don’t think it’s clear whether multipolar outcomes are overall a good thing, but we note that they have some advantages. In the short term they are likely to preserve something closer to the existing balance of power, which gives more time for work to ensure a safe future. They are additionally less sensitive to the prospect of a treacherous turn or of any single-point failure mode in an AGI.

Strategic implications

If we think that there will be much more time for safety work in slow takeoff scenarios, there seem to be two main implications:

First, when there is any chance to influence matters, we should generally push towards slow takeoff scenarios. They are likely to have much more safety work done, and this is a large factor which could easily outweigh our other information about the relative desirability of the scenarios.

Second, we should generally focus safety research today on fast takeoff scenarios. Since there will be much less safety work in total in these scenarios, extra work is likely to have a much larger marginal effect. This can be seen as hedging against a fast takeoff even if we think it is undesirable.

Overall it seems to us that the AGI safety community has internalised the second point, and sensibly focused on work addressing fast takeoff scenarios. It is less clear that we have appropriately weighed the first point. Either of these points could be strengthened or outweighed by a better understanding of the relevant scenarios.

For example, it seems that neuromorphic AGI would be much harder to understand and control than an AGI with a much clearer internal architecture. So conditional on a fast takeoff, it would be bad if the AGI were neuromorphic. People concerned with AGI safety have argued against a neuromorphic approach on these grounds. However, precisely because it is opaque, neuromorphic AGI may be less able to perform fast recursive self-improvement, and this would decrease the chance of a fast takeoff. Given how much better a slow takeoff appears, we should perhaps prefer neuromorphic approaches.

In general, the AGI safety community focuses much of its attention on recursive self-improvement approaches to designing a highly intelligent system. We think that this makes sense in as much as it draws attention to the dangers of fast takeoff scenarios and hedges against being in one, but we would want to take care not to promote the approach for those considering designing an AGI. Drawing attention to the power of recursive self improvement could end up being self-defeating if it encourages people to design such systems, producing a faster takeoff. In conclusion it seems that when doing direct technical safety work, may be reasonable to condition on a fast takeoff, as that is the scenario where our early work matters most. When choosing strategic direction, however, it is a mistake to condition on a fast takeoff, precisely because our decisions may affect the probability of a fast takeoff.

Thanks to Daniel Dewey for conversations and comments.

10

0
0

Reactions

0
0

More posts like this

Comments4
Sorted by Click to highlight new comments since: Today at 3:07 PM

Second, we should generally focus safety research today on fast takeoff scenarios. Since there will be much less safety work in total in these scenarios, extra work is likely to have a much larger marginal effect.

Does this assumption depend on how pessimistic/optimistic one is about our chances of achieving alignment in different take-off scenarios, i.e. what our position on a curve something like this is expected to be for a given takeoff scenario?

I think you get an adjustment from that, but that it should be modest. None of the arguments we have so far about how difficult to expect the problem to be seem very robust, so I think it's appropriate to have a somewhat broad prior over possible difficulties.

I think the picture you link to is plausible if the horizontal axis is interpreted as a log scale. But this changes the calculation of marginal impact quite a lot, so that you probably get more marginal impact towards the left than in the middle of the curve. (I think it's conceivable to end up with well-founded beliefs that look like that curve on a linear scale, but that this requires (a) very good understanding of what the problem actually is, & (b) justified confidence that you have the correct understanding.)

It's great you're thinking about these issues.

I agree that AGI safety is plausibly the dominating consideration regarding takeoff speed. Thus, whether one wants a faster or slower takeoff depends on whether one wants safe AGI (which is not a completely trivial question, http://foundational-research.org/robots-ai-intelligence-explosion/#Would_a_human_inspired_AI_or_rogue_AI_cause_more_suffering , though I think it's likely safe AI is better for most human values).

And yes, neuromorphic AGI seems likely to be safer both because it may be associated with a slow takeoff but also because we understand how humans work, how to balance power with them, and so on. Arbitrary AGIs with alien motivational and behavioral systems are more unpredictable. In the long run, if you want goal preservation, you probably need AGI that's different from the human brain, but goal preservation is arguably less of a concern in the short run; knowledge of how to do goal preservation will come with greater intelligence. In any case, neuromorphic AGIs are much more likely to have human-like values than arbitrary AGIs. We don't worry that much about goal preservation with subsequent generations of humans because they're pretty similar to us (though old conservatives are often upset with the moral degeneration of society caused by young people).

I agree that multipolar power dynamics could be bad, because this might lead to arms races and conflict relative to a quick monopoly by one group. On the other hand, it might allow for more representation by different parties.

Overall, I think the odds of a fast takeoff are sufficiently low that I'm not convinced it makes sense to focus on fast-takeoff work (even if some such exploration is worthwhile). There may be important first-mover advantages to shaping how society approaches slow takeoffs, and if slow takeoff is sufficiently probable, those may dominate in impact. In any case, the fast-slow distinction is not binary, and maybe the best place to focus is on scenarios where human-level AI takes over on a time scale of a few years. (Timescales of months, days, or hours strike me as pretty improbable, unless, say, Skynet gets control of nuclear weapons.)

Thanks, good comments.

Which kind of work it's better to focus depends on the relative leverage you think you have in either case, combined with the likelihoods of the different scenarios. I plan to try a more quantitative analysis, which investigates what ranges of empirical beliefs about these factors correspond to what kind of work now. We could then try to gather some data on estimates (and variance in estimates) of these key values.

Curated and popular this week
Relevant opportunities