You can see a full transcript of this conversation on our website.
Summary
We spoke with Adam Gleave on August 27, 2019. Here is a brief summary of that conversation:
- Gleave gives a number of reasons why it’s worth working on AI safety:
- It seems like the AI research community currently isn’t paying enough attention to building safe, reliable systems.
- There are several unsolved technical problems that could plausibly occur in AI systems without much advance notice.
- A few additional people working on safety may be extremely high leverage, especially if they can push the rest of the AI research community to pay more attention to important problems.
- Gleave thinks there’s a ~10% chance that AI safety is very hard in the way that MIRI would argue, a ~20-30% chance that AI safety will almost certainly be solved by default, and a remaining ~60-70% chance that what we’re working on actually has some impact.
- Here are the reasons for Gleave’s beliefs, weighted by how much they factor into his holistic viewpoint:
- 40%: The traditional arguments for risks from AI are unconvincing:
- Traditional arguments often make an unexplained leap from having superintelligent AIs to superintelligent AIs being catastrophically bad.
- It’s unlikely that AI systems not designed from mathematical principles are going to inherently be unsafe.
- They’re long chains of heuristic reasoning, with little empirical validation.
- Outside view: most fears about technology have been misplaced.
- 20%: The AI research community will solve the AI safety problem naturally.
- 20%: AI researchers will be more interested in AI safety when the problems are nearer.
- 10%: The hard, MIRI version of the AI safety problem is not very compelling.
- 10%: AI safety problems that seem hard now will be easier to solve once we have more sophisticated ML.
- 40%: The traditional arguments for risks from AI are unconvincing:
- Here are the reasons for Gleave’s beliefs, weighted by how much they factor into his holistic viewpoint:
- Fast takeoff defined as “GDP will double in 6 months before it doubles in 24 months” is plausible, though Gleave still leans towards slow takeoff.
- Gleave thinks discontinuous progress in AI is extremely unlikely:
- There is unlikely to be a sudden important insight dropped into place, since AI has empirically progressed more by accumulation of lots of bags and tricks and compute.
- There isn’t going to be a sudden influx of compute in the near future, since well-funded organizations are currently already spending billions of dollars to optimize it.
- If we train impressive systems, we will likely train other systems beforehand that are almost as capable.
- Given discontinuous progress, the most likely story is that we combine many narrow AI systems in a way where the integrated whole is much better than half of them.
- Gleave guesses a ~10-20% chance that AGI technology will only be a small difference away from current techniques, and a ~50% chance that AGI technology will be easily comprehensible to current AI researchers:
- There are fairly serious roadblocks in current techniques right now, e.g. memory, transfer learning, Sim2Real, sample inefficiency.
- Deep learning is slowing down compared to 2012 – 2013:
- Much of the new progress is going to different domains, e.g. deep RL instead of supervised deep learning.
- Computationally expensive algorithms will likely hit limits without new insights.
- Though it seems possible that in fact progress will come from more computationally efficient algorithms.
- Outside view, we’ve had lots of different techniques for AI over time, so it would be surprising is the current one is the right one for AGI.
- Pushing more towards current techniques getting to AGI, from an economic point of view, there is a lot of money going into companies whose current mission is to build AGI.
- Conditional on advanced AI technology being created, Gleave gives a 60-70% chance that it will pose a significant risk of harm without additional safety efforts.
- Gleave thinks that best case, we drive it down to 20 – 10%, median case, we drive it down to 40 – 30%. A lot of his uncertainty comes from how difficult the problem is.
- Gleave thinks he could see evidence that could push him in either direction in terms of how likely AI is to be safe:
- Evidence that would cause Gleave to think AI is less likely to be safe:
- Evidence that thorny but speculative technical problems, like inner optimizers, exist.
- Seeing more arms race dynamics, e.g. between U.S. and China.
- Seeing major catastrophes involving AI, though they would also cause people to pay more attention to risks from AI.
- Hearing more solid arguments for AI risk.
- Evidence that would cause Gleave to think AI is more likely to be safe:
- Seeing AI researchers spontaneously focus on relevant problems would make Gleave think that AI is less risky.
- Getting evidence that AGI was going to take longer to develop.
- Evidence that would cause Gleave to think AI is less likely to be safe:
- Gleave is concerned that he doesn’t understand why members of the safety community come to widely different conclusions when it comes to AI safety.
- Gleave thinks a potentially important question is the extent to which we can successfully influence field building within AI safety.
Christiano operationalises a slow takeoff as
in Takeoff speeds, and a fast takeoff as one where there isn't a complete 4 year interval before the first 1 year interval.