This is a linkpost for https://aiimpacts.org/conversation-with-adam-gleave/
You can see a full transcript of this conversation on our website.
Summary
We spoke with Adam Gleave on August 27, 2019. Here is a brief summary of that conversation:
- Gleave gives a number of reasons why it’s worth working on AI safety:
- It seems like the AI research community currently isn’t paying enough attention to building safe, reliable systems.
- There are several unsolved technical problems that could plausibly occur in AI systems without much advance notice.
- A few additional people working on safety may be extremely high leverage, especially if they can push the rest of the AI research community to pay more attention to important problems.
- Gleave thinks there’s a ~10% chance that AI safety is very hard in the way that MIRI would argue, a ~20-30% chance that AI safety will almost certainly be solved by default, and a remaining ~60-70% chance that what we’re working on actually has some impact.
- Here are the reasons for Gleave’s beliefs, weighted by how much they factor into his holistic viewpoint:
- 40%: The traditional arguments for risks from AI are unconvincing:
- Traditional arguments often make an unexplained leap from having superintelligent AIs to superintelligent AIs being catastrophically bad.
- It’s unlikely that AI systems not designed from mathematical principles are going to inherently be unsafe.
- They’re long chains of heuristic reasoning, with little empirical validation.
- Outside view: most fears about technology have been misplaced.
- 20%: The AI research community will solve the AI safety problem naturally.
- 20%: AI researchers will be more interested in AI safety when the problems are nearer.
- 10%: The hard, MIRI version of the AI safety problem is not very compelling.
- 10%: AI safety problems that seem hard now will be easier to solve once we have more sophisticated ML.
- 40%: The traditional arguments for risks from AI are unconvincing:
- Here are the reasons for Gleave’s beliefs, weighted by how much they factor into his holistic viewpoint:
- Fast takeoff defined as “GDP will double in 6 months before it doubles in 24 months” is plausible, though Gleave still leans towards slow takeoff.
- Gleave thinks discontinuous progress in AI is extremely unlikely:
- There is unlikely to be a sudden important insight dropped into place, since AI has empirically progressed more by accumulation of lots of bags and tricks and compute.
- There isn’t going to be a sudden influx of compute in the near future, since well-funded organizations are currently already spending billions of dollars to optimize it.
- If we train impressive systems, we will likely train other systems beforehand that are almost as capable.
- Given discontinuous progress, the most likely story is that we combine many narrow AI systems in a way where the integrated whole is much better than half of them.
- Gleave guesses a ~10-20% chance that AGI technology will only be a small difference away from current techniques, and a ~50% chance that AGI technology will be easily comprehensible to current AI researchers:
- There are fairly serious roadblocks in current techniques right now, e.g. memory, transfer learning, Sim2Real, sample inefficiency.
- Deep learning is slowing down compared to 2012 – 2013:
- Much of the new progress is going to different domains, e.g. deep RL instead of supervised deep learning.
- Computationally expensive algorithms will likely hit limits without new insights.
- Though it seems possible that in fact progress will come from more computationally efficient algorithms.
- Outside view, we’ve had lots of different techniques for AI over time, so it would be surprising is the current one is the right one for AGI.
- Pushing more towards current techniques getting to AGI, from an economic point of view, there is a lot of money going into companies whose current mission is to build AGI.
- Conditional on advanced AI technology being created, Gleave gives a 60-70% chance that it will pose a significant risk of harm without additional safety efforts.
- Gleave thinks that best case, we drive it down to 20 – 10%, median case, we drive it down to 40 – 30%. A lot of his uncertainty comes from how difficult the problem is.
- Gleave thinks he could see evidence that could push him in either direction in terms of how likely AI is to be safe:
- Evidence that would cause Gleave to think AI is less likely to be safe:
- Evidence that thorny but speculative technical problems, like inner optimizers, exist.
- Seeing more arms race dynamics, e.g. between U.S. and China.
- Seeing major catastrophes involving AI, though they would also cause people to pay more attention to risks from AI.
- Hearing more solid arguments for AI risk.
- Evidence that would cause Gleave to think AI is more likely to be safe:
- Seeing AI researchers spontaneously focus on relevant problems would make Gleave think that AI is less risky.
- Getting evidence that AGI was going to take longer to develop.
- Evidence that would cause Gleave to think AI is less likely to be safe:
- Gleave is concerned that he doesn’t understand why members of the safety community come to widely different conclusions when it comes to AI safety.
- Gleave thinks a potentially important question is the extent to which we can successfully influence field building within AI safety.
I'm confused about this point. Did Adam Gleave explicitly say that he thinks discontinuous progress is "extremely unlikely" (or something to this effect)?
From the transcript I get a sense of a less confident estimate being made:
Does anyone have the original version of this? The transcript says that Adam is (perhaps incorrectly) paraphrasing Paul Christiano.
I think I have some intuition about what this is getting at, but I don't think this statement is precisely correct (surely GDP has to double in 24 months before it doubles in six months, as long as there is at least 24 months of data).
Christiano operationalises a slow takeoff as
in Takeoff speeds, and a fast takeoff as one where there isn't a complete 4 year interval before the first 1 year interval.