Last year, several researchers at AI Impacts (primarily Robert Long and I) interviewed prominent researchers inside and outside of the AI safety field who are relatively optimistic about advanced AI being developed safely. These interviews were originally intended to focus narrowly on reasons for optimism, but we ended up covering a variety of topics, including AGI timelines, the likelihood of current techniques leading to AGI, and what the right things to do in AI safety are right now.
We talked to Ernest Davis, Paul Christiano, Rohin Shah, Adam Gleave, and Robin Hanson.
Here are some more general things I personally found noteworthy while conducting these interviews. For interview-specific summaries, check out our Interviews Page.
Relative optimism in AI often comes from the belief that AGI will be developed gradually, and problems will be fixed as they are found rather than neglected.
All of the researchers we talked to seemed to believe in non-discontinuous takeoff.
Rohin gave 'problems will likely be fixed as they come up' as his primary reason for optimism, Adam
and Paul both mentioned it as a reason.
Relatedly, both Rohin and Paul said one thing that could update their views was gaining information about how institutions relevant to AI will handle AI safety problems-- potentially by seeing them solve relevant problems, or by looking at historical examples.
I think this is a pretty big crux around the optimism view; my impression is that MIRI researchers generally think that 1) the development of human-level AI will likely be fast and potentially discontinuous and 2) people will be incentivized to hack around and redeploy AI when they encounter problems. See Likelihood of discontinuous progress around the development of AGI for more on 1). I think 2) could be a fruitful avenue for research; in particular, it might be interesting to look at recent examples of people in technology, particularly ML, correcting software issues, perhaps when they're against their short-term profit incentives. Adam said he thought the AI research community wasn't paying enough attention to building safe, reliable, systems.
Many of the arguments I heard around relative optimism weren't based on inside-view technical arguments.
This isn't that surprising in hindsight, but it seems interesting to me that though we interviewed largely technical researchers, a lot of their reasoning wasn't based particularly on inside-view technical knowledge of the safety problems. See the interviews for more evidence of this, but here's a small sample of the not-particularly-technical claims made by interviewees:
- AI researchers are likely to stop and correct broken systems rather than hack around and redeploy them.
- AI has and will progress via a cumulation of lots of small things rather than via a sudden important insight.
- Many technical problems feel intractably hard in the way that AI safety feels now, and still get solved within ~10 years.
- Evolution baked very little into humans; babies learn almost everything from their experiences in the world.
My instinct when thinking about AGI is to defer largely to safety researchers, but these reasons felt noteworthy to me in that they seemed like questions that were perhaps better answered by economists or sociologists (or for the latter case, neuroscientists) than safety researchers. I really appreciated Robin's efforts to operationalize and analyze the second claim above.
(Of course, many of the claims were also more specific to machine learning and AI safety.)
There are lots of calls for individuals with views around AI risk to engage with each other and understand the reasoning behind fundamental disagreements.
This is especially true around views that MIRI have, which many optimistic researchers reported not having a good understanding of.
This isn't particularly surprising, but there was a strong universal and unprompted theme that there wasn't enough engagement around AI safety arguments. Adam and Rohin both said they had a much worse understanding than they would like of others viewpoints.
Robin and Paul both pointed to some existing but meaningful unfinished debate in the space.
--- By Asya Bergal
One minor quibble with this post's language, rather than any of its actual claims: The title includes the phrase "safety by default", and the terms "optimism" and "optimist" are repeatedly applied to these researchers or their views. The title is reasonable in a sense, as these interviews were partially/mostly about whether AI would be "safe by default", or why we might believe that it would be, or why these researchers believe that that's likely. And the use of "optimism"/"optimist" are reasonable in a sense, as these researchers were discussing why they're relatively optimistic, compared to something like e.g. the "typical MIRI view".
But it seems potentially misleading to use those phrases here without emphasising (or at least mentioning) that at least some of these researchers think there's a greater than 1% chance of extinction or other existential catastrophe as a result of AI. E.g., the statement "Rohin reported an unusually large (90%) chance that AI systems will be safe without additional intervention" implies a 10% credence that that won't be the case (and Paul and Adam seem to share very roughly similar views, based on Rohin's summaries). Relevant quote from The Precipice:
And in this case, the stakes are far greater (meaning no offence to Isidor Rabi).
My guess would be that a decent portion of people who (a) were more used to something like the FHI/80k/Oxford views, and less used to the MIRI/Bay Area views, and (b) read this without having read the interviews in great detail, might think that these researchers believe something like "The chance things go wrong is too small to be worth anyone else worrying about." Which doesn't seem accurate, at least for Rohin, Paul, and Adam.
To be clear: I don't think you're intending to convey that message. And I definitely wouldn't want to try shut down any statements about AI that don't sound like "this is a huge deal, everyone get in here now!" I'm just a bit concerned about posts accidentally conveying an overly optimistic/sanguine message when that wasn't actually their intent, and when it wasn't supported by the arguments/evidence provided.
(Something informing this comment is my past experience reading a bunch of cognitive science work on how misinformation spreads and can be sticky. Some discussion here, and a particularly relevant paper here.)