tldr: I conducted a series of interviews with 11 AI reseachers to discuss AI safety, which are located here: TRANSCRIPTION LINK. If you are interested in doing outreach with AI researchers, I highly recommend taking a look!
[Cross-posted to LessWrong.]
I recently conducted a series of interviews with 11 AI researchers, wherein I laid out some reasons to be concerned about long-term risks from AI.
These semi-structured interviews were 40-60 minutes long and conducted on Zoom. Interviewees were cold-emailed, were paid for their participation, and agreed that I may share their anonymized transcripts.
Six of the interviews were with researchers who had papers accepted at NeurIPS or ICML in 2021. Five of the interviews were with researchers who were informally categorized as “particularly useful to talk to about their opinions about safety” (generally more senior researchers at specific organizations).
I’m attaching the raw transcripts from these 11 interviews, at the following link. I’ve also included the approximate script I was following, post-interview resources I sent to interviews, and informal interview notes in the associated “README” doc. Ideally I’d have some analysis too, and hopefully will in the future. However, I think it’s useful— particularly for people who plan to start similar projects— to read through a couple of these interviews, to get an intuitive feel for what conversations with established AI researchers can feel like.
Note: I also interviewed 86 researchers for a more complete academic, under-IRB study (whose transcripts won’t be released publicly), whose results will be posted about separately on LessWrong once I finish analyzing the data. There will be substantially more analysis and details in that release; this is just to get some transcripts out quickly. As such, I won't be replying to a lot of requests for details here.
Thanks to Sam Huang, Angelica Belo, and Kitt Morjanova, who helped clean up the transcripts!
- I worry that sometimes young EAs think of researchers like a crowd of young, influenceable proto-EAs. One of my models about community-building in general is that there’s many types of people, some who will be markedly more sympathetic to AI safety arguments than others, and saying the same things that would convince an EA to someone whose values don’t align will not be fruitful. A second model is that older people who are established in their careers will have more formalized world models and will be more resistance to change. This means that changing one’s mind requires much more of a dialogue and integration of ideas into a world model than with younger people. The thing I want to say overall: I think changing minds takes more careful, individual-focused or individual-type-focused effort than would be expected initially.
- I think one’s attitude as an interviewer matters a lot for outcomes. Like in therapy, which is also about changing beliefs and behaviors, I think the relationship between the two people substantially influences openness to discussion, separate from the persuasiveness of the arguments. I also suspect interviewers might have to be decently “in-group” to have these conversations with interviewees. However, I expect that that in-group-ness could take many forms: college students working under a professor in their school (I hear this works for the AltProtein space), graduate students (faculty frequently do report their research being guided by their graduate students) or colleagues. In any case, I think the following probably helped my case as an interviewer: I typically come across as noticeably friendly (also AFAB), decently-versed in AI and safety arguments, and with status markers. (Though this was not a university-associated project, I’m a postdoc at Stanford who did some AI work at UC Berkeley).
- I’d be very excited if someone wants to take my place and continue on this project with one-on-one discussions with researchers. There’s a good chance I won’t have time to do this anymore, but I’m still very enthusiastic about this project. If you think you’d be a good fit, please reach out (email@example.com)! Things that help: friendliness, as much familiarity with AI arguments as you can manage (I also learned a lot about gaps in my models through these interviews), conscientiousness, and some way to get access to researchers. Extroversion would have helped me a lot too, but can’t have everything.
Thanks, I think this is really valuable.
This is great work, I think it's really valuable to get a better sense of what AI researchers think of AI safety.
Often when I ask people in AI safety what they think AI researchers think of AGI and alignment arguments, they don't have a clear idea and just default to some variation on "I'm not sure they've thought about it much". Yet as these transcripts show, many AI researchers are well aware of AI risk arguments (in my anecdotal experience, many have read at least part of Superintelligence ) and have more nuanced views. So I'm worried that AI safety is insular w.r.t mainstream AI researchers thought on AGI - and these are people who in many cases have spent their working life thinking about AGI, so their thoughts are highly valuable, and this work goes some way to reversing that insularity.
A nice followup direction to take this would be to get a list of common arguments used by AI researchers to be less worried about AI safety (or about working on capabilities, which is separate), counterarguments, and possible counter-counter arguments. Do you plan to touch on this kind of thing in your further work with the 86 researchers?
Indeed! I've actually found that in most of my interviews people haven't thought about the 50+ year future much or heard of AI alignment, given that my large sample is researchers who had papers at NeurIPS or ICML. (The five researchers who were individually selected here had thought about AI alignment uncommonly much, which didn't particularly surprise me given how they were selected.)
Yes. With the note that the arguments brought forth are generally less carefully thought-through than the ones shown in the individually-selected-population, due to the larger population. But you can get a sense for some of the types of arguments in the six transcripts from NeurIPS / ICML researchers, though I wouldn't say it's fully representative.
I've been finding "A Bird's Eye View of the ML Field [Pragmatic AI Safety #2]" to have a lot of content that would likely be interesting to the audience reading these transcripts. For example, the incentives section rhymes with the type of things interviewees would sometimes say. I think the post generally captures and analyzes a lot of the flavor / contextualizes what it was like to talk to researchers.
Two authors gave me permission to publish their transcripts non-anonymously! Thus:
- Interview with Michael L. Littman
- Interview with David Duvenaud
I am currently pursuing a couple of projects that are intended to appeal to the sensibilities of AI researchers who aren't in the alignment community already. This has already been very useful for informing the communications and messaging I would use for those. I can see myself referring back to this often, when pursuing other field building activities too. Thanks a lot for publishing this!
With all the “AI psychology” posts on here and Twitter, I thought this was going to be “interviews with AIs that are researchers” not “interviews with humans researching AI”. This is probably more valuable!
Fantastic work. And thank you for transcribing!
I would find this a bit more readable if you compiled everything into one big pdf.
Thanks for doing this! The interviews are really interesting to read; the CEO example seems like something which perhaps should gain more prominence as a way to introduce AI-knowledgeable audiences to risks about alignment.
This is fantastic, thank you!
Is there a summary of the main insights/common threads from the interviews?
Still in progress as always, but this talk covers a lot of it! https://forum.effectivealtruism.org/posts/q49obZkQujkYmnFWY/vael-gates-risks-from-advanced-ai-june-2022
(Unfortunately the part about the insights isn't transcribed-- the first 20m is introduction, next ~30m is the description you want, last 10m is questions)
This is such an incredibly useful resource, Vael! Thank you so much for your hard work on this project.
I really hope this project continues to go strong!