Survey on AI existential risk scenarios

Sam Clarke; jonasschuett; ac

Comments 11

Sorted by

New & upvoted

RobBensinger

Fascinating results! I really appreciate the level of thought and precision you all put into the survey questions.

Were there any strong correlations between which of the five scenarios respondents considered more likely?

Sam Clarke

Thanks Rob, interesting question. Here are the correlation coefficients between pairs of scenarios (sorted from max to min):

So it looks like there are only weak correlations between some scenarios.

It's worth bearing in mind that we asked respondents not to give an estimate for any scenarios they'd thought about for less than 1 hour. The correlations could be stronger if we didn't have this requirement.

MichaelA🔸

Thanks for sharing these interesting and findings and write-up! This seems like a quite useful survey to have done, for similar reasons to why I think/hope work like Clarifying some key hypotheses in AI alignment, Crucial questions for longtermists, Database of existential risk estimates, and much of the prior work you cite would be useful.

People could also check out that database for a sense of how these respondents' views compare to some other quantitative estimates that have been publicly made (which is of course only a subset of all relevant views from all relevant people).

I've also now added to my database of existential risk estimates the six estimates indicated by the sentence "If you take the median response for each scenario and compare them, those (conditional) probabilities are fairly similar (between 10% and 12.5% for the five given scenarios, and 20% for “other scenarios”)", in the "Conditional existential-risk-level estimates".

jonasschuett

Thanks :)

Maxime Riché 🔸

I am confused by this survey. Taken at face value, working on improving Cooperation would only be x2 less impactful than working on hard AI alignment (only looking at the importance of the problem). And working on partial/naive alignment would be as impactful as working on AI alignment (looking only at the importance).
Does that make sense?

(I make a bunch of assumptions to come up with these values. The starting point is the likelihood of the 5-6 X-risks scenarios. Then I associate each scenario with a field (AI alignment, naive AI alignment, Cooperation) that reduces its likelihood. Then I produce the value above, and they stay similar even if I assume a 2-step model where some scenarios happen before others. Google sheet)

Sam Clarke

Thanks for your comment!

I doubt that it's reasonable to draw these kinds of implications from the survey results, for a few reasons:

respondents were very uncertain
there's overlap between the scenarios
there's no 1-1 mapping between "fields" and risk scenarios (e.g. I'd strongly bet that improved cooperation of certain kinds would make both catastrophic misalignment and war less likely) (though maybe your model tries to account for this, I didn't look at it)

A broader point: I think making importance comparisons (between interventions) on the level of abstraction of "improving cooperation", "hard AI alignment" and "partial/naive alignment" doesn't make much sense. I expect comparing specific plans/interventions to be much more useful.

Maxime Riché 🔸

Thanks for your response!

Yet, I am still not clearly convinced that my reading doesn't make sense. Here are some comments:

"respondents were very uncertain"
This seems to be, at the same time, the reason why you could want to diversify your portfolio of interventions for reducing X-risks. And the reason why someone could want to improve such estimates (of P(Nth scenario|X-risk)). But it doesn't seem to be a strong reason to discard the conclusion of the survey (It would be, if we had more reliable information elsewhere).
"there's overlap between the scenarios":
I am unsure, but it seems that the overlaps are not that big overall. Especially, the overlap between {1,2,3} and {4,5} doesn't seem huge. (I also wonder if these overlaps also illustrate that you could reduce X-risks using a broader range of interventions (than just "AI alignment" and "AI governance"))

The “Superintelligence” scenario (Bostrom, 2014)
Part 2 of “What failure looks like” (Christiano, 2019)
Part 1 of “What failure looks like” (Christiano, 2019)
War (Dafoe, 2018)
Misuse (Karnofsky, 2016)
Other existential catastrophe scenarios.

"no 1-1 mapping between "fields" and risk scenarios"
Sure, this would benefit from having a more precise model.
"Priority comparison of interventions is better than high-level comparisons"
Right. High-level comparisons are so much cheaper to do, that it seems worth it to stay at that level for now.

The point I am especially curious about is the following:
- Is this survey pointing to the fact that the importance of working on "Technical AI alignment", "AI governance", "Cooperative AI" and "Misuse limitation" are all within one OOM?
By importance here I mean, the importance as in the ITN framework of 80k, not the overall priority, which should include neglectedness, tractabilities and looking at object-level interventions.

NunoSempere

I'd strongly bet that improved cooperation of certain kinds would make both catastrophic misalignment and war less likely

I dislike the usage of "strongly bet" here, given that a literal bet here seems hard to arrive at. See here: <https://nunosempere.com/blog/2023/03/02/metaphorical-bets> for some background.

Sam Clarke

Thanks for this, I won't use "bet" in this context in the future

ben.smith

Are there similar surveys being taken on (1) AI researchers in general (2) technology experts in general and (3) voters?

The first group is relevant because they are the ones currently building AGI; the second group is relevant because they are the ones who will implement it; the third group is relevant because they set up incentives for policymakers.

I am a social science researcher, and if there is no work being done on this, I would consider research on it myself.

Sam Clarke

Re (1) See When Will AI Exceed Human Performance? Evidence from AI Experts (2016) and the 2022 updated version. These surveys don't ask about x-risk scenarios in detail, but do ask about the overall probability of very bad outcomes and other relevant factors.

Re (1) and (3), you might be interested in various bits of research that GovAI has done on the American public and AI researchers.

You also might want to get in touch with Noemi Dreksler, who is working on surveys at GovAI.

Comments

More from the author

134

Some talent needs in AI governance

Sam Clarke·3y ago·Curated 3y ago·9m read

177

The longtermist AI governance landscape: a basic overview

Sam Clarke·4y ago·11m read

120

When reporting AI timelines, be clear who you're deferring to

Sam Clarke·3y ago·1m read

Curated and popular this week

What would an animal-aligned AI be aligned to?

Aidan Kankyoku, Anima International·2w ago·Curated 6d ago·15m read

This is a crosspost from the new Animal Welfare Alignment Newsletter by Anima International. You can subscribe on Substack if you are interested in following these efforts. Audio reading also available on Substack. The goals of this post are to: 1. Raise a question I see as crucially important to the goal of aligning AI to animal welfare...

131

Let's taboo the V-word

lincolnq·2d ago·8m read

“How long have you been v*g*n?” This is one of the most common icebreakers at animal protection events. It’s a baseline assumption, and it mostly holds true: if you’re out advocating for animals not to be tortured or abused, realistically these days you are v**n, or close. And it makes for good conversation. It seems fairly safe to assume when you meet strangers. But this assumption is hurting the movement in a way which we don’t always notice: someone new comes into the sp...

Counting animals: Stable population size is not equivalent to priority level

abrahamrowe, mal_graham🔸·2d ago·16m read

AI Use Note: Main body text entirely human written. Claude (Opus 4.8) helped develop models of animal life histories in the appendix. Cross-posted from Good Structures. Executive Summary * Animal advocates sometimes make claims like “there are X of this animal...

Recent opportunities to take action

Maxime Riché 🔸

Thanks for your response!

Yet, I am still not clearly convinced that my reading doesn't make sense. Here are some comments:

"respondents were very uncertain"
This seems to be, at the same time, the reason why you could want to diversify your portfolio of interventions for reducing X-risks. And the reason why someone could want to improve such estimates (of P(Nth scenario|X-risk)). But it doesn't seem to be a strong reason to discard the conclusion of the survey (It would be, if we had more reliable information elsewhere).
"there's overlap between the scenarios":
I am unsure, but it seems that the overlaps are not that big overall. Especially, the overlap between {1,2,3} and {4,5} doesn't seem huge. (I also wonder if these overlaps also illustrate that you could reduce X-risks using a broader range of interventions (than just "AI alignment" and "AI governance"))

The “Superintelligence” scenario (Bostrom, 2014)
Part 2 of “What failure looks like” (Christiano, 2019)
Part 1 of “What failure looks like” (Christiano, 2019)
War (Dafoe, 2018)
Misuse (Karnofsky, 2016)
Other existential catastrophe scenarios.

"no 1-1 mapping between "fields" and risk scenarios"
Sure, this would benefit from having a more precise model.
"Priority comparison of interventions is better than high-level comparisons"
Right. High-level comparisons are so much cheaper to do, that it seems worth it to stay at that level for now.

We will not look at any responses from now on; this is intended just to show what questions were asked, and in case any readers are interested in thinking through their own responses. ↩︎
AI existential risk scenarios are sometimes called threat models. ↩︎
Bostrom describes many scenarios in the book “Superintelligence”. We think that this scenario is the one that most people remember from the book, but nonetheless, we think it was probably a mistake to refer to this particular scenario by this name. ↩︎
Likewise, the mean responses for the five given scenarios are all between 15% and 18%, and the mean response for “other scenarios” was 25%. ↩︎
Other similar results: 77% of respondents assigned ≤5% (conditional) probability to at least one scenario; 51% of respondents assigned ≤5% (conditional) probability to at least two scenarios. ↩︎
For another way of interpreting this, consider that if respondents were evenly split into six completely “polarised” camps, each of which put 100% probability on one option and 0% on the others, then the mean absolute deviation for each scenario would be ~28%. ↩︎
As per footnote 3, the particular scenario we are referring to here is not the only scenario described in “Superintelligence”. ↩︎

Survey on AI existential risk scenarios

Summary

Motivation

The survey

Key results

There was considerable disagreement among researchers about which risk scenarios are most likely

Researchers are uncertain about which risk scenarios are most likely

Researchers put substantial credence on “other scenarios”

Key takeaway

Caveats

Other notable results

Full version

Acknowledgements