1 min read 15

10

The median superforecaster gave a 0.38% risk of extinction due to AI by 2100, while the median AI domain expert gave a 3.9% risk of extinction.

Tetlock's previous results show that domain experts are not very good at making predictions, and that superforecasters are significantly better.  We should all revise our views on AI xrisk.

10

0
0

Reactions

0
0
Comments15


Sorted by Click to highlight new comments since:

I am a bit worried about a narrative of "the forecasters think x-risk is low" when I know a bunch of excellent forecasters who have much higher AI x-risk probabilities. 

For example, Samotsvety (who afaict have an excellent forecasting track record on domain-relevant questions) gave some estimates here (on sep-8-2022) :

A few of the headline aggregate forecasts are:

  1. 25% chance of misaligned AI takeover by 2100, barring pre-APS-AI catastrophe
  2. 81% chance of Transformative AI (TAI) by 2100, barring pre-TAI catastrophe
  3. 32% chance of AGI being developed in the next 20 years

Conversely, the median estimate of all domain level experts is probably lower than the 3.9% presented here. The sampling of experts is non-random: people who are already concerned about AI risk are more likely to do the voluntary survey. The sample here had ~40% of experts attending at least one AI meetup, which is not at all typical for AI experts as a group. 

This could also be true of previous surveys, like the 2022 AI impacts survey which had a response rate of only 17%. I reckon that if you added in the other 83% of experts, the median estimate would drop by a fair margin. 

"when I know a bunch of excellent forecasters..."

Perhaps your sampling techniques are better than Tetlock's then.

The Samotsvety track record does straightforwardly look better than what I expect the median superforecaster's track record to be (which I think is ~99th percentile in either the original Tetlock studies or on GJO), especially on AI. Though perhaps Tetlock's team also selected for better forecasters than the median superforecaster? It's unclear to me.

Last I checked, Tetlock's result on the efficacy of superforecasters vs. domain experts wasn't apples-to-apples: it was comparing individual domain expert forecasts vs. superforecaster forecasts that had been aggregated.

As this post explains, the main study that people cite when saying that "superforecasters are better than experts" comes from a competition where the aggregation methods for the two groups was different (Good Judgment Project's aggregation algorithm versus prediction market with low liquidity for amateur forecasters and experts, respectively). Prediction markets for forecasters and experts had similar performance.

Teddy - this is an important and fascinating paper, and I'd highly recommend EAs to read it.

I'm genuinely baffled about why the superforecasters are giving such an _extremely_ low risk of extinction compared to the AI domain experts, and I'd value any suggestions from others about this.

I don't think the superforecaster estimates being so low is a strong reason to significantly revise our views on X risk, until we have better insights into the huge discrepancy in risk estimates.

One possible explanation for the disparity is the sampling of participants: 42% of the domain experts had attended EA meetups, whereas only 9% of the superforecasters had (page 9 of report). This could have caused a systematic shift in opinion. 

Another explanation: Anchoring bias. The general public changed their estimates of x-risk six orders of magnitude from 5% to 1 in 15 million when the question was phrased differently (page 29). Presumably at least some of this effect would persist for experts as well. Participants were given a list of previous predictions of AI x-risk which were mostly around the 5% range (page 132). I propose that the domain experts anchored to this value, whereas the superforecasters were more willing to deviate. 

titotal - thanks for these helpful observations. Both sound plausible!

There's a giant financial and status incentive for AI safety workers to inflate the dangers. It's also more likely that someone becomes an ai safety expert if they over-estimate the risk.

This study wasn't recruiting AI safety workers? Rather it had AI domain experts, many of whom appeared to have thought about AI x-risk not much more than I'd have expected the median AI researcher to have thought of AI x-risk.[EDIT 2023/07/16: I'm less sure that this is true] 

There was a follow up study with both superforecasters and people who have thought about or worked in AI safety (or adjacent fields). I was involved as a participant. That study had some more (though arguably still limited) engagement between the two camps, and I think there was more constructive dialogue and useful updates in comparison. 

John - yes, it is plausible that there could be selection effects, such that only people with a P(doom) over 1% even bother becoming AI safety researchers. 

But this cuts both ways: any 'forecasting experts' who think the P(doom) is over 1% might have already become AI safety researchers, rather than remaining general forecasting experts.

Also, I'm a bit baffled by this narrative that there are 'giant financial and status incentives' for AI safety researchers to inflate the dangers. 

If somebody wanted to become rich and famous, becoming an AI safety researcher wouldn't even make the Top 1000 list of good career strategies.

The entirety of their job security and status in society depends on the risk being high. You don't view that as a strong incentive to create the impression that the risk is high?

To explain my disagree-vote, this kind of explanation isn't a good one in isolation

I could also say it benefits AI developers to downplay[1] risk, as that means their profits and status will be high, and society will have a more positive view of them as people who are developing fantastic technologies rather than raising existential risks

And what makes this a bad explanation is that it is so easy to vary. Like above, you can flip the sign. I can also easily swap out the area for any other existential risk (e.g. Nuclear War or Climate Change), and the argument could run exactly the same.

Of course, I think motivated reasoning is something that exists and may play a role in explaining the gap between superforecasters and experts in this survey. But on the whole I don't find it convincing without further evidence.

  1. ^

    consciously or not

I wouldn't expect a lot of scarcity mindset, because there's a lot of generically in demand talent and experience among AI x-risk orgs. Status may be a more reasonable question, but job security doesn't really make sense.

Curated and popular this week
 ·  · 23m read
 · 
Or on the types of prioritization, their strengths, pitfalls, and how EA should balance them   The cause prioritization landscape in EA is changing. Prominent groups have shut down, others have been founded, and everyone is trying to figure out how to prepare for AI. This is the first in a series of posts examining the state of cause prioritization and proposing strategies for moving forward.   Executive Summary * Performing prioritization work has been one of the main tasks, and arguably achievements, of EA. * We highlight three types of prioritization: Cause Prioritization, Within-Cause (Intervention) Prioritization, and Cross-Cause (Intervention) Prioritization. * We ask how much of EA prioritization work falls in each of these categories: * Our estimates suggest that, for the organizations we investigated, the current split is 89% within-cause work, 2% cross-cause, and 9% cause prioritization. * We then explore strengths and potential pitfalls of each level: * Cause prioritization offers a big-picture view for identifying pressing problems but can fail to capture the practical nuances that often determine real-world success. * Within-cause prioritization focuses on a narrower set of interventions with deeper more specialised analysis but risks missing higher-impact alternatives elsewhere. * Cross-cause prioritization broadens the scope to find synergies and the potential for greater impact, yet demands complex assumptions and compromises on measurement. * See the Summary Table below to view the considerations. * We encourage reflection and future work on what the best ways of prioritizing are and how EA should allocate resources between the three types. * With this in mind, we outline eight cruxes that sketch what factors could favor some types over others. * We also suggest some potential next steps aimed at refining our approach to prioritization by exploring variance, value of information, tractability, and the
 ·  · 5m read
 · 
[Cross-posted from my Substack here] If you spend time with people trying to change the world, you’ll come to an interesting conundrum: Various advocacy groups reference previous successful social movements as to why their chosen strategy is the most important one. Yet, these groups often follow wildly different strategies from each other to achieve social change. So, which one of them is right? The answer is all of them and none of them. This is because many people use research and historical movements to justify their pre-existing beliefs about how social change happens. Simply, you can find a case study to fit most plausible theories of how social change happens. For example, the groups might say: * Repeated nonviolent disruption is the key to social change, citing the Freedom Riders from the civil rights Movement or Act Up! from the gay rights movement. * Technological progress is what drives improvements in the human condition if you consider the development of the contraceptive pill funded by Katharine McCormick. * Organising and base-building is how change happens, as inspired by Ella Baker, the NAACP or Cesar Chavez from the United Workers Movement. * Insider advocacy is the real secret of social movements – look no further than how influential the Leadership Conference on Civil Rights was in passing the Civil Rights Acts of 1960 & 1964. * Democratic participation is the backbone of social change – just look at how Ireland lifted a ban on abortion via a Citizen’s Assembly. * And so on… To paint this picture, we can see this in action below: Source: Just Stop Oil which focuses on…civil resistance and disruption Source: The Civic Power Fund which focuses on… local organising What do we take away from all this? In my mind, a few key things: 1. Many different approaches have worked in changing the world so we should be humble and not assume we are doing The Most Important Thing 2. The case studies we focus on are likely confirmation bias, where
 ·  · 1m read
 · 
I wanted to share a small but important challenge I've encountered as a student engaging with Effective Altruism from a lower-income country (Nigeria), and invite thoughts or suggestions from the community. Recently, I tried to make a one-time donation to one of the EA-aligned charities listed on the Giving What We Can platform. However, I discovered that I could not donate an amount less than $5. While this might seem like a minor limit for many, for someone like me — a student without a steady income or job, $5 is a significant amount. To provide some context: According to Numbeo, the average monthly income of a Nigerian worker is around $130–$150, and students often rely on even less — sometimes just $20–$50 per month for all expenses. For many students here, having $5 "lying around" isn't common at all; it could represent a week's worth of meals or transportation. I personally want to make small, one-time donations whenever I can, rather than commit to a recurring pledge like the 10% Giving What We Can pledge, which isn't feasible for me right now. I also want to encourage members of my local EA group, who are in similar financial situations, to practice giving through small but meaningful donations. In light of this, I would like to: * Recommend that Giving What We Can (and similar platforms) consider allowing smaller minimum donation amounts to make giving more accessible to students and people in lower-income countries. * Suggest that more organizations be added to the platform, to give donors a wider range of causes they can support with their small contributions. Uncertainties: * Are there alternative platforms or methods that allow very small one-time donations to EA-aligned charities? * Is there a reason behind the $5 minimum that I'm unaware of, and could it be adjusted to be more inclusive? I strongly believe that cultivating a habit of giving, even with small amounts, helps build a long-term culture of altruism — and it would