The closest thing I could find was the Metaculus Ragnarök Question Series, but I'm not sure how to interpret it because:
- The answers seem inconsistent (eg. a 1% chance of >95% of humans being killed by 2100, but a 2% chance of humans going extinct by 2100). Maybe this isn't all that problematic but I'm not sure
- The incentives for accuracy seem weird. These questions only resolve by 2100, and, if there is a catastrophe, nobody will care about their Brier score. Again, this might not be a problem but I'm not sure
- The 'community prediction' (the median) was much higher than the 'Metaculus prediction' (some weighted combination of each user's prediction). Is that because more accurate forecasters were less worried about existential risk, or because there's something that makes a good near-term forecaster that makes people underestimate existential risk?
Related: here's a list of database of existential risk estimates, and here's a list of AI-risk prediction market question suggestions.
I wonder if questions around existential risk would better be estimated by a smaller group of forecasters, rather than a prediction market or something like Metaculus (for the above reasons and other reasons).
I'll speak for the consensus when I say I think there's not a clear way to decide if this is correct without actually doing it - and the outcome would depend a lot on what level of engagement the superforecasters had with these ideas already. (If I got to pick the 5 superforecasters, even excluding myself, I could guarantee it was either closer to FHI's viewpoints, or to Will's.) Even if we picked from a "fair" reference class, if I could have them spend 2 weeks at FHI talking to people there, I think a reasonable proportion would be convinced - though perhaps this is less a function of updating neutrally towards correct ideas as it is the emergence of consensus in groups.
Lastly, I have tremendous respect for Will, but I don't know that he's calibrated particularly well to make a prediction like this. (Not that I know he isn't - I just don't have any reason to think he's spent much time working on this skillset.)