All of Javier Prieto's Comments + Replies

How accurate are Open Phil's predictions?

We do track whether predictions have a positive ("good thing will happen") or negative ("bad thing will happen") framing, so testing for optimism/pessimism bias is definitely possible. However, only 2% of predictions have a negative framing, so our sample size is too low to say anything conclusive about this yet.

Enriching our database with base rates and categories would be fantastic, but my hunch is that given the nature and phrasing of our questions this would be impossible to do at scale. I'm much more bullish on per-predictor analyses and that's more or less what we're doing with the individual dashboards.

How accurate are Open Phil's predictions?

Very good point!

I see a few ways of assessing "global overconfidence":

  1. Lump all predictions into two bins (under and over 50%) and check that the lower point is above the diagonal and the upper one is below the diagonal. I just did this and the points are where you'd expect if we were overconfident, but the 90% credible intervals still overlap with the diagonal, so pooling all the bins in this way still provides weak evidence of overconfidence.
  2. Calculating the OC score as defined by Metaculus (scroll down to the bottom of the page and click the (+) sign next
... (read more)
3Charles Dillon 2mo
One thing to note here is it is plausible that your errors are not symmetric in expectation, if there's some bias towards phrasing questions one way or another (this could be something like frequently asking "will [event] happen" where optimism might cause you to be too high in general, for example). This might mean assuming linearity could be wrong. This is probably easier for you to tell since you can see the underlying data.
I haven't seen a rigorous analysis of this, but I like looking at the slope, and I expect that it's best to include each resolved prediction as a separate data point. So there would be 743 data points, each with a y value of either 0 or 1.
How accurate are Open Phil's predictions?


  1. We're currently providing calibration and accuracy stats to our grant investigators through our Salesforce app in the hopes that they'll find that feedback useful and actionable.
  2. I'm not sure and I'd have to defer to decision-makers at OP. My model of them is that predictions are just one piece of evidence they look at.
How accurate are Open Phil's predictions?

Interesting, thanks for sharing that trick!

Our forecasting questions are indeed maximally uncertain in some absolute sense because our base rate is ~50%, but it may also be the case that they're particularly uncertain to the person making the prediction as you suggest.

How accurate are Open Phil's predictions?

A related issue is that people may be more comfortable making predictions about less important aspects of the project, since the consequences of being wrong are lower

I'm actually concerned about the same thing but for exactly the opposite reason, i.e. that because the consequences of being wrong (a hit to one's Brier score) are the same regardless of the importance of the prediction people might allocate the same time and effort to any prediction, including the more important ones that should perhaps warrant closer examination.

We're currently trialing some... (read more)

Data on forecasting accuracy across different time horizons and levels of forecaster experience

Thanks for doing these analyses!

I recently had to dive into the Metaculus data for a report I'm writing and I produced the following plot along the way. I'm posting it here because it didn't make it into the final report, but I felt it was worth sharing anyway.

Each dot corresponds to the Brier score for the community prediction on every non-ambiguously resolved question as a function of time horizon (i.e. time remaining until resolution when the prediction was made). There are up to 101 predictions per question for the reasons you describe in the post. The... (read more)

1Charles Dillon 9mo
Nice graph, thanks!