Hide table of contents

The following is Metaculus's year-in-review CEO letter written by Gaia Dempsey.
 

2022 was a year of progress, growth, and change at Metaculus.

We matured and grew as an organization.

  • Mission Update: In the spring, we expanded the scope of our mission, creating a solid foundation for our work. Specifically, our mission is to build epistemic infrastructure that enables the global community to model, understand, predict, and navigate the world’s most important and complex challenges. We also clearly defined the three primary ways we enact our mission, namely: providing forecasts as a public service, fostering a global forecasting community, and supporting forecasting research.
  • Pro Forecaster Program: In response to a sharp increase in global uncertainty with the outbreak of the war in Ukraine, we designed new operational procedures including an infohazard review process, and recruited over 30 candidates into an exceptionally talented Pro Forecasting team.
  • Core Funding: In the summer, we secured a $5.5M core funding grant from Open Philanthropy to expand our work.
  • Becoming a Public Benefit Corporation: Shortly after, we committed to providing three specific public benefits in our charter, reflecting our mission statement, as part of the process of becoming a Public Benefit Corporation.
  • Growing Our Team: With a mandate to expand our platform, programs, and operations, we set about upgrading our hiring processes thanks to great resource recommendations from Jim Savage (a), and Aaron Hamlin (b), and got laser focused on recruiting. Over the course of the year we doubled our overall headcount, building four strong new teams: leadership, AI forecasting, engineering, and design (some engineering start dates are in early 2023).

We launched over a dozen forecasting tournaments and generated over 1000 aggregate forecasts.

  • Expansion of Forecasting Programs: Over the last year, we expanded our existing partnerships and began collaborations with a number of fantastic new partners. In total, we launched 13 new forecasting tournaments offering a total of $77,500 in prize money, with a number of programs delivering policy-relevant predictions on public health, biosecurity, and nuclear-risk. Outside of these core themes (each of which I’ll touch on later), Forecasting Our World In Data was a standout launch: a project that “probes the long-term future, delivering predictions on topics like global investment in AI, world life expectancy, CO2 emissions, and more on time horizons from one to 100 years.” Another favorite was our final launch of the year, the Sagan Tournament, focused on all things space-related — from technology, to scientific discovery, to governance.
  • Ukraine War Response: After accurately predicting the war in Ukraine, the Metaculus community continued to closely monitor the conflict, responding rapidly to new developments in a forecasting tournament launched just 48 hours after the invasion. With nuclear security expert Peter Scoblic, we deployed the Red Lines in Ukraine project as an early-warning system gauging the likelihood Russia would make use of nuclear weapons.
  • Biosecurity & Public Health: 2022 saw us increasing our impact in public health and biosecurity, executing the $25,000 Biosecurity Tournament with the Institute for Progress and Guarding Against Pandemics, as well as the Real-Time Pandemic Decision Making tournament with UVA’s Biocomplexity Institute to aid COVID computational modeling efforts. In May when clusters of Mpox virus infections were observed across multiple countries, our community responded immediately through our Mpox Series, contributing ensemble predictions and providing a crucial health tool for assessing widespread community transmission when data were sparse.

We grew our forecasting community and connected with a wider network.

We upgraded our infrastructure and shipped new features.

  • Upgraded Tech Stack: In 2022, we rewrote nearly the entire Metaculus application, modernizing the Metaculus tech stack to support our 2023 product roadmap. We have a number of enhancements and new features in store, big and small, and we can’t wait to share them with the community.
  • Question Groups & Fan Graphs: We released question groups and fan graphs, enabling the grouping of related questions and the visualization of their forecasts in series.

  • Private Forecasting Spaces & Language Localization: To support partner projects, including internationally, we developed private forecasting spaces that enable confidential forecasting for a group or organization of any size, as well as the ability to translate the Metaculus interface into any language.
  • Tournament Scoring & Leaderboards: We updated our tournament scoring and leaderboard systems to bring greater rigor and clarity to how we reward and incentivize forecasting skill.

We collaborated on research and published reports on the biggest potential risks of 2022.

  • Nuclear Escalation in Ukraine: After the invasion of Ukraine, policymakers and the public became increasingly concerned about the prospect of nuclear escalation. We recruited a team of Metaculus Pro Forecasters to make their judgments on key questions and provide their rationales, all of which were drawn up in a full nuclear risk report.
  • Predicting the Omicron BA.1 Wave: In partnership with the University of Virginia’s Biocomplexity Institute and the Virginia Department of Health, we co-authored a paper demonstrating the accuracy and robustness of using Metaculus’s COVID-19 Omicron variant forecasting ensembles in combination with computational models by providing valuable real-time forecasts.
  • Mpox Rapid Information Aggregation: When an unexpected number of Mpox cases were reported in early May 2022, we organized a rapid forecasting response to gauge the potential scope and impact of the outbreak. Working with our research partner Tom McAndrew at Lehigh University, we co-authored a paper on the efficacy of rapid human judgment forecasting, which was published (in record time) in The Lancet Digital Health.
  • Forecasting the US-China AI & Nuclear Landscape: With our partners at the Institute for Security and Technology, we launched an initiative evaluating intervention points in the US-China nuclear relationship, with a special focus on the integration of AI into nuclear command, control, and communications systems. The resulting report combines insights from nuclear and policy subject matter experts and Metaculus Pro Forecasters.

I’m extremely proud of what we’ve accomplished as a team, and I’m deeply grateful for the support of our partner organizations and the forecasting community. If you’re excited by what we’re doing and would like to get in touch, please do feel free to grab some time with me, or shoot our team a note.

Onward,

Gaia Dempsey
CEO, Metaculus

Comments2


Sorted by Click to highlight new comments since:

Thanks for this update. I like Metaculus and have started forecasting more in 2022.

Something I would enjoy seeing is the ability to have a very quick UI for creating private questions, similar to what Nathan proposes here: https://www.super-linear.org/prize?recordId=recYHpvvGFmiFq9tS

Here is what I imagine this could look like: 

  1. I pull up https://quick.metaculus.com 
  2. I only see an empty command line, and when I type in, e.g. "Will Putin still be President of Russia at the end of 2023? 1y", it instantly creates a private binary question that closes at the end of today and puts the resolve date to one year from today. The question title is filled out as a random combination of the words from the question text. It just fills in a dot for background info, resolution criteria,  and fine print. 
  3. The format could be "[question text]? [resolve date]" where the question mark serves as the indicator for the end of the question text, and the resolve date part can interpret things like "1w", "1y", "eoy", "5d"
  4. In a perfect world, this would also integrate with Alfred on my mac so that it becomes extremely easy and quick to create a new private question

This feature would allow me to quickly create a binary private forecast when I am thinking of it and improve my calibration over time. 

Also, I have a question about the calibration plot that Metaculus creates on my profile: Does it only take into account the last forecast I made on a question or somehow integrates all forecasts I made on one question over time and puts that into the calibration plot? 

It's a great idea, and I like how you've fleshed it out. I'll pass this along to our Product team. 

For your calibration plot, you can actually use the 'evaluated at' dropdown and watch your plot adjust on the fly. 

 

More from christian
75
· · 2m read
35
christian
· · 2m read
33
christian
· · 1m read
Curated and popular this week
 ·  · 32m read
 · 
Summary Immediate skin-to-skin contact (SSC) between mothers and newborns and early initiation of breastfeeding (EIBF) may play a significant and underappreciated role in reducing neonatal mortality. These practices are distinct in important ways from more broadly recognized (and clearly impactful) interventions like kangaroo care and exclusive breastfeeding, and they are recommended for both preterm and full-term infants. A large evidence base indicates that immediate SSC and EIBF substantially reduce neonatal mortality. Many randomized trials show that immediate SSC promotes EIBF, reduces episodes of low blood sugar, improves temperature regulation, and promotes cardiac and respiratory stability. All of these effects are linked to lower mortality, and the biological pathways between immediate SSC, EIBF, and reduced mortality are compelling. A meta-analysis of large observational studies found a 25% lower risk of mortality in infants who began breastfeeding within one hour of birth compared to initiation after one hour. These practices are attractive targets for intervention, and promoting them is effective. Immediate SSC and EIBF require no commodities, are under the direct influence of birth attendants, are time-bound to the first hour after birth, are consistent with international guidelines, and are appropriate for universal promotion. Their adoption is often low, but ceilings are demonstrably high: many low-and middle-income countries (LMICs) have rates of EIBF less than 30%, yet several have rates over 70%. Multiple studies find that health worker training and quality improvement activities dramatically increase rates of immediate SSC and EIBF. There do not appear to be any major actors focused specifically on promotion of universal immediate SSC and EIBF. By contrast, general breastfeeding promotion and essential newborn care training programs are relatively common. More research on cost-effectiveness is needed, but it appears promising. Limited existing
Ben_West🔸
 ·  · 1m read
 · 
> Summary: We propose measuring AI performance in terms of the length of tasks AI agents can complete. We show that this metric has been consistently exponentially increasing over the past 6 years, with a doubling time of around 7 months. Extrapolating this trend predicts that, in under a decade, we will see AI agents that can independently complete a large fraction of software tasks that currently take humans days or weeks. > > The length of tasks (measured by how long they take human professionals) that generalist frontier model agents can complete autonomously with 50% reliability has been doubling approximately every 7 months for the last 6 years. The shaded region represents 95% CI calculated by hierarchical bootstrap over task families, tasks, and task attempts. > > Full paper | Github repo Blogpost; tweet thread. 
 ·  · 2m read
 · 
For immediate release: April 1, 2025 OXFORD, UK — The Centre for Effective Altruism (CEA) announced today that it will no longer identify as an "Effective Altruism" organization.  "After careful consideration, we've determined that the most effective way to have a positive impact is to deny any association with Effective Altruism," said a CEA spokesperson. "Our mission remains unchanged: to use reason and evidence to do the most good. Which coincidentally was the definition of EA." The announcement mirrors a pattern of other organizations that have grown with EA support and frameworks and eventually distanced themselves from EA. CEA's statement clarified that it will continue to use the same methodologies, maintain the same team, and pursue identical goals. "We've found that not being associated with the movement we have spent years building gives us more flexibility to do exactly what we were already doing, just with better PR," the spokesperson explained. "It's like keeping all the benefits of a community while refusing to contribute to its future development or taking responsibility for its challenges. Win-win!" In a related announcement, CEA revealed plans to rename its annual EA Global conference to "Coincidental Gathering of Like-Minded Individuals Who Mysteriously All Know Each Other But Definitely Aren't Part of Any Specific Movement Conference 2025." When asked about concerns that this trend might be pulling up the ladder for future projects that also might benefit from the infrastructure of the effective altruist community, the spokesperson adjusted their "I Heart Consequentialism" tie and replied, "Future projects? I'm sorry, but focusing on long-term movement building would be very EA of us, and as we've clearly established, we're not that anymore." Industry analysts predict that by 2026, the only entities still identifying as "EA" will be three post-rationalist bloggers, a Discord server full of undergraduate philosophy majors, and one person at