Spring AI Forecasting Benchmark is Starting
Over the last year and a half, Metaculus has been running a series of tournaments to benchmark AI's accuracy in predicting future events. These tournaments pit frontier models, bot developers, and a human baseline against each other to collectively push the boundaries of forecasting performance. We are quickly approaching the end of the Fall Bot Tournament and are now prepping for the Spring Bot Tournament!
Joining the tournament is a great way to help further innovation and learning in the AI forecasting space (contributing to AI-safety and decision-making applications), hone your AI development skills, and earn rewards for strong performance!
The Details
For those new to our benchmarking efforts, here is an overview of the two tournament series that are part of the Metaculus AI Forecasting Benchmarking Tournament (AIB):
- $50,000 Fall/Spring/Summer Bot Tournament: Our primary bot tournament runs three times a year and aligns with the Metaculus Cup timeframe. Each season will feature a $50k prize pool and 300-500 questions. Most questions will be imported from the regular question feed to compare bot performance against the Metaculus Community Prediction.
- $1,000 Bi-Weekly MiniBench: MiniBench will be a series of back-to-back two-week-long $1k tournaments of ~60 questions each. Questions are automatically created and resolved from public data sources (e.g., FRED, Google Trends, Metaculus, Stocks), and will eventually include LLM-generated questions. The purpose of this tournament is to provide fast feedback loops for participants to test the quality of their bots and to lower the barrier to entry for new participants. Additionally, it will help highlight the best forecasting LLMs faster than once a quarter. Due to automation, we expect this tournament to be slightly noisier, but it should provide an interesting point of reference. You can find the list of all past MiniBench tournaments here.
Now for updates specific to transitioning to Spring:
- Fall End Date: There will be no more new questions for the Fall bot tournament.
- Fall Resolve Date: Most questions for the Fall bot tournament are scheduled to resolve between Jan 1st and Jan 20th. Giving wiggle room for some questions needing late resolutions, we can expect finalized Fall results in February.
- Fall Prize Money: Prizes will be distributed after all questions in Fall resolve, and some verification of prize winners has completed. Prizes for MiniBench will be distributed at the same time.
- Spring Start Date and Cadence: The Spring tournament will begin on January 5th.
- Tournament will start off slow: The first 1-2 weeks will have fewer questions than usual to give new competitors more time to join.
- MiniBench Continues: MiniBench will continue without pause. The first Spring MiniBench will start on January 5th.
- Testing Your Bot: New competitors can use the December 22nd MiniBench and the Spring tournament practice questions as a warmup.
Set Up a New Bot With a 30-Minute Walkthrough
- 30-minute Walkthrough: You can set up a bot using our video walkthrough that uses our template bot GitHub repo.
- Documentation: Please see additional documentation on our resource page.
- Free LLM and Search Credits: Metaculus sponsors the LLM and search costs of participants in AIB via donations from Anthropic and OpenAI, as well as a partnership with AskNews. More info here.
- Join anytime: The tournament runs continuously till around May 1st. Competitors can join any time during this and will start at the middle of the leaderboard with 0 points. New bot makers can enter an early MVP of their bot and improve it over the course of the tournament.
Upkeep for Existing Bot Makers
- AskNews Renewal: Bots must renew their accounts each season to maintain their AskNews requests. Either join the AskNews discord, friend @freqai, and send him a message. Or, email contact@asknews.app with:
- Your bot name
- AskNews registered email
- First and Last name
- LinkedIn Profile (or other social profile)
- Association (company/lab/independent)
- LLM Credit Request: Bot makers can apply for more free LLM credits for the new Spring season. Use the same form as last time. Since bot makers change their bot designs significantly over time, please still complete the fields for descriptions, etc.
- Max probability per numeric bin dropping to 0.2: For 201 point numeric probability distributions, the max change in probability between 2 bins in the CDF (one bin in the PMF) is now 0.2 rather than 0.59. For those using forecasting-tools, all you need to do is update the package. For those not using forecasting-tools you can copy and reuse the _standardize_cdf(cdf) function. You can apply this function to your final CDF right before you submit it to Metaculus. This change should affect only a small minority of forecasts since putting 0.2 probability on one bin is rarely needed. This constraint aligns the max probability per bin for humans (via the UI) with bots (via the API). We will give participants some time to make relevant changes before requiring this constraint in the API.
- Update your packages: If you are using forecasting-tools, it is probably a good idea to update to the latest version, along with related dependencies like asknews, openai, etc.
- Optional date/conditional question support: The forecasting-tools package and thus Metac Bots now supports date questions and conditional questions. We will not be launching date or conditional questions in Spring AIB, but if you want to forecast on date or conditional questions, keep an eye out for these changes. For those inheriting from ForecastBot, conditional question forecasting has a default implementation that will reuse other prediction functions to forecast conditionals for you.
- Required Bot Survey: Around the time that Fall questions resolve, we will be sending out a required survey to all Fall AIB participants (not just prize winners as in previous rounds). This will help give better data on what model and high-level forecasting strategies work best. Bot makers who do not fill out their survey will be ineligible for Spring prizes.
Other Bot-Friendly Tournaments
There are a few other ways to compete on Metaculus using Bots:
- $7,000 Market Pulse Tournament: Bots have been eligible for prizes in the Q3 and Q4 Market Pulse tournaments and will remain eligible in Q1 when it opens. Bots will need to handle numeric group questions and continuously update forecasts during the question lifetime. (Our normal AI Benchmark tournaments do not require updating.)
- Metaculus Cup: The Metaculus Cup is a great way to test your bot and compare yourself against human participants. This is the most popular human tournament on Metaculus, and though bots are not eligible for prizes, it can help you measure the strength of your bot on diverse questions.
If you have any questions, please feel free to contact us on our Discord, or reach out to ben [at] metaculus [.com]!
