Within the EA sphere, prediction markets have often been championed as a good solution for forecasting the future. Improved forecasting has been discussed many times as a cause area for humanity to make better judgements and generally improve institutional decision making.
In this post, I will argue that prediction markets are overrated within EA, both in their function for high stakes forecasting, as well as in more casual environments.
A prediction market is a market created for the purpose of trading on the outcome of events. The market prices are supposed to indicate what the probability of an event is, so a contract can trade between 0 and 100%. The idea behind the merit of prediction markets is that they are markets - and therefore should be efficient.
I. Prediction markets didn’t do best in Tetlock’s experiments
Prof. Tetlock lead the Good Judgement Project, a research collaborative participating in the IARPA’s (a US intelligence org) forecasting tournament. He experimented with various setups of people participating to see which would lead to be the best outcomes. The tournament focussed on geopolitical questions with a time frame of less than a couple of years.
Tetlock then wrote up his findings in Superforecasting. His book is mostly about the individual people who did consistently exceptionally well at IARPA’s tournament, whom he calls Superforecasters and what made them special.
But Tetlock also writes about how well his various experiments did at the tournament - among the tried methods were prediction markets. Unfortunately, while prediction markets beat the ‘wisdom of crowds’ (the average of people’s guesses) as well as most individual participants and teams of participants, they were not the experiment that did best.
They were individual superforecasters who did better than prediction markets and teams of superforecasters working together did reliably better than prediction markets. Also, teams of forecasters working together often did better than prediction markets if their results were extremized - that means taking the team’s stated probability and nudging it either closer to zero or hundred percent. The reason for extremizing to work so well is that information in teams of normal forecasters is often shared incompletely. The idea is that if all participants shared their information more completely with each other, the guesses each individual has would become more confident and thereby their average judgement more extreme. Note that extremizing works much less well if people have similar information.
However, Tetlock grants that the experiments with prediction markets could be improved (Superforecasting, page 207):
I can already hear the protests from my colleagues in finance that the only reason the superteams beat the prediction markets was that our markets lacked liquidity: real money wasn’t at stake and we didn’t have a critical mass of traders. They may be right. It is a testable idea, and worth testing. It’s also important to recognize that while superteams beat prediction markets, prediction markets did a pretty good job of forecasting complex global events.
I’d also be curious about a prediction market in which only superforecasters trade.
This brings me to the second part of my argument - namely, that the suboptimal conditions for the prediction market in the Good Judgement Project would often also apply in the real world when prediction markets get tested.
II. Prediction markets often won’t be efficient (as so many other markets in the real world)
It is important to recap here what market efficiency actually means: a market is efficient, if at any given time, prices fully reflect all available information about a particular stock and/or market.
The idea behind the efficiency of markets is that if there’s information that is not yet priced in, traders have an immediate financial incentive to correct market prices which will then lead to market prices incorporating this information.
This mechanism working well rests on various conditions needing to be met. In the quote above, Tetlock mentions two of them: a market having to be large and liquid.
Your small office prediction market won’t do so well because there isn’t enough money and not enough people trading. Correcting prices isn’t worth the opportunity cost. The more money you can make in your market by correcting prices, the more likely it is the opportunity cost will be worth it. If you run a prediction market on when Mark will finish project X and you think the current estimate is hopelessly optimistic, is it really worth the cost of aggravating Mark to earn a few bucks? Of course not.
The same argument applies to transaction costs - they have to be low, otherwise correcting prices isn’t worth the cost. Your office presumably has other priorities than making sure its workers can trade easily.
That said, your office prediction market will still likely do better than taking the average of the guesses of your office workers.
There’s another problem peculiar to prediction markets, which run predictions for the more distant future. If you notice a future election prediction in 2021 is only a couple of percent off, it’s not worth it for you to invest your savings in correcting the price. You’re better off making money on index fonds which cash out a bit more than a couple of percent over three years!
But what about other bigger markets to make actually relevant prediction about the future? Surely they will be more efficient?
I’m skeptical. How stringently the conditions for market efficiency need to be met for a market to actually be efficient is an empirical question. How efficient a prediction market needs to be to give better forecasts than the alternatives is another one.
For example, despite high liquidity, political prediction markets aren’t that efficient. In the 2012 US presidential election, there was a big arbitrage opportunity between Intrade (a former US prediction market) and other prediction markets betting on the outcome of the presidential election. A single trader on Intrade lost millions by continuously betting on Romney. The most likely explanation for this trader’s behaviour was that they were trying to distort published predictions to manipulate the election outcome, since published polls and prediction markets odds tend to influence actual voter outcomes.
From this trader’s perspective, investing millions to increase the chance of Romney winning the election may have been a good investment, but this example shows the problems that can result when decisions are going to be made based on the prediction market’s prices.
Even markets like political betting not being that efficient shows us how difficult it is to fulfill necessary market efficiency conditions. I’m still optimistic for sufficiently large stock market like prediction markets in the future. But all in all, I think the merit of prediction markets is overblown.
A way to frame this question is how do we get the best predictions per least amount of effort, with different strategies having different levels of effort/accuracy of output. A strategy would be considered dominated if a different strategy required both less effort and gave better accuracy. I think a pretty good case can be made for “teams of forecasters working together with their results extremized” cleary requiring less effort and being possibly more accurate or in the same ballpark as prediction markets. If that is the case, I think the argument for setting up/using prediction markets is greatly weakened. It seems like if someone did systematic research into the highest value/least resource consumption predictions, prediction markets would not score at the top of many overall rankings given its high cost. Also some evidence about the high resource cost might be that EAs, although quite excited, driven and intelligent, cannot get a prediction market going with more than a few bets on a given question.
Who are you arguing against? The three links in your first paragraph go to articles that don't clearly disagree with you.
I'd guess that there would be fewer trades than otherwise, and this would often offset any benefits that come from the high quality of the participants.
I'm arguing against prediction markets being the best alternative in many situations contemplated by EAs, which is something I have heard said or implied by a lot of EAs in conversations I've had with them. Most notably, I think a lot of EAs are unaware of the arguments I make in the post and I wanted to have them written up for future reference.
I have had a lot of EAs say this to me in person as well.
You seem to be comparing prediction markets to perfection, not to the real mechanisms that we now use today instead. People proposing prediction markets are suggesting they'd work better than the status quo. They are usually not comparing them to something like GJP.
I agree with you prediction markets are in many cases better than the status quo. I'm not comparing prediction markets to perfection but to their alternatives (like extremizing team forecasts). I'm also only arguing that prediction markets are overrated within EA, not in the wider world. I'd assume they're underrated outside of libertarian-friendly circles.
All in all, for which problems prediction markets do better than which alternatives is an empirical question, which I state in the post:
Do you disagree that in the specific examples I have given (an office prediction market about the timeline of a project, an election prediction market) having a prediction market is worse than the alternatives?
It would be good if you could give concrete examples where you expect prediction markets to be the best alternative.
Prediction markets are a neat concept, and are often regarded highly in the EA sphere. I think they are often not the best alternative for a given problem and are insufficiently compared to those alternatives within EA. Perhaps because they are such a neat concept - "let's just do a prediction market!" sounds a lot more exciting than discussing a problem in a team and extremizing the team's forecast even though a prediction market would be a lot more work.
Without some concrete estimate of how highly prediction markets are currently rated, its hard to say if they are over or under rated. They are almost never used, however, so it is hard to believe they are overused.
The office prediction markets you outline might well be useful. They aren't obviously bad.
I see huge potential for creating larger markets to estimate altruism effectiveness. We don't have any such at the moment, or even much effort to make them, so I find it hard to see that there's too much effort there.
For example, it would be great to create markets estimating advertised outcomes from proposed World Bank projects. That might well pressure the Bank into adopting projects more likely to achieve those outcomes.
I don't think prediction markets are overused by EAs, I think they are advocated for too much (both for internal lower stakes situations as well as for solving problems in the world) when they are not the best alternative for a given problem.
One problem with prediction markets is that they are hassle to implement which is why people don't actually want to implement them. But since they are often the first alternative suggestion to the status quo within EA, better solutions in lower stakes situations like office forecasts which might have a chance of actually getting implemented don't even get discussed.
I don't think an office prediction market would be bad or not useful once you ignore opportunity costs, just worse than the alternatives. To be fair, I'm somewhat more optimistic for implementing office prediction markets in large workspaces like Google, but not for the small EA orgs we have. In those they would more likely take up a bunch of work without actually improving the situation much.
How large do you think a market needs to be to be efficient enough to be better than, say, asking Tetlock for the names of the top 30 superforecasters and hiring them to assess the problem? Given that political betting, despite being pretty large, had such big trouble as described in the post, I'm afraid an efficient enough prediction market would take a lot of work to implement. I agree with you the added incentive structure would be nice, which might well make up for a lack of efficiency.
But again, I'm still optimistic about sufficiently large stock market like prediction markets.
I think markets that have at least 20 people trading on any given question will on average be at least as good as any alternative.
Your comments about superforecasters suggest that you think what matters is hiring the right people. What I think matters is the incentives the people are given. Most organizations produce bad forecasts because they have goals which distract people from the truth. The biggest gains from prediction markets are due to replacing bad incentives with incentives that are closely connected with accurate predictions.
There are multiple ways to produce good incentives, and for internal office predictions, there's usually something simpler than prediction markets that works well enough.
Political betting had a problem relative to perfection, not relative to the actual other alternatives used; it did better than them according to accuracy studies.
Yes there are overheads to using prediction markets, but those are mainly for having any system at all. Once you have a system, the overhead to adding a new question is much lower. Since you don't have EA prediction markets now, you face those initial costs.
For forecasting in most organizations, hiring top 30 super forecasters would go badly, as they don't know enough about that organization to be useful. Far better to have just a handful of participants from that organization.
I assumed you didn't mean an internal World Bank prediction market, sorry about that. As I said above, I'm more optimistic about large workplaces employing prediction markets. I don't know how many staff the World Bank employs. Do you agree now that prediction markets are an inferior solution to forecasting problems in small organizations? If yes, what do you think is the minimum staff size of a workplace for a prediction market to be efficient enough to be better than e.g. extremized team forecasting?
Could you link to the accuracy studies you cite that show that prediction markets do better than polling on predicting election results? I don't see any obvious big differences on a quick Google search. The next obvious alternative is asking whether people like Nate Silver did better than prediction markets. In the GJP, individual superforecasters did sometimes better than prediction markets, but team superforecasters did consistently better. Putting Nate Silver and his kin in a room seems to have a good chance to outperform prediction markets then.
You also don't state your opinion on the Intrade incident. Since I cannot see that prediction markets are obviously a lot better than polls or pundits (they didn't call the 2016 surprises either), I find it questionable whether blatant attempts at voter manipulation through prediction markets are worth the cost. This is a big price to pay even if prediction markets did a bit better than polls or pundits.
Robin's position is that manipulators can actually improve the accuracy of prediction markets, by increasing the rewards to informed trading. On this view, the possibility of market manipulation is not in itself a consideration that favors non-market alternatives, such as polls or pundits.
Interesting! I am trading off accuracy with outside world manipulation in that argument, since accuracy isn't actually the main end goal I care about (but 'good done in the world' for which better forecasts of the future would be pretty useful).
Feel free to ignore if you don't think this is sufficiently important, but I don't understand the contrast you draw between accuracy and outside world manipulation. I thought manipulation of prediction markets was concerning precisely because it reduces their accuracy. Assuming you accept Robin's point that manipulation increases accuracy on balance, what's your residual concern?
Two points about prediction markets:
I'm arguing that the limit is hard to reach and when it isn't being reached, prediction markets are usually worse than alternatives. I'd be excited about a prediction market like Scott is describing in his post, but we are quite far away from implementing anything like that.
I also find it ironic that Scott's example discusses how hard election prediction markets are to corrupt, which is precisely what happened in the Intrade example above.
Regarding section 1, is there a reliable way to determine who these market-beating superforecasters are? What about in new domains? Do we have to have a long series of forecasts in any new domain before we can pick out the superforecasters?
Somewhat relatedly, what guarantees do we have that the superforecasters aren't just getting lucky? Surely, some portion of them would revert to the mean if we continued to follow their forecasts.
Altogether, this seems somewhat analogous to the arguments around active vs passive investing where I think passive investing comes out on top.
So let's assume that teams of superforecasters with extremized predictions can do significantly better than any other mechanism of prediction that we've thought of, including prediction markets as they've existed so far. If so, then with prediction markets of sufficiently high volume and liquidity (just for the sake of argument, imagine prediction markets on the scale of the NYSE today), we would expect firms to crop up that would identify superforecasters, train them, and optimize for exactly how much to extremize their predictions (as well as iterating on this basic formula). These superforecaster firms would come to dominate the prediction markets (we'd eventually wind up with companies that were like the equivalent of goldman sachs but for prediction markets), and the prediction markets would be better than any other method of prediction. Of course, we're a LONG way away from having prediction markets like that, but I think this at least shows the theoretical potential of large scale prediction markets.
That's what I thought, too. Hiring the top 30 superforecasters seems much less scalable than a big prediction market like you describe it, where becoming a superforecaster suddenly would become a valid career. I wonder if it's not too far off to expect some more technocratic government to set one up at some point in the coming years. I wonder what the OT and others here would think about lobbying for predictions markets from an EA perspective.
On the thin markets problem, there's been some prior work (on doing some googling I found https://mason.gmu.edu/~rhanson/mktscore.pdf, but I recall reading a paper with a less scary tile).
In the office case, an obvious downside to incentivising the market is that one may divert labour away from normal work, so it may still be that non-market solutions are superior