Summary
Impact markets (that encourage retrospective funding, and especially if they allow resale of impact) have a severe downside risk: they can incentivize risky projects that are likely to be net-negative due to allowing people to profit if they cause positive impact while not inflicting a cost on them if they cause negative impact. This risk is hard to mitigate.
Impact markets themselves are therefore such a risky project. To avoid the conflict of interest issues that arise, work to establish impact markets should only ever be funded prospectively (never retrospectively).
The risk
Suppose the certificates of a risky project are traded on an impact market. If the project ends up being beneficial, the market allows the people who own the certificates to profit. But if the project ends up being harmful, the market does not inflict a cost on them. The certificates of a project that ended up being extremely harmful are worth as much as the certificates of a project that ended up being neutral, namely nothing. Therefore, even if everyone believes that a certain project is net-negative, its certificates may be traded for a high price due to the chance that the project will end up being beneficial.[1]
Impact markets can thus incentivize people to create or fund net-negative projects. Denis Drescher used the term "distribution mismatch" to describe this risk, due to the mismatch between the probability distribution of investor profit and that of EV.
It seems especially important to prevent the risk from materializing in the domains of anthropogenic x-risks and meta-EA. Many projects in those domains can cause a lot of accidental harm because, for example, they can draw attention to info hazards, produce harmful outreach campaigns, produce dangerous experiments (e.g. in machine learning or virology), shorten AI timelines, intensify competition dynamics among AI labs, etcetera.
Mitigating the risk is hard
The Toward Impact Markets post describes an approach that attempts to mitigate this risk. The core idea is that retro funders should consider the ex-ante EV rather than the ex-post EV if the former is smaller. (The details are more complicated; a naive implementation of this idea would incentivize people to launch a safe project and later expand it to include high-risk high-reward interventions.)
We think that this approach cannot be relied upon to sufficiently mitigate the risk due to the following reasons:
- For that approach to succeed, retro funders must be familiar with it and be sufficiently willing and able to adhere to it. However, some potential retro funders are more likely to use a much simpler approach, such as "you should buy impact that you like".
- Other things being equal, simpler approaches are easier to communicate, more appealing to potential retro funders, more prone to become a meme and a norm, and more likely to be advocated for by teams who work on impact markets and want to get more traction.
- If there is no way to prevent anyone from becoming a retro funder, being careful about choosing/training the initial set of retro funders may not help much. Especially if the market allows people to profit from outreach interventions that attract new retro funders who are not very careful.
- The price of a certificate tracks the maximum amount of money that any future retro funder will be willing to pay for it. Prudent retro funders do not (significantly) offset the influence of imprudent retro funders on the prices of certificates of net-negative projects.
- Traditional (prospective) charitable funding can have a similar dynamic; one only needs one funder to support a project even if everyone else thinks it’s bad. Impact markets make the problem much worse, though, because they add variance from uncertainty about project outcomes as well as variance in funder views.
- Suppose that a risky project that is ex-ante net-negative ends up being beneficial. If retro funders attempt to evaluate it after it already ended up being beneficial, hindsight bias can easily cause them to overestimate its ex-ante EV. This phenomenon can make the certificates of net-negative projects more appealing to investors, already at an early stage of the project (before it is known whether the project will end up being beneficial or harmful).
The conflict of interest problem for establishing impact markets
Can we just trust that people interested in establishing impact markets will do so only if it’s a good idea? Unfortunately the incentivization of risky projects applies at this level. If someone establishes an impact market and it has large benefits, they might expect to be able to sell their impact in establishing the markets for large amounts of money. On the other hand if they establish impact markets and they cause large harms, they won’t lose large amounts of money.
Establishing impact markets would probably involve many high-stakes decisions under great uncertainty. (e.g. should an impact market be launched? Should the impact market be decentralized? Should a certain person be invited to serve as a retro funder? Should certain certificates be deleted? What instructions should be communicated to potential market participants?) We should protect the integrity of these decisions by insulating them from conflicts of interest.
This point seems important even conditional on the people involved being the most careful and EA-aligned people in the world. (Because they are still humans, and humans' judgment is likely to be affected by biases/self-deception when there is a huge financial profit at stake).
Suggestions
- Currently, launching impact markets seems to us (non-robustly) net-negative. The following types of impact markets seems especially concerning:
- Decentralized impact markets (in which there are no accountable decision makers that can control or shut down the market).
- Impact markets that allow certificates for risky interventions, and especially interventions that are related to the impact market itself (e.g. recruiting new retro funders).
- On the other hand, we’re excited about work to further understand the benefits and costs of different funding structures. If there were a robust mechanism to allow the markets to avoid the risks discussed in this post (& ideally handle moral trade as well), we think impact markets could have very high potential. We just don’t think we’re there yet.
- In any case, launching an impact market should not be done without (weak) consensus among the EA community, in order to avoid the unilateralist's curse.
- To avoid tricky conflicts of interest, work to establish impact markets should only ever be funded in forward-looking ways. Retro funders should commit to not buying impact of work that led to impact markets (at least work before the time when the incentivization of net-negative projects has been robustly cleared up, if it ever is). EA should socially disapprove of anyone who did work on impact markets trying to sell impact of that work.
- All of this relates to markets which encourage retrospective funding (especially but not exclusively if they also allow for the resale of impact).
- In particular, this is not intended to apply to introducing market-like mechanisms like explicit allocation of credit between contributors to projects. While such mechanisms may be useful for supporting impact markets, they are also useful in their own right (for propagating price information without distorting incentives), and we’re in favour of experiments with such credit allocation.
- ^
The risk was probably first pointed out by Ryan Carey.
Dawn’s (Denis’s) Intellectual Turing Test Red-Teaming Impact Markets
[Edit: Before you read this, note that I failed. See the comment below.]
I want to check how well I understand Ofer’s position against impact markets. The “Imagined Ofer” below is how I imagine Ofer to respond (minus language – I’m not trying to imitate his writing style though our styles seem similar to me). I would like to ask the real Ofer to correct me wherever I’m misunderstanding his true position.
I currently favor using the language of prize contests to explain impact markets unless I talk to someone intimately familiar with for-profit startups. People seem to understand it more easily that way.
My model of Ofer is informed by (at least) these posts/comment threads.
Dawn: I’m doing these prize contests now where I encourage people to help each other (monetarily and otherwise) to produce awesome work to reduce x-risks and finally I reward all participants in the best ones of the projects. I’m writing software to facilitate this. I will only reward them in proportion to the gains from moral trade that they’ve generated, and I’ll use my estimate of their ex ante EV as a ceiling for my overall evaluation of a project.
This has all sorts of benefits! It’s basically a wide-open regrantor program where the quasi-regrantors (the investors) absorb most of the risk. It scales grantmaking up and down – grantmakers have ~ 10x less work and can thus scale their operation up by 10x, and the investors can be anyone around the world, so they can draw on their existing networks for their investments, so they can consider many more much smaller investments or investments which require very niche knowledge or access. Many more ideas will get tried, and it’ll be easier for people to start projects even when they still lack personal contact to the right grantmakers.
Imagined Ofer: That seems very dangerous to me. What if someone else also offers a reward and also encourages people to help each other with the projects but does not apply your complicated ex ante EV ceiling? Someone may create a flashy but extremely risky project and attract a lot of investors for it.
Dawn: But they can do that already? All sorts of science prizes, all the other EA-related prizes, Bountied Rationality, new prizes they promote on Twitter, etc.
Imagined Ofer: Okay, but you’re building a software to make it easier, so presumably you’ll thereby increase the number of people who will offer such prizes and increase the number of people who will attract investments in advance because the user experience and networking with investors is smoother and because they’re encouraged to do so.
Dawn: That’s true. We should make our software relatively unattractive to such prize offerers and their audiences, for example by curating the projects on it such that only the ones that are deemed to be robustly positive in impact are displayed (something I proposed from the start, in Aug. 2021). I could put together a team of experts for this.
Imagined Ofer: That’s not enough. What if you or your panel of experts overlook that a project was actually ex ante net-negative in EV, for example because it has already matured and so happened to turn out good? You’d be biased in a predictably upward direction in your assessment of the ex ante EV. In fact, people could do a lot of risky projects and then only ever submit the ones that worked out fine.
Dawn: Well, we can try really hard… Pay bounties for spotting projects that were negative in ex ante EV but slipped through; set up a network of auditors; make it really easy and effortless to hold compounding short positions on projects that manage their -1x leverage automatically; recruit firm like Hindenburg Research (or individuals with similar missions) to short projects and publish exposés on them; require issuers to post collateral; set up a mechanisms whereby it becomes unlikely that there’ll be other prizes with any but a small market share (such as the “pot”); maybe even require preregistration of projects to avoid the tricks you mention; etc. (All the various fixes I propose in Toward Impact Markets.)
Imagined Ofer: Those are only unreliable patches for a big fundamental problem. None of them is going to be enough, not even in combination. They are slow and incomplete. Ex ante negative projects can slip through the cracks or remain undetected for long enough to cause harm in this world or a likely counterfactual world.
Dawn: Okay, so one slips through, attracts a lot of investment, gets big, maybe even manages to fool us into awarding it prize money. It or new projects in the same reference class have some positive per-year probability of being found out due to all the safety mechanisms. Eventually a short-seller or an exposé-bounty poster will spot them and make a lot of money for doing so. We will react and make it super-duper clear going forward that we will not reward projects in that reference class ever again. Anyone who wants to get investments will need to make the case that their project is not in that reference class.
Imagined Ofer: But by that time the harm is done, be it to a counterfactual world. Next time the harm will be done to the factual world. Besides, regardless of how safe you actually make the system, what’s important is that there can always be issuers and investors who believe (be it wrongly believe) that they can get their risky project retro-funded. You can’t prevent that no matter how safe you make the system.
Dawn: But that seems overly risk averse to me because prospective funders can also make mistakes, and current prizes – including prizes in EA – are nowhere near as safe. Once our system is safer than any other existing methods, the bad actors will prefer the existing methods.
Imagined Ofer: The existing methods are much safer. Prospective funding is as safe as it gets, and current prizes have a time window of months or so, so by the time the prizes are awarded, the projects that they are awarded to are still very young, so the prizes are awarded on the basis of something that is still very close to ex ante EV.
Dawn: But retroactive funders can decide when to award prizes. In fact, we have gone with a month in our experiment. But admittedly, in the end I imagine that cycles of a year or two are more realistic. That is still not that much more. (See this draft FAQ for some calculations. Retro funders will pay out prizes of up to 1000% in the success case, but outside the success case investors will lose all or most of their principal. They are hits-based investors, so their riskless benchmark profit is probably much higher than 5% per year. They’ll probably not want to stay in certificates for more than a few years even at 1000% return in the success case.)
Imagined Ofer: A lot more can happen in a year or two than in a month. EA, for example, looked very different in 2013 compared to 2015, but it looked about the same in January vs. February 2015. But more importantly, you write about tying the windfall clauses of AGI companies to retro funding with enormous budgets, budgets that surely offset even the 20 years that it may take to get to that point and the very low probability.
Dawn: The plan I wrote about has these windfalls reward projects that were previously already rewarded by our regular retro funders, no more.
Imagined Ofer: But what keeps a random, unaligned AGI company from just using the mechanism to reward anyone they like?
Dawn: True. Nothing. Let’s keep this idea private. I can unpublish my EA Forum post too, but maybe that’s the audience that should know about it if anyone should. As an additional safeguard against uncontrolled speculation, how about we require people to always select one or several actual present prize rounds when they submit a project?
Imagined Ofer: That might help, but people could just churn out small projects and select whatever prize happens to be offered at the time whereas in actuality they’re hoping that one of these prizes will eventually be mentioned in a windfall clause or their project will otherwise be retro funded through a windfall clause or some other future funder who ignore the setting.
Dawn: Okay, but consider how far down the rabbit hole we’ve gone now: We have a platform that is moderated; we have relatively short cycles for the prize contest (currently just one month); we explicitly offer prizes for exposés; we limit our prizes to stuff that is, by dint of its format, unlikely to be very harmful; we even started with EA Forum posts, a forum that has another highly qualified moderation team. Further, we want to institute more mechanisms – besides exposés – that make it easy to short certificates to encourage people to red-team them; mechanisms to retain control of the market norms even if many new retro funders enter; even stricter moderation; etc. We’re even considering requiring preregistration, mandatory selection of present prize rounds (even though it runs counter to how I feel impact markets should work), and very narrow targets set by retro funders (like my list of research questions in our present contest). Compare that to other EA prize contests. Meanwhile, the status quo is that anyone with some money and Twitter following can do a prize contest, and anyone can make a contract with a rich friend to secure a seed investment that they’ll repay if they win. All of our countless safeguards should make it vastly easier for unaligned retro funders and unaligned project founders to do anything other than use our platform. All that remains is that maybe we’re spreading the meme that you can seed-invest into potential prize winners, but that’s also something that is already happening around the world with countless science prizes. What more can we do!
Imagined Ofer: This is not an accusation – we’re all human – but money and sunk time-cost fallacy corrupt. For all I know this could be a motte-and-bailey type of situation: The moment a big crypto funder offers you a $1m grant, you might throw caution to the wind and write a wide-open ungated blockchain implementation of an impact market.
Dawn: I hope I’ve made clear in my 20,000+ words of writing on impact market safety that were unprompted by your comments (other than the first one in 2021) that my personal prioritization has long rested on robustness over mere positive EV. I’ve just quit my well-paid ETG job as software engineer in Switzerland to work on this. If I were in it for the money, I wouldn’t be. (More than what I need for my financial safety.) Our organization is also set up with a very general purview so that we can pivot easily. So if I should start work on a more open version of the currently fully moderated, centralized implementation, it’s because I’ve come to believe that it’s more robustly positive than I currently think it is. (Or it may well be possible to find a synthesis of permissionlessness and curation.) The only thing that can convince me otherwise are evidence and arguments.
Imagined Ofer: I think that most interventions that have a substantial chance to prevent an existential catastrophe also have a substantial chance to cause an existential catastrophe, such that it’s very hard to judge whether they are net-positive or net-negative (due to complex cluelessness dynamics that are caused by many known and unknown crucial considerations). So the typical EA Forum post with sufficient leverage over our future to make a difference at all is about equally likely to increase or to decrease x-risk.
Dawn: I find that to be an unusual opinion. CEA and others try to encourage people to post on the EA Forum rather than discourage them. That was also the point of the CEA-run EA Forum contest. Personally, I also find it unintuitive that that should be the case: For any given post, I try to think of pathways along which it could be beneficial and detrimental. Usually there are few detrimental pathways, and if there are any, there are strong social norms around malice and government institutions such as the police in the way of pursuing the paths. A few posts come to mind that are rare, unusual exceptions to this theme, but it’s been several years since I read one of those. Complex cluelessness also doesn’t seem to make a difference here because it applies equally to any prospective funding, to prizes after one month, and to prizes after one year. Do you think that writing on high-leverage topics such as x-risks should generally be discouraged rather than encouraged on the EA Forum?
Imagined Ofer: Even if you create even a very controlled impact market that is safer than the average EA prize contest you are still creating a culture and a meme regarding retroactive funding. You could inspire someone to post on Twitter “The current impact markets are too curated. I’m offering a $10m retro prize for dumping 500 tons of iron sulfate into the ocean to solve climate change.” If someone posted this now no one would take them seriously. If you create an impact market with tens of millions of dollars flowing through it and many market actors, it will become believable to some rouge players that this payout is likely real.
I do not endorse the text written by "Imagined Ofer" here. Rather than describing all the differences between that text and what I would really say, I've now published this reply to your first comment.