Hide table of contents

Comment Permalink

Dawn’s (Denis’s) Intellectual Turing Test Red-Teaming Impact Markets

[Edit: Before you read this, note that I failed. See the comment below.]

I want to check how well I understand Ofer’s position against impact markets. The “Imagined Ofer” below is how I imagine Ofer to respond (minus language – I’m not trying to imitate his writing style though our styles seem similar to me). I would like to ask the real Ofer to correct me wherever I’m misunderstanding his true position.

I currently favor using the language of prize contests to explain impact markets unless I talk to someone intimately familiar with for-profit startups. People seem to understand it more easily that way.

My model of Ofer is informed by (at least) these posts/comment threads.

Dawn: I’m doing these prize contests now where I encourage people to help each other (monetarily and otherwise) to produce awesome work to reduce x-risks and finally I reward all participants in the best ones of the projects. I’m writing software to facilitate this. I will only reward them in proportion to the gains from moral trade that they’ve generated, and I’ll use my estimate of their ex ante EV as a ceiling for my overall evaluation of a project.

This has all sorts of benefits! It’s basically a wide-open regrantor program where the quasi-regrantors (the investors) absorb most of the risk. It scales grantmaking up and down – grantmakers have ~ 10x less work and can thus scale their operation up by 10x, and the investors can be anyone around the world, so they can draw on their existing networks for their investments, so they can consider many more much smaller investments or investments which require very niche knowledge or access. Many more ideas will get tried, and it’ll be easier for people to start projects even when they still lack personal contact to the right grantmakers.

Imagined Ofer: That seems very dangerous to me. What if someone else also offers a reward and also encourages people to help each other with the projects but does not apply your complicated ex ante EV ceiling? Someone may create a flashy but extremely risky project and attract a lot of investors for it.

Dawn: But they can do that already? All sorts of science prizes, all the other EA-related prizes, Bountied Rationality, new prizes they promote on Twitter, etc.

Imagined Ofer: Okay, but you’re building a software to make it easier, so presumably you’ll thereby increase the number of people who will offer such prizes and increase the number of people who will attract investments in advance because the user experience and networking with investors is smoother and because they’re encouraged to do so.

Dawn: That’s true. We should make our software relatively unattractive to such prize offerers and their audiences, for example by curating the projects on it such that only the ones that are deemed to be robustly positive in impact are displayed (something I proposed from the start, in Aug. 2021). I could put together a team of experts for this.

Imagined Ofer: That’s not enough. What if you or your panel of experts overlook that a project was actually ex ante net-negative in EV, for example because it has already matured and so happened to turn out good? You’d be biased in a predictably upward direction in your assessment of the ex ante EV. In fact, people could do a lot of risky projects and then only ever submit the ones that worked out fine.

Dawn: Well, we can try really hard… Pay bounties for spotting projects that were negative in ex ante EV but slipped through; set up a network of auditors; make it really easy and effortless to hold compounding short positions on projects that manage their -1x leverage automatically; recruit firm like Hindenburg Research (or individuals with similar missions) to short projects and publish exposés on them; require issuers to post collateral; set up a mechanisms whereby it becomes unlikely that there’ll be other prizes with any but a small market share (such as the “pot”); maybe even require preregistration of projects to avoid the tricks you mention; etc. (All the various fixes I propose in Toward Impact Markets.)

Imagined Ofer: Those are only unreliable patches for a big fundamental problem. None of them is going to be enough, not even in combination. They are slow and incomplete. Ex ante negative projects can slip through the cracks or remain undetected for long enough to cause harm in this world or a likely counterfactual world.

Dawn: Okay, so one slips through, attracts a lot of investment, gets big, maybe even manages to fool us into awarding it prize money. It or new projects in the same reference class have some positive per-year probability of being found out due to all the safety mechanisms. Eventually a short-seller or an exposé-bounty poster will spot them and make a lot of money for doing so. We will react and make it super-duper clear going forward that we will not reward projects in that reference class ever again. Anyone who wants to get investments will need to make the case that their project is not in that reference class.

Imagined Ofer: But by that time the harm is done, be it to a counterfactual world. Next time the harm will be done to the factual world. Besides, regardless of how safe you actually make the system, what’s important is that there can always be issuers and investors who believe (be it wrongly believe) that they can get their risky project retro-funded. You can’t prevent that no matter how safe you make the system.

Dawn: But that seems overly risk averse to me because prospective funders can also make mistakes, and current prizes – including prizes in EA – are nowhere near as safe. Once our system is safer than any other existing methods, the bad actors will prefer the existing methods.

Imagined Ofer: The existing methods are much safer. Prospective funding is as safe as it gets, and current prizes have a time window of months or so, so by the time the prizes are awarded, the projects that they are awarded to are still very young, so the prizes are awarded on the basis of something that is still very close to ex ante EV.

Dawn: But retroactive funders can decide when to award prizes. In fact, we have gone with a month in our experiment. But admittedly, in the end I imagine that cycles of a year or two are more realistic. That is still not that much more. (See this draft FAQ for some calculations. Retro funders will pay out prizes of up to 1000% in the success case, but outside the success case investors will lose all or most of their principal. They are hits-based investors, so their riskless benchmark profit is probably much higher than 5% per year. They’ll probably not want to stay in certificates for more than a few years even at 1000% return in the success case.)

Imagined Ofer: A lot more can happen in a year or two than in a month. EA, for example, looked very different in 2013 compared to 2015, but it looked about the same in January vs. February 2015. But more importantly, you write about tying the windfall clauses of AGI companies to retro funding with enormous budgets, budgets that surely offset even the 20 years that it may take to get to that point and the very low probability.

Dawn: The plan I wrote about has these windfalls reward projects that were previously already rewarded by our regular retro funders, no more.

Imagined Ofer: But what keeps a random, unaligned AGI company from just using the mechanism to reward anyone they like?

Dawn: True. Nothing. Let’s keep this idea private. I can unpublish my EA Forum post too, but maybe that’s the audience that should know about it if anyone should. As an additional safeguard against uncontrolled speculation, how about we require people to always select one or several actual present prize rounds when they submit a project?

Imagined Ofer: That might help, but people could just churn out small projects and select whatever prize happens to be offered at the time whereas in actuality they’re hoping that one of these prizes will eventually be mentioned in a windfall clause or their project will otherwise be retro funded through a windfall clause or some other future funder who ignore the setting.

Dawn: Okay, but consider how far down the rabbit hole we’ve gone now: We have a platform that is moderated; we have relatively short cycles for the prize contest (currently just one month); we explicitly offer prizes for exposés; we limit our prizes to stuff that is, by dint of its format, unlikely to be very harmful; we even started with EA Forum posts, a forum that has another highly qualified moderation team. Further, we want to institute more mechanisms – besides exposés – that make it easy to short certificates to encourage people to red-team them; mechanisms to retain control of the market norms even if many new retro funders enter; even stricter moderation; etc. We’re even considering requiring preregistration, mandatory selection of present prize rounds (even though it runs counter to how I feel impact markets should work), and very narrow targets set by retro funders (like my list of research questions in our present contest). Compare that to other EA prize contests. Meanwhile, the status quo is that anyone with some money and Twitter following can do a prize contest, and anyone can make a contract with a rich friend to secure a seed investment that they’ll repay if they win. All of our countless safeguards should make it vastly easier for unaligned retro funders and unaligned project founders to do anything other than use our platform. All that remains is that maybe we’re spreading the meme that you can seed-invest into potential prize winners, but that’s also something that is already happening around the world with countless science prizes. What more can we do!

Imagined Ofer: This is not an accusation – we’re all human – but money and sunk time-cost fallacy corrupt. For all I know this could be a motte-and-bailey type of situation: The moment a big crypto funder offers you a $1m grant, you might throw caution to the wind and write a wide-open ungated blockchain implementation of an impact market.

Dawn: I hope I’ve made clear in my 20,000+ words of writing on impact market safety that were unprompted by your comments (other than the first one in 2021) that my personal prioritization has long rested on robustness over mere positive EV. I’ve just quit my well-paid ETG job as software engineer in Switzerland to work on this. If I were in it for the money, I wouldn’t be. (More than what I need for my financial safety.) Our organization is also set up with a very general purview so that we can pivot easily. So if I should start work on a more open version of the currently fully moderated, centralized implementation, it’s because I’ve come to believe that it’s more robustly positive than I currently think it is. (Or it may well be possible to find a synthesis of permissionlessness and curation.) The only thing that can convince me otherwise are evidence and arguments.

Imagined Ofer: I think that most interventions that have a substantial chance to prevent an existential catastrophe also have a substantial chance to cause an existential catastrophe, such that it’s very hard to judge whether they are net-positive or net-negative (due to complex cluelessness dynamics that are caused by many known and unknown crucial considerations). So the typical EA Forum post with sufficient leverage over our future to make a difference at all is about equally likely to increase or to decrease x-risk.

Dawn: I find that to be an unusual opinion. CEA and others try to encourage people to post on the EA Forum rather than discourage them. That was also the point of the CEA-run EA Forum contest. Personally, I also find it unintuitive that that should be the case: For any given post, I try to think of pathways along which it could be beneficial and detrimental. Usually there are few detrimental pathways, and if there are any, there are strong social norms around malice and government institutions such as the police in the way of pursuing the paths. A few posts come to mind that are rare, unusual exceptions to this theme, but it’s been several years since I read one of those. Complex cluelessness also doesn’t seem to make a difference here because it applies equally to any prospective funding, to prizes after one month, and to prizes after one year. Do you think that writing on high-leverage topics such as x-risks should generally be discouraged rather than encouraged on the EA Forum?

Imagined Ofer: Even if you create even a very controlled impact market that is safer than the average EA prize contest you are still creating a culture and a meme regarding retroactive funding. You could inspire someone to post on Twitter “The current impact markets are too curated. I’m offering a $10m retro prize for dumping 500 tons of iron sulfate into the ocean to solve climate change.” If someone posted this now no one would take them seriously. If you create an impact market with tens of millions of dollars flowing through it and many market actors, it will become believable to some rouge players that this payout is likely real.

OferJun 26 20221

I do not endorse the text written by "Imagined Ofer" here. Rather than describing all the differences between that text and what I would really say, I've now published this reply to your first comment.

See in context

Impact markets may incentivize predictably net-negative projects

by Ofer, Owen Cotton-Barratt

Jun 21 20225 min read 76

113

Building effective altruismCertificate of impactAccidental harmMarkets for altruismEffective altruism fundingModels

Frontpage

Impact markets may incentivize predictably net-negative projects

Summary

The risk

Mitigating the risk is hard

The conflict of interest problem for establishing impact markets

Suggestions

76 comments

Summary

Impact markets (that encourage retrospective funding, and especially if they allow resale of impact) have a severe downside risk: they can incentivize risky projects that are likely to be net-negative due to allowing people to profit if they cause positive impact while not inflicting a cost on them if they cause negative impact. This risk is hard to mitigate.

Impact markets themselves are therefore such a risky project. To avoid the conflict of interest issues that arise, work to establish impact markets should only ever be funded prospectively (never retrospectively).

The risk

Suppose the certificates of a risky project are traded on an impact market. If the project ends up being beneficial, the market allows the people who own the certificates to profit. But if the project ends up being harmful, the market does not inflict a cost on them. The certificates of a project that ended up being extremely harmful are worth as much as the certificates of a project that ended up being neutral, namely nothing. Therefore, even if everyone believes that a certain project is net-negative, its certificates may be traded for a high price due to the chance that the project will end up being beneficial.^[1]

Impact markets can thus incentivize people to create or fund net-negative projects. Denis Drescher used the term "distribution mismatch" to describe this risk, due to the mismatch between the probability distribution of investor profit and that of EV.

It seems especially important to prevent the risk from materializing in the domains of anthropogenic x-risks and meta-EA. Many projects in those domains can cause a lot of accidental harm because, for example, they can draw attention to info hazards, produce harmful outreach campaigns, produce dangerous experiments (e.g. in machine learning or virology), shorten AI timelines, intensify competition dynamics among AI labs, etcetera.

Mitigating the risk is hard

The Toward Impact Markets post describes an approach that attempts to mitigate this risk. The core idea is that retro funders should consider the ex-ante EV rather than the ex-post EV if the former is smaller. (The details are more complicated; a naive implementation of this idea would incentivize people to launch a safe project and later expand it to include high-risk high-reward interventions.)

We think that this approach cannot be relied upon to sufficiently mitigate the risk due to the following reasons:

For that approach to succeed, retro funders must be familiar with it and be sufficiently willing and able to adhere to it. However, some potential retro funders are more likely to use a much simpler approach, such as "you should buy impact that you like".
- Other things being equal, simpler approaches are easier to communicate, more appealing to potential retro funders, more prone to become a meme and a norm, and more likely to be advocated for by teams who work on impact markets and want to get more traction.
If there is no way to prevent anyone from becoming a retro funder, being careful about choosing/training the initial set of retro funders may not help much. Especially if the market allows people to profit from outreach interventions that attract new retro funders who are not very careful.
The price of a certificate tracks the maximum amount of money that any future retro funder will be willing to pay for it. Prudent retro funders do not (significantly) offset the influence of imprudent retro funders on the prices of certificates of net-negative projects.
- Traditional (prospective) charitable funding can have a similar dynamic; one only needs one funder to support a project even if everyone else thinks it’s bad. Impact markets make the problem much worse, though, because they add variance from uncertainty about project outcomes as well as variance in funder views.
Suppose that a risky project that is ex-ante net-negative ends up being beneficial. If retro funders attempt to evaluate it after it already ended up being beneficial, hindsight bias can easily cause them to overestimate its ex-ante EV. This phenomenon can make the certificates of net-negative projects more appealing to investors, already at an early stage of the project (before it is known whether the project will end up being beneficial or harmful).

The conflict of interest problem for establishing impact markets

Can we just trust that people interested in establishing impact markets will do so only if it’s a good idea? Unfortunately the incentivization of risky projects applies at this level. If someone establishes an impact market and it has large benefits, they might expect to be able to sell their impact in establishing the markets for large amounts of money. On the other hand if they establish impact markets and they cause large harms, they won’t lose large amounts of money.

Establishing impact markets would probably involve many high-stakes decisions under great uncertainty. (e.g. should an impact market be launched? Should the impact market be decentralized? Should a certain person be invited to serve as a retro funder? Should certain certificates be deleted? What instructions should be communicated to potential market participants?) We should protect the integrity of these decisions by insulating them from conflicts of interest.

This point seems important even conditional on the people involved being the most careful and EA-aligned people in the world. (Because they are still humans, and humans' judgment is likely to be affected by biases/self-deception when there is a huge financial profit at stake).

Suggestions

Currently, launching impact markets seems to us (non-robustly) net-negative. The following types of impact markets seems especially concerning:
- Decentralized impact markets (in which there are no accountable decision makers that can control or shut down the market).
- Impact markets that allow certificates for risky interventions, and especially interventions that are related to the impact market itself (e.g. recruiting new retro funders).
On the other hand, we’re excited about work to further understand the benefits and costs of different funding structures. If there were a robust mechanism to allow the markets to avoid the risks discussed in this post (& ideally handle moral trade as well), we think impact markets could have very high potential. We just don’t think we’re there yet.
In any case, launching an impact market should not be done without (weak) consensus among the EA community, in order to avoid the unilateralist's curse.
To avoid tricky conflicts of interest, work to establish impact markets should only ever be funded in forward-looking ways. Retro funders should commit to not buying impact of work that led to impact markets (at least work before the time when the incentivization of net-negative projects has been robustly cleared up, if it ever is). EA should socially disapprove of anyone who did work on impact markets trying to sell impact of that work.
All of this relates to markets which encourage retrospective funding (especially but not exclusively if they also allow for the resale of impact).
- In particular, this is not intended to apply to introducing market-like mechanisms like explicit allocation of credit between contributors to projects. While such mechanisms may be useful for supporting impact markets, they are also useful in their own right (for propagating price information without distorting incentives), and we’re in favour of experiments with such credit allocation.

^{^}
The risk was probably first pointed out by Ryan Carey.

113 Reactions

Mentioned in

112Impact Markets: The Annoying Details

55Monthly Overload of EA - July 2022

24A Fresh FAQ on GiveWiki and Impact Markets Generally

22Crypto loves impact markets: Notes from Schelling Point Bogotá

10Link Collection: Impact Markets

Load more (5/6)

More posts like this

Comments76

Sorted by

New & upvoted

Click to highlight new comments since: Today at 10:53 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

harsimonyJun 21 202250

I proposed a simple solution to the problem:

For a project to be considered for retroactive funding, participants must post a specific amount of money as collateral.
If a retroactive funder determines that the project was net-negative, they can burn the collateral to punish the people that participated in it. Otherwise, the project receives its collateral back.

This eliminates the "no downside" problem of retroactive funding and makes some net-negative projects unprofitable.

The amount of collateral can be chosen adaptively. Start with a small amount and increase it slowly until the number of net-negative projects is low enough. Note that setting the collateral too high can discourage net-positive but risky projects.

James PayorJun 22 202218

Related: requiring some kind of insurance that pays out when a certificate becomes net-negative.

Suppose we somehow have accurate positive and negative valuations of certificates. We can have insurers sell put options on certificates, and be required to maintain that their portfolio has positive overall impact. (So an insurer needs to buy certificates of positive impact to offset negative impact they've taken on.)

Ultimately what's at stake for the insurer is probably some collateral they've put down, so it's a similar proposal.

mako yassJun 22 202211

Crypto's inability to take debts or enact substantial punishments beyond slashing stakes is a huge limitation and I would like it if we didn't have to swallow that (ie, if we could just operate in the real world, with non-anonymous impact traders, who can be held accountable for more assets than they'd be willing to lock in a contract.)

Given enough of that, we would be able to implement this by just having an impact cert that's implicated in a catastrophe turn into debt/punishment, and we'd be able to make that disincentive a lot more proportional to the scale of its potential negative externalities, and we would be able to allow the market to figure out how big that risk is for itself, which is pretty much the point of an impact market.

Though, on reflection, I'm not sure I would want to let the market to decide that. The problem with markets is that they give us a max function, they're made of auctions, whoever pays most decides the price, and the views of everyone else are not taken into account at all. Markets, in a sense, subject us to the decisions of the people with the most extreme beliefs. Eventually the ones who are extreme and wrong go bankrupt and disappear, but I don't ... (read more)

Owen Cotton-Barratt

Jun 21 2022

Nice, that's pretty interesting. (It's hacky, but that seems okay.) It's easy to see how this works in cases where there's a single known-in-advance funder that people are aiming to get retro funding from (evaluated in five years, say). Have you thought about whether it could work with a more free market, and not necessarily knowing all of the funders in advance?

harsimony

Jun 21 2022

This kind of thing could be made more sophisticated by making fines proportional to the harm done, requiring more collateral for riskier projects, or setting up a system to short sell different projects. But simpler seems better, at least initially. Yeah, that's a harder case. Some ideas: * People undertaking projects could still post collateral on their own (or pre-commit to accepting a fine under certain conditions). This kind of behavior could be rewarded by retro-funders giving these projects more consideration and the act of posting collateral does constitute a costly signal of quality. But that still requires some pre-commitments from retro funders or a general consensus from the community. * If contributors undertake multiple projects it should be possible to punish after-the-fact by docking some of their rewards from other projects. For example, if someone participates in 1 beneficial project and 1 harmful project, their retro funding rewards from the beneficial project can be reduced due to their participation on the harmful project. Unfortunately, this still requires some sort of pre-commitment from funders.

Stefan_SchubertJun 21 202211

This kind of thing could be made more sophisticated by making fines proportional to the harm done

I was thinking of this. Small funders could then potentially buy insurance from large funders in order to allow them to fund projects that they deem net positive even though there's a small risk of a fine that would be too costly for them.

RyanCareyJun 22 202222

I take it that Harsimony is proposing for the IC-seller to put up a flexible amount of collateral when they start their project, according to the possible harms.

There are two problems, though:

This requires centralised prospective estimation of harms for every project. (A big part of the point of impact certificates is to evaluate things retroactively, and to outsource prospective evaluations to the market, thereby incentivising accuracy in the latter.
This penalises IC-sellers based on how big their harms initially seem, rather than how big they eventually turn out to be.

It would be better if the IC-seller is required to buy insurance that will pay out the whole cost of the harm, as evaluated retrospectively. In order for the IC-seller to prove that they are willing to be accountable for all harms, they must buy insurance when they sell their IC. And to ensure that the insurer will pay out correctly, we must only allow insurers who use a standard, trusted board of longtermist evaluators to estimate the harms.

This means that a centralised system is only required to provide occasional retrospective evaluations of harm. The task of evaluating harms in prospect is delegated to insurers, similar to the role insurers play in the real world.

(This is my analysis, but the insurance idea was from Stefan.)

RyanCareyJun 22 202212

Although, the costs of insurance would need to be priced according to the ex ante costs, not the ex post costs.

For example: Bob embarks on a project with a 50% chance of success. If it succeeds, it saves one person's life, and Bob sells the IC. If it fails, it kills two people.

Clearly, the insurance needs to be priced to take into account a 50% chance of two deaths. So we would have to require Bob to buy the insurance when he initially embarks on the project (which is a tough ask, given that few currently anticipate selling their impact). Or else we would need to rely on a (centralised) retrospective evaluation of ex ante harm, for every project (which seems laborious).

Dawn Drescher

Jun 25 2022

I love the insurance idea because compared to our previous ideas around shorting with hedge tokens that compound automatically to maintain a -1x leverage, collateral, etc. (see Toward Impact Markets), the insurance idea also has the potential of solving the incentive problems that we face around setting up our network of certificate auditors! (Strong upvotes to both of you!) (The insurances would function a bit like the insurances in Robin Hanson’s idea for a tort law reform.)

Ofer

Jun 21 2022

I don't think that short selling would work. Suppose a net-negative project has a 10% chance to end up being beneficial, in which case its certificates will be worth $1M (and otherwise the certificates will end up being worth $0). Therefore, the certificates are worth today $100K in expectation. If someone shorts the certificates as if they are worth less than that, they will lose money in expectation.

Emrik

Jun 21 2022

I don't think such a rule has a chance of surviving if impact markets take off? 1. Added complexity to the norms for trading needs to pay for itself to withstand friction or else decay to its most intuitive equilibrium. 1. Or the norm for punishing defectors needs to pay for itself in order to stay in equilibrium. 2. Or someone needs to pay the cost of punishing defectors out of pocket for altruistic reasons. 2. Once a collateral-charging market takes off, someone could just start up an exchange that doesn't demand a collateral, and instead just charge a nominal fee that doesn't disincentivise risky investments but would still make them money. Traders would defect to this market if it's more profitable for them. (To be clear, I think I'm very pro GoodX's project here; I'm just skeptical of the collateral suggestion.)

mako yass

Jun 21 2022

Traders would adopt a competitor without negative externality mechanisms, but charities wouldn't, there will be no end buyers there, I wouldn't expect that kind of vicious amoral competitive pressure between platforms to play out.

Emrik

Jun 22 2022

But afaik the theory of change of this project doesn't rely on altruistic "end buyers", it relies on profit-motivated speculation? At least, the aim is to make it work even in the worst-case scenario where traders are purely motivated by profit, and still have the trades generate altruistic value. Correct me if I'm wrong, Update: If it wasn't clear, I was wrong. :p

Linch

Jun 22 2022

My understanding is that without altruistic end-buyers, then the intrinsic value of impact certificates becomes zero and it's entirely a confidence game.

mako yass

Jun 22 2022

There might be a market for that sort of ultimately valueless token now (or several months ago? I haven't been following the NFT stuff), I'm not sure there will be for long.

Emrik

Jun 22 2022

Aye, I updated. I was kinda dumb. The magical speculation model is probably not worth going for when end-buyers seem within reach.

mako yass

Jun 22 2022

I think there's an argument for the thing you were saying, though... Something like... If one marketplace forbids most foundational AI public works, then another marketplace will pop up with a different negative externality estimation process, and it wont go away, and most charities and government funders still aren't EA and don't care about undiscounted expected utility, so there's a very real risk that that marketplace would become the largest one. I guess there might not be many people who are charitibly inclined, and who could understand, believe in, and adopt impact markets, but also don't believe in tail risks. There are lots of people who do one of those things, but I'm not sure there are any who do all.

Dawn DrescherJun 21 202230

Going Forward

We will convene a regular working group to more proactively iterate and improve the mechanism design focused on risk mitigation. We intend for this group to function for the foreseeable future. Anyone is welcome to join this group via our Discord.
We will attempt to gain consultation from community figures that have expressed interest in impact markets (Paul Christiano, Robin Hanson, Scott Alexander, Eliezer Yudkowsky, Vitalik Buterin). This should move the needle towards more community consensus.
We will continue our current EA Forum contest. We will not run another contest in July.
We will do more outreach to other projects interested in this space (Gitcoin, Protocol Labs, Optimism, etc.) to make sure they are aware of these issues as well and we can come up with solutions together.

Do we think that impact markets are net-negative?

We – the Impact Markets team of Denis, Dony, and Matt – have been active EAs for almost a combined 20 years. In the past years we’ve individually gone through a prioritization process in which we’ve weighed importance, tractability, neglectedness, and personal fit for various projects that are close to the work of QURI, CLR, ACE, REG, CE, and o... (read more)

Jan_KulveitJun 22 202221

If the main problem you want to solve is "scaling up grantmaking", there are probably many other ways how to do it other than "impact markets".

(Roughly, you can amplify any "expert panel of judges" evaluations with judgemental forecasting.)

Dawn Drescher

Jun 23 2022

We’ve considered a wide range of mechanisms and ended up most optimistic about this one. When it comes to prediction markets on funding decisions, I’ve thought about this in two contexts in the past: 1. During the ideation phase, I found that it was already being done (by Metaculus?) and not as helpful because it doesn’t provide seed funding. 2. In Toward Impact Markets, I describe the “pot” safety mechanism that, I surmised, could be implemented with a set of prediction markets. The implementation that I have in mind that uses prediction markets has important gaps, and I don’t think it’s the right time to set up the pot yet. But the basic idea was to have prediction markets whose payouts are tied to decisions of retro funders to buy a particular certificate. That action resolves the respective market. But the yes votes on the market can only be bought with shares in the respective cert or by people who also hold shares in the respective cert and in proportion to them. (In Toward Impact Markets I favor the product of the value they hold in either as determinant of the payout.) But maybe you’re thinking of yet another setup: Investors buy yes votes on a prediction market (e.g. Polymarket, with real money) about whether a particular project will be funded. Funders watch those prediction markets and participants are encouraged to pitch their purchases to funders. Funders then resolve the markets with their actual grants and do minimal research, mostly trust the markets. Is that what you envisioned? I see some weaknesses in that model. I feel like it’s rather a bit over 10x as good as the status quo vs. our model, which I think is over 100x as good. But it is an interesting mechanism that I’ll bear in mind as a fallback!

Ofer

Jun 27 2022

Does this mean that you (the Impact Markets team) may sell certificates of your work to establish an impact market on that very impact market?

Ofer

Jun 26 2022

I think the analogy would work better if self-driving cars did risky things that could cause a terrible accident, in order to prevent the battery from running out reach the destination sooner. I think the following concern (quoted from the OP) is still relevant here: You later wrote: Does your current plan not involve explaining to all the retro funders that that they should consider the ex-ante EV as an upper bound? I don't see how this argument works. Given that a naive impact market incentivizes people to treat extremely harmful outcomes as if they were neutral (when deciding what projects to do/fund), why should your above argument cause an update towards the view that launching a certain impact market is net-positive? How does the potential harm that other people can cause via Twitter etc. make launching a certain impact market be a better idea than it would otherwise be? Why? Conditional on impact markets gaining a lot of traction and retro funders spending billions of dollars in impact markets 5 years from now, why wouldn't it make sense to buy many certificates of risky projects that might end up being extremely beneficial (according to at least one relevant future retro funder)? Do you intent to allow people to profit from outreach interventions that attract new retro funders? (i.e. by allowing people to sell certificates of such outreach interventions.) I disagree. I think this risk can easily materialize if the description of the certificate is not very specific (and in particular if it's about starting an organization, without listing specific interventions.)

Dawn Drescher

Jun 26 2022

First of all, what we’ve summarized as “curation” so far could really be distinguished as follows: 1. Making access for issuers invite-only, maybe keeping the whole marketplace secret (in combination with #2) until we find someone who produces cool papers/articles and who we trust and then invite them. 2. Making access for investors/retro funders invite-only, maybe keeping the whole marketplace secret (in combination with #1) until we find an impact investor or a retro funder who we trust and then invite them. 3. Read every certificate either before or shortly after it is published. (In combination with exposé certificates in case we make a mistake.) Let’s say #3 is a given. Do you think the marketplace would fulfill your safety requirements if only #1, only #2, or both were added to it? It involves explaining that. What we wrote was to argue that Attributed Impact is not as complicated as it may sound but rather quite intuitive. If you want to open a bazaar, one of your worries could be that people will use it to sell stolen goods. Currently these people sell the stolen goods online or on other bazaars, and the experience may be a bit clunky. By default these people will be happy to use your bazaar for their illegal trade because it makes life slightly easier for them. Slightly easier could mean that they get to sell a bit more quickly and create a bit more capacity for more stealing. But if you enact some security measures to keep them out, you quickly reach the point where the bazaar is less attractive than the alternatives. At that point you already have no effect anymore on how much theft there is going on in the world in aggregate. So the trick is to tune the security measures just right that they make the place less attractive than alternatives to the thieves and yet don’t impose prohibitively high costs on the legitimate sellers. My intent so far was to focus on text that is accessible online, e.g., articles, papers, some books. There may be othe

Ofer

Jun 27 2022

An impact market with invite-only access for issuers and investors seems safer than otherwise. But will that be a temporary phase after which our civilization ends up with a decentralized impact market that nobody can control or shut down, and people are incentivized to recruit as many new retro funders as they can? In the Toward Impact Markets post (March 2022) you wrote: That came after the sentence "A web2 solution like that would have a few advantages too:", after which you listed three advantages that have nothing to do with safety. I don't think the analogy works. Right now, there seems to be no large-scale retroactive funding mechanisms for anthropogenic x-risk interventions. Launching an impact market can change that. An issuer/investor/funder who will use your impact market would probably not use Twitter or anything else to deal with retroactive funding if you did not launch your impact market. The distribution mismatch problem applies to those people. (In your analogy there's a dichotomy of good people vs. thieves, which has no clear counterpart in the domain of retroactive funding.) Also, if your success inspires others to launch/join competing impact markets, you can end up increasing the number of people who use the other markets.

AustinJun 21 202225

Hm, naively - is this any different than the risks of net-negative projects in the for-profit startup funding markets? If not, I don't think this a unique reason to avoid impact markets.

My very rough guess is that impact markets should be at a bare minimum better than the for-profit landscape, which already makes it a worthwhile intervention. People participating as final buyers of impact will at least be looking to do good rather than generate additional profits; it would be very surprising to me if the net impact of that was worse than "the thing that happens in regular markets already".

Additionally - I think the negative externalities may be addressed with additional impact projects, further funded through other impact markets?

Finally: on a meta level, the amount of risk you're willing to spend on trying new funding mechanisms with potential downsides should basically be proportional to the amount of risk you see in our society at the moment. Basically, if you think existing funding mechanisms are doing a good job, and we're likely to get through the hinge of history safely, then new mechanisms are to be avoided and we want to stay the course. (That's not my current read of our xrisk situation, but would love to be convinced otherwise!)

Owen Cotton-Barratt

Jun 21 2022

I think startups are usually doing an activity which scales if it's good and stops if it's bad. People can sue if it's causing harm to them. Overall this kind of feedback mechanism does a fine job. In the impact markets case I'm most worried about activities which have long-lasting impacts even without continuing/scaling them. I'm more into the possibility of markets for scalable/repeatable activities (seems less fraught). In general the story for concern here is something like: * At the moment a lot of particularly high-leverage areas are have disproportionate attention from people who are earnestly trying to do good things * Impact markets could shift this to "attention from people earnestly trying to do high-variance things" * In cases where the resolution on what was successful or not takes a long time, and people potentially do a lot of the activity before we know whether it was eventually valued, this seems pretty bad

Stefan_Schubert

Jun 21 2022

They refer to Drescher's post. He writes:

Owen Cotton-Barratt

Jun 21 2022

I think this is not quite right. It shouldn't be about what we think about existing funding mechanisms, but what we think about the course we're set to be on. I think that ~EA is doing quite a good job of reshaping the funding landscape especially for the highest-priority areas. I certainly think it could be doing better still, and I'm in favour of experiments I expect to see there, but I think that spinning up impact markets right now is more likely to crowd out later better-understood versions than to help them.

Austin

Jun 21 2022

I think impact markets should be viewed in that experimental lens, for what it's worth (it's barely been tested outside of a few experiments on the Optimism blockchain). I'm not sure if we disagree much! Curious to hear what experiments and better funding mechanisms you're excited about~

Ofer

Jun 21 2022

Impact markets can incentivize/fund net-negative projects that are not currently of interest to for-profit investors. For example, today it can be impossible for someone to make a huge amount of money by launching an aggressive outreach campaign to make people join EA, or publishing a list of "the most dangerous ongoing experiments in virology that we should advocate to stop"; which are interventions that may be net-negative. (Also, in cases where both impact markets and classic classical for-profit investors incentivize a project, one can flip your statement and say that there's no unique reason to launch impact markets; I'm not sure that "uniqueness" is the right thing to look at.) [EDIT: removed unnecessary text.] I tentatively think that launching impact markets seem worse than a "random" change to the world's trajectory. Conditional on an existential catastrophe occurring, I think there's a high substantial chance that the catastrophe will be caused by individuals who followed their local financial incentives. We should be cautious about pushing the world (and EA especially) further towards the "big things happen due to individuals following their local financial incentives" dynamics.

AustinJun 21 202212

Thanks for your responses!

I'm not sure that "uniqueness" is the right thing to look at.

Mostly, I meant: the for-profit world already incentivizes people to take high amounts of risk for financial gain. In addition, there are no special mechanisms to prevent for-profit entities from producing large net-negative harms. So asking that some special mechanism be introduced for impact-focused entities is an isolated demand for rigor.

There are mechanisms like pollution regulation, labor laws, etc which apply to for-profit entities - but these would apply equally to impact-focused entities too.

We should be cautious about pushing the world (and EA especially) further towards the "big things happen due to individuals following their local financial incentives" dynamics.

I think I disagree with this? I think people following local financial incentives is always going to happen, and the point of an impact market is to structure financial incentives to be aligned with what the EA community broadly thinks is good.

Agree that xrisk/catastrophe can happen via eg AI researchers following local financial incentives to make a lot of money - but unless your proposal is to overhaul the capitalist market system somehow, I think building a better competing alternative is the correct path forward.

Ofer

Jun 22 2022

It may be useful to think about it this way: Suppose an impact market is launched (without any safety mechanisms) and $10M of EA funding are pledged to be used for buying certificates as final buyers 5 years from now. No other final buyers join the market. The creation of the market causes some set of projects X to be funded and some other set of project Y to not get funded (due to the opportunity cost of those $10M). We should ask: is [the EV of X minus the EV of Y] positive or negative? I tentatively think it's negative. The projects in Y would have been judged by the funder to have positive ex-ante EV, while the projects in X got funded because they had a chance to end up having a high ex-post EV. Also, I think complex cluelessness is a common phenomenon in the realms of anthropogenic x-risks and meta-EA. It seems that interventions that have a substantial chance to prevent existential catastrophes usually have an EV that is much closer to 0 than we would otherwise think due to also having a chance to cause an existential catastrophe. Therefore, the EV of Y seems much closer to 0 than the EV of X (assuming that the EV of X is not 0). [EDIT: adding the text below.] Sorry, I messed up when writing this comment (I wrote it at 03:00 am...). Firstly, I confused X and Y in the sentence that I now crossed out. But more fundamentally: I tentatively think that the EV of X is negative (rather than positive but smaller than the EV of Y), because the projects in X are ones that no funder in EA decides to fund (in a world without impact markets). Therefore, letting an impact market fund a project in X seems even worse than falling into the regular unilateralist's curse, because here there need not be even a single person who thinks that the project is (ex-ante) a good idea.

Ofer

Jun 22 2022

I messed up when writing that comment (see the EDIT block).

Owen Cotton-Barratt

Jun 21 2022

I didn't follow this; could you elaborate? (/give an example?)

vaniverJun 25 202218

This reminds me a lot of limited liability (see also Austin's comment, where he compares it to the for-profit startup market, which because of limited liability for corporations bounds prices below by 0).

This is a historically unusual policy (full liability came first), and seems to me to have basically the same downsides (people do risky things, profiting if they win and walking away if they lose), and basically the same upsides (according to the theory supporting LLCs, there's too little investment and support of novel projects).

Can you say more about why you think this consideration is sufficient to be net negative? (I notice your post seems very 'do-no-harm' to me instead of 'here are the positive and negative effects, and we think the negative effects are larger', I'm also interested in Owen's impression on whether or not impact markets lead to more or less phase 2 work.)

Ofer

Jun 26 2022

Can you explain the "same upsides" part? I think that most interventions that have a substantial chance to prevent an existential catastrophe also have a substantial chance to cause an existential catastrophe, such that it's very hard to judge whether they are net-positive or net-negative (due to complex cluelessness dynamics that are caused by many known and unknown crucial considerations). The EA community causes activities in anthropogenic x-risk domains, and it's extremely important that it will differentially cause net-positive activities. This is something we should optimize for rather than regard as an axiom. Therefore, we should be very wary of funding mechanisms that incentivize people to treat extremely harmful outcomes as if they were neutral (when making decisions about doing/funding projects that are related to anthropogenic x-risks). [EDIT: Also, interventions that are carried out if and only if impact markets fund them seem selected for being more likely than otherwise to be net-negative, because they are ones that no classical EA funder would fund.]

vaniver

Jun 27 2022

Yeah; by default people have entangled assets which will be put at risk by starting or investing in a new project. Limiting the liability that originates from that project to just the assets held by that project means that investors and founders can do things that seem to have positive return on their own, rather than 'positive return given that you're putting all of your other assets at stake.' [Like I agree that there's issues where the social benefit of actions and the private benefits of actions don't line up, and we should try to line them up as well as we can in order to incentivize the best action. I'm just noting that the standard guess for businesses is "we should try to decrease the private risk of starting new businesses"; I could buy that it's different for the x-risk environment, where we should not try to decrease the private risk of starting new risk reduction projects, but it's not obviously the case.] Sure, I agree with this, and with the sense that the costs are large. The thing I'm looking for is the comparison between the benefits and the costs; are the costs larger? Sure, I buy that adverse selection can make things worse; my guess was that the hope was that classical EA funders would also operate thru the market. [Like, at some point your private markets become big enough that they become public markets, and I think we have solid reasons to believe a market mechanism can outperform specific experts, if there's enough profit at stake to attract substantial trading effort.]

Ofer

Jun 27 2022

Efficient impact markets would allow anyone to create certificates for a project and then sell them for a price that corresponds to a very good prediction of their expected future value. Therefore, sufficiently efficient impact markets will probably fund some high EV projects that wouldn't otherwise be funded (because it's not easy for classical EA funders to evaluate them or even find them in the space of possible projects). If we look at that set of projects in isolation, we can regard it as the main upside of creating the impact market. The problem is that the market does not reliably distinguish between those high EV projects and net-negative projects, because a potential outcome that is extremely harmful affect the expected future value of the certificate as if the outcome were neutral. Suppose x is a "random" project that has a substantial chance to prevent an existential catastrophe. If you believe that the EV of x is much smaller than the EV of x conditional on x not causing a harmful outcome, then you should be very skeptical about impact markets. Finally, we should consider that if a project is funded if and only if impact markets exist then it means that no classical EA funder would fund it in a world without impact markets, and thus it seems more likely than otherwise to be net-negative. (Even if all EA funders switched to operate solely as retro funders in impact markets, I think it would still be true that an intervention that gets funded by an impact market—and wouldn't get funded in a world without impact markets—seems more likely than otherwise to be net-negative.)

Yonatan CaleJun 23 202216

Concretely:

If someone thinks a net-negative project is being traded on (or run at all), how about posting about it on the forum?

I assume anyone who retro funds a project will first run a search here and see what came up.

Meta:

The problem of funding net-negative projects exists also now.
I think that if retroactive public good's funding existed, people would be slightly horrified about canceling it. The upside seems big.
Nobody has seriously run retroactive public good's funding yet, we're not sure how it will go or what will go wrong. Surely some things will go wrong, I agree about that.
This is a kind of project that we can stop or change if we want to. There is a lot of human discretion. This is not like adding a government regulation that will be very hard to change, or launching a blockchain that you can pretty much never take back no matter what you do.
So I suggest we do experiment with it, fund some good projects (like hpmorpodcast!), and do a few things like encourage people to post on the forum if they think a project is net-negative and being traded. Perhaps even the software system that manages the trade can link to a forum post for each project, so people can voice their opinions there

Ofer

Jun 23 2022

As we wrote in the post, even if everyone believes that a certain project is net-negative, its certificates may be traded for a high price due to the chance that the project will end up being beneficial. For example, consider OpenAI (I'm not making here a claim that OpenAI is net-negative, but it seems that many people in EA think it is, and for the sake of this example let's imagine that everyone in EA think that). It's plausible that OpenAI will end up being extremely beneficial. Therefore, if a naive impact market had existed when OpenAI was created, it's likely that the market would have helped in funding its creation (i.e. OpneAI's certificates would have been traded for a high price). Also, it seems that people in EA (and in academia/industry in general) usually avoid saying bad things publicly about others' work (due to reasons that are hard to nullify). Another point to consider is that saying that a project is net-negative publicly can sometimes in itself be net-negative due to drawing attention to info hazards. (e.g. "The experiment that Alice is working on is dangerous!") As I already wrote in a reply to Austin, impact markets can incentivize/fund net-negative projects that are not currently of interest to for-profit investors. For example, today it can be impossible for someone to make a huge amount of money by launching an aggressive outreach campaign to make people join EA, or publishing a list of "the most dangerous ongoing experiments in virology that we should advocate to stop"; which are interventions that may be net-negative. (Also, in cases where both impact markets and existing mechanisms incentivize a project, one can flip your argument and say that the solution to funding net-positive projects already exist and so we don't need impact markets. To be clear, I'm not making that argument, I'm just trying to show that the original one is wrong.) Shutting down an impact market, if successful, functionally means burning all the certificates that

Yonatan Cale

Jun 23 2022

OpenAI I recommend that early on someone posts: "OpenAI is expected to be negative on net even if it ends up randomly working [...] So retro funders, if you agree with this, please don't retro-fund OpenAI, even if it works" I expect this will reduce the price at which OpenAI is traded People in EA (and in academia/industry in general) usually avoid saying bad things publicly about others' work 1. Yeah. Anonymous comments? Messaging FTX (or who ever does the retro funding) ? This seems like the kind of thing we'd learn on the fly, no? 2. Or an item on the checklist before retro-funding? 3. This problem already exists today when you fund someone for work that they're planning to do, no? launching an aggressive outreach campaign to make people join EA I bet EA has this problem today already and CEA deals with it somehow. Wanna ask them? (In other words, doesn't seem like a new probelm. No?) Publishing a list of "the most dangerous ongoing experiments in virology that we should advocate to stop" I assume CEA+LW already remove such infohazards from the forum(s) and would reach out to EAs who publish this stuff elsewhere, if it comes to their attention. = probably not a new problem. And you could probably ask CEA/LW what they think since it's their domain (?) They might say this is totally a huge problem and making it worse is really bad, I don't know One can flip your argument and say that the solution to funding net-positive projects already exist and so we don't need impact markets Nice! (I like this kind of argument <3 ) I missed what you're replying to though. Is it the "The problem of funding net-negative projects exists also now." ? I'm pointing at "we're not creating any new problem [that we have no mechanism to solve]". (I'll wait for your reply here since I suspect I missed your point) Human discretion / Shutting down an impact market What I actually mean is not that the people running the market will shut it down. What I mean is tha

Ofer

Jun 25 2022

But an impact market can still make OpenAI's certificates be worth $100M if, for example, investors have at least 10% credence in some future retro funder being willing to buy them for $1B (+interest). And that could be true even if everyone today believed that creating OpenAI is net-negative. See the "Mitigating the risk is hard" section in the OP for some additional reasons to be skeptical about such an approach. Yes. You respond to examples of potential harm that impact markets can cause by pointing out that these things can happen even without impact markets. I don't see why these arguments should be more convincing than the flipped argument: "everything that impact markets can fund can already be funded in other ways, so we don't need impact markets". (Again, I'm not saying that the flipped argument makes sense.) Your overall view seems to be something like: we should just create an impact market and if it causes harm then the retro funders will notice and stop buying certificates (or they will stop buying some particular certificates that are net-negative to buy). I disagree with this view because: 1. There is a dire lack of feedback signal in the realm of x-risk mitigation. It's usually very hard to judge whether a given intervention was net-positive or net-negative. It's not just a matter of asking CEA / LW / anyone else what they think about a particular intervention, because usually no one on Earth can do a reliable, robust evaluation. (e.g. is the creation of OpenAI/Anthropic net positive or net negative?) So, if you buy the core argument in the OP (about how naive impact markets incentivize people to carry out interventions without considering potential outcomes that are extremely harmful), I think that you shouldn't create an impact market and rely on some unspecified future feedback signals to make retro funders stop buying certificates in a net-negative way at some unspecified point in the future. 2. As I argued in the grandparent comment, we sho

Dawn Drescher

Jun 25 2022

It could be done a bit more smoothly by (1) accepting no new issues, (2) completing all running prize rounds, and (3) declaring the impact certificates not burned and allowing people some time to export their data. (I don’t think it would be credible for the marketplace to declare the certs burned since it doesn’t own them.) My original idea from summer 2021 was to use blockchain technology simply for technical ease of implementation (I wouldn’t have had to write any code). That would’ve made the certs random tokens among millions of others on the blockchain. But then to set up a centralized, curated marketplace for them with a smart and EA curation team. We’ve moved away from that idea. Our current market is fully web2 with no bit of blockchain anywhere. Safety was a core reason for the update. (But the ease-of-implementation reasons to prefer blockchain also didn’t apply so much anymore. We have a doc somewhere with all the pros and cons.) For our favored auction mechanisms, it would be handy to be able to split transactions easily, so we have thought about (maybe, at some point) allowing users to connect a wallet to improve the user experience, but that would be only for sending and receiving payments. The certs would still be rows in a Postgres database in this hypothetical model. Sort of like how Rethink Priorities accepts crypto donations or a bit like a centralized crypto exchange (but that sounds a bit pompous). But what do you think about the original idea? I don’t think it's so different from a fully centralized solution where you allow people to export their data or at least not prevent them from copy-pasting their certs and ledgers to back them up. My greatest worries about crypto stem less from the technology itself (which, for all I know, could be made safe) but from the general spirit in the community that decentralization, democratization, ungatedness, etc. are highly desirable values to strive for. I don’t want to have to fight against the domi

Ofer

Jun 26 2022

That could make it easier for another team to create a new impact market that will seamlessly replace the impact market that is being shut down. If a decentralized impact market gains a lot of traction, I don't see how the certificates being "tokens among millions of others" helps. A particular curated gallery can end up being ignored by some/most market participants (and perhaps be outcompeted by another, less scrupulous curated gallery).

Dawn Drescher

Jun 26 2022

Okay, but to keep the two points separate: 1. Allowing people to make backups: You’d rather make it as hard as possible to make backups, e.g., by using anti-screenscraping tools and maybe hiding some information about the ledger in the first place so people can’t easily back it up. 2. Web3: Seems about as bad as any web2 solution that allows people to easily back up their data. Is that about right?

Ofer

Jun 26 2022

I think that a decentralized impact market that can't be controlled or shut down seems worse. Also, a Web3 platform will make it less effortful for someone to launch a competing platform (either with or without the certificates from the original platform).

mako yassJun 22 202215

I should mention that Good Exchange/impact certs people have discussed this quite a bit. I raised concerns about this issue early on here. Shortly later, I posted the question would (myopic) general public good producers significantly accelerate the development of AGI? to Lesswrong.

My current thoughts are similar to harsimony's, it's probably possible to get the potential negative externalities of a job to factor into the price of the impact cert by having certs take on negative value/turn into liabilities/debts if the negative outcomes end up eventuating.
We don't know exactly how to implement that well yet, though.

Jordan ArelJun 28 20228

I think that retroactive Impact Markets may be a net negative for many x-risk projects, however, I also think that in general Impact Markets may significantly reduce x-risk.

I think you have to bear in mind that if this project is highly successful, it has the potential to create a revolution in the funding of public goods. If humanity achieves much better funding and incentive mechanisms for public goods, this could create a massive increase in the efficiency of philanthropy.

It is hard to know how beneficial such a system would be, but it is not hard to s... (read more)

Dawn DrescherJun 25 20225

Dawn’s (Denis’s) Intellectual Turing Test Red-Teaming Impact Markets

[Edit: Before you read this, note that I failed. See the comment below.]

I currently favor using the language of prize contests to explain impact markets unless ... (read more)

Ofer

Jun 26 2022

Jan_KulveitJun 22 20225

Seems a case of "How x-risk projects are different from startups"

DCOct 22 20224

This post seems like a really great example in the wild of how distribution mismatch could occur.

Yonatan CaleJun 23 20223

Another comment, specifically about AGI capabilities:

If someone wants to advance AI capabilities, they can already get prospective funding by opening a regular for-profit startup.

No?

Ofer

Jun 23 2022

Right. But without an impact market it can be impossible to profit from, say, publishing a post with a potentially transformative insight about AGI development. (See this post as a probably-harmless-version of the type of posts I'm talking about here.)

Yonatan Cale

Jun 24 2022

I acknowledge this could be bad, but (as with most of my comments here), this is not a new problem. Also today, if someone publishes such a post in the Alignment Forum: I hope they have moderation for taking it down, wether the author expects to make money from it or not. Or is your worry something like "there will be 10x more such posts and the moderation will be overloaded"?

Ofer

Jun 25 2022

It's just an example for how a post on the alignment forum can be net-negative and how it can be very hard to judge whether it's net-negative. For any net-negative intervention that impact markets would incentivize, if people can do it without funding then the incentive to do impressive things can also cause them to carry out the intervention. In those cases, impact markets can cause those interventions to be more likely to be carried out.

Yonatan Cale

Jun 25 2022

I hope I'm not strawmanning your claim and please call me out if I am, but, Seems like you are arguing for making it more likely to have [a risk] that, as you point out, happened, and the AF could solve with almost no cost, and they chose not to. ..right? So.. why do you think it's a big problem? Or at least.. seems like the AF disagrees about this being a problem.. no? (Please say if this is an unfair question somehow)

Ofer

Jun 25 2022

(Not an important point [EDIT: meaning the text you are reading in these parentheses], but I don't think that a karma of 18 points is a proof for that; maybe the people who took the time to go over that post and vote are mostly amateurs who found the topic interesting. Also, as an aside, if someone one day publishes a brilliant insight about how to develop AGI much faster, taking the post down can be net-negative due to the Streisand effect). I'm confident that almost all the alignment researchers on Earth will agree with the following statement: conditional on such a post having a transformative impact, it is plausible [EDIT: >10% credence] that the post will end up having have an extremely harmful impact. [EDIT: "transformative impact" here means impact that is either extremely negative or extremely positive.] I argue that we should be very skeptical about potential funding mechanisms that incentivize people to treat "extremely harmful impact" here as if it were "neutral impact". A naive impact market is such a funding mechanism.

Yonatan Cale

Jun 26 2022

You changed my mind! I think the missing part, for me, is a public post saying "this is what I'm going to do, but I didn't start", which is what the prospective funder sees, and would let the retro funder say "hey you shouldn't have funded this plan". I think. I'll think about it

Jun 26 2022

I think you're missing the part where if such a marketplace was materially changing the incentives and behavior of the Alignment Forum, people could get an impact certificate for counterbalancing externalities such as critiquing/flagging/moderating a harmful AGI capabilities post, possibly motivating them to curate more than a small moderation team could handle. That's not to say that in that equilibrium there couldn't be an even stronger force of distributionally mismatched positivity bias, e.g. upvote-brigading assuming there are some Goodhart incentives to retro fund posts in proportion to their karma, but it is at least strongly suggestive.

DCJun 22 20223

Ofer (and Owen), I want to understand and summarize your cruxes one by one, in order to sufficiently pass your Ideological Turing Test that I can regenerate the core of your perspective. Consider me your point person for communications.

Crux: Distribution Mismatch of Impact Markets & Anthropogenic X-Risk

If I understand one of the biggest planks of your perspective correctly, you believe that there is a high-variance normal distribution of utility centered around 0 for x-risk projects, such that x-risk projects can often increase x-risk rather than decre... (read more)

Ofer

Jun 23 2022

I think that most interventions that have a substantial chance to prevent an existential catastrophe also have a substantial chance to cause an existential catastrophe, such that it's very hard to judge whether they are net-positive or net-negative (due to complex cluelessness dynamics that are caused by many known and unknown crucial considerations). My best guess is that those particular two posts are net-positive (I haven't read them entirely / at all). Of course, this does not imply that it's net-positive to use these posts in a way that leads to the creation of an impact market. In (3) you wrote "posts on the EA Forum/LW/Alignment Forum […] (minus stuff such as infohazards, etc)". I think this description essentially assumes the problem away. Posts are merely information in a written form, so if you exclude all the posts that contain harmful information (i.e. info hazards), the remaining posts are by definition not net-negative. The hard part is to tell which posts are net-negative. (Or more generally, which interventions/projects are net-negative.) The distribution mismatch problem is not caused by different people judging the EV differently. It would be relevant even if everyone in the world was in the same epistemic state. The problem is that if a project ends up being extremely harmful, its certificates end up being worth $0, same as if it ended up being neutral. Therefore, when market participants who follow their local financial incentives evaluate a project, they treat potential outcomes that are extremely harmful as if they were neutral. I'm happy to discuss this point further if you don't agree with it. It's the core argument in the OP, so I want to first reach an agreement about it before discussing possible courses of action.

MichaelStJulesJun 21 20223

Based on your arguments here, I wouldn't be excited about impact markets. Still, if impact markets significantly expand the amount of work done for social good, it's plausible the additional good outweighs the additional bad. Furthermore, people looking to make money are already funding net negative companies due to essentially the same problems (companies have non-negative evaluations), so shifting them towards impact markets could be good, if impact markets have better projects than existing markets on average.

Ofer

Jun 21 2022

See my reply to Austin.

DCJun 30 20222

I am going to be extremely busy over the next week as I prep for the EAEcon retreat and wrap up the retro funding contest among other things, and combining that with the generally high personal emotional cost of engaging with this post will choose to not comment further for at least a week to focus my energy elsewhere (unless inspiration strikes me).

Here are a couple of considerations relevant to why I at least have not been more responsive, generally speaking:

... (read more)

ArepoJun 22 20221

Is there any real-world evidence of the unilateralist's curse being realised? My sense historically is that this sort of reasoning to date has been almost entirely hypothetical, and has done a lot to stifle innovation and exploration in the EA space.

Ofer

Jun 22 2022

If COVID-19 is a result of a lab leak that occurred while conducting a certain type of experiment (for the purpose of preventing future pandemics), perhaps many people considered conducting/funding such experiments and almost all of them decided not to. I think we should be careful with arguments that such and such existential risk factor is entirely hypothetical. Causal chains that end in an existential catastrophe are entirely hypothetical and our goal is to keep them that way.

Arepo

Jun 23 2022

I'm talking about the unilateralist's curse with respect to actions intended to be altruistic, not the uncontroversial claim that people sometimes do bad things. I find it hard to believe that any version of the lab leak theory involved all the main actors scrupulously doing what they thought was best for the world. I think we should be careful with arguments that existential risk discussions require lower epistemic standards. That could backfire in all sorts of ways, and leads to claims like one I heard recently from a prominent player that a claim about artificial intelligence prioritisation for which I asked for evidence is 'too important to lose to measurability bias'.

Ofer

Jun 23 2022

I don't find it hard to believe at all. Conditional on a lab leak, I'm pretty confident no one involved was consciously thinking: "if we do this experiment it can end up causing a horrible pandemic, but on the other hand we can get a lot of citations." Dangerous experiments in virology are probably usually done in a way that involves a substantial amount of effort to prevent accidental harm. It's not obvious that virologists who are working on dangerous experiments tend to behave much less scrupulously than people in EA who are working for Anthropic, for example. (I'm not making here a claim that such virologists or such people in EA are doing net-negative things.)

Arepo

Jun 23 2022

Strong disagree. A bioweapons lab working in secret on gain of function research for a somewhat belligerent despotic government, which denies everything after an accidental release is nowhere near any model I have of 'scrupulous altruism'. Ironically, the person I mentioned in my previous comment is one of the main players at Anthropic, so your second paragraph doesn't give me much comfort.

Ofer

Jun 23 2022

I think that it's more likely to be the result of an effort to mitigate potential harm from future pandemics. One piece of evidence that supports this is the grant proposal, which was rejected by DARPA, that is described in this New Yorker article. The grant proposal was co-submitted by the president of the EcoHealth Alliance, a non-profit which is "dedicated to mitigating the emergence of infectious diseases", according to the article.

Linch

Jun 28 2022

I don't understand your sentence/reasoning here. Naively this should strengthen ofer's claim, not weaken it.

Arepo

Jun 29 2022

Why? The less scrupulous one finds Anthropic in their reasoning, the less weight a claim that Wuhan virologists are 'not much less scrupulous' carries.

Linch

Jun 22 2022

In 2015, when I was pretty new to EA, I talked to a billionaire founder of a company I worked at and tried to pitch them on it. They seemed sympathetic but empirically it's been 7 years and they haven't really done any EA donations or engaged much with the movement. I wouldn't be surprised if my actions made it at least a bit harder for them to be convinced of EA stuff in the future. In 2022, I probably wouldn't do the same thing again, and if I did, I'd almost certainly try to coordinate a bunch more with the relevant professionals first. Certainly the current generation of younger highly engaged EAs seemed more deferential (for better or worse) and similar actions wouldn't be in the Overton window.

Ofer

Jun 22 2022

Unless ~several people in EA had an opportunity to talk to that billionaire, I don't think this is an example of the unilateralist's curse (regardless of whether it was net negative for you to talk to them).

LinchJun 22 202210

Fair, though many EAs are probably in positions where they can talk to other billionaires (especially with >5 hours of planning), and probably chose not to do so.