Hey there!
Can you describe your meta process for deciding what analyses to work on and how to communicate them? Analyses about the future development of transformative AI can be extremely beneficial (including via publishing them and getting many people more informed). But getting many people more hyped about scaling up ML models, for example, can also be counterproductive. Notably, The Economist article that you linked to shows your work under the title "The blessings of scale". (I'm ... (read more)
META LEVEL REPLY
Thinking about the ways publications can be harmful is something that I wish was practiced more widely in the world, specially in the field of AI.
That being said, I believe that in EA, and in particular in AI Safety, the pendulum has swung too far - we would benefit from discussing these issues more openly.
In particular, I think that talking about AI scaling is unlikely to goad major companies to invest much more in AI (there are already huge incentives). And I think EAs and people otherwise invested in AI Safety would benefit from having a... (read more)
OBJECT LEVEL REPLY:
Our current publication policy is:
We think this policy has a good mix of being flexible and giving space for Epoch staff to raise concerns.
... (read more)First of all, what we’ve summarized as “curation” so far could really be distinguished as follows:
- Making access for issuers invite-only, maybe keeping the whole marketplace secret (in combination with #2) until we find someone who produces cool papers/articles and who we trust and then invite them.
- Making access for investors/retro funders invite-only, maybe keeping the whole marketplace secret (in combination with #1) until we find an impact investor or a retro funder who we trust and then invite them.
- Read every certificate either before or shortly af
The thing I'm looking for is the comparison between the benefits and the costs; are the costs larger?
Efficient impact markets would allow anyone to create certificates for a project and then sell them for a price that corresponds to a very good prediction of their expected future value. Therefore, sufficiently efficient impact markets will probably fund some high EV projects that wouldn't otherwise be funded (because it's not easy for classical EA funders to evaluate them or even find them in the space of possible projects). If we look at that set of pr... (read more)
We would never submit our own certificates to a prize contest that we are judging, but we’d also be open to not submitting any of our impact market–related work to any other prize contests if that’s what consensus comes to.
Does this mean that you (the Impact Markets team) may sell certificates of your work to establish an impact market on that very impact market?
I do not endorse the text written by "Imagined Ofer" here. Rather than describing all the differences between that text and what I would really say, I've now published this reply to your first comment.
Web3: Seems about as bad as any web2 solution that allows people to easily back up their data.
I think that a decentralized impact market that can't be controlled or shut down seems worse. Also, a Web3 platform will make it less effortful for someone to launch a competing platform (either with or without the certificates from the original platform).
But abandoning the project of impact markets because of the downsides seems about as misguided to us as abandoning self-driving cars because of adversarial-example attacks on street signs.
I think the analogy would work better if self-driving cars did risky things that could cause a terrible accident, in order to prevent the battery from running out reach the destination sooner.
... (read more)Attributed Impact may look complicated but we’ve just operationalized something that is intuitively obvious to most EAs – expectational consequentialism. (And moral trade and so
(3) declaring the impact certificates not burned and allowing people some time to export their data.
That could make it easier for another team to create a new impact market that will seamlessly replace the impact market that is being shut down.
... (read more)My original idea from summer 2021 was to use blockchain technology simply for technical ease of implementation (I wouldn’t have had to write any code). That would’ve made the certs random tokens among millions of others on the blockchain. But then to set up a centralized, curated marketplace for them with a smar
[Limited liability] is a historically unusual policy (full liability came first), and seems to me to have basically the same downsides (people do risky things, profiting if they win and walking away if they lose), and basically the same upsides (according to the theory supporting LLCs, there's too little investment and support of novel projects).
Can you explain the "same upsides" part?
... (read more)Can you say more about why you think this consideration is sufficient to be net negative? (I notice your post seems very 'do-no-harm' to me instead of 'here are the posi
seems like the AF disagrees about this being a problem.. no?
(Not an important point [EDIT: meaning the text you are reading in these parentheses], but I don't think that a karma of 18 points is a proof for that; maybe the people who took the time to go over that post and vote are mostly amateurs who found the topic interesting. Also, as an aside, if someone one day publishes a brilliant insight about how to develop AGI much faster, taking the post down can be net-negative due to the Streisand effect).
I'm confident that almost all the alignment researche... (read more)
I expect this will reduce the price at which OpenAI is traded
But an impact market can still make OpenAI's certificates be worth $100M if, for example, investors have at least 10% credence in some future retro funder being willing to buy them for $1B (+interest). And that could be true even if everyone today believed that creating OpenAI is net-negative. See the "Mitigating the risk is hard" section in the OP for some additional reasons to be skeptical about such an approach.
... (read more)I missed what you're replying to though. Is it the "The problem of funding net
It's just an example for how a post on the alignment forum can be net-negative and how it can be very hard to judge whether it's net-negative. For any net-negative intervention that impact markets would incentivize, if people can do it without funding then the incentive to do impressive things can also cause them to carry out the intervention. In those cases, impact markets can cause those interventions to be more likely to be carried out.
I think that most interventions that have a substantial chance to prevent an existential catastrophe also have a substantial chance to cause an existential catastrophe, such that it's very hard to judge whether they are net-positive or net-negative (due to complex cluelessness dynamics that are caused by many known and unknown crucial considerations).
... (read more)My model of you would say either that:
- funding those particular posts is net bad, or
- funding those two posts in particular may be net good, but it sets a precedent that will cause there to be further counte
If someone wants to advance AI capabilities, they can already get prospective funding by opening a regular for-profit startup.
No?
Right. But without an impact market it can be impossible to profit from, say, publishing a post with a potentially transformative insight about AGI development. (See this post as a probably-harmless-version of the type of posts I'm talking about here.)
If someone thinks a net-negative project is being traded on (or run at all), how about posting about it on the forum?
As we wrote in the post, even if everyone believes that a certain project is net-negative, its certificates may be traded for a high price due to the chance that the project will end up being beneficial. For example, consider OpenAI (I'm not making here a claim that OpenAI is net-negative, but it seems that many people in EA think it is, and for the sake of this example let's imagine that everyone in EA think that). It's plausible that Op... (read more)
I think that it's more likely to be the result of an effort to mitigate potential harm from future pandemics. One piece of evidence that supports this is the grant proposal, which was rejected by DARPA, that is described in this New Yorker article. The grant proposal was co-submitted by the president of the EcoHealth Alliance, a non-profit which is "dedicated to mitigating the emergence of infectious diseases", according to the article.
I find it hard to believe that any version of the lab leak theory involved all the main actors scrupulously doing what they thought was best for the world.
I don't find it hard to believe at all. Conditional on a lab leak, I'm pretty confident no one involved was consciously thinking: "if we do this experiment it can end up causing a horrible pandemic, but on the other hand we can get a lot of citations."
Dangerous experiments in virology are probably usually done in a way that involves a substantial amount of effort to prevent accidental harm. It's not o... (read more)
Unless ~several people in EA had an opportunity to talk to that billionaire, I don't think this is an example of the unilateralist's curse (regardless of whether it was net negative for you to talk to them).
Fair, though many EAs are probably in positions where they can talk to other billionaires (especially with >5 hours of planning), and probably chose not to do so.
Is there any real-world evidence of the unilateralist's curse being realised?
If COVID-19 is a result of a lab leak that occurred while conducting a certain type of experiment (for the purpose of preventing future pandemics), perhaps many people considered conducting/funding such experiments and almost all of them decided not to.
My sense historically is that this sort of reasoning to date has been almost entirely hypothetical
I think we should be careful with arguments that such and such existential risk factor is entirely hypothetical. Causal chains ... (read more)
I messed up when writing that comment (see the EDIT block).
I think people following local financial incentives is always going to happen, and the point of an impact market is to structure financial incentives to be aligned with what the EA community broadly thinks is good.
It may be useful to think about it this way: Suppose an impact market is launched (without any safety mechanisms) and $10M of EA funding are pledged to be used for buying certificates as final buyers 5 years from now. No other final buyers join the market. The creation of the market causes some set of projects X to be funded and some other set... (read more)
or setting up a system to short sell different projects.
I don't think that short selling would work. Suppose a net-negative project has a 10% chance to end up being beneficial, in which case its certificates will be worth $1M (and otherwise the certificates will end up being worth $0). Therefore, the certificates are worth today $100K in expectation. If someone shorts the certificates as if they are worth less than that, they will lose money in expectation.
Furthermore, people looking to make money are already funding net negative companies due to essentially the same problems (companies have non-negative evaluations), so shifting them towards impact markets could be good, if impact markets have better projects than existing markets on average.
See my reply to Austin.
Hm, naively - is this any different than the risks of net-negative projects in the for-profit startup funding markets? If not, I don't think this a unique reason to avoid impact markets.
Impact markets can incentivize/fund net-negative projects that are not currently of interest to for-profit investors. For example, today it can be impossible for someone to make a huge amount of money by launching an aggressive outreach campaign to make people join EA, or publishing a list of "the most dangerous ongoing experiments in virology that we should advocate to sto... (read more)
Thanks for your responses!
I'm not sure that "uniqueness" is the right thing to look at.
Mostly, I meant: the for-profit world already incentivizes people to take high amounts of risk for financial gain. In addition, there are no special mechanisms to prevent for-profit entities from producing large net-negative harms. So asking that some special mechanism be introduced for impact-focused entities is an isolated demand for rigor.
There are mechanisms like pollution regulation, labor laws, etc which apply to for-profit entities - but these would apply equally ... (read more)
I added an EDIT block in the first paragraph after quoting you (I've misinterpreted your sentence).
Hey there!
the AI safety research seems unlikely to have strong enough negative unexpected consequences to outweigh the positive ones in expectation.
The word "unexpected" sort of makes that sentence trivially true. If we remove it, I'm not sure the sentence is true. [EDIT: while writing this I misinterpreted the sentence as: "AI safety research seems unlikely to end up causing more harm than good"] Some of the things to consider (written quickly, plausibly contains errors, not a complete list):
If we shame each other for using our EA activities to make friends, find mates, raise status, make a living, or feel good about ourselves, we undermine EA.
This seems plausible. On the other hand, it may be important to be nuanced here. In the realms of anthropogenic x-risks and meta-EA, it is often very hard to judge whether a given intervention is net-positive or net-negative. Conflicts of interest can cause people to be less likely to make good decisions from an EA perspective.
In the original EA Forum Prize, the ex-post EV at the time of evaluation is usually similar to the ex-ante EV assuming that the evaluation happens closely after the post was written. (In a naive impact market, the price of a certificate can be high due to the chance that 3 years from now its ex-post EV will be extremely high.)
The original EA Forum Prize does not seem to have had the distribution mismatch problem; the posts were presumably evaluated based on their ex-ante EV (or something like that?).
Thanks for the info!
If the shareholders of the public benefit corporation will be able to receive dividends, I think there's a conflict of interest problem with this setup. The Impact Markets team will probably need to make high-stakes decisions under great uncertainty. (E.g. should an impact market be launched? Should the impact market be decentralized? Should a certain person be invited to serve as a retro funder? How to navigate the tradeoff between explaining the safety rules thoroughly and writing more engaging posts that are more conducive to gaining... (read more)
There’s a difficult trade-off between the high-fidelity communication of our long explainer posts and the concision that is necessary to get people to actually read a post when it comes to participating in a contest. Our explainer posts get very little engagement. To participate in the contest it’s not necessary to understand exactly how our mechanisms work, so we hope to reach more people by explaining things in simpler terms without words like “ex ante” and comparisons to constructed counterfactual world histories.
After this contest, it will still be ... (read more)
We do not plan to resell or consume/open the impact of the posts in the short term but reserve the right to do so in the future.
If you end up reselling impact that you've purchased with a grant from the Future Fund Regranting Program, where does the money go?
Hi Dony!
In a section titled "Other less important details", after a sentence saying "It’s not a requirement to read all of the following, […]" there is the following sentence:
The certificate description justifies the value of the impact as defined by the latest version of the Attributed Impact definition (currently 0.2).
Other than that sentence, the OP does not convey that the retro funders will consider the ex-ante EV of a post (and won't attribute to the post a higher EV than that, even if the post ends up being extremely beneficial). Instead, the OP... (read more)
Okay, but if you’re not actually talking about “malicious” retro funders (a category in which I would include actions that are not typically considered malicious today, such as defecting against minority or nonhuman interests), the difference between a world with and without impact markets becomes very subtle and ambiguous in my mind.
I think it depends on the extent to which the (future) retro funders take into account the ex-ante impact, and evaluate it without an upward bias even if they already know that the project ended up being extremely beneficial.
I think the concern here is not about "unaligned retro funders" who consciously decide to do harmful things. It doesn't take malicious intent to misjudge whether a certain effort is ex-ante beneficial or harmful in expectation.
I wonder, though, when I play this through in my mind, I can’t quite see almost any investor investing anything but tiny amounts into a project on the promise that there might be at some point a retro funder for it.
Suppose investors were able to buy impact certificates of organizations like OpenAI, Anthropic, Conjecture, EcoHealt... (read more)
I’ll create a document for all the things we need to watch out for when it comes to attacks by issuers, investors, and funders, so we can monitor them in our experiments.
(I don't think that potential "attacks" by issuers/investors/funders are the problem here.)
But that does not solve the retro funder alignment that is part of your argument.
I don't think it's an alignment issue here. The price of a certificate tracks the maximum amount of money that any future retro funder will be willing to pay for it. So even if 100% of the current retro funders sa... (read more)
Probably something like striving for a Long Reflection process. (Due to complex cluelessness more generally, not just moral uncertainty.)
In general, what do you think of the level of conflict of interests within EA grantmaking?
My best guess, based on public information, is that CoIs within longtermism grantmaking are being handled with less-than-ideal strictness. For example, generally speaking, if a project related to anthropogenic x-risks would not get funding without the vote of a grantmaker who is a close friend of the applicant, it seems better to not fund the project.
... (read more)(For example, Anthropic raised a big Series A from grantmakers closely related to their president Daniella Amodei’
Thank you for the info!
I understand that you recently replaced Jonas as the head of the EA Funds. In January, Jonas indicated that the EA Funds intends to publish a polished CoI policy. Is there still such an intention?
The policy that you referenced is the most up-to-date policy that we have but, I do intend to publish a polished version of the COI policy on our site at some point. I am not sure right now when I will have the capacity for this but thank you for the nudge.
Hi Linch, thank you for writing this!
I started off with a policy of recusing myself from even small CoIs. But these days, I mostly accord with (what I think is) the equilibrium: a) definite recusal for romantic relationships, b) very likely recusal for employment or housing relationships, c) probable recusal for close friends, d) disclosure but no self-recusal by default for other relationships.
In January, Jonas Vollmer published a beta version of the EA Funds' internal Conflict of Interest policy. Here are some excerpts from it:
... (read more)Any relationship that
In general, what do you think of the level of conflict of interests within EA grantmaking? I’m a bit of an outsider to the meta / AI safety folks located in Berkeley, but I’ve been surprised to find out the frequency of close relationships between grantmakers and grant receivers. (For example, Anthropic raised a big Series A from grantmakers closely related to their president Daniella Amodei’s husband, Holden Karnofsky!)
Do you think COIs pose a significant threat to the EA’s epistemic standards? How should grantmakers navigate potential COIs? How should this be publicly communicated?
(Responses from Linch or anybody else welcome)
Suppose Alice is working on a dangerous project that involves engineering a virus for the purpose of developing new vaccines. Fortunately, the dangerous stage of the project is completed successfully (the new virus is exterminated before it has a chance to leak), and now we have new vaccines that are extremely beneficial. At this point, observing that the project had a huge positive impact, will Retrox retroactively fund the project?
We aimed for participants to form evidence-based views on questions such as:
[...]
- What are the most probable ways AGI could be developed?
A smart & novel answer to this question can be an information hazard, so I'd recommend consulting with relevant people before raising it in a retreat.
Suppose Alice is working on a risky project that has a 50% chance of ending up being extremely beneficial and 50% chance of ending up being extremely harmful. If the project ends up being extremely beneficial, will Retrox allow Alice to make a lot of money from her project?
Grifters are optimizing only to get themselves money and power; EAs are optimizing for improving the world.
I think it is not so binary in reality. It's likely that almost no one thinks about themselves as a grifter; and almost everyone in EA is at least somewhat biased towards actions that will cause them to have more money and power (on account of being human). So, while I think this post points at an extremely important problem, I wouldn't use the grifters vs. EAs dichotomy.
Option value considerations dictate that we continue doing AI safety research even if we’re unsure of its value because it’s much easier to stop a research program than to start one.
I think the opposite is often true. Once there are people who get compensated for doing X it can be very hard to stop X. (Especially if it's harder for impartial people, who are not experts-in-X, to evaluate X.)
Thanks, you're right. There's this long thread, but I'll try to explain the issues here more concisely. I think the theorems have the following limitations that were not reasonably explained in the paper (and some accompanying posts):
Hey there!
And then finally there are actually some formal results where we try to formalize a notion of power-seeking in terms of the number of options that a given state allows a system. This is work [...] which I'd encourage folks to check out. And basically you can show that for a large class objectives defined relative to an environment, there's a strong reason for a system optimizing those objectives to get to the states that give them many more options.
After spending a lot of time on understanding that work, my impression is that the main theorem... (read more)
Have you explained your thoughts somewhere? It'd more productive to hash out the disagreement rather than generically casting shade!
Yeah, that feels like a continuous kind of failure. Like, you can reduce the risk from 50% to 1% and then to 0.1% but you can’t get it down to 0%.
Suppose we want the certificates of a risky, net-negative project to have a price that is lower by 10x than the price they would have on a naive impact market. Very roughly, it needs to be the case that the speculators have a credence of less than 10% that at least one relevant retro funder will evaluate the ex-ante impact to be high (positive). Due to the epistemic limitations of the speculators, that conditi... (read more)
Our solutions to at least remove incentives like that (but not to additionally penalize it) are in the Solutions section of the article that Ofer linked
Will those solutions work?
Do you have control over who can become a retro funder after the market is launched? To what extent will the retro funders understand or care about the complicated definition of Attributed Impact? And will they be aware of, and know how to account for, complicated considerations like: "if the certificate's description is not very specific and leaves the door open to risky, net-n... (read more)
.
On the other hand, if someone in EA is making decisions about high-stakes in... (read more)