Send me anonymous feedback: https://docs.google.com/forms/d/1qDWHI0ARJAJMGqhxc9FHgzHyEFp-1xneyl9hxSMzJP0/viewform
Any type of feedback is welcome, including arguments that a post/comment I wrote is net negative.
I'm interested in ways to increase the EV of the EA community by mitigating downside risks from EA related activities (primarily ones that are related to anthropogenic x-risks). I think that:
Feel free to reach out by sending me a PM here or on my website.
Thank you for the info!
I understand that you recently replaced Jonas as the head of the EA Funds. In January, Jonas indicated that the EA Funds intends to publish a polished CoI policy. Is there still such an intention?
Hi Linch, thank you for writing this!
I started off with a policy of recusing myself from even small CoIs. But these days, I mostly accord with (what I think is) the equilibrium: a) definite recusal for romantic relationships, b) very likely recusal for employment or housing relationships, c) probable recusal for close friends, d) disclosure but no self-recusal by default for other relationships.
In January, Jonas Vollmer published a beta version of the EA Funds' internal Conflict of Interest policy. Here are some excerpts from it:
Any relationship that could cause significantly biased judgment (or the perception of that) constitutes a potential conflict of interest, e.g. romantic/sexual relationships, close work relationships, close friendships, or living together.
.
The default suggestion is that you recuse yourself from discussing the grant and voting on it.
.
If the above means we can’t evaluate a grant, we will consider forwarding the application to another high-quality grantmaker if possible. If delegating to such a grantmaker is difficult, and this policy would hamper the EA community’s ability to make a good decision, we prefer an evaluation with conflict of interest over none (or one that’s significantly worse). However, the chair and the EA Funds ED should carefully discuss such a case and consider taking additional measures before moving ahead.
Is this consistent with the current CoI policy of the EA Funds?
Suppose Alice is working on a dangerous project that involves engineering a virus for the purpose of developing new vaccines. Fortunately, the dangerous stage of the project is completed successfully (the new virus is exterminated before it has a chance to leak), and now we have new vaccines that are extremely beneficial. At this point, observing that the project had a huge positive impact, will Retrox retroactively fund the project?
We aimed for participants to form evidence-based views on questions such as:
[...]
- What are the most probable ways AGI could be developed?
A smart & novel answer to this question can be an information hazard, so I'd recommend consulting with relevant people before raising it in a retreat.
Suppose Alice is working on a risky project that has a 50% chance of ending up being extremely beneficial and 50% chance of ending up being extremely harmful. If the project ends up being extremely beneficial, will Retrox allow Alice to make a lot of money from her project?
Grifters are optimizing only to get themselves money and power; EAs are optimizing for improving the world.
I think it is not so binary in reality. It's likely that almost no one thinks about themselves as a grifter; and almost everyone in EA is at least somewhat biased towards actions that will cause them to have more money and power (on account of being human). So, while I think this post points at an extremely important problem, I wouldn't use the grifters vs. EAs dichotomy.
Option value considerations dictate that we continue doing AI safety research even if we’re unsure of its value because it’s much easier to stop a research program than to start one.
I think the opposite is often true. Once there are people who get compensated for doing X it can be very hard to stop X. (Especially if it's harder for impartial people, who are not experts-in-X, to evaluate X.)
Thanks, you're right. There's this long thread, but I'll try to explain the issues here more concisely. I think the theorems have the following limitations that were not reasonably explained in the paper (and some accompanying posts):
I'm not arguing that the theorems don't prove anything useful. I'm arguing that it's very hard for the readers of the paper (and some accompanying posts) to understand what the theorems actually prove. Readers need to understand about 20 formal definitions that build on each other to understand the theorems. I also argue that the lack of explanations about what the theorems actually prove, and some of the informal claims that were made about the theorems, are not reasonable (and cause the theorems to appear more impressive). Here's an example for such an informal claim (taken from this post):
Not all environments have the right symmetries
- But most ones we think about seem to
Hey there!
And then finally there are actually some formal results where we try to formalize a notion of power-seeking in terms of the number of options that a given state allows a system. This is work [...] which I'd encourage folks to check out. And basically you can show that for a large class objectives defined relative to an environment, there's a strong reason for a system optimizing those objectives to get to the states that give them many more options.
After spending a lot of time on understanding that work, my impression is that the main theorems in the paper are very complicated and are limited in ways that were not reasonably explained. (To the point that, probably, very few people understand the main theorems and what environments they are applicable for, even though the work has been highly praised within the AI alignment community).
My best guess, based on public information, is that CoIs within longtermism grantmaking are being handled with less-than-ideal strictness. For example, generally speaking, if a project related to anthropogenic x-risks would not get funding without the vote of a grantmaker who is a close friend of the applicant, it seems better to not fund the project.
My understanding is that Anthropic is not a nonprofit and it received funding from investors rather than grantmakers. Though Anthropic can cause CoI issues related to Holden's decision-making about longtermism funding. Holden said in an interview:
I think CoIs can easily influence decision making (in general, not specifically in EA). In the realm of anthropogenic x-risks, judging whether a high-impact intervention is net-positive or net-negative is often very hard due to complex cluelessness. Therefore, CoI-driven biases and self-deception can easily influence decision making and cause harm.
I think grantmakers should not be placed in a position where they need to decide how to navigate potential CoIs. Rather, the way grantmakers handle CoIs should be dictated by a detailed CoI policy (that should probably be made public).