Thanks Ajeya, this is very helpful and clarifying!
I am the only person who is primarily focused on funding technical research projects ... I began making grants in November 2022
Does this mean that prior to November 2022 there were ~no full-time technical AI safety grantmakers at Open Philanthropy?
OP (prev. GiveWell labs) has been evaluating grants in the AI safety space for over 10 years. In that time the AI safety field and Open Philanthropy have both grown, with OP granting over $300m on AI risk. Open Phil has also done a lot of research on the problem. So, from someone on the outside, it seems surprising that the number of people making grants has been consistently low
Following the episode with Mustafa, it would be great to interview the founders of leading AI labs - perhaps Dario (Anthropic) [again], Sam (OpenAI), or Demis (DeepMind). Or alternatively, the companies that invest / support them - Sundar (Google) or Satya (Microsoft).
It seems valuable to elicit their honest opinions about "p(doom)", timelines, whether they believe they've been net-positive for the world, etc.
I think one risk here is either:
a) not challenging them firmly enough - lending them undue credibility / legitimacy in the minds of listeners
b) challenging them too strongly - reducing willingness to engage, less goodwill, etc
For deception (not deceptive alignment) - AI Deception: A Survey of Examples, Risks, and Potential Solutions (section 2)
This looks very exciting, thanks for posting!
I'll quickly mention a couple of things that stuck out to me that might make the CEA potentially overoptimistic:
Each of these reasons on their own is fairly weak, but the likelihood of at least one being true gives us reason to discount future cost-effectiveness analyses. More generally, we might expect some regression to the mean w.r.t reducing exposure from tulmeric - maybe everything went right for this particular program, but this is unlikely to be true in future programs. To be clear, there are likely also reasons that this analysis is too pessimistic, and thus on net it may be the case that cost-effectiveness remains at $1/ DALY (or even better). Nonetheless, I think it's good to be cautious, since $1 / DALY implies this program was >800x better than cash transfers and >80x better than GiveWell's top charities - a strong claim to make (though still possible!)
My bad, thanks so much!
It would be great to have some way to filter for multiple topics.
Example: Suppose I want to find posts related to the cost-effectiveness of AI safety. Instead of just filtering for "AI safety", or for just "Forecasting and estimation", I might want to find posts only at the intersection of those two. I attempted to do this by customizing my frontpage feed, but this doesn't really work (since it heavily biases to new/upvoted posts)
it relies primarily on heuristics like organiser track record and higher-level reasoning about plans.
I think this is mostly correct, with the caveat that we don't exclusively rely on qualitative factors and subjective judgement alone. The way I'd frame it is more as a spectrum between
[Heuristics] <------> [GiveWell-style cost-effectiveness modelling]
I think I'd place FP's longtermist evaluation methodology somewhere between those two poles, with flexibility based on what's feasible in each cause
I'll +1 everything Johannes has already said, and add that several people (including myself) have been chewing over the "how to rate longtermist projects" question for quite some time. I'm unsure when we will post something publicly, but I hope it won't be too far in the future.
If anyone is curious for details feel free to reach out!
Quick take: renaming shortforms to Quick takes is a mistake
This looks super interesting, thanks for posting! I especially appreciate the "How to apply" section
One thing I'm interested in is seeing how this actually looks in practice - specifying real exogenous uncertainties (e.g. about timelines, takeoff speeds, etc), policy levers (e.g. these ideas, different AI safety research agendas, etc), relations (e.g. between AI labs, governments, etc) and performance metrics (e.g "p(doom)", plus many of the sub-goals you outline). What are the conclusions? What would this imply about prioritization decisions? etc
I appreciate this would be super challenging, but if you are aware of any attempts to do it (even if using just a very basic, simplifying model), I'd be curious to hear how it's gone