This is a special post for quick takes by Karthik Tadepalli. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
Sorted by Click to highlight new quick takes since:

Often people post cost-effectiveness analyses of potential interventions, which invariably conclude that the intervention could rival GiveWell's top charities. (I'm guilty of this too!) But this happens with such frequency, and I am basically never convinced that the intervention is actually competitive with GWTC. The reason is that they are comparing ex-ante cost-effectiveness (where you make a bunch of assumptions about costs, program delivery mechanisms, etc) with GiveWell's calculated ex-post cost-effectiveness (where the intervention is already delivered, so there are much fewer assumptions).

Usually, people acknowledge that ex-ante cost-effectiveness is less reliable than ex-post cost-effectiveness. But I haven't seen any acknowledgement that this systematically overestimates cost-effectiveness, because people who are motivated to try and pursue an intervention are going to be optimistic about unknown factors. Also, many costs are "unknown unknowns" that you might only discover after implementing the project, so leaving them out underestimates costs. (Also, the planning fallacy in general.) And I haven't seen any discussion of how large the gap between these estimates could be. I think it could be orders of magnitude, just because costs are in the denominator of a benefit-cost ratio, so uncertainty in costs can have huge effects on cost-effectiveness.

One straightforward way to estimate this gap is to redo a GiveWell CEA, but assuming that you were setting up a charity to deliver that intervention for the first time. If GiveWell's ex-post estimate is X and your ex-ante estimate is K*X for the same intervention, then we would conclude that ex-ante cost-effectiveness is K times too optimistic, and deflate ex-ante estimates by a factor of K.

I might try to do this myself, but I don't have any experience with CEAs, and would welcome someone else doing it.

I think a similar view is found in 'Why we can't take expected value estimates literally even when they're unbiased' I.e. we should have a pretty low prior that any particular intervention is above (e.g.) 10x cash transfers, but the strength and robustness of top charities' CEAs are sufficient to clear them over the bar. And most CEAs of specific interventions written up on the forum aren't compelling enough to bring the estimate all that much higher from the low prior.
I agree it'd be informative to see what 'naive' versions of top charity CEAs would be like. As a quick and dirty version, I looked at Givewell's stuff on AMF - their 2023 central figure is $5,500 per life saved, with 226 rows in the spreadsheet. If we look at their 2012 CEA (downloadable here), they have 45 rows with their optimistic value being $1819 per life saved. Leaving aside inter-temporal confounders, naively a 3x cut is reasonable if the random forum CEA is equivalent to 2012 optimistic Givewell. Though it depends on the quality of the random CEA, I'd guess a 2x-10x cut is a reasonable prior? Plus a stronger cut for really high estimates - eg a 500x cost-effectiveness is more likely to be due to an over-generous methodology

That's an interesting separate point, I certainly agree that our prior should have low mass around 10x cash and above and that has its own large effect. But I don't feel like I would make this point contingent on the quality of the CEA; I think even the highest-quality ex-ante CEA can't avoid these issues. Some CEAs are probably high-quality because there are real decisions attached to them (e.g. Charity Entrepreneurship's ex-ante CEAs of their prospective charities) and I don't think I would be convinced by those either.

Neat exercise with 2012 GiveWell. Does 2023 have a country breakdown? Because the main intertemporal confounder I would want to guard against is the change in country mix. I would compare 2012 to the 2023 country in which AMF had the most activity in 2012, which I don't know off the top of my head. But 3x seems reasonable to me.

I sympathise with this view, but I think I see it in more continuous terms than ex ante vs. ex post, and maybe akin to quality. This is because even ex post, I think there would still be substantial guess-work and assumptions, and the bottom line still relies on interpretation. But the difference for ex post is how empirically informed that analysis can be, and how specific. I.e an ex post analysis can ground estimates on data for that specific org, with that program, in that community. Ex ante analyses can also differ in quality for how empirically informed they are, and how specific they are. But a great ex ante CEA could be more empirically informed and specific than an sub-par ex post CEA. But all this is ~semantics - I think we basically agree. 

Not sure about the geography question - from this it looks like the 2012 estimate was based on distribution in Malawi. In 2022 they distribute in DRC, Ghana, Guinea, Malawi, Papua New Guinea, Togo, Uganda, and Zambia, and my guess is that the Givewell figure is an average of those programs? Read into that what you will.

Ooh another angle would be to compare Charity Entrepreneurship's ex ante CEAs with the eventual charities' own ex post CEAs. But there'd be a strong selection effect given it depends on eventual charity success/stability, plus the interventions change a lot from the research to implementaiton.

Yes, I agree quality matters a lot, but I think people are universally aware of that - I just wanted to draw attention to the ex-ante/ex-post distinction, which I hadn't seen raised before.

The CE approach is a good idea, because actually I think the interventions changing a lot from research to implementation is a key part of why ex-ante estimates are unreliable. I don't know if both estimates are available but it would be great if they are!

One example I know of off the top of my head is LEEP - their CEA for their Malawi campaign found a median of $14/DALY. CE's original report on lead paint regulation suggested $156/DALY (as a central estimate, I think). That direction and magnitude is pretty surprising to me. I expect it would be explicable based on the details of the different approaches/considerations, but I'd need to look into the details. Maybe a motivating story is that LEEP's Malawi campaign was surprisingly fast and effective compared to the original report's hopes?

Another is Family Empowerment Media. An ex post Rethink Priorities report mentions FEM used a Givewell model to estimate a cost-effectiveness 26.9x cash transfers, and Founders Pledge estimated 22x. The original CE report links to a CEA that estimates $984/DALY averted, which is lower than Givewell top charities - though I don't know the exact comparison to cash transfers, and there are other benefits to family planning than just DALYs. 

I suspect a strong selection effect is in play - i.e. I know of these examples and their CEAs are prominent because they were successful - and the ideas survived the gauntlet of further research, selection, founding, piloting, and scaling. 

LEEP is a pretty unusual situation in general I think, and I'm not sure is super generalisable. If you get an easy-ish win with lead things, the cost-effectiveness can be insane (see the bangladesh cumin situation).

Yeah makes sense, and that the early research could have been heavily discounted by pessimism about a charity achieving big wins.

This is one of the reasons I don't love post-hoc Cost-effectiveness assessments of successful individual campaigns and policy changes which don't take into account the probability that their (now successful) campaign might have failed - which I have seen a number of times on the lead front. For every win there might be 5, or 10 or 20 failures (which is fine). If you just zero in on the successes then cost-effective numbers look unrealistically rosy.

If the initial assessment say for LEEP in Malawi assessed say a 20% chance of success, then this should be factored into their final calculation I think, then they can perhaps update it if they realise their success rate increases. Otherwise we end up not costing in the failed campaigns, while the successful ones appear ludicrously cost-effective.

Yeah, though to be fair the CEA for Malawi was b/c it was LEEP's literal first campaign. I'd imagine LEEP has CEAs for all their country work which include adjustments for likelihood of success, though I don't know whether they intend to publish them any time soon.

This is an important reflection, and one I've found myself querying when seeing various programs claim they are hyper effective. Incredibly well performing interventions are rare, but we might expect to see a higher number of them to be showcased on this forum given there is already a selection bias from the membership/readership here. 

However, I do feel the community naturally creates an incentive to inflate (conciously or not) the CEA of interventions - afterall, if you aren't working on something which can compete with AMF, then why take money away from that? The fix to this being you live in the ambiguity of your intervention and argue that under certain assumptions, your program could be better. 

As you effectively note, the problem is could (a priori) judgments are riddled with reasoning risks and errors, which is why I feel the community could do more to better support and also challenge reasoning methods (cognitive and computational). For example, lots of posts mention key uncertainties people have on their interventions, but they often don't state the second order probabilities of them (not even GiveWell does this consistently) along with how much that uncertainty fundamentally underpins the intervention. A relatively simple fix, which could be a community norm. 

I agree about the incentives/motivated reasoning problem. I suspect that uncertainty intervals would be uninformatively huge, so I don't know if they really are useful in practice. Remember that cost effectiveness is the ratio of two uncertain quantities (benefits and costs), and the ratio of two random variables follows a ratio distribution which generally has huge tails.

FWIW I think it's a bad solution, but why not quantify the uncertainty in the ex ante CEA? See this GiveWell Change Our Minds submission as an example--I don't think the uncertainty intervals are uninformatively large, although there is a rather strong assumption that the GiveWell models capture the right structure of the problem. Once the uncertainty is quantified, we could run something like the Bayesian adjustment I demonstrate in this PDF to (in theory!) eliminate the positive bias for more uncertain estimates. And then compare the posterior distribution to an analogous distribution for AMF/other relevant benchmark.

Conceptually, the difference between the ex ante and ex post CEA isn't categorical. It is a matter of degree--the degree of uncertainty about the model and its parameters. This difference could be captured with an adequate explicit treatment of uncertainty in the CEA. 

Interesting, I don't know why the tails aren't larger, and I find Squiggle kinda hard to parse. Do you quantify cost uncertainty in addition to benefit uncertainty? Because that would, I think, make the bounds huge.

How do we find new interventions that can beat the best? The main approach has been to seek leverage; interventions where you pay a dollar to move a larger amount of money/resources to a cause (e.g. advocacy interventions. effective giving campaigns). But people don't often recognize market-based leverage; interventions where you pay a dollar to enable a market transaction whose value is much larger than that dollar. Two examples:

  1. High-school educated workers in urban South Africa struggle to find their first job, because school performance is a very weak signal of ability, and businesses are reluctant to take chances because firing workers is costly. Harambee assesses job-seekers' skills and gives them certification that they can show to employers as a credible signal. Getting the certification increases their chances of finding a job. Thus, Harambee increased their income without giving them any money, just by fixing an information problem in the labor market.
  2. Rug manufacturers in Egypt would like to export their rugs for big profit. An unnamed German retailer would like to buy these rugs, since they are cheaper than alternatives. However, the global market is huge, and these two parties have no way of getting in touch with each other, or knowing that they are mutually interested in a sale. Aid to Artisans helps the local intermediary for the rug manufacturers get in touch with the German retailer, helps them travel to meet with the retailer to provide samples of their products, and arranges an initial large order. The rug manufacturers gain a reputation, and over the next four years, they get orders totalling $150,000. Thus, ATA increased the incomes of all the artisans without giving them any money, just by connecting them with buyers who wanted to give them money.

These interventions get their leverage from fixing market failures. Both of these examples are failures of information, which are both ubiquitous and cheap to fix. I suspect that it shouldn't be too hard to find one where spending $1 generates more than $10 in income, which is roughly the bar for a GiveWell top charity.

I suspect that it shouldn't be too hard to find one where spending $1 generates more than $10 in income, which is roughly the bar for a GiveWell top charity.


This seems wrong to me in that both of your examples are constituencies that are quite a bit better off than Give Directly recipients for which that would hold, i.e. the actual multiplier would need to be a lot higher or apply to constituencies as poor as GD-recipients. 

Yeah, hence the caveat with roughly. I actually don't think they're much better off - the former group are unemployed and thus have basically no income! - but I feel pretty sanguine about generating $50 or $100 in income per $1 spent if your intervention operates at scale, just because the unit costs of solving an information friction seem trivially small. (Also, business operators are better off but the potential to multiply business income is way higher.)

The easiest way to get this would be through agricultural livelihood interventions. Farmers are the extreme poor, and they have tons of frictions to market transactions, so you are targeting the right population and also getting market-based leverage.

I am not an GHD expert but I would expect someone who has a high school diploma in the richest country in Africa to be a lot better off than the typical GD recipient which seems to be from the poorest strata of the poorest countries.

And so, yeah, I agree one would probably a 50-100x expected multiplier to make this work. I am not saying this is not possible, I just thought the bar stated here was significantly too optimistic.

I picked South Africa because Harambee works there, but the same issue - employers don't know who is good to hire so job seekers struggle to find jobs - is true across Africa and for much poorer populations than high school educated workers.

But the point would have been better demonstrated with livelihood interventions for farmers.

Thanks, and sorry if I was too nitpicky then.

I find it encouraging that EAs have quickly pivoted to viewing AI companies as adversaries, after a long period of uneasily viewing them as necessary allies (c.f. Why Not Slow AI Progress?). Previously, I worried that social/professional entanglements and image concerns would lead EAs to align with AI companies even after receiving clear signals that AI companies are not interested in safety. I'm glad to have been wrong about that.

Caveat: we've only seen this kind of scrutiny applied to OpenAI and it remains to be seen whether Anthropic and DeepMind will get the same scrutiny.

I don't think it's accurate to say that "EAs have quickly pivoted to viewing AI companies as adversaries, after a long period of uneasily viewing them as necessary allies."

My understanding is that no matter how you define "EAs," many people have always been supportive of working with/at AI companies, and many others sceptical of that approach.

I think Kelsey Piper's article marks a huge turning point. In 2022, there were lots of people saying in an abstract sense "we shouldn't work with AI companies", but I can't imagine that article being written in 2022. And the call for attorneys for ex-OpenAI employees is another step so adversarial I can't imagine it being taken in 2022. Both of these have been pretty positively received, so I think they reflect a real shift in attitudes.

To be concrete, I imagine if Kelsey wrote an article in 2022 about the non disparagement clause (assume it existed then), a lot of people's response would be "this clause is bad, but we shouldn't alienate the most safety conscious AI company or else we might increase risk". I don't see anyone saying that today. The obvious reason is that people have quickly updated on evidence that OpenAI is not actually safety-conscious. My fear was that they would not update this way, hence my positive reaction.

Curated and popular this week
Relevant opportunities