Hide table of contents

Summary

  • GiveWell currently uses a time discount rate of 4% for all their cost-effectiveness analyses (CEAs).
  • I argue that it is a mathematical mistake to pick any single best guess value to use for the CEAs.
    • Instead, GiveWell should use a probability distribution over possible discount rates.
  • This is not just an aesthetic judgement for mathematical puritans; it materially changes the CEAs, notably by making all the deworming interventions more attractive relative to other interventions.
    • This is because deworming interventions rely on multi-decadal effects, and so a lower discount rate would make them much more valuable.

Epistemic Status

  • On the object level, I cannot think of any reasons to justify GiveWell's current modelling choice over my proposal.
  • However, I still doubt my conclusion because on the meta level it seems like an obvious thing that would be surprising if no one at GiveWell had ever thought of doing, which is evidence I am missing something important.

Main

GiveWell’s CEAs are an impressive attempt to model many different factors in assessing the near-term impacts of various interventions.[1] I will ignore all of this complexity. For my purposes, it is sufficient to note that the CEA for most interventions is well characterised by decomposing impact into several constituents, and multiplying these numbers together. Consider Helen Keller International’s Vitamin A Supplementation program:

[2] where:

is cost-effectiveness [deaths/dollar],

is baseline mortality [deaths/year/child],

is mortality reduction [%], and

is treatment cost [dollars/child/year]

Obviously, all of these terms are uncertain. Treatment costs we can estimate quite accurately, but there may be fluctuations in the price of labour or materials needed in the distribution. Mortality data is generally good, but some deaths may not be reported, and mortality rates will change over time. The mortality reduction is based on a solid-seeming meta-analysis of RCTs, but things change over time, and circumstances differ between the trial and intervention locations.

GiveWell’s model makes a subtle mathematical assumption, namely that the expectation of the product of these three random variables is equal to the product of their expectations:

This is not, in general, true.[3] However, if the three random variables are independent, it is true. I cannot think of any plausible ways in which these three random variables correlate. Surely learning that the price of vitamin A tablets just doubled () does not affect how effective they are () or change the baseline of how many kids die (). Thus, while GiveWell’s method is mathematically unsound, it gives the correct answer in this case. It could well be that GiveWell has considered this, and decided not to explain this in their CEAs because it doesn’t change the answer. I think this would be a mistake in communication, but otherwise benign.

The one place where I believe this mathematical mistake translates into an incorrect answer is in the use of discount rates. From GiveWell’s explanatory document:

“The discount rate's primary effect in the cost-effectiveness analyses of our top charities is to represent how much we discount increases in consumption resulting from the long run effects of improved child health for our malaria, deworming and vitamin A charities (which we call "developmental effects"). It also affects the longer-run benefits from cash transfers. We don't discount mortality benefits in our cost-effectiveness analyses.”

This figure shows the cost-effectiveness of all the charities in the CEA spreadsheet, when varying the discount rate.[4]

cost-effectiveness vs discount rate

Deworming interventions, shown in dashed lines, vary considerably in the discount rate because their path to impact relies on multi-decadal earnings increases for the recipients. The other interventions are less sensitive to the discount rate because more of their impact comes via direct mortality reduction, which is not subject to a discount rate in GiveWell’s analysis. However, all CEAs include an analysis of long-term income changes, so display some sensitivity to the discount rate.

The GiveWell approach of making their best point estimate of a random variable and using it for subsequent calculations breaks down here because the cost-effectiveness is not linearly proportional to the discount rate. The precise function relating discount rate to cost-effectiveness for the deworming interventions is a bit complex.[5] We can just note that the cost-effectiveness is some function of, among other things, the discount rate :

We care about the expected value of the cost-effectiveness:

GiveWell makes the further implicit assumption that

But this assumption is false!

Because is nonlinear, our best guess for the discount rate being wrong in one direction will mean our estimate for cost-effectiveness will be off by more than if we are wrong about discount rates in the other direction. In this case, if the appropriate[6] discount rate is significantly lower than GiveWell’s current value of 4%, then deworming looks very good indeed, while if we choose a higher discount rate, deworming is only moderately worse.

This means that simply plugging in our best guess for the discount rate into the CEA is a mathematically invalid move that gives the wrong answer.[7] The correct approach would involve deciding on our probability distribution in credences over different discount rates, and integrating the product of f and the probability distribution. Others have argued GiveWell should do uncertainty analysis to keep better track of the risk their recommendations are badly off. My concern differs from this, as even if we just care about maximising expected value without being risk averse, we should avoid using point estimates of the discount rate.

I make no claim to have a good way of choosing our probability distribution, just that the current implicit one - 100% confidence in 4% - is clearly wrong. For illustrative purposes, here are three slightly less wrong discrete distributions we could have over discount rates: equal credence between 2% and 6%, a binomial distribution, and uniform credence on 0%-8%.

three probability distributions over discount rates

The more credence we place on lower discount rates, the more deworming charities look better compared to the baseline methodology, even in relation to other charities:

changes to cost-effectiveness under different scenarios

Thus, GiveWell improving their modelling of discount rates could increase the cost-effectiveness of deworming charities by upwards of 10%. The takeaway I am hoping for is in pointing towards a better process, rather than defending my new very dodgy numbers.

My accompanying post argues the large chance of TAI within the coming decades should cause GiveWell to use a higher discount rate.

Notes


  1. I am compelled by the foundational critique of GiveWell’s approach that its CEAs ignore almost everything of value by leaving out longterm effects, such as via population size and economic growth (Greaves 2020). However, I feel I have nothing important and new to add on this matter, and moreover by my reading of the rules it is outside the scope of the competition. ↩︎

  2. In the real model, many other adjustments are made, and developmental benefits are also included, which I ignore. In my notation, V corresponds to row 33, M to row 31, R to row 24 and C to row 19. ↩︎

  3. For instance, imagine if the three random variables are the outcome of flipping a coin. Suppose I have 50% credence the coin is fair, and 25% credence each that the coin has heads on both sides or tails on both sides. Using the fallacious reasoning, we would calculate E[HHH] = E[H]*E[H]*E[H] = (½)^3 = ⅛. The correct reasoning, though, gives E[HHH] = E[HHH|fair]P[fair] + E[HHH|heads bias]P[heads bias] + E[HHH|tails bias]P[tails bias] = ⅛½+1¼+0¼ = 5/16 which is (significantly) greater than ⅛. ↩︎

  1. I got this data manually, by making a copy of the CEA spreadsheet, and plugging in each discount rate value and then copying the cost-effectiveness of each charity. Where the charity has separate columns for different countries/regions, I always used the highest cost-effectiveness figure. I used the numbers from the “Units of value generated per philanthropic dollar spent, after accounting for all adjustments” row, or the row with the most similar name in that sheet, with the exception of AMF where I used the “Units of value generated per dollar spent” row as the other row is hard-coded numbers that do not update when I change the discount rate. The data I used, and all graphs, are in an accompanying spreadsheet ↩︎

  2. Let the discount rate be d, and the ratio of value from one year to the next be r=1/(1+d) which is just how discount rates are defined, then the cost-effectiveness is proportional to r8*(1-r40)/(1-r) using simple geometric series maths, and GiveWell’s own values of the income benefits starting after 8 years and lasting for 40. ↩︎

  3. I will refrain from saying true, given the subjective judgements involved. ↩︎

  4. A similar complaint could be made about these parameters of 8 and 40 years, that a better model would treat these as variables with probability distributions, or better yet would not have the effect be binary, but rather continuously increasing over the first decade, and later gradually declining to zero. However, this is outside the scope of this article, but could be addressed in conceptually similar ways. ↩︎

Show all footnotes
Comments13


Sorted by Click to highlight new comments since:

Nice!

However, I still doubt my conclusion because on the meta level it seems like an obvious thing that would be surprising if no one at GiveWell had ever thought of doing, which is evidence I am missing something important.

A thing that you might be missing is that GiveWell is only assigning limited manpower to its CEAs. This means that I think that they may not have noticed relatively subtle issues like this—clear when you notice them, but perhaps hard to notice unprompted.

Thanks.  I suppose I was thinking that the CEAs are one of the core products GiveWell creates and a lot of effort has gone into them.  But yes, once I have thought of something it probably seems more obvious than it is.

With limited manpower, GiveWell also has to prioritize which CEA improvements to make--and added complexity can moreover increase the risk of errors. 

Thanks for your entry!

I cannot think of any plausible ways in which these three random variables correlate. Surely learning that the price of vitamin A tablets just doubled (C) does not affect how effective they are

Price and effectiveness correlate because people are willing to pay higher prices for more effective things. One reason the price would double abruptly is because of new information indicating higher effectiveness.

Ah good point, OK I think I was too bold there. Perhaps the weaker thing I should have said is that I don't think these effects would be large and that I don't have a good suggestion for what GiveWell could practically do to implement any possible correlations among these variables.

Do you think you making that mistake could serve as a good model for how they could have made the mistakes that you think they made?

Yes perhaps, I suppose one disanalogy is the length of time spent thinking about the topic - I would hope GiveWell's work is more thorough than mine.

You can never escape Jensen's inequality :) nice, clean argument!

Very interesting argument. As someone fond of Bayesian modeling, I am almost always in favor of replacing point estimates with distributions.

GiveWell links to this paper in their CEA spreadsheet. Their recommendation is a discount rate of 4% for upper-middle-income countries and 5% for low-income countries. This recommendation seems to be based on three factors: Past growth of GDP, implied social discount rate, and projected GDP growth. All three, are measured with uncertainty and will vary by country. I think it would be very interesting to take that variability into account!

Thanks! I think most of the value would be captured by just switching to a single distribution for all countries, but yes having slightly different distributions for each country could be slightly better. My (fairly uninformed) guess would be that the socio-economic status of each of GiveWell's countries is similar enough that we wouldn't have much reason to use different distributions.

I was actually thinking of the same thing.  Nice work here bro.

[comment deleted]1
0
0
More from OscarD🔸
95
OscarD🔸
· · 5m read
44
OscarD🔸
· · 1m read
Curated and popular this week
LintzA
 ·  · 15m read
 · 
Cross-posted to Lesswrong Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achi
Dr Kassim
 ·  · 4m read
 · 
Hey everyone, I’ve been going through the EA Introductory Program, and I have to admit some of these ideas make sense, but others leave me with more questions than answers. I’m trying to wrap my head around certain core EA principles, and the more I think about them, the more I wonder: Am I misunderstanding, or are there blind spots in EA’s approach? I’d really love to hear what others think. Maybe you can help me clarify some of my doubts. Or maybe you share the same reservations? Let’s talk. Cause Prioritization. Does It Ignore Political and Social Reality? EA focuses on doing the most good per dollar, which makes sense in theory. But does it hold up when you apply it to real world contexts especially in countries like Uganda? Take malaria prevention. It’s a top EA cause because it’s highly cost effective $5,000 can save a life through bed nets (GiveWell, 2023). But what happens when government corruption or instability disrupts these programs? The Global Fund scandal in Uganda saw $1.6 million in malaria aid mismanaged (Global Fund Audit Report, 2016). If money isn’t reaching the people it’s meant to help, is it really the best use of resources? And what about leadership changes? Policies shift unpredictably here. A national animal welfare initiative I supported lost momentum when political priorities changed. How does EA factor in these uncertainties when prioritizing causes? It feels like EA assumes a stable world where money always achieves the intended impact. But what if that’s not the world we live in? Long termism. A Luxury When the Present Is in Crisis? I get why long termists argue that future people matter. But should we really prioritize them over people suffering today? Long termism tells us that existential risks like AI could wipe out trillions of future lives. But in Uganda, we’re losing lives now—1,500+ die from rabies annually (WHO, 2021), and 41% of children suffer from stunting due to malnutrition (UNICEF, 2022). These are preventable d
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f