- The exponential distribution offers a statistical framework for evaluating the cost-effectiveness of global health interventions
- Shifting donations from interventions near the 50th percentile of effectiveness to groups near the 97th percentile of effectiveness can multiply the impact by at least 5 times
The Exponential Distribution
I was reading an old essay by Toby Ord when I came across this striking graph that ranked 108 health interventions from the Disease Control Priorities in Developing Countries (DCP2) report:
Most obvious is the implication that interventions differ dramatically in their effectiveness with some interventions saving orders of magnitudes more DALYs (Disability Adjusted Life Years) than others. As Ord points out,
"Moving money from the least effective intervention to the most effective would produce about 15,000 times the benefit, and even moving it from the median intervention to the most effective would produce about 60 times the benefit."
However, what also struck me was the graph's resemblance to a common statistical model: the exponential distribution. The exponential distribution is a model used to describe the probability and impact of events that are usually benign but potentially dramatic like the costs of natural disasters, the largest single day declines for the Nasdaq, and the wealth of individuals. In each of these cases, most of the events have values near 0 (most tornados cause very little damage, most single day stock market drops are small, most people make relatively little money) but a few events have extremely large values (Hurricane Katrina caused $170 billion in damage, the largest Nasdaq drop was 12.32% in 2020, and Jeff Bezos is worth $177 billion).
Exponential distributions are well defined by their averages. For example, taking the mean of Ord's data to be 60 DALYs per $1000 and generating 1000 sample data points according to an exponential distribution with an average of 60, we get this graph:
The blue bars represent how many times a value in its width was generated (i.e. there were about 175 of the 1000 sample data points with between 0 and 10 DALYs per $1000) and the black line shows the expected density at that value based on the exponential distribution with an average of 60. The key here is that just taking the average of Ord's data and generating new data based on what we would expect from the corresponding exponential distribution gives a graph that looks quite similar in shape and DALY scale to Ord's actual data from earlier. (Note: Ord's scale plots a single horizontal bar for each data value while this approach plots a vertical bar depending on how many "interventions" had a certain effect. However, the effect is the same: the few extremely cost-effective organizations stand out to the right while the majority of the not-very-cost-effective organizations clump to the left.)
Looking at more recent data, a similar pattern emerges. Taking the DCP3 (2018) equivalents of Ord's earlier data and plotting them, we see a similar trend.
The blue bars represent the number of health interventions (out of the 94 evaluated) that produce DALYs in the range of the bar (i.e. 48 interventions produce between 0 and 0.005 DALYs per dollar) while the black line shows the expected distribution of interventions according to the exponential distribution with a mean of 0.016 DALYs per dollar (the mean of the data). Admittedly, the fit is not as nice here, but with only 94 data points (versus the 1000 in the previous graph of example data), more fluctuation from the line is expected.
Interesting comparison, but who cares?
The Memoryless Property
What makes the exponential distribution so powerful is a result known as the memoryless property. This result says that the data points in an exponential distribution past some cut-off also follow an exponential distribution. For example, taking the exponential distribution with a mean of 0.016 from earlier, the proportion of all interventions between 0 and 0.05 DALYs/dollar is the same as the proportion of interventions better than 0.05 DALYs/dollar that are between 0.05 and 0.1 DALYs/dollar. The proportions are the same; the interval is just shifted up in the subset. In other words, every subset of an exponential distribution has the same shape as the original distribution.
Assuming that health intervention effectiveness follows an exponential distribution, this says that shifting your investment from the median intervention (in terms of effectiveness) to the 75th percentile intervention does as much good as shifting your investment from the 75th percentile to the 87.5th percentile and so on. In the real world, you'll eventually run out of interventions to shift your investment to, but until then, the differences can be dramatic. For example, taking the DCP3 data's approximate exponential distribution, shifting donations from groups near the 50th percentile of effectiveness to groups near the 97th percentile of effectiveness can multiply the impact by at least 5 times. Applying the same process to Orb's data shows even more dramatic differences.
A few caveats worth mentioning:
- The exponential model seems to fit the DCP2 and DCP3 data pretty well but potentially underestimates how many very ineffective interventions and how many very effective interventions there are. That is, the exponential model might suggest there are more interventions in the middle than there actually are. This difference would make shifting investments from low-effectiveness to high-effectiveness even more useful but would limit the use of the exponential model to describe interventions.
- The underlying data in both DCP reports is somewhat sparse and unexpected. For example, some interventions use volunteers and therefore have no labor costs while others give DALYs only in vague "expert estimates." This results in seemingly strange results like voluntary male circumcision being up to seven times more cost effective than malaria prevention (sprays, nets, and insect control).
- The exponential distribution can model the cost-effectiveness of health interventions
- Properties of the exponential distribution show that shifting investments (especially between good interventions and very good interventions) can have dramatic effects