The "mean" cost-effectiveness of interventions with uncertain impact can be misleading, sometimes significantly.
We want to consider , not .
2023 Edit: I now think that the top comment is right: we want to use , I'll keep the rest of the post as is, as I still think it can be useful
Epistemic status (how much I'm sure of this): I'm pretty confident about the main claim, but still confused about the details, I end the post with some questions.
Minimal extreme example:
Let's say that you have a magical intervention that has:
- 33% of saving 1 life
- 33% of saving 100 lives
- 33% of saving 199 lives
All for the known cost of $10,000.
It would be an amazing intervention! If you run hundreds of similar interventions, you can save lives with cost-effectiveness of $100/life: the expected value is 100 lives saved, and the cost is always $10,000.
But here is what happens if you model it in Guesstimate:
You get $3400 mean cost per life! Changing the useful value by a factor of 34!
This is obvious in hindsight: since Guesstimate shows the "mean" cost-effectiveness
instead of what we care about, which is:
Looking at the 5th and 95th percentile helps in many cases, but not in scenarios where there is a very small probability of very high effects and a significant probability of small effects. Minimal Guesstimate example with 4.8% of saving 1000 lives and 95.2% of saving 1 life.
Some practical examples of very small chances of huge value might be deworming or policy interventions. For those, mean(cost/effect) and mean(cost)/mean(effect) might differ by orders of magnitude.
Three recent examples:
- https://forum.effectivealtruism.org/posts/h2N9qEbvQ6RHABcae/a-critical-review-of-open-philanthropy-s-bet-on-criminal (search for "EDIT 22/06/2022:" in this post, it changes a result by an order of magnitude)
- https://forum.effectivealtruism.org/posts/RXm2mxvq3ReXmsHm4/ (in this case the difference is smaller, 54 vs 31, see https://colab.research.google.com/drive/1lnwjw2_zJHL4rBepw9yzAHQE0zL2ClDb?usp=sharing )
- https://forum.effectivealtruism.org/posts/9iBpokRpoJ2xspfnb/estimating-the-cost-effectiveness-of-scientific-research (19 vs 12, see https://colab.research.google.com/drive/1B_jnFGeUB_2fV7XR942QKJ_xyGMzvGMz?usp=sharing )
If you want to check another Guesstimate model
For most models, you can just manually calculate , since the means are shown in the Guesstimate UI.
For more complex cases, if you are comfortable with python, you can port a guesstimate model to numpy using https://recursing.github.io/guesstimate_to_squiggle/ and add
.mean() liberally (very MVP, let me know if it doesn't work with a model you want to try).
Possible solutions / mitigations:
- If costs are constant in your model, consider looking at the value per dollar (or per $1,000) instead of dollars per value, so the denominator is constant. The minimal example would become https://www.getguesstimate.com/models/20682
Edit: this is by far the most favored approach in comments, and should cover most cases.
My view is that this is useful in part because huge uncertainties in costs are rare.
- If you're interested in a single number for some sense of "expected cost-effectiveness", get the expected value and the expected cost and divide those numbers instead of the distributions (if the distributions can be considered independent).
- Other ideas? I'm definitely not an expert in any of this and there's probably a nice mathematical/statistical solution that I can't think of! Please comment if you think of anything!
Some questions I still have:
- How can we express the uncertainty around cost/effectiveness if the ratio distribution is hard to reason about and has misleading moments?
- How could the UI in guesstimate or some potential alternative indicate to the user when to use and when to use for nonlinear functions, to prevent people from making this very common mistake?
We might want to use the former for e.g. the value of cash transfers
Really curious to know if anyone has ideas!
Huge thanks to Sam Nolan, Justis Mills, and many others for fleshing out the main idea, editing, and correcting mistakes.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Edit: this used to say "Underestimating the actual effectiveness by a factor of 34". But I don't think that this value is more "actual" than the other, just much more useful.
Assuming independence between cost and effect
Edit: several commenters pointed out that I'm implicitly considering
over many interventions, and it's not obvious at all that that's what we want in most cases.
I still think that's what we want in almost every case, but there's some interesting discussion going on in comments