About two months ago, Michael Plant, Joel McGuire and Clare Donaldson from the Happier Lives Institute posted on this forum about Using Subjective Well-Being to Estimate the Moral Weights of Averting Deaths and Reducing Poverty. I wrote some feedback on the post and Michael asked me to share it here.
These are my own views but I'm very grateful to my colleagues on the Founders Pledge research and advisory teams for comments and discussion.
The trade-off between saving lives and increasing consumption is really important to prioritisation in global health and development and I think subjective well-being (SWB) offers potentially really useful tools for making trade-offs like this, so I’m really excited to see this post. This post functions very well as an initial proof of concept of using SWB in cost-effectiveness analyses and I think it’s a strong early step in applying well-being analysis. The analysis impressively comes to reasonable conclusions, often with only poor quality data available. I’m excited to see this methodology develop further.
The post and guesstimate model are clear and easy to understand. I was especially impressed by the presentation of the guesstimate model. These easily get unwieldy but this is extremely clear and easy to follow.
I have one major comment and a few minor comments. Most of the minor comments are best interpreted as things to think about in future work rather than direct criticisms of the post.
One major comment
Beware the difference between an expected value of a ratio and a ratio of expected values. You calculate the expected value of a ratio but I think you should instead look at the ratio of expected values. The ratio of expected values is about 40% lower than the expected ratio.
A simple illustrative example
Suppose there are two options, A and B, and two scenarios, 1 and 2. You are uncertain about which scenario is actual but you know that they are equally likely. Suppose that you can get the following pay-offs by choosing option A or B in a scenario 1 or 2:
Which option should we choose? Ratios are helpful: if x/y > 1, then we know that x > y (as long as x and y are positive). , so should we expect A to be better than B and choose A over B? No. Note that is also greater than 1, so the expected ratio is misleading. We care about the higher expected value and , so choose B.
Similarly, we could look at the ratio of expected values: . This is the right ratio to look at when choosing between A and B.
In the context of this post
In the case of this post, we’re not actually choosing between two options, we’re trying to determine how to weigh two outcomes against each other. So perhaps it’s not immediately obvious that this criticism applies here, but it does. Suppose there are two acts:
- Act C, that doubles n people’s consumption for one year
- Act L, that saves m lives of children under the age of 5
Ideally, in choosing between these, we’d build probabilistic models like this one that incorporate the distributions here for the value of doubling consumption for one person for one year (call this c) and the value of saving one life of a child under the age of 5 (call this l). We look at the expected value of performing C and L and choose whichever is greater. The relevant ratio here is (rather than ). We might approximate the expected value of C as the expected number of income doublings multiplied by the expected value of doubling one person’s income, i.e. (and similarly, ). Then the relevant ratio for comparing C and L is:
So the relevant ratio for comparing income doublings and saving lives is the ratio of their expected values, not the expected value of their ratio. I don’t see a use for in this analysis, so this shouldn’t be the headline figure.
Another way to see that the expected ratio is misleading is to consider inverses. Suppose the expected ratio told us how many times saving a life is better than doubling one person’s consumption. One could just as easily have calculated the ratio the other way round – i.e. the expected ratio of doubling consumption over saving a life, and this would tell us how many times better doubling consumption is than saving a life. If that’s the case, then the reciprocal of this expected ratio should tell also us how many times better saving a life is than doubling one person’s consumption (e.g. if we think doubling consumption is better than saving a life, then we think saving a life is better than doubling consumption). But this gives a different answer to the expected ratio of saving a life over doubling consumption:
- Expected ratio of saving a life/doubling consumption = 240 (your model)
- Expected ratio of doubling consumption/saving a life = 0.0085 (can calculate in your model)
- 1/(expected ratio of doubling consumption/saving a life)) = 1/0.0085 = 120 ≠ 240 (simple calculation)
But 1 and 3 purport to describe the same thing, i.e. how many times better saving a life is than doubling consumption, so they should be equal. Therefore, at least one of them fails to do this and since we have no principled reason to choose one of the other, we should reject both.
Calculate the expected value of doubling consumption for one person for one year in your model (about 1.5) then read of the relevant expected values and divide:
- Deprivationism: 210/1.5 = 140 (rather than 240)
- TRIA: 45/1.5 = 30 (rather than 50)
Totalism and births averted per life saved
As you develop this methodology further, I think it’s important that you account for other moral views, most notably totalism. As you’re aware, totalism is a popular view (especially in EA) and, depending on how we ought to respond to moral uncertainty, we might think that totalism (or something similar) dominates our decision calculus when acting under moral uncertainty (Greaves and Ord 2017). I think it would be valuable to know what a similar totalist analysis yields.
As you mention, the value of saving lives, according to totalism will be sensitive to the effect that saving a child’s life has on fertility. You write:
A report written for GiveWell estimated that in some areas where it recommends charities the number of births averted per life saved is as large as 1:1, a ratio at which population size and growth are left effectively unchanged by saving lives. For totalists, the value of saving lives in a 1:1 context would be very small (compared to one where there was no fertility reduction) as the value of saving one life is ‘negated’ by the disvalue of causing one less life to be created.
This wasn’t my reading of Roodman’s report (though this isn’t my area of expertise + my memory of the report is a little hazy). I understood the conclusion to be that the 1:1 ratio is mainly in countries with low infant mortality rates, in which parents choose the number of children they want to have but that saving lives in a high infant mortality rate environment could be very different. Areas in which we’d aim to save lives of young children, almost by definition, have relatively high infant mortality rates, so saving lives is more likely to lead to additional lives. E.g. in Kenya, infant mortality rate is about 3%, compared to 0.4% in the UK. Many parents in the UK choose the size of their family, so saving a child’s life here often means an additional child won’t be born. I don’t know the effect of saving an infant’s life in Kenya on fertility but the large difference in infant mortality rate suggests that the effect on fertility could be much smaller.
I think a valuable improvement would be to investigate the effect of saving lives on fertility rates in the relevant contexts further and use this to run a totalist analysis. As you note, totalism raises difficult questions about the extent to which the world/the relevant regions are under or overpopulated, as these bear on the extent to which adding extra people affects average well-being. Even just side-stepping these for now by holding average well-being constant could be informative but further investigation of these questions could also be valuable.
Wide bounds for parameters of exponential decay model
The exponential decay model does well to make reasonable estimates from your data. The 90% confidence interval displayed in Figure 1A is really wide though and, importantly, the areas under the lower and upper bounds of the confidence interval are very different. Guesstimate does a good job of incorporating this uncertainty but it seems like it would be valuable to reduce the uncertainty with longer-term follow-up data.
Moral value as a linear function of well-being
What we really care about is moral value. We’re looking at life satisfaction here as a proxy for that. Maybe moral value is a function of life satisfaction. In this post, you treat moral value as a linear function of life satisfaction because you treat a point gain in life satisfaction as equally valuable, no matter one’s starting life satisfaction. I know it’s standard to assume this but how well-grounded is it? Is a jump from 0 to 1 life satisfaction really as valuable as a jump from 9 to 10? This seems like something that could be tested through revealed and/or stated preferences (but of course that would then partially undermine the reason to look at life satisfaction in the first place). This could be understood in prioritarian terms but I don’t intend it like that: the difference in well-being between 0 and 1 life satisfaction could be larger than the difference in well-being between 9 and 10.
Perhaps, the (near) logarithmic relationship between income and life satisfaction provides indirect evidence for a linear relationship between moral value and life satisfaction because this shows that life satisfaction has diminishing returns to income and surely so does moral value. But this is only a very weak argument: it could be like saying “ and both display diminishing returns, so maybe y and z are roughly linear.” But , so this really isn’t a linear relationship at all! (I don't intend to imply that you make this argument, I'm just highlighting that diminishing returns can look very different.)
A few small points here:
- How much variation in life expectancy is there across and within counties? I would expect the life expectancy of those whose lives are saved by GiveWell charities to be among the very lowest and if there’s lots of variation in life expectancy, then the relevant life expectancy could be significantly lower than at the county or national level.
- Your bounds seem too high to me, given the evidence you cite. There being 2 counties with life expectancy less than 60 seems fairly strong reason to believe that it’s more than 5% likely that the relevant life expectancy is less than 62. Even if life expectancy in every county were above 60, say, it’s possible that we could still expect the relevant life expectancy to be less than 60 if there’s lots of variation within counties (e.g. even if life expectancy across a county is 60, life expectancy of the relevant subgroup within the county could be less and this is more likely if there’s high variation in length of lifetime within counties).
- Technically, we should look at life expectancy given the current age rather than life expectancy at birth, and this increases as we survive more years (in practice). I would guess that life expectancy at 5 (or 2.5) is pretty similar to life expectancy at birth, so I doubt this matters much here.
- If you wanted to, you could estimate life expectancy at 5 (or 2.5 etc.) using the infant mortality rate. A simple version: If proportion p of children survive to age n, life expectancy at birth is E, then (assuming those who die before age n die at age n/2 on average), we have , where is life expectancy at age n. Just rearrange to find
- You cite median life expectancy for a one-year old (i.e. the median age at which a one-year old will die?), which is relevant, but it’s not obvious to me how much weight to put on this. What proportion of infants in Kenya are potential recipients of life-saving interventions? If it’s much less than 50%, the median might not be very informative
- I’m not entirely sure how life expectancy is calculated but I think that ideally, you’d also account for life expectancy trends over time. Current life expectancy might not predict how long we can expect current infants to live (it might instead predict how long we can expect infants born N years ago to live)
- Is it possible to get any data on the distribution of the length of lifetimes? This would actually be better than life expectancy. Some people live much longer or shorter than their country/county’s life expectancy, so estimating the life expectancy understates the uncertainty in your model. Ideally, you’d estimate how long people will live, which would be a distribution with life expectancy as the mean but with more uncertainty. I doubt this makes a big difference to the end results though
- Historically, Japanese life expectancy data has been used in DALY calculations across the world so that the lives of people living in countries with low life expectancy aren’t weighted less than the lives of people living in countries with high life expectancy
Bias in life satisfaction studies
- Psychology and economics studies (among many others) often overstate effect sizes, so the effect of cash transfer on life satisfaction might be overstated here
- However, the samples are very large, so maybe this isn’t too concerning
- This applies less to the average life satisfaction studies because there’s no particular reason to expect these to be biased one way or the other (there’s no pressure to find an effect)
- This could overstate the benefits of doubling consumption relative to saving lives
Indirect/spillover effects are hard to quantify and this is a nice start, with effects on other household members. Wider effects are hard to account for but perhaps a notable exception is the economic effects of cash transfers on non-recipients. You note that the possibility of negative spillovers but the larger more relevant Egger et al. (2019) that you cite suggests large positive spillovers on non-recipient households and firms. Well-being analysis could be well placed to account for such spillovers (though I do fear that such an investigation could quickly get tangled up in the Easterlin Paradox, in which case it might not be that tractable after all).
Variation in consumption
The use of the linear-log model to adjust from consumption increase to consumption doubling makes sense. In theory though, just looking at the mean pre-intervention consumption and mean cash transfer to estimate the mean effect on life satisfaction could be misleading if there’s lots of variation in these quantities. If there’s lots of variation in consumption, then the mean pre-intervention (and post-intervention) life satisfaction could be much lower than the life satisfaction corresponding to mean pre-intervention (and post-intervention) consumption – by Jensen’s inequality, . I haven’t thought much about the implications for estimating the effect size but in principle, this could make a difference so it might be worth looking into.
Moral vs prudential value
It might be helpful to distinguish between moral and prudential value since we could hold different views about the moral and prudential badness of death. For example, some version of TRIA strikes me as a very plausible account of the badness of my death for me (i.e. its prudential value) but less plausible as an account of the moral badness of my death (because it seems to depend on some form of person-affecting view and/or a preference-based theory of well-being). In the context of doing good, the moral value is what matters rather than the prudential value and we wouldn’t want decision-makers to erroneously give too much weight to some moral views because they mistake such views for their more plausible prudential counterparts. This isn’t really a criticism of the post (indeed, introducing this distinction might just have been distracting, confusing and unnecessary) but it could be important when it comes to putting this framework into practice.
Sensitivity analysis on TRIA representation
Caveats: (1) I’m only very vaguely familiar with the TRIA literature, so could be way off here. These comments are based on my initial thoughts and could be largely disconnected from or simply repeating the literature. (2) these concerns plausibly apply less to the badness of death of young children compared to adults, as most plausible TRIAs will be in relatively close agreement that young children aren’t strongly psychologically connected to their future adult selves. However, as I’ll elaborate later, I can see these concerns being relevant for evaluating the badness of death for infants too.
As you note, there are many possible ways of formalising TRIA. The method you’ve chosen is simple and intuitive and seems fine for a proof of concept. I have two concerns going forward though: (i) it seems plausible that the results could vary quite significantly on different versions of TRIA, (ii) it’s not obvious to me that your version is the best. Combined, these concerns suggest that the choice of representation matters (important differences between different representations) and you’ve chosen the wrong one.
I’ll start by motivating (ii). Is it standard to assume that we reach 'full psychological connectedness' at around age 10-21 and maintain that until death? Is that supposed to be a realistic or a simplifying assumption? It's clearly false for some people, e.g. with dementia, and intuitively, this seems to me to be probably false for most people. Memories fade, personalities change etc. over time and we gradually become more and more psychologically disconnected from our earlier selves. I would expect the peak TRIA discount factor to be less than 1 because even at the peak, I wouldn’t expect people to be fully psychologically connected to their entire future selves (so death isn’t as bad for them as on the deprivationist account). I can also imagine the peak coming later than 21 (but for sure, there should be a rapid increase between birth and age 10-21). I would also expect the discount to drop off again after the peak (but that’s probably not important here). I’m not trying to argue for a specific TRIA here, just to note that the TRIA used in your model doesn’t strike me as the most plausible version. Or at least that this isn’t clear enough that it’s worth considering other representations in future work.
Now onto (i). In the most extreme case, one could have an eliminativist view about personal identity, according to which persons don’t persist over time and so in each moment, I am different to all the other mes at different times. Combined with TRIA, death isn’t bad for someone who dies on this view because no one that they are psychologically connected to loses well-being. In between this extreme and the view you propose, we might hold a view according to which periods of psychological connectedness are relatively short, lasting a few years or small number of decades. Each of these views could have widely varying results for the badness of death, so it might be important to run a sensitivity analysis on this.
Returning briefly to (ii), the third view I sketched above seems more intuitively plausible to me than the one you’ve used. But I haven’t thought about this much and don’t know what the rest of the literature looks like here.
Even though this more obviously applies to the badness of death for adults, it could still have consequences for the badness of death for infants. You’ve modelled TRIA by increasing the TRIA discount factor linearly from 0 at birth to 1 at age 15 (90% CI: 10-21). If the discount factor peaks below 1, then a similar linear increase would be shallower so the relevant discount factor would be lower. I expect that, if anything, this is probably an argument for a non-linear discount factor rather than a significantly lower one. In any case, I’d be interested in more of a sensitivity analysis here.
Comparability of SWB measures across different income settings
Similar to Jack Malde’s comment, I think there are legitimate concerns about the comparability of SWB measures across different income settings due to the possibility of people using the scales differently in different contexts. I look forward to seeing your paper on this, Michael!
This might not be too significant a concern here – if the SWB measures are comparable across (e.g.) GiveDirectly recipients and (e.g.) children saved from malaria, then we’re all good. This could be much more important when trying to use SWB measure to make comparisons across more strongly differing contexts.
Location of results in guesstimate model
As mentioned above, I think the guesstimate model is really well presented. The only slight improvement in presentation (for me) would be to have the results at the top. It's good to repeatedly refresh guesstimate models to check how variable the results are over multiple runs and it's a little bit easier to do this if you don't have to scroll each time you refresh.