Thanks to the experts with whom we spoke while researching this report!
This was written by Stephen Clare and Aidan Goth.
- In order to help us prioritize Founders Pledge’s research and advising efforts across cause areas, we created a rough model that tries to compare an animal welfare intervention (The Human League’s cage-free campaigns) to a global health intervention (AMF’s bednet distribution).
- In assessing the impact of THL’s intervention, we identified three main sources of uncertainty:
- How many animal years are affected by the intervention
- How much the intervention improves each animal’s subjective well-being
- How much an animal’s well-being matters compared to a human’s (the moral weight)
- The last factor is really tough to figure out. There are good reasons to think the weight might be quite high, and good reasons to think it might be very low. That means the range of our moral weight estimates spans multiple orders of magnitude.
- For this report, we made spreadsheet and Guesstimate models that compare The Humane League to the Against Malaria Foundation for a range of different assumptions about the above uncertain factors.
- Importantly, we assumed hedonism (sentient experience is all that matters morally), that chickens have moral status (their experience matters morally), and anti-speciesism (the value of an experience is independent of the species of animal that is experiencing it). Accordingly, this analysis does not offer an all-things-considered view on the relative goodness of THL and AMF – it assumes a particular worldview that is relatively favourable to THL.
- In this model, in most of the most plausible scenarios, THL appears better than AMF. The difference in cost-effectiveness is usually within 1 or 2 orders of magnitude. Under some sets of reasonable assumptions, AMF looks better than THL. Because we have so much uncertainty, one could reasonably believe that AMF is more cost-effective than THL or one could reasonably believe that THL is more cost-effective than AMF.
- In general, if you value human well-being >10,000 times more than chicken well-being, AMF looks better. If you value human well-being <300 times more than chicken well-being, THL looks better. But between these moral weights the ranking is less clear. We think there’s a good chance (at least 50%) that the moral weight falls between these bounds, where factors like THL’s effectiveness and the badness of battery cages are more important.
- It’s very likely that we’re missing key considerations that could change our estimates by orders of magnitude. For example, we haven’t tried to account for moral uncertainty, indirect effects of the interventions or longtermist considerations.
- Links to our models:
We have to decide how to allocate limited resources between the best human charities and the best animal charities. We work out the trade-offs involved explicitly or make them implicitly, but we have to somehow decide.
3. Theoretical Framework
Why this is hard
- Our epistemic status is weak. It’s hard to know what the relevant variables are, much less how they vary among different animals. We do not have access to the internal experience of other animal species. We have to try and approximate what it’s like based on observations about their behavior, reflecting on our own experience, and considering the different biological and philosophical factors that seem likely to shape consciousness
- Animal-focused interventions are less well studied than human-focused interventions, so we know much less about how effective interventions to improve animal welfare are.
- So there are (at least) three sources of uncertainty:
- How many animal years are affected by the interventions
- How much do the interventions improve each animal’s subjective well-being
- How much does an animal’s well-being matter compared to a human’s?
Moral weight vs. moral status
- A moral weight suggests how much we should value the welfare of an animal in a given species relative to an animal from a different species. These are usually defined relative to humans, i.e. humans are given a moral weight of 1.
- Moral weights are different to an animal’s moral status, i.e. whether its welfare matters to us. Here we focus on moral weights, which relate more to a creature’s ‘capacity for welfare.’ For farm animals, Muehlhauser’s 2017 report, the most in-depth treatment of this question we found, assigns chickens an 80% chance of moral status
Candidates for key variables that determine moral weight
Some of the variables we came across in researching moral weights are:
- Clock speed of consciousness (suggested by Muehlhauser)
- Smaller animals have faster reaction times (e.g. imagine trying to swat a fly).
- To the extent those reactions are under conscious control, smaller animals would experience more subjective moments per unit of objective time
- When measuring well-being, it seems very likely we should care about the subjective length of experience more
- This updates us towards valuing smaller animals more
- Experience intensity (Muehlhauser)
- More ‘intense’ experiences seem like they should matter more
- Human experiences seem like they’re probably more intense. But experiences of other species may be more intense - for example, it’s not clear whether our ‘linguistic thoughts’ make our experience more or less intense
- Unity of consciousness (Muehlhauser)
- Refers broadly to how various different conscious inputs are combined into a single experience. E.g. conscious states over time all belong to some persistent “self” (subject unity) and contents of conscious states are unified (representational unity and phenomenal unity)
- Debatable how much this influences moral weight, but subject and phenomenal unity relate to subjective experience which seems relevant to moral weight. Humans probably have more unity of consciousness than animals.
- Brain size/complexity (Tomasik)
- Brian Tomasik is very uncertain, but weakly suggests a view where moral weight scales non-linearly with brain size. As a rough approximation he suggests scaling by N^(2/5), where N is the number of neurons the animals has.
- However some evidence suggests that the correlation between cognitive sophistication and neuron count seems weak.
We made a spreadsheet model that generates cost-effectiveness estimates for many different plausible values and noted where the key considerations lie. We only estimated a model to compare our most popular global health intervention (AMF) to our top animal welfare recommendation (The Humane League’s campaigns for aviaries instead of battery cages for egg-laying hens). Our model is:
number of 'hen-years' spent in aviaries rather than battery cages
benefits of moving from a battery cage to an aviary
moral weight of chickens
how many DALYs are averted by $1 to AMF
Effectiveness of THL campaigns in changing corporate behaviour
The FP report on animal welfare (published 2018) estimates that THL moves 10 hen-years from battery cages to aviaries per dollar donated. This number assumes there is a 60% probability that companies honour their cage-free commitments and that THL’s advocacy brought these pledges forward by between half a year and one year on average. Simcikas (2019)’s estimates of corporate campaign effectiveness are higher, though not all corporate campaigns relate to battery cages and aviaries specifically. His upper bound is that 160 hen-years are affected per dollar, with a median estimate of 54. Similarly, Bollard (2016) suggests corporate campaigns spare at least 38 hen years from battery cages per dollar. On the other hand, Bollard (2019) documents that some companies have delayed or reneged on their pledges. While he remains optimistic, this indicates the need for ongoing campaigning to ensure pledges are fulfilled which would raise the expected cost.
We think the 2018 FP estimate of 10 hen-years/$ is likely a slight underestimate. Across the different tabs on the spreadsheet, we model four scenarios: 1, 10, 30 and 100 hen-years affected per dollar.
Benefits of moving from battery cages to aviaries
We measure the benefits of moving from battery cages to aviaries by estimating how bad battery cages are and then estimating how bad aviaries are as a proportion of the badness of battery cages.
Battery cages are plausibly extremely bad. Hens are kept in tiny spaces, live on wire racks, are unable to move around, and are kept from engaging in natural behaviors such as rooting, preening, and socializing. Pages 20 through 23 of our report have more detail. In aviaries, birds are still kept in quite cramped conditions. However, birds have up to 80% more space, access to litter and perches, can move around, and can engage in more of their preferred behaviors. However, there is some evidence that the rate of hen mortality is higher in aviaries. Due to this, the FP report estimates there is a 5-10% chance that aviaries are worse than battery cages. OpenPhil’s “current – though uncertain – best guess is that even without additional reforms, the U.S. transition to cage-free housing systems will on net reduce hen suffering once mortality rates have stabilized.”
There have been a couple other attempts at quantifying this welfare change. Charity Entrepreneurship’s weighted animal welfare index gives battery cages a score of -57 out of -100. As of April 2020, this is the worst score on their scale. In Compassion, by the Pound, F. Bailey Norwood gives caged egg-laying hens a welfare score of -8 (again, the worst score on his scale) and cage-free egg-laying hens a score of +2 (see Table 8.2 on p. 229).
We think that life in a battery cage is very likely to have a negative value - i.e. we conceptualize their badness as a negative multiplier x of 1 unit of healthy time. Plausible estimates of x could vary across several orders of magnitude. Battery cages could be subjectively unpleasant, in which case x would be close to 0. Battery cages could also be truly horrific, with hens spending their entire lives in extreme distress and pain. In that case we think battery cages could be -100 or -1000, meaning 100 to 1000 weeks of cage-free life would be morally cancelled-out by 1 week or 1 year in a cage.
(The existence of a unit of “healthy life” as an upper bound to human welfare is common in analyses based on Quality and/or Disability Adjusted Life Years. However, we recognise that there are varying levels of “healthy life”, some of which are better than others, and that well-being might not be bounded in this way. Defining a unit of positive well-being is important to our model because the negative units of well-being are defined in terms of trade-offs with positive units of well-being. We suggest interpreting “healthy life” for a chicken as living with all needs met, no or minimal fear of predation and disease-free (e.g. perhaps the best moments on a very good farm animal sanctuary) and defining one unit of well-being as one instant of healthy life, understood in this way. We’re uncertain about whether this is the best way to formulate our model).
Our median estimates, made independently, were between -10 and -30, but the bounds of our spreadsheet model are -0.1 to -1000. Since battery cages could plausibly be extremely bad (e.g. well-being level of around -1000), we think that the expected well-being of life in a battery cage is lower than the median.
The spreadsheet model is not intended to have a probabilistic interpretation, so these bounds do not represent a specific confidence interval. In the first instance, the spreadsheet model shows how much better or worse donating to THL is than AMF given various assumptions without committing to a judgement about how likely it is that those assumptions are true. In particular, the model can help to identify points at small changes in assumptions change which charity looks more cost-effective. This can sometimes be sufficient for making decisions.
While we are concerned by the data showing increased mortality rates in aviaries, we do not model the scenario in which aviaries are worse for hens than battery cages in our non-probabilistic model because the cost-effectiveness would, of course, be infinitely worse than AMF. In our probabilistic model, we use a probability density function (pdf) that puts some weight (5-10%) on the possibility that aviaries are worse than battery cages.
Moral weight: how much does a hen’s suffering matter compared to a human’s?
These estimates assume hedonism (all that matters morally is conscious experience of pleasure and suffering), that chickens have moral status (their experience matters morally), and anti-speciesism (the moral value of an experience depends only on the quality of the experience, not on the species of animal who experiences it).
Not much research has been done on moral weights and most people are reluctant to give even speculative estimates, so we rely pretty heavily on Luke Muehlhauser’s work (while recognizing its limitations). Extreme uncertainty entails extremely wide bounds, especially because some considerations push in different directions. Some factors suggest animals have more weight while others suggest the opposite. This is why we consider what would happen throughout the range of plausible estimates.
Our model has moral weights ranging from to (i.e. from 1 human = 1 million chickens to 1 human = 1 chicken). We think it’s very likely that if animals have moral status, their moral weight is not vanishingly small (say, less 1 in 1 million).
Benefits of AMF
We use GiveWell’s updated estimate of AMF’s cost-effectiveness as a point estimate, i.e. ~$1,700 per outcome as good as saving the life of a child under 5. We assume that saving a life is worth about 50 DALYs. These estimates have a margin of error, but since they’re unlikely to be wrong by an order of magnitude that shouldn’t affect our findings too much.
In our spreadsheet model, the numbers in the cells show how many orders of magnitude better THL is than AMF. So if the number is black (positive), THL is better. If the number is white (negative), AMF is better.
The columns are moral weight values, and show how chicken experience is valued relative to human experience. The lower bound is , i.e. 1 million chickens to 1 human; the upper bound is 1.
The rows measure how good moving from a battery cage to an aviary is for a chicken. This has two layers. First, there’s a range for the badness of battery cages, from -0.1 (close to indifference between life and death), to -1,000 (extreme torture, 1000 days of battery cage life outweighs 1 day of healthy life). Second, there’s a range for how much better aviaries are, ranging from 0.7 (70% as bad as battery cages) to -.3 (a life worth living, 30% as good as battery cages are bad).
In the different tabs, we replicate this spreadsheet for different estimates of THL’s effectiveness.
We also translated these inputs to Guesstimate to get an expected value. We describe the Guesstimate model at the end of this section.
Where are the inflection points?
- If you think the moral weight of chickens is less than 1/10,000, then AMF is better than THL unless battery cages are extremely bad
- If you think the moral weight of chickens is more than 1/100, then THL is better than AMF unless THL is very ineffective and battery cages aren’t that bad
- If you think battery cages are very bad (-100 or worse), then THL is better than AMF unless chickens have a very low moral weight (<1/10,000) or aviaries are as bad or worse than battery cages
- If the FP estimate of THL is a little bit pessimistic and THL’s effectiveness is closer to Saulius’ estimates, then THL is usually better than AMF unless battery cages are not that bad (>-1) or chickens have an extremely low moral weight (<1/100,000)
- It rarely matters how much better aviaries are than battery cages, assuming aviaries are at least 30% better than battery cages
For each scenario, we have graphed the line along which AMF and THL are equally cost-effective given various assumptions. Above this line, our model suggests a donation to AMF is better; below the line, THL is better. The y-axis is the inverse of the moral weight, i.e. the number of chickens equal to one human, and the x-axis shows the (negative) momentary well-being of life in a battery cage. We plotted graphs separately for different assumptions about how many hen-years THL affects per dollar, with two graphs per scenario. One plot includes values of battery cage badness from -1 to -1000 with logarithmic axes and one plot that zooms in on 0 to -30 with linear axes. The different lines represent different assumptions about how bad life in an aviary is compared to life in a battery cage.
Here we include the plots for the scenario in which 30 hen-years are affected per dollar donated to THL. We have plots for other scenarios of THL’s effectiveness in this ibb album.
What do our median and expected value estimates suggest?
- These estimates are speculative and not stable
- A combination of median estimates is not the same as the overall best guess, e.g. , so combine estimates with care
- There are potentially important factors for which our model doesn’t account (e.g. rich meat eaters, non-hedonistic considerations, variable moral status by species, effects of corporate campaigns on the number of chickens that exist)
Our low-confidence median parameter estimates:
- THL affects 30 hen-years per $
- Life in battery cages is worth around -10 to -30 units
- Life in aviaries is probably a bit better than battery cages, but with lots of uncertainty. When we translate our credence to Guesstimate, the model gives an expected value of aviaries being ~50% as bad as battery cages
- Human experience is about 300-500 times more valuable than chicken experience. Our distribution would place >10% probability on each order of magnitude between 10x and 10,000x, with tails stretching out to 1M and 1/10 (i.e. chicken experience worth more than humans, due to clock speed and experience intensity)
A Guesstimate model with these estimates suggests that:
- In expectation, THL is >100x better than AMF
- In the median scenario, THL is about 2-4x more cost-effective than AMF
- A 71% chance that THL is more cost-effective than AMF (calculations here)
The expected moral weight in our guesstimate model is about 0.03 (chicken experience is ~30x less valuable than human experience), which might seem very high. However, note that (1) we assume moral status and (2) if one thinks there is some probability that the moral weight is one, then there is a lower bound to the expected moral weight. If, as we do here, one assumes hedonism and that chickens have moral status, then we think that it is difficult to rule out the chance that humans and chickens have equal moral weights. As a result, we would expect a relatively high moral weight in expectation. There may be other reasons for caring about human experience more than chicken experience such that an all things considered view would be less favourable to chickens. We have not taken such considerations into account in this analysis.
We should note some other limitations of the Guesstimate model:
- It does not seem stable (e.g. the numbers change if you refresh the page and get a new sample)
- The bounds and distribution you choose for the badness of battery cages and the moral weight have big effects
- Specifying probability distributions that accurately represent our credences for many of these variables is really hard
- Our pdf for the moral weight is neither lognormal nor normal and we weren’t sure how best to represent it, so we’ve had to fudge the bounds a bit to get a reasonable approximation
- Our pdf for how bad aviaries are has some weight that they’re worse than battery cages, but also some weight that they’re much better. We’ve fudged the bounds to get an ~8% chance aviaries are worse than battery cages
- We’re not sure if we should be calculating the ratio of the expected cost-effectiveness of each charity, or the expected ratio of the cost-effectiveness of each charity. But this doesn’t seem to matter much
- Nevertheless, we’ve played around with the bounds and distributions for key parameters and under most reasonable assumptions, THL is expected to be >100 times better than AMF
- Although according to our model THL is much more cost-effective than AMF in expectation, the probability that THL is more cost-effective than AMF is relatively low, at 0.71
We think THL is more cost-effective in expectation than AMF given certain reasonable assumptions, but due to high uncertainty we don’t think that our models offer strong evidence for this claim in general.
Given our current (lack of) understanding of animal sentience and suffering, one could reasonably believe that AMF is more cost-effective than THL or one could reasonably believe that THL is more cost-effective than AMF, even given our THL-friendly assumptions.
Would future work be useful?
We haven’t thought about:
- The effect of AMF on animals - i.e. the Meat Eater Problem.
- How to introduce moral uncertainty into the model. We assume hedonism, but there are other moral views on which we have some probability that would produce very different estimates (e.g. deny chickens moral status).
It seems unlikely that we’ll learn about more variables that contribute to moral weights without scientific advances in our ability to understand consciousness. However, a better understanding of how bad aviaries are relative to battery cages could be valuable.
Despite the significant limitations of this work, we still think it's important to try and make cross-cause comparisons. We'd welcome any feedback on how to interpret the results or improve our approach to shed more light on this question!
Sorry for the late comment, but I was wondering:
Why do you think it's an underestimate?
One major thing that I think you missed in this analysis is the impact of the song "Don't You Want Me?". Arguably massive.
Something I've been wondering about lately is the supply and demand effects of cage-free reforms. If cage-free costs more, then presumably production should decrease, and fewer hens will be raised for eggs. Maybe this makes up for some of the concerns about higher mortalities in cage-free systems?
I agree that this seems important. It also makes me worry about the equilibrium effects. If producer A switches to a more expensive system and producer B doesn't, then I wonder how many consumers just end up buying more cheap eggs from B.
Commitments are usually made by grocers, restaurants, hotels, etc., not producers. You can see in this document by USDA that at least in the U.S., most important companies that made commitments are retailers, followed by restaurants. I think it's somewhat unlikely that many people will go to another grocer just to save a little bit of money on eggs. Similarly, I don't think that it will impact people's choice of restaurants much because egg prices probably won't influence meal prices that much. Also, some animal advocates believe that eventually all the production in some countries/regions like the U.S. will be cage-free because egg producers won't want to invest in new caged facilities when there is a risk that further corporate campaigns or law changes will take away the few remaining customers that buy caged eggs.
In Compassion, by the Pound, Norwood and Lusk estimate that a transition from cage to cage-free eggs would increase prices 21% which would decrease consumption 4% (350-351). But cage-free eggs also require more chickens. Norwood and Lusk estimate that one cage egg requires 0.003212204 chickens and one cage-free egg requires 0.003229267 chickens (233).
There is actually a typo in the table 8.4 in page 233 on which you are basing this. If you read the text closely, you can see that the value for "Number of non breeder animals associated with one cage egg" should be be 1/509 = 0.001964637, not 1/314 = 0.003184713. The book does not make the same mistake in a very similar table 8.7.
However, in my opinion, what matters more is how many chicken-years are required per egg. And since cage-free hens seem to live shorter lives, the difference in chicken-years required per egg is not as big as chickens required per egg.
But according to a person with more knowledge, the bigger problem is that the book is comparing industrial cage systems with small-scale cage-free systems that are not using optimal genetics. That is not the relevant comparison for the current situation where large-scale producers are switching to cage-free systems. Numbers that the book uses differ quite a lot from numbers in other sources that are discussing industrial systems.
I have spent two or three weeks looking into these issues and have quite neat document about it that I decided not to publish. If somebody thinks that the information in the document could be action-relevant to them, you can email me at firstname.lastname@example.org and I will send you the document.
Awesome, thanks! Looks like the difference in number of chickens required per egg is basically dominated by the 4% change in demand, working out to about 3.5% fewer chickens. It seems plausible to me that the roughly 3.5% fewer chickens raised might even dominate the changes in average welfare, assuming their lives are very bad either way.
There are also recent analyses of ballot initiatives in California, both ex ante and ex post that might tell us about this, too, e.g.: http://www.zachgroff.com/2017/11/animal-welfare-reforms-are-looking.html?m=1
This doesn't change the bottom line much, but for the technical correctness sake, I feel it should be noted that not all cage-free commitments that THL wins shift hens from battery cages to aviaries. For example, some THL-funded campaigns are in the EU or UK where battery cages have been banned since 2012. You can see in this graph that in the EU, caged hens are in enriched cages, not battery cages. Enriched cages are better than battery cages. Charity Entrepreneurship’s weighted animal welfare index gives battery cages a score of -57 and enriched cages a score of -46. That said, looking at THL’s 2020 room for more funding report, it seems that a lot of the cage-free focus will be on countries where battery cages (rather than enriched cages) are used.
Less importantly, aviary is not the only cage-free system that producers may switch to after converting. E.g., some producers may switch to barn, free-range, or organic systems.
Thanks for this. I think this stems from the same issue as your nitpick about AMF bringing about outcomes as good as saving lives of children under 5. The Founders Pledge Animal Welfare Report estimates that THL historically brought about outcomes as good as moving 10 hen-years from battery cages to aviaries per dollar, so we took this as our starting point and that's why this is framed in terms of moving hens from battery cages to aviaries. We should have been clearer about this though, to avoid suggesting that the only outcomes of THL are shifts from battery cages to aviaries.
Note that (unless I missed something) your animal welfare report commits this same minor mistake of assuming that all hens used by companies that made cage-free commitments were in battery cages. While I think that's true for the majority of hens, some of them were already in cage-free systems, and some were in enriched cages. But this is more than outweighed by some very conservative assumptions. E.g., that THL's work only moved policies forward by 1 year or something like that. So it's no big deal :)
What is way more important is all the indirect effects and other factors that I list in the "Ways this estimate could be misleading" section of my corporate campaigns CEA here. I think that they might be more important than direct effects. The same could also be true about AMF.
Thanks Stephen and Aidan for this great report! These sorts of questions are super difficult but plausibly quite important. I appreciate how transparent you are about your uncertainty. Rethink Priorities has been doing some work on moral weight that will point to ways to hopefully reduce some key uncertainties. Stay tuned in the next few weeks as we begin to release our reports!
Thanks Jason! Looking forward to reading the new research.
Thanks for doing this! Though it seems like you kinda buried the lede. Why isn't this in the top level summary?
Thanks for raising this. It's a fair question but I think I disagree that the numbers you quote should be in the top level summary.
I'm wary of overemphasising precise numbers. We're really uncertain about many parts of this question and we arrived at these numbers by making many strong assumptions, so these numbers don't represent our all-things-considered-view and it might be misleading to state them without a lot of context. In particular, the numbers you quote came from the Guesstimate model, which isn't where the bulk of the work on this project was focused (though we could have acknowledged that more). To my mind, the upshot of this investigation is better described by this bullet in the summary than by the numbers you quote:
Thanks! I appreciate your wariness of overemphasizing precise numbers and I agree that it is important to hedge your estimates in this way.
However, none of the claims in the bullet you cite give us any indication of the expected value of each intervention. For two interventions A and B, all of the following is consistent with the expected value of A being astronomically higher than the expected value of B:
Extremely little information is communicated about the relative expected value of A and B by the above points, and what information is communicated misleadingly suggests that both interventions are quite close in expected value. Because EAs are concerned with the expected value of interventions, I think you ought to communicate more about the relative expected value of the interventions and frame your summary of the interventions in a way that is less likely to mislead people about the relative expected value of each intervention.
I think the ideally informative way to both communicate the relative expected value of the interventions and hedge on your model uncertainty in the summary is to (1) provide your expected value estimate, (2) explain that you have high model uncertainty and one could arrive at a different expected value estimate with different assumptions, and (3) invite participants to adjust the Guesstimate and generate their own predictions.
Thanks, this is a good criticism. I think I agree with the main thrust of your comment but in a bit of a roundabout way.
I agree that focusing on expected value is important and that ideally we should communicate how arguments and results affect expected values. I think it's helpful to distinguish between (1) expected value estimates that our models output and (2) the overall expected value of an action/intervention, which is informed by our models and arguments etc. The guesstimate model is so speculative that it doesn't actually do that much work in my overall expected value, so I don't want to overemphasise it. Perhaps we under-emphasised it though.
The non-probabilistic model is also speculative of course, but I think this offers stronger evidence about the relative cost-effectiveness than the output of the guesstimate model. It doesn't offer a precise number in the same way that the guesstimate model does but the guesstimate model only does that by making arbitrary distributional assumptions, so I don't think it adds much information. I think that the non-probabilistic model offers evidence of greater cost-effectiveness of THL relative to AMF (given hedonism, anti-speciesism) because THL tends to come out better and sometimes comes out much, much better. I also think this isn't super strong evidence but that you're right that our summary is overly agnostic, in light of this.
In case it's helpful, here's a possible explanation for why we communicated the findings in this way. We actually came into this project expecting THL to be much more cost-effective, given a wide range of assumptions about the parameters of our model (and assuming hedonism, anti-speciesism) and we were surprised to see that AMF could plausibly be more cost-effective. So for me, this project gave an update slightly in favour of AMF in terms of expected cost-effectiveness (though I was probably previously overconfident in THL). For many priors, this project should update the other way and for even more priors, this project should leave you expecting THL to be more cost-effective. I expect we were a bit torn in communicating how we updated and what the project showed and didn't have the time to think this through and write this down explicitly, given other projects competing for our time and energy. It's been helpful to clarify a few things through this discussion though :)
This is another nitpick but I just want to say this to prevent slightly incorrect information from spreading.
When you look at GiveWell’s latest estimates, you can see that the cost per outcome as good as averting the death of a child under 5 is ~$1,700. It costs around $3,710 to avert a death of a child under 5.
Nice catch, thanks for the careful read Saulius. I think this is especially important because it means that moral weight considerations creep into our measure of AMF's cost-efficiency even before we try to compare them to THL. GW currently assigns the same value to averting under-5 and age 5+ deaths (100 units), so that's convenient. I'd guess the "Cost per outcome as good as" cell also factors in other benefits from reduced morbidity?
I don’t fully understand GiveWell’s spreadsheet myself but I’ll try to answer. By default, "Cost per outcome as good as" cell seems to factor in averting under-5 deaths (46% of the total benefit), averting age 5+ deaths (27%) and development effects (28%).
Developmental effects here seem to refer to the fact that reducing the burden of malaria may have a lasting impact on children's development, and thus on their ability to be productive and successful throughout life.
In the ‘results’ tab, you see that by default, the estimation doesn’t include additional adjustments. If you change that, then the estimate takes into account the effects listed in the “Inclusion/Exclusion” sheet (see below)
It also takes into account something but I haven’t figured out what. In the end including additional adjustments changes "Cost per outcome as good as" very modestly, from $1,690 to $1,678.
Note that according to WHO, in 2018 there were 228 million cases of malaria worldwide resulting in an estimated 405,000 deaths. So for every lethal case, there were 405,000 / 228 million = 563 non-lethal cases. AMF founder said that bednets prevent these non lethal cases as well. I don’t know how much suffering an average case of malaria causes but the combined effect is probably significant. Especially when we take into account some of the complications that sometimes arise from malaria.
For example, GiveWell claims that “It is also believed that malaria can cause permanent disability (hearing impairment, visual impairment, epilepsy, etc.)”. An old Giving What We Can report says “our model suggests that the distribution of long-lasting insecticide treated bednets averts one case of epilepsy for about $25,000.” Note that it is not only difficult to live with epilepsy, but it’s also difficult and stressful to raise a child that has epilepsy (see this video).
How much to weigh these effects and effects of other diseases AMF may prevent (e.g. dengue, yellow fever, zika, encephalitis) depends on the subjective trade-off between preventing deaths and preventing suffering. I feel that my personal trade-off would give much more relative weight to the suffering than GiveWell does. Although I’m sure that GiveWell has solid reasons for making their estimates in the way that they did.
Finally, GiveWell’s estimate doesn’t seem to take into account many other effects. E.g.:
Note that I'm not at all an expert on any of these problems so don't put too much weight on what I say.
Good points. :) That post of mine isn't really about the mosquitoes themselves but more about the impacts that a larger human population would have on invertebrates (assuming AMF does increase the size of the human population, which is a question I also mention briefly).
Note that taking expectations over your uncertainty about chicken moral weight relative to humans runs into the two envelopes problem:
If instead you had fixed chicken moral weight and used a variable for human moral weight relative to chicken moral weight and multiplied AMF's value into cDALYs, you'd get different results. In particular, since chickens may have 0 moral weight on the (imo unlikely) possibility that they are not conscious at all, you would get division by 0 and positive infinity as the expected value of the ratio of cost-effectiveness of AMF relative to THL. And of course this wouldn't capture how we should think about this.
I don't think it makes sense to fix either chicken or human moral weight in the denominator and use a multiplier for the other and take expectations across. While it might feel like we have direct acquaintance with our own moral weight, this is individual-relative, not absolute.
The sensitivity analysis and distribution of EV ratios on different fixed moral weights are still fine to present and very useful, though!
A bit of a nitpick, but "reasonable assumptions" may not be reasonable to others, so what's considered reasonable is really personal. You could say assumptions close to what you think are representative of EAs.
Personally, I found your assumptions about tradeoffs between chickens and humans reasonable, but I also don't find that death in itself (or the loss of pleasure to the individual because of it) is bad. Others might have opposite intuitions.
Thanks for this post!
The current cost-effectiveness of AMF is saving 1 life / 4.5 k$. This implies, according to the Guesstimate model, that the median of the ratio between THL's and AMF's cost-effectiveness is 20, and its mean is larger than 1000.
It's not clear to me that DALYs or QALYs track hedonistic welfare that well. Although life satisfaction isn't hedonistic either, QALYs give relatively less weight to mental pain (anxiety and depression) and ability to perform usual activities compared to life satisfaction. Michael Plant also argues in favour of using life satisfaction over QALYs in that post. DALYs are calculated slightly differently, based on the judgements of experts rather than something closer to a random sample from the general population, but they may also have no personal experience living with the conditions.
I think your sensitivity analysis might be broad enough for this to not matter, though, since from that link, the difference seems to be at most a factor of about 3.
On the other hand, there's the question about whether the kinds of tradeoffs people make between pleasure and suffering, or different levels of suffering or different levels of pleasure for different durations actually track hedonistic value. Often there are too many different factors involved to isolate the hedonistic value (and when they try to, like with the experience machine or wireheading, many people seem to reject hedonism and experientialism outright, so the kinds of tradeoffs people make normally might not refer much to the value of experiences; then again, see this). It seems unlikely that there's a one-size-fits-all, but maybe the average responses are good enough, or the best we can do.
Some more discussion of welfare metrics here: Why does EA use QALYs instead of experience sampling?
Thanks for this comment, you raise a number of important points. I agree with everything you've written about QALYs and DALYs. We decided to frame this in terms of DALYs for simplicity and familiarity. This was probably just a bit confusing though, especially as we wanted to consider values of well-being (much) less than 0 and, in principle, greater than 1. So maybe a generic unit of hedonistic well-being would have been better. I think you're right that this doesn't matter a huge amount because we're uncertain over many orders of magnitude for other variables, such as the moral weight of chickens.
The trade-off problem is really tricky. I share your scepticism about people's actual preferences tracking hedonistic value. We just took it for granted that there is a single, privileged way to make such trade-offs but I agree that it's far from obvious that this is true. I had in mind something like "a given experience has well-being -1 if an idealised agent/an agent with the experiencer's idealised preferences would be indifferent between non-existence and a life consisting of that experience as well as an experience of well-being 1". There are a number of problems with this conception, including the issue that there might not be a single idealised set of preferences for these trade-offs, as you suggest. I think we needed to make some kind of assumption like this to get this project off the ground but I'd be really interested to hear thoughts/see future discussion on this topic!