How good is The Humane League compared to the Against Malaria Foundation?

by smclare, AidanGoth12 min read29th Apr 202025 comments


Donation WriteupCause Prioritization

Thanks to the experts with whom we spoke while researching this report!

This was written by Stephen Clare and Aidan Goth.

1. Summary

  • In order to help us prioritize Founders Pledge’s research and advising efforts across cause areas, we created a rough model that tries to compare an animal welfare intervention (The Human League’s cage-free campaigns) to a global health intervention (AMF’s bednet distribution).
  • In assessing the impact of THL’s intervention, we identified three main sources of uncertainty:
    • How many animal years are affected by the intervention
    • How much the intervention improves each animal’s subjective well-being
    • How much an animal’s well-being matters compared to a human’s (the moral weight)
  • The last factor is really tough to figure out. There are good reasons to think the weight might be quite high, and good reasons to think it might be very low. That means the range of our moral weight estimates spans multiple orders of magnitude.
  • For this report, we made spreadsheet and Guesstimate models that compare The Humane League to the Against Malaria Foundation for a range of different assumptions about the above uncertain factors.
  • Importantly, we assumed hedonism (sentient experience is all that matters morally), that chickens have moral status (their experience matters morally), and anti-speciesism (the value of an experience is independent of the species of animal that is experiencing it). Accordingly, this analysis does not offer an all-things-considered view on the relative goodness of THL and AMF – it assumes a particular worldview that is relatively favourable to THL.
  • In this model, in most of the most plausible scenarios, THL appears better than AMF. The difference in cost-effectiveness is usually within 1 or 2 orders of magnitude. Under some sets of reasonable assumptions, AMF looks better than THL. Because we have so much uncertainty, one could reasonably believe that AMF is more cost-effective than THL or one could reasonably believe that THL is more cost-effective than AMF.
  • In general, if you value human well-being >10,000 times more than chicken well-being, AMF looks better. If you value human well-being <300 times more than chicken well-being, THL looks better. But between these moral weights the ranking is less clear. We think there’s a good chance (at least 50%) that the moral weight falls between these bounds, where factors like THL’s effectiveness and the badness of battery cages are more important.
  • It’s very likely that we’re missing key considerations that could change our estimates by orders of magnitude. For example, we haven’t tried to account for moral uncertainty, indirect effects of the interventions or longtermist considerations.
  • Links to our models:

2. Motivation

We have to decide how to allocate limited resources between the best human charities and the best animal charities. We work out the trade-offs involved explicitly or make them implicitly, but we have to somehow decide.

3. Theoretical Framework

Why this is hard

  • Our epistemic status is weak. It’s hard to know what the relevant variables are, much less how they vary among different animals. We do not have access to the internal experience of other animal species. We have to try and approximate what it’s like based on observations about their behavior, reflecting on our own experience, and considering the different biological and philosophical factors that seem likely to shape consciousness
  • Animal-focused interventions are less well studied than human-focused interventions, so we know much less about how effective interventions to improve animal welfare are.
  • So there are (at least) three sources of uncertainty:
    • How many animal years are affected by the interventions
    • How much do the interventions improve each animal’s subjective well-being
    • How much does an animal’s well-being matter compared to a human’s?

Moral weight vs. moral status

  • A moral weight suggests how much we should value the welfare of an animal in a given species relative to an animal from a different species. These are usually defined relative to humans, i.e. humans are given a moral weight of 1.
  • Moral weights are different to an animal’s moral status, i.e. whether its welfare matters to us. Here we focus on moral weights, which relate more to a creature’s ‘capacity for welfare.’ For farm animals, Muehlhauser’s 2017 report, the most in-depth treatment of this question we found, assigns chickens an 80% chance of moral status

Candidates for key variables that determine moral weight

Some of the variables we came across in researching moral weights are:

  • Clock speed of consciousness (suggested by Muehlhauser)
    • Smaller animals have faster reaction times (e.g. imagine trying to swat a fly).
    • To the extent those reactions are under conscious control, smaller animals would experience more subjective moments per unit of objective time
    • When measuring well-being, it seems very likely we should care about the subjective length of experience more
    • This updates us towards valuing smaller animals more
  • Experience intensity (Muehlhauser)
    • More ‘intense’ experiences seem like they should matter more
    • Human experiences seem like they’re probably more intense. But experiences of other species may be more intense - for example, it’s not clear whether our ‘linguistic thoughts’ make our experience more or less intense
  • Unity of consciousness (Muehlhauser)
    • Refers broadly to how various different conscious inputs are combined into a single experience. E.g. conscious states over time all belong to some persistent “self” (subject unity) and contents of conscious states are unified (representational unity and phenomenal unity)
    • Debatable how much this influences moral weight, but subject and phenomenal unity relate to subjective experience which seems relevant to moral weight. Humans probably have more unity of consciousness than animals.
  • Brain size/complexity (Tomasik)
    • Brian Tomasik is very uncertain, but weakly suggests a view where moral weight scales non-linearly with brain size. As a rough approximation he suggests scaling by N^(2/5), where N is the number of neurons the animals has.
    • However some evidence suggests that the correlation between cognitive sophistication and neuron count seems weak.

4. Methodology

We made a spreadsheet model that generates cost-effectiveness estimates for many different plausible values and noted where the key considerations lie. We only estimated a model to compare our most popular global health intervention (AMF) to our top animal welfare recommendation (The Humane League’s campaigns for aviaries instead of battery cages for egg-laying hens). Our model is:


number of 'hen-years' spent in aviaries rather than battery cages

benefits of moving from a battery cage to an aviary

moral weight of chickens

how many DALYs are averted by $1 to AMF


Effectiveness of THL campaigns in changing corporate behaviour

The FP report on animal welfare (published 2018) estimates that THL moves 10 hen-years from battery cages to aviaries per dollar donated. This number assumes there is a 60% probability that companies honour their cage-free commitments and that THL’s advocacy brought these pledges forward by between half a year and one year on average. Simcikas (2019)’s estimates of corporate campaign effectiveness are higher, though not all corporate campaigns relate to battery cages and aviaries specifically. His upper bound is that 160 hen-years are affected per dollar, with a median estimate of 54. Similarly, Bollard (2016) suggests corporate campaigns spare at least 38 hen years from battery cages per dollar. On the other hand, Bollard (2019) documents that some companies have delayed or reneged on their pledges. While he remains optimistic, this indicates the need for ongoing campaigning to ensure pledges are fulfilled which would raise the expected cost.

We think the 2018 FP estimate of 10 hen-years/$ is likely a slight underestimate. Across the different tabs on the spreadsheet, we model four scenarios: 1, 10, 30 and 100 hen-years affected per dollar.

Benefits of moving from battery cages to aviaries

We measure the benefits of moving from battery cages to aviaries by estimating how bad battery cages are and then estimating how bad aviaries are as a proportion of the badness of battery cages.

Battery cages are plausibly extremely bad. Hens are kept in tiny spaces, live on wire racks, are unable to move around, and are kept from engaging in natural behaviors such as rooting, preening, and socializing. Pages 20 through 23 of our report have more detail. In aviaries, birds are still kept in quite cramped conditions. However, birds have up to 80% more space, access to litter and perches, can move around, and can engage in more of their preferred behaviors. However, there is some evidence that the rate of hen mortality is higher in aviaries. Due to this, the FP report estimates there is a 5-10% chance that aviaries are worse than battery cages. OpenPhil’s “current – though uncertain – best guess is that even without additional reforms, the U.S. transition to cage-free housing systems will on net reduce hen suffering once mortality rates have stabilized.”

There have been a couple other attempts at quantifying this welfare change. Charity Entrepreneurship’s weighted animal welfare index gives battery cages a score of -57 out of -100. As of April 2020, this is the worst score on their scale. In Compassion, by the Pound, F. Bailey Norwood gives caged egg-laying hens a welfare score of -8 (again, the worst score on his scale) and cage-free egg-laying hens a score of +2 (see Table 8.2 on p. 229).

We think that life in a battery cage is very likely to have a negative value - i.e. we conceptualize their badness as a negative multiplier x of 1 unit of healthy time. Plausible estimates of x could vary across several orders of magnitude. Battery cages could be subjectively unpleasant, in which case x would be close to 0. Battery cages could also be truly horrific, with hens spending their entire lives in extreme distress and pain. In that case we think battery cages could be -100 or -1000, meaning 100 to 1000 weeks of cage-free life would be morally cancelled-out by 1 week or 1 year in a cage.

(The existence of a unit of “healthy life” as an upper bound to human welfare is common in analyses based on Quality and/or Disability Adjusted Life Years. However, we recognise that there are varying levels of “healthy life”, some of which are better than others, and that well-being might not be bounded in this way. Defining a unit of positive well-being is important to our model because the negative units of well-being are defined in terms of trade-offs with positive units of well-being. We suggest interpreting “healthy life” for a chicken as living with all needs met, no or minimal fear of predation and disease-free (e.g. perhaps the best moments on a very good farm animal sanctuary) and defining one unit of well-being as one instant of healthy life, understood in this way. We’re uncertain about whether this is the best way to formulate our model).

Our median estimates, made independently, were between -10 and -30, but the bounds of our spreadsheet model are -0.1 to -1000. Since battery cages could plausibly be extremely bad (e.g. well-being level of around -1000), we think that the expected well-being of life in a battery cage is lower than the median.

The spreadsheet model is not intended to have a probabilistic interpretation, so these bounds do not represent a specific confidence interval. In the first instance, the spreadsheet model shows how much better or worse donating to THL is than AMF given various assumptions without committing to a judgement about how likely it is that those assumptions are true. In particular, the model can help to identify points at small changes in assumptions change which charity looks more cost-effective. This can sometimes be sufficient for making decisions.

While we are concerned by the data showing increased mortality rates in aviaries, we do not model the scenario in which aviaries are worse for hens than battery cages in our non-probabilistic model because the cost-effectiveness would, of course, be infinitely worse than AMF. In our probabilistic model, we use a probability density function (pdf) that puts some weight (5-10%) on the possibility that aviaries are worse than battery cages.

Moral weight: how much does a hen’s suffering matter compared to a human’s?

These estimates assume hedonism (all that matters morally is conscious experience of pleasure and suffering), that chickens have moral status (their experience matters morally), and anti-speciesism (the moral value of an experience depends only on the quality of the experience, not on the species of animal who experiences it).

Not much research has been done on moral weights and most people are reluctant to give even speculative estimates, so we rely pretty heavily on Luke Muehlhauser’s work (while recognizing its limitations). Extreme uncertainty entails extremely wide bounds, especially because some considerations push in different directions. Some factors suggest animals have more weight while others suggest the opposite. This is why we consider what would happen throughout the range of plausible estimates.

Our model has moral weights ranging from to (i.e. from 1 human = 1 million chickens to 1 human = 1 chicken). We think it’s very likely that if animals have moral status, their moral weight is not vanishingly small (say, less 1 in 1 million).

Benefits of AMF

We use GiveWell’s updated estimate of AMF’s cost-effectiveness as a point estimate, i.e. ~$1,700 per outcome as good as saving the life of a child under 5. We assume that saving a life is worth about 50 DALYs. These estimates have a margin of error, but since they’re unlikely to be wrong by an order of magnitude that shouldn’t affect our findings too much.

5. Results

In our spreadsheet model, the numbers in the cells show how many orders of magnitude better THL is than AMF. So if the number is black (positive), THL is better. If the number is white (negative), AMF is better.

The columns are moral weight values, and show how chicken experience is valued relative to human experience. The lower bound is , i.e. 1 million chickens to 1 human; the upper bound is 1.

The rows measure how good moving from a battery cage to an aviary is for a chicken. This has two layers. First, there’s a range for the badness of battery cages, from -0.1 (close to indifference between life and death), to -1,000 (extreme torture, 1000 days of battery cage life outweighs 1 day of healthy life). Second, there’s a range for how much better aviaries are, ranging from 0.7 (70% as bad as battery cages) to -.3 (a life worth living, 30% as good as battery cages are bad).

In the different tabs, we replicate this spreadsheet for different estimates of THL’s effectiveness.

We also translated these inputs to Guesstimate to get an expected value. We describe the Guesstimate model at the end of this section.

Where are the inflection points?

  • If you think the moral weight of chickens is less than 1/10,000, then AMF is better than THL unless battery cages are extremely bad
  • If you think the moral weight of chickens is more than 1/100, then THL is better than AMF unless THL is very ineffective and battery cages aren’t that bad
  • If you think battery cages are very bad (-100 or worse), then THL is better than AMF unless chickens have a very low moral weight (<1/10,000) or aviaries are as bad or worse than battery cages
  • If the FP estimate of THL is a little bit pessimistic and THL’s effectiveness is closer to Saulius’ estimates, then THL is usually better than AMF unless battery cages are not that bad (>-1) or chickens have an extremely low moral weight (<1/100,000)
  • It rarely matters how much better aviaries are than battery cages, assuming aviaries are at least 30% better than battery cages

For each scenario, we have graphed the line along which AMF and THL are equally cost-effective given various assumptions. Above this line, our model suggests a donation to AMF is better; below the line, THL is better. The y-axis is the inverse of the moral weight, i.e. the number of chickens equal to one human, and the x-axis shows the (negative) momentary well-being of life in a battery cage. We plotted graphs separately for different assumptions about how many hen-years THL affects per dollar, with two graphs per scenario. One plot includes values of battery cage badness from -1 to -1000 with logarithmic axes and one plot that zooms in on 0 to -30 with linear axes. The different lines represent different assumptions about how bad life in an aviary is compared to life in a battery cage.

Here we include the plots for the scenario in which 30 hen-years are affected per dollar donated to THL. We have plots for other scenarios of THL’s effectiveness in this ibb album.

What do our median and expected value estimates suggest?


  • These estimates are speculative and not stable
  • A combination of median estimates is not the same as the overall best guess, e.g. , so combine estimates with care
  • There are potentially important factors for which our model doesn’t account (e.g. rich meat eaters, non-hedonistic considerations, variable moral status by species, effects of corporate campaigns on the number of chickens that exist)

Our low-confidence median parameter estimates:

  • THL affects 30 hen-years per $
  • Life in battery cages is worth around -10 to -30 units
  • Life in aviaries is probably a bit better than battery cages, but with lots of uncertainty. When we translate our credence to Guesstimate, the model gives an expected value of aviaries being ~50% as bad as battery cages
  • Human experience is about 300-500 times more valuable than chicken experience. Our distribution would place >10% probability on each order of magnitude between 10x and 10,000x, with tails stretching out to 1M and 1/10 (i.e. chicken experience worth more than humans, due to clock speed and experience intensity)

A Guesstimate model with these estimates suggests that:

  • In expectation, THL is >100x better than AMF
  • In the median scenario, THL is about 2-4x more cost-effective than AMF
  • A 71% chance that THL is more cost-effective than AMF (calculations here)

The expected moral weight in our guesstimate model is about 0.03 (chicken experience is ~30x less valuable than human experience), which might seem very high. However, note that (1) we assume moral status and (2) if one thinks there is some probability that the moral weight is one, then there is a lower bound to the expected moral weight. If, as we do here, one assumes hedonism and that chickens have moral status, then we think that it is difficult to rule out the chance that humans and chickens have equal moral weights. As a result, we would expect a relatively high moral weight in expectation. There may be other reasons for caring about human experience more than chicken experience such that an all things considered view would be less favourable to chickens. We have not taken such considerations into account in this analysis.

We should note some other limitations of the Guesstimate model:

  • It does not seem stable (e.g. the numbers change if you refresh the page and get a new sample)
    • The bounds and distribution you choose for the badness of battery cages and the moral weight have big effects
  • Specifying probability distributions that accurately represent our credences for many of these variables is really hard
    • Our pdf for the moral weight is neither lognormal nor normal and we weren’t sure how best to represent it, so we’ve had to fudge the bounds a bit to get a reasonable approximation
    • Our pdf for how bad aviaries are has some weight that they’re worse than battery cages, but also some weight that they’re much better. We’ve fudged the bounds to get an ~8% chance aviaries are worse than battery cages
  • We’re not sure if we should be calculating the ratio of the expected cost-effectiveness of each charity, or the expected ratio of the cost-effectiveness of each charity. But this doesn’t seem to matter much
  • Nevertheless, we’ve played around with the bounds and distributions for key parameters and under most reasonable assumptions, THL is expected to be >100 times better than AMF
  • Although according to our model THL is much more cost-effective than AMF in expectation, the probability that THL is more cost-effective than AMF is relatively low, at 0.71

6. Discussion


We think THL is more cost-effective in expectation than AMF given certain reasonable assumptions, but due to high uncertainty we don’t think that our models offer strong evidence for this claim in general.

Given our current (lack of) understanding of animal sentience and suffering, one could reasonably believe that AMF is more cost-effective than THL or one could reasonably believe that THL is more cost-effective than AMF, even given our THL-friendly assumptions.

Would future work be useful?

We haven’t thought about:

  • The effect of AMF on animals - i.e. the Meat Eater Problem.
  • How to introduce moral uncertainty into the model. We assume hedonism, but there are other moral views on which we have some probability that would produce very different estimates (e.g. deny chickens moral status).

It seems unlikely that we’ll learn about more variables that contribute to moral weights without scientific advances in our ability to understand consciousness. However, a better understanding of how bad aviaries are relative to battery cages could be valuable.

Despite the significant limitations of this work, we still think it's important to try and make cross-cause comparisons. We'd welcome any feedback on how to interpret the results or improve our approach to shed more light on this question!