I’ve now read everything on the GiveWell website about the Against Malaria Foundation, a top rated charity since 2011.  This has helped me increase my understanding of the work they do and the challenges involved.  This is the first in a series of posts summarising my outstanding concerns from this reading.

It may be that I’ll find the answers to some of these points by looking elsewhere, for example reading the AMF website or getting in touch with them directly.  That means this is not the final word on my view of the Against Malaria Foundation.  However, I’m capturing my progress at this stage so that I have a clear basis to build on for further work.

Concern #1:  When you try to measure outputs (malaria case rates/deaths) rather than inputs (bed nets distributed) for non-RCT/"real world" distributions, there is no evidence of impact.  

One of the big lessons I took from William Easterly’s work was to focus on outputs, not inputs.  In general, charities and NGOs like to boast about how much effort they’ve put in but what we really care about is the impact they’ve had.  A lot of the argument for bed net distributions is about inputs. The cost-effectiveness calculations are a prediction of how many lives should be saved, and how many malaria cases should be avoided, if our assumptions hold.  What we really care about is how many lives are saved and how many many malaria cases are avoided, something that in principle can be measured by counting malaria deaths, or by counting malaria cases before and after our distributions.

In principle I think GiveWell and AMF would agree with this.  Indeed, AMF had a plan to monitor malaria case rates before and after distributions to prove their effectiveness.  However, when they actually collected the data they concluded the data was of poor quality and so abandoned this plan.  From the GiveWell website:

“Malaria case rate data: Previously, AMF expected to collect data on malaria case rates from the regions in which it funded LLIN distributions…In 2016, AMF shared malaria case rate data from Malawi…but we have not prioritized analyzing it closely. AMF believes that this data is not high quality enough to reliably indicate actual trends in malaria case rates, so we do not believe that the fact that AMF collects malaria case rate data is a consideration in AMF’s favor, and do not plan to continue to track AMF’s progress in collecting malaria case rate data.”

I find this very worrying.  Maybe the data was of poor quality, but that is a reason for working harder in this area rather than abandoning it altogether.  In general, if we only have poor quality data about malaria in a region, doesn’t that mean we do not know how effective a bednet distribution will be?  More cynically, maybe the data did not demonstrate a significant reduction in malaria and that in itself was taken as evidence that the data was low quality.  If that is the case then we may be ignoring evidence that the world is more complex than we thought, something which effective altruists ignore at their peril.

Elsewhere, I have read that AMF requires its distribution partners to collect monthly malaria case rate data from all health centers in the distribution zone for 12 months preceding and 4 years following the distribution.  I don’t think this requirement is actually enforced. 

Whatever the reasons, it seems that the only time AMF tried to evidence their impact by collecting data they were unable to do so.  This is a very bad sign.  The fact that GiveWell is not concerned with this is also confusing, though not my primary issue in this review.

Taking a step back from the Against Malaria Foundation to look at the malaria problem more generally, there is mixed evidence that bed net distributions reduce malaria case rates.  GiveWell has a macro review of the evidence which shows at the nation-level you cannot demonstrate any impact from all malaria control initiatives.  Highlights include:

“On the whole, continent-level data do not convincingly show a relationship between the scale-up of malaria control and a fall in malaria mortality, or even a clear trend in malaria mortality.”

“In most cases, funding allocated to ITNs is significant, but many malaria control measures at once are occurring and malaria data quality is unclear (more below), so it is difficult to say much about the relative contribution of ITNs.”

“There are also 15 countries where it appears that malaria control efforts have been strong, yet there is …”Limited evidence of decrease” in malaria burden.”

“GiveWell charted ITN coverage and malaria deaths 2000-2009.  Some countries look like they could be cases where a rapid scale-up in ITN coverage failed to result in a drop in malaria deaths”

“Available data and studies appear to show some cases of apparent malaria control success, and also seem to indicate that the overall burden of malaria in Africa is more likely to be falling than rising. However, in most cases it is difficult to link changes in the burden of malaria to particular malaria control measures, or to malaria control in general, and the data remains quite limited and incomplete, such that we cannot confidently say that the burden of malaria has been falling on average.”

Digging around on the GiveWell website finds more details worth highlighting.  Malaria rates in Benin, DRC, Ghana, Mali & Sierra Leone increased as net coverage increased, which is more evidence that the malaria data being used is not great.  In central Africa malaria was trending downwards before bednet coverage was scaled up, further muddying the waters when trying to measure impact.  

GiveWell’s response to all of these points seems to boil down to “we don’t use this data as a basis for our recommendations so these issues are irrelevant”.  GiveWell's recommendation is based on the evidence from Randomised Controlled Trials that using bednets does reduce malaria cases and deaths. Maybe the RCT evidence is so convincing that the noise of country-level data doesn’t matter.  However, the point remains that there are multiple attempts at evidencing impact of bednet distributions and none of these attempts are convincing.  A lack of evidence of impact for real-world distributions should be a concern to any donors to this cause.


 

44

17 comments, sorted by Highlighting new comments since Today at 8:47 AM
New Comment

(I only skimmed your post, and it has been some time since I've read either the GiveWell intervention reports or the studies they draw from)

I appreciate attempts to criticize/red-team existing EA organizations and EA evaluations of interventions. That said, this argument mostly falls flat for me.

My understanding is that the structure of the GiveWell recommendation for the Against Malaria Foundation (AMF) is really quite simple:

  1. At the intervention level, there are strong, RCT-backed evidence that long-lasting insecticidal bednets are very good at preventing mosquito borne illnesses and overall decreasing child mortality.
  2. At the charity level, AMF is unusually good at distributing such bednets at scale.

These arguments are not iron-clad. For example, for #1, maybe you think insecticidal bednets are so a priori implausible as an anti-malaria intervention that you would not trust any level of RCT evidence? But this just falls flat to me, as "bednets that prevent/kill mosquitoes makes it harder for malarial mosquitoes to sting kids at night" passes some very simple sanity checks, at least for me. (Or perhaps you think drawing GiveWell's conclusion from the RCTs is statistically wrong, because of reasons? If so, it'd be good to list the reasons!)

Another reason you might doubt #2 is relevant is if you're suspicious that AMF can confer similar results as would be implied by the RCTs.  For example, if you think the places AMF works in is so "out-of-distribution" relative to the RCTs, because of lower malarial load[1]. But my understanding is that a) the GiveWell analysis accounts for this and b) the malarial loads aren't that different.

There are a number of other reasons that I would not go into that engages with the argument structure. 

However, your critique does not engage with the structure of the argument, and instead[2] argues that because there's no direct empirical evidence of AMF's specific distribution of bednets  saving lives, we cannot assume that AMF's bednets save lives.

 I currently think your post is an overly myopic treatment of the evidence. For a better extension by my lights, I'd be interested to see more engagement from you on whether the structure of the original argument is wrong, or alternatively, why you think your alternative formulation/framework of the problem ought to be the preferred one. I would also be interested in a very different critique of AMF that takes GiveWell's structure as a given but argues that by those lights, AMF is not a good donation target (eg because the intervention research is actually shoddy, or because AMF is actually bad at delivering bednets).

[1] My understanding is that, in contrast, substantially lower worm load is a serious reason to be skeptical of the present-day impact of deworming interventions.

[2] You also argue that there's observational data against AMF's effectiveness because the countries they work in don't have obviously lower malarial loads. However I think causality is just pretty hard to determine from observational data, for reasons Charles mentions here.

Thanks Linch, interesting thoughts.

To clarify, my point is not just there's no direct empirical evidence of AMF's specific distributions saving lives.  My point is that there is no direct evidence of any  non-RCT/"real world" distributions saving lives.

Further, this is not because nobody is looking for such evidence.  GiveWell's macro review of the evidence suggests every time somebody has looked for evidence of non-RCT/"real world" distributions saving lives they've come up with nothing.

I agree with your summary of the GiveWell argument (strong RCT evidence + AMF as competent distributor).  However, in order to turn these two facts into a prediction about future we need to add the assumption that the RCT evidence applies to future distributions.  This is the weak link in the chain.  As you say, differences in malarial load could distort things.  Differences in the underlying health of the population, differences in net usage and increasing insecticide resistance are other contenders, along with many more I'm sure.  If we can't see any evidence of impact after distributing hundreds of millions of bednets then it seems reasonable to question if this key assumption is leading us astray. 

In your post, I think your concerns are in two categories:

Issue A. Not tracking the effects of recipients (or more likely, initially trying to track but not finding no positive statistical effects and dropping data collection).

Indeed, AMF had a plan to monitor malaria case rates before and after distributions to prove their effectiveness.  However, when they actually collected the data they concluded the data was of poor quality and so abandoned this plan...I find this very worrying.  Maybe the data was of poor quality, but that is a reason for working harder in this area rather than abandoning it altogether.  In general, if we only have poor quality data about malaria in a region, doesn’t that mean we do not know how effective a bednet distribution will be?

Issue B. At the country level (not monitoring recipients of AMF nets but malaria levels in countries), there is no/limited/mixed evidence for malaria reduction:

Taking a step back from the Against Malaria Foundation to look at the malaria problem more generally, there is mixed evidence that bed net distributions reduce malaria case rates.  GiveWell has a macro review of the evidence which shows at the nation-level you cannot demonstrate any impact from all malaria control initiatives.  

...Malaria rates in Benin, DRC, Ghana, Mali & Sierra Leone increased as net coverage increased, which is more evidence that the malaria data being used is not great.  In central Africa malaria was trending downwards before bednet coverage was scaled up, further muddying the waters when trying to measure impact.

“Available data and studies appear to show some cases of apparent malaria control success, and also seem to indicate that the overall burden of malaria in Africa is more likely to be falling than rising. However, in most cases it is difficult to link changes in the burden of malaria to particular malaria control measures, or to malaria control in general, and the data remains quite limited and incomplete, such that we cannot confidently say that the burden of malaria has been falling on average.”

What you wrote and the reasoning is a complete and well reasoned line of thought from careful study of the AMF website. 

However, this is not sufficient evidence for strong updates against AMF. 

For me, it's not even enough evidence that would cause me to investigate this issue further.

The root issue/crux is that the "causal inference"/"causal identification" or the information you can get from statistics you collected here is very low, and far from a model of impact or finding the Truth.

Some perspectives:

Issue A: For the first issue, where tracking recipients was ineffective (or as you suggest and I would also find plausible, they found no statistical effect and then data collection was dropped), I don't know more than what you wrote, but finding no effects is plausible, even common, in highly successful interventions.

  • The statistical power may be very low. To get intuition for this, remember that a life saved costs $5000 in expectation and a bednet costs ~$2. In some real statistical sense, you literally need thousands of bednets to get an "observation" of a death or life saved. So you may need many, hundreds of thousands, or really millions of bednets to get enough observations for statistical power. But that's just one layer of the difficulty and assumes perfectly balanced groups of treatment/control, demographics—you may need an order of magnitude more observations to do a proper observational study. Even generously, that's a large fraction of all the bednets distributed in a year. From this problem alone, my prior would be to find no effect and also I would expect it to impose large operational costs that many donors would find unacceptable (I would).
  • The above implies a pretty clean, controlled test environment. E.g. two villages, one with bednets or one without, or really, two children in the same household, where one gets a treatment with one bednet and one without. This isn't going to happen in the actual program and the effects are wildly different if not controlled.
  • Examples of random stories that's going to mess up inference: a principled bednet distributor might give nets to poorer families, families that have sicker children and adults. Since everyone probably knows bednets are effective, wealthier families might get their own (which is good, AMF can give to the really poor), and these wealthy families might get more premium bednets and treatments (e.g. $10 instead of $2), so you don't have a comparison group.
  • There's even more pathological stories that mess up your inference: if you were a skilled implementor, working in this program on the group for many years, and you know you only have 100 bednets for 1000 people (maybe because the EAs got captured by the AI/futurist memes which diverted all the billionaire funds), it's possible that you know, working on the ground, who gets the bednets is very important, like by a factor of 2 or 4. That is, if you give the right bednets to the right people you can increase cost effectiveness by 200-400%. By definition, this skill isn't legible by some survey. So your very skill in giving bednets to the worst families, more afflicted by malaria, means that someone looking at the data will go "hey when we collect data for recipients of malaria nets, these families don't look better worse off, let's cancel this."

 

Issue B: Cross country effects

  • The cross country sort of examinations suffers from all of the issues above, but is even weaker. For example, climate trends, poverty, institutional change are all going forces that will mess up results, and even this is description is a crude gesture at the realities of what is going on. What about other ways malaria can be contracted, besides sleeping in an bednet eligible bed?
  • These confounding effects mean that nation studies might never find an effect at all, even with very effective interventions. One new major crux is how much coverage of bednets there is in a country. Again, I don't know anything about this more than reading your post, but if bednet distribution is 10% or even 30%, that is may not be enough to find an effect even if bednets were 100% effective.
  • That's assuming that bednets were 100% effective. If bednets were even 1% effective (which by the way still makes them completely worth it and is consistent with the CEA of $5000 per life for ~$2 bednet), you may never be able to find an effect from an observational study.

Basically, cross country regressions aren't good without being embedded with a strong model/context and this domain is sort of an "also ran" in economics. 

 

Again, what you wrote and the reasoning is a complete and well reasoned line of thought from careful study of the AMF website. 

You said:

we may be ignoring evidence that the world is more complex than we thought, something which effective altruists ignore at their peril.

Like, to be clear, let's flip the evidence another way around: 

Imagine someone who came to you for money for a new project or new business. This person didn't understand the intervention, didn't understand the country or people. All they present is an argument they read from papers, with just country level observational data, or data from someone who they didn't know, who collected some data giving nets to families. 

If you were being asked to give money to this person, this information is not enough to trust them, (and it may even be wise to distrust them if this was the only argument they were able to present.)

Thanks Charles for your detailed response.

I agree with your central point that it's very hard to use statistics to prove anything.  In particular, you need a huge amount of data and there is lots of noise as the real world is not a clean &  tidy place.

For bednets, we do have a huge amount of data.  The World Malaria Report 2011, used in GiveWell's macro review, says 145 million bednets were distrubuted in sub-Saharan Africa in 2010 alone [1].  That's theoretical coverage for around 30% of the population [2].  This is a massive level of intervention.  

For malaria, we also have lots of noise.  The same World Malaria Report puts annual deaths in the range 537,000-907,000.  That's a pretty wide confidence interval.  The Lancet gives 929,000-1,685,000 deaths per year.  That's a wider range than the first and the two ranges don't even overlap. [3]

I understand GiveWell's position (and yours?) to be "There is so much noise, the real world observations don't really tell you anything.  You have to focus on the Randomised Control Trials as proving the concept & monitor AMF to ensure competent delivery".  This might well be right.  However, it is then unclear what information could ever be supplied to change GiveWell's mind.  How many bednets would we have to distribute with no evidence of impact before we revisit the recommendation? A billion? 100 billion?  Put another way, imagine 10 years from now we find out that bednet distributions had much less impact than we expected.  What would be the evidence that demonstrates this, and where might we look now for clues that such evidence is emerging?

More generally, if an intervention can't stand out from the statistical noise then I'm not sure it passes my personal threshold for a top intervention.  As a minimum this means the scale of the problem, and so the scale of our impact, is not well understood.  An intervention that can't stand out from statistical noise has no way of providing feedback to providers on when it is going well or badly, and so has no way to avoid mistakes and no way to improve.  Finally, there's also a psychological element about certainty of impact that will be a big deal to some donors, but that's a topic for another day.

 

[1] Source: https://www.who.int/malaria/world_malaria_report_2011/WMR2011_factsheet.pdf

[2] Based on GiveWell's assumption of 1.8 people covered per net & a population of 869m, as per here:  https://www.statista.com/statistics/805605/total-population-sub-saharan-africa/

[3] Source:  https://blog.givewell.org/2013/01/23/guest-post-from-david-barry-about-deworming-cost-effectiveness/

When you're saying "a lack of evidence should be concerning" are you saying "this has been studied and it looks like there's no impact" or "this hasn't been studied as well as it should be"? The word "concerned" is quite vague, but sometimes when people say "donors should be concerned" they mean "donors should stop donating", which I think is the wrong answer here.

The CDC says, "LLINs have been associated with sharp decreases in malaria in countries where malaria programs have achieved high LLIN coverage." They don't list their source, but emailing them might be one of the things you'd like to do as you look into this?

https://www.cdc.gov/malaria/malaria_worldwide/reduction/itn.html

Have you read this GiveWell page on bed nets? They state:

There is strong evidence that when large numbers of people use LLINs to protect themselves while sleeping, the burden of malaria can be reduced, resulting in a reduction in child mortality among other benefits.

Or this Cochrane review? 

Insecticide‐treated nets reduce child mortality from all causes by 17% compared to no nets (rate ratio 0.83, 95% CI 0.77 to 0.89; 5 trials, 200,833 participants, high‐certainty evidence). This corresponds to a saving of 5.6 lives (95% CI 3.6 to 7.6) each year for every 1000 children protected with ITNs. Insecticide‐treated nets also reduce the incidence of uncomplicated episodes of Plasmodium falciparum malaria by almost a half (rate ratio 0.55, 95% CI 0.48 to 0.64; 5 trials, 35,551 participants, high‐certainty evidence) and probably reduce the incidence of uncomplicated episodes of Plasmodium vivax malaria (risk ratio (RR) 0.61, 95% CI 0.48 to 0.77; 2 trials, 10,967 participants, moderate‐certainty evidence).

If the nation-level data isn't supportive of this, then perhaps this is worthy of further investigation to understand why it may be different from the trials. 

You seem to acknowledge this by saying 'Maybe the RCT evidence is so convincing that the noise of country-level data doesn’t matter' - but if your claim is that there is 'no evidence of impact' specifically at the country-level, then I'd encourage you to be clear about this with your heading. The statement that 'when you try to measure outputs there is no evidence of impact'  doesn't seem true.

Thanks for the comments Matt.  I've adjusted and improved the post based on your input.

I was aware of this info and assumed everybody else would be too, so I just took it as read. However, I agree these points are not clear enough in the original post above.

I've now changed the heading to add the clarity that it only applies to non-RCT/"real world" distributions.  I've also inserted a sentence in the final paragraph to make it clear the RCTs do show such evidence and this is the basis for GiveWell's recommendation.  

Malaria rates in Benin, DRC, Ghana, Mali & Sierra Leone increased as net coverage increased, which is more evidence that the malaria data being used is not great. 

I appreciate your sharing what you find as you dig through the data. But I'd also recommend sharing links for statements like this (or at least instructions for finding the same information you found). This makes it much easier for other people to dig along with you.

Elsewhere, I have read that AMF requires its distribution partners to collect monthly malaria case rate data from all health centers in the distribution zone for 12 months preceding and 4 years following the distribution.  I don’t think this requirement is actually enforced. 

If you email AMF to ask about whether they enforce this, I think you'll get a response pretty quickly. Ditto other questions you come up with in your research. (At least, based on my own experience emailing AMF about things.)

I think you should be cautious about statements like:

"More cynically, maybe the data did not demonstrate a significant reduction in malaria and that in itself was taken as evidence that the data was low quality."

From reading your post, you don't seem to have any evidence for this at all and it doesn't seem like you've made any effort to find any evidence (e.g. by asking AMF). If that is indeed the case, this suggestion is baseless.

I really think you ought to consider renaming this post given you've not even emailed GiveWell yet? It seems like you're planning to post a series on "concerns with AMF" before asking any follow up questions to anyone. Probably about 1000 people will see the title. There's some chance you could convince someone to stop donating to AMF just from the title - that tends to be how brains work, even though it isn't very rational.

I really think you ought to consider renaming this post... Probably about 1000 people will see the title. There's some chance you could convince someone to stop donating to AMF just from the title - that tends to be how brains work, even though it isn't very rational.

I think it's not a good idea to respond to criticism in this way. I imagine myself as an outsider, skeptical of some project, and having supporters of the project tell me, "It's morally wrong to say we're not doing good without following our things-to-do-before-critiquing-us checklist, because critiques of us (if improperly done) might cause us to lose support, which is tantamount to causing harm."

I think this would (and should) make skeptic-me take a dimmer view of the project in question. It's unconvincing on the object level; to the extent that I already don't think what you're doing is valuable, I shouldn't be moved by arguments about how critiquing it might destroy value. And it pattern-matches to the many other instances of humans organizations wanting to dictate the terms on which they can be criticized, and leveraging the force of moral arguments to do so. Organizations that do this kind of thing are often not truth-seeking and genuinely open to criticism (even when it's done "properly" by their lights).

I think telling someone not to post criticism without having done X, Y or Z seems bad, but I think asking someone for a title change to make clear that this is a set of concerns one person has come up with rather than e.g. news of an evaluation change from GiveWell is reasonable, and that's what I read the request as.

Specifically about asking organisations ahead of posting criticism, I think this is a good thing to do, but absolutely shouldn't be required before posting. In this case, I expect asking someone before posting would have led to a much higher quality post, as the responses from Charles and Linch would almost certainly have come up, and there would have been a chance to discuss them.

I literally have nothing to do with AMF, I just think the title is bad and not representative of the post.

I didn't mean to imply you did, though I see how "human organizations wanting to dictate the terms on which they can be criticized" might sound that way. My sense that it's bad if posts on the Forum that are critical of AMF get met with this kind of argument doesn't hinge on whether the person making the argument is involved with AMF or not.

There's a really interesting meta-point here where it looks like the Europeans broadly agreed that requesting a title change was reasonable (was at +30 karma when I went to sleep) and the West coast EAs disagreed (back down to the original +2 when I woke up).

I disagree on the broader point. I think people, and especially EA, should be grateful for well-considered technical criticism, and  we should be especially wary of adding too many roadblocks for criticism.

Minor point:

I really think you ought to consider renaming this post given you've not even emailed GiveWell yet?

Shouldn't the natural organization to give a heads-up for feedback of this sort be the Against Malaria Foundation, and not GiveWell?

This mostly seemed to be a criticism of how GiveWell communicates about AMF rather than of AMF itself, given the author hasn't even visited AMF's website yet, but either would make sense.