Since folks are interested in encouraging critiques of EA—an admirable sentiment!—I wrote the following as a good-faith, friendly, and hopefully modest critique. [Note: all of the following is about global health/development, not about long-termist endeavors. I try to make this clear throughout, but might occasionally have let slip some overly-broad phrasing.]
EAs write compelling articles about why RCTs are a great way to understand the causal impact of a policy or treatment. And GiveWell’s claim to fame is that it has led to many millions of dollars of donations to “several charities focusing on RCT-backed interventions as the ‘most effective’ ones around the world.”
But I wonder if the EA movement is allocating nearly enough money to new RCTs and program evaluations, or to R&D more broadly, so as to build out new evidence in a strategic way.
After all, the agreed-upon list of the “best” interventions identified by RCTs seems . . . a bit stagnant.
- When I spoke at the EA Global conference in 2016, GiveWell’s best ideas for global giving involved malaria, deworming, and cash transfers.
- When I look at GiveWell’s current list of the top charities, they still are mostly focused on malaria, deworming, and cash transfers (albeit with the addition of Vitamin A supplements and a vaccine program operating in northwest Nigeria).
Such a tiny set of interventions doesn’t seem anywhere near the scale of the many inequities and problems in the world today. Indeed, Open Phil is offering up to $150 million in a regranting challenge, which seems to be a signal that they have more money to give away than they currently are able to deploy to existing causes.
In any event, how do we know that a handful of interventions and organizations are the best ideas to fund? Because at some point in the past, someone thought to fund rigorous RCTs on anti-malaria efforts, deworming, Vitamin A, cash incentives, etc.
But why would a handful of isolated ideas be the best we can possibly do?
To be a bit provocative (commenters will hopefully point out corrections):
We’ve mostly [albeit not entirely] taken the world’s supply of research as a given—with all of its oversights, poorly-aligned academic incentives, and irreproducibility—and then picked the best-supported interventions we could find there.
But the world’s supply of program evaluations, RCTs, jurisdiction-wide studies (e.g., difference-in-differences), and implementation research is not fixed. Gates, WHO, Wellcome, the World Bank, etc., do fund a constant stream of research, but it isn’t clear why we would expect them to identify and fund the most promising programs and the best studies for EA purposes.
If we want to expand the list of cost-effective ideas, and if EA as a movement has more money than it knows what to do with, perhaps we should develop an EA-focused R&D agenda that is robust, coherent, and focused on the problems of effectiveness at a broad scale? Over time, we could come up with any number of ideas to add to GiveWell’s list.
Doesn't EA Already Fund Research?
There are a number of cases where EA does indeed fund academic research on the effectiveness of interventions, such as GiveWell’s recent funding of this Michael Kremer et al. meta-analysis finding that water chlorination is a highly cost-effective way of improving child mortality. GiveWell has written recently of its commitment to research on malnutrition and lead exposure, while OpenPhil has recently funded research on air quality sensors, Covid vaccines, a potential syphilis vaccine, etc. And I'm sure there are other examples I've missed.
But on a closer look, not much of this research is squarely within the realm of what I’m talking about – i.e., directly funding RCTs and program evaluations themselves as part of a broader and well-designed agenda.
For example, the Kremer et al. meta-analysis of water treatment hinged on 15 main program evaluations (see Table 1). As far as I can tell, none of them were funded by major EA initiatives or donors:
- The Haushofer et al. 2021 paper was funded by NIH, the Dioraphte Foundation, and Sint Antonius Stichting.
- The Dupas et al. 2021 paper was funded by Stichting Dioraphte and the Stanford Center for Innovation in Global Health.
- The Humphrey et al. 2019 paper was funded by Gates Foundation, UK Department for International Development, Wellcome Trust, Swiss Development Cooperation, UNICEF, and NIH.
- The Kirby et al. 2019 paper was funded by DelAgua Health Limited.
- The Null et al. 2018 paper was funded by the US Agency for International Development and the Gates Foundation.
- The Luby et al. 2018 paper was funded by the Gates Foundation.
- The Boisson et al. 2013 paper was funded by Program for Appropriate Technology in Health (PATH); United States Agency for International Development (USAID); Medentech, Ltd.; and Chemical Chlorine Association.
- The Peletz et al. 2012 paper was funded by Vestergaard-Frandsen SA and the United States National Science Foundation.
- The Kremer et al. 2011 paper was funded by Hewlett Foundation, USDA/Foreign Agricultural Service, International Child Support, Swedish International Development Agency, Finnish Fund for Local Cooperation in Kenya, google.org, the Bill and Melinda Gates Foundation, and the Sustainability Science Initiative at the Harvard Center for International Development.
- The other studies are from 2006 and before, when the EA movement didn’t really exist yet.
Instead, the term “research” in this case consisted of summarizing other people’s research, followed by inserting the effect sizes into cost models, etc.
Which is a fine and valuable activity! Indeed, I think that rigorous meta-analysis is one of the best things to perform and fund (while at the Arnold Foundation, I funded BITTS to work with Rachael Meager on her groundbreaking work in this area).
Again, though, it’s derivative of what everyone else chooses to fund. If we don’t fund enough underlying RCTs and evaluations, then we are assuming that we can mostly sit back and wait to see what emerges from Gates/WHO/etc. Then we have to assume that those studies are conducted with the right amount of rigor and the right amount of focus on cost and quality, etc.
But that isn’t happening very often (see above). Indeed, an EA leader told me in conversation that there are problems with relying on the existing academic literature.
First, there is an imperfect overlap between the questions that applied researchers want to study, and the questions that EAs would want to answer.
Second, even when there’s overlap, the academic journal system usually doesn’t ask researchers to collect cost data, which means that if you want to know anything about cost-effectiveness, you’re left with trying to reconstruct those numbers after the fact.
This implies that there are many opportunities for EA to exploit weaknesses in the current academic system by funding a large number of R&D projects with an eye towards cost-effectiveness and scale.
What About Existing Research Agendas?
Aren’t there existing research agendas compiled by EAs? Absolutely. But the most thoughtful and thorough research agendas (e.g., here and here) are all about philosophy and long-termism. I haven’t yet found any similarly thorough research agenda for global health and development, economic advancement, etc. (check out the page for GiveWell's "research agenda" by comparison).
What about GiveWell’s incubation grants, of which there are several dozen listed here. Isn’t that what I’m looking for?
Yes and no. Some of the grants are exactly the sort of thing I would suggest, such as this grant to IDinsight, or this grant to CEGA, this grant to Evidence Action, and this grant to conduct an RCT on cash incentives for vaccination. The recent $14m grant to Evidence Action (one of many to that organization) might also be quite within the spirit of what I would recommend, although to date that organization seems mostly like an implementing organization for deworming and safe water initiatives.
Indeed, GiveWell says they are looking for “academic research to evaluate program evidence,” and “early-stage funding for a promising organization,” and “monitoring and evaluation of an existing organization.”
While it’s a bit subtle, notice what isn’t here: academic research that doesn’t just “evaluate” program evidence (often generated elsewhere), but an extensive R&D agenda to create new program evidence.
That might explain why much of the list seems to be focused on deworming, malaria, and malnutrition, with some grants to organizations focused on other assorted topics (lead exposure, suicide prevention, water treatment, syphilis screening, seasonal migration, and of course, the well-known study of face masks for preventing Covid).
Moreover, many of the grants, on further reading, aren’t necessarily about funding rigorous RCTs or other types of rigorous evaluation, but are about program support or technical assistance.
In sum, the list of incubation grants seems like a great start—but it would be even better if fleshed out into a coherent and extensive R&D agenda akin to the ones on long-termism.
An Interlude on Scale
As an interlude here: We talk about RCTs and RCT results all the time. Not nearly enough people to date have been talking about the problem of scale.
I don’t mean “what organization or government can deliver this intervention at a larger scale,” although that is part of it. Instead, I mean to ask a much broader question: how do we even know that what works in one or two studies will work when replicated at a larger scale?
Because you know what is disastrous? Some studies say that X works; everyone decides that X works; whole organizations spin up to do X and government agencies decide to do X; much money is spent; and then it turns out that there’s no way to make X work at scale no matter what you do. It isn’t replicable, or it isn’t even the type of idea that would ever work at scale.
One problem is that whether it’s economics, international development, education, or any other public policy issue, there are far too many small, ad hoc, one-off studies. Not enough chance for replicability, in other words.
But even if the studies are replicable, not enough ideas are scalable in the first place.
There are human capital effects—perhaps the small startup study was run with the best teachers/workers/etc. someone could recruit, but scaling up would inevitably mean that you get less-qualified people delivering the program. That’s just reality no matter what organization or government is involved.
A classic example is what happened with class size reduction. One fairly small experiment showed that reducing class size had amazingly positive effects, but when class size reduction was adopted by the state of California, a study showed that the “increase in the share of teachers with neither prior experience nor full certification dampened the benefits of smaller classes, particularly in schools with high shares of economically disadvantaged, minority students.”
In other words, putting students in small classes was a great idea in a small study, but when you try to reduce class size statewide (never mind nationwide), you end up having to hire so many teachers that their quality and experience goes down, mostly or fully offsetting the benefits of smaller classes.
Then there are peer effects. Perhaps if you give a drug treatment program or a high school graduation program to just 10% of the local students, they get distracted by the other 90% of kids not in the program, but if you put everyone in the program, they would all reinforce each other’s decisions. In this case, that would mean that the small study in fact underestimates the program’s effect at a larger scale. So if we dismiss this type of program on the basis of small studies, we might be missing the boat.
Then there are general equilibrium effects. For example, job training programs often have positive effects in isolation. If you pick 100 or 200 people to give better training, they might do better than otherwise.
But can such programs work when scaled up? After all, if there are 100 welding jobs in a community (just to make up a hypothetical example), and if you train 100 people for those jobs, they might do well, but if you train 500 people for the 100 welding jobs, the training effect will necessarily dissipate.
Unfortunately, that’s what one study found in France, where the study team had the (rare!) opportunity to randomize not just who got access to the job training program, but how many people the program actually served across different communities. They found that the job training successes came mostly at the expense of the control group.
Discouraging news, but a highly valuable study because it pointed out the flaw in thinking that if job training works for a few people, it would work just as well when offered to everyone.
Note: this isn’t a comprehensive look at the scaling issue. For more, see John List’s new book or his recent A16z interview (or his scholarly work on the issue here, here, and here). And my friends Mary Ann Bates and Rachel Glennerster (both formerly at J-PAL) wrote a great article about scaling and generalizability.
Maybe Research Isn't Worth It?
I’ve been critiquing short-termist EA for lacking a full-blown R&D agenda. But maybe that is unfair, as much research wouldn’t be a cost-effective way of donating money. After all, most ideas don’t work if rigorously evaluated. Maybe sponsoring research, if discounted by the probability of actually finding something that works, ends up not being anywhere near as good as giving to Give Directly.
That’s a great point. Let’s consider why it might be wrong.
First, GiveWell’s 2020 giving portfolio was ~$250M to interventions that presumably have around a 10x return over just giving cash (the GiveWell expectation is to find charities that are 5-15x as effective as cash). Imagine that with a $20 million investment in research, we could identify two new interventions that are 15x as effective as cash, rather than 5x, and therefore GiveWell could move donations to the 15x category rather than 5x. The additional social value from deploying, let's say, $100 million of GiveWell donations to those new interventions would be ($100m * 15) - ($100m * 5), or $1 billion. That would be a 50x return on the $20 million investment in research, and that's just counting one year's worth of giving!
Put another way, spending $20 million on research needs only a 2% chance of succeeding in order to have an expected value of paying for itself within the first year. Never mind future years (I don't want to bother with the effects of inflation, discount rates, etc.).
Switch contexts to the United States, where the OpenPhil expectation is that after taking into account the diminishing marginal value of money, grants should have roughly a 1,000x payoff in order to be as good as GiveWell's recommendations. Imagine, if you will, that R&D could identify interventions that have a 2,000x return or even a 1,000,000x return (it's not hard to imagine that a 2017 grant to create a pan-coronavirus vaccine might have easily gotten such a return, if it had worked!).
The math is similar, but essentially a 2,000x return in the US is 2x better than GiveWell, and thus if $10 million in R&D found such a program/policy and then was used to drive $100 million in yearly giving, that would be an additional $190 million in benefits in the first year alone ($200m - $10m), for a 19x return on investment.
These numbers are obviously a bit arbitrary—as are all numbers about the expected value of future hypothetical interventions! But they nonetheless illustrate the fact that funding research on new interventions and policies might be highly cost-effective even if very few of the new ideas pan out.
[Caveat: I'm an advisor to the Social Science Research Council, which applied to OpenPhil for funding based on the above rationale. So take it with a huge grain of salt.]
Second, to be slightly less arbitrary: Michael Kremer et al. have a working paper from last year, in which they estimate the return on investing in “social science R&D,” based on data from USAID’s Development Innovation Ventures. They developed a model “for determining whether the return on an innovation portfolio exceeds a benchmark, such as the economy-wide return on capital or the opportunity cost of more conventional development assistance investments.”
Then, they applied this model to 41 R&D awards made between 2010 and 2012; all of the awards were for issues like safe water filters in Kenya, rural solar accessibility in Uganda, fighting tuberculosis in India, and the like.
It turned out that there was only enough data to estimate benefits from five of the innovations, but those alone “generated $281 million in social benefits,” compared to a $16 million cost for the entire portfolio. In other words, “setting aside any potential future benefits and any realized benefits of the other 36 innovations . . .benefits of these five innovations would have paid for the cost of the entire DIV portfolio at least 17 times over.”
This sort of benefit-cost ratio shouldn’t be a surprise. For example, as OpenPhil’s blog post observes, the Rockefeller Foundation saved “over a billion people from starvation” by employing Normal Borlaug as a plant scientist.
I could be missing something, but it seems like few (if any) EA organizations employ hard scientists who are directly working on issues like that. EA organizations are much more likely to employ scientists working on AGI issues that, while possibly important, are as yet quite speculative and sci-fi in nature. (Indeed, I suspect that whatever they are doing is not only hopeless, but might be as likely to cause future harm as future good--it's hard enough to measure the correct sign as to a medical or health intervention delivered to people right in front of you, let alone the sign of today's efforts as to AGI impacts on people in the far future).
Third, no such cost-benefit analysis is available as to many long-termist research questions (e.g., “how much weight should we place on philosophical arguments” or whether to diversify across different worldviews). I doubt many (or any) of those questions would pass muster under a reasonable cost-benefit analysis—after all, any effects of most philosophizing would be exceedingly diffuse. It still may be worth researching those questions, but you can’t justify such research on the ground that it is highly likely to lead to direct donations that are more impactful than GiveWell’s current list.
Fourth, as Holden Karnofsky points out, there’s a strong case for funding high-risk, high-reward projects that are up to 90+% likely to fail: “hits-based giving.” My one disagreement is that he says this “calls for approaching our giving with some counterintuitive principles — principles that are very different from those underlying our work on GiveWell.”
I’d argue that if we invest in RCTs, evaluations, and other forms of R&D that have a tiny chance of finding a cost-effective treatment/program/innovation, that research investment would be a hit-based way of furthering GiveWell’s aims. Thus, there is no contradiction here. Indeed, funding a broad range of empirical evaluations might be the best example of “hits-based giving.” The result would be a much broader set of interventions, programs, policies, etc., that feed into GiveWell’s analyses.
Fifth, funding more R&D could have positive spillovers in ways far beyond the direct effect of the intervention or program in question. For example, the mere act of getting program implementers to think about R&D questions could help them improve the program in the future (such as better targeting or delivery). Moreover, a greater focus on research could, in theory, influence other actors (policymakers, grantors, etc.) to care more about having solid evidence, thus hastening the discontinuation of ineffective programs while increasing the support for effective programs.
In conclusion, insofar as the EA movement focuses on short-term economic and human development, it needs a more robust, thoughtful, and thorough R&D agenda with three stages. The agenda should start with a vast number of possible innovations that might need some help to spin up in the first place (albeit with an eye towards what would ever be possible to scale); it should fund a number of pilot experiments; and the agenda should then recommend how to fund RCTs and other evaluations to show a program’s cost-effectiveness at scale.
To be clear, I'm sure there are lots of research grants that I've missed! And I do see bits and pieces of such an R&D agenda right now—akin to seeing 50 or 100 pieces of a 1,000-piece puzzle.
A good start, but not enough.
Thanks to Kerry Vaughan and to an anonymous person for comments. They don't necessarily agree with anything I said.
In general, I am sympathetic to this argument, but it think the BOTEC is hinging on something that is far beyond implausible.
Let's say you can do a well-managed RCT on a intervention for a half million dollars. That seems pretty cheap, but not implausible for a developing country RCT. That implies that you think that 5% of the best candidate not-already-exhaustively-researched interventions would turn out to be 50% better than anything we've found so far? That seems implausible, at best.
And even if you could find them, you wouldn't trust a single RCT - you'd need to do several more over time before you'd be willing to strongly trust that these have such high ROIs. And you would have reversion to the mean, which you probably would - but remember that the mean for development interventions isn't "as effective as cash" - it's much much less effective. So we're talking about finding good evidence for something that works many standard deviations better than the mean. Unless the distribution of quantifiable interventions looks very strange, that seems like a strange claim to make. And I'm certainly not going to claim that I think there is an efficient market in developmental economics, but I'm still very skeptical that billion-dollar bills are quite that easy to find.
Fair point! Perhaps a more modest standard would be appropriate -- i.e., "giving that produces a net positive effect in the world, something more than 1x."
If the bar is set so high, then obviously there will be almost nothing worth funding except for a miniscule set of interventions on a miniscule number of issues, and large foundations will be left with piles of money that they don't know what to do with, and meanwhile the world still has lots of problems that need solving even if there's no 10X intervention in sight.
I don't think we were advocating leaving money on the sidelines for that reason - patient philanthropy is largely a different argument.
I think that we buy down the 10x interventions, then the 9x, 8x, etc. But even if they are not known, discovering those interventions may possible without the same level of investment in RCTs.
I’m also sympathetic to the argument, but I think the BOTEC overstates the potential benefit for another reason. If Givewell finds an opportunity to give $100 million per year at an effectiveness of 15x of cash transfers rather than 5x (and assuming there is a large supply of giving opportunities at 5x), I think the benefit is $200 million per year rather than $1 billion. The $100 million spent on the 15x intervention achieves what they could have achieved by spending $300 million on a 5x intervention. Of course, as noted, that is for only one year, so the number over a longer time horizon would be much larger.
Even with that adjustment, and considering the issues raised by David Manheim and other commenters, I find this post quite compelling – thank you for sharing it.
Aside: It's not funding research, as you are proposing, but I'm hoping the Unjournal will help encourage academic research in the right direction through:
In choosing which research to evaluate and feature, prioritizing work relevant to the most effective interventions, that explicitly addresses cost and scalability, that provides transparent calculations and reasoning, MonteCarlo CEAs, etc.
Through the evaluation and communication process, encouraging and helping researchers to do the above.
I enjoyed reading this, thank you for writing it. Two things:
Firstly, I wondered if you were aware of this recent GiveWell scoping grant to Precision Development (PxD) which explores something very close to what you're suggesting - it's asking them to come up with an evaluation design (which could by an RCT) for their work on providing information to smallholder farmers, which GiveWell is then open to funding ("we think there's a 70% chance we will provide a grant to fund implementation, and evaluation of PxD's agriculture program...40% chance we'll provide a grant of $30 million or more..."). This isn't a full recent research agenda, but it could be a peer-review suitable RCT (depending of course on what PxD propose). How close is this to what you're advocating?
Secondly, if you wanted to submit this to the Open Philanthropy cause exploration prizes as is or after any changes, it is eligible.
You gave lots of good examples of low-cost high-impact interventions like water chlorination, vaccines and lead removal. I agree that there are far more examples like that, particularly in health & medicine, which we already know about but where the scale of the benefits is underestimated. Water chlorination is a particularly good example because it's one where the large benefits were expected by experts but were surprising to others.
And thank you for linking to my article on RCTs, the arguments you made above were actually a big part of the reason that I wrote that!
I think the thesis is right (EA should fund RCT) but we actually shouldnt believe RCT will find interventions better than those currently being funded.
I will also restrict my analysis to global health/development as you did. A simple model of cost effectiveness is as follows:
cost effectiveness = (severity of problem instance * probability of solution from one instance of intervening) / cost of one instance.
I think this is straightforward but ill try to justify here. Cost effectiveness is by definition effectiveness over cost. Effectiveness is the expected value of the intervention. The expected value is the good done per success * probability of success. Therefore, if we define problem severity as "amount of good done by solving the problem" - which i think is intuitive - the model holds.
Looking at this model we can see that to find interventions more cost effective than the ones currently being funded the intervention would have to do at least one of the following:
Solve a problem more severe than the problems currently funded
Have a higher success probability than the interventions currently funded
Be cheaper than the problems currently funded
1 seems pretty unlikely. The best currently funded interventions (lets just use the givewell top charities list as an example) address the following problems:
Child mortality - malaria, vitamin A deficiency , vaccines
Blindness - Deworming
Poverty - Givedirectly - ill leave this one aside because its the benchmark
Youll be hard pressed to find problems where a single instance of solution does more good than preventing a single death or a single instance of blindness. The only thing I can think of would be infectious disease intervention where one person not being infected means that multiple lives are saved at the margin by preventing transmission. Importantly, malaria and vaccines already have this quality so the only possible improvement would be changing to more prevalent infectious disease, but the only very deadly/prevalent infectious disease not being heavily intervened on above is HIV/AIDS which right now is very expensive to prevent or treat.
2 is also very tough. The basic formula for success probability in the currently funded interventions is
This is how anthelmintics, vaccines, vitamin A supplementation, and malaria prevention work.
I can only see two ways to improve this formula, treating more common ailments (the problem is there arent any) or by more effective targeting. For example if we only gave bednets to the particular people who would have gotten deadly malaria without them. The problem here is even if predicting who will get malaria, which children are vitamin A deficient, will get pneumonia, or will have schistomiatosis is possible, the data collection for making predictions would probably be more expensive than the gains from targeting.
That leaves 3, but the popular interventions are just so cheap, dollars per instance.
But I started by saying I agree. The reason why is the current interventions are basically fully funded. Givewell is holding money because they cant spend faster than they receive donations. We dont need to find interventions that are better than the standards, we need to find new effective interventions, period. RCT is the way to do this, and no organization is funding RCT with the explicit goal of finding cost effective interventions that can be implemented by granters. The main funders of RCT are governments funding the academy with the goal of advancing knowledge -which of course is extremely important - this leads to RCT which are often undertaken in support of theory (see microfinance) or funding of RCT which are based in the host country rather than in developing countries where problems are more severe and interventions are cheaper. An organization that exclusively funds RCT with an EA focus would be filling a huge need in the EA community.
Thanks for sharing this! It touched on a lot of the topics that I've recently found most challenging in thinking about the use of RCTs in development economics and offered some helpful perspectives. Below I've outlined a few challenges.
EAs creating impact through marginal grants
To me, it seems highly improbable that a new intervention is in expected value terms:
yet is still struggling to find funds in the current RCT funding landscape.
So I think it is unlikely that EA funds for RCTs will precipitate new, high-quality opportunities that generate giving opportunities at the same cost-benefit as GiveWell’s current recommendations on the margin.
One might think that in the “market place” of development economics grants, grants tend to be allocated to the most promising programs on which we have the least information. The marginal grant is probably far less impactful than the average, in a heavily right-tailed impact area. For this reason, the ROI calculation looks very optimistic to me.
In fact, I think that currently too many RCTs are happening in benefit-cost terms and that if EA funds are held to their traditionally high ROI standard, there will be few opportunities to fund new RCT research. Actually, I’m somewhat pessimistic that many interventions that are both 5x+ as effective as cash transfers and plausibly scalable exist and have not been found (the low-hanging fruit has probably been exhausted). However, if these exist and the recipient couldn't have accessed a grant but for EA support, we should definitely spend money on them. I am just unconvinced that 1) we should treat this as a an area for substantive EA engagement and 2) that if we think like Bayesians who are interested in medium-long-term impact, RCTs are necessarily cost-effective relative to other forms of evidence-generating activities such as pseudo-experiments / looking at correlation evidence.
Of course, the RCT funding landscape is far from a perfect market. My personal experience (at least in the development economics space) is that RCTs tend to be led by relatively senior development economists who have the record, partnerships and experience in-country to foster government buy-in and successfully manage an RCT. These same attributes mean that they are well-placed to find or create funding opportunities with partner governments, their home government and with NGOs so the marginal EA grant is probably barely impactful.
By contrast, I am more optimistic that grants to PhD or early researchers and researchers from small institutions/independent researchers might be more impactful, as these researchers have fewer signals of their capacity to bring to the grant market (so they might have good ideas that are unfunded).
EAs creating impact through changing the research agenda
I am sympathetic to the argument that EA grants in this space might cause better research to happen. For example, I think that funding Kremer’s meta-analysis was probably worthwhile as it helped to normalise the use of non-RCT methodologies in the RCT-crowded development economics space.
That said, I actually think that the literature is very responsive to critiques such as those raised about reporting cost-effectiveness and general equilibrium effects, and is rapidly self-correcting. For example, the norm in development economics RCTs is transitioning to expecting cost-effectiveness analyses (or that is the case in the papers, workshops and review processes of which I am part). Studies on general equilibrium effects and spillovers are also gaining traction in the top 5 econ journals, for example this recent paper in Econometrica. Indeed, my personal view is that economists have gotten better at rigorously evaluating spillovers due the to the robust critiques of epidemiologists (see the worm wars). I’m not sure, for this reason, that EA should be allocating many funds to trying to push the development economics literature in a given direction, as it seems to already be quite responsive to internal critiques.
More importantly, I am pretty convinced that development economics is quite EA-aligned, and probably relatively more aligned than the academic disciplines concerned with longtermist cause areas (most notably AGI). With Esther Duflo, Abhijit Banerjee and Michael Kremer winning the Nobel Prize in Economics, I think it is pretty clear that the field aspires to have a large impact on improving lives in the developing world through extremely rigorously evaluated approaches (and the peer review system for microeconometric evaluations seems to expect an extremely high degree of rigour). Hence, I'm not sure that EA has much room to create impact by using funds to re-align the field to focus on areas of interest to EAs, as it seems to be pretty close to what we would hope for.
In fact, I think that the primary point of EA non-alignment is that it has a much greater focus on randomista development strategies than macroeconomic development strategies. That is, relative to the EA median, it probably focuses too much on the use of RCTs and rigorous evidence at the expense of evaluating macro policy which is almost impossible to evaluate using RCTs. Therefore, I think there are probably more promising cases for research grants in the relatively neglected space of macroeconomic development policy, rather than new RCTs.
Lots of great points here, and I agree with much of what you say!
Just to clarify, I didn't mean to focus solely on RCTs, and tried throughout to use broader terms like R&D, or "other forms of evaluation," or "difference-in-differences," so as to encompass other research methods that might be more suitable for everything from 1) trying to develop a new program, to 2) evaluating country-wide policies that could never be subjected to an RCT.
Not sure how useful this is but I tried to develop a model to help us decide between carrying out our best existing interventions and carrying out research into potentially better interventions: https://forum.effectivealtruism.org/posts/jp3yaQczFWk7yiNXz/to-fund-research-or-not-to-fund-research-that-is-the
My key takeaway was that the longer the timescale we care about doing good over, the better research is relative to carrying out existing interventions. This is because there is a greater period over which we would gain from a better intervention.
As someone with long timescales I’m therefore very on board with more research!
TLDR: more practical applications of existing research.
I think that these days everything competes for attention ("attention economy").
I think that popularising existing research and funding new - can go side by side.
I'm more on the practical side, implementing what we know so far.
Just like a brilliant product - will it go to market organically or will require a marketing push? Same analogy is applicable to research - more mainstream attention, popularisation, impact, getting on Joe Rogan and Lex Fridman, that in turn can provide more funds and interest to fund new research.
Overall it seems it is a balance - more new research will naturally trigger more new high quality research and more new real-life implications.
Another benefit I can think of - INDEPENDENCE - whenever something is sponsored by someone I wonder about incentives and spheres of influence.
Interesting to see the skepticism about the existence of currently unknown interventions with high ROI (e.g., large impacts on reducing mortality). There seem to be a very large number of problems for which we have not yet identified effective interventions. For example, given what we know about the effectiveness of Covid vaccines at averting death, and the fact that, despite availability, currently only about 18% of residents of LICs have received even just one dose of a Covid vaccine, an intervention that cost-effectively increased Covid vaccination demand would seem to be well worth the investment in R&D to find such an intervention. The same is presumably true for interventions that cost-effectively increase demand for/take-up of other life-saving vaccinations.
I wouldnt say there is skepticism that there are interventions with high ROI only with ROI higher than the currently known interventions. Medicare pays $45 for a two dose course of a COVID vaccine. Compare this to the $7 cost of a bednet and you have to ask, is a COVID vaccine more than 9x as likely to save a life as a bednet. Plus the search cost of the RCTs has to be recouped. You are saying work on the demand side, but i think the evidence from many interventions shows the best way to increase uptake is to meet people where they are and provide the product/service for free.
CGD reports the cost of an Astra Zeneca-like Covid vaccine (0.75 efficacy) as 3$/dose. But perhaps more to the point, the costs of developing, producing, and distributing Covid vaccines are by now largely sunk. They are widely available free of cost in areas with low vaccination rates. Yet vaccination rates in LMIC remain stubbornly low (and aren't that great in many HIC). We don't know which interventions might work to increase vaccination rates. We have some evidence that scaled SMS campaigns increase Covid vax rates in the US; could that work in LMIC? We don't know! Social signaling increased childhood immunization rates in Sierra Leone. Could social signaling increase Covid vax rates in LMIC? We don't know!
In order for an intervention that increases Covid vax rates to have greater than 1000x ROI, we need each $100K invested in a vaccine uptake intervention to yield greater than $100M in benefits. Using the Open Philanthropy benchmarks of 32 DALYS per adult death, and $100,000 per DALY, averting an adult death is worth approximately $3.2M. Assuming reducing mortality is the only benefit of Covid vaccine uptake, for each $100K in the costs of an intervention to increase Covid vax uptake, we would need to avert more than 31.25 adult deaths ($100M/$3.2M). Recent estimates of excess mortality rates due to Covid in LIC range as high as 0.007. A population-wide COVID mortality rate of even 0.004 implies that we would see 32 deaths in a population of 8,000 unvaccinated adults. With vaccine efficacy of 0.75, we would need to fully vaccinate about 10,667 adults to avert 32 deaths.
For an investment of $100,000, we therefore need each full vaccination (as a result of an intervention) to cost no more than about $9.37. Given that SMS and social signaling campaigns are both relatively low cost (and if anything have diminishing marginal costs), and that there are presumably other interventions that could overcome vaccine hesitancy, that seems within reach.