Donating money, buying happiness: new meta-analyses comparing the cost-effectiveness of cash transfers and psychotherapy in terms of subjective well-being

MichaelPlant; JoelMcGuire

BrianTanOct 26 202116

This is amazing work! I have a bunch of thoughts, which I'll number so it's easier for you or others to respond to. Sorry that this comment is a bit long; you can respond to the numbers one-by-one instead of all at once if you'd like:

I would love to hear what GiveWell's response is to your findings here. As you show, I think there's a strong case to be made for why StrongMinds should be a GiveWell top charity. I'm definitely not an expert in this though, and maybe there are good reasons GiveWell or others have for why StrongMinds shouldn't be a GiveWell top charity.

It will likely take some time before GiveWell would be able to make StrongMinds a top charity, but it would be exciting if StrongMinds (or any mental health charity) could make it to GiveWell's list of recommended charities as early as the end of 2022. It would be nice to hear from GiveWell about the following too if they:
1. do push through with doing moral weights work related to SWB (which I think would be valuable and important to do)
2. will assess the cost-effectiveness of task-shifted psychotherapy in low- and middle-income countries, i.e. by creating a program review on it
3. plan on doing anything else with regards to assessing interventions in terms of SWB
I'm looking forward to HLI's report on the cost-effectiveness of deworming in terms of SWB! I'm also looking forward to your reports on other interventions you think might be just as cost-effective (or even better) than StrongMinds.
Just thought I'd share this here: I and Shen Javier from EA Philippines recently got a grant from the EA Infrastructure Fund to lead a 6-month part-time research project to find the top mental health charity ideas in the Philippines. This is our follow-up project as part of our participation in Charity Entrepreneurship's 2021 incubation program, specifically their region-specific research track. We have recently hired 3 part-time research analysts (all with more background in mental health and research than Shen and I) to help us with this project.

Our goal is to find 1-2 charity ideas that are highly cost-effective to implement in the Philippines (and competitive with StrongMinds), and that Charity Entrepreneurship will be willing to incubate in 2022. These reports of yours will likely be very useful for us, which is why I took the time to read this report and browse through some of the others linked here. And I can see how we can build off and learn from this research in various ways. We'll probably email you within the next couple of weeks to schedule a call with you and/or Joel, with more specific questions about HLI's research and to get advice about our project!
You talk about how you estimate the long-term effects on SWB of StrongMinds and psychotherapy via modeling it as exponentially decaying. I want to understand this better - does this mean you are saying that improvements to subjective wellbeing are high post-treatment, and then basically decrease a lot (eventually to zero) over time? Is this saying that these beneficiaries become depressed again, or is this saying that their continuous lack of depression is more going to be due to other factors (and not due to the initial StrongMinds program)? If none of what I wrote above is correct, feel free to explain it better.
Hope it's okay I ask this publicly, but feel free to respond privately if you'd like: How much time did each of these reports take (on cash transfers, psychotherapy, StrongMinds, and this summary report)?
(Optional) Do you see HLI possibly becoming like GiveWell in the future, but with charity recommendations based on which ones improve subjective wellbeing the most? Although maybe you'd want another organization to do the charity analysis, while you focus on the intervention analysis or bigger-picture research questions.

This might make sense as a vision if GiveWell doesn't plan on recommending some charities that do well on improving SWB (i.e. StrongMinds). Hopefully GiveWell does though.
(Optional) I'm not sure if you'll answer this, since you probably are quite uncertain, but do you tentatively think that donating to StrongMinds would be better than donating to any of the life-saving GiveWell top charities (i.e. AMF)? I know you've written about this before here, but I wonder if you have updated views on this.

JoelMcGuireOct 27 20217

Brian, I am glad to see your interest in our work!

1.) We have discussed our work with GiveWell. But we will let them respond :).

2.) We're also excited to wade deeper into deworming. The analysis has opened up a lot of interesting questions.

3.) I’m excited about your search for new charities! Very cool. I would be interested to discuss this further and learn more about this project.

4.) You’re right that in both the case of CTs and psychotherapy we estimate that the effects eventually become zero. We show the trajectory of StrongMinds effects over time in Figure 5. I think you’re asking if we could interpret this as an eventual tendency towards depression relapse. If so, I think you’re correct since most individuals in the studies we summarize are depressed, and relapse seems very common in longitudinal studies. However, it’s worth noting that this is an average. Some people may never relapse after treatment and some may simply receive no effect.

5.) I'll message you privately about this for the time being.

6.) In general we hope to get more people to make decisions using SWB.

7.) I am going to pass the buck on making a comment on this :P. This decision will depend heavily on your view of the badness of death for the person dying and if the world is over or underpopulated. We discuss this a bit more in our moral weights piece. In my (admittedly limited) understanding, the goodness of improving the wellbeing of presently existing people is less sensitive to the philosophical view you take.

GiveWellDec 6 202118

Thanks for the flag, Joel.

Brian, our team is working on our own reports on how we view interpersonal group therapy interventions and subjective well-being measures more generally. We expect to publish our reports within the next 3-6 months.

We have spoken to HLI about their work, and HLI has given us feedback on our reports. It’s been really helpful to discuss this topic with Michael, Joel, and the team at HLI. Their work has provided some updates to how we view this topic, even if we do not ultimately end up reaching the same conclusions.

We’re still looking into this area and some of the important questions HLI has raised. While we plan to provide a more detailed view once our reports are published, a few areas where we differ from HLI are below:

The most concrete methodological difference between our approach and HLI’s is that we have a stronger view that there are larger intra-household spillovers for cash transfers than for therapy. HLI assumes any spillover effects to other household members are proportional across interventions—i.e., if a cash transfer benefits other household members’ subjective well-being x% as much as it benefits the recipient, the same is true for therapy. We have a strong prior that, since income is shared within households, other household members would benefit much more from a cash transfer than from therapy delivered to one of its members. If there are 4 to 5 members per household (roughly what we estimate for participants in GiveDirectly's program) and there were no household multipliers from therapy, psychotherapy would be 2x-3x as cost-effective as cash transfers, taking HLI’s other assumptions as given. HLI has flagged this as an uncertainty and something they may look further into, and we would be interested to see what they find. It seems possible there are significant spillovers from therapy, but our current best guess is these would be much smaller than for cash.
Another concrete difference is that we’re not sure comparing standard deviation effects across interventions is appropriate. HLI compares the effect on well-being in standard deviations (SDs) across cash transfer and therapy interventions. Our impression is that therapy interventions like StrongMinds target individuals with depression, who might be concentrated at the lower end of subjective well-being scales, to a greater extent than cash transfer interventions do. If this is the case, then the measure of well-being per SD may be lower for therapy than for cash transfer interventions. For example, a SD for psychotherapy may be a 0.5 on a 10-point scale of well-being, while the SD for cash transfers may be 1. HLI notes this as an uncertainty, and we’d be interested in more evidence on this, too.
We put less weight on subjective well-being as a measure to compare different interventions. As HLI notes, subjective well-being measures could provide a common currency for comparing interventions like cash and therapy. We agree this is a useful perspective. However, we think there are some limitations to these measures, and give weight to other factors. I expect that GiveWell would find interventions that have benefits beyond subjective well-being more cost-effective than HLI would. As a result, even if a subjective well-being approach showed a program was more cost-effective than programs we’d expect to recommend marginal funding to, we wouldn’t necessarily make a recommendation based on that alone (though we would give it some weight).
When we’ve looked at group therapy from perspectives other than comparing different interventions' effects on subjective well-being measures, we find much lower cost-effectiveness than HLI is finding. We’ve considered a few different angles. First, we’ve looked at the effect of therapy under our current moral weights, which we use to trade off outcomes like increasing consumption, averting deaths, and averting morbidity. Under this approach and using the effect of depression on DALYs, we find similar cost-effectiveness between therapy and cash transfers. Second, we’ve taken a very shallow look at a recent trial that compares cash transfers with therapy head-to-head. This trial finds a much smaller effect of therapy vs. cash. Third, it seems intuitively surprising that a $1,000 cash transfer for a household in a low-income country would have a substantially smaller effect than 3-4 months of therapy (0.5 SD for cash transfers vs. 1.6 SD for therapy in HLI's study). If the average GiveDirectly household has consumption of $2,000-$3,000, a $1,000 transfer amounts to a 30%-50% increase in baseline consumption for an entire household. We would guess that if we, for example, gave households the choice between 3-4 months of therapy and a $1,000 transfer, many more would choose the transfer. While the above approaches definitely have drawbacks and we don’t think we should put 100% weight on them either, the implications seem substantially different enough from HLI’s estimates that they give us some additional skepticism.

We still have a lot of uncertainty about how to compare different interventions like cash transfers and therapy, and making these comparisons is crucial to our decisions on what funding opportunities to recommend to our donors. As a result, we hope to continue to discuss this topic with individuals who have a differing view than us on our moral weights so that we can continue to refine our approach.

We look forward to engaging once we publish a fully vettable report. Until then, I hope this answers the immediate questions you have about where the views of GiveWell and HLI differ.

MichaelPlantDec 10 20213

Before I respond to the details, I’d like to thank GiveWell for engaging with these questions. I’m delighted our research has led to them producing their own reports into group psychotherapy and using subjective wellbeing to determine one's moral weights.

GiveWell kindly shared a draft of their reply with us in advance and we made several comments clarifying our position. However, they decided to publish their original draft unchanged (without offering a further explanation) so we're restating our comments here so that readers can build a better understanding of where our positions differ. I’ll split these up so it’s easier to follow, first quoting the response from GiveWell, then providing our reply.

MichaelPlantDec 10 20215

We would guess that if we, for example, gave households the choice between 3-4 months of therapy and a $1,000 transfer, many more would choose the transfer.

To clarify, the intervention is not to provide therapy to anyone, it's just to provide it to those who are depressed. I expect that even some depressed people would choose cash over therapy. But it's reasonable to assume people don't always know what's best for them and under/overconsume on certain goods due to lack of information, etc. That's why we need studies to see what truly improves people's subjective well-being.

If one was serious about always giving people what they choose, then you would just give people cash and let them decide. Given that GiveWell claims that bednets and deworming are better than cash, it seems they already accept cash is not necessarily best. Hence, it’s unclear how they could raise this as a problem for therapy without being inconsistent.

MichaelPlantDec 10 20213

Third, it seems intuitively surprising that a $1,000 cash transfer for a household in a low-income country would have a substantially smaller effect than 3-4 months of therapy (0.5 SD for cash transfers vs. 1.6 SD for therapy in HLI's study).

What I think might have been overlooked here is that therapy is only being given to people diagnosed with mental illnesses, but the cash transfers go to poor people in general (only some of whom will be depressed). Hence, it's perhaps not so surprising that directly treating the depression of depressed people is more impactful than giving out money (even if those people are poor). If you were in pain but also poor, no one would assume that giving you money would do more for your happiness than morphine would.

MichaelPlantDec 10 20213

Second, we’ve taken a very shallow look at a recent trial that compares cash transfers with therapy head-to-head. This trial finds a much smaller effect of therapy vs. cash.

We account for this trial in our meta-analysis - if we hadn’t incorporated it, therapy would look even a bit more cost-effective. Of course, the point of meta-analyses is to look at the whole evidence base, rather than just selecting one or two pieces of evidence; one could discount any meta-analysis this way by pointing to the trial with the lowest effect.

We don’t think one study should overshadow the results of a meta-analysis, which aggregates a much wider set of data ("beware the man of only one study" etc). If there was one study finding no impact of bednets, I doubt GiveWell would conclude it would be reasonable to discount all the previous data on bednets.

MichaelPlantDec 10 20213

First, we’ve looked at the effect of therapy under our current moral weights, which we use to trade off outcomes like increasing consumption, averting deaths, and averting morbidity. Under this approach and using the effect of depression on DALYs, we find similar cost-effectiveness between therapy and cash transfers.

What is the conversion rate here between DALYs and income increases, and on what is it based? I'm not sure what method could be being used here except by inputting one's intuitions. In which case, it would be good to make that clear, as people may think the conversation rate is an authoritative fact, rather than (just) an opinion. It would be interesting to state how much readers' opinions would need to differ from GiveWell’s to reach alternative conclusions!

To bang a familiar drum, the reason to use subjective wellbeing measures is that we can observe how much health and income changes improve wellbeing, rather than having to guess.

MichaelPlantDec 10 20213

However, we think there are some limitations to these measures, and give weight to other factors. I expect that GiveWell would find interventions that have benefits beyond subjective well-being more cost-effective than HLI would.

It's not easy to respond to this - it's not stated what the limitations and other factors are.

More generally, there's no reason to think in the abstract that, if you're pluralist rather than monist about value, this changes the relative cost-effectiveness ranking of different actions. You'd need to provide a specific argument about what the different values are, how each intervention relatively does on this, and how the units of value are commensurate.

For example, imagine a scenario where intervention X provides 15 units of happiness/$ but does nothing for autonomy and intervention Y provides 10 units of happiness/$ and 10 units of autonomy/$. If we take one unit of happiness as being as valuable as one unit of autonomy, then Y is better than X. However, someone who only valued happiness would think X is better.

MichaelPlantDec 10 20213

It seems possible there are significant spillovers from therapy, but our current best guess is these would be much smaller than for cash.

It would be helpful if GiveWell could share what their current best guess is. Even if spillovers are 30% for therapy and 100% for cash, assuming the original 12x multiple and 3 other household members, then the multiple would still be 5.7.

Cash = 1 + (1*3) = 4

Psychotherapy = 12 + ((0.3*12)*3) = 12 + 10.8 = 22.8

Hence, therapy still looks quite a bit better even if the spillover effects are small.

MichaelPlantDec 10 20213

HLI assumes any spillover effects to other household members are proportional across interventions—i.e., if a cash transfer benefits other household members’ subjective well-being x% as much as it benefits the recipient, the same is true for therapy.
If there are 4 to 5 members per household (roughly what we estimate for participants in GiveDirectly's program) and there were no household multipliers from therapy, psychotherapy would be 2x-3x as cost-effective as cash transfers, taking HLI’s other assumptions as given.

This is definitely not what we think, particularly the assumption it will be proportional across 'any' intervention! I'm sure why someone would believe that.

Our position, as outlined in this twitter thread, is quite a bit more nuanced. There wasn't much evidence we could find on household spillovers - five studies for cash, one for mental health - and in each case it indicated very large spillover effects, i.e. in the range that other household members got 70-100% of the benefitted the recipient did. We didn't include that in the final estimate because there was so little evidence and, if we'd taken it at face value, it would only have modestly changed the results (making therapy 8-10x better). Even in the extreme, and implausible, case where therapy has no household spillovers, it wouldn't have yielded the result that psychotherapy is more cost-effective than cash transfers. We discussed this in the individual cost-effectiveness analysis reports and flagged it as something to come back to for further research.

We agree that the effects of household spillovers from cash are large. Where our priors may diverge is that HLI (and others) think that the spillovers from therapy are also large, whereas GiveWell is very sceptical about this. We are now conducting a thorough search for more evidence on household spillovers, so we are not just swapping priors.

Barry GrimesApr 19 20223

We have published an updated cost-effectiveness comparison of psychotherapy and cash transfers to include an estimate of the effects on other household members. You can read a summary here.

For cash transfers, we estimate that each household member experiences 86% of the benefits experienced by the recipient. For psychotherapy, we estimate the spillover ratio to be 53%.

After including the household spillover effects, we estimate that StrongMinds is 9 times more cost-effective than GiveDirectly (a slight reduction from 12 times in our previous analysis).

DerekNov 26 202114

There is much to be admired in this report, and I don't find it intuitively implausible that mental health interventions are several times more cost-effective than cash transfers in terms of wellbeing (which I also agree is probably what matters most). That said, I have several concerns/questions about certain aspects of the methodology, most of which have already been raised by others. Here are just a few of them, in roughly ascending order of importance:

Outcomes should be time-discounted, for at least two reasons. First, to account for uncertainty as to whether they will obtain, e.g. there could be no counterfactual benefit in 10 years because of social upheaval, catastrophic events (e.g. an AI apocalypse, natural disaster), or the availability of more effective treatments for depression/ill-being/poverty. Second, to account for generally improving circumstances and opportunities for reinvestment: these countries are generally getting richer, people can invest cash transfers, etc. This will be even more important when assessing deworming and other interventions with benefits far in the future. (There is probably no need to discount costs as it seems they are incurred around the time the intervention is delivered in both cases.)
I've only skimmed the reports, but it isn't clear to me what exactly is included in the costs for StrongMinds, e.g. sometimes capital costs (buildings etc), or overheads like management salaries and rent, are incorrectly left out of cost-effectiveness analyses. If you haven't already, you might also want to consider any costs to the beneficiaries, e.g. if therapy recipients had to travel, pay for materials, miss work, etc. As you note, most of the difference in the cost-effectiveness is determined by the programmes' costs rather than their consequences, so it's important to get this right (which you may well have done).
You note that both interventions are assessed only in terms of their effect on depression. A couple years ago I summarised the findings of the four available evaluations of GiveDirectly in an unpublished draft post ( see Appendix 2.1, copied below, and the "GiveWell" subsection of section 2.2, the relevant part of which is copied below). The studies recorded data on many other indicators of wellbeing, which were sometimes combined into indices of "psychological wellbeing" with up to 10 components (as well as many non-wellbeing outcomes like consumption and education). Apologies if you explain this somewhere, but why did you only use the data on depression? Was it to facilitate an 'apples to apples' comparison, or something like that? If so, I wonder if it that was loading the dice a bit: at first blush, it seems unfair to compare two interventions in terms of outcome A when one is aimed solely at improving outcome A and the other is aimed at improving outcomes A, B, C, D, E, F, G and H (at least when B–H are relevant, i.e. indicators of subjective wellbeing).
I share others' concerns about the omission of spillovers. In the draft post I linked above (partly copied below), I recorded my impression that the evidence so far, while somewhat lacking, suggests only null or positive spillovers to other households (at least for the current version of the programme, which 'treats' all eligible households in the village). As part of a separate project I did last year (which I'm not allowed to share), I also concluded that non-recipients within the household benefited considerably: "Only about 1.6 members of each household (average size ~4.3) were surveyed to get the wellbeing results, of which only 1 actually received the money. There was no statistically significant wellbeing difference between the recipients and surveyed non-recipient household members, and there is evidence of many benefits to non-recipients other than psychological wellbeing (e.g. education, domestic violence, child labour). Nevertheless, we expect the effects to be a little lower among non-recipients…" Omitting the inter-household spillovers is perhaps reasonable for the primary analysis, but it seems harder to justify ignoring benefits to others within the household.
Whatever may be justified for the base case, I don't understand why you haven't done a proper sensitivity analysis. Stochastic uncertainty is captured well by the Monte Carlo simulations, but it is standard practice in many fields (including health economics) to carry out scenario analyses that investigate the effects of contestable structural and methodological assumptions. It should be quite straightforward to adapt the model so as to include/exclude (or vary the values of) spillovers, non-depression data, certain kinds of costs, discount rates, etc. You can present the results of these analyses yourself, but users can also put their own set of assumptions in a well-constructed model to see how that changes things. (Many other analyses are also potentially helpful, especially when the difference in cost-effectiveness between the alternatives is relatively small, e.g. deterministic one-way and two-way analyses that show how the cost-effectiveness ratio changes with high/low values for each parameter; threshold analyses that show what value a parameter must attain for the 'worse' programme to become the more cost-effective; value of information, showing how much it would be worth spending on further studies to reduce uncertainty; and perhaps most usefully in this case, a cost-effectiveness acceptability curve indicating the probability that StrongMinds is cost-effective at a given threshold, such as the 3-8x GiveDirectly that GiveWell is currently using as its bar for new charities. Some examples are here.)

My out-of-date notes:

Topic 2.2: (Re-)prioritising causes and interventions

[…]

GiveWell

[…]

Spillover effects

Secondly, there are also potential issues with ‘spillover effects’ of increased consumption, i.e. the impact on people other than the beneficiaries. This is particularly relevant to GiveDirectly, which provides unconditional cash transfers; but consumption is also, according to GiveWell’s model, the key outcome of deworming (Deworm the World, Sightsavers, the END Fund) and vitamin A supplementation (Hellen Keller International). Evidence from multiple contexts suggests that, to some extent, the psychological benefits of wealth are relative: increasing one person’s income improves their SWB, but this is at least partly offset by decreases in the SWB of others in the community, particularly on measures of life satisfaction (e.g. Clark, 2017). If increasing overall wellbeing is the ultimate aim, it seems important to factor these ‘side-effects’ into the cost-effectiveness analysis.

As usual, GiveWell provides a sensible discussion of the relevant evidence. However, it is somewhat out of date and does not fully report the findings most relevant to SWB, so I’ve provided a summary of wellbeing outcomes from the four most relevant papers in Appendix 2.1. In brief:

All four studies found positive treatment effects, i.e. improvement to the psychological wellbeing of cash recipients, though in two cases this finding was sensitive to particular methodological choices.
Two studies of GiveDirectly found negative psychological spillovers.
Two found only null or positive spillovers.

As GiveWell notes, it is hard to aggregate the evidence on spillovers (psychological and otherwise) because of:

Major differences in study methodology (e.g. components of the psychological wellbeing index, type of control, inclusion/exclusion criteria, follow-up period).
Major differences in the programs being studied (e.g. size of transfers, proportion of households in a village receiving transfers).
Absence of key information (e.g. how many non-recipient households are affected by spillover effects for each treated household, how the magnitude of spillovers changes with distance and over time, how they differ among eligible and ineligible households).

Like GiveWell, I suspect the adverse happiness spillovers from GiveDirectly’s current program are fairly small. In order of importance, these are the three main reasons:

The negative findings were based on within-village analyses, i.e. comparing treated and untreated households in the same village. These may not be relevant to the current GiveDirectly program, which gives money to all eligible households in treated villages (and sometimes all households in the village). The two studies that investigated potential spillovers in untreated villages in the same area as the treated ones found no statistically significant effect.
Eggers et al. (2019) (the “general equilibrium” study), which found only null or positive spillovers, was by far the largest, seems to have had the fewest methodological limitations, and investigated a version of the program most similar to current practice.
At least one of the ‘negative’ studies, Haushofer & Shapiro (2018), had significant methodological issues, e.g. differential attrition rates and lack of baseline data on across-village controls (though results were fairly robust to authors’ efforts to address these).

In addition, any psychological harm seems to be primarily to life satisfaction rather than hedonic states. As noted in Haushofer, Reisinger, & Shapiro (2019): “This result is intuitive: the wealth of one’s neighbors may plausibly affect one’s overall assessment of life, but have little effect on how many positive emotional experiences one encounters in everyday life. This result complements existing distinctions between these different facets of well-being, e.g. the finding that hedonic well-being has a “satiation point” in income, whereas evaluative well-being may not (Kahneman and Deaton, 2010).” This is reassuring for those of us who tend to think feelings ultimately matter more than cognitive evaluations.

Nevertheless, I’m not extremely confident in the net wellbeing impact of GiveDirectly.

Non-trivial comparison effects are found in many other contexts, so it is perhaps reasonable to expect them here too. (I haven’t properly looked at that evidence so I’m not sure how strong my prior should be.)
As with any metric, there are various potential biases in wellbeing measures that could lead to under- or over-estimation of effects. When assessing the actual effect on wellbeing/welfare/utility (rather than on the specific measures of wellbeing used in the study), we should consider the evidence in the context of other findings that I haven’t discussed here.
Even a negative spillover with a very small effect size, which seems plausible in this case, could offset much or all of the positive impact. For instance, if recipient households gain 1 happiness point from the transfer, but every transfer causes 10 other households to lose 0.1 points for the same duration, the net effect is neutral.
I have only summarised the relevant papers; I haven’t tried to critique them in detail. GiveWell has also not analysed the latest versions of some of the key studies, which differ considerably from the working papers, so they might uncover some issues that I haven’t spotted.

A few more notes on interpreting the wellbeing effects of GiveDirectly:

As with other health and poverty interventions, I suspect the overall, long-run impact will be more sensitive to unmeasured and unmodeled indirect effects (e.g. consumption of factory-farmed meat, population size, CO2 emissions) than to methods for estimating welfare (e.g. SWB instruments vs consumption). But I’m leaving these broader issues with short-termist methodology aside for now.
The mechanisms of any adverse wellbeing effects have not been established in this case, and may not be pure psychological ‘comparison effects’ (jealousy, reduced status, etc). For instance, they could be mediated through consumption (e.g. poorer households selling goods to richer ones) or through some other, perhaps culture-specific, process.
Like any metric, SWB measures are imperfect. So even when SWB data are available, an assessment of the SWB effects of an intervention may be improved by taking into account information on other outcomes, plus ‘common sense’ reasoning.

In addition, I would note that the other income-boosting charities reviewed by GiveWell could potentially cause negative psychological spillovers. According to GiveWell’s model, the primary benefit of deworming and vitamin A supplementation is increased earnings later in life, yet no adjustment is made for any adverse effects this could have on other members of the community. As far as I can tell, the issue has not been discussed at all. Perhaps this is because these more ‘natural’ boosts to consumption are considered less likely to impinge on neighbours’ wellbeing than windfalls such as large cash transfers. But I’d like to see this justified using the available evidence.

I make some brief suggestions for improving assessment of psychological spillover effects in the “potential solutions” subsection below.

Appendix 2.1

Four studies investigated psychological impacts of GiveDirectly transfers. Two of these found wellbeing gains for cash recipients (“treatment effects”) and only null or positive psychological spillovers:

Haushofer & Shapiro (2016) (9-month follow-up)
- 0.26 standard deviation (SD; p<0.01), positive, within-village treatment effect (i.e. comparing treated and untreated households in the same village) on an index of psychological wellbeing with 10 components (Table IV, p. 2011).
  - Statistically significant benefits for (in decreasing order of magnitude) Depression, Stress, Life Satisfaction, and Happiness at the 1% level, and Worries at the 10% level. Null effects (at the 10% level) on Cortisol, Trust, Locus of Control, Optimism, and Self-esteem (though point estimates were mostly positive).
- Null, precise, within-village spillover effect on the index of psychological wellbeing; point estimate positive (0.1 SD; Table III, p. 2004).
Egger et al. (2019) (the “general equilibrium” study)
- 0.09 SD (p<0.01) within-village treatment effect (i.e. assuming all spillovers are contained within a village) on a 4-item index of psychological wellbeing.
  - Driven entirely by Life Satisfaction; no effect on Depression, Happiness, or Stress. (See this table, which the authors kindly sent to me on request.)
- 0.12 SD (p<0.1) “total” treatment effect (both within-village and across-village) on psychological wellbeing.
  - Driven by Happiness (0.15 SD; p<0.05); no others significant at the 10% level. (See this table.)
- Null, fairly precise “total” spillover effect (combining within- and across-village effects) on the index of psychological wellbeing (and on every individual component); point estimate small and positive (0.08 SD). (See this table.)
- Note: GiveWell reports a positive, statistically significant within-village spillover effect on psychological wellbeing of about 0.1 SD, based on an earlier draft of the paper. I can’t find this in the published paper; perhaps it was cut because of the authors’ stated preference for the “total” specification.

However, two studies are more concerning:

Haushofer & Shapiro (2018) (3-year follow-up; working paper)
- Within-village 0.16 SD (p<0.01) treatment effect on an 8-component index of psychological wellbeing (Table 3, p. 16).
  - Driven primarily by improvements to Depression and Locus of Control (p<0.05), followed by Happiness and Life Satisfaction (p<0.1). No statistically significant (at the 10% level) change in Stress, Trust, Optimism, and Self-esteem. (Table B.7, p. 55)
- Null across-village treatment effect on psychological wellbeing (Table 5, p. 22).
- Approx. -0.2 SD (p<0.01) adverse psychological wellbeing spillover on untreated households in treated villages (Table 7, p. 26).
  - Driven by Stress (p<0.01), Depression (p<0.05), Happiness (p<0.1), and Optimism (p<0.1). No statistically significant (at the 10% level) change in Life Satisfaction, Trust, Locus of control, or Self-esteem. (Table B.15, p. 63)
Haushofer, Reisinger, & Shapiro (2019)
- A 1 SD increase in own wealth causes a 0.13 SD (p<0.01) increase in the psychological well-being index (p.13; Table 3, p. 27).
  - At the average change in own wealth of eligible (thatched-roof) households of USD 354, this translates into a treatment effect of 0.09 SD.
- At the average transfer of $709 among treated households, this translates into a treatment effect of 0.18 SD.
- Driven by Happiness and Stress (p<0.01) then Life Satisfaction and Depression (p<0.05). No statistically significant (at the 10% level) effect on Salivary Cortisol. (Table 5, p. 29)
- A 1 SD increase in village mean wealth (i.e. neighbours in one’s own village having a larger average transfer size) causes a decrease of 0.06 SD in psychological well-being over a 15 month period, only significant at the 10% level (p. 14; Table 3, p. 27).
  - At the average cross-village change in neighbours’ wealth of $327, this translates into an effect of -0.2 SD.
- Driven entirely by Life Satisfaction (0.14 SD; p<0.01; p. 15; Table 5, p. 29)
  - At a change in neighbours’ wealth of $327, this translates into a Life Satisfaction effect of -0.4 SD (which is much larger than the own-wealth benefit, but less precisely estimated).
- Subgroup analysis 1: No statistically significant within-village difference between treated and untreated households in psychological wellbeing effects of a change in neighbours’ wealth. (This suggests that what matters is how much more your neighbours received, not whether you received any transfer.)
- Subgroup analysis 2: No statistically significant within-village difference in the psychological wellbeing effect of a change in neighbours’ wealth between households below versus above the median wealth of their village at baseline. (This suggests poorer households did not suffer more adverse psychological spillovers than wealthier ones.)
- Methodological variations: Broadly similar results using alternative measures of the change in village mean wealth. (See p. 17 and Tables A.9–A.14 for details.)
No effect of village-level inequality on psychological wellbeing (holding constant one’s own wealth) over any time period and using three alternative measures of inequality.

Note: GiveWell’s review of an earlier version of the paper reports a “statistically significant negative effect on an index of psychological well-being that is larger than the short-term positive effect that the study finds for receiving a transfer, but the negative effect becomes smaller and non-statistically significant when including data from the full 15 months of follow-up… The authors interpret these results as implying that cash transfers have a negative effect on well-being that fades over time.” I’m not sure why the authors removed those analyses from the final version.

JoelMcGuireNov 29 20216

Hi Derek, it’s good to hear from you, and I appreciate your detailed comments. You suggest several features we should consider in our following intervention comparison and version of these analyses. I think trying to test the robustness of our results to more fundamental assumptions is where we are likeliest to see our uncertainty expand. But I moderately disagree that this is straightforward to adapt our model to. I’ll address your points in turn.

Time discounting: We omitted time-discounting because we only look at effects lasting ten years or less. Given our limited time available, adding a section discussing time-discounting would not be worth the effort. It’s worth noting that adding time discounting would only make psychotherapy look better because cash transfers’ benefits last longer.
Cost of StrongMinds: We include all costs StrongMinds incurs. The cost is "total expenditure of StrongMinds" / "number of people treated". We don't record any monetary cost to the beneficiary. If an expense to a beneficiary is bad because it decreases their wellbeing, we expect subjective well-being to account for that.
Only depression data? We have subjective well-being and mental health measures for cash transfers, but only the latter for psychotherapy. We discuss why we don’t think differences between MH and SWB measures will make much difference in sections 3.1 of the CT CEA and Appendix A of the psychotherapy report. Section 4.4 of the psychotherapy report discusses the literature on social desirability/experimenter demand (what I take you’re pointing to with your concern about “loading the dice”). The limited evidence suggests, perhaps surprisingly, that people don’t seem very responsive to the perceived demands of the experimenter, in general, or in LMIC settings.
Spillovers: We are working on updating our analysis to include household spillovers. We discuss the intra village spillovers in the cost-effectiveness analysis and the meta-analysis. I think we agree that the community spillovers do not appear likely to be influential.
Sensitivity / robustness: You are correct that we haven't run as many robustness tests as we could have. These seem like reasonable candidates to consider in an updated version of the CEA comparison. Adding these tests can be conceptually straightforward and sometimes time-efficient. I especially think it’d be good to add another frame of the cost-effectiveness analysis that outputs the likelihood to surpass the 5x-8x bar.
- On the other hand, adding robustness checks for model-level assumptions seems like it could take a decent amount of time. In my view it doesn't seem straightforward to, for example, operationalise moral views, the value of information, reasonable bounds for discount rates, the differences in “conversion rates” between MH and SWB data, etc. But maybe we should be more willing to make semi-uninformed guesses at the range of these values and include these in our robustness tests.

DerekDec 4 202112

Thanks for the reply. I don't have much more time to think about this at the moment, but some quick thoughts:

On time discounting: It might have been reasonable to omit discounting in this case for the reasons you suggest, but (a) it limits comparability across analyses if you or others do it elsewhere; (b) for various reasons, it would be good to have some estimate of the absolute, not just relative, costs and effects of these interventions; and (c) it's pretty easy to implement in most software, e.g. Excel and R (maybe less so in Guesstimate), so there isn't usually much reason not to do it.
On costs: (a) You only seem to measure depression, so if costs affect some other aspect of SWB then your analysis will not account for it. (b) It is also a good idea, where feasible, to account for non-monetary costs, such as lost time spent with family, and informal caregiver time. In this case, these are probably best covered by SWB outcomes, rather than being monetised, but since they involve spillovers on people other than the patient, they were not captured in this case. (c) Your detailed CEA of StrongMinds does not make it entirely clear what you mean by "all costs"; it just says "Our estimates of the average cost for treating a person in each programme are taken directly from StrongMinds' accounting of its costs from 2019," with no details about those accounts. For example, if they bought an expensive building in which to deliver training in 2018, that cost should normally be amortised over future years (roughly speaking, shared among future beneficiaries for the life of the building). So simply looking at 2019 expenditure does not necessarily capture "all costs". I suggest reading Chapter 7 of Drummond et al to begin with, for a discussion of practical and conceptual issues in costing of health interventions.
On the focus on depression data: My "loading the dice" comment wasn't about SDB/demand effects. Suppose, for example, that you want to compare intervention A, which treats both depression and severe physical pain; and intervention B, which only treats depression. You find that B reduces depression by more per dollar than A, so you conclude it is more cost-effective than A, and recommend it to donors. But it's not really a fair comparison: you don't know whether the overall benefit per dollar is greater in B than A, because you are ignoring the pain-relieving effects, which are likely greater in A. I haven't looked at the GD data recently, but I can imagine something like that going on here, e.g. the cash has all sorts of benefits that aren't captured by the depression measure, whereas the psychotherapy could have few such benefits.
On spillovers: I'm glad you are updating the analysis. To be frank, I think you probably shouldn't have published this analysis in its current state, primarily due to the omission of spillovers. It's just too misleading.
On sensitivity analysis: Also pleased you are going to add some of these. You're right that some take longer than others, and it's hard/impossible to do some of them in Guesstimate. But I think you can export the samples from Guesstimate to Excel, which should allow you to do some of the key ones without too much work, e.g. EVPI and CEAC/CEAF just need a simple macro and graph; see my Donational model for examples. (For extra usability and flexibility, you can do it in R and make a Shiny web app, but that takes a lot more work.)

This paper, the Drummond book above, and this book are good starting points if you want to learn how to do cost-effectiveness analysis (including sensitivity analysis).

A couple nitpicks:

Your title is misleading: this isn't/these aren't "meta-analyses comparing the cost-effectiveness of cash transfers and psychotherapy". AFAICT, you are doing a cost-effectiveness analysis informed by meta-analyses of the effects of the two interventions. You aren't doing a meta-analysis of cost-effectiveness studies.
The y axes of your graphs, and some of your tables, say things like "Effects of Depression Improvement". As far as I can tell, these are showing the effects of the interventions on depression/SWB/MHa in terms of SD. They aren't, for example, showing the effects of depression (i.e. the consequences of depression for something else), as implied by this wording.

JoelMcGuireDec 8 20212

Hi Derek, thank you for your comment and for clarifying a few things.

Time discounting: We will revisit time discounting when looking at interventions with longer time scales. To be clear, we plan to update these analyses for backwards compatibility as we introduce refinements to our models and analyse new interventions.
Costs: You’re right, expenses in an organisation can be lumpy over time. If costs are high in all previous years but low in 2019 and we only use the 2019 figures, we'd probably be making a wrong prediction about future costs. I think a reasonable way to account for this is by treating the cost for an organisation as an average of the previous years, where you give more weight increasingly to years closer to the present.
Depression data: Thanks for the clarification; I think I understand better now. We make a critical assumption that a one-unit improvement in depression scales corresponds to the same improvement in well-being as a one-unit change in subjective well-being scales. If SWB is our gold standard, we can ask if depression scale changes predict SWB scale changes. Our preliminary analyses suggest that the difference here would, in any case, be pretty small. For cash transfers, we found the 'SWB only' effect would be about 13% larger than the pooled 'SWB-and-MH' effect (see page 10, footnote 16). To assess therapy, we looked at some psychological interventions that had outcome measures in SWB and MH and found the SWB effect was 11% smaller (see p27-8). We'd like to dig further into this in the future. But these are not result-reversing differences.

Richard BrunsJul 19 20224

I strongly agree with Derek's point about measuring the nonmonetary costs to the recipients and their families. If your benefits are driven mainly by the differences in costs, then omitting potentially relevant costs can invalidate the entire analysis. You absolutely must account for the time that recipients spent in the program, and traveling to and from the program, and any other money or time costs that they or their families incurred as a result of program participation. At minimum, this time should be valued at the local wage rate. Until this is addressed, I will assume that your analysis is junk, and say so to anyone who asks me about it.

Aaron Gertler 🔸Oct 28 202114

I haven't read the debate closely, but people who like this post would probably be interested in the authors' Twitter conversation about this research with Alexander Berger (head of global health and wellbeing at Open Philanthropy).

MichaelStJulesOct 28 202118

On Twitter, Michael Plant wrote:

Hi! Not exactly sure why you disagree, but some context might help. We found 5 studies of hshold spillovers for CTs. Taken at face value, the imply other members of hshold get approx 100% of SWB value as recipient.
Weirdly,we could only find 1 study of intra-hshold therapy effects; that was non RCT and had N<200. It found hshold spillover was 77% size of that to recipient. Taking this at face value too, and assuming hshold size is about 4, therapy goes from 12x to 10x better.

I think the right thing to do here (besides further research) would be to give less weight to the psychotherapy spillover estimate and adjust the effect downwards (or at least more downwards than CTs' spillover, which has evidence from multiple studies, presumably some better designed and with larger sample sizes), based on a skeptical prior.

PT=psychotherapy, CT=cash transfers

PT= 12 x CT without spillovers

CT' = 4 x CT with spillovers

PT' = (1+3s) x PT with spillovers, where s=spillover effect for psychotherapy

PT'= ((1+3s)/4) * 12 x CT' = 3(1+3s) x CT'.

The worst case, s=0, has PT'=3 x CT'. With s=0.25=25%, PT'=5.25 x CT', and with s=0.5=50%, PT'=7.5 x CT'.

Barry GrimesApr 19 20227

As a result of Alexander's feedback, we’ve updated our cost-effectiveness comparison of psychotherapy and cash transfers to include an estimate of the effects on other household members. Our previous analysis only considered the effects on recipients.

You can read a summary of our new analysis here.

For cash transfers, we estimate that each household member experiences 86% of the benefits experienced by the recipient. For psychotherapy, we estimate the spillover ratio to be 53%.

After including the household spillover effects, we estimate that StrongMinds is 9 times more cost-effective than GiveDirectly (a slight reduction from 12 times in our previous analysis).

This new analysis of household effects is based on a small number of studies, eight for cash transfers and three for psychotherapy. The lack of data on household effects is a serious gap in the literature that should be addressed by further research because it is such a large part - indeed, the majority - of the total effects. The significance of household effects seems plausibly crucial for many interventions, such as poverty alleviation programmes, housing improvement interventions, and air or water quality improvements.

Michael_WiebeNov 1 20218

How does the meta-analysis avoid the garbage-in-garbage-out problem? Are you simply averaging across studies, or do you weight by study quality (eg. sample size, being pre-registered, etc)? Did you consider replicating the individual studies?

Do you worry about effect sizes decreasing as StrongerMinds scales up? Eg. they start targeting a different population where therapy has smaller effects.

One quibble: "post-treatment effect" sounds weird, I would just call it a "treatment effect".

David JohnstonNov 1 20215

I share this concern. I don't have much of a baseline on how much meta-analysis overstated effect sizes, but I suspect it is substantial.

One comparison I do know about: as of about 2018, the average effect size of unusually careful studies funded by the EEF (https://educationendowmentfoundation.org.uk/projects-and-evaluation/projects) was 0.08, while the mean of meta-analytic effect sizes overall was allegedly 0.40(https://visible-learning.org/hattie-ranking-influences-effect-sizes-learning-achievement/), suggesting that meta analysis in that field on average yields effect sizes about five times higher than is realistic.

The point is, these concerns cannot be dealt with simply by suggesting that they won't make enough difference to change the headline result; in fact they could.

If this issue was addressed in the research discussed here, it's not obvious to me how it was done.

Give well rated the evidence of impact for GiveDirectly "Exceptionally strong", though it's not clear exactly what this means with regard to the credibility of studies that estimate the size of the effect of cash transfers on wellbeing (https://www.givewell.org/charities/top-charities#cash). Nevertheless, if a charity was being penalized in such comparisons for doing rigorous research, then I would expect to see assessments like "strong evidence, lower effect size", which is what we see here.

JoelMcGuireNov 2 20212

Hi Michael,

I try to avoid avoid the problem by discounting the average effect of psychotherapy. The point isn’t to try and find the “true effect”. The goal is to adjust for the risk of bias present in psychotherapy’s evidence base relative to the evidence base of cash transfers. We judge the CTs evidence to be higher quality. Psychotherapy has lower sample sizes on average and fewer unpublished studies, both of which are related to larger effect sizes in meta-analyses (MetaPsy, 2020; Vivalt, 2020, Dechartres et al., 2018 ;Slavin et al., 2016). FWIW I discuss this more in appendix C of the psychotherapy report.

I should note that I think the tool I use needs development. This issue of detecting and adjusting for the bias present in a study is a more general issue in social science.

I do worry about the effect sizes decreasing, but the hope is that the cost will drop to a greater degree as StrongMinds scales up.

We say "post-treatment effect" because it makes it clear the time point we are discussing. "Treatment effect" could refer either to the post-treatment effect or to the total effect of psychotherapy, where the total effect is the decision-relevant effect.

SiebeRozendalOct 26 20218

Great work! :) Very happy to see the increase in rigour over earlier estimates. If your research is correct (and, in my casual reading of it, I can find no reason why it wouldn't be) this opens up a whole new area of funding opportunities in the global health & wellbeing space!

I'm also excited about the rest of your research agenda. It seems very ambitious ;)

Some things I find interesting:

"we found evidence that group psychotherapy is more effective than psychotherapy delivered to individuals which is in line with other meta-analyses (Barkowski et al., 2020; Cuijpers et al., 2019). One explanation for the superiority is that the peer relationships formed in a group provide an additional source of value beyond the patient-therapist relationship." --> I did not expect group therapy to be more effective. Instead I expected it to be less effective per person, but more cost effective in total. This is great news.

I am also surprised by the extremely low cost of lay therapy. Is there any correlation between the effectiveness of lay therapy and its cost? I can imagine training costing money but increasing effectiveness.

Most charities not responding/willing to share their costs is .. maybe not so surprising? Let's hope that changes if/when StrongMinds gets a bunch of funding, and you develop your reputation!

Last question: what's HLI's current funding situation? (Current funding, room for funding in different growth scenarios)

MichaelPlantOct 27 202110

Last question: what's HLI's current funding situation? (Current funding, room for funding in different growth scenarios)

Our funding situation is, um, "actively seeking new donors"! We haven't yet filled our budget for 2022.

Our gap up to the end 2022 on our lean budget is £120k; that's the minimum we need to 'keep the lights on'.

Our growth budget, the gap to the of 2022 is probably £300k; I'm not sure we could efficiently scale up much faster than that. (But if someone insisted on giving me more than that, I would have a good go!)

Lorenzo Buonanno🔸Jan 29 20221

Hi Michael, has the funding situation of the HLI changed in the last three months?

I'm especially interested to know if this analysis and the discussion around it brought new funds.

Also, are financial statements for 2020 and 2021 from the HLI available somewhere by any chance?
I'm thinking about directing there some of my donations for 2022

MichaelPlantFeb 4 20226

Hello Lorenzo,

Sorry for the slow reply on this - I've been taking a bit of a break from the EA forum.

I'm pleased to report the funding situation at HLI has substantially improved on the back of our new research. We've raised enough money that we've switched to our 'growth' budget (most of the difference between 'lean' and 'growth' is hiring 2 new staff). However, we are still $50k short of funding for our growth budget for 2022 and donations would be welcome! Feel free to reach out to me privately too.

I'll talk to our fiscal sponsor, PPF and find out about the financial statements.

Lorenzo Buonanno🔸Feb 6 20223

Hi Michael,

You replied in less than a week, I consider it a fast reply, nothing to be sorry about!

I'm so very glad to hear that the funding situation of the HLI has improved, I think the work you're doing is really important and something that was/is sorely missing when trying to maximize and align impact.

On a personal note, I'll take this opportunity to mention that I find your posts and comments on this forum very valuable.
As someone ignorant about philosophy and happiness research, that's how I was introduced to important topics like population ethics and WELLBYs. It helped me understand much better my intuitive uneasiness about GiveWell's values and cleared up a lot of internal confusion.

Seeing a different perspective in what sometimes seems a bit of a monoculture (EA short-term interventions to improve human lives), and seeing the importance of SWB being relentlessly pushed for years was very enlightening and inspiring.

So thank you for all your amazing work!

JoelMcGuireOct 27 20218

Hi Siebe, thank you for the kind words! We agree that using SWB could help us find new opportunities! We’re excited to explore more of this area.

I was also surprised by the things you mention, but I think they make sense on reflection. I can share more of my reasoning if you'd like (but I'm unsure if that's what you were asking for).

We don’t have enough information to estimate the relationship between cost and effectiveness, but this is an interesting question! The issue is that we lack studies that contain both the effects and the costs of psychotherapy. However, we should be able to get cost information from another psychotherapy NGO operating in LMICs, so we hope to analyze that too.

I will let Michael comment on the funding situation!

MichaelStJulesOct 25 20217

Firstly, the recipient is plausibly not the only person impacted by a cash transfer. They can share it with their partner, children, and even friends or neighbours. Such sharing should benefit non-recipients' well-being. However, it’s also possible that any benefit that non-recipients receive could be offset by envy of their neighbour’s good fortune. There appears to be no evidence of significant negative within-village spillover effects, but there is some evidence for positive within-household and across-village spillover effects. We have not included these spillover effects in our main analysis because of the large uncertainty about the relative magnitude of spillovers across interventions and the slim evidence available to estimate the household spillover effects.

I may be misremembering, but doesn't GiveDirectly give to whole villages at a time, anyway, making negative spillover very unlikely? If that's the case, it seems like all of the spillover effects should be positive (in expectation).

Do you have any thoughts on how the spillover effects of these interventions might compare, and is there any interest in looking further into this? Mental health interventions may also improve productivity (and so increase income), and people's mental health can affect others (especially family, and parents' mental health on children in particular) in important ways. On the other hand, people build wealth (and other resources, including human capital) within their communities, and cash transfers/deworming could facilitate this, but this may happen over longer time scales.

I would guess the effects on SWB through increased income for the direct beneficiaries of StrongMinds are already included in the measurements of effects on SWB, assuming the research participants were similar demographically (including in income, importantly) as the beneficiaries of StrongMinds.

EDIT: Saw this in your post:

That being said, even if we take the upper range of GiveDirectly’s total effect on the household of the recipient (8 SDs), psychotherapy is still around twice as cost-effective.

JoelMcGuireOct 26 20215

Hi Michael and thank you for your comments and engaging with our work!

I may be misremembering, but doesn't GiveDirectly give to whole villages at a time, anyway, making negative spillover very unlikely? If that's the case, it seems like all of the spillover effects should be positive (in expectation).

To my understanding, GiveDirectly gives cash transfers to everyone in a village who is eligible. GiveWell says this means almost everyone in a village receives CTs in Kenya and Uganda but not Rwanda (note that GiveDirectly no longer works in Uganda). So it seems like negative spillovers are still possible. However, I think you’re still right that it makes negative spillovers less likely.

Do you have any thoughts on how the spillover effects of these interventions might compare, and is there any interest in looking further into this?

It’s tough to say how the (intra-household) spillovers compare. I guess that CTs could provide a bit more benefit to the household than psychotherapy, but I am very uncertain about this.

My thinking is that household spillovers are at least what your family gets for having you be happier. As you say, “people's mental health can affect others (especially family, and parents' mental health on children in particular) in important ways.” I expect this to be about balanced across interventions. Then there are the other benefits, which I think will mostly be pecuniary. In this case, it seems like cash transfers if shared, will boost the household’s consumption more than psychotherapy. Again, we are quite uncertain about how spillovers compare across interventions, but it seems important to figure out what’s going on at least within the household. I can go into more detail if you’d like.

We are very interested in looking further into how the spillover effects of these interventions might compare, particularly intra-household spillovers. But as you might guess, the existing evidence is very slim. To advance the question we need to either wait for more primary research to be done or ask researchers for their data and do the analysis ourselves. We will revisit this topic after we’ve looked into other interventions.

I would guess the effects on SWB through increased income for the direct beneficiaries of StrongMinds are already included in the measurements of effects on SWB, assuming the research participants were similar demographically (including in income, importantly) as the beneficiaries of StrongMinds

I think that’s pretty much correct!

MichaelStJulesOct 25 20216

How would the possibility of scale norming with life satisfaction scores (using the scales differently across people or over time, in possibly predictable ways) affect these results? There's a recent paper on this, and also an attempt to correct for this here (video here). (I haven't read any of these myself; just the abstracts.)

CasparKaiserOct 27 202115

(Disclosure: I’m the author of the second linked paper, board member of HLI, and a collaborator on some of its research.)

Hi Michael!

In my paper on scale use, I generally find that people who become more satisfied tend to also become more stringent in the way they report their satisfaction (i.e., for a given satisfaction level, they report a lower number). As a consequence, effects tend to be underestimated.

If effects are underestimated by the same amount across different variables/treatments, scale norming is not an issue (apart from costing us statistical power). However, in the context of this post, if (say) the change in reporting behaviour is stronger for cash-transfers than for psychotherapy, then cash-transfers will seem relatively less cost-effective than psychotherapy .

To assess whether this is indeed a problem, we’d either need data on so-called vignettes (link), or people’s assessment of their past wellbeing. Unfortunately, as far as I know, this data does not currently exist.

That being said, in my paper (which is based on a sample from the UK), I find that accounting for changes in scale use does not, compared to the other included variables, result in statistically significantly larger associations between income and satisfaction.

Aaron Gertler 🔸Dec 25 20212

This Twitter thread from economist Chris Blattman, who "spent the last 15 years studying cash and also CBT", is an interesting response to the Vox article based on this study. An excerpt:

There ought to be huge amounts of investment in testing whether these techniques can be automated into apps, implemented by non experts, performed in groups or over mass media. Some of this testing is already happening but it needs to explode in scale.
That’s because scaling these interventions is harder than the CBT enthusiasts are letting on. Helping an average villager be less sad & anxious is different than tackling serious depression. Both are different than a program to help riskiest young men to be less violent.

MichaelStJulesNov 1 20212

Do you have any plans to look into the welfare benefits from GiveWell's life-saving charities to those who would otherwise lose loved ones (mostly parents losing their children)?

JoelMcGuireNov 2 20213

Yes! We've looked into this a bit already in our report on comparing the value of doubling consumption to saving the life of a child using SWB. We plan to revisit and expand on this work.

peter_janickiOct 28 2021-1

I recently looked into strongminds „research“ and their findings. I was extremely dissapointed by the low standards. It seemed like they simply wanted to make up super-good numbers. Their results are extremely unrealistic. Are there new results from proper research?

Stephen ClareOct 28 202112

I'm interested in reading critiques of StrongMinds' research, but downvoted this comment because I didn't find it very helpful or constructive. Would you mind saying a bit more about why you think their standards are low, and the evidence that led you to believe they are "making up" numbers?

peter_janickiOct 28 202119

They did not have a placebo-receiving control group. For example some kind of unstructured talking-group etc. Ideally an intervention known as „useless“ but sounding plausible. So we do not know, which effects are due to regression to the middle, social desirable answers etc. This is basically enough to make their research rather useless. And proper control groups are common for quiete a while.

No „real“ evaluation of the results. Only depending on what their patients said, but not checking, if this is correct (children going to school more often…). Not even for a subgroup.

They had the impression, that patients answered in a social desirable way - and adressed that problem completely inadequate. Arguing social desirable answers would happen only at the end of the treatment, but not near the end of the treatment. ?! So they simply took near-end numbers for granted. ?!

If their depression treatment is as good as they claim, then it is magnitudes better, than ALL common treatments in high-income countries. And much cheaper. And faster. And with less specialized instuctors… ?! And did they invent something new? Nope. They took an already existing treatment - and now it works SO much better? This seems implausible to me.

As far as I know SoGive is reviewing strongminds research. They should be able to back (or reject) my comments here.

TsunayoshiNov 2 20218

They did not have a placebo-receiving control group.

All the other points you mentioned seem very relevant, but I somewhat disagree with the importance of a placebo control group, when it comes to estimating counterfactual impact. If the control group is assigned to standard of care, they will know they are receiving no treatment and thus not experience any placebo effects (but unlike you write, regression-to-the-mean is still expected in that group), while the treatment group experiences placebo+"real effect from treatment". This makes it difficult to do causal attribution (placebo vs treatment), but otoh it is exactly what happens in real life when the intervention is rolled out!

If there is no group psychotherapy, the would-be patients receive standard of care, so they will not experience the placebo effect either. Thus a non-placebo design is estimating precisely what we are considering doing in real life: give an intervention to people, who will know that they are being treated and who would just have received standard of care (in the context of Uganda, this presumably means receiving nothing?).

Ofc, there are issues with blinding the evaluators; whether StrongMinds has done so is unclear to me. All of your other points seem fairly strong though.

peter_janickiNov 3 20214

Thx for commenting. I have to agree with you and disagree somewhat with my earlier comment. (#placebo). Actually placebo-effects are fine and if a placebo helps people: Great!

And yes, getting a specific treatment effect + the placebo-effect is better (and more like in real life), than getting no treatment at all.

Stefan_SchubertOct 28 20214

"Still: I thought it be good to make this comment right now, so people see my opinion."

I think it would have been better to wait until you had time to give proper arguments for your views. I agree with Stephen that the above comment wasn't helpful or constructive.

David JohnstonNov 1 20214

I think the follow up is much more helpful, but I found the original helpful too. I think it may be possible to say the same content less rudely, but "I think strong minds research is poor" is still a useful comment to me.

Stefan_SchubertNov 1 20216

I disagree. I should also say that the follow up looked very different when I commented on it; it was extensively edited after I had commented.

peter_janickiOct 28 20211

Please don´t get me wrong. I do not like the research from strongminds for the above mentioned reasons (I am sure nobody got me wrong on this). And for some other reasons. But that does mean, that their therapy-work is bad or inefficient. Even if they overestimate their effects by a factor of 4 (it might be 20, it might be 2 - I just made those numbers up) it would still be very valuable work.

PhantoMinecrafterOct 31 20212

I think that somewhere there is "placebo's effect" involved. People may think something is helpful but it is not.

Just recently have read the https://www.health.harvard.edu/mental-health/the-power-of-the-placebo-effect article about it. A bit shocked to be honest

P.S. I do not want to offend anybody.

Effective Altruism Forum
EA Forum

Donating money, buying happiness: new meta-analyses comparing the cost-effectiveness of cash transfers and psychotherapy in terms of subjective well-being

153

153

Reactions