All of Derek's Comments + Replies

As you note, this was written in 2012. Have you looked for more recent research into lie detection? As well as fMRI and polygraph, there's voice stress analysis, non-verbal cues, microexpressions, and cognitive interviewing. I did a brief search a while ago but couldn't find anything particularly accurate. I would be extremely keen to hear about any methods with high sensitivity and/or specificity, or with the potential to achieve that in the near future. I might be willing to pay someone a modest amount to review the evidence and predict when accurate tec... (read more)

DALYs appear to weight pain very lightly. For example, terminal illness with constant, untreated pain has a disability (DALY) weight of 0.569, which is only 0.029 more than the weight for the same condition with pain medication. QALYs are better at capturing pain: physical pain is the dimension given the highest weight in the EQ-5D, and instrument used to measure quality of life.

You might want to check disability weights for other painful conditions; I don't remember if they were generally low. 

I suspect QALYs still underweight extreme pain, for vario... (read more)

Thanks for writing this - there's some good stuff here. A few comments: 

1 QALY is equal to a year of life in full health, while 0 QALYs is a health state equivalent to death … The QALY scale admits scores below zero, which represent states worse than death

Minor point, but I think 'being dead' is more accurate than 'death'. The latter suggest permanency, whereas values <0 can represent temporary states that are deemed worse than being dead. That said, there is some uncertainty over the meaning of negative valuations, and the best interpretation may ... (read more)

1
Stan Pinsent
3mo
Thanks for your comment, Derek. This has been really useful. Some changes I have made in response: * Changed "death" to "being dead" in my explanation of the DALY scale * Now say that DALYs likely underweight pain, but QALYs may not: * Mention that even sufferers may underestimate the badness of depression [with a link to your comment] A question: * I see from the summary you linked that IHME have used sequelae to identify ailments that are present in multiple health conditions. That seems sensible. I guess the kind of problem I often face is "What will be reduction in someone's disability weight if they are - protected from getting diabetes / cured of depression / etc. ?"  * In the diabetes example, it seems fair to count DALYs averted by not having diabetes and DALYs averted by depression-caused-by-diabetes. Maybe not fair to count, say, obesity, since the increased risk of obesity associated with diabetes is likely to be correlational, not causal. Am I thinking along the right lines?  * If we go with the depression example, it seems fair to count both prevented suicide and prevented depression (but not prevented depression-while-dead-by-suicide)

The urgency of relieving severe physical pain reveals the serious limitations of the “QALYs gained” approach to measuring scale of impact. A person with terminal cancer treated with morphine for two months might remain highly disabled and in a very poor state of health, and gain only a fraction of a QALY, yet be spared two months of agony.

I'm not sure I follow this. QALYs allow negative values, so if morphine treatment increased health-related quality of life from, say, -0.5 to +0.1, it would gain 0.6 QALYs per year. Most/all currently-used value sets... (read more)

Note that there are also methods for calculating confidence intervals around ICERs that avoid issues with ratios. The best I'm aware of is by Hatswell et al. I have an Excel sheet with all the macros etc set up if you want.

MAICER = maximum acceptable incremental cost-effectiveness ratio. This is often called the willingness to pay for a unit of outcome, though the concepts are a little different. It is typically represented by lambda. 

The CE plane is also useful as it indicates which quadrant the samples are in, i.e. NE = more effective but more costly (the most common), SE = more effective and cheaper (dominant), NW  = less effective and more costly (dominated), and SW =  less effective and cheaper. When there are samples in more than one quadrant, which is v... (read more)

3
Derek
1y
Note that there are also methods for calculating confidence intervals around ICERs that avoid issues with ratios. The best I'm aware of is by Hatswell et al. I have an Excel sheet with all the macros etc set up if you want.
Derek
2y19
0
0
1

This is a recognised issue in health technology assessment. The most common solution is to first plot the incremental costs and effects on a cost-effectiveness plane to get a sense of the distributions:

Then to represent uncertainty in terms of the probability that an intervention is cost-effective at different cost-effectiveness thresholds (e.g. 20k and 30k per QALY). On the CEP above this is the proportion of samples below the respective lines, but it's generally better represented by cost-effectiveness acceptability curves (CEACs), as below:

 

Often, ... (read more)

2
Lorenzo Buonanno
2y
This is super interesting, thanks! Exactly the kind of thing I was hoping for when posting this! 1. Point clouds with cost-effectiveness lines seem definitely useful, I saw them in the HLI reports but indeed should probably be much more standard. 2. Cost-effectiveness acceptability curves also seem a useful tool for reasoning about funding opportunities. Especially since (as far as I understand) most grantmakers have a "cost-effectiveness bar" to decide whether to fund things. I hope it's well known inside EA as it's the first time I've seen it! I think it might have some downsides though, or at the very least need some small modifications. If there is a policy intervention that has a fixed cost and a 5% chance of having a huge value, how would the curve look? Also, what is "MAICER" in that plot? 3. This is also super interesting! Not sure how to use it to compare interventions in an EA context though, e.g. malaria nets vs vitamin A supplements but I have a feeling there is something there. After looking at this post score, the comments, and some discussions I'm having, I think I'm not the only person a bit confused about these things. So I think any overview of these topics would definitely be useful, especially if it presents well-thought-out industry standards! I would especially be interested in examples of how to use these tools in an EA context (even if very simplified and theoretical). But in general, having examples of different ways to look at these things I think can be very valuable!

For traditional QALY calculations, researchers simply ask people how they feel when experiencing certain things (like a particular surgery or a disease) and normalize/aggregate those responses to get a scale where 0 quality is as good as death, 1 is perfect health, and negative numbers can be used for experiences worse than death.

 

This isn't correct. QALY weights are typically based on hypothetical preferences, not experiences.

What Richard described is more like a WELBY, which has a similar structure but covers wellbeing in some sense rather than just... (read more)

Glad you found it useful. I am not qualified to comment on the role of neuron count in sentience; you may want to look at work by Jason Schukraft and others at Rethink Priorities on animal sentience and/or get in touch with them.

If you haven't already, you may also want to review the 2018 Humane Slaughter Association report, which was the best I could find in early 2019. While looking for it, I also just came across one from Compassion in World Farming, which I don't think I've read.

On fish, there were several comments here, including this one from me.

The 2018 Humane Slaughter Association report was probably the best info available at the time; not sure what's happened since. 

1
Aaron Bergman
2y
Wow thanks so much, super valuable info! Too bad I can't give it more than four karma haha One of the reasons it took so long for me to reply is that I kinda fell into a rabbit hole investigating whether buying the Crustastun patent+manufacturing it and giving away would be a good intervention. It all looked good until I finally thought to look into lobsters themselves, and it turns out that they have way fewer neurons - ~100,000 according to an OpenPhil report (lost the link) - which is 2 orders of magnitude lower than even very small fish and 10−5 as many as humans. And crabs are very similar. F WIW, I was not at all expecting to find this, and had no idea crustaceans had extremely disproportionately small brains. May as well link this Google doc as what I had written before I met some inconvient statistics. I know not everyone is convinced that linear neuron comparisons are ideal, but they intuitively seem unlikely to be too far off from what "matters". Given this, I'm gonna conclude that Crustastun isn't worth pursuing unless we get more, different info about lobster sentience. On to the other bullet points!

There are also easy-access savings accounts giving a bit more than 1.3%: https://www.moneysavingexpert.com/savings/savings-accounts-best-interest/

Answer by DerekJul 11, 20222
0
0

If you are under 40 and might want to spend the money on a first property costing <450k, you could consider a Lifetime ISA (either cash or stocks & shares):

https://www.gov.uk/lifetime-isa https://www.moneysavingexpert.com/savings/lifetime-isas/

1
Derek
2y
There are also easy-access savings accounts giving a bit more than 1.3%: https://www.moneysavingexpert.com/savings/savings-accounts-best-interest/

There is a lot of potential in fish welfare/stunning. In addition to what others have mentioned, IIRC from some reading a few years ago:

  •  The greatest bottleneck in humane slaughter is research, e.g. determining parameters/designing machines for stunning each major species, as they differ so much. There just aren't many experts in this field, and the leading researchers are mostly very busy (and pretty old), but perhaps financial incentives would persuade some people with the right sort of background to go into this area.
  • As well as electrical and percu
... (read more)
2
Amber Dawn
1y
I just had the exact same question, so thanks Aaron for asking this, and Derek for giving this answer :)
9
Aaron Gertler
2y
This is excellent, thanks! These two papers, in particular, were what I was looking for. The corresponding information on QALYs was also great. (For future readers of my post, the relevant info is under the "descriptive system" and "valuation methods" subheadings in Derek's post.)

Yeah that's what I use, and it's cheaper than the fancy Wiley-branded fish-based product he linked to.  You can get much cheaper fish oil, but if you're going to get the expensive stuff anyway (I guess due to concerns about the quality of the cheaper brands), why not get vegan?

[Recording of the talk and related papers]

 

You can now view the recording of the talk from Professor John Brazier - Extending the QALY beyond health - the EQ HWB (Health and Wellbeing)

 

Kaltura

https://digitalmedia.sheffield.ac.uk/media/t/1_8k5slrc4

 

YouTube

https://www.youtube.com/watch?v=KTlsIvqyhNI

 

Papers associated with this talk

Special issue of Value in Health Development papers: 

Brazier, J et al. ‘The EQ-HWB: overview of the development of a measure of health and well-being and key results’. Value in Health. https://www.sciencedi

... (read more)

FYI the E-QALY work has been progressing quite well since you asked that question; I've just come out of a webinar on it. Let me know if you want me to send you notes/slides.  

A few key points:

  1. The measure has been named the EuroQol Health and Wellbeing (EQ-HWB); E-QALY seems to be what they are calling the broader project of extending the scope of the QALY.
  2. Psychometric work and stakeholder consultation resulting in a 25-item 'long' measure, then further consultation resulted in a 9-item EQ-HWB-S (Short Form) covering 9 domains: Mobility, Daily activit
... (read more)
2
Derek
2y
[Recording of the talk and related papers]   You can now view the recording of the talk from Professor John Brazier - Extending the QALY beyond health - the EQ HWB (Health and Wellbeing)   Kaltura https://digitalmedia.sheffield.ac.uk/media/t/1_8k5slrc4   YouTube https://www.youtube.com/watch?v=KTlsIvqyhNI   Papers associated with this talk Special issue of Value in Health Development papers:  Brazier, J et al. ‘The EQ-HWB: overview of the development of a measure of health and well-being and key results’. Value in Health. https://www.sciencedirect.com/science/article/pii/S1098301522000833    Mukuria, C et al. "Qualitative Review on Domains of Quality of Life Important for Patients, Social Care Users, and Informal Carers to Inform the Development of the EQ Health and Wellbeing." Value in Health (2022).  https://www.sciencedirect.com/science/article/pii/S1098301521032277    Carlton, J et al. "Generation, Selection, and Face Validation of Items for a New Generic Measure of Quality of Life: The EQ Health and Wellbeing." Value in Health (2022). https://www.sciencedirect.com/science/article/pii/S1098301522000109    Peasgood, T et al. "Developing a New Generic Health and Wellbeing Measure: Psychometric Survey Results for the EQ Health and Wellbeing." Value in Health (2022). https://www.sciencedirect.com/science/article/pii/S1098301521031922    International papers:  Monteiro AL, et al. A Comparison of a Preliminary Version of the EQ Health and Wellbeing Short and the 5-Level Version EQ-5D. Value Health. 2022 Mar 8:S1098-3015(22)00051-1. doi: 10.1016/j.jval.2022.01.003. Epub ahead of print. PMID: 35279371.    Augustovski F, Argento F, Rocío R, Luz G, Mukuria C, Belizán M. The Development of a New International Generic Measure (EQ Health and Wellbeing): Face Validity And Psychometric Stages In Argentina. https://www.sciencedirect.com/science/article/abs/pii/S1098301522000134  

Thanks. I tried 5-HTP a few years ago and didn't notice any benefit, but maybe I'll give it another go.

Thanks for the reply. I don't have much more time to think about this at the moment, but some quick thoughts:

  1. On time discounting: It might have been reasonable to omit discounting in this case for the reasons you suggest, but (a) it limits comparability across analyses if you or others do it elsewhere; (b) for various reasons, it would be good to have some estimate of the absolute, not just relative, costs and effects of these interventions; and (c) it's pretty easy to implement in most software, e.g. Excel and R (maybe less so in Guesstimate), so there is
... (read more)
2
JoelMcGuire
2y
Hi Derek, thank you for your comment and for clarifying a few things. 1. Time discounting: We will revisit time discounting when looking at interventions with longer time scales. To be clear, we plan to update these analyses for backwards compatibility as we introduce refinements to our models and analyse new interventions.  2. Costs: You’re right, expenses in an organisation can be lumpy over time. If costs are high in all previous years but low in 2019 and we only use the 2019 figures, we'd probably be making a wrong prediction about future costs. I think a reasonable way to account for this is by treating the cost for an organisation as an average of the previous years, where you give more weight increasingly to years closer to the present.  3. Depression data: Thanks for the clarification; I think I understand better now. We make a critical assumption that a one-unit improvement in depression scales corresponds to the same improvement in well-being as a one-unit change in subjective well-being scales. If SWB is our gold standard, we can ask if depression scale changes predict SWB scale changes. Our preliminary analyses suggest that the difference here would, in any case, be pretty small. For cash transfers, we found the 'SWB only' effect would be about 13% larger than the pooled 'SWB-and-MH' effect (see page 10, footnote 16). To assess therapy, we looked at some psychological interventions that had outcome measures in SWB and MH and found the SWB effect was 11% smaller (see p27-8). We'd like to dig further into this in the future. But these are not result-reversing differences.

There is much to be admired in this report, and I don't find it intuitively implausible that mental health interventions are several times more cost-effective than cash transfers in terms of wellbeing (which I also agree is probably what matters most). That said, I have several concerns/questions about certain aspects of the methodology, most of which have already been raised by others. Here are just a few of them, in roughly ascending order of importance:

  1. Outcomes should be time-discounted, for at least two reasons. First, to account for uncertainty as to
... (read more)
6
JoelMcGuire
2y
Hi Derek, it’s good to hear from you, and I appreciate your detailed comments. You suggest several features we should consider in our following intervention comparison and version of these analyses. I think trying to test the robustness of our results to more fundamental assumptions is where we are likeliest to see our uncertainty expand. But I moderately disagree that this is straightforward to adapt our model to. I’ll address your points in turn.    * Time discounting: We omitted time-discounting because we only look at effects lasting ten years or less. Given our limited time available, adding a section discussing time-discounting would not be worth the effort. It’s worth noting that adding time discounting would only make psychotherapy look better because cash transfers’ benefits last longer. * Cost of StrongMinds: We include all costs StrongMinds incurs. The cost is "total expenditure of StrongMinds" / "number of people treated". We don't record any monetary cost to the beneficiary. If an expense to a beneficiary is bad because it decreases their wellbeing, we expect subjective well-being to account for that. * Only depression data? We have subjective well-being and mental health measures for cash transfers, but only the latter for psychotherapy. We discuss why we don’t think differences between MH and SWB measures will make much difference in sections 3.1 of the CT CEA and Appendix A of the psychotherapy report. Section 4.4 of the psychotherapy report discusses the literature on social desirability/experimenter demand (what I take you’re pointing to with your concern about “loading the dice”). The limited evidence suggests, perhaps surprisingly, that people don’t seem very responsive to the perceived demands of the experimenter, in general, or in LMIC settings. * Spillovers: We are working on updating our analysis to include household spillovers. We discuss the intra village spillovers in the cost-effectiveness analysis and the meta-analysis. I think we a

Is the CO2 accumulation entirely due to human (or I suppose animal) respiration? So it will typically be worse in small houses with lots of people (holding other factors, like ventilation, constant)?

In a modern house, with no open fires, lead paint etc, what "household air pollution" might there be?

2
RayTaylor
2y
More than you would think - a lot from kitchen, some from (newer) furniture, some faecal matter from mites, house dust which is largely human skin, cleaning chemicals, ozone, positive ions (the bad ones) from laptops especially Macbooks, mould spores, etc. www.blf.org.uk/support-for-you/indoor-air-pollution/causes-and-effects www.epa.gov/indoor-air-quality-iaq/introduction-indoor-air-quality  but in may countries the original source of 'household' (indoor) air pollution is actually from outside the home: www.conserve-energy-future.com/causes-and-effects-of-indoor-air-pollution.php  www.ncbi.nlm.nih.gov/pmc/articles/PMC5089137  I bought two HEPA filters to help protect others during a home isolation, but I also had in mind that it would be useful afterwards! 

Thanks - this is useful and I will explore some of the suggestions.

Is there much research comparing immediate vs extended release melatonin? E.g.:

  1. Is IR better for speeding sleep onset, as one might expect?
  2. Does XR actually improve sleep maintenance/duration more than IR? 
  3. Do they have the same effect on sleep efficiency?
  4. Is the optimal dose the same for each?
  5. Dose aside, do combined IR/XR supplements, or taking a bit of each, give you the 'best of both worlds'?
3
RayTaylor
2y
The decision may be between IR melatonin and ER 5-HTP which is a precursor: www.foodstuffs.ca/scrapbookmain/2017/5/14/5-htp-vs-melatonin "For some people, taking melatonin will help induce and maintain sleep. However, melatonin supplements usually only work if a person has low levels of melatonin in their system (this situation is commonly found in elderly persons). In other words, if you have normal levels of melatonin, taking melatonin supplements won't be as effective in helping you sleep. That's where 5-HTP comes in. Since it works on serotonin as well (and indirectly on melatonin), it may be a better supplement to take for individuals with normal levels of melatonin that are suffering from insomnia. Because it interacts with serotonin, people who are already on anti-depressants or MAOIs should talk to their doctor before trying 5-HTP (melatonin, on the other hand, is generally safe to use with these other drugs when taken as directed)." www.quora.com/What-is-the-difference-between-taking-melatonin-and-5HTP  General intro to 5-HTP and uses: www.mountsinai.org/health-library/supplement/5-hydroxytryptophan-5-htp 

[Edited on 19 Nov 2021: I removed links to my models and report, as I was asked to do so.]

Just to clarify, our (Derek Foster's/Rethink Priorities') estimated Effect Size of ~0.01–0.02 DALYs averted per paying user assumes a counterfactual of no treatment for anxiety. It is misleading to estimate total DALYs averted without taking into account the proportion of users who would have sought other treatment, such as a different app, and the relative effectiveness of that treatment. 

In our Main Model, these inputs are named "Relative impact of Alternative ... (read more)

7
jh
2y
Hi Derek, hope you are doing well. Thank you for sharing your views on this analysis that you completed while you were at Rethink Priorities. The difference between your estimates and Hauke's certainly made our work more interesting. A few points that may be of general interest: * For both analysts we used 3 estimates, an 'optimistic guess', 'best guess' and 'pessimistic guess'. * For users from middle-income countries we doubled the impact estimates. Without reviewing our report/notes in detail, I don't recall the rationale for the specific value of this multiplier. The basic idea is that high-income countries are better served, more competitive markets, so apps are more likely to find users with worse counterfactuals in middle income countries. * The estimates were meant to be conditional on Mind Ease achieving some degree of success. We simply assumed the impact of failure scenarios is 0. Hauke's analysis seems to have made more clear use of this aspect. Not only is Hauke's reading of the literature more optimistic, but he is more optimistic about how much more effective a successful Mind Ease will be relative to the competition. * Indeed the values we used for Derek's analysis, for high income countries, were all less than 0.01. We simplified the 3 estimates, doing a weighted average across the two types of countries, into the single value of 0.01 for Derek's analysis after rounding up (I think the true number may be more like 0.006). The calculations in the post use rounded values so it is easier for a reader to follow. Nevertheless, the results are in line with our more detailed calculations in the original report. * Similar to this point of rounding, we simplified the explanation of the robustness tilt we applied. It wasn't just about Derek vs Hauke. It was also along the dimensions of the business analysis (e.g. success probabilities). We simplified the framing of the robustness tilt both here and in a 'Fermi Estimate' section of the original report

[Edited on 19 Nov 2021: I was asked to remove the links.]

For those who are interested, here is the write-up of my per-user impact estimate (which was based in part on statistical analyses by David Moss): [removed]

The Main Model in Guesstimate is here: [removed]

The Effect Size model, which feeds into the Main Model, is here: [removed]

I was asked to compare it to GiveDirectly donations, so results are expressed as such. Here is the top-level summary:

Our analysis suggests that, compared to doing nothing to relieve anxiety, MindEase causes about as much benefi

... (read more)

Hi Sam,

Thanks for the comments.

1. Have you done much stakeholder engagement? No. I discuss this a little bit in this section of Part 2, but I basically just suggest that people look into this and come up with a strategy before spending a huge amount of time on the research. I do know of academics who would may be able to advise on this, e.g. people who have developed previous metrics in consultation with NICE etc, but they’re busy and I suspect they wouldn’t want to invest a lot of time into efforts outside academia.

I think they’d reject the assumption tha... (read more)

2
Derek
2y
FYI the E-QALY work has been progressing quite well since you asked that question; I've just come out of a webinar on it. Let me know if you want me to send you notes/slides.   A few key points: 1. The measure has been named the EuroQol Health and Wellbeing (EQ-HWB); E-QALY seems to be what they are calling the broader project of extending the scope of the QALY. 2. Psychometric work and stakeholder consultation resulting in a 25-item 'long' measure, then further consultation resulted in a 9-item EQ-HWB-S (Short Form) covering 9 domains: Mobility, Daily activitie, Pain, Fatigue, Loneliness, Concentration & thinking clearly, Depression, Anxiety, Control. 3. A feasibility valuation study in 521 members of the UK public uses the time tradeoff (TTO, EQ-VT protocol) and discrete choice experiments (DCE). Due to covid this was done using video conferencing. 4. There was also a deliberative exercise with a 12-member panel of experts at NICE which reviewed the valuation results. 5. Based on the size of the utility decrement associated with the most severe level of each dimension, the order of importance is: Pain (by a long way); Mobility; Daily activities; Depression; Loneliness; Anxiety; Fatigue; Control; Concentration. (To me, the weight given to Mobility in particular might indicate that this measure does not overcome some of the biggest problems with earlier measures like the EQ-5D, though it seems to be much better overall.) 6. Other valuation studies, using different methodologies, are underway or planned. As far as I know, these don't include ones that obtain weights based on SWB, but I think they will be looking at own-state utilities (i.e. weights derived from preferences of people with the relevant conditions). 7. Several papers are being published on it this year in a special edition of the journal Value in Health.  8. It started with a grant of 850,000 GBP; more has been spent since, but I'm not sure how much. 9. NICE still seems wedded to the EQ-5D fo

I've made a few edits to address some of these issues, e.g.:

Clearly, there are many possible “wellbeing approaches” to economic evaluation and population health summary, defined both by the unit of value (hedonic states, preferences, objective lists, SWB) and by how they aggregate those units when calculating total value. Indeed, welfarism can be understood as a specific form of desire theory combined with a maximising principle (i.e., simple additive aggregation); and extra-welfarism, in some forms, is just an objective list theory plus equity (i.e., no

... (read more)

Hi Michael. Thanks for the feedback.

A few general points to begin with:

  1. I think it’s generally fine to use terminology any way you like as long as you’re clear about what you mean.
  2. In this piece I was summarising debates in health economics, and my framing reflects that literature.
  3. The main objective of these posts is to highlight particular issues that may deserve further attention from researchers, and sometimes that has to come at the expense of conceptual rigour (or at least I couldn’t think of a way to avoid that tradeoff). Like you, my natural incli
... (read more)
1
Derek
3y
I've made a few edits to address some of these issues, e.g.: ---------------------------------------- ---------------------------------------- Changed the first two problem headings to avoid ambiguity and, in the first case, to focus on the result of the problem rather than the cause, which helps distinguish it from 5.

As far as I can tell, it isn't possible to have line breaks in footnotes (though I may just be doing something wrong). This also precludes bulleted/numbered lists, block quotes, etc. Any chance that could be changed? 

3
Aaron Gertler
3y
See the "long footnote with multiple blocks" syntax here. You need to indent successive lines within a footnote to add line breaks by adding four spaces in front of each line. See here for an example of someone doing this in a post.

H3s are still being converted to regular Paragraph format when I paste them in from GDocs. What am I doing wrong?

3
Aaron Gertler
3y
H3 headers should be available again soon; the feature broke after a recent migration.
3
MichaelA
3y
I had the same problem when posting a few days ago. Though I think level 3 headings work for me if I use the markdown editor (e.g., a paragraph that only has "### How often have people been wrong about such things in the past?" will show up as a level 3 heading). And when I just put a sentence fragment in a line by itself and in bold, it at least showed up in the sidebar as if it was a level three heading. (Well, one of them didn't initially work, but then I fixed it somehow - I think the fix was simple, but can't remember.)

I'm sure there are many giving opportunities in global health that are better than the GiveWell top charities, and I'm pleased to see promising small or medium-sized projects like this being brought to the attention of EAs. 

However, I think you should try to get better estimates of QALYs gained (or DALYs averted)—especially if you're going to feature the cost-effectiveness ratio so prominently in your write-up. This should be possible by referring to the relevant literature. The current estimates don't seem all that plausible to me, e.g. an episode of... (read more)

1
brb243
4y
Hello! I found the dataset that I thought I saw before: the Institute for Health Metrics and Evaluation (IHME) Global Burden of Disease Study 2017 (GBD 2017) Disability Weights. Disability weights are the changes of Health-related Quality of Life (HRQoL) due to a condition. I re-ran the calculations and found the cost-effectiveness of the mobile clinics project as 26.63 USD/QALY, with a low estimate of 184.14 USD/QALY and high estimate of 6.33 USD/QALY. I used the same data to estimate the cost-effectiveness of AMF and found 56.07 USD/QALY (low 112.14 and high 11.21). The Business Insider AMF number is about 49.76 USD/QALY. Thus, these updated calculations may be more accurate. Still, the calculations do not take into account the preventive care outcomes, deaths averted due to the Ebola outbreak response, and economic benefits (e. g. of deworming) that may lead to further health improvements, leave alone the positive long-term virtuous cycle of improved health and wealth - but that may apply to other health-related programs too.
1
brb243
4y
Hello. I apologize for the late reply. I was moving over the weekend. I am looking at the IHME DALY by cause data (my calculations here) but these do not seem to take into account the long-term effects of the diseases. For example, deworming and vitamin A supplementation may have positive long-term effects in terms of schooling and economic gains that may far outweigh the direct short-term QALY losses. From there the upper estimate of 5. Simple malaria I would presume one that does not require immediate medical attention but one that still may result in severe condition if untreated (CDC). For the life-threatening conditions, my rationale was also that children treated with severe acute malnutrition are younger than average-age patients and that persons who survive 5 years live on average longer than life expectancy. Also, the QALY estimates are not taking into account the effects of preventive measures - e. g. almost 90,000 persons informed on STIs and the response to a cholera outbreak (training and material provided) - before the intervention, 5 persons died, after no other deaths occurred. On that note, I would actually appreciate if anyone could provide more credible estimates, taking into account the effectiveness and long-term consequences of the treatment. I am sure that REO would welcome such cooperation, also for capacity building reasons.

''Next" and "Previous" arrows/buttons at the bottom of a post, to move to the next/previous post - useful when you haven't read the forum for a while and want to catch up. This would obviously have to assume a certain ordering (e.g. chronological vs karma) and selection (e.g. all or excluding Community/Questions), which could perhaps be adjusted in Settings.

Level 3 headings should be supported. Unless it's changed recently, it currently jumps from Level 2 to Level 4, which makes it hard to logically format complex documents.

4
JP Addison
4y
It has. We no longer apply the same styling to h2 and h3. While you still can’t create h3s using the editor, you can paste in from google docs and they will appear correctly. Sorry for not mentioning this anywhere, it’s such an invisible change — I don’t know what I was thinking. (Unfortunately, I will need to remake this change once the new editor ships. LessWrong does not want its posts to have more than 3 levels of headings [h1, h2 and bold text]. I don’t think that’s the right choice for the EA Forum, but sometimes their updates won’t be checked for compatibility with minor features of the Forum).
6
Will Bradshaw
4y
Strongly agree with this, have been very frustrated in the past with how the Forum (via LessWrong) coerces my header usage. It looks bad in the sidebar too.

Thanks for the comments!

1. The put could cover ~90% of the cost of the accelerated production, taking into account the additional costs.

2. Sales are likely to be higher if they move more quickly: the company with the first billion vaccines is likely to sell a lot more items than the company with the second, and this could more than offset any additional costs. (The second may not sell any, even if it’s a good product, if the first can meet all needs quickly enough.)

3. Some variants outlined in the brief, such as declining payouts, can further incent... (read more)

1
edcon
4y
Thanks for your response! I think this is a really promising idea. Just a few minor points 1/ I agree that if set up right could incentivise pace if it includes accelerated cost esp. if it erred on the side being overly generous. Though just sceptical it will do this to a large extent, as some costs for haste are hard to quantify, eg. moving best/more staff onto this project at the detriment of other projects, and I doubt would be covered in a politically feasible payout structure (eg. 1% a month). 2/ I think the market incentive to coming first to market is quite small, as there is large social pressure to sell these products at low margins and the market for some of these products (esp. vaccines) are so huge, compared to manufacturing capability, so seems small first mover advantage , though this certainty on size of market if the put options are not used will not apply to all products.
Answer by DerekApr 03, 20206
0
0

Should Covid-19 be a priority for EAs?

A scale-neglectedness-tractability assessment, or even a full cost-effectiveness analysis, of Covid as a cause area (compared to other EA causes) could be useful. I'm starting to look into this now – please let me know if it's already been done.

1
El-Nino9
3y
Was asking myself the same question

Suicide is a very poor indicator of the dead/neutral point, for a host of reasons.

A few small, preliminary surveys I've seen place it around 2/10, though it ranges from about 0.5 to 6 depending on whom and how you ask.

(I share your concerns in parentheses, and am doing some work along these lines - it's been sidelined in part due to covid projects.)

1
bfinn
4y
OK, well reworking the numbers with a 2/10 neutral point (and Imperial's latest figures as noted below): Death is now a fall from 5.17 to 2 points, i.e. by 3.17 points, though presumably out of 8 not 10 as we've compressed our scale. So 4.5 years = 4.5 x 3.17/8 = 1.78 WALYs lost. So 1.9 to 24 million deaths = 3.4 to 43 WALYs lost. Presumably the WALYs lost by the financial crisis is also out of 8 not 10, i.e. 0.2/8 per person = 194 million WALYs. Which is 4.5 to 57 times worse than the deaths.

Hah! I was working on them before getting sidelined with covid stuff.

I can send you the drafts if you send me a PM. The content is >80% done (I've decided to add more, so the % complete has dropped) but they need reorganising into ~10 manageable posts rather than 3 massive ones.

Thanks Aidan! Hope you're feeling better now.

Most of your comments sound about right.

On retention rates: Your general methods seem to make sense, since one would expect gradual tapering off of benefits, but your inputs seem even more optimistic than I originally thought.

I'm not sure Strong Minds is a great benchmark for retention rates, partly because of the stark differences in context (rural Uganda vs UK cities), and partly because IIRC there were a number of issues with SM's study, e.g. a non-randomised allocation and evidence of social ... (read more)

2
AidanGoth
4y
Yes, feeling much better now fortunately! Thanks for these thoughts and studies, Derek. Given our time constraints, we did make some judgements relatively quickly but in a way that seemed reasonable for the purposes of deciding whether to recommend AfH. So this can certainly be improved and I expect your suggestions to be helpful in doing so. This conversation has also made me think it would be good to explore six monthly/quarterly/monthly retention rates rather than annual ones - thanks for that. :) Our retention rates for StrongMinds were also based partly on this study, but I wasn't involved in that analysis so I'm not sure on the details of the retention rates there.

Do you think GiveWell top charities are the best of all current giving opportunities? If so, what is the next best opportunity?

Do you think adopting subjective wellbeing as your primary focus would materially affect your recommendations?

In particular:

(a) Would using SWB as the primary outcome measure in your cost-effectiveness analysis change the rank ordering of your current top charities in terms of estimated cost-effectiveness?

(b) If it did, would that affect the ranking of your recommendations?

(c) Would it likely cause any of your current top charities to no longer be recommended?

(d) Would it likely cause the introduction of other charities (such as ones focused on mental health) into your top charity list?

How likely is it that GiveWell will ultimately (e.g. over a 100-year or 10,000-year period) do more harm than good? If that happens, what is the most likely explanation?

A recent post on this forum (one of the most upvoted of all time) argued that "randomista" development projects like GiveWell's top charities are probably less cost-effective than projects to promote economic growth. Do you have any thoughts on this?

I like your general approach to this evaluation, especially:

  • the use of formal Bayesian updating from a prior derived in part from evidence for related programmes
  • transparent manual discounting of the effect size based on particular concerns about the direct study
  • acknowledgement of most of the important limitations of your analysis and of the RCT on which it was based
  • careful consideration of factors beyond the cost-effectiveness estimate.

I'd like to see more of this kind of medium-depth evaluation in EA.

I don't have time at the moment for a close ... (read more)

2
AidanGoth
4y
Thanks very much for this thoughtful comment and for taking the time to read and provide feedback on the report. Sorry about the delay in replying - I was ill for most of last week. 1. Yes, you're absolutely right. The current bounds are very wide and they represent extreme, unlikely scenarios. We're keen to develop probabilistic models in future cost-effectiveness analyses to produce e.g. 90% confidence intervals and carry out sensitivity analyses, probably using Guesstimate or R. We didn't have time to do so for this project but this is high on our list of methodological improvements. 2. Estimating the retention rates is challenging so it's helpful for us to know that you think our values are too high. We based this primarily on our retention rate for StrongMinds, but adjusted downwards. It's possible we anchored on this too much. However, it's not clear to me that our values are too high. In particular, if our best-guess retention rate for AfH is too high, then this is probably also true for StrongMinds. Since we're using StrongMinds as a benchmark, this might not change our conclusions very much. The total benefits are calculated somewhat confusingly and I appreciate you haven't had the chance to look at the CEA in detail. If x is the effect directly post-treatment and r is the retention rate, we calculated the total benefits as 12x+∞∑i=1rix=x(1+r)2(1−r) That is, we assume half a year of full effect, and then discount each year that follows by r each time. We calculated it in this way because for StrongMinds, we had 6 month follow-up data. However, it's not clear that this approach is best in this case. It might have been better to: * Assume 0.15 years at full effect * Since the study has only an 8 week follow-up, as you mention * Assume somewhere in between 0.15 and 0.5 years at full effect * Since the effects still looked very good at 8 week follow-up (albeit with no control) and evidence from interventions such as StrongMinds that suggest lo
There is also evidence that health problems have a much smaller effect on subjective well-being than one might imagine.

This is only the case for (some) physical health problems, especially those associated with reduced mobility. People tend to underestimate the SWB impact of (at least some) mental health problems. See e.g. Gilbert & Wilson, 2000; De Wit et al., 2000; Dolan & Kahneman, 2007; Dolan 2008; Pyne et al., 2009; Karimi et al., 2017

2
AidanGoth
4y
Yes, we had physical health problems in mind here. I appreciate this isn't clear though - thanks for pointing out. Indeed, we are aware of the underestimation of the badness of mental health problems and aim to take this into account in future research in the subjective well-being space.

You might want to mention the publication date (1937)

3
Aaron Gertler
4y
Added a note, thanks!

Thanks - I missed that on my skim. But the "extended" follow-up is only for another two months. It does seem to indicate that effects persist for at least that period, without any trend towards baseline, which is promising (though without a control group the counterfactual is impossible to establish with confidence). I wonder why they didn't continue to collect data beyond this period.

Thanks - "trained facilitator" might be a bit misleading. Still, it looks like there were two volunteer course leaders for each course, selected in part for their unspecified "skills", who were given "on-going guidance and support" to facilitate the sessions, and who have to arrange a venue etc themselves, then go through a follow-up process when it's over. So it's not a trivial amount of overhead for an average of 13 participants.

Load more