This is a special post for quick takes by Mo Putera. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
Striking paper by Anant Sudarshan and Eyal Frank (via Dylan Matthews at Vox Future Perfect) on the importance of vultures as a keystone species.
To quote the paper and newsletter — the basic story is that vultures are extraordinarily efficient scavengers, eating nearly all of a carcass less than an hour after finding it, and farmers in India historically relied on them to quickly remove livestock carcasses, so they functioned as a natural sanitation system in helping to control diseases that could otherwise be spread through the carcasses they consume. In 1994, farmers began using diclofenac to treat their livestock, due to the expiry of a patent long held by Novartis leading to the entry of cheap generic brands made by Indian companies. Diclofenac is a common painkiller, harmless to humans, but vultures develop kidney failure and die within weeks of digesting carrion with even small residues of it. Unfortunately this only came to light via research published a decade later in 2004, by which time the number of Indian vultures in the wild had tragically plummeted from tens of millions to just a few thousands today, the fastest for a bird species in recorded history and the largest in magnitude since the extinction of the passenger pigeon.
When the vultures died out, far more dead animals lay around rotting, transmitting pathogens to other scavengers like dogs and rats and entering the water supply. Dogs and rats are less efficient than vultures at fully eliminating flesh from carcasses, leading to a higher incidence of human contact with infected remains, and they're also more likely to transmit diseases like anthrax and rabies to people. Sudarshan and Frank estimate that this led to ~100,000(!) additional deaths each year from 2000-05 due to a +4.2%(!) increase in all-cause mortality among the 430 million people living in districts that once had a lot of vultures, which is staggering; this is e.g. more than the death toll in 2001 from HIV/AIDS (92,000), malaria (53,000), and alcohol use disorders (14,000).
(Cause X, anyone? Preventing a hundred thousand deaths a year for less than half a billion dollars annually clears the GiveWell top charity-level threshold, and half a billion is in the ballpark of Open Philanthropy's entire annual grantmaking...)
So what to do? For vultures in particular, Sudarshan and Frank say their results "inform current vulture recovery efforts in India, and conservation efforts elsewhere" e.g. parts of Africa and Spain, albeit without elaborating. More broadly, they hope their paper informs better policymaking by providing "a particularly stark example of the type of hard-to-reverse and unpredictable costs that must be accounted for when evaluating the introduction of new chemicals into fragile and diverse ecosystems", stating "it is plausible that a counterfactual policy regime in India that tested chemicals for their toxicity to at least keystone species might have avoided the collapse of vultures". They conclude:
In the absence of empirical estimates of the social benefits conferred by different species, conservation policy may be heavily influenced by existence values unrelated to utility. The vulture is not a particularly attractive bird and evokes rather different emotions at first sight than do more charismatic poster-animals of wildlife conservation such as tigers and panda bears. Nevertheless our results suggest that subjective existence values alone may not be the best way to formulate conservation policy.
The remark that vultures are not particularly attractive reminds me of the overlooked plight of farmed chickens, shrimp, insects etc for not being charismatic fauna. (I am admittedly sort of emotionally conflating the welfare of vultures with their ecosystem importance as a keystone species here.)
Navigating the global health funding landscape can be confusing even for global health veterans; there are scores of donors and multilateral funding mechanisms, each with its own particular structure, personality, and philosophy. For the uninitiated, PEPFAR, GAVI, PMI, WHO, the Global Fund, UNITAID, and the Gates Foundation can all appear obscure and intimidating. But if your head is spinning from acronym-induced vertigo, fear not! We are here to help you make sense of it all. How, you ask? With a clear method for donor identification: comparing the donors to your parents. So what would happen if the donors were your parents and you asked them for a new car?
PEPFAR: Ok, we’ll buy you a new car, but we’re going with you to the dealership and it must be American-made. At least one seat must be devoted to abstinence and the delay of sexual debut. Before you drive the car, you must promise not to support prostitution. Each quarter, you must report how many miles you’ve driven with how many passengers, with a target of 1000 passenger-miles per month.
President’s Malaria Initiative: We’ve made it very clear that we only support four proven, cost-effective interventions for child rearing: food, clothing, health care, and education. What, do you think money in the Malaria family just grows on trees? Just because HIV/AIDS has a shiny new car doesn’t mean we can afford it.
UNITAID: We’ve identified pediatric vehicles as a niche market which is currently underserved by the major transport providers. By buying cars for you and all our other children, we are helping to create a pediatric automotive market with new and superior transportation commodities. Prior to our innovative entry into the pediatric vehicle market, most of our potential beneficiaries were getting around using lower-quality forms of transportation, such as bicycles, buses, and walking.
GAVI: We will purchase and a deliver a car for you from a particular GAVI-approved dealership. However, you must co-finance the purchase with wages from your part-time job. Gas and insurance will require separate applications.
WHO: Sorry, we haven’t had a car budget in ten years. But we DO have a new set of guidelines on best practices for safe car driving, and a box full of old carfax vehicle reports that you’re welcome to look at any time. Please let us know right away if you experience any engine trouble; regular and reliable reporting allows us to maintain an up-to-date transmission failure surveillance system. And don’t forget to celebrate Vehicle Safety Day on May 11!
Gates Foundation: Of course, darling, we gave your boarding school plenty of money to buy a car. And since we’re on the Board, we’ll make sure they buy the right car. And you can drive it any time you want…as long as one of us is in the passenger seat to make sure you’re going the right way.
Global Fund: We’ve reviewed your proposal for a Range Rover and according to Consumer Reports it is a technically capable car for city driving. Here is a $70,000 check for you to go and buy the Range Rover, as discussed in your proposal.
I like Austin Vernon's idea for scaling CO2 direct air capture to 40 billion tons per year, i.e. matching our current annual CO2 emissions, using (extreme versions of) well-understood industrial processes.
The proposed solution may not be the cheapest out there. Other ideas like ocean seeding or olivine weathering might be less expensive. But most of the science is understood, and it can scale quickly. I'd guess 100,000 workers could build enough sites to capture our 40 billion tons goal in a decade. The capital expenditure rate would be between $1 trillion and $5 trillion yearly, or 1% to 5% of global GDP. That cost and deployment speed take doomer scenarios off the table. Say something scary like melting permafrost threatens runaway warming. You can target the area with a few years of sulfur cooling while a tiny portion of the global economy builds carbon capture devices. It is nothing like a wartime mobilization.
The most disruptive aspect would be energy usage. We'd need to ramp output up at double-digit rates because each ton of CO2 requires 2-3 MWh of energy for removal. Thankfully low-grade heat is easy to come by. There is enough energy near coal mines in Wyoming or natural gas fields in SW Pennsylvania at less than $5/MWh. Other places might use solar, hydro, or geothermal steam if they lack fossil fuel reserves. The key is to put the facilities at the energy sources instead of trying to move the energy. Cheap energy makes the operating costs <1% of global GDP. Many clean energy proponents have fretted about how to keep fossil fuel reserves in the ground. Burning them to run carbon capture equipment kills two birds with one stone!
The takeaway is that we could completely turn around the carbon dioxide problem within a few years with a similar spending rate as rich world COVID relief. There won't be a scenario where we've waited too long to act.
I am admittedly perhaps biased to want moonshots like Vernon's idea to work, and for society at large to be able to coordinate and act on the required scale, after seeing these depressing charts from Assessing the costs of historical inaction on climate change:
Curious what people think of Gwern Branwen's take that our moral circle has historically narrowed as well, not just expanded (so contra Singer), so we should probably just call it a shifting circle. His summary:
The “expanding circle” historical thesis ignores all instances in which modern ethics narrowed the set of beings to be morally regarded, often backing its exclusion by asserting their non-existence, and thus assumes its conclusion: where the circle is expanded, it’s highlighted as moral ‘progress’, and where it is narrowed, what is outside is simply defined away.
When one compares modern with ancient society, the religious differences are striking: almost every single supernatural entity (place, personage, or force) has been excluded from the circle of moral concern, where they used to be huge parts of the circle and one could almost say the entire circle. Further examples include estates, houses, fetuses, prisoners, and graves.
(I admittedly don't find his examples all that persuasive, probably because I'm already biased to only consider beings that can feel pleasure and suffering.)
What's the "so what"? Gwern:
One of the most difficult aspects of any theory of moral progress is explaining why moral progress happens when it does, in such apparently random non-linear jumps. (Historical economics has a similar problem with the Industrial Revolution & Great Divergence.) These jumps do not seem to correspond to simply how many philosophers are thinking about ethics.
As we have already seen, the straightforward picture of ever more inclusive ethics relies on cherry-picking if it covers more than, say, the past 5 centuries; and if we are honest enough to say that moral progress isn’t clear before then, we face the new question of explaining why things changed then and not at any point previous in the 2500 years of Western philosophy, which included many great figures who worked hard on moral philosophy such as Plato or Aristotle.
It is also troubling how much morality & religion seems to be correlated with biological factors. Even if we do not go as far as Julian Jaynes’s9 theories of gods as auditory hallucinations, there are still many curious correlations floating around.
Why did India's happiness ratings consistently drop so much over time even as its GDP per capita rose?
Epistemic status: confused. Haven't looked into this for more than a few minutes
My friend recently alerted me to an observation that puzzled him: this dynamic chart from Our World in Data's happiness and life satisfaction article showing how India's self-reported life satisfaction dropped an astounding -1.20 points (4.97 to 3.78) from 2011 to 2021, even as its GDP per capita rose +51% (I$4,374 to I$6,592 in 2017 prices):
(I included China for comparison to illustrate the sort of trajectory I expected to see for India.)
The sliding year scale on OWID's chart shows how this drop has been consistent and worsening over the years. This picture hasn't changed much recently: the most recent 2024 World Happiness Report reports a 4.05 rating averaged over the 3-year window 2021-23, only slightly above the 2021 rating.
A -1.20 point drop is huge. For context, it's 10x(!) larger than the effect of doubling income at +0.12 LS points (Clarke et al 2018 p199, via HLI's report), and compares to major negative life events like widowhood and extended unemployment:
Given India's ~1.4 billion population, such a large drop is alarming: roughly ~5 billion LS-years lost since 2011, very roughly ballparking. For context, and keeping in mind that LS-years and DALYs aren't the same thing, the entire world's DALY burden is ~2.5 billion DALYs p.a.
But – again caveating with my lack of familiarity with the literature and extremely cursory look into this – I haven't seen any writeup look into this, which makes me wonder if it's not a 'real issue'? For instance, the 2021 WHR just says
Since 2006-08, world well-being has been static, but life expectancy increased by nearly four years up to 2017-19 (we shall come to 2020 later). The rate of progress differed a lot across regions. The biggest improvements in life expectancy were in the former Soviet Union, in Asia, and (the greatest) in Sub-Saharan Africa. And these were the regions that had the biggest increases in WELLBYs. In Asia, the exception is South Asia, where India has experienced a remarkable fall in Well-being which more than outweighs its improved life expectancy.
That's it: no elaboration, no footnotes, nothing.
So what am I missing? What's going on here?
A quick search turned up this WEF article (based on Ipsos data and research, not the WHR's Gallup World Poll, so take it with a grain of salt) pointing to
increased internet access -> pressure to portray airbrushed lives on social media & a feeling that 'their lives have become meaningless'
covid-19 mitigation-induced isolation curtailing activities that improve wellbeing (employment, socializing, going to school, exercising and accessing health services)
urban migration to seek work -> traffic congestion, noise and pollution, demanding bosses -> less sleep and exercise -> higher anxiety and worsening health
But I'm not sure these factors are differential (i.e. that they, for instance, happen much more in India than elsewhere s.t. it explains the wellbeing vs development trajectory difference over 2011-24)?
Interesting! I think figure 2.1 here provides a partial answer. According to the FAQ:
"the sub-bars show the estimated extent to which each of the six factors (levels of GDP, life expectancy, generosity, social support, freedom, and corruption) is estimated to contribute to making life evaluations higher in each country than in Dystopia. Dystopia is a hypothetical country with values equal to the world’s lowest national averages for each of the six factors (see FAQs: What is Dystopia?). The sub-bars have no impact on the total score reported for each country but are just a way of explaining the implications of the model estimated in Table 2.1. People often ask why some countries rank higher than others—the sub-bars (including the residuals, which show what is not explained) attempt to answer that question."
India seems to score very low on social support, compared to similarly ranked countries.
I did some googling and found this, which shows the sub-factors over time for India. Looks like social support declined a lot, but is now increasing again.
I haven't checked whether it declined more than in other countries and, if it has, I'm not sure why it has.
Your second link helped me refine my line of questioning / confusion. You're right that social support declined a lot, but the sum of the six key variables (GDP per capita, etc) still mostly trended upwards over time, huge covid dip aside, which is what I'd expect in the India development success story.
It's the dystopia residual that keeps dropping, from 2.275 - 1.83 = 0.445 in 2015 (i.e. Indians reported 0.445 points higher life satisfaction than you'd predict using the model) to 0.979 - 1.83 = -0.85, an absolute plummeting of life satisfaction across a sizeable fraction of the world population, that's for some reason not explained by the six key variables. Hm...
(please don't feel obliged to respond – I appreciate the link!)
Could this be related to the rising level of inequality in happiness levels in Asia? (See the graph on page 44 of the WHR2024). It can be assumed that the benefits of GDP growth are not evenly distributed, and increasing inequalities trigger frustration and a decrease in well-being in the majority of the population (since to a certain extent, the sense of welfare is relative).
This is how Our World in Data explains a similar phenomenon in the US:
"Income inequality in the US is exceptionally high and has been on the rise in the last four decades, with incomes for the median household growing much more slowly than incomes for the top 10%. As a result, trends in aggregate life satisfaction should not be seen as paradoxical: the income and standard of living of the typical US citizen have not grown much in the last couple of decades."
Yeah rising inequality is a good guess, thank you – the OWID chart also shows the US experiencing the same trajectory direction as India (declining average LS despite rising GDP per capita). I suppose one way to test this hypothesis is to see if China had inequality rise significantly as well in the 2011-23 period, since it had the expected LS-and-GDP-trending-up trajectory. Probably a weak test due to potential confounders...
As someone predisposed to like modeling, the key takeaway I got from Justin Sandefur's Asterisk essay PEPFAR and the Costs of Cost-Benefit Analysis was this corrective reminder – emphasis mine, focusing on what changed my mind:
Second, economists were stuck in an austerity mindset, in which global health funding priorities were zero-sum: $300 for a course of HIV drugs means fewer bed nets to fight malaria. But these trade-offs rarely materialized. The total budget envelope for global public health in the 2000s was not fixed. PEPFAR raised new money. That money was probably not fungible across policy alternatives. Instead, the Bush White House was able to sell a dramatic increase in America’s foreign aid budget by demonstrating that several billion dollars could, realistically, halt an epidemic that was killing more people than any other disease in the world.
...
A broader lesson here, perhaps, is about getting counterfactuals right. In comparative cost-effectiveness analysis, the counterfactual to AIDS treatment is the best possible alternative use of that money to save lives. In practice, the actual alternative might simply be the status quo, no PEPFAR, and a 0.1% reduction in the fiscal year 2004 federal budget. Economists are often pessimistic about the prospects of big additional spending, not out of any deep knowledge of the budgeting process, but because holding that variable fixed makes analyzing the problem more tractable. In reality, there are lots of free variables.
More detail:
Economists’ standard optimization framework is to start with a fixed budget and allocate money across competing alternatives. At a high-level, this is also how the global development community (specifically OECD donors) tends to operate: foreign aid commitments are made as a proportion of national income, entirely divorced from specific policy goals. PEPFAR started with the goal instead: Set it, persuade key players it can be done, and ask for the money to do it.
Bush didn’t think like an economist. He was apparently allergic to measuring foreign aid in terms of dollars spent. Instead, the White House would start with health targets and solve for a budget, not vice versa. ... Economists are trained to look for trade-offs. This is good intellectual discipline. Pursuing “Investment A” means forgoing “Investment B.” But in many real-world cases, it’s not at all obvious that the realistic alternative to big new spending proposals is similar levels of big new spending on some better program. The realistic counterfactual might be nothing at all.
In retrospect, it seems clear that economists were far too quick to accept the total foreign aid budget envelope as a fixed constraint. The size of that budget, as PEPFAR would demonstrate, was very much up for debate.
When Bush pitched $15 billion over five years in his State of the Union, he noted that $10 billion would be funded by money that had not yet been promised. And indeed, 2003 marked a clear breaking point in the history of American foreign aid. In real-dollar terms, aid spending had been essentially flat for half a century at around $20 billion a year. By the end of Bush’s presidency, between PEPFAR and massive contracts for Iraq reconstruction, that number hovered around $35 billion. And it has stayed there since.
Compared to normal development spending, $15 billion may have sounded like a lot, but exactly one sentence after announcing that number in his State of the Union address, Bush pivoted to the case for invading Iraq, a war that would eventually cost America something in the region of $3 trillion — not to mention thousands of American and hundreds of thousands of Iraqi lives. Money was not a real constraint.
Tangentially, I suspect this sort of attitude (Iraq invasion notwithstanding) would naturally arise out of a definite optimism mindset (that essay by Dan Wang is incidentally a great read; his follow-up is more comprehensive and clearly argued, but I prefer the original for inspiration). It seems to me that Justin has this mindset as well, cf. his analogy to climate change in comparing economists' carbon taxes and cap-and-trade schemes vs progressive activists pushing for green tech investment to bend the cost curve. He concludes:
You don’t have to give up on cost-effectiveness or utilitarianism altogether to recognize that these frameworks led economists astray on PEPFAR — and probably some other topics too. Economists got PEPFAR wrong analytically, not emotionally, and continue to make the same analytical mistakes in numerous domains. Contrary to the tenets of the simple, static, comparative cost-effectiveness analysis, cost curves can sometimes be bent, some interventions scale more easily than others, and real-world evidence of feasibility and efficacy can sometimes render budget constraints extremely malleable. Over 20 years later, with $100 billion dollars appropriated under both Democratic and Republican administrations, and millions of lives saved, it’s hard to argue a different foreign aid program would’ve garnered more support, scaled so effectively, and done more good. It’s not that trade-offs don’t exist. We just got the counterfactual wrong.
Aside from his climate change example above, I'd be curious to know what other domains economists are making analytical mistakes in w.r.t. cost-benefit modeling, since I'm probably predisposed to making the same kinds of mistakes.
This WHO press release was a good reminder of the power of immunization – a new study forthcoming publication in The Lancet reports that (liberally quoting / paraphrasing the release)
global immunization efforts have saved an estimated 154 million lives over the past 50 years, 146 million of them children under 5 and 101 million of them infants
for each life saved through immunization, an average of 66 years of full health were gained – with a total of 10.2 billion full health years gained over the five decades
measles vaccination accounted for 60% of the lives saved due to immunization, and will likely remain the top contributor in the future
vaccination against 14 diseases has directly contributed to reducing infant deaths by 40% globally, and by more than 50% in the African Region
the 14 diseases: diphtheria, Haemophilus influenzae type B, hepatitis B, Japanese encephalitis, measles, meningitis A, pertussis, invasive pneumococcal disease, polio, rotavirus, rubella, tetanus, tuberculosis, and yellow fever
fewer than 5% of infants globally had access to routine immunization when the Expanded Programme on Immunization (EPI) was launched 50 years ago in 1974 by the World Health Assembly; today 84% of infants are protected with 3 doses of the vaccine against diphtheria, tetanus and pertussis (DTP) – the global marker for immunization coverage
there's still a lot to be done – for instance, 67 million children missed out on one or more vaccines during the pandemic years
(Attention conservation notice: rambling in public)
A striking throwaway remark, given its context:
There is remarkably little evidence that evidence-based medicine leads to better health outcomes for patients, though this is absence of (good) evidence rather than (good) evidence of absence of effect.
It's striking given that this comes from this book on Thailand’s Health Intervention and Technology Assessment Program (HITAP) (ch 1 pg 22), albeit perhaps understandable given the authors' stance that evidence is necessary but not sufficient to determine the best course of action (to treat a patient, to design a social insurance scheme, etc), which seems completely unobjectionable.
That said, I did wonder about the first half of the quoted throwaway remark, so I asked Elicit; its top-4 paper summary is
Evidence-based medicine (EBM) has been shown to improve patient outcomes and healthcare efficiency. A study in a Spanish hospital found that an EBP unit had lower mortality rates (6.27% vs 7.75%) and shorter lengths of stay (6.01 vs 8.46 days) compared to standard practice (Emparanza et al., 2015). EBM can reduce clinical uncertainty, leading to better patient outcomes, improved population health, and reduced costs (Molony & Samuels, 2012). The implementation of EBM is expected to enhance the quality of care as part of healthcare reform initiatives (Hughes, 2011). Additionally, EBM has paralleled the growth of patient empowerment, supporting informed decision-making by integrating the best available research with individual patient values and concerns (Hendler, 2004). While challenges remain in translating EBM principles for public consumption, its adoption has the potential to significantly improve healthcare delivery and patient outcomes.
although the summary didn't include these papers it listed in the top 10
Bahtsevani et al 2004's systematic review (weak evidence of limited findings)
Every-Palmer & Howick 2014's paper with these dramatic sentences in their abstract:
"In this paper we suggest that EBM's potential for improving patients' health care has been thwarted by bias in the choice of hypotheses tested, manipulation of study design and selective publication."
"Evidence for these flaws is clearest in industry-funded studies. We argue EBM's indiscriminate acceptance of industry-generated 'evidence' is akin to letting politicians count their own votes. Given that most intervention studies are industry funded, this is a serious problem for the overall evidence base. Clinical decisions based on such evidence are likely to be misinformed, with patients given less effective, harmful or more expensive treatments."
"More investment in independent research is urgently required. Independent bodies, informed democratically, need to set research priorities. We also propose that evidence rating schemes are formally modified so research with conflict of interest bias is explicitly downgraded in value."
Shaw et al 2007's dramatically-titled Why Evidence Based Medicine May Be Bad for You and Your Patients ("This review argues that the basis of EBM is so deeply flawed that in many cases it cannot usefully inform clinical practice, reflected in fact by the current majority outcome of most trials as “no-blood,” or no result")
With the proviso that I'm a layperson w.r.t. medicine and healthcare, and that I didn't ask Elicit further questions or really dig further into this at all — I find myself mostly unmoved by these papers & reviews, while the younger me of (say) a decade ago would've epistemically panicked. Partly it's that they aren't really contra "using evidence to inform medicine" per se: to oversimplify a bit, Bahtsevani et al recommend more evidence generation, Every-Palmer & Howick recommend less industry-biased evidence generation, and Shaw et al argue that other less legible-than-RCT types of evidence should occupy more mindshare than they did back in '07 (there's a loose parallel here to the more recent growth vs randomista debate in dev econ). Partly it's that I suspect there's some talking past each other, which only becomes clear when one digs into the nuts-and-bolts. Partly it's that I think the general underlying ethos of "using evidence to inform medicine" is a lot more robust than any particular instantiation of it (e.g. using only empirical data from systematic reviews of RCTs), sort of like how cluster thinking > sequence thinking for decision-making, or like how foxes have weak views strongly held (side note: in that essay's framing I used to be a hedgehog, hopefully I'm now more fox than degenerate cactus). Partly it's that I've "seen this before" with other topics, cf. Scott Alexander's many deep dives. Maybe I'm just getting old...
I haven't looked in detail, but my quick comment would be that these studies seem to basically be comparing extreme careful following ofevidence based medicine, vs. "normal medical practise" which is like 90%+ based on evidence anyway. Standard medical training and registered medical practise in most of the world closely follows the evidence - it would be very difficullt (maybe impossible) to practise "outside" of the evidence. So not finding a huge difference between these 2 ways of practising isn't so surprising.
Epistemic status: public attempt at self-deconfusion & not just stopping at knee-jerk skepticism
The recently published Cost-effectiveness of interventions for HIV/AIDS, malaria, syphilis, and tuberculosis in 128 countries: a meta-regression analysis (so recent it's listed as being published next month), in my understanding, aims to fill country-specific gaps in CEAs for all interventions in all countries for HIV/AIDS, malaria, syphilis, and tuberculosis, to help national decision-makers allocate resources effectively – to a first approximation I think of it as "like the DCP3 but at country granularity and for Global Fund-focused programs". They do this by predicting ICERs, IQRs, and 95% UIs in US$/DALY using the meta-regression parameters obtained from analysing ICERs published for these interventions (more here).
AFAICT their methodology and execution seem superb, so I was keen to see their results:
Antenatal syphilis screening ranks as the lowest median ICER in 81 (63%) of 128 countries, with median ICERs ranging from $3 (IQR 2–4) per DALY averted in Equatorial Guinea to $3473 (2244–5222) in Ukraine.
At risk of being overly skeptical: $3 per DALY averted is >30x better than Open Phil's 1,000x bar of $100 per DALY which is roughly around GW top charity level which OP have said are hard to beat, especially for a direct intervention like antenatal syphilis screening. It makes me wonder how much credence to put in the study's findings for actual resource allocation decisions (esp. Figure 4 ranking top interventions at country granularity). Also:
Specifically re: antenatal syphilis screening, CE/AIM's report on screening + treating antenatal syphilis estimates $81 per DALY; I'm hard-pressed to believe that removing treatment improves cost-eff >1 OOM
I'm reminded of the time GW found 5 separate spreadsheet errors in a DCP2 estimate of soil-transmitted-helminth (STH) treatment that together misleadingly 'improved' its cost-effectiveness ~100-fold from $326.43 per DALY (correct output) to just $3.41 (wrong, and coincidentally in the ballpark of the estimate above that triggered my skepticism)
So how should I think about and use their findings given what seems like reasonable grounds for skepticism, if I'm primarily interested in helping decision-makers help people better? Scattered thoughts to defend the study / push back on my nitpicking above:
even if imperfect – and I'm not confident in my skepticism above – they clearly improve substantially upon the previous state of affairs (CEA gaps everywhere at country-disease-intervention level granularity; expert opinion not lending itself to country-specific predictions; case-by-case methods often being unsuccessful)
their recommendations seem reasonably hedged, not naively maximalist: they include 95% uncertainty intervals; they clearly say "cost-effectiveness... should not be the only criterion... [consider also] enhancing equity and providing financial risk protection"
even a naively maximalist recommendation ("first fund lowest-ICER intervention, then 2nd-lowest, ... until funds run out") doesn't seem unreasonable in this context – essentially countries would end up funding more antenatal syphilis screening, intermittent preventive treatment of malaria in pregnant women and infants, and chemotherapy for drug-susceptible TB (just from eyeballing Figure 4)
I interpret what they're trying to do as not so much "here are the ICER league tables, use them", but shifting decision-makers' approach to resource allocation from needing a single threshold for all healthcare funding decisions to (quoting them) "ICERs ranked in country-specific league tables", and in the long run this perspective shift seems useful to "bake into" decision-making processes, even if the specific figures in this specific study aren't necessarily the most accurate and shouldn't be taken at face value
That said, I do wonder if the authors could have done a bit better, like
cautioning against naively taking the best cost-eff estimates at face value, instead of suggesting "Funds could be first spent on the intervention that has the lowest ICER. Following that, other interventions could be funded in order of their ICER rankings, as long as there are available funds"
spot-checking some of (not all) the top cost-eff ICERs that went into their meta-regression analysis to get a sense of their credibility, especially those which feed into their main recommendations, like GW did above with the DCP2 estimate for STH treatment
extracting qualitative proxies for decision-maker guidance from an analysis of the main drivers behind the substantial ranking differences in intervention ICERs across economic and epidemiological contexts (eg "we should expect antenatal syphilis screening to be substantially less cost-effective in our context due to factors XYZ, let's look at other interventions instead" – what would a short useful list of XYZ look like?), instead of just saying "we found the rankings differ substantially"
The positive spin is that someone got funded to do this kind of big-picture analysis and got it published in The Lancet.
There were 1,792 potential country-intervention pairs (although it is not immediately clear if they did all 1,792 pairs). So I don't think most reasonable readers would view these findings as substitutes for a more in-depth, country-specific analysis on the potentially promising intervention. They did publish at least some data for each intervention, although maybe it isn't enough to poke at each of the country-intervention pairs.
One of the more surprising things I learned from Karen Levy's 80K podcast interview on misaligned incentives in global development was how her experience directly contradicted a stereotype I had about for-profits vs nonprofits:
Karen Levy: When I did Y Combinator, I expected it to be a really competitive environment: here you are in the private sector and it’s all about competition. And I was blown away by the level of collaboration that existed in that community — and frankly, in comparison to the nonprofit world, which can be competitive. People compete for funding, and so very often we’re fighting over slices of the same pie. Whereas the Y Combinator model is like, “We’re making the pie bigger. It’s getting bigger for everybody.”
My assumption had been that the opposite was true.
The following table is from Scott Alexander's post, which you should check out for the sources and (many, many) caveats.
This table can’t tell you what your ethical duties are. I'm concerned it will make some people feel like whatever they do is just a drop in the bucket - all you have to do is spend 11,000 hours without air conditioning, and you'll have saved the same amount of carbon an F-35 burns on one airstrike! But I think the most important thing it could convince you of is that if you were previously planning on letting yourself be miserable to save carbon, you should buy carbon offsets instead. Instead of boiling yourself alive all summer, spend between $0.04 and $2.50 an hour to offset your air conditioning use.
I'm curious what people who're more familiar with infinite ethics think of Manheim & Sandberg's What is the upper limit of value?, in particular where they discuss infinite ethics (emphasis mine):
Bostrom’s discussion of infinite ethics is premised on the moral relevance of physically inaccessible value. That is, it assumes that aggregative utilitarianism is over the full universe, rather than the accessible universe. This requires certain assumptions about the universe, as well as being premised on a variant of the incomparability argument that we dismissed above, but has an additional response which is possible, presaged earlier. Namely, we can argue that this does not pose a problem for ethical decision-making even using aggregative ethics, because the consequences of any ethical decision can have only a finite (difference in) value. This is because the value of a moral decision relates only to the impact of that decision. Anything outside of the influenced universe is not affected, and the arguments above show that the difference any decision makes is finite.
I first read their paper a few years ago and found their arguments for the finiteness of value persuasive, as well as their collectively-exhaustive responses in section 4 to possible objections. So ever since then I've been admittedly confused by claims that the problems of infinite ethics still warrant concern w.r.t. ethical decision-making (e.g. I don't really buy Joe Carlsmith's arguments for acknowledging that infinities matter in this context, same for Toby Ord's discussion in a recent 80K podcast). What am I missing?
Rob Wiblin: OK, so the argument is something like valuing is a process that requires information to be encoded, and information to be processed — and there are just maximum limits on how much information can be encoded and processed given a particular amount of mass and given a finite amount of mass and energy. So that ultimately is going to set the limit on how much valuing can be done physically in our universe. No matter what things we create, no matter what minds we generate, there’s going to be some finite limit there. That’s basically it?
Anders Sandberg: That’s it. In some sense, this is kind of trivial. I think some readers would no doubt feel almost cheated, because they wanted to know that metaphysical limit for value, and we can’t say anything about that. But it seems very likely that if value has to have to do with some entity that is doing the valuing, then there is always going to be this limit — especially since the universe is inconveniently organised in such a way that we can’t get hold of infinite computational power, as far as we know.
Pay close attention to ideas that repel others people for non-impact related reasons, but not you. If you can get obsessed about something important that most people find horribly boring, you're uniquely well placed to make a big impact.
Unfortunately it's bereft of concrete examples. The closest to a shortlist he shares is in this comment:
Horrible career moves e.g. investigating the corrupt practices of powerful EAs / Orgs
Boring to most people e.g. compiling lists and data
Low status outside EA e.g. welfare of animals nobody cares about (e.g. shrimp)
Low status within EA e.g. global mental health
Living in relatively low quality of living areas e.g. fieldwork in many African countries
(I disagree with some of these; e.g. the first bullet seems contradicted by the propensity for forum drama on adjacent topics, and as someone who likes compiling lists and data I don't actually see much low-hanging fruit for me to contribute here due to the work of e.g. Hamish)
I'd be keen to learn other examples. He does give this advice to brainstorm examples:
What work do you wish someone else would do?
although in my case it's not useful because I either just end up doing it (or trying, failing, and learning why), or discover that it's already been done better than I could (e.g. Rethink Priorities' new CCM).
That said, I still think the original takeaway is a useful reminder.
[Question] How should we think about the decision relevance of models estimating p(doom)?
(Epistemic status: confused & dissatisfied by what I've seen published, but haven't spent more than a few hours looking. Question motivated by Open Philanthropy's AI Worldviews Contest; this comment thread asking how OP updated reminded me of my dissatisfaction. I've asked this before on LW but got no response; curious to retry, hence repost)
To illustrate what I mean, switching from p(doom) to timelines:
The recent post AGI Timelines in Governance: Different Strategies for Different Timeframes was useful to me in pushing back against Miles Brundage's argument that "timeline discourse might be overrated", by showing how choice of actions (in particular in the AI governance context) really does depend on whether we think that AGI will be developed in ~5-10 years or after that.
A separate takeaway of mine is that decision-relevant estimation "granularity" need not be that fine-grained, and in fact is not relevant beyond simply "before or after ~2030" (again in the AI governance context).
Finally, that post was useful to me in simply concretely specifying which actions are influenced by timelines estimates.
Question: Is there something like this for p(doom) estimates? More specifically, following the above points as pushback against the strawman(?) that "p(doom) discourse, including rigorous modeling of it, is overrated":
What concrete high-level actions do most alignment researchers agree are influenced by p(doom) estimates, and would benefit from more rigorous modeling (vs just best guesses, even by top researchers e.g. Paul Christiano's views)?
What's the right level of granularity for estimating p(doom) from a decision-relevant perspective? Is it just a single bit ("below or above some threshold X%") like estimating timelines for AI governance strategy, or OOM (e.g. 0.1% vs 1% vs 10% vs >50%), or something else?
I suppose the easy answer is "the granularity depends on who's deciding, what decisions need making, in what contexts", but I'm in the dark as to concrete examples of those parameters (granularity i.e. thresholds, contexts, key actors, decisions)
e.g. reading Joe Carlsmith's personal update from ~5% to >10% I'm unsure if this changes his recommendations at all, or even his conclusion – he writes that "my main point here, though, isn't the specific numbers... [but rather that] here is a disturbingly substantive risk that we (or our children) live to see humanity as a whole permanently and involuntarily disempowered by AI systems we’ve lost control over", which would've been true for both 5% and 10%
Or is this whole line of questioning simply misguided or irrelevant?
Some writings I've seen gesturing in this direction:
Carl Shulman disagrees, but his comment (while answering my 1st bullet point) isn't clear in the way the different AI gov strategies for different timelines post is, so I'm still left in the dark – to (simplistically) illustrate with a randomly-chosen example from his reply and making up numbers, I'm looking for statements like "p(doom) < 2% implies we should race for AGI with less concern about catastrophic unintended AI action, p(doom) > 10% implies we definitely shouldn't, and p(doom) between 2-10% implies reserving this option for last-ditch attempts", which he doesn't provide
Froolow's attempted dissolution of AI risk (which takes Joe Carlsmith's model and adds parameter uncertainty – inspired by Sandberg et al's Dissolving the Fermi paradox – to argue that low-risk worlds are more likely than non-systematised intuition alone would suggest)
Froolow's modeling is useful to me for making concrete recommendations for funders, e.g. (1) "prepare at least 2 strategies for the possibility that we live in one of a high-risk or low-risk world instead of preparing for a middling-ish risk", (2) "devote significantly more resources to identifying whether we live in a high-risk or low-risk world", (3) "reallocate resources away from macro-level questions like 'What is the overall risk of AI catastrophe?' towards AI risk microdynamics like 'What is the probability that humanity could stop an AI with access to nontrivial resources from taking over the world?'", (4) "When funding outreach / explanations of AI Risk, it seems likely it would be more convincing to focus on why this step would be hard than to focus on e.g. the probability that AI will be invented this century (which mostly Non-Experts don’t disagree with)". I haven't really seen any other p(doom) model do this, which I find confusing
I'm encouraged by the long-term vision of the MTAIR project "to convert our hypothesis map into a quantitative model that can be used to calculate decision-relevant probability estimates", so I suppose another easy answer to my question is just "wait for MTAIR", but I'm wondering if there's a more useful answer to the "current SOTA" than this. To illustrate, here's (a notional version of) how MTAIR can help with decision analysis, cribbed from that introduction post:
This question was mainly motivated by my attempt to figure out what to make of people's widely-varying p(doom) estimates, e.g. in the appendix section of Apart Research's website, beyond simply "there is no consensus on p(doom)". I suppose one can argue that rigorous p(doom) modeling helps reduce disagreement on intuition-driven estimates by clarifying cruxes or deconfusing concepts, thereby improving confidence and coordination on what to do, but in practice I'm unsure if this is the case (reading e.g. the public discussion around the p(doom) modeling by Carlsmith, Froolow, etc), so I'm not sure I buy this argument, hence my asking for concrete examples.
I just learned about Tom Frieden via Vadim Albinsky's writeup Resolve to Save Lives Trans Fat Program for Founders Pledge. His impact in sheer lives saved is astounding, and I'm embarrassed I didn't know about him before:
The CEO of RTSL, Tom Frieden, likely prevented tens of millions of deaths by creating an international tobacco control initiative in a prior role that may have been much more cost effective than most of our top recommended charities. ...
We believe that by leveraging his influence with governments, and the relatively low cost of advocating for regulations to improve health, Tom Frieden has the potential to again save a vast number of lives at a low cost.
How many more? Albinsky estimates:
RTSL is aiming to save 94 million lives over 25 years by advocating for countries to implement policies to reduce non-communicable diseases. We believe the industrially-produced trans fat elimination program is the most cost-effective of their initiatives. ... Even after very conservative discounts to RTLS’s impact projections we estimate this program to be more cost effective than most of our top global health and development recommendations.
Tangentially, if a "Borlaug" is a billion lives saved, then Frieden's impact is probably on the scale of ~100 milliBorlaugs (to nearest OOM). Bill and Melinda likely have had similar impact. This makes me wonder who else I don't know about who's done ~100 milliBorlaugs of good.
(It's arguably unfair to wholly attribute all those lives saved to Frieden, and I am honestly unsure what credit attribution makes most sense, but applying the same logic to Borlaug you can no longer really say he saved a billion lives.)
The 1,000-ton rule is Richard Parncutt's suggestion for reframing the political message of the severity of global warming in particularly vivid human rights terms; it says that someone in the next century or two is prematurely killed every time humanity burns 1,000 tons of carbon.
I came across this paper while (in the spirit of Nuno's suggestion) trying to figure out the 'moral cost of climate change' so to speak, driven by my annoyance that e.g. climate charity BOTECs reported $ per ton of CO2-eq averted in contrast to (say) the $ per death averted bottomline of GHW charities, since I don't intrinsically care to avert CO2-equivalent emissions the way I do about averting deaths. (To be clear, I understand why the BOTECs do so and would do the same for work; this is for my own moral clarity.)
Parncutt's derivation is simple: burning a trillion tons of carbon will cause ~2 °C of anthropogenic global warming, which will in turn cause 1 - 10 million premature deaths a year "for a period of several centuries", something like this:
Modelling the rise in global mean surface temperature (GMST) as a function of carbon burned is already very hard; Parncutt doesn't try to model premature deaths as a function of GMST but just makes a semi-quantitative order-of-magnitude estimation anchored extensively at the lower and upper ends to various catastrophic outcomes discussed in the literature on climate change, and assumes a lognormal distribution around a billion future deaths with a 10x range for worst-vs-best case scenario, which over time looks 'very approximately' like this:
The lower line represents deaths due to poverty without AGW. As the negative effect of AGW overtakes the positive effect of development, the death rate will increase, as shown by the upper line. In a more accurate model, the upper line might be concave upward on the left (exponential increase) and concave downward on the right (approaching a peak).
Based on the 1,000-ton rule, Pearce & Parncutt suggest the 'millilife' as "an accessible unit of measure for carbon footprints that is easy to understand and may be used to set energy policy to help accelerate carbon emissions reductions". A millilife is a measure of intrinsic value defined to be 1/1000th of a human life; the 1,000-ton rule says that burning a ton of fossil carbon destroys a millilife. This lets Pearce & Parncutt make statements like these, at an individual level (all emphasis mine):
For example in Canada, which has some of the highest yearly carbon emissions per capita in the world at around 19 tons of CO2 or 5 tons of carbon per person, roughly 5 millilives are sacrificed by an average person each year. As the average Canadian lives to be about 80, he/she sacrifices about 400 millilives (0.4 human lives) in the course of his/her lifetime, in exchange for a carbon-intensive lifestyle
and
... an average future AGW-victim in a developing country will lose half of a lifetime or 30–40 life-years, as most victims will be either very young or very old. If the average climate victim loses 35 life-years (or 13,000 life-days), a millilife corresponds to 13 days.
Stated in another way: if a person is responsible for burning a ton of fossil carbon by flying to another continent and back, they effectively steal 13 days from the life of a future poor person living in the developing world. If the traveler takes 1000 such trips, they are responsible for the death of a future person.
and for "large-scale energy decisions":
... the Adani Carmichael coalmine in Queensland, Australia, is currently under construction and producing coal since 2021. Despite massive protests over several years, it will be the biggest coalmine ever. Its reserves are up to 4 billion tons of coal, or 3 billion tons of carbon. If all of that was burned, the 1000-tonne rule says it would cause the premature deaths of 3 million future people. Given that the 1000-tonne rule is only an order-of-magnitude estimate, the number of caused deaths will lie between one million and 10 million. ... Many of those who will die are already living as children in the Global South; burning Carmichael coal will cause their future deaths with a high probability. Should energy policy allow that to occur?
Pearce & Parncutt then use the 1,000-ton rule and millilife to make various suggestions. Here's one:
Under what circumstances might a government ban or outlaw an entire corporation or industry, considered a legal entity or person—for example, the entire global coal industry? ...
Ideally, a company should not cause any human deaths at all. If it does, those deaths should be justifiable in terms of improvements to the quality of life of others. For example, a company that builds a bridge might reasonably risk a future collapse that would kill 100 people with a probability of 1%. In that case, the company accepts that on average one future person will be killed as a result of the construction of the bridge. It may be reasonable to claim that the improved quality of life for thousands or millions of people who cross the bridge justifies the human cost.
Fossil fuel industries are causing far more future deaths than that, raising the question of the point at which the law should intervene. As a first step to solving this problem, it has been proposed a rather high threshold (generous toward the corporations) is appropriate. A company does not have the right to exist if its net impact on human life (e.g., a company/industry might make products that save lives like medicine but do kill a small fraction of users) is such that it kills more people than it employs. This requirement for a company’s existence is thus:
Number of future premature deaths/year < Number of full-time employees (1)
This criterion can be applied to an entire industry. If the industry kills more people than it employs, then primary rights (life) are being sacrificed for secondary rights (jobs or profits) and the net benefit to humankind is negative. If an industry is not able to satisfy Equation (1), it should be closed down by the government.
... the coal industry kills people by polluting the air that they breathe. ... In the U.S., about 52,000 human lives are sacrificed per year to provide coal-fired electricity. ... In the U.S., coal employed 51,795 people in 2016. Since the number of people killed is greater than the number employed, the U.S. coal industry does not satisfy Equation (1) and should be closed down. This conservative conclusion does not include future deaths caused by climate change due to burning coal.
One more energy policy suggestion (there's many more in the paper):
Applying asset forfeiture laws (also referred to as asset seizure) to manslaughter caused by AGW. These laws enable the confiscation of assets by the U.S. government as a type of criminal-justice financial obligation that applies to the proceeds of crime. Essentially, if criminals profit from the results of unlawful activity, the profits (assets) are confiscated by the authorities.
This is not only a law in the U.S. but is in place throughout the world. For example, in Canada, Part XII.2 of the Criminal Code, provides a national forfeiture régime for property arising from the commission of indictable offenses. Similarly, ‘Son of Sam laws’ could also apply to carbon emissions. In the U.S., Son of Sam laws refer to laws designed to keep criminals from profiting from the notoriety of their crimes and often authorize the state to seize funds earned by the criminals to be used to compensate the criminal’s victims.
If that logic of asset forfeiture is applied to fossil fuel company investors who profit from carbon-emission-related manslaughter, taxes could be set on fossil fuel profits, dividends, and capital gains at 100% and the resultant tax revenue could be used for energy efficiency and renewable energy projects or to help shield the poor from the most severe impacts of AGW. ...
Such AGW-focused asset forfeiture laws would also apply to fossil fuel company executive compensation packages. Energy policy research has shown that it is possible to align energy executive compensation with careful calibration of incentive equations such that the harmful effects of emissions can be prevented through incentive pay. Executives who were compensated without these safeguards in place would have their incomes seized the same as other criminals benefiting materially from manslaughter.
I have no (defensible) opinion on these suggestions; curious to know what anyone thinks.
Cost-effectiveness in DALYs per $1k (90% CI) / % of simulation results with positive outcomes - negative outcomes - no effects / alternative weightings of cost-eff under different risk aversion profiles and weighting schemes in weighted DALYs per $1k, min to max values
Portfolio of biorisk projects ($15-30M budget, 60% chance no effect, 70% effect is positive): 132 (middle 99.9% of expected utility is 0) / >99.9% no effect / risk 0 - 132
Nanotech safety megaproject ($10-30M budget, 90% chance no effect, 70% effect is positive): 73 (middle 99.9% of EU is 0) / >99.9% no effect / risk -10 - 73
AI misalignment megaproject ($8-28B budget, 97.3% chance no effect, 70% effect is positive): 154 (middle 99.9% of EU is 27, 99% is 0) / >99.6% no effect / risk -56 - 154
Some things that jumped out at me (caveating that I don't work in any of these areas):
I'm a little surprised that only chicken campaigns are modeled as clearly higher EV (OOM-wise) than GHD interventions considered good by GW & OP's lights, while interventions for other nonhuman animals fall short
I'm also surprised that chickens > all other nonhuman animals on both EV and p(+ve simulation outcome). There's some discussion that seems to indicate that cage-free work seems to be much lower EV now than previously, although I'm not sure if it changes the takeaway (and in any case funding prioritization shouldn't be purely EV-based)
I'm surprised yet again that a >$10B AI misalignment megaproject is modeled as having no effect in >99.6% of simuls. I probably hadn't internalized the 'hits' in 'hits-based giving' as well as I should, since my earlier gut intuition (based on no data whatsoever) was that a near-Manhattan-scale megaproject would surely have some effect in >10% of possible worlds
I didn't expect the model to say chickens > misaligned AI, unsafe nanotech and biorisk from a risk-neutral EV perspective. That said, the x-risk inputs are in some sense just placeholders, so I don't put much weight in this
In any case, I'd be curious to see how the CCM is taken into consideration by funders and other stakeholders going forward.
I thought I had mostly internalized the heavy-tailed worldview from a life-guiding perspective, but reading Ben Kuhn's searching for outliers made me realize I hadn't. So here are some summarized reminders for posterity:
Key idea: lots of important things in life generated by multiplicative processes resulting in heavy-tailed distributions – jobs, employees / colleagues, ideas, romantic relationships, success in business / investing / philanthropy, how useful it is to try new activities
Decision relevance to living better, i.e. what Ben thinks I should do differently:
Getting lots of samples improves outcomes a lot, so draw as many samples as possible
Trust the process and push through the demotivation of super-high failure rates (instead of taking them as evidence that the process is bad)
But don't just trust any process; it must have 2 parts: (1) a good way to tell if a candidate is an outlier ("maybe amazing" below) (2) a good way to draw samples
Optimize less, draw samples more (for a certain type of person)
Filter for "ruling in" candidates, not "ruling out" (e.g. in dating)
Cultivate an abundance mindset to help reject more candidates early on (to find 99.9th percentile not just 90th)
Think ahead about what outliers look like, to avoid accidentally rejecting 99.9th percentile candidates out of miscalibration, by asking others based on their experience
My reservations with Ben's advice, despite thinking they're mostly sound and idea-generating:
"Stick with the process through super-high failure rates instead of taking them as evidence that the process is bad" feels uncomfortably close to protecting a belief from falsification
Filtering for "maybe amazing", not "probably good" makes me uncomfortable because I'm not risk-neutral (e.g. in RP's CCM I'm probably closest to "difference-making risk-weighted expected utility = low to moderate risk aversion", which for instance assesses RP's default AI risk misalignment megaproject as resulting in, not averting, 300+ DALYs per $1k)
Unlike Ben, I'm a relatively young person in a middle-income country, and the abundance mindset feels privileged (i.e. not as much runway to try and fail)
So maybe a precursor / enabling activity for the "sample more" approach above is "more runway-building": money, leisure time, free attention & health, proximity to opportunities(?)
From Richard Y Chappell's post Theory-Driven Applied Ethics, answering "what is there for the applied ethicist to do, that could be philosophically interesting?", emphasis mine:
A better option may be to appeal to mid-level principles likely to be shared by a wide range of moral theories. Indeed, I think much of the best work in applied ethics can be understood along these lines. The mid-level principles may be supported by vivid thought experiments (e.g. Thomson’s violinist, or Singer’s pond), but these hypothetical scenarios are taken to be practically illuminating precisely because they support mid-level principles (supporting bodily autonomy, or duties of beneficence) that we can then apply generally, including to real-life cases.
The feasibility of this principled approach to applied ethics creates an opening for a valuable (non-trivial) form of theory-driven applied ethics. Indeed, I think Singer’s famous argument is a perfect example of this. For while Singer in no way assumes utilitarianism in his famous argument for duties of beneficence, I don’t think it’s a coincidence that the originator of this argument was a utilitarian. Different moral theories shape our moral perspectives in ways that make different factors more or less salient to us. (Beneficence is much more central to utilitarianism, even if other theories ought to be on board with it too.)
So one fruitful way to do theory-driven applied ethics is to think about what important moral insights tend to be overlooked by conventional morality. That was basically my approach to pandemic ethics: to those who think along broadly utilitarian lines, it’s predictable that people are going to be way too reluctant to approve superficially “risky” actions (like variolation or challenge trials) even when inaction would be riskier. And when these interventions are entirely voluntary—and the alternative of exposure to greater status quo risks is not—you can construct powerful theory-neutral arguments in their favour. These arguments don’t need to assume utilitarianism. Still, it’s not a coincidence that a utilitarian would notice the problem and come up with such arguments.
Another form of theory-driven applied ethics is to just do normative ethics directed at confused applied ethicists. For example, it’s commonplace for people to object that medical resource allocation that seeks to maximize quality-adjusted life years (QALYs) is “objectionably discriminatory” against the elderly and disabled, as a matter of principle. But, as I argue in my paper, Against 'Saving Lives': Equal Concern and Differential Impact, this objection is deeply confused. There is nothing “objectionably discriminatory” about preferring to bestow 50 extra life-years to one person over a mere 5 life-years to another. The former is a vastly greater benefit, and if we are to count everyone equally, we should always prefer greater benefits over lesser ones. It’s in fact the opposing view, which treats all life-saving interventions as equal, which fails to give equal weight to the interests of those who have so much more at stake.
Two asides:
This seems broadly correct (at least for someone who shares my biases); e.g. even in pure math John von Neumann warned:
As a mathematical discipline travels far from its empirical source, or still more, if it is a second and third generation only indirectly inspired by ideas coming from "reality" it is beset with very grave dangers. It becomes more and more purely aestheticizing, more and more purely l'art pour l'art. This need not be bad, if the field is surrounded by correlated subjects, which still have closer empirical connections, or if the discipline is under the influence of men with an exceptionally well-developed taste. But there is a grave danger that the subject will develop along the line of least resistance, that the stream, so far from its source, will separate into a multitude of insignificant branches, and that the discipline will become a disorganized mass of details and complexities. In other words, at a great distance from its empirical source, or after much "abstract" inbreeding, a mathematical subject is in danger of degeneration. ... In any event, whenever this stage is reached, the only remedy seems to me to be the rejuvenating return to the source: the re-injection of more or less directly empirical ideas.
This makes me wonder if it would be fruitful to look at & somehow incorporate mid-level principles into decision-relevant cost-effectiveness analyses that attempt to incorporate moral uncertainty, e.g. HLI's app or Rethink's CCM. (This is not at all a fleshed-out thought, to be clear)
INT has its uses, but I believe many people over-apply it.
Generally speaking (with some exceptions), people don’t choose between causes, they choose between interventions. That is, they don’t prioritize broad focus areas like global poverty or immigration reform. Instead, they choose to support specific interventions such as distributing deworming treatments or lobbying to pass an immigration bill. The INT framework doesn’t apply to interventions as well as it does to causes. In short, cause areas correspond to problems, and interventions correspond to solutions; INT assesses problems, not solutions.
(aside: Michael Plant makes the same point in chapters 5 & 6 of his PhD thesis as per Edo Arad's post, using it as a starting point to develop a systematic cause prio approach he called 'cause mapping')
In most cases, we can try to directly assess the true marginal impact of investing in an intervention. These assessments will never be perfectly accurate, but they generally seem to tell us more than INT does. ...
How can we estimate an intervention’s impact more directly? To develop a better framework, let’s start with the final result we want and work backward to see how to get it.
Dickens' post has more; the framework they end up with is this:
which (somewhat less practically, they note) could be fine-grained further:
I also appreciated that Dickens actually used this framework to guide their giving decision (more details in their post).
List of charities providing humanitarian assistance in the Israel-Hamas war mentioned in response to this request, for posterity and ease of reference:
~100x 95% CI range (mostly from estimates of total current funding to date, and difficulty of continuing with research), so figures below can't really argue for change in priorities so much as compel further research
This uncertainty is a lower bound, including only statistical uncertainty and not model uncertainty
Differing returns to research are largely driven by disease burden size, so look at diarrheal diseases, malaria, hookworm, ascariasis, trichuriasis, lymphatic filariasis, meningitis, typhoid, and salmonella – i.e. nothing too surprising
Estimated figures:
13.9 DALYs/$1k for the sector as a whole (vs ~20 DALYs/$1k for GWWC top charities back in 2014), 95% CI 1.43-130 DALYs/$1k
Median estimates: diarrheal disease e.g. cholera and dysentry 121 DALYs/$1k, salmonella infections 74 DALYs/$1k, worms ~50 DALYs/$1k, leprosy 0.058 DALYs/$1k
Most of the top diseases have ~100x 95% CI range, except salmonella whose range is ~3,000x(!)
The following is a collection of long quotes from Ozy Brennan's post On John Woolman (which I stumbled upon via Aaron Gertler) that spoke to me. Woolman was clearly what David Chapman would call mission-oriented with respect to meaning of and purpose in life; Chapman argues instead for what he calls "enjoyable usefulness", which is I think healthier in ~every way ... it just doesn't resonate. All bolded text is my own emphasis, not Ozy's.
As a child, Woolman experienced a moment of moral awakening: ... [anecdote]
This anecdote epitomizes the two driving forces of John Woolman’s personality: deep compassion and the refusal to ever cut himself a moment of slack. You might say “it was just a bird”; you might say “come on, Woolman, what were you? Ten?” Woolman never thought like that. It was wrong to kill; he had killed; that was all there was to say about it.
When Woolman was a teenager, the general feeling among Quakers was that they were soft, self-indulgent, not like the strong and courageous Quakers of previous generations, unlikely to run off to Massachusetts to preach the Word if the Puritans decided once again to torture Quakers for their beliefs, etc. Woolman interpreted this literally. He spent his teenage years being like “I am depraved, I am evil, I have not once provoked anyone into whipping me to death, I don’t even want to be whipped to death.”
As a teenager, Woolman fell in with a bad crowd and committed some sins. What kind of sins? I don’t know. Sins. He's not telling us:
“I hastened toward destruction,” he writes. “While I meditate on the gulf toward which I travelled … I weep; mine eye runneth down with water.”
In actuality, Woolman’s corrupting friends were all... Quakers who happened to be somewhat less strict than he was. We have his friends' diaries and none of them remarked on any particular sins committed in this period. Biographers have speculated that Woolman was part of a book group and perhaps the great sin he was reproaching himself for was reading nonreligious books. He may also have been reproaching himself for swimming, skating, riding in sleighs, or drinking tea.
Woolman is so batshit about his teenage wrongdoing that many readers have speculated about the existence of different, non-Quaker friends who were doing all the sins. However, we have no historical evidence of him having other friends, and we have a fuckton of historical evidence of Woolman being extremely hard on himself about minor failings (or “failings”).
Most people who are Like That as teenagers grow out of it. Woolman didn’t. He once said something dumb in Weekly Meeting1 and then spent three weeks in a severe depression about it. He never listened to nonreligious music, read fiction or newspapers, or went to plays. He once stormed down to a tavern to tell the tavern owner that celebrating Christmas was sinful.
... if Woolman were just an 18th century neurotic, no one would remember him. We care about him because of his attitude about slavery.
When Woolman was 21, his employer asked him to write a bill of sale for an enslaved woman. Woolman knew it was wrong. But his employer told him to and he was scared of being fired. Both Woolman’s employer and the purchaser were Quakers themselves, so surely if they were okay with it it was okay. Woolman told both his master and the purchaser that he thought that Christians shouldn't own enslaved people, but he wrote the bill.
After he wrote the bill of sale Woolman lost his inner peace and never really recovered it. He spent the rest of his life struggling with guilt and self-hatred. He saw himself as selfish and morally deficient. ...
Woolman worked enough to support himself, but the primary project of his life was ending slavery. He wrote pamphlet after pamphlet making the case that slavery was morally wrong and unbiblical. He traveled across America making speeches to Quaker Meetings urging them to oppose slavery. He talked individually with slaveowners, both Quaker and not, which many people criticized him for; it was “singular”, and singular was not okay. ...
It is difficult to overstate how much John Woolman hated doing anti-slavery activism. For the last decade of his life, in which he did most of his anti-slavery activities, he was clearly severely depressed. ... Partially, he hated the process of traveling: the harshness of life on the road; being away from his family; the risk of bringing home smallpox, which terrified him.
But mostly it was the task being asked of Woolman that filled him with grief. Woolman was naturally "gentle, self-deprecating, and humble in his address", but he felt called to harshly condemn slaveowning Quakers. All he wanted was to be able to have friendly conversations with people who were nice to him. But instead, he felt, God had called him to be an Old Testament prophet, thundering about God’s judgment and the need for repentance. ...
Woolman craved approval from other Quakers. But even Quakers personally opposed to slavery often thought that Woolman was making too big a deal about it. There were other important issues. Woolman should chill. His singleminded focus on ending slavery was singular, and being singular was prideful. Isn’t the real sin how different Woolman’s abolitionism made him from everyone else?
Sometimes he persuaded individual people to free their slaves, but successes were few and far between. Mostly, he gave speeches and wrote pamphlets as eloquently as he could, and then his audience went “huh, food for thought” and went home and beat the people they’d enslaved. Nothing he did had any discernible effect.
... Woolman spent much of his time feeling like a failure. If he were better, if he followed God’s will more closely, if he were kinder and more persuasive and more self-sacrificing, then maybe someone would have lived free who now would die a slave, because Woolman wasn’t good enough.
The modern version of this is probably what Thomas Kwa wrote about here:
I think that many people new to EA have heard that multipliers like these exist, but don't really internalize that all of these multipliers stack multiplicatively. ... If she misses one of these multipliers, say the last one, ... Ana is losing out on 90% of her potential impact, consigning literally millions of chickens to an existence worse than death. To get more than 50% of her maximum possible impact, Ana must hit every single multiplier. This is one way that reality is unforgiving.
From one perspective, Woolman was too hard on himself about his relatively tangential connection to slavery. From another perspective, he is one of a tiny number of people in the eighteenth century who has a remotely reasonable response to causing a person to be in bondage when they could have been free. Everyone else flinched away from the scale of the suffering they caused; Woolman looked at it straight. Everyone else thought of slaves as property; Woolman alone understood they were people.
Some people’s high moral standards might result in unproductive self-flagellation and the refusal to take actions because they might do something wrong. But Woolman derived strength and determination from his high moral standards. When he failed, he regretted his actions and did his best to change them. At night he might beg God to fucking call someone else, but the next morning he picked up his walking stick and kept going.
And the thing he was doing mattered. Quaker abolitionism wasn’t inevitable; it was the result of hard work by specific people, of whom Woolman was one of the most prominent. If Woolman were less hard on himself, many hundreds if not thousands of free people would instead have been owned things that could beaten or raped or murdered with as little consequence as I experience from breaking a laptop.
An aside (doubling as warning) on mission orientation, quoting Tanner Greer's Questing for Transcendence:
... out of the lands I’ve lived and roles I’ve have donned, none blaze in my memory like the two years I spent as a missionary for the Church of Jesus Christ. It is a shame that few who review my resume ask about that time; more interesting experiences were packed into those few mission years than in the rest of the lot combined. ... I doubt I shall ever experience anything like it again. I cannot value its worth. I learned more of humanity’s crooked timbers in the two years I lived as missionary than in all the years before and all the years since.
Attempting to communicate what missionary life is like to those who have not experienced it themselves is difficult. ... Yet there is one segment of society that seems to get it. In the years since my service, I have been surprised to find that the one group of people who consistently understands my experience are soldiers. In many ways a Mormon missionary is asked to live something like a soldier... [they] spend years doing a job which is not so much a job as it is an all-encompassing way of life.
The last point is the one most salient to this essay. It is part of the reason both many ex-missionaries (known as “RMs” or “Return Missionaries” in Mormon lingo) and many veterans have such trouble adapting to life when they return to their homes. ... Many RMs report a sense of loss and aimlessness upon returning to “the real world.” They suddenly find themselves in a society that is disgustingly self-centered, a world where there is nothing to sacrifice or plan for except one’s own advancement. For the past two years there was a purpose behind everything they did, a purpose whose scope far transcended their individual concerns. They had given everything—“heart, might, mind and strength“—to this work, and now they are expected to go back to racking up rewards points on their credit card? How could they?
The soldier understands this question. He understands how strange and wonderful life can be when every decision is imbued with terrible meaning. Things which have no particular valence in the civilian sphere are a matter of life or death for the soldier. Mundane aspects of mundane jobs (say, those of the former vehicle mechanic) take on special meaning. A direct line can be drawn between everything he does—laying out a sandbag, turning off a light, operating a radio—and the ability of his team to accomplish their mission. Choice of food, training, and exercise before combat can make the difference between the life and death of a soldier’s comrades in combat. For good or for ill, it is through small decisions like these that great things come to pass.
In this sense the life of the soldier is not really his own. His decisions ripple. His mistakes multiply. The mission demands strict attention to things that are of no consequence in normal life. So much depends on him, yet so little is for him.
This sounds like a burden. In some ways it is. But in other ways it is a gift. Now, and for as long as he is part of the force, even his smallest actions have a significance he could never otherwise hope for. He does not live a normal life. He lives with power and purpose—that rare power and purpose given only to those whose lives are not their own.
... It is an exhilarating way to live.
This sort of life is not restricted to soldiers and missionaries. Terrorists obviously experience a similar sort of commitment. So do dissidents, revolutionaries, reformers, abolitionists, and so forth. What matters here is conviction and cause. If the cause is great enough, and the need for service so pressing, then many of the other things—obedience, discipline, exhaustion, consecration, hierarchy, and separation from ordinary life—soon follow. It is no accident that great transformations in history are sprung from groups of people living in just this way. Humanity is both at its most heroic and its most horrifying when questing for transcendence.
Striking paper by Anant Sudarshan and Eyal Frank (via Dylan Matthews at Vox Future Perfect) on the importance of vultures as a keystone species.
To quote the paper and newsletter — the basic story is that vultures are extraordinarily efficient scavengers, eating nearly all of a carcass less than an hour after finding it, and farmers in India historically relied on them to quickly remove livestock carcasses, so they functioned as a natural sanitation system in helping to control diseases that could otherwise be spread through the carcasses they consume. In 1994, farmers began using diclofenac to treat their livestock, due to the expiry of a patent long held by Novartis leading to the entry of cheap generic brands made by Indian companies. Diclofenac is a common painkiller, harmless to humans, but vultures develop kidney failure and die within weeks of digesting carrion with even small residues of it. Unfortunately this only came to light via research published a decade later in 2004, by which time the number of Indian vultures in the wild had tragically plummeted from tens of millions to just a few thousands today, the fastest for a bird species in recorded history and the largest in magnitude since the extinction of the passenger pigeon.
When the vultures died out, far more dead animals lay around rotting, transmitting pathogens to other scavengers like dogs and rats and entering the water supply. Dogs and rats are less efficient than vultures at fully eliminating flesh from carcasses, leading to a higher incidence of human contact with infected remains, and they're also more likely to transmit diseases like anthrax and rabies to people. Sudarshan and Frank estimate that this led to ~100,000(!) additional deaths each year from 2000-05 due to a +4.2%(!) increase in all-cause mortality among the 430 million people living in districts that once had a lot of vultures, which is staggering; this is e.g. more than the death toll in 2001 from HIV/AIDS (92,000), malaria (53,000), and alcohol use disorders (14,000).
(Cause X, anyone? Preventing a hundred thousand deaths a year for less than half a billion dollars annually clears the GiveWell top charity-level threshold, and half a billion is in the ballpark of Open Philanthropy's entire annual grantmaking...)
So what to do? For vultures in particular, Sudarshan and Frank say their results "inform current vulture recovery efforts in India, and conservation efforts elsewhere" e.g. parts of Africa and Spain, albeit without elaborating. More broadly, they hope their paper informs better policymaking by providing "a particularly stark example of the type of hard-to-reverse and unpredictable costs that must be accounted for when evaluating the introduction of new chemicals into fragile and diverse ecosystems", stating "it is plausible that a counterfactual policy regime in India that tested chemicals for their toxicity to at least keystone species might have avoided the collapse of vultures". They conclude:
The remark that vultures are not particularly attractive reminds me of the overlooked plight of farmed chickens, shrimp, insects etc for not being charismatic fauna. (I am admittedly sort of emotionally conflating the welfare of vultures with their ecosystem importance as a keystone species here.)
Pretty funny CGD blog post by Victoria Fan and Rachel Bonnifield: If the Global Health Donors Were Your Parents: A (Whimsical) Comparative Perspective. Quoting at length (with some reformatting):
I like Austin Vernon's idea for scaling CO2 direct air capture to 40 billion tons per year, i.e. matching our current annual CO2 emissions, using (extreme versions of) well-understood industrial processes.
I am admittedly perhaps biased to want moonshots like Vernon's idea to work, and for society at large to be able to coordinate and act on the required scale, after seeing these depressing charts from Assessing the costs of historical inaction on climate change:
Curious what people think of Gwern Branwen's take that our moral circle has historically narrowed as well, not just expanded (so contra Singer), so we should probably just call it a shifting circle. His summary:
(I admittedly don't find his examples all that persuasive, probably because I'm already biased to only consider beings that can feel pleasure and suffering.)
What's the "so what"? Gwern:
Hi Mo. I'm unsure if you've seen it, but Gwern’s article was discussed here.
I hadn't, thanks for the pointer Pablo.
Why did India's happiness ratings consistently drop so much over time even as its GDP per capita rose?
Epistemic status: confused. Haven't looked into this for more than a few minutes
My friend recently alerted me to an observation that puzzled him: this dynamic chart from Our World in Data's happiness and life satisfaction article showing how India's self-reported life satisfaction dropped an astounding -1.20 points (4.97 to 3.78) from 2011 to 2021, even as its GDP per capita rose +51% (I$4,374 to I$6,592 in 2017 prices):
(I included China for comparison to illustrate the sort of trajectory I expected to see for India.)
The sliding year scale on OWID's chart shows how this drop has been consistent and worsening over the years. This picture hasn't changed much recently: the most recent 2024 World Happiness Report reports a 4.05 rating averaged over the 3-year window 2021-23, only slightly above the 2021 rating.
A -1.20 point drop is huge. For context, it's 10x(!) larger than the effect of doubling income at +0.12 LS points (Clarke et al 2018 p199, via HLI's report), and compares to major negative life events like widowhood and extended unemployment:
Given India's ~1.4 billion population, such a large drop is alarming: roughly ~5 billion LS-years lost since 2011, very roughly ballparking. For context, and keeping in mind that LS-years and DALYs aren't the same thing, the entire world's DALY burden is ~2.5 billion DALYs p.a.
But – again caveating with my lack of familiarity with the literature and extremely cursory look into this – I haven't seen any writeup look into this, which makes me wonder if it's not a 'real issue'? For instance, the 2021 WHR just says
That's it: no elaboration, no footnotes, nothing.
So what am I missing? What's going on here?
A quick search turned up this WEF article (based on Ipsos data and research, not the WHR's Gallup World Poll, so take it with a grain of salt) pointing to
But I'm not sure these factors are differential (i.e. that they, for instance, happen much more in India than elsewhere s.t. it explains the wellbeing vs development trajectory difference over 2011-24)?
Interesting! I think figure 2.1 here provides a partial answer. According to the FAQ:
"the sub-bars show the estimated extent to which each of the six factors (levels of GDP, life expectancy, generosity, social support, freedom, and corruption) is estimated to contribute to making life evaluations higher in each country than in Dystopia. Dystopia is a hypothetical country with values equal to the world’s lowest national averages for each of the six factors (see FAQs: What is Dystopia?). The sub-bars have no impact on the total score reported for each country but are just a way of explaining the implications of the model estimated in Table 2.1. People often ask why some countries rank higher than others—the sub-bars (including the residuals, which show what is not explained) attempt to answer that question."
India seems to score very low on social support, compared to similarly ranked countries.
I did some googling and found this, which shows the sub-factors over time for India. Looks like social support declined a lot, but is now increasing again.
I haven't checked whether it declined more than in other countries and, if it has, I'm not sure why it has.
Thank you for the pointer!
Your second link helped me refine my line of questioning / confusion. You're right that social support declined a lot, but the sum of the six key variables (GDP per capita, etc) still mostly trended upwards over time, huge covid dip aside, which is what I'd expect in the India development success story.
It's the dystopia residual that keeps dropping, from 2.275 - 1.83 = 0.445 in 2015 (i.e. Indians reported 0.445 points higher life satisfaction than you'd predict using the model) to 0.979 - 1.83 = -0.85, an absolute plummeting of life satisfaction across a sizeable fraction of the world population, that's for some reason not explained by the six key variables. Hm...
(please don't feel obliged to respond – I appreciate the link!)
Could this be related to the rising level of inequality in happiness levels in Asia? (See the graph on page 44 of the WHR2024). It can be assumed that the benefits of GDP growth are not evenly distributed, and increasing inequalities trigger frustration and a decrease in well-being in the majority of the population (since to a certain extent, the sense of welfare is relative).
This is how Our World in Data explains a similar phenomenon in the US: "Income inequality in the US is exceptionally high and has been on the rise in the last four decades, with incomes for the median household growing much more slowly than incomes for the top 10%. As a result, trends in aggregate life satisfaction should not be seen as paradoxical: the income and standard of living of the typical US citizen have not grown much in the last couple of decades."
Yeah rising inequality is a good guess, thank you – the OWID chart also shows the US experiencing the same trajectory direction as India (declining average LS despite rising GDP per capita). I suppose one way to test this hypothesis is to see if China had inequality rise significantly as well in the 2011-23 period, since it had the expected LS-and-GDP-trending-up trajectory. Probably a weak test due to potential confounders...
As someone predisposed to like modeling, the key takeaway I got from Justin Sandefur's Asterisk essay PEPFAR and the Costs of Cost-Benefit Analysis was this corrective reminder – emphasis mine, focusing on what changed my mind:
More detail:
Tangentially, I suspect this sort of attitude (Iraq invasion notwithstanding) would naturally arise out of a definite optimism mindset (that essay by Dan Wang is incidentally a great read; his follow-up is more comprehensive and clearly argued, but I prefer the original for inspiration). It seems to me that Justin has this mindset as well, cf. his analogy to climate change in comparing economists' carbon taxes and cap-and-trade schemes vs progressive activists pushing for green tech investment to bend the cost curve. He concludes:
Aside from his climate change example above, I'd be curious to know what other domains economists are making analytical mistakes in w.r.t. cost-benefit modeling, since I'm probably predisposed to making the same kinds of mistakes.
This WHO press release was a good reminder of the power of immunization – a new study forthcoming publication in The Lancet reports that (liberally quoting / paraphrasing the release)
Great OWID charts for this:
(Attention conservation notice: rambling in public)
A striking throwaway remark, given its context:
It's striking given that this comes from this book on Thailand’s Health Intervention and Technology Assessment Program (HITAP) (ch 1 pg 22), albeit perhaps understandable given the authors' stance that evidence is necessary but not sufficient to determine the best course of action (to treat a patient, to design a social insurance scheme, etc), which seems completely unobjectionable.
That said, I did wonder about the first half of the quoted throwaway remark, so I asked Elicit; its top-4 paper summary is
although the summary didn't include these papers it listed in the top 10
With the proviso that I'm a layperson w.r.t. medicine and healthcare, and that I didn't ask Elicit further questions or really dig further into this at all — I find myself mostly unmoved by these papers & reviews, while the younger me of (say) a decade ago would've epistemically panicked. Partly it's that they aren't really contra "using evidence to inform medicine" per se: to oversimplify a bit, Bahtsevani et al recommend more evidence generation, Every-Palmer & Howick recommend less industry-biased evidence generation, and Shaw et al argue that other less legible-than-RCT types of evidence should occupy more mindshare than they did back in '07 (there's a loose parallel here to the more recent growth vs randomista debate in dev econ). Partly it's that I suspect there's some talking past each other, which only becomes clear when one digs into the nuts-and-bolts. Partly it's that I think the general underlying ethos of "using evidence to inform medicine" is a lot more robust than any particular instantiation of it (e.g. using only empirical data from systematic reviews of RCTs), sort of like how cluster thinking > sequence thinking for decision-making, or like how foxes have weak views strongly held (side note: in that essay's framing I used to be a hedgehog, hopefully I'm now more fox than degenerate cactus). Partly it's that I've "seen this before" with other topics, cf. Scott Alexander's many deep dives. Maybe I'm just getting old...
I haven't looked in detail, but my quick comment would be that these studies seem to basically be comparing extreme careful following of evidence based medicine, vs. "normal medical practise" which is like 90%+ based on evidence anyway. Standard medical training and registered medical practise in most of the world closely follows the evidence - it would be very difficullt (maybe impossible) to practise "outside" of the evidence. So not finding a huge difference between these 2 ways of practising isn't so surprising.
Epistemic status: public attempt at self-deconfusion & not just stopping at knee-jerk skepticism
The recently published Cost-effectiveness of interventions for HIV/AIDS, malaria, syphilis, and tuberculosis in 128 countries: a meta-regression analysis (so recent it's listed as being published next month), in my understanding, aims to fill country-specific gaps in CEAs for all interventions in all countries for HIV/AIDS, malaria, syphilis, and tuberculosis, to help national decision-makers allocate resources effectively – to a first approximation I think of it as "like the DCP3 but at country granularity and for Global Fund-focused programs". They do this by predicting ICERs, IQRs, and 95% UIs in US$/DALY using the meta-regression parameters obtained from analysing ICERs published for these interventions (more here).
AFAICT their methodology and execution seem superb, so I was keen to see their results:
At risk of being overly skeptical: $3 per DALY averted is >30x better than Open Phil's 1,000x bar of $100 per DALY which is roughly around GW top charity level which OP have said are hard to beat, especially for a direct intervention like antenatal syphilis screening. It makes me wonder how much credence to put in the study's findings for actual resource allocation decisions (esp. Figure 4 ranking top interventions at country granularity). Also:
So how should I think about and use their findings given what seems like reasonable grounds for skepticism, if I'm primarily interested in helping decision-makers help people better? Scattered thoughts to defend the study / push back on my nitpicking above:
That said, I do wonder if the authors could have done a bit better, like
The positive spin is that someone got funded to do this kind of big-picture analysis and got it published in The Lancet.
There were 1,792 potential country-intervention pairs (although it is not immediately clear if they did all 1,792 pairs). So I don't think most reasonable readers would view these findings as substitutes for a more in-depth, country-specific analysis on the potentially promising intervention. They did publish at least some data for each intervention, although maybe it isn't enough to poke at each of the country-intervention pairs.
One of the more surprising things I learned from Karen Levy's 80K podcast interview on misaligned incentives in global development was how her experience directly contradicted a stereotype I had about for-profits vs nonprofits:
My assumption had been that the opposite was true.
The following table is from Scott Alexander's post, which you should check out for the sources and (many, many) caveats.
I'm curious what people who're more familiar with infinite ethics think of Manheim & Sandberg's What is the upper limit of value?, in particular where they discuss infinite ethics (emphasis mine):
I first read their paper a few years ago and found their arguments for the finiteness of value persuasive, as well as their collectively-exhaustive responses in section 4 to possible objections. So ever since then I've been admittedly confused by claims that the problems of infinite ethics still warrant concern w.r.t. ethical decision-making (e.g. I don't really buy Joe Carlsmith's arguments for acknowledging that infinities matter in this context, same for Toby Ord's discussion in a recent 80K podcast). What am I missing?
Sandberg's recent 80K podcast interview transcript has this quote:
I like John Salter's post on schlep blindness in EA (inspired by Paul Graham's eponymous essay), whose key takeaway is
Unfortunately it's bereft of concrete examples. The closest to a shortlist he shares is in this comment:
(I disagree with some of these; e.g. the first bullet seems contradicted by the propensity for forum drama on adjacent topics, and as someone who likes compiling lists and data I don't actually see much low-hanging fruit for me to contribute here due to the work of e.g. Hamish)
I'd be keen to learn other examples. He does give this advice to brainstorm examples:
although in my case it's not useful because I either just end up doing it (or trying, failing, and learning why), or discover that it's already been done better than I could (e.g. Rethink Priorities' new CCM).
That said, I still think the original takeaway is a useful reminder.
[Question] How should we think about the decision relevance of models estimating p(doom)?
(Epistemic status: confused & dissatisfied by what I've seen published, but haven't spent more than a few hours looking. Question motivated by Open Philanthropy's AI Worldviews Contest; this comment thread asking how OP updated reminded me of my dissatisfaction. I've asked this before on LW but got no response; curious to retry, hence repost)
To illustrate what I mean, switching from p(doom) to timelines:
Question: Is there something like this for p(doom) estimates? More specifically, following the above points as pushback against the strawman(?) that "p(doom) discourse, including rigorous modeling of it, is overrated":
Or is this whole line of questioning simply misguided or irrelevant?
Some writings I've seen gesturing in this direction:
This question was mainly motivated by my attempt to figure out what to make of people's widely-varying p(doom) estimates, e.g. in the appendix section of Apart Research's website, beyond simply "there is no consensus on p(doom)". I suppose one can argue that rigorous p(doom) modeling helps reduce disagreement on intuition-driven estimates by clarifying cruxes or deconfusing concepts, thereby improving confidence and coordination on what to do, but in practice I'm unsure if this is the case (reading e.g. the public discussion around the p(doom) modeling by Carlsmith, Froolow, etc), so I'm not sure I buy this argument, hence my asking for concrete examples.
I just learned about Tom Frieden via Vadim Albinsky's writeup Resolve to Save Lives Trans Fat Program for Founders Pledge. His impact in sheer lives saved is astounding, and I'm embarrassed I didn't know about him before:
How many more? Albinsky estimates:
Tangentially, if a "Borlaug" is a billion lives saved, then Frieden's impact is probably on the scale of ~100 milliBorlaugs (to nearest OOM). Bill and Melinda likely have had similar impact. This makes me wonder who else I don't know about who's done ~100 milliBorlaugs of good.
(It's arguably unfair to wholly attribute all those lives saved to Frieden, and I am honestly unsure what credit attribution makes most sense, but applying the same logic to Borlaug you can no longer really say he saved a billion lives.)
The 1,000-ton rule is Richard Parncutt's suggestion for reframing the political message of the severity of global warming in particularly vivid human rights terms; it says that someone in the next century or two is prematurely killed every time humanity burns 1,000 tons of carbon.
I came across this paper while (in the spirit of Nuno's suggestion) trying to figure out the 'moral cost of climate change' so to speak, driven by my annoyance that e.g. climate charity BOTECs reported $ per ton of CO2-eq averted in contrast to (say) the $ per death averted bottomline of GHW charities, since I don't intrinsically care to avert CO2-equivalent emissions the way I do about averting deaths. (To be clear, I understand why the BOTECs do so and would do the same for work; this is for my own moral clarity.)
Parncutt's derivation is simple: burning a trillion tons of carbon will cause ~2 °C of anthropogenic global warming, which will in turn cause 1 - 10 million premature deaths a year "for a period of several centuries", something like this:
Modelling the rise in global mean surface temperature (GMST) as a function of carbon burned is already very hard; Parncutt doesn't try to model premature deaths as a function of GMST but just makes a semi-quantitative order-of-magnitude estimation anchored extensively at the lower and upper ends to various catastrophic outcomes discussed in the literature on climate change, and assumes a lognormal distribution around a billion future deaths with a 10x range for worst-vs-best case scenario, which over time looks 'very approximately' like this:
Based on the 1,000-ton rule, Pearce & Parncutt suggest the 'millilife' as "an accessible unit of measure for carbon footprints that is easy to understand and may be used to set energy policy to help accelerate carbon emissions reductions". A millilife is a measure of intrinsic value defined to be 1/1000th of a human life; the 1,000-ton rule says that burning a ton of fossil carbon destroys a millilife. This lets Pearce & Parncutt make statements like these, at an individual level (all emphasis mine):
and
and for "large-scale energy decisions":
Pearce & Parncutt then use the 1,000-ton rule and millilife to make various suggestions. Here's one:
One more energy policy suggestion (there's many more in the paper):
I have no (defensible) opinion on these suggestions; curious to know what anyone thinks.
Notes from Ozy Brennan's On capabilitarianism
Some notes from trying out Rethink Priorities' new cross-cause cost-effectiveness model (CCM) from their post, for personal reference:
Cost-effectiveness in DALYs per $1k (90% CI) / % of simulation results with positive outcomes - negative outcomes - no effects / alternative weightings of cost-eff under different risk aversion profiles and weighting schemes in weighted DALYs per $1k, min to max values
In any case, I'd be curious to see how the CCM is taken into consideration by funders and other stakeholders going forward.
I thought I had mostly internalized the heavy-tailed worldview from a life-guiding perspective, but reading Ben Kuhn's searching for outliers made me realize I hadn't. So here are some summarized reminders for posterity:
From Richard Y Chappell's post Theory-Driven Applied Ethics, answering "what is there for the applied ethicist to do, that could be philosophically interesting?", emphasis mine:
Two asides:
Michael Dickens' 2016 post Evaluation Frameworks (or: When Importance / Neglectedness / Tractability Doesn't Apply) makes the following point I think is useful to keep in mind as a corrective:
(aside: Michael Plant makes the same point in chapters 5 & 6 of his PhD thesis as per Edo Arad's post, using it as a starting point to develop a systematic cause prio approach he called 'cause mapping')
Dickens' post has more; the framework they end up with is this:
which (somewhat less practically, they note) could be fine-grained further:
I also appreciated that Dickens actually used this framework to guide their giving decision (more details in their post).
List of charities providing humanitarian assistance in the Israel-Hamas war mentioned in response to this request, for posterity and ease of reference:
Just came across Max Dalton's 2014 writeup Estimating the cost-effectiveness of research into neglected diseases, part of Owen Cotton-Barratt's project on estimating cost-effectiveness of research and similar activities. Some things that stood out to me:
The following is a collection of long quotes from Ozy Brennan's post On John Woolman (which I stumbled upon via Aaron Gertler) that spoke to me. Woolman was clearly what David Chapman would call mission-oriented with respect to meaning of and purpose in life; Chapman argues instead for what he calls "enjoyable usefulness", which is I think healthier in ~every way ... it just doesn't resonate. All bolded text is my own emphasis, not Ozy's.
The modern version of this is probably what Thomas Kwa wrote about here:
An aside (doubling as warning) on mission orientation, quoting Tanner Greer's Questing for Transcendence: