Mo Putera's Quick takes

Mo Putera

Effective Altruism Forum
EA Forum

Mo Putera's Quick takes

This is a special post for quick takes by Mo Putera. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

Sorted by

New & upvoted

Click to highlight new quick takes since: Today at 4:09 AM

Mo PuteraApr 232

I spent most of my early career as a data analyst in industry, which engendered in me a deep wariness of quantitative data sources and plumbing, and a neverending discomfort at how often others tended to just take them as given for input into consequential decision-making, even if at an intellectual level I knew their constraints and other priorities justified it and they were doing the best they could. ...and then I moved to global health applied research and realised that the data trustworthiness situation was so much worse I had to recalibrate a lot of expectations / intuitions.

In that regard I appreciate GiveWell's new guidance on burden note:

Disease burden estimates, such as child mortality rates, are a key input in our cost-effectiveness analyses. Historically, for consistency and convenience, we've primarily relied on a single source for these estimates.
Going forward, we plan to consider multiple sources for burden estimates, apply a higher level of scrutiny to these estimates, and adjust for potential biases or inaccuracies, like we do when estimating other parameters in our models.
This change has already led to us making over $25m in additional grants we would not have otherwise. (Footnote: Our updated estimates of malaria burden in Chad have led us to allocate $3.3 million in grantmaking for seasonal malaria chemoprevention (more), and $25.9m for insecticide-treated nets (not yet published).) We expect to consider additional research to improve estimates of burden of disease in the future.

The rest of the note was cathartic to skim-read. For instance, when I looked into the idea of distributing low-cost glasses to correct presbyopia in low-income countries awhile back (a problem that afflicts over 1.8 billion people globally with >$50 billion in annual lost potential productivity annually in LMICs alone), the industry data analyst in me was dismayed to learn that the WHO didn't even collect data on how many people needed glasses prior to 2008, so governments and associated stakeholders understandably prioritised allocation of resources towards surgical and medical interventions instead. I think the existence of orgs like IHME and OWID greatly improve the GHD data situation nowadays, but there are many "pockets" where it remains a far cry from what it could be, so I appreciated that GiveWell said they're considering

Fund data collection. This includes potentially funding additional nationally representative surveys (DHS/MIS/MICS) or additional modules to these surveys, or supporting more autopsy data collection to better understand cause-specific mortality, particularly for malaria in sub-Saharan Africa. Our guess is that part of the reason different models disagree is that the data underlying these models is limited. We may look for cases where we could fund additional data collection to improve burden of disease estimates.

Another example: a fair bit of my earlier analyst work involved either reconciling discrepant figures for ostensibly similar metrics (e.g. campaign revenue breakdowns etc) or root-cause analysing-via-data-plumbing whether a flagged metric needed to be acted on or was a false positive, which made me appreciate this section:

Key uncertainties: ...
There are likely technical nuances we haven't captured. We've found that comparisons between sources are more complex than they first appear. For example, we recently learned that IGME and IHME define diarrheal diseases differently. Similar technical differences likely exist elsewhere.
Possible next steps:
Get a better understanding of what’s driving differences in models. This may come from bringing together modeling groups in regions with high disagreement to understand methodological differences.
Look for ways to improve model transparency. We’ve found it difficult to engage with burden of disease models, and think that finding ways to see inside the black box of how they produce estimates may make it easier to understand which estimates to rely on and how to improve them.

NickLaingApr 34

This is fantastic to hear! The Global burden of disease process (while the best and most reputable we have) is surprisingly opaque and hard to follow in many cases. I haven't been able to find the spreadsheets with their calculations.

Their numbers are usually reasonable but bewildering in some cases and obviously wrong in others. GiveWell moving towards combining GBD with other sensible models is a great way forward.

Its a bit unfortunate that the best burden of disease models we have aren't more understandable.

Mo PuteraMar 1846

Counting people is hard. Here are some readings I've come across recently on this, collected in one place for my own edification:

Oliver Kim's How Much Should We Trust Developing Country GDP? is full of sobering quotes. Here's one: "Hollowed out by years of state neglect, African statistical agencies are now often unable to conduct basic survey and sampling work... [e.g.] population figures [are] extrapolated from censuses that are decades-old". The GDP anecdotes are even more heartbreaking
Have we vastly underestimated the total number of people on Earth? Quote: "Josias Láng-Ritter and his colleagues at Aalto University, Finland, were working to understand the extent to which dam construction projects caused people to be resettled, but while estimating populations, they kept getting vastly different numbers to official statistics. To investigate, they used data on 307 dam projects in 35 countries, including China, Brazil, Australia and Poland, all completed between 1980 and 2010, taking the number of people reported as resettled in each case as the population in that area prior to displacement. They then cross-checked these numbers against five major population datasets that break down areas into a grid of squares and estimate the number of people living in each square to arrive at totals... According to their analysis, the most accurate estimates undercounted the real number of people by 53 per cent on average, while the worst was 84 per cent out."
David Nash's Nigeria's Missing 50 Million People argues that (quote) "Nigeria's official population (~220-230 million) may be significantly inflated and could be closer to 170-180 million (another article claims 120 million) likely driven by political and financial incentives for states". The comments are insightful too, e.g. David's comment that Uganda and Burkina Faso have the opposite problem ("in Burkina Faso the issue was that GDP per capita numbers were calculated from industrial output divided by population estimates so in order to look good, local government had an incentive to underestimate population so they seemed richer"), and Sjlver's comment comparing AMF's population data from distributing bednets to every household to UNFPA data; I've copied their table below:

jablevineMar 215

Good links. My favorite example is Papua New Guinea, which doubled their population estimate after a UN Population Fund review. Chapter 1 of Fernand Braudel's The Structures of Everyday Life is a good overview of the problem in historical perspective.

Mo PuteraMar 223

Wow, that's nuts. Thanks for the pointer.

Mo PuteraMar 2723

Global health

Buried deep in the PEPFAR Report's appendix - methodology section is a nice "introduction to global health programs" mini-article that also addresses some lay misconceptions about foreign aid and suggests a better way to think about it all in one go; it's a shame that most folks won't read it, so I'm reposting it here for ease of future reference.

Introduction to Global Health Programs
Many people are skeptical of foreign aid and other attempts to help the global poor—and they’re right to be! A lot of foreign aid is poorly targeted, counterproductive, or simply a waste of money. From PlayPumps to TOMS shoes to One Laptop Per Child, the news is full of well-intentioned programs that had nowhere near the effect their boosters advertised. Many prominent experts, such as William Easterly and Angus Deaton, question whether foreign aid works at all.
Development economists, charity evaluators, and other specialists perform “program evaluations,” which ask questions like:
Does the problem we’re trying to solve actually exist?
Why does the problem exist?
Is the program well-implemented?
Is the program having the effect that we expected?
Is the program too expensive? Can some other program get the same results for less money?
In general, program evaluations are interested in finding out what the effects of a program are. The effect of a program is the difference between the outcome (what actually happened) and the counterfactual (what would have happened without the program being implemented). It’s impossible to measure the counterfactual because the counterfactual is about the same people at the same time. The counterfactual can only ever be estimated. Program evaluators have come up with many different ways of estimating the counterfactual, which we’ll talk about on the main page.
Researchers have found that global health interventions are far more likely to work than programs like PlayPumps or One Laptop Per Child. It’s easy to be wrong about whether a school system needs laptops, especially in a country far away from your own; it’s much harder to be wrong about whether a country has sky-high rates of HIV/AIDS. We don’t know much about the causes of poverty or what makes countries develop economically; we know much more about the causes of HIV and what makes HIV progress to AIDS. PlayPumps were a brand-new invention that might not work; antiretroviral medications are well-tested, well-understood, and widely used in the developed world. For this reason, the charity evaluator GiveWell—which specializes in cost-effective ways of helping the global poor—mostly recommends charities that provide healthcare.
Foreign aid often has unintended consequences: for example, giving people shoes (like TOMS shoes did) can put local shoemakers out of work; foreign aid can lead to governments prioritizing the wishes of foreign donors over the wishes of their own people. Providing healthcare has many fewer negative unintended consequences than other forms of foreign aid: providing antiretrovirals is unlikely to put small local antiretroviral manufacturers out of work. It can have some other unintended consequences, like loss of democratic accountability.
But providing healthcare also could also have positive unintended consequences. Healthy children may be more likely to go to school. Then healthy adults may be able to work more and earn more for their families. These effects can be huge. For instance, the charity evaluator GiveWell estimates that a third of the benefit of insecticide-treated bednet distribution, a treatment to avert deaths from malaria, comes from increased income.
Poverty in America is horrible: no one should be unsure how to pay for rent, food, or healthcare. But Americans are extraordinarily rich compared to the rest of the world: a person at the poverty line in the United States is in the top 15% wealthiest people in the world, even if you adjust for how far money goes in each country. Many of the world’s poorest people live in the countries PEPFAR works in: about two-fifths of people in sub-Saharan Africa live on less than $2.15 a day, adjusted for how far money goes. Since these people are so poor, they don’t have many of the opportunities Americans take for granted. Of course, the first priority of the United States government should be to help American citizens. But if you’re used to charity at home, it can be shocking how cheap it is to help people abroad.
Most of all: just throwing money at foreign aid doesn’t fix anything. But that isn’t a reason to give up—not with millions of lives at stake. If we’re careful and thoughtful, and if we actually check whether what we’re doing does any good, then we don’t have to be PlayPumps or TOMS shoes. We can concretely, robustly make things better.

NickLaingMar 274

Wow that's fantastic - I wonder who wrote it, that seems extremely EA flavoured.

Mo PuteraMar 276

You're bang on, definitely a few EAs involved:

Authors:
Kelsey Piper is a journalist at Vox.
Leah Libresco Sargeant is a journalist.
Colin Aitken is a postdoctoral scholar in development economics at the University of Chicago.
Alex Randall is a foreign aid and procurement expert.
Bruce Tsai is a doctor.
Dave Kasten is a consultant.
Zac Hatfield-Dodds is a fellow of the Python Software Foundation.
Keller Scholl is a PhD candidate in policy analysis.
Clara Collier is the Editor in Chief at Asterisk.
Rishi Mago is a software engineer at Amazon.
We speak only for ourselves and our consciences. None of our respective institutions have reviewed this work. Alex was formerly employed by a USAID contractor that received PEPFAR funding. She did not work directly on PEPFAR programs. We thank Emily Lin for serving as our Webmaster. We are indebted to a number of external reviewers, including Saloni Dattani and Andrew Martin.

Mo PuteraMar 2915

Global health

From Rachel Glennerster's old J-PAL blog post, a classic worth resharing: "charge for bednets or distribute them for free?"

In 2000 there was an intense argument about whether malarial insecticide-treated bednets (ITNs) should be given out for free. Some argued that charging for bednets would massively reduce take-up by the poor. Others argued that if people don’t pay for something, they don’t value it and are less likely use it. It was an evidence-free argument at the time.
Then, a series of studies in many countries testing many different preventative health products showed that even a small increase in price led to a sharp decline in product take-up. Pricing did not help target the product to those who needed it most, and people were not more likely to use a product if they paid for it. This cleared the way for a massive increase in free bednet distribution (Dupas 2011 and Kremer and Glennerster 2011).

There was a dramatic increase in malaria bednet coverage between 2000 and 2015 in sub-Saharan Africa. At the same time, there was a massive fall in the number of malarial cases. In Nature, Bhatt and colleagues estimate that the vast majority of the decline in malarial cases is due to the increase in ITNs. They estimate there were 450 million fewer cases of malaria due to ITNs and four million fewer deaths due to ITNs. The lesson here is that testing an important policy-relevant idea can have as much impact on peoples’ lives as testing a specific program.

Mo PuteraJan 2142

Global health

I just learned that Trump signed an executive order last night withdrawing the US from the WHO; this is his second attempt to do so.

WHO thankfully weren't caught totally unprepared. Politico reports that last year they "launched an investment round seeking some $7 billion “to mobilize predictable and flexible resources from a broader base of donors” for the WHO’s core work between 2025 and 2028. As of late last year, the WHO said it had received commitments for at least half that amount".

Full text of the executive order below:

WITHDRAWING THE UNITED STATES FROM THE WORLD HEALTH ORGANIZATION
By the authority vested in me as President by the Constitution and the laws of the United States of America, it is hereby ordered:
Section 1. Purpose. The United States noticed its withdrawal from the World Health Organization (WHO) in 2020 due to the organization’s mishandling of the COVID-19 pandemic that arose out of Wuhan, China, and other global health crises, its failure to adopt urgently needed reforms, and its inability to demonstrate independence from the inappropriate political influence of WHO member states. In addition, the WHO continues to demand unfairly onerous payments from the United States, far out of proportion with other countries’ assessed payments. China, with a population of 1.4 billion, has 300 percent of the population of the United States, yet contributes nearly 90 percent less to the WHO.
Sec. 2. Actions. (a) The United States intends to withdraw from the WHO. The Presidential Letter to the Secretary-General of the United Nations signed on January 20, 2021, that retracted the United States’ July 6, 2020, notification of withdrawal is revoked.
(b) Executive Order 13987 of January 25, 2021 (Organizing and Mobilizing the United States Government to Provide a Unified and Effective Response to Combat COVID–19 and to Provide United States Leadership on Global Health and Security), is revoked.
(c) The Assistant to the President for National Security Affairs shall establish directorates and coordinating mechanisms within the National Security Council apparatus as he deems necessary and appropriate to safeguard public health and fortify biosecurity.
(d) The Secretary of State and the Director of the Office of Management and Budget shall take appropriate measures, with all practicable speed, to:
(i)    pause the future transfer of any United States Government funds, support, or resources to the WHO;
(ii)   recall and reassign United States Government personnel or contractors working in any capacity with the WHO; and
(iii) identify credible and transparent United States and international partners to assume necessary activities previously undertaken by the WHO.
(e) The Director of the White House Office of Pandemic Preparedness and Response Policy shall review, rescind, and replace the 2024 U.S. Global Health Security Strategy as soon as practicable.
Sec. 3. Notification. The Secretary of State shall immediately inform the Secretary-General of the United Nations, any other applicable depositary, and the leadership of the WHO of the withdrawal.
Sec. 4. Global System Negotiations. While withdrawal is in progress, the Secretary of State will cease negotiations on the WHO Pandemic Agreement and the amendments to the International Health Regulations, and actions taken to effectuate such agreement and amendments will have no binding force on the United States.
Sec. 5. General Provisions. (a) Nothing in this order shall be construed to impair or otherwise affect:
(i)   the authority granted by law to an executive department or agency, or the head thereof; or
(ii) the functions of the Director of the Office of Management and Budget relating to budgetary, administrative, or legislative proposals.
(b) This order shall be implemented consistent with applicable law and subject to the availability of appropriations.
(c) This order is not intended to, and does not, create any right or benefit, substantive or procedural, enforceable at law or in equity by any party against the United States, its departments, agencies, or entities, its officers, employees, or agents, or any other person.
THE WHITE HOUSE,
    January 20, 2025.

huwJan 2111

Someone noted that at the rate of US GHD spending, this would cost ~12,000 counterfactual lives. A tremendous tragedy.

LarksJan 235

Is WHO cost-effectiveness similar to US GHD spending?

Mo PuteraJan 222

That's heartbreaking. Thanks for the pointer.

Mo PuteraFeb 24*12

This is a top-level comment compiling scattered notes on quantifying benefits / costs and related things, for my own benefit. Future notes will be comments under this one.

The purpose of this compilation is to essentially fact-post / perpetual-draft my way towards a more nuts-and-bolts understanding of the "what are the most effective interventions?" question, which is a different starting point than the usual "what are the most cost-effective?" one, a perspective shift primarily spurred by Justin Sandefur's case study on PEPFAR and a desire to better internalise what "big EA" might look like (as opposed to taking a scarcity mindset as given), albeit in a lazy undirected way: these notes are from references I encounter in the course of doing other work.

Past relevant notes:

Mo PuteraFeb 242

In their 2024 results report, the Global Fund partnership claims to have saved 65 million lives from dying of AIDS, TB and malaria alone, in addition to other benefits like morbidity reductions from those diseases, lower infant and maternal mortality and fewer deaths from acute trauma and other conditions.

The 65M lives saved figure caught my eye, since they have "only" disbursed $63 billion so far between 2002 and end of 2023, for an implied cost-effectiveness of ~$1,000 per life saved, about 5x the GiveWell bar. On the one hand, I doubt their assessment is as rigorous as GiveWell's; on the other hand, even a crude 90% discount (far more than any of the top charities are subject to, AFAICT on a quick skim) still yields $10k per life saved, surprisingly close to the GW bar despite (1) disbursing billions per year (2) across such a wide range of programs.

Mo PuteraMar 244

Martin Gould's Five insights from farm animal economics over at Open Phil's FAW newsletter points out that (quote) "blocking local factory farms can mean animals are farmed in worse conditions elsewhere":

Consider the UK: Local groups celebrate blocking new chicken farms. But because UK chicken demand keeps growing — it rose 24% from 2012-2022 — the result of fewer new UK chicken farms is just that the UK imports more chicken: it almost doubled its chicken imports over the same time period. While most chicken imported into the UK comes from the EU, where conditions for chickens are similar, a growing share comes from Brazil and Thailand, where regulations are nonexistent. Blocking local farms may slightly reduce demand via higher prices, but it also risks sentencing animals to worse conditions abroad.
The same problem haunts government welfare reforms — stronger standards in one country can just shift production to places with worse standards.

This reminded me of what Will MacAskill wrote in Doing Good Better on anti-sweatshop protests being potentially misguided because the alternative for sweatshop workers is worse (long quote):

... those who protest sweatshops by refusing to buy goods produced in them are making the mistake of failing to consider what would happen otherwise. In developing countries, sweatshop jobs are the good jobs. The alternatives are typically worse, such as backbreaking, low-paid farm labor, scavenging, or unemployment.
A clear indicator that sweatshops provide comparatively good jobs is the great demand for them among people in developing countries. Almost all workers in sweatshops chose to work there, and some go to great lengths to do so. In the early 2000s, nearly four million people from Laos, Cambodia, & Burma immigrated to Thailand to take sweatshop jobs, and many Bolivians risk deportation by illegally entering Brazil in order to work in the sweatshops there. The average earnings of a sweatshop worker in Brazil are $2,000/year — not very much, but $600/year more than the average earnings in Bolivia, where people generally work in agriculture or mining. Similarly, the average earnings among sweatshop workers are: $2/day in Bangladesh, $5.50/day in Cambodia, $7/day in Haiti, and $8/day in India. These wages are tiny, but when compared to the $1.25 a day many citizens of these countries live in, the demand for these jobs seem more understandable.
It’s difficult for us to imagine that people would risk deportation just to work in sweatshops. But that’s because the extremity of global poverty is almost unimaginable.
Among economists, there’s no question that sweatshops benefit those in poor countries and that they are ‘tremendous good news for the world’s poor.’ One said, ‘My concern is not that there are too many sweatshops but that there are too few.’ Low-wage, labor-intensive manufacturing is a stepping-stone that helps an economy based around cash crops develop into an industrialized, rich country. During the Industrial Revolution, for example, Europe and America spent more than 100 years using sweatshop labor, emerging with much higher living standards as a result. It took many decades to pass through this stage because the tech to industralize was new, and the 20th century has seen countries pass through this stage of development much more rapidly because the tech is already in place. The four East Asian ‘Tiger economies’ — Hong Kong, Singapore, South Korea, and Taiwan — exemplify speedy development, having evolved from very poor, agrarian societies in the early 20th century to manufacturing-oriented sweatshop countries mid-century, and finally emerging as industrialized economic powerhouses in recent decades. Because sweatshops are good for poor countries, if we boycott them, we make people in poor countries worse off.
We should certainly feel outrage and horror at the conditions sweatshop laborers toll under. The correct response, however, is not to give sweatshop-produced goods in favor of domestically produced goods. The correct response is to try to end the extreme poverty that makes sweatshops desirable places to work in the first place. What about buying products from companies that employ people in poor countries but claim to have higher labor standards, like People Tree, Indigenous, and Kuyichi? By doing this, we would avoid the use of sweatshops, while at the same time providing even better job opportunities for the extreme poor.

This made me wonder about 2 things:

Zooming out: if you buy that the two examples above form a natural category (of "noble intentions misguided by poor reasoning about counterfactuals / second-order effects", say), what other examples are there of such altruistic mistakes that we might be making?
Zooming in: what kind of intervention is analogous to "buy from People Tree" in the FAW context? Is this a promising avenue at all?

I know very little about FAW, but I'd guess the answer to #2 is "not promising" mainly because it isn't what advocates do. Instead, and again quoting from Gould's writeup, they do this:

... advocates are getting smarter about this. They're pushing for laws that tackle both production and imports at once. US states like California have done this — when it banned battery cages, it also banned selling eggs from hens caged anywhere. The EU is considering the same approach. It's a crucial shift: without these import restrictions, both farm bans and welfare reforms risk exporting animal suffering to places with even worse conditions. And advocates have prioritized corporate policies, which avoid this problem, as companies pledge to stop selling products associated with the worst animal suffering (like caged eggs), regardless of where they are produced.

JosephMar 245

Zooming out, regarding other examples of altruistic mistakes that we might be making, I think there are a lot of scenarios in which banning something or making something less appealing in one locations is intended to reduce the bad thing, but actually just ends up shifting the thing elsewhere, where there are even fewer regulations.

One critique of the United States's drug policy is that it doesn't halt the production or trade of dangerous drugs, but simply pushes it elsewhere (the balloon effect).
When a jurisdiction bans chicken farmers from using small cages, (such as California's Proposition 2 from 2008) then it might just shift production elsewhere.
More mundane: a parent doesn't want their child to engage in a particular behavior (smoking cigarettes, having sex, drinking alcohol, etc.), the child will then do it away from the home in a more dangerous context. My vague impression is that teenagers with parents who ban sexual activity tend to have less access to contraception and worse health outcomes (although I haven't read the research on this).
A little bit different, but a classic example of this kind of "poor reasoning about second order effects" is the cobra effect (or any similar incentive for extermination)
Welfare traps

Mo PuteraMar 252

Thank you, this is exactly the kind of list of examples I was looking for.

Mo PuteraAug 2 202446

Global healthShow more

Striking paper by Anant Sudarshan and Eyal Frank (via Dylan Matthews at Vox Future Perfect) on the importance of vultures as a keystone species.

To quote the paper and newsletter — the basic story is that vultures are extraordinarily efficient scavengers, eating nearly all of a carcass less than an hour after finding it, and farmers in India historically relied on them to quickly remove livestock carcasses, so they functioned as a natural sanitation system in helping to control diseases that could otherwise be spread through the carcasses they consume. In 1994, farmers began using diclofenac to treat their livestock, due to the expiry of a patent long held by Novartis leading to the entry of cheap generic brands made by Indian companies. Diclofenac is a common painkiller, harmless to humans, but vultures develop kidney failure and die within weeks of digesting carrion with even small residues of it. Unfortunately this only came to light via research published a decade later in 2004, by which time the number of Indian vultures in the wild had tragically plummeted from tens of millions to just a few thousands today, the fastest for a bird species in recorded history and the largest in magnitude since the extinction of the passenger pigeon.

When the vultures died out, far more dead animals lay around rotting, transmitting pathogens to other scavengers like dogs and rats and entering the water supply. Dogs and rats are less efficient than vultures at fully eliminating flesh from carcasses, leading to a higher incidence of human contact with infected remains, and they're also more likely to transmit diseases like anthrax and rabies to people. Sudarshan and Frank estimate that this led to ~100,000(!) additional deaths each year from 2000-05 due to a +4.2%(!) increase in all-cause mortality among the 430 million people living in districts that once had a lot of vultures, which is staggering; this is e.g. more than the death toll in 2001 from HIV/AIDS (92,000), malaria (53,000), and alcohol use disorders (14,000).

(Cause X, anyone? Preventing a hundred thousand deaths a year for less than half a billion dollars annually clears the GiveWell top charity-level threshold, and half a billion is in the ballpark of Open Philanthropy's entire annual grantmaking...)

So what to do? For vultures in particular, Sudarshan and Frank say their results "inform current vulture recovery efforts in India, and conservation efforts elsewhere" e.g. parts of Africa and Spain, albeit without elaborating. More broadly, they hope their paper informs better policymaking by providing "a particularly stark example of the type of hard-to-reverse and unpredictable costs that must be accounted for when evaluating the introduction of new chemicals into fragile and diverse ecosystems", stating "it is plausible that a counterfactual policy regime in India that tested chemicals for their toxicity to at least keystone species might have avoided the collapse of vultures". They conclude:

In the absence of empirical estimates of the social benefits conferred by different species, conservation policy may be heavily influenced by existence values unrelated to utility. The vulture is not a particularly attractive bird and evokes rather different emotions at first sight than do more charismatic poster-animals of wildlife conservation such as tigers and panda bears. Nevertheless
our results suggest that subjective existence values alone may not be the best way to formulate conservation policy.

The remark that vultures are not particularly attractive reminds me of the overlooked plight of farmed chickens, shrimp, insects etc for not being charismatic fauna. (I am admittedly sort of emotionally conflating the welfare of vultures with their ecosystem importance as a keystone species here.)

Mo PuteraMar 175

#9 of Santi Ruiz's 50 thoughts on DOGE over at Statecraft caught my eye:

Information silos are crazier than ever.
For example, I’ve been privy to two parallel, heated debates about foreign aid over the past half-decade. People who work in foreign development (especially effective altruists) have engaged in a battle about the efficacy of various forms of foreign aid: what works best, what works less well, what doesn’t work at all, and how we can know.
Meanwhile, right-wingers have spent much of the last decade (since the summer of 2020 in particular) documenting how deeply embedded left-wing NGOs are in many federal (and local) funding programs, and developing a critique of that tight federal/NGO linkage.
[A reader points out that some of the criticism of American foreign aid on the right is older, and comes from a broader critique about the liberal world order — I think that’s also true.]
But neither debate exhibits much awareness of the other at all, with very negative consequences. The DOGE team has axed the most effective and efficient programs at USAID, forced out the chief economist, who was brought in to oversee a more aggressive push toward efficiency. It does not appear to be interested in engaging with what we know about more or less effective humanitarian aid.
And people in the NGO class were completely blindsided by the animosity the Trump administration had toward them and the speed at which many of their contracts would be torn up.

This makes me wonder whether there's something like a "targeted de-siloing intervention" to prevent risks like the above. Asking LLMs yielded a variety of ideas, none all that insightful or promising.

My own not-that-useful observation is that this is yet another instance of the observation John Nerst made in Partial Derivatives and Partial Narratives that narratives can be both correct and seem totally contradictory, the way partial derivatives of a higher-order function (representing reality, too complicated for human minds to grok) can be both correct and seem nothing alike, so that asking "what's the real story here?" is a bit like asking "what's the real derivative of f(x,y,z) = 4x^2 y – y^z?".

Quoting Nerst's square-and-circle example for more intuition-building:

Imagine that the world was just a set of dots like this picture:

What happens then? Say Alice is told (or, because of psychological predilections, personal experiences or self interest, is more likely to internalize) that the world is a square (left picture), while Bob is told it’s a circle (right picture).
Alice and Bob now have differing ideas about which dots are the important ones, which are expressions of something fundamental (signal) and which are just isolated incidents (noise). They will be interested in and eager to talk about the dots that make up their preferred shape.
When Alice talks about any dot in the square she’s actually taking about the square and other dots are Beside The Point. When Bob talks about a dot that makes up the circle, he’s just ranting about some insignificant dot. If Alice is feeling uncharitable she might think Bob’s just talking about irrelevant dots because he doesn’t want to talk about the square. Bob thinks that Alice taking a great interest in dots in the square and dismisses dots in the circle is hypocritical.
Note that they don’t have to disagree on which dots exist or where they are. Savage political fights can happen without any factual disagreement or fundamental value difference.
There are of course more examples than capitalism. Like nature vs. nurture. “People’s behavior are the result of socialization that works to perpetuate power structures” and “people’s behavior are the result of biological impulses and instincts” are both partial truths. But the full truth is not “in the middle” but on another plane entirely.
“History is determined by the actions of individuals” vs. “history is determined by large scale economic and technological forces.”
“Art subverts the audience’s unexamined preconceptions” vs. “art is the creation of transcendent beauty.”
“Sex is about satisfying basic, impersonal appetites” vs. “sex is an act of intimacy and an expression of love.”
“Ethics is about maximizing happiness” vs. “ethics is about doing your duty”.
“Science works by accumulating knowledge about the world, asymptotically approaching perfect correctness” vs. “science works by replacing one paradigm with another in a series of revolutions.”
“Moral rules and norms are symmetrical, exactly the same for everyone” vs. “Moral rules are there to protect the weak, favoring them over the strong.”

I suspect Nerst is sadly right that "savage political fights can happen without any factual disagreement or fundamental value difference", which makes me bearish on the effectiveness of potential interventions involving better awareness-raising of correct factual information for preventing or mitigating risks like the dismantling of USAID, but I'd love to be proven wrong here.

Mo PuteraJan 1311

Many heads are more utilitarian than one by Anita Keshmirian et al is an interesting paper I found via Gwern's site. Gwern's summary of the key points:

Collective consensual judgments made via group interactions were more utilitarian than individual judgments.
Group discussion did not change the individual judgments indicating a normative conformity effect.
Individuals consented to a group judgment that they did not necessarily buy into personally.
Collectives were less stressed than individuals after responding to moral dilemmas.
Interactions reduced aversive emotions (eg. stressed) associated with violation of moral norms.

Abstract:

Moral judgments have a very prominent social nature, and in everyday life, they are continually shaped by discussions with others. Psychological investigations of these judgments, however, have rarely addressed the impact of social interactions.
To examine the role of social interaction on moral judgments within small groups, we had groups of 4 to 5 participants judge moral dilemmas first individually and privately, then collectively and interactively, and finally individually a second time. We employed both real-life and sacrificial moral dilemmas in which the character’s action or inaction violated a moral principle to benefit the greatest number of people. Participants decided if these utilitarian decisions were morally acceptable or not.
In Experiment 1, we found that collective judgments in face-to-face interactions were more utilitarian than the statistical aggregate of their members compared to both first and second individual judgments. This observation supported the hypothesis that deliberation and consensus within a group transiently reduce the emotional burden of norm violation.
In Experiment 2, we tested this hypothesis more directly: measuring participants’ state anxiety in addition to their moral judgments before, during, and after online interactions, we found again that collectives were more utilitarian than those of individuals and that state anxiety level was reduced during and after social interaction.
The utilitarian boost in collective moral judgments is probably due to the reduction of stress in the social setting.

I wonder if this means that individual EAs might find EA principles more emotionally challenging than group-level surveys might suggest. It also seems a bit concerning that group judgments may naturally skew utilitarian simply by virtue of being groups, rather than through improved moral reasoning (and I say this as someone for whom utilitarianism is the largest "party" in my moral parliament).

Mo PuteraJun 24 202423

Curious what people think of Gwern Branwen's take that our moral circle has historically narrowed as well, not just expanded (so contra Singer), so we should probably just call it a shifting circle. His summary:

The “expanding circle” historical thesis ignores all instances in which modern ethics narrowed the set of beings to be morally regarded, often backing its exclusion by asserting their non-existence, and thus assumes its conclusion: where the circle is expanded, it’s highlighted as moral ‘progress’, and where it is narrowed, what is outside is simply defined away.
When one compares modern with ancient society, the religious differences are striking: almost every single supernatural entity (place, personage, or force) has been excluded from the circle of moral concern, where they used to be huge parts of the circle and one could almost say the entire circle. Further examples include estates, houses, fetuses, prisoners, and graves.

(I admittedly don't find his examples all that persuasive, probably because I'm already biased to only consider beings that can feel pleasure and suffering.)

What's the "so what"? Gwern:

One of the most difficult aspects of any theory of moral progress is explaining why moral progress happens when it does, in such apparently random non-linear jumps. (Historical economics has a similar problem with the Industrial Revolution & Great Divergence.) These jumps do not seem to correspond to simply how many philosophers are thinking about ethics.
As we have already seen, the straightforward picture of ever more inclusive ethics relies on cherry-picking if it covers more than, say, the past 5 centuries; and if we are honest enough to say that moral progress isn’t clear before then, we face the new question of explaining why things changed then and not at any point previous in the 2500 years of Western philosophy, which included many great figures who worked hard on moral philosophy such as Plato or Aristotle.
It is also troubling how much morality & religion seems to be correlated with biological factors. Even if we do not go as far as Julian Jaynes’s⁹ theories of gods as auditory hallucinations, there are still many curious correlations floating around.

PabloJun 24 20246

Hi Mo. I'm unsure if you've seen it, but Gwern’s article was discussed here.

Mo PuteraJun 25 20243

I hadn't, thanks for the pointer Pablo.

Mo PuteraMar 26 202417

Global health

Why did India's happiness ratings consistently drop so much over time even as its GDP per capita rose?

Epistemic status: confused. Haven't looked into this for more than a few minutes

My friend recently alerted me to an observation that puzzled him: this dynamic chart from Our World in Data's happiness and life satisfaction article showing how India's self-reported life satisfaction dropped an astounding -1.20 points (4.97 to 3.78) from 2011 to 2021, even as its GDP per capita rose +51% (I$4,374 to I$6,592 in 2017 prices):

(I included China for comparison to illustrate the sort of trajectory I expected to see for India.)

The sliding year scale on OWID's chart shows how this drop has been consistent and worsening over the years. This picture hasn't changed much recently: the most recent 2024 World Happiness Report reports a 4.05 rating averaged over the 3-year window 2021-23, only slightly above the 2021 rating.

A -1.20 point drop is huge. For context, it's 10x(!) larger than the effect of doubling income at +0.12 LS points (Clarke et al 2018 p199, via HLI's report), and compares to major negative life events like widowhood and extended unemployment:

The effect of life events on life satisfaction

Given India's ~1.4 billion population, such a large drop is alarming: roughly ~5 billion LS-years lost since 2011, very roughly ballparking. For context, and keeping in mind that LS-years and DALYs aren't the same thing, the entire world's DALY burden is ~2.5 billion DALYs p.a.

But – again caveating with my lack of familiarity with the literature and extremely cursory look into this – I haven't seen any writeup look into this, which makes me wonder if it's not a 'real issue'? For instance, the 2021 WHR just says

Since 2006-08, world well-being has been static, but life expectancy increased by nearly four years up to 2017-19 (we shall come to 2020 later). The rate of progress differed a lot across regions. The biggest improvements in life expectancy were in the former Soviet Union, in Asia, and (the greatest) in Sub-Saharan Africa. And these were the regions that had the biggest increases in WELLBYs. In Asia, the exception is South Asia, where India has experienced a remarkable fall in Well-being which more than outweighs its improved life expectancy.

That's it: no elaboration, no footnotes, nothing.

So what am I missing? What's going on here?

A quick search turned up this WEF article (based on Ipsos data and research, not the WHR's Gallup World Poll, so take it with a grain of salt) pointing to

increased internet access -> pressure to portray airbrushed lives on social media & a feeling that 'their lives have become meaningless'
covid-19 mitigation-induced isolation curtailing activities that improve wellbeing (employment, socializing, going to school, exercising and accessing health services)
urban migration to seek work -> traffic congestion, noise and pollution, demanding bosses -> less sleep and exercise -> higher anxiety and worsening health

But I'm not sure these factors are differential (i.e. that they, for instance, happen much more in India than elsewhere s.t. it explains the wellbeing vs development trajectory difference over 2011-24)?

James HerbertMar 26 202410

Interesting! I think figure 2.1 here provides a partial answer. According to the FAQ:

"the sub-bars show the estimated extent to which each of the six factors (levels of GDP, life expectancy, generosity, social support, freedom, and corruption) is estimated to contribute to making life evaluations higher in each country than in Dystopia. Dystopia is a hypothetical country with values equal to the world’s lowest national averages for each of the six factors (see FAQs: What is Dystopia?). The sub-bars have no impact on the total score reported for each country but are just a way of explaining the implications of the model estimated in Table 2.1. People often ask why some countries rank higher than others—the sub-bars (including the residuals, which show what is not explained) attempt to answer that question."

India seems to score very low on social support, compared to similarly ranked countries.

I did some googling and found this, which shows the sub-factors over time for India. Looks like social support declined a lot, but is now increasing again.

I haven't checked whether it declined more than in other countries and, if it has, I'm not sure why it has.

Mo PuteraMar 26 202414

Thank you for the pointer!

Your second link helped me refine my line of questioning / confusion. You're right that social support declined a lot, but the sum of the six key variables (GDP per capita, etc) still mostly trended upwards over time, huge covid dip aside, which is what I'd expect in the India development success story.

It's the dystopia residual that keeps dropping, from 2.275 - 1.83 = 0.445 in 2015 (i.e. Indians reported 0.445 points higher life satisfaction than you'd predict using the model) to 0.979 - 1.83 = -0.85, an absolute plummeting of life satisfaction across a sizeable fraction of the world population, that's for some reason not explained by the six key variables. Hm...

(please don't feel obliged to respond – I appreciate the link!)

mcnificatMar 26 20242

Could this be related to the rising level of inequality in happiness levels in Asia? (See the graph on page 44 of the WHR2024). It can be assumed that the benefits of GDP growth are not evenly distributed, and increasing inequalities trigger frustration and a decrease in well-being in the majority of the population (since to a certain extent, the sense of welfare is relative).

This is how Our World in Data explains a similar phenomenon in the US: "Income inequality in the US is exceptionally high and has been on the rise in the last four decades, with incomes for the median household growing much more slowly than incomes for the top 10%. As a result, trends in aggregate life satisfaction should not be seen as paradoxical: the income and standard of living of the typical US citizen have not grown much in the last couple of decades."

Mo PuteraMar 27 20241

Yeah rising inequality is a good guess, thank you – the OWID chart also shows the US experiencing the same trajectory direction as India (declining average LS despite rising GDP per capita). I suppose one way to test this hypothesis is to see if China had inequality rise significantly as well in the 2011-23 period, since it had the expected LS-and-GDP-trending-up trajectory. Probably a weak test due to potential confounders...

Mo PuteraNov 30 20247

Global health

Pretty funny CGD blog post by Victoria Fan and Rachel Bonnifield: If the Global Health Donors Were Your Parents: A (Whimsical) Comparative Perspective. Quoting at length (with some reformatting):

Navigating the global health funding landscape can be confusing even for global health veterans; there are scores of donors and multilateral funding mechanisms, each with its own particular structure, personality, and philosophy. For the uninitiated, PEPFAR, GAVI, PMI, WHO, the Global Fund, UNITAID, and the Gates Foundation can all appear obscure and intimidating. But if your head is spinning from acronym-induced vertigo, fear not! We are here to help you make sense of it all. How, you ask? With a clear method for donor identification: comparing the donors to your parents. So what would happen if the donors were your parents and you asked them for a new car?
PEPFAR: Ok, we’ll buy you a new car, but we’re going with you to the dealership and it must be American-made. At least one seat must be devoted to abstinence and the delay of sexual debut. Before you drive the car, you must promise not to support prostitution. Each quarter, you must report how many miles you’ve driven with how many passengers, with a target of 1000 passenger-miles per month.
President’s Malaria Initiative: We’ve made it very clear that we only support four proven, cost-effective interventions for child rearing: food, clothing, health care, and education. What, do you think money in the Malaria family just grows on trees? Just because HIV/AIDS has a shiny new car doesn’t mean we can afford it.
UNITAID: We’ve identified pediatric vehicles as a niche market which is currently underserved by the major transport providers. By buying cars for you and all our other children, we are helping to create a pediatric automotive market with new and superior transportation commodities. Prior to our innovative entry into the pediatric vehicle market, most of our potential beneficiaries were getting around using lower-quality forms of transportation, such as bicycles, buses, and walking.
GAVI: We will purchase and a deliver a car for you from a particular GAVI-approved dealership. However, you must co-finance the purchase with wages from your part-time job. Gas and insurance will require separate applications.
WHO: Sorry, we haven’t had a car budget in ten years. But we DO have a new set of guidelines on best practices for safe car driving, and a box full of old carfax vehicle reports that you’re welcome to look at any time. Please let us know right away if you experience any engine trouble; regular and reliable reporting allows us to maintain an up-to-date transmission failure surveillance system. And don’t forget to celebrate Vehicle Safety Day on May 11!
Gates Foundation: Of course, darling, we gave your boarding school plenty of money to buy a car. And since we’re on the Board, we’ll make sure they buy the right car. And you can drive it any time you want…as long as one of us is in the passenger seat to make sure you’re going the right way.
Global Fund: We’ve reviewed your proposal for a Range Rover and according to Consumer Reports it is a technically capable car for city driving. Here is a $70,000 check for you to go and buy the Range Rover, as discussed in your proposal.

Mo PuteraJan 31 202418

ForecastingShow more

As someone predisposed to like modeling, the key takeaway I got from Justin Sandefur's Asterisk essay PEPFAR and the Costs of Cost-Benefit Analysis was this corrective reminder – emphasis mine, focusing on what changed my mind:

Second, economists were stuck in an austerity mindset, in which global health funding priorities were zero-sum: $300 for a course of HIV drugs means fewer bed nets to fight malaria. But these trade-offs rarely materialized. The total budget envelope for global public health in the 2000s was not fixed. PEPFAR raised new money. That money was probably not fungible across policy alternatives. Instead, the Bush White House was able to sell a dramatic increase in America’s foreign aid budget by demonstrating that several billion dollars could, realistically, halt an epidemic that was killing more people than any other disease in the world.
...
A broader lesson here, perhaps, is about getting counterfactuals right. In comparative cost-effectiveness analysis, the counterfactual to AIDS treatment is the best possible alternative use of that money to save lives. In practice, the actual alternative might simply be the status quo, no PEPFAR, and a 0.1% reduction in the fiscal year 2004 federal budget. Economists are often pessimistic about the prospects of big additional spending, not out of any deep knowledge of the budgeting process, but because holding that variable fixed makes analyzing the problem more tractable. In reality, there are lots of free variables.

More detail:

Economists’ standard optimization framework is to start with a fixed budget and allocate money across competing alternatives. At a high-level, this is also how the global development community (specifically OECD donors) tends to operate: foreign aid commitments are made as a proportion of national income, entirely divorced from specific policy goals. PEPFAR started with the goal instead: Set it, persuade key players it can be done, and ask for the money to do it.
Bush didn’t think like an economist. He was apparently allergic to measuring foreign aid in terms of dollars spent. Instead, the White House would start with health targets and solve for a budget, not vice versa. ... Economists are trained to look for trade-offs. This is good intellectual discipline. Pursuing “Investment A” means forgoing “Investment B.” But in many real-world cases, it’s not at all obvious that the realistic alternative to big new spending proposals is similar levels of big new spending on some better program. The realistic counterfactual might be nothing at all.
In retrospect, it seems clear that economists were far too quick to accept the total foreign aid budget envelope as a fixed constraint. The size of that budget, as PEPFAR would demonstrate, was very much up for debate.
When Bush pitched $15 billion over five years in his State of the Union, he noted that $10 billion would be funded by money that had not yet been promised. And indeed, 2003 marked a clear breaking point in the history of American foreign aid. In real-dollar terms, aid spending had been essentially flat for half a century at around $20 billion a year. By the end of Bush’s presidency, between PEPFAR and massive contracts for Iraq reconstruction, that number hovered around $35 billion. And it has stayed there since.
Compared to normal development spending, $15 billion may have sounded like a lot, but exactly one sentence after announcing that number in his State of the Union address, Bush pivoted to the case for invading Iraq, a war that would eventually cost America something in the region of $3 trillion — not to mention thousands of American and hundreds of thousands of Iraqi lives. Money was not a real constraint.

Tangentially, I suspect this sort of attitude (Iraq invasion notwithstanding) would naturally arise out of a definite optimism mindset (that essay by Dan Wang is incidentally a great read; his follow-up is more comprehensive and clearly argued, but I prefer the original for inspiration). It seems to me that Justin has this mindset as well, cf. his analogy to climate change in comparing economists' carbon taxes and cap-and-trade schemes vs progressive activists pushing for green tech investment to bend the cost curve. He concludes:

You don’t have to give up on cost-effectiveness or utilitarianism altogether to recognize that these frameworks led economists astray on PEPFAR — and probably some other topics too. Economists got PEPFAR wrong analytically, not emotionally, and continue to make the same analytical mistakes in numerous domains. Contrary to the tenets of the simple, static, comparative cost-effectiveness analysis, cost curves can sometimes be bent, some interventions scale more easily than others, and real-world evidence of feasibility and efficacy can sometimes render budget constraints extremely malleable. Over 20 years later, with $100 billion dollars appropriated under both Democratic and Republican administrations, and millions of lives saved, it’s hard to argue a different foreign aid program would’ve garnered more support, scaled so effectively, and done more good. It’s not that trade-offs don’t exist. We just got the counterfactual wrong.

Aside from his climate change example above, I'd be curious to know what other domains economists are making analytical mistakes in w.r.t. cost-benefit modeling, since I'm probably predisposed to making the same kinds of mistakes.

Mo PuteraMar 152

From the PEPFAR report, the same thing Justin Sandefur mentioned above:

When PEPFAR was announced, many economists thought it would not be cost-effective to treat AIDS. They were wrong. Their initial concern was understandable: when PEPFAR began in 2004, first-line antiretroviral therapy cost about $1,000 per year, so treating everyone with HIV would cost tens of billions of dollars. Many experts worried that PEPFAR would lead to funding cuts for other highly cost-effective aid efforts. But the pessimistic forecasts didn’t come true. First, the Bush Administration decided to fund PEPFAR on top of existing aid efforts. Second, the massive increase in funding for antiretroviral drugs created demand that helped drive competition and innovation, and supercharged an existing race to develop cheaper, more effective generic drugs.³ Today, a year of first-line antiretroviral medication costs about $60.

As a result, PEPFAR has saved between 7.5 and 30 million lives, at a cost between $1,500 and $10,000 per life saved, and has also prevented at least 5.5 million babies from being born with HIV. It has also gotten more cost-effective each year as medication costs decline, doing more and more on a budget that has been declining in real dollars since 2009.

Some other quotes that stood out to me:

PEPFAR was conceived of and launched during a time of widespread skepticism about foreign aid. It had become clear that lots of foreign aid was corrupt, wasteful, or unhelpful. But the thesis behind PEPFAR was that health interventions might succeed where larger development interventions often had not. Accountability for health interventions is easier—we can tell whether or not a drug has reached a patient. Results are more measurable. And while development interventions are often premised on ideological claims that may come and go, or on theories of investment that may not hold up, health interventions have only the simple thesis that health is better than sickness and life is better than death.
In the present day, PEPFAR has been successfully handing off responsibilities in partner nations, targeting 70% of funding through local organizations including partner country governments. Some countries, including Botswana and South Africa, have successfully transitioned to funding a majority of their own HIV efforts, with PEPFAR now playing a smaller supporting role.
PEPFAR requires very good accounting controls, with every expenditure documented and demonstrated to be in line with program requirements. For this report, we read three recent Office of the Inspector General audits of PEPFAR program recipients. The three audits we reviewed found undocumented expense rates ranging from 0% to 2% of program expenses, and they demanded repayment of every dollar unaccounted for.
Quoting Goldsmith, Horiuchi, and Wood 2014, “PEPFAR has, indeed, positively affected how publics in recipient countries regard the US … what types of aid, under what conditions, might be effective in influencing foreign public opinion about the donor. Specifically, our theory is that foreign aid that is targeted, sustained, effective, and visible is more likely to affect mass opinion. … as Goldsmith and Horiuchi(2012) show, changes in public opinion within Country B about Country A can influence Country B’s foreign policy behavior toward Country A.” (emphasis ours).

Mo PuteraDec 12 20244

I like Austin Vernon's idea for scaling CO2 direct air capture to 40 billion tons per year, i.e. matching our current annual CO2 emissions, using (extreme versions of) well-understood industrial processes.

The proposed solution may not be the cheapest out there. Other ideas like ocean seeding or olivine weathering might be less expensive. But most of the science is understood, and it can scale quickly. I'd guess 100,000 workers could build enough sites to capture our 40 billion tons goal in a decade. The capital expenditure rate would be between $1 trillion and $5 trillion yearly, or 1% to 5% of global GDP. That cost and deployment speed take doomer scenarios off the table. Say something scary like melting permafrost threatens runaway warming. You can target the area with a few years of sulfur cooling while a tiny portion of the global economy builds carbon capture devices. It is nothing like a wartime mobilization.
The most disruptive aspect would be energy usage. We'd need to ramp output up at double-digit rates because each ton of CO2 requires 2-3 MWh of energy for removal. Thankfully low-grade heat is easy to come by. There is enough energy near coal mines in Wyoming or natural gas fields in SW Pennsylvania at less than $5/MWh. Other places might use solar, hydro, or geothermal steam if they lack fossil fuel reserves. The key is to put the facilities at the energy sources instead of trying to move the energy. Cheap energy makes the operating costs <1% of global GDP. Many clean energy proponents have fretted about how to keep fossil fuel reserves in the ground. Burning them to run carbon capture equipment kills two birds with one stone!
The takeaway is that we could completely turn around the carbon dioxide problem within a few years with a similar spending rate as rich world COVID relief. There won't be a scenario where we've waited too long to act.

I am admittedly perhaps biased to want moonshots like Vernon's idea to work, and for society at large to be able to coordinate and act on the required scale, after seeing these depressing charts from Assessing the costs of historical inaction on climate change:

Mo PuteraApr 26 20246

Global health

This WHO press release was a good reminder of the power of immunization – a new study forthcoming publication in The Lancet reports that (liberally quoting / paraphrasing the release)

global immunization efforts have saved an estimated 154 million lives over the past 50 years, 146 million of them children under 5 and 101 million of them infants
for each life saved through immunization, an average of 66 years of full health were gained – with a total of 10.2 billion full health years gained over the five decades
measles vaccination accounted for 60% of the lives saved due to immunization, and will likely remain the top contributor in the future
vaccination against 14 diseases has directly contributed to reducing infant deaths by 40% globally, and by more than 50% in the African Region
- the 14 diseases: diphtheria, Haemophilus influenzae type B, hepatitis B, Japanese encephalitis, measles, meningitis A, pertussis, invasive pneumococcal disease, polio, rotavirus, rubella, tetanus, tuberculosis, and yellow fever
fewer than 5% of infants globally had access to routine immunization when the Expanded Programme on Immunization (EPI) was launched 50 years ago in 1974 by the World Health Assembly; today 84% of infants are protected with 3 doses of the vaccine against diphtheria, tetanus and pertussis (DTP) – the global marker for immunization coverage
there's still a lot to be done – for instance, 67 million children missed out on one or more vaccines during the pandemic years

Mo PuteraMay 12 20242

Great OWID charts for this:

Mo PuteraSep 26 20239

Philosophy

I'm curious what people who're more familiar with infinite ethics think of Manheim & Sandberg's What is the upper limit of value?, in particular where they discuss infinite ethics (emphasis mine):

Bostrom’s discussion of infinite ethics is premised on the moral relevance of physically inaccessible value. That is, it assumes that aggregative utilitarianism is over the full universe, rather than the accessible universe. This requires certain assumptions about the universe, as well as being premised on a variant of the incomparability argument that we dismissed above, but has an additional response which is possible, presaged earlier. Namely, we can argue that this does not pose a problem for ethical decision-making even using aggregative ethics, because the consequences of any ethical decision can have only a finite (difference in) value. This is because the value of a moral decision relates only to the impact of that decision. Anything outside of the influenced universe is not affected, and the arguments above show that the difference any decision makes is finite.

I first read their paper a few years ago and found their arguments for the finiteness of value persuasive, as well as their collectively-exhaustive responses in section 4 to possible objections. So ever since then I've been admittedly confused by claims that the problems of infinite ethics still warrant concern w.r.t. ethical decision-making (e.g. I don't really buy Joe Carlsmith's arguments for acknowledging that infinities matter in this context, same for Toby Ord's discussion in a recent 80K podcast). What am I missing?

David Mathers🔸Jan 44

I haven't read the paper, but a simple objection is that you're never going to be certain your actions only have finite effects, because you should only assign credence 0 to contradictions. (I don't actually know the argument for the latter, but some philosophers believe it.) So you have to deal with the very, very small but not literally 0 chance that your actions will have an infinitely good/bad outcome because your current theories of how the universe works are wrong. However, anything with a chance of bringing about an infinitely good or bad outcome has an infinite expected value or an undefined one. So unless all expected values are undefined (which brings it own problems) you have to deal with infinite expected values, which is enough to cause trouble.

Mo PuteraJan 65

Manheim and Sandberg address your objection in the paper persuasively (to me personally), so let me quote them, since directly addressing these arguments might change my mind. @MichaelStJules I'd be keen to get your take on this as well. (I'm not quoting the footnotes, even though they were key to persuading me too.)

Section 4.1, "Rejecting Physics":

4.1.1 Pessimistic Meta-induction and expectations of falsification
The pessimistic meta-induction warns that since many past successful scientific theories were found to be false, we have no reason expect that our currently successful theories are approximately true. Hence, for example, the above constraints on information processing are not guaranteed to imply finitude. Indeed, many of them are based on information physics that is weakly understood and liable to be updated in new directions. If physics in our universe does, in fact, allow for access to infinite matter, energy, time, or computation through some as-yet-undiscovered loophole, it would undermine the central claim to finitude.
This criticism cannot be refuted, but there are two reasons to be at least somewhat skeptical. First, scientific progress is not typically revisionist, but rather aggregative. Even the scientific revolutions of Newton, then Einstein, did not eliminate gravity, but rather explained it further. While we should regard the scientific input to our argument as tentative, the fallibility argument merely shows that science will likely change. It does not show that it will change in the direction of allowing infinite storage. Second, past results in physics have increasingly found strict bounds on the range of physical phenomena rather than unbounding them. Classical mechanics allow for far more forms of dynamics than relativistic mechanics, and quantum mechanics strongly constrain what can be known and manipulated on small scales.
While all of these arguments in defense of physics are strong evidence that it is correct, it is reasonable to assign a very small but non-zero value to the possibility that the laws of physics allow for infinities. In that case, any claimed infinities based on a claim of incorrect physics can only provide conditional infinities. And those conditional infinities may be irrelevant to our decisionmaking, for various reasons.
4.1.2 Boltzmann Brains, Decisions, and the indefinite long-term
One specific possible consideration for an infinity is that after the heat-death of the universe there will be an indefinitely long period where Boltzmann brains can be created from random fluctuations. Such brains are isomorphic to thinking human brains, and in the infinite long-term, an infinite number of such brains might exist [ 34]. If such brains are morally relevant, this seems to provide a value infinity.
We argue that even if these brains have moral value, it is by construction impossible to affect their state, or the distribution of their states. This makes their value largely irrelevant to decision-making, with one caveat. That is, if a decision-maker believes that these brains have positive or negative moral value, it could influence decisions about whether decisions that could (or would intentionally) destroy space-time, for instance, by causing a false-vacuum collapse. Such an action would be a positive or negative decision, depending on whether the future value of a non-collapsed universe is otherwise positive or negative. Similar and related implications exist depending on whether a post-collapse universe itself has a positive or negative moral value.
Despite the caveat, however, a corresponding (and less limited) argument can be made about decisionmaking for other proposed infinities that cannot be affected. For example, inaccessible portions of the universe, beyond the reachable light-cone, cannot be causally influenced. As long as we maintain that we care about the causal impacts of decisions, they are irrelevant to decisionmaking.

Section 4.2.4 more directly addresses the objection I think. (Unfortunately the copy-pasting doesn't preserve the mathematical formatting, so perhaps it'd be clearer to just look at page 12 of their paper; in particular I've simplified their notation for $1 in 2020 to just $1):

4.2.4 Bounding Probabilities
As noted above, any act considered by a rational decision maker, whether consequentialist or otherwise, is about preferences over a necessarily finite number of possible decisions. This means that if we restrict a decision-maker or ethical system to finite, non-zero probabilities relating to finite value assigned to each end state, we end up with only finite achievable value. The question is whether probabilities can in fact be bounded in this way.
We imagine Robert, faced with a choice between getting $1 with certainty, and getting $100 billion with some probability. Given that there are two choices, Robert assigns utility in proportion to the value of the outcome weighted by the probability. If the probability is low enough, yet he chooses the option, it implies that the value must be correspondingly high.
As a first argument, imagine Robert rationally believes there is a probability of 10^−100 of receiving the second option, and despite the lower expected dollar value, chooses it. This implies that he values receiving $100 billion at approximately 10^100x the value of receiving $1. While this preference is strange, it is valid, and can be used to illustrate why Bayesians should not consider infinitesimal probabilities valid.
To show this, we ask what would be needed for Robert to be convinced this unlikely event occurred. Clearly, Robert would need evidence, and given the incredibly low prior probability, the evidence would need to be stupendously strong. If someone showed Robert that his bank balance was now $100 billion higher, that would provide some evidence for the claim—but on its own, a bank statement can be fabricated, or in error. This means the provided evidence is not nearly enough to convince him that the event occurred. In fact, with such a low prior probability, it seems plausible that Robert could have everyone he knows agree that it occurred, see newspaper articles about the fact, and so on, and given the low prior odds assigned, still not be convinced. Of course, in the case that the event happened, the likelihood of getting all of that evidence will be much higher, causing him to update towards thinking it occurred.
A repeatable experiment which generates uncorrelated evidence could provide far more evidence over time, but complete lack of correlation seems implausible; checking the bank account balance twice gives almost no more evidence than checking it once. And as discussed in the appendix, even granting the possibility of such evidence generation, the
amount possible is still bounded by available time, and therefore finite.
Practically, perhaps the combination of evidence reaches odds of 10^50:1 that the new money exists versus that it does not. Despite this, if he truly assigned the initially implausibly low probability, any feasible update would not be enough to make the event, receiving the larger sum, be a feasible contender for what Robert should conclude. Not only that, but we posit that a rational decision maker should know, beforehand, that he cannot ever conclude that the second case occurs.
If he is, in fact, a rational decision maker, it seems strange to the point of absurdity for him to to choose something he can never believe occurred, over the alternative of a certain small gain.
Generally, then, if an outcome is possible, at some point a rational observer must be able to be convinced, by aggregating evidence, that it occurred. Because evidence is a function of physical reality, the possible evidence is bounded, just as value itself is limited by physical constraints. We suggest (generously) that the strength of this evidence is limited to odds of the number of possible quantum states of the visible universe — a huge but finite value — to 1. If the prior probability assigned to an outcome is too low to allow for a decision maker to conclude it has occurred given any possible universe, no matter what improbable observations occur, we claim the assigned probability is not meaningful for decision making. As with the bound on lexicographic preferences, this bound allows for an immensely large assignment of value, even inconceivably so, but it is again still finite.

Mo PuteraDec 30 20243

Just noting for my own future edification this LW exchange between David Manheim (who argues that infinite ethics is irrelevant to actual decisions, per paper above) and Joe Carlsmith (who argues the opposite, per essay above), which only increased my conviction that Manheim and Anders Sandberg were right.

FWIW here's Claude Sonnet 3.5 attempting to first steelman Carlsmith's essay and then being neutral about which stance is more right:

Steelman of Carlsmith's key arguments:
Even if infinite value/disvalue isn't currently accessible, we should have non-zero credence in possibilities that would make it accessible (e.g. discovering our physics is wrong in specific ways). Given expected value theory, a non-zero credence in infinite value should dominate our calculations.
We face decisions with potentially infinite consequences even under current physics through acausal influence - if we accept certain decision theories and cosmological models suggesting infinite copies of us exist, our choices could influence an infinite number of copies.
Even if we can't access infinities in practice, ethics shouldn't fall silent when considering infinite cases. We have strong intuitions about choices like "infinite heaven vs infinite hell" that a complete ethical theory needs to address.
The mathematical/philosophical challenges infinite ethics raises reveal important limitations in our ethical frameworks that we should take seriously, even if just to better understand finite ethics.
Manheim/Sandberg's key counterarguments:
Under our best physics, the accessible universe is fundamentally finite in space, time, and computational capacity.
Any decision theory allowing infinite value requires rejecting either current physics, rational preferences, or established decision theory in problematic ways.
We can represent supposedly "infinite" values (like lexicographic preferences) with very large but finite numbers while preserving all decision-relevant distinctions.
Ethics should focus on actionable choices that make a difference to actual moral patients in our universe.
My analysis: While Carlsmith makes interesting theoretical points, Manheim/Sandberg seem more right for practical purposes. The key considerations:
Their physics arguments about fundamental finitude seem solid and would need extraordinary evidence to overturn.
Even if we assign some credence to infinite possibilities, we can't operationalize that credence in a way that helps real decision-making.
The practical ethical questions we face (e.g. existential risk, animal welfare, global poverty) don't require solving infinite ethics.
Time spent on infinite ethics likely trades off against work on pressing finite problems where we can make concrete progress.
So while infinite ethics raises fascinating theoretical questions, I lean towards Manheim/Sandberg's view that it's not practically relevant for "doing good better." That said, I appreciate Carlsmith's rigorous exploration of the theoretical challenges, which may yield insights for finite ethics.
The prudent path seems to be: Note the theoretical challenges infinite ethics raises, remain humble about our ethical frameworks, but focus our efforts on tractable finite problems where we can clearly help actual moral patients.

MichaelStJulesDec 30 20244

If you maximize expcted value, you should be taking expected values through small probabilities, including that we have the physics wrong or that things could go on forever (or without hard upper bound) temporally. Unless you can be 100% in no infinities, then your expected values will be infinite or undefined. And there are, I think, hypotheses that can't be ruled out and that could involve infinite affectable value.

In response to Carl Shulman on acausal influence, David Manheim said to renormalize. I'm sympathetic and would probably agree with doing something similar, but the devil is in the details. There may be no very uniquely principled way to do this, and some things can still break down, e.g. you get actions that are morally incomparable.

Mo PuteraJan 43

And there are, I think, hypotheses that can't be ruled out and that could involve infinite affectable value.

This is my crux, I think. I have yet to find a single persuasive example of an ethical decision I might face for which incorporating infinite ethics considerations suggests a different course of action. I don't remember if Carlsmith's essay provided any such examples; if it did I likely did not find them persuasive, since I skimmed it with this focus in mind. I interpreted Manheim & Sandberg's paper to say that I likely wouldn't find any such examples if I kept looking.

MichaelStJulesJan 44

You could want to do acausal trades and cooperate with agents causally disconnected from you. You'll expect that those who reason (sufficiently) similarly would do the same in return, and that you would cooperate would be evidence for them cooperating and make it more likely.

If you were difference-making risk averse locally, e.g. you don't care about making a huge difference with very very tiny probability, by taking acausal influence into account, you should be (possibly much) less difference-making risk averse, according to Wilkinson.

Mo PuteraJan 63

I don't see why acausal trade makes infinite ethics decision-relevant for essentially the reasons Manheim & Sandberg discuss in Section 4.5 – acausal trade alone doesn't imply infinite value; footnote 41's "In mainstream cosmological theories, there is a single universe, and the extent can be large but finite even when considering the unreachable portion (e.g. in closed topologies). In that case, these alternative decision theories are useful for interaction with unreachable beings, or as ways to interact with powerful predictors, but still do not lead to infinities"; physical limits on information storage and computation would still apply to any acausal coordination.

I'll look into Wilkinson's paper, thanks.

MichaelStJulesJan 64

They aren't asserting that the whole universe, including the unreachable portion, is finite in extent with certainty. They're just saying that it's possible, and they also note infinite is possible too in the sentence after which that footnote follows.

Even if you think a universe with infinite spatial extent is very unlikely, you should still be entertaining the possibility. If there's a chance it's infinite and you can have infinite impact (before renormalizing), a risk neutral expected value reasoner should wager on that.

FWIW, I'm sympathetic to their arguments in that section against expected value maximization, or that at least undermine the arguments for it. I'm not totally convinced of expected value maximization myself.

However, that doesn't give a positive case for ignoring these infinities. I find infinite acausal impacts not too unlikely, personally, because both that acausal influence is possible seems more likely than not and that the universe is infinite in spatial extent (and in the right way to be influenced infinitely acausally) seems not too unlikely.

But I am optimistic about renormalization.

Vasco Grilo🔸Jan 42

Hi Mo,

I believe the effects of one's actions decay to practically 0 after at most around 100 years, so I do not think it matters whether the theoretically affectable universe is infinite or not. Even if it was, one could simply use limits to figure out which action is best.

Mo PuteraJan 63

Thanks Vasco. While I agree with what I interpret to be your actionable takeaway (to ethically act as if our actions' consequences are finitely circumscribed in time and space), I don't see where your confidence comes from that the effects of one's actions decay to practically 0 after at most around 100 years, especially given that longtermists explicitly seek and focus on such actions. I'm guessing you have a writeup on the forum elaborating on your reasoning, in which case would you mind linking to it?

Vasco Grilo🔸Jan 63

My post Reducing the nearterm risk of human extinction is not astronomically cost-effective? is somewhat related, but it does not empirically analyse how fast effects decay over time. Uncertainty over time and Bayesian updating is the best analysis on this I am aware of. I have just updated the comment I had left there to explain my claim that effects decay to practically 0 after at most 100 years.

Mo PuteraJan 73

Much appreciated, thanks again Vasco.

Mo PuteraOct 7 20233

Sandberg's recent 80K podcast interview transcript has this quote:

Rob Wiblin: OK, so the argument is something like valuing is a process that requires information to be encoded, and information to be processed — and there are just maximum limits on how much information can be encoded and processed given a particular amount of mass and given a finite amount of mass and energy. So that ultimately is going to set the limit on how much valuing can be done physically in our universe. No matter what things we create, no matter what minds we generate, there’s going to be some finite limit there. That’s basically it?
Anders Sandberg: That’s it. In some sense, this is kind of trivial. I think some readers would no doubt feel almost cheated, because they wanted to know that metaphysical limit for value, and we can’t say anything about that. But it seems very likely that if value has to have to do with some entity that is doing the valuing, then there is always going to be this limit — especially since the universe is inconveniently organised in such a way that we can’t get hold of infinite computational power, as far as we know.

Mo PuteraJan 31 20246

One of the more surprising things I learned from Karen Levy's 80K podcast interview on misaligned incentives in global development was how her experience directly contradicted a stereotype I had about for-profits vs nonprofits:

Karen Levy: When I did Y Combinator, I expected it to be a really competitive environment: here you are in the private sector and it’s all about competition. And I was blown away by the level of collaboration that existed in that community — and frankly, in comparison to the nonprofit world, which can be competitive. People compete for funding, and so very often we’re fighting over slices of the same pie. Whereas the Y Combinator model is like, “We’re making the pie bigger. It’s getting bigger for everybody.”

My assumption had been that the opposite was true.

Mo PuteraDec 3 20237

The following table is from Scott Alexander's post, which you should check out for the sources and (many, many) caveats.

This table can’t tell you what your ethical duties are. I'm concerned it will make some people feel like whatever they do is just a drop in the bucket - all you have to do is spend 11,000 hours without air conditioning, and you'll have saved the same amount of carbon an F-35 burns on one airstrike! But I think the most important thing it could convince you of is that if you were previously planning on letting yourself be miserable to save carbon, you should buy carbon offsets instead. Instead of boiling yourself alive all summer, spend between $0.04 and $2.50 an hour to offset your air conditioning use.

Mo PuteraDec 11 20236

I like John Salter's post on schlep blindness in EA (inspired by Paul Graham's eponymous essay), whose key takeaway is

Pay close attention to ideas that repel others people for non-impact related reasons, but not you. If you can get obsessed about something important that most people find horribly boring, you're uniquely well placed to make a big impact.

Unfortunately it's bereft of concrete examples. The closest to a shortlist he shares is in this comment:

Horrible career moves e.g. investigating the corrupt practices of powerful EAs / Orgs
Boring to most people e.g. compiling lists and data
Low status outside EA e.g. welfare of animals nobody cares about (e.g. shrimp)
Low status within EA e.g. global mental health
Living in relatively low quality of living areas e.g. fieldwork in many African countries

(I disagree with some of these; e.g. the first bullet seems contradicted by the propensity for forum drama on adjacent topics, and as someone who likes compiling lists and data I don't actually see much low-hanging fruit for me to contribute here due to the work of e.g. Hamish)

I'd be keen to learn other examples. He does give this advice to brainstorm examples:

What work do you wish someone else would do?

although in my case it's not useful because I either just end up doing it (or trying, failing, and learning why), or discover that it's already been done better than I could (e.g. Rethink Priorities' new CCM).

That said, I still think the original takeaway is a useful reminder.

Mo PuteraAug 28 20243

(Attention conservation notice: rambling in public)

A striking throwaway remark, given its context:

There is remarkably little evidence that evidence-based medicine leads to better health outcomes for patients, though this is absence of (good) evidence rather than (good) evidence of absence of effect.

It's striking given that this comes from this book on Thailand’s Health Intervention and Technology Assessment Program (HITAP) (ch 1 pg 22), albeit perhaps understandable given the authors' stance that evidence is necessary but not sufficient to determine the best course of action (to treat a patient, to design a social insurance scheme, etc), which seems completely unobjectionable.

That said, I did wonder about the first half of the quoted throwaway remark, so I asked Elicit; its top-4 paper summary is

Evidence-based medicine (EBM) has been shown to improve patient outcomes and healthcare efficiency. A study in a Spanish hospital found that an EBP unit had lower mortality rates (6.27% vs 7.75%) and shorter lengths of stay (6.01 vs 8.46 days) compared to standard practice (Emparanza et al., 2015). EBM can reduce clinical uncertainty, leading to better patient outcomes, improved population health, and reduced costs (Molony & Samuels, 2012). The implementation of EBM is expected to enhance the quality of care as part of healthcare reform initiatives (Hughes, 2011). Additionally, EBM has paralleled the growth of patient empowerment, supporting informed decision-making by integrating the best available research with individual patient values and concerns (Hendler, 2004). While challenges remain in translating EBM principles for public consumption, its adoption has the potential to significantly improve healthcare delivery and patient outcomes.

although the summary didn't include these papers it listed in the top 10

Bahtsevani et al 2004's systematic review (weak evidence of limited findings)
Every-Palmer & Howick 2014's paper with these dramatic sentences in their abstract:
- "In this paper we suggest that EBM's potential for improving patients' health care has been thwarted by bias in the choice of hypotheses tested, manipulation of study design and selective publication."
- "Evidence for these flaws is clearest in industry-funded studies. We argue EBM's indiscriminate acceptance of industry-generated 'evidence' is akin to letting politicians count their own votes. Given that most intervention studies are industry funded, this is a serious problem for the overall evidence base. Clinical decisions based on such evidence are likely to be misinformed, with patients given less effective, harmful or more expensive treatments."
- "More investment in independent research is urgently required. Independent bodies, informed democratically, need to set research priorities. We also propose that evidence rating schemes are formally modified so research with conflict of interest bias is explicitly downgraded in value."
Shaw et al 2007's dramatically-titled Why Evidence Based Medicine May Be Bad for You and Your Patients ("This review argues that the basis of EBM is so deeply flawed that in many cases it cannot usefully inform clinical practice, reflected in fact by the current majority outcome of most trials as “no-blood,” or no result")

With the proviso that I'm a layperson w.r.t. medicine and healthcare, and that I didn't ask Elicit further questions or really dig further into this at all — I find myself mostly unmoved by these papers & reviews, while the younger me of (say) a decade ago would've epistemically panicked. Partly it's that they aren't really contra "using evidence to inform medicine" per se: to oversimplify a bit, Bahtsevani et al recommend more evidence generation, Every-Palmer & Howick recommend less industry-biased evidence generation, and Shaw et al argue that other less legible-than-RCT types of evidence should occupy more mindshare than they did back in '07 (there's a loose parallel here to the more recent growth vs randomista debate in dev econ). Partly it's that I suspect there's some talking past each other, which only becomes clear when one digs into the nuts-and-bolts. Partly it's that I think the general underlying ethos of "using evidence to inform medicine" is a lot more robust than any particular instantiation of it (e.g. using only empirical data from systematic reviews of RCTs), sort of like how cluster thinking > sequence thinking for decision-making, or like how foxes have weak views strongly held (side note: in that essay's framing I used to be a hedgehog, hopefully I'm now more fox than degenerate cactus). Partly it's that I've "seen this before" with other topics, cf. Scott Alexander's many deep dives. Maybe I'm just getting old...

NickLaingAug 28 20242

I haven't looked in detail, but my quick comment would be that these studies seem to basically be comparing extreme careful following of evidence based medicine, vs. "normal medical practise" which is like 90%+ based on evidence anyway. Standard medical training and registered medical practise in most of the world closely follows the evidence - it would be very difficullt (maybe impossible) to practise "outside" of the evidence. So not finding a huge difference between these 2 ways of practising isn't so surprising.

Mo PuteraJun 25 20244

Epistemic status: public attempt at self-deconfusion & not just stopping at knee-jerk skepticism

The recently published Cost-effectiveness of interventions for HIV/AIDS, malaria, syphilis, and tuberculosis in 128 countries: a meta-regression analysis (so recent it's listed as being published next month), in my understanding, aims to fill country-specific gaps in CEAs for all interventions in all countries for HIV/AIDS, malaria, syphilis, and tuberculosis, to help national decision-makers allocate resources effectively – to a first approximation I think of it as "like the DCP3 but at country granularity and for Global Fund-focused programs". They do this by predicting ICERs, IQRs, and 95% UIs in US$/DALY using the meta-regression parameters obtained from analysing ICERs published for these interventions (more here).

AFAICT their methodology and execution seem superb, so I was keen to see their results:

Antenatal syphilis screening ranks as the lowest median ICER in 81 (63%) of 128 countries, with median ICERs ranging from $3 (IQR 2–4) per DALY averted in Equatorial Guinea to $3473 (2244–5222) in Ukraine.

At risk of being overly skeptical: $3 per DALY averted is >30x better than Open Phil's 1,000x bar of $100 per DALY which is roughly around GW top charity level which OP have said are hard to beat, especially for a direct intervention like antenatal syphilis screening. It makes me wonder how much credence to put in the study's findings for actual resource allocation decisions (esp. Figure 4 ranking top interventions at country granularity). Also:

Specifically re: antenatal syphilis screening, CE/AIM's report on screening + treating antenatal syphilis estimates $81 per DALY; I'm hard-pressed to believe that removing treatment improves cost-eff >1 OOM
I'm reminded of the time GW found 5 separate spreadsheet errors in a DCP2 estimate of soil-transmitted-helminth (STH) treatment that together misleadingly 'improved' its cost-effectiveness ~100-fold from $326.43 per DALY (correct output) to just $3.41 (wrong, and coincidentally in the ballpark of the estimate above that triggered my skepticism)

So how should I think about and use their findings given what seems like reasonable grounds for skepticism, if I'm primarily interested in helping decision-makers help people better? Scattered thoughts to defend the study / push back on my nitpicking above:

even if imperfect – and I'm not confident in my skepticism above – they clearly improve substantially upon the previous state of affairs (CEA gaps everywhere at country-disease-intervention level granularity; expert opinion not lending itself to country-specific predictions; case-by-case methods often being unsuccessful)
their recommendations seem reasonably hedged, not naively maximalist: they include 95% uncertainty intervals; they clearly say "cost-effectiveness... should not be the only criterion... [consider also] enhancing equity and providing financial risk protection"
even a naively maximalist recommendation ("first fund lowest-ICER intervention, then 2nd-lowest, ... until funds run out") doesn't seem unreasonable in this context – essentially countries would end up funding more antenatal syphilis screening, intermittent preventive treatment of malaria in pregnant women and infants, and chemotherapy for drug-susceptible TB (just from eyeballing Figure 4)
I interpret what they're trying to do as not so much "here are the ICER league tables, use them", but shifting decision-makers' approach to resource allocation from needing a single threshold for all healthcare funding decisions to (quoting them) "ICERs ranked in country-specific league tables", and in the long run this perspective shift seems useful to "bake into" decision-making processes, even if the specific figures in this specific study aren't necessarily the most accurate and shouldn't be taken at face value

That said, I do wonder if the authors could have done a bit better, like

cautioning against naively taking the best cost-eff estimates at face value, instead of suggesting "Funds could be first spent on the intervention that has the lowest ICER. Following that, other interventions could be funded in order of their ICER rankings, as long as there are available funds"
spot-checking some of (not all) the top cost-eff ICERs that went into their meta-regression analysis to get a sense of their credibility, especially those which feed into their main recommendations, like GW did above with the DCP2 estimate for STH treatment
extracting qualitative proxies for decision-maker guidance from an analysis of the main drivers behind the substantial ranking differences in intervention ICERs across economic and epidemiological contexts (eg "we should expect antenatal syphilis screening to be substantially less cost-effective in our context due to factors XYZ, let's look at other interventions instead" – what would a short useful list of XYZ look like?), instead of just saying "we found the rankings differ substantially"

JasonJun 25 20242

The positive spin is that someone got funded to do this kind of big-picture analysis and got it published in The Lancet.

There were 1,792 potential country-intervention pairs (although it is not immediately clear if they did all 1,792 pairs). So I don't think most reasonable readers would view these findings as substitutes for a more in-depth, country-specific analysis on the potentially promising intervention. They did publish at least some data for each intervention, although maybe it isn't enough to poke at each of the country-intervention pairs.

Mo PuteraSep 27 20236

Global health

I just learned about Tom Frieden via Vadim Albinsky's writeup Resolve to Save Lives Trans Fat Program for Founders Pledge. His impact in sheer lives saved is astounding, and I'm embarrassed I didn't know about him before:

The CEO of RTSL, Tom Frieden, likely prevented tens of millions of deaths by creating an international tobacco control initiative in a prior role that may have been much more cost effective than most of our top recommended charities. ...
We believe that by leveraging his influence with governments, and the relatively low cost of advocating for regulations to improve health, Tom Frieden has the potential to again save a vast number of lives at a low cost.

How many more? Albinsky estimates:

RTSL is aiming to save 94 million lives over 25 years by advocating for countries to implement policies to reduce non-communicable diseases. We believe the industrially-produced trans fat elimination program is the most cost-effective of their initiatives. ... Even after very conservative discounts to RTLS’s impact projections we estimate this program to be more cost effective than most of our top global health and development recommendations.

Tangentially, if a "Borlaug" is a billion lives saved, then Frieden's impact is probably on the scale of ~100 milliBorlaugs (to nearest OOM). Bill and Melinda likely have had similar impact. This makes me wonder who else I don't know about who's done ~100 milliBorlaugs of good.

(It's arguably unfair to wholly attribute all those lives saved to Frieden, and I am honestly unsure what credit attribution makes most sense, but applying the same logic to Borlaug you can no longer really say he saved a billion lives.)

Mo PuteraMar 14 20244

ForecastingShow more

[Question] How should we think about the decision relevance of models estimating p(doom)?

(Epistemic status: confused & dissatisfied by what I've seen published, but haven't spent more than a few hours looking. Question motivated by Open Philanthropy's AI Worldviews Contest; this comment thread asking how OP updated reminded me of my dissatisfaction. I've asked this before on LW but got no response; curious to retry, hence repost)

To illustrate what I mean, switching from p(doom) to timelines:

The recent post AGI Timelines in Governance: Different Strategies for Different Timeframes was useful to me in pushing back against Miles Brundage's argument that "timeline discourse might be overrated", by showing how choice of actions (in particular in the AI governance context) really does depend on whether we think that AGI will be developed in ~5-10 years or after that.
A separate takeaway of mine is that decision-relevant estimation "granularity" need not be that fine-grained, and in fact is not relevant beyond simply "before or after ~2030" (again in the AI governance context).
Finally, that post was useful to me in simply concretely specifying which actions are influenced by timelines estimates.

Question: Is there something like this for p(doom) estimates? More specifically, following the above points as pushback against the strawman(?) that "p(doom) discourse, including rigorous modeling of it, is overrated":

What concrete high-level actions do most alignment researchers agree are influenced by p(doom) estimates, and would benefit from more rigorous modeling (vs just best guesses, even by top researchers e.g. Paul Christiano's views)?
What's the right level of granularity for estimating p(doom) from a decision-relevant perspective? Is it just a single bit ("below or above some threshold X%") like estimating timelines for AI governance strategy, or OOM (e.g. 0.1% vs 1% vs 10% vs >50%), or something else?
- I suppose the easy answer is "the granularity depends on who's deciding, what decisions need making, in what contexts", but I'm in the dark as to concrete examples of those parameters (granularity i.e. thresholds, contexts, key actors, decisions)
- e.g. reading Joe Carlsmith's personal update from ~5% to >10% I'm unsure if this changes his recommendations at all, or even his conclusion – he writes that "my main point here, though, isn't the specific numbers... [but rather that] here is a disturbingly substantive risk that we (or our children) live to see humanity as a whole permanently and involuntarily disempowered by AI systems we’ve lost control over", which would've been true for both 5% and 10%

Or is this whole line of questioning simply misguided or irrelevant?

Some writings I've seen gesturing in this direction:

harsimony's argument that Precise P(doom) isn't very important for prioritization or strategy ("identifying exactly where P(doom) lies in the 1%-99% range doesn't change priorities much") amounts to the 'single bit granularity' answer
- Carl Shulman disagrees, but his comment (while answering my 1st bullet point) isn't clear in the way the different AI gov strategies for different timelines post is, so I'm still left in the dark – to (simplistically) illustrate with a randomly-chosen example from his reply and making up numbers, I'm looking for statements like "p(doom) < 2% implies we should race for AGI with less concern about catastrophic unintended AI action, p(doom) > 10% implies we definitely shouldn't, and p(doom) between 2-10% implies reserving this option for last-ditch attempts", which he doesn't provide
Froolow's attempted dissolution of AI risk (which takes Joe Carlsmith's model and adds parameter uncertainty – inspired by Sandberg et al's Dissolving the Fermi paradox – to argue that low-risk worlds are more likely than non-systematised intuition alone would suggest)
- Froolow's modeling is useful to me for making concrete recommendations for funders, e.g. (1) "prepare at least 2 strategies for the possibility that we live in one of a high-risk or low-risk world instead of preparing for a middling-ish risk", (2) "devote significantly more resources to identifying whether we live in a high-risk or low-risk world", (3) "reallocate resources away from macro-level questions like 'What is the overall risk of AI catastrophe?' towards AI risk microdynamics like 'What is the probability that humanity could stop an AI with access to nontrivial resources from taking over the world?'", (4) "When funding outreach / explanations of AI Risk, it seems likely it would be more convincing to focus on why this step would be hard than to focus on e.g. the probability that AI will be invented this century (which mostly Non-Experts don’t disagree with)". I haven't really seen any other p(doom) model do this, which I find confusing
I'm encouraged by the long-term vision of the MTAIR project "to convert our hypothesis map into a quantitative model that can be used to calculate decision-relevant probability estimates", so I suppose another easy answer to my question is just "wait for MTAIR", but I'm wondering if there's a more useful answer to the "current SOTA" than this. To illustrate, here's (a notional version of) how MTAIR can help with decision analysis, cribbed from that introduction post:

This question was mainly motivated by my attempt to figure out what to make of people's widely-varying p(doom) estimates, e.g. in the appendix section of Apart Research's website, beyond simply "there is no consensus on p(doom)". I suppose one can argue that rigorous p(doom) modeling helps reduce disagreement on intuition-driven estimates by clarifying cruxes or deconfusing concepts, thereby improving confidence and coordination on what to do, but in practice I'm unsure if this is the case (reading e.g. the public discussion around the p(doom) modeling by Carlsmith, Froolow, etc), so I'm not sure I buy this argument, hence my asking for concrete examples.

Mo PuteraNov 27 20235

Some notes from trying out Rethink Priorities' new cross-cause cost-effectiveness model (CCM) from their post, for personal reference:

Cost-effectiveness in DALYs per $1k (90% CI) / % of simulation results with positive outcomes - negative outcomes - no effects / alternative weightings of cost-eff under different risk aversion profiles and weighting schemes in weighted DALYs per $1k, min to max values

GHD:
- US govt GHD: 1 (range: 0.85 - 1.22) / 100% positive / risk 1 - 1
- Cash: 1.7 (range 1.1 - 2.5) / 100% positive / risk 1 - 2
- GW bar: 21 (range: 11 - 42) / 100% positive / risk 16 - 21 (OP bar has ~similar figures)
- Good intervention (per OP & GW): 39 (range: 15 - 67) / 100% positive / risk 31 - 39
AW - generic interventions:
- Black soldier fly: 5.6 (range: 95% below 11.4) / 16% positive, 84% no effect / risk 0 - 6
- Shrimp: 7.8 (range: 95% below 8.0) / 19% positive, 81% no effect / risk 0 - 8
- Carp: 36 (range: 95% below 145) / 31% positive, 69% no effect / risk 2 - 36
- Chicken: 719 (range: 95% below 2,100) / 81% positive, 19% no effect / risk 221 - 717
x-risk:
- Portfolio of biorisk projects ($15-30M budget, 60% chance no effect, 70% effect is positive): 132 (middle 99.9% of expected utility is 0) / >99.9% no effect / risk 0 - 132
- Nanotech safety megaproject ($10-30M budget, 90% chance no effect, 70% effect is positive): 73 (middle 99.9% of EU is 0) / >99.9% no effect / risk -10 - 73
- AI misalignment megaproject ($8-28B budget, 97.3% chance no effect, 70% effect is positive): 154 (middle 99.9% of EU is 27, 99% is 0) / >99.6% no effect / risk -56 - 154
Some things that jumped out at me (caveating that I don't work in any of these areas):
- I'm a little surprised that only chicken campaigns are modeled as clearly higher EV (OOM-wise) than GHD interventions considered good by GW & OP's lights, while interventions for other nonhuman animals fall short
- I'm also surprised that chickens > all other nonhuman animals on both EV and p(+ve simulation outcome). There's some discussion that seems to indicate that cage-free work seems to be much lower EV now than previously, although I'm not sure if it changes the takeaway (and in any case funding prioritization shouldn't be purely EV-based)
- I'm surprised yet again that a >$10B AI misalignment megaproject is modeled as having no effect in >99.6% of simuls. I probably hadn't internalized the 'hits' in 'hits-based giving' as well as I should, since my earlier gut intuition (based on no data whatsoever) was that a near-Manhattan-scale megaproject would surely have some effect in >10% of possible worlds
- I didn't expect the model to say chickens > misaligned AI, unsafe nanotech and biorisk from a risk-neutral EV perspective. That said, the x-risk inputs are in some sense just placeholders, so I don't put much weight in this

In any case, I'd be curious to see how the CCM is taken into consideration by funders and other stakeholders going forward.

Mo PuteraMar 31 20243

The 1,000-ton rule is Richard Parncutt's suggestion for reframing the political message of the severity of global warming in particularly vivid human rights terms; it says that someone in the next century or two is prematurely killed every time humanity burns 1,000 tons of carbon.

I came across this paper while (in the spirit of Nuno's suggestion) trying to figure out the 'moral cost of climate change' so to speak, driven by my annoyance that e.g. climate charity BOTECs reported $ per ton of CO2-eq averted in contrast to (say) the $ per death averted bottomline of GHW charities, since I don't intrinsically care to avert CO2-equivalent emissions the way I do about averting deaths. (To be clear, I understand why the BOTECs do so and would do the same for work; this is for my own moral clarity.)

Parncutt's derivation is simple: burning a trillion tons of carbon will cause ~2 °C of anthropogenic global warming, which will in turn cause 1 - 10 million premature deaths a year "for a period of several centuries", something like this:

www.frontiersin.org Modelling the rise in global mean surface temperature (GMST) as a function of carbon burned is already very hard; Parncutt doesn't try to model premature deaths as a function of GMST but just makes a semi-quantitative order-of-magnitude estimation anchored extensively at the lower and upper ends to various catastrophic outcomes discussed in the literature on climate change, and assumes a lognormal distribution around a billion future deaths with a 10x range for worst-vs-best case scenario, which over time looks 'very approximately' like this:

The lower line represents deaths due to poverty without AGW. As the negative effect of AGW overtakes the positive effect of development, the death rate will increase, as shown by the upper line. In a more accurate model, the upper line might be concave upward on the left (exponential increase) and concave downward on the right (approaching a peak).

Based on the 1,000-ton rule, Pearce & Parncutt suggest the 'millilife' as "an accessible unit of measure for carbon footprints that is easy to understand and may be used to set energy policy to help accelerate carbon emissions reductions". A millilife is a measure of intrinsic value defined to be 1/1000th of a human life; the 1,000-ton rule says that burning a ton of fossil carbon destroys a millilife. This lets Pearce & Parncutt make statements like these, at an individual level (all emphasis mine):

For example in Canada, which has some of the highest yearly carbon emissions per capita in the world at around 19 tons of CO2 or 5 tons of carbon per person, roughly 5 millilives are sacrificed by an average person each year. As the average Canadian lives to be about 80, he/she sacrifices about 400 millilives (0.4 human lives) in the course of his/her lifetime, in exchange for a carbon-intensive lifestyle

and

... an average future AGW-victim in a developing country will lose half of a lifetime or 30–40 life-years, as most victims will be either very young or very old. If the average climate victim loses 35 life-years (or 13,000 life-days), a millilife corresponds to 13 days.
Stated in another way: if a person is responsible for burning a ton of fossil carbon by flying to another continent and back, they effectively steal 13 days from the life of a future poor person living in the developing world. If the traveler takes 1000 such trips, they are responsible for the death of a future person.

and for "large-scale energy decisions":

... the Adani Carmichael coalmine in Queensland, Australia, is currently under construction and producing coal since 2021. Despite massive protests over several years, it will be the biggest coalmine ever. Its reserves are up to 4 billion tons of coal, or 3 billion tons of carbon. If all of that was burned, the 1000-tonne rule says it would cause the premature deaths of 3 million future people. Given that the 1000-tonne rule is only an order-of-magnitude estimate, the number of caused deaths will lie between one million and 10 million. ... Many of those who will die are already living as children in the Global South; burning Carmichael coal will cause their future deaths with a high probability. Should energy policy allow that to occur?

Pearce & Parncutt then use the 1,000-ton rule and millilife to make various suggestions. Here's one:

Under what circumstances might a government ban or outlaw an entire corporation or industry, considered a legal entity or person—for example, the entire global coal industry? ...
Ideally, a company should not cause any human deaths at all. If it does, those deaths should be justifiable in terms of improvements to the quality of life of others. For example, a company that builds a bridge might reasonably risk a future collapse that would kill 100 people with a probability of 1%. In that case, the company accepts that on average one future person will be killed as a result of the construction of the bridge. It may be reasonable to claim that the improved quality of life for thousands or millions of people who cross the bridge justifies the human cost.
Fossil fuel industries are causing far more future deaths than that, raising the question of the point at which the law should intervene. As a first step to solving this problem, it has been proposed a rather high threshold (generous toward the corporations) is appropriate. A company does not have the right to exist if its net impact on human life (e.g., a company/industry might make products that save lives like medicine but do kill a small fraction of users) is such that it kills more people than it employs. This requirement for a company’s existence is thus:
Number of future premature deaths/year < Number of full-time employees (1)
This criterion can be applied to an entire industry. If the industry kills more people than it employs, then primary rights (life) are being sacrificed for secondary rights (jobs or profits) and the net benefit to humankind is negative. If an industry is not able to satisfy Equation (1), it should be closed down by the government.
... the coal industry kills people by polluting the air that they breathe. ... In the U.S., about 52,000 human lives are sacrificed per year to provide coal-fired electricity. ... In the U.S., coal employed 51,795 people in 2016. Since the number of people killed is greater than the number employed, the U.S. coal industry does not satisfy Equation (1) and should be closed down. This conservative conclusion does not include future deaths caused by climate change due to burning coal.

One more energy policy suggestion (there's many more in the paper):

Applying asset forfeiture laws (also referred to as asset seizure) to manslaughter caused by AGW. These laws enable the confiscation of assets by the U.S. government as a type of criminal-justice financial obligation that applies to the proceeds of crime. Essentially, if criminals profit from the results of unlawful activity, the profits (assets) are confiscated by the authorities.
This is not only a law in the U.S. but is in place throughout the world. For example, in Canada, Part XII.2 of the Criminal Code, provides a national forfeiture régime for property arising from the commission of indictable offenses. Similarly, ‘Son of Sam laws’ could also apply to carbon emissions. In the U.S., Son of Sam laws refer to laws designed to keep criminals from profiting from the notoriety of their crimes and often authorize the state to seize funds earned by the criminals to be used to compensate the criminal’s victims.
If that logic of asset forfeiture is applied to fossil fuel company investors who profit from carbon-emission-related manslaughter, taxes could be set on fossil fuel profits, dividends, and capital gains at 100% and the resultant tax revenue could be used for energy efficiency and renewable energy projects or to help shield the poor from the most severe impacts of AGW. ...
Such AGW-focused asset forfeiture laws would also apply to fossil fuel company executive compensation packages. Energy policy research has shown that it is possible to align energy executive compensation with careful calibration of incentive equations such that the harmful effects of emissions can be prevented through incentive pay. Executives who were compensated without these safeguards in place would have their incomes seized the same as other criminals benefiting materially from manslaughter.

I have no (defensible) opinion on these suggestions; curious to know what anyone thinks.

Mo PuteraDec 22 20233

Notes from Ozy Brennan's On capabilitarianism

Martha Nussbaum's first-draft list of central capabilities (for humans)
- Life
- Bodily health
- Bodily integrity
- Senses, Imagination, and Thought
- Emotions
- Practical reason
- Affiliation
- Other species
- Play
- Control over political & material environment
the Five Freedoms (for animals)
- Freedom from hunger and thirst
- Freedom from discomfort
- Freedom from pain, injury, and disease
- Freedom to express normal behavior
- Freedom from fear and distress

Mo PuteraJan 30 20242

I thought I had mostly internalized the heavy-tailed worldview from a life-guiding perspective, but reading Ben Kuhn's searching for outliers made me realize I hadn't. So here are some summarized reminders for posterity:

Key idea: lots of important things in life generated by multiplicative processes resulting in heavy-tailed distributions – jobs, employees / colleagues, ideas, romantic relationships, success in business / investing / philanthropy, how useful it is to try new activities
Decision relevance to living better, i.e. what Ben thinks I should do differently:
- Getting lots of samples improves outcomes a lot, so draw as many samples as possible
- Trust the process and push through the demotivation of super-high failure rates (instead of taking them as evidence that the process is bad)
- But don't just trust any process; it must have 2 parts: (1) a good way to tell if a candidate is an outlier ("maybe amazing" below) (2) a good way to draw samples
- Optimize less, draw samples more (for a certain type of person)
- Filter for "maybe amazing", not "probably good", as they have different traits
- Filter for "ruling in" candidates, not "ruling out" (e.g. in dating)
- Cultivate an abundance mindset to help reject more candidates early on (to find 99.9th percentile not just 90th)
- Think ahead about what outliers look like, to avoid accidentally rejecting 99.9th percentile candidates out of miscalibration, by asking others based on their experience
My reservations with Ben's advice, despite thinking they're mostly sound and idea-generating:
- "Stick with the process through super-high failure rates instead of taking them as evidence that the process is bad" feels uncomfortably close to protecting a belief from falsification
- Filtering for "maybe amazing", not "probably good" makes me uncomfortable because I'm not risk-neutral (e.g. in RP's CCM I'm probably closest to "difference-making risk-weighted expected utility = low to moderate risk aversion", which for instance assesses RP's default AI risk misalignment megaproject as resulting in, not averting, 300+ DALYs per $1k)
- Unlike Ben, I'm a relatively young person in a middle-income country, and the abundance mindset feels privileged (i.e. not as much runway to try and fail)
So maybe a precursor / enabling activity for the "sample more" approach above is "more runway-building": money, leisure time, free attention & health, proximity to opportunities(?)

Mo PuteraMay 21 20241

From Richard Y Chappell's post Theory-Driven Applied Ethics, answering "what is there for the applied ethicist to do, that could be philosophically interesting?", emphasis mine:

A better option may be to appeal to mid-level principles likely to be shared by a wide range of moral theories. Indeed, I think much of the best work in applied ethics can be understood along these lines. The mid-level principles may be supported by vivid thought experiments (e.g. Thomson’s violinist, or Singer’s pond), but these hypothetical scenarios are taken to be practically illuminating precisely because they support mid-level principles (supporting bodily autonomy, or duties of beneficence) that we can then apply generally, including to real-life cases.
The feasibility of this principled approach to applied ethics creates an opening for a valuable (non-trivial) form of theory-driven applied ethics. Indeed, I think Singer’s famous argument is a perfect example of this. For while Singer in no way assumes utilitarianism in his famous argument for duties of beneficence, I don’t think it’s a coincidence that the originator of this argument was a utilitarian. Different moral theories shape our moral perspectives in ways that make different factors more or less salient to us. (Beneficence is much more central to utilitarianism, even if other theories ought to be on board with it too.)
So one fruitful way to do theory-driven applied ethics is to think about what important moral insights tend to be overlooked by conventional morality. That was basically my approach to pandemic ethics: to those who think along broadly utilitarian lines, it’s predictable that people are going to be way too reluctant to approve superficially “risky” actions (like variolation or challenge trials) even when inaction would be riskier. And when these interventions are entirely voluntary—and the alternative of exposure to greater status quo risks is not—you can construct powerful theory-neutral arguments in their favour. These arguments don’t need to assume utilitarianism. Still, it’s not a coincidence that a utilitarian would notice the problem and come up with such arguments.
Another form of theory-driven applied ethics is to just do normative ethics directed at confused applied ethicists. For example, it’s commonplace for people to object that medical resource allocation that seeks to maximize quality-adjusted life years (QALYs) is “objectionably discriminatory” against the elderly and disabled, as a matter of principle. But, as I argue in my paper, Against 'Saving Lives': Equal Concern and Differential Impact, this objection is deeply confused. There is nothing “objectionably discriminatory” about preferring to bestow 50 extra life-years to one person over a mere 5 life-years to another. The former is a vastly greater benefit, and if we are to count everyone equally, we should always prefer greater benefits over lesser ones. It’s in fact the opposing view, which treats all life-saving interventions as equal, which fails to give equal weight to the interests of those who have so much more at stake.

Two asides:

This seems broadly correct (at least for someone who shares my biases); e.g. even in pure math John von Neumann warned:

As a mathematical discipline travels far from its empirical source, or still more, if it is a second and third generation only indirectly inspired by ideas coming from "reality" it is beset with very grave dangers. It becomes more and more purely aestheticizing, more and more purely l'art pour l'art. This need not be bad, if the field is surrounded by correlated subjects, which still have closer empirical connections, or if the discipline is under the influence of men with an exceptionally well-developed taste. But there is a grave danger that the subject will develop along the line of least resistance, that the stream, so far from its source, will separate into a multitude of insignificant branches, and that the discipline will become a disorganized mass of details and complexities. In other words, at a great distance from its empirical source, or after much "abstract" inbreeding, a mathematical subject is in danger of degeneration. ... In any event, whenever this stage is reached, the only remedy seems to me to be the rejuvenating return to the source: the re-injection of more or less directly empirical ideas.

This makes me wonder if it would be fruitful to look at & somehow incorporate mid-level principles into decision-relevant cost-effectiveness analyses that attempt to incorporate moral uncertainty, e.g. HLI's app or Rethink's CCM. (This is not at all a fleshed-out thought, to be clear)

Mo PuteraMar 22 20241

Effective giving

Michael Dickens' 2016 post Evaluation Frameworks (or: When Importance / Neglectedness / Tractability Doesn't Apply) makes the following point I think is useful to keep in mind as a corrective:

INT has its uses, but I believe many people over-apply it.
Generally speaking (with some exceptions), people don’t choose between causes, they choose between interventions. That is, they don’t prioritize broad focus areas like global poverty or immigration reform. Instead, they choose to support specific interventions such as distributing deworming treatments or lobbying to pass an immigration bill. The INT framework doesn’t apply to interventions as well as it does to causes. In short, cause areas correspond to problems, and interventions correspond to solutions; INT assesses problems, not solutions.

(aside: Michael Plant makes the same point in chapters 5 & 6 of his PhD thesis as per Edo Arad's post, using it as a starting point to develop a systematic cause prio approach he called 'cause mapping')

In most cases, we can try to directly assess the true marginal impact of investing in an intervention. These assessments will never be perfectly accurate, but they generally seem to tell us more than INT does. ...
How can we estimate an intervention’s impact more directly? To develop a better framework, let’s start with the final result we want and work backward to see how to get it.

Dickens' post has more; the framework they end up with is this:

which (somewhat less practically, they note) could be fine-grained further:

I also appreciated that Dickens actually used this framework to guide their giving decision (more details in their post).

Mo PuteraDec 20 20231

List of charities providing humanitarian assistance in the Israel-Hamas war mentioned in response to this request, for posterity and ease of reference:

Physicians for Human Rights Israel (2022 Impact Report, response to the current crises)
Al Mezan Centre for Human Rights
Palestinian Centre for Human Rights
Charity Navigator's list has 18 charities, of which only Global Empowerment Mission Inc. is rated on 'impact & results'

Mo PuteraJan 26 20241

Forecasting

Just came across Max Dalton's 2014 writeup Estimating the cost-effectiveness of research into neglected diseases, part of Owen Cotton-Barratt's project on estimating cost-effectiveness of research and similar activities. Some things that stood out to me:

High-level takeaways
- ~100x 95% CI range (mostly from estimates of total current funding to date, and difficulty of continuing with research), so figures below can't really argue for change in priorities so much as compel further research
  - This uncertainty is a lower bound, including only statistical uncertainty and not model uncertainty
- Differing returns to research are largely driven by disease burden size, so look at diarrheal diseases, malaria, hookworm, ascariasis, trichuriasis, lymphatic filariasis, meningitis, typhoid, and salmonella – i.e. nothing too surprising
Estimated figures:
- 13.9 DALYs/$1k for the sector as a whole (vs ~20 DALYs/$1k for GWWC top charities back in 2014), 95% CI 1.43-130 DALYs/$1k
- Median estimates: diarrheal disease e.g. cholera and dysentry 121 DALYs/$1k, salmonella infections 74 DALYs/$1k, worms ~50 DALYs/$1k, leprosy 0.058 DALYs/$1k
- Most of the top diseases have ~100x 95% CI range, except salmonella whose range is ~3,000x(!)
References
- Sources & calculations for estimates above
- G-FINDER survey of research funding for neglected diseases
- Cotton-Barratt's essay deriving the simple equation for calculating the estimates above

Mo PuteraMay 13 20241

The following is a collection of long quotes from Ozy Brennan's post On John Woolman (which I stumbled upon via Aaron Gertler) that spoke to me. Woolman was clearly what David Chapman would call mission-oriented with respect to meaning of and purpose in life; Chapman argues instead for what he calls "enjoyable usefulness", which is I think healthier in ~every way ... it just doesn't resonate. All bolded text is my own emphasis, not Ozy's.

As a child, Woolman experienced a moment of moral awakening: ... [anecdote]
This anecdote epitomizes the two driving forces of John Woolman’s personality: deep compassion and the refusal to ever cut himself a moment of slack. You might say “it was just a bird”; you might say “come on, Woolman, what were you? Ten?” Woolman never thought like that. It was wrong to kill; he had killed; that was all there was to say about it.
When Woolman was a teenager, the general feeling among Quakers was that they were soft, self-indulgent, not like the strong and courageous Quakers of previous generations, unlikely to run off to Massachusetts to preach the Word if the Puritans decided once again to torture Quakers for their beliefs, etc. Woolman interpreted this literally. He spent his teenage years being like “I am depraved, I am evil, I have not once provoked anyone into whipping me to death, I don’t even want to be whipped to death.”
As a teenager, Woolman fell in with a bad crowd and committed some sins. What kind of sins? I don’t know. Sins. He's not telling us:
“I hastened toward destruction,” he writes. “While I meditate on the gulf toward which I travelled … I weep; mine eye runneth down with water.”
In actuality, Woolman’s corrupting friends were all... Quakers who happened to be somewhat less strict than he was. We have his friends' diaries and none of them remarked on any particular sins committed in this period. Biographers have speculated that Woolman was part of a book group and perhaps the great sin he was reproaching himself for was reading nonreligious books. He may also have been reproaching himself for swimming, skating, riding in sleighs, or drinking tea.
Woolman is so batshit about his teenage wrongdoing that many readers have speculated about the existence of different, non-Quaker friends who were doing all the sins. However, we have no historical evidence of him having other friends, and we have a fuckton of historical evidence of Woolman being extremely hard on himself about minor failings (or “failings”).
Most people who are Like That as teenagers grow out of it. Woolman didn’t. He once said something dumb in Weekly Meeting¹ and then spent three weeks in a severe depression about it. He never listened to nonreligious music, read fiction or newspapers, or went to plays. He once stormed down to a tavern to tell the tavern owner that celebrating Christmas was sinful.

... if Woolman were just an 18th century neurotic, no one would remember him. We care about him because of his attitude about slavery.
When Woolman was 21, his employer asked him to write a bill of sale for an enslaved woman. Woolman knew it was wrong. But his employer told him to and he was scared of being fired. Both Woolman’s employer and the purchaser were Quakers themselves, so surely if they were okay with it it was okay. Woolman told both his master and the purchaser that he thought that Christians shouldn't own enslaved people, but he wrote the bill.
After he wrote the bill of sale Woolman lost his inner peace and never really recovered it. He spent the rest of his life struggling with guilt and self-hatred. He saw himself as selfish and morally deficient. ...
Woolman worked enough to support himself, but the primary project of his life was ending slavery. He wrote pamphlet after pamphlet making the case that slavery was morally wrong and unbiblical. He traveled across America making speeches to Quaker Meetings urging them to oppose slavery. He talked individually with slaveowners, both Quaker and not, which many people criticized him for; it was “singular”, and singular was not okay. ...
It is difficult to overstate how much John Woolman hated doing anti-slavery activism. For the last decade of his life, in which he did most of his anti-slavery activities, he was clearly severely depressed. ... Partially, he hated the process of traveling: the harshness of life on the road; being away from his family; the risk of bringing home smallpox, which terrified him.
But mostly it was the task being asked of Woolman that filled him with grief. Woolman was naturally "gentle, self-deprecating, and humble in his address", but he felt called to harshly condemn slaveowning Quakers. All he wanted was to be able to have friendly conversations with people who were nice to him. But instead, he felt, God had called him to be an Old Testament prophet, thundering about God’s judgment and the need for repentance. ...
Woolman craved approval from other Quakers. But even Quakers personally opposed to slavery often thought that Woolman was making too big a deal about it. There were other important issues. Woolman should chill. His singleminded focus on ending slavery was singular, and being singular was prideful. Isn’t the real sin how different Woolman’s abolitionism made him from everyone else?
Sometimes he persuaded individual people to free their slaves, but successes were few and far between. Mostly, he gave speeches and wrote pamphlets as eloquently as he could, and then his audience went “huh, food for thought” and went home and beat the people they’d enslaved. Nothing he did had any discernible effect.
... Woolman spent much of his time feeling like a failure. If he were better, if he followed God’s will more closely, if he were kinder and more persuasive and more self-sacrificing, then maybe someone would have lived free who now would die a slave, because Woolman wasn’t good enough.

The modern version of this is probably what Thomas Kwa wrote about here:

I think that many people new to EA have heard that multipliers like these exist, but don't really internalize that all of these multipliers stack multiplicatively. ... If she misses one of these multipliers, say the last one, ... Ana is losing out on 90% of her potential impact, consigning literally millions of chickens to an existence worse than death. To get more than 50% of her maximum possible impact, Ana must hit every single multiplier. This is one way that reality is unforgiving.

From one perspective, Woolman was too hard on himself about his relatively tangential connection to slavery. From another perspective, he is one of a tiny number of people in the eighteenth century who has a remotely reasonable response to causing a person to be in bondage when they could have been free. Everyone else flinched away from the scale of the suffering they caused; Woolman looked at it straight. Everyone else thought of slaves as property; Woolman alone understood they were people.
Some people’s high moral standards might result in unproductive self-flagellation and the refusal to take actions because they might do something wrong. But Woolman derived strength and determination from his high moral standards. When he failed, he regretted his actions and did his best to change them. At night he might beg God to fucking call someone else, but the next morning he picked up his walking stick and kept going.
And the thing he was doing mattered. Quaker abolitionism wasn’t inevitable; it was the result of hard work by specific people, of whom Woolman was one of the most prominent. If Woolman were less hard on himself, many hundreds if not thousands of free people would instead have been owned things that could beaten or raped or murdered with as little consequence as I experience from breaking a laptop.

An aside (doubling as warning) on mission orientation, quoting Tanner Greer's Questing for Transcendence:

... out of the lands I’ve lived and roles I’ve have donned, none blaze in my memory like the two years I spent as a missionary for the Church of Jesus Christ. It is a shame that few who review my resume ask about that time; more interesting experiences were packed into those few mission years than in the rest of the lot combined. ... I doubt I shall ever experience anything like it again. I cannot value its worth. I learned more of humanity’s crooked timbers in the two years I lived as missionary than in all the years before and all the years since.
Attempting to communicate what missionary life is like to those who have not experienced it themselves is difficult. ... Yet there is one segment of society that seems to get it. In the years since my service, I have been surprised to find that the one group of people who consistently understands my experience are soldiers. In many ways a Mormon missionary is asked to live something like a soldier... [they] spend years doing a job which is not so much a job as it is an all-encompassing way of life.
The last point is the one most salient to this essay. It is part of the reason both many ex-missionaries (known as “RMs” or “Return Missionaries” in Mormon lingo) and many veterans have such trouble adapting to life when they return to their homes. ... Many RMs report a sense of loss and aimlessness upon returning to “the real world.” They suddenly find themselves in a society that is disgustingly self-centered, a world where there is nothing to sacrifice or plan for except one’s own advancement. For the past two years there was a purpose behind everything they did, a purpose whose scope far transcended their individual concerns. They had given everything—“heart, might, mind and strength“—to this work, and now they are expected to go back to racking up rewards points on their credit card? How could they?
The soldier understands this question. He understands how strange and wonderful life can be when every decision is imbued with terrible meaning. Things which have no particular valence in the civilian sphere are a matter of life or death for the soldier. Mundane aspects of mundane jobs (say, those of the former vehicle mechanic) take on special meaning. A direct line can be drawn between everything he does—laying out a sandbag, turning off a light, operating a radio—and the ability of his team to accomplish their mission. Choice of food, training, and exercise before combat can make the difference between the life and death of a soldier’s comrades in combat. For good or for ill, it is through small decisions like these that great things come to pass.
In this sense the life of the soldier is not really his own. His decisions ripple. His mistakes multiply. The mission demands strict attention to things that are of no consequence in normal life. So much depends on him, yet so little is for him.
This sounds like a burden. In some ways it is. But in other ways it is a gift. Now, and for as long as he is part of the force, even his smallest actions have a significance he could never otherwise hope for. He does not live a normal life. He lives with power and purpose—that rare power and purpose given only to those whose lives are not their own.
... It is an exhilarating way to live.
This sort of life is not restricted to soldiers and missionaries. Terrorists obviously experience a similar sort of commitment. So do dissidents, revolutionaries, reformers, abolitionists, and so forth. What matters here is conviction and cause. If the cause is great enough, and the need for service so pressing, then many of the other things—obedience, discipline, exhaustion, consecration, hierarchy, and separation from ordinary life—soon follow. It is no accident that great transformations in history are sprung from groups of people living in just this way. Humanity is both at its most heroic and its most horrifying when questing for transcendence.