Mo Putera

Research & quantitative modelling @ ARMoR

Bio

Participation
4

CE Research Training Program graduate and research intern at ARMoR under the Global Impact Placements program, working on cost-benefit analyses to help combat AMR. Currently exploring roles involving research distillation and quantitative analysis to improve decision-making e.g. applied prioritization research, previously supported by a FTX Future Fund regrant and later Open Philanthropy's affected grantees program. Previously spent 6 years doing data analytics, business intelligence and knowledge + project management in various industries (airlines, e-commerce) and departments (commercial, marketing), after majoring in physics at UCLA. Also collaborating on a local charity evaluation initiative with the moonshot aim of reorienting Malaysia's giving landscape towards effectiveness. 

I first learned about effective altruism circa 2014 via A Modest Proposal, a polemic on using dead children as units of currency to force readers to grapple with the opportunity costs of subpar resource allocation under triage. I have never stopped thinking about it since, although my relationship to it has changed quite a bit; I related to Tyler's personal story (which unsurprisingly also references A Modest Proposal as a life-changing polemic):

I thought my own story might be more relatable for friends with a history of devotion – unusual people who’ve found themselves dedicating their lives to a particular moral vision, whether it was (or is) Buddhism, Christianity, social justice, or climate activism. When these visions gobble up all other meaning in the life of their devotees, well, that sucks. I go through my own history of devotion to effective altruism. It’s the story of [wanting to help] turning into [needing to help] turning into [living to help] turning into [wanting to die] turning into [wanting to help again, because helping is part of a rich life].

Comments
107

Topic contributions
3

Yeah rising inequality is a good guess, thank you – the OWID chart also shows the US experiencing the same trajectory direction as India (declining average LS despite rising GDP per capita). I suppose one way to test this hypothesis is to see if China had inequality rise significantly as well in the 2011-23 period, since it had the expected LS-and-GDP-trending-up trajectory. Probably a weak test due to potential confounders... 

Thank you for the pointer!

Your second link helped me refine my line of questioning / confusion. You're right that social support declined a lot, but the sum of the six key variables (GDP per capita, etc) still mostly trended upwards over time, huge covid dip aside, which is what I'd expect in the India development success story. 

It's the dystopia residual that keeps dropping, from 2.275 - 1.83 = 0.445 in 2015 (i.e. Indians reported 0.445 points higher life satisfaction than you'd predict using the model) to 0.979 - 1.83 = -0.85, an absolute plummeting of life satisfaction across a sizeable fraction of the world population, that's for some reason not explained by the six key variables. Hm... 

(please don't feel obliged to respond – I appreciate the link!)

Why did India's happiness ratings consistently drop so much over time even as its GDP per capita rose?

Epistemic status: confused. Haven't looked into this for more than a few minutes

My friend recently alerted me to an observation that puzzled him: this dynamic chart from Our World in Data's happiness and life satisfaction article showing how India's self-reported life satisfaction dropped an astounding -1.20 points (4.97 to 3.78) from 2011 to 2021, even as its GDP per capita rose +51% (I$4,374 to I$6,592 in 2017 prices): 

(I included China for comparison to illustrate the sort of trajectory I expected to see for India.)

The sliding year scale on OWID's chart shows how this drop has been consistent and worsening over the years. This picture hasn't changed much recently: the most recent 2024 World Happiness Report reports a 4.05 rating averaged over the 3-year window 2021-23, only slightly above the 2021 rating.

A -1.20 point drop is huge. For context, it's 10x(!) larger than the effect of doubling income at +0.12 LS points (Clarke et al 2018 p199, via HLI's report), and compares to major negative life events like widowhood and extended unemployment: 

The effect of life events on life satisfaction

Given India's ~1.4 billion population, such a large drop is alarming: roughly ~5 billion LS-years lost since 2011, very roughly ballparking. For context, and keeping in mind that LS-years and DALYs aren't the same thing, the entire world's DALY burden is ~2.5 billion DALYs p.a. 

But – again caveating with my lack of familiarity with the literature and extremely cursory look into this – I haven't seen any writeup look into this, which makes me wonder if it's not a 'real issue'? For instance, the 2021 WHR just says

Since 2006-08, world well-being has been static, but life expectancy increased by nearly four years up to 2017-19 (we shall come to 2020 later). The rate of progress differed a lot across regions. The biggest improvements in life expectancy were in the former Soviet Union, in Asia, and (the greatest) in Sub-Saharan Africa. And these were the regions that had the biggest increases in WELLBYs. In Asia, the exception is South Asia, where India has experienced a remarkable fall in Well-being which more than outweighs its improved life expectancy.

That's it: no elaboration, no footnotes, nothing.

So what am I missing? What's going on here? 

A quick search turned up this WEF article (based on Ipsos data and research, not the WHR's Gallup World Poll, so take it with a grain of salt) pointing to

  • increased internet access -> pressure to portray airbrushed lives on social media & a feeling that 'their lives have become meaningless' 
  • covid-19 mitigation-induced isolation curtailing activities that improve wellbeing (employment, socializing, going to school, exercising and accessing health services)
  • urban migration to seek work -> traffic congestion, noise and pollution, demanding bosses -> less sleep and exercise -> higher anxiety and worsening health 

But I'm not sure these factors are differential (i.e. that they, for instance, happen much more in India than elsewhere s.t. it explains the wellbeing vs development trajectory difference over 2011-24)?

Michael Dickens' 2016 post Evaluation Frameworks (or: When Importance / Neglectedness / Tractability Doesn't Apply) makes the following point I think is useful to keep in mind as a corrective:

INT has its uses, but I believe many people over-apply it. 

Generally speaking (with some exceptions), people don’t choose between causes, they choose between interventions. That is, they don’t prioritize broad focus areas like global poverty or immigration reform. Instead, they choose to support specific interventions such as distributing deworming treatments or lobbying to pass an immigration bill. The INT framework doesn’t apply to interventions as well as it does to causes. In short, cause areas correspond to problems, and interventions correspond to solutions; INT assesses problems, not solutions.

(aside: Michael Plant makes the same point in chapters 5 & 6 of his PhD thesis as per Edo Arad's post, using it as a starting point to develop a systematic cause prio approach he called 'cause mapping')

In most cases, we can try to directly assess the true marginal impact of investing in an intervention. These assessments will never be perfectly accurate, but they generally seem to tell us more than INT does. ... 

How can we estimate an intervention’s impact more directly? To develop a better framework, let’s start with the final result we want and work backward to see how to get it.

Dickens' post has more; the framework they end up with is this:

which (somewhat less practically, they note) could be fine-grained further:

I also appreciated that Dickens actually used this framework to guide their giving decision (more details in their post).

I'd be curious if you have any thoughts on how your proposed refactoring from [neartermist human-only / neartermist incl. AW / longtermist] -> [pure suffering reduction / reliable global capacity growth / moonshots] might change, in broad strokes (i.e. direction & OOM change), current 

Or maybe these are not the right questions to ask / I'm looking at the wrong things, since you seem to be mainly aiming at research (re)prioritisation? 

I do agree with 

we should be especially cautious of completely dismissing commonsense priorities in a worldview-diversified portfolio (even as we give significant weight and support to a range of theoretically well-supported counterintuitive cause areas) 

although I thought the sandboxing of cluster thinking (vs sequence thinking) handles that just fine:

A key difference with “sequence thinking” is the handling of certainty/robustness (by which I mean the opposite of Knightian uncertainty) associated with each perspective. Perspectives associated with high uncertainty are in some sense “sandboxed” in cluster thinking: they are stopped from carrying strong weight in the final decision, even when such perspectives involve extreme claims (e.g., a low-certainty argument that “animal welfare is 100,000x as promising a cause as global poverty” receives no more weight than if it were an argument that “animal welfare is 10x as promising a cause as global poverty”).

I'm from a middle-income country, so when I first seriously engaged with EA, I remember how the fact that my order-of-magnitude lower earnings vs HIC folks proportionately reduced my giving impact made me feel really sad and left out. 

It's also why the original title of your post – the post itself is fantastic; I resonate with a lot of the points you bring up – didn't quite land with me, so I appreciate the title change and your consideration in thinking through Jeff's example.

[Question] How should we think about the decision relevance of models estimating p(doom)?

(Epistemic status: confused & dissatisfied by what I've seen published, but haven't spent more than a few hours looking. Question motivated by Open Philanthropy's AI Worldviews Contest; this comment thread asking how OP updated reminded me of my dissatisfaction. I've asked this before on LW but got no response; curious to retry, hence repost) 

To illustrate what I mean, switching from p(doom) to timelines: 

  • The recent post AGI Timelines in Governance: Different Strategies for Different Timeframes was useful to me in pushing back against Miles Brundage's argument that "timeline discourse might be overrated", by showing how choice of actions (in particular in the AI governance context) really does depend on whether we think that AGI will be developed in ~5-10 years or after that. 
  • A separate takeaway of mine is that decision-relevant estimation "granularity" need not be that fine-grained, and in fact is not relevant beyond simply "before or after ~2030" (again in the AI governance context). 
  • Finally, that post was useful to me in simply concretely specifying which actions are influenced by timelines estimates.  

Question: Is there something like this for p(doom) estimates? More specifically, following the above points as pushback against the strawman(?) that "p(doom) discourse, including rigorous modeling of it, is overrated":

  1. What concrete high-level actions do most alignment researchers agree are influenced by p(doom) estimates, and would benefit from more rigorous modeling (vs just best guesses, even by top researchers e.g. Paul Christiano's views)?
  2. What's the right level of granularity for estimating p(doom) from a decision-relevant perspective? Is it just a single bit ("below or above some threshold X%") like estimating timelines for AI governance strategy, or OOM (e.g. 0.1% vs 1% vs 10% vs >50%), or something else?
    • I suppose the easy answer is "the granularity depends on who's deciding, what decisions need making, in what contexts", but I'm in the dark as to concrete examples of those parameters (granularity i.e. thresholds, contexts, key actors, decisions)
    • e.g. reading Joe Carlsmith's personal update from ~5% to >10% I'm unsure if this changes his recommendations at all, or even his conclusion – he writes that "my main point here, though, isn't the specific numbers... [but rather that] here is a disturbingly substantive risk that we (or our children) live to see humanity as a whole permanently and involuntarily disempowered by AI systems we’ve lost control over", which would've been true for both 5% and 10%

Or is this whole line of questioning simply misguided or irrelevant?


Some writings I've seen gesturing in this direction:

  • harsimony's argument that Precise P(doom) isn't very important for prioritization or strategy ("identifying exactly where P(doom) lies in the 1%-99% range doesn't change priorities much") amounts to the 'single bit granularity' answer
    • Carl Shulman disagrees, but his comment (while answering my 1st bullet point) isn't clear in the way the different AI gov strategies for different timelines post is, so I'm still left in the dark – to (simplistically) illustrate with a randomly-chosen example from his reply and making up numbers, I'm looking for statements like "p(doom) < 2% implies we should race for AGI with less concern about catastrophic unintended AI action, p(doom) > 10% implies we definitely shouldn't, and p(doom) between 2-10% implies reserving this option for last-ditch attempts", which he doesn't provide
  • Froolow's attempted dissolution of AI risk (which takes Joe Carlsmith's model and adds parameter uncertainty – inspired by Sandberg et al's Dissolving the Fermi paradox – to argue that low-risk worlds are more likely than non-systematised intuition alone would suggest) 
    • Froolow's modeling is useful to me for making concrete recommendations for funders, e.g. (1) "prepare at least 2 strategies for the possibility that we live in one of a high-risk or low-risk world instead of preparing for a middling-ish risk", (2) "devote significantly more resources to identifying whether we live in a high-risk or low-risk world", (3) "reallocate resources away from macro-level questions like 'What is the overall risk of AI catastrophe?' towards AI risk microdynamics like 'What is the probability that humanity could stop an AI with access to nontrivial resources from taking over the world?'", (4) "When funding outreach / explanations of AI Risk, it seems likely it would be more convincing to focus on why this step would be hard than to focus on e.g. the probability that AI will be invented this century (which mostly Non-Experts don’t disagree with)". I haven't really seen any other p(doom) model do this, which I find confusing 
  • I'm encouraged by the long-term vision of the MTAIR project "to convert our hypothesis map into a quantitative model that can be used to calculate decision-relevant probability estimates", so I suppose another easy answer to my question is  just "wait for MTAIR", but I'm wondering if there's a more useful answer to the "current SOTA" than this. To illustrate, here's (a notional version of) how MTAIR can help with decision analysis, cribbed from that introduction post: 

This question was mainly motivated by my attempt to figure out what to make of people's widely-varying p(doom) estimates, e.g. in the appendix section of Apart Research's website, beyond simply "there is no consensus on p(doom)". I suppose one can argue that rigorous p(doom) modeling helps reduce disagreement on intuition-driven estimates by clarifying cruxes or deconfusing concepts, thereby improving confidence and coordination on what to do, but in practice I'm unsure if this is the case (reading e.g. the public discussion around the p(doom) modeling by Carlsmith, Froolow, etc), so I'm not sure I buy this argument, hence my asking for concrete examples.

what I get (2e6) when deriving the US slice of the total DALY burden from global burden of disease data showing 3% global DALYs come from URI

I'm seeing 0.25% globally and 0.31% for the US for URI in the GBD data, ~1 OOM lower (the direct figure for the US is 3.4e5, also ~1 OOM lower). What am I missing? 

Akash's Speaking to Congressional staffers about AI risk seems similar:

In May and June of 2023, I (Akash) had about 50-70 meetings about AI risks with congressional staffers. ...

In March of 2023, I started working on some AI governance projects at the Center for AI Safety. One of my projects involved helping CAIS respond to a Request for Comments about AI Accountability that was released by the NTIA.

As part of that work, I started thinking a lot about what a good regulatory framework for frontier AI would look like. For instance: if I could set up a licensing regime for frontier AI systems, what would it look like? Where in the US government would it be housed? What information would I want it to assess?

I began to wonder how actual policymakers would react to these ideas. I was also curious to know more about how policymakers were thinking about AI extinction risks and catastrophic risks.

I started asking other folks in AI Governance. The vast majority had not talked to congressional staffers (at all). A few had experience talking to staffers but had not talked to them about AI risk. A lot of people told me that they thought engagement with policymakers was really important but very neglected. And of course, there are downside risks, so you don't want someone doing it poorly. 

After consulting something like 10-20 AI governance folks, I asked CAIS if I could go to DC and start talking to congressional offices. The goals were to (a) raise awareness about AI risks, (b) get a better sense of how congressional offices were thinking about AI risks, (c) get a better sense of what kinds of AI-related priorities people at congressional offices had, and (d) get feedback on my NTIA request for comment ideas. 

CAIS approved, and I went to DC in May-June 2023. And just to be clear, this wasn't something CAIS told me to do– this was more of an "Akash thing" that CAIS was aware was happening.

Like you, Akash just cold-emailed people:

I sent a mass email to tech policy staffers, and I was pretty impressed by the number who responded. The email was fairly short, mentioned that I was at CAIS, had 1-2 bullets about what CAIS does, and had a bullet point about the fact that I was working on an NTIA request for comment.

I think it was/is genuinely the case that Congressional staffers are extremely interested in AI content right now. Like, I don't think I would've been able to have this many meetings if I was emailing people about other issues.

There's a lot of concrete learnings in that writeup; definitely worth reading I think.

Load more