Tldr;
- EAs seem more focused on non-extinction risks from AI than they used to be
- I haven’t seen much discussion of why, making reference to the very longrun
- Changes in the world this might be responding to are: alignment seeming more promising than it used to, the speed of AI progress, and concerns around US governance.
- There are also plausible sociological reasons for the shift, so I feel unsure how far to update
What’s the change and why care about it?
My sense is that there’s been a significant shift in how much longtermists prioritise non-extinction risks over the last few years. A decade ago people who were trying to ensure the flourishing of the universe trillions of years from now were very often focused on avoiding events that would kill all humans. My sense is that now they’re much more often focused on situations which they don’t think pose a risk of human extinction, such as extreme power concentration.
This change has felt confusing to me for a couple of reasons:
- There’s reasonably little written about why longtermists should change their prioritisation in this direction. The notable exception is this paper by Will MacAskill.
- To my mind, there’s a clear reason for longtermists to focus on extinction risks:
- It seems extremely hard for any of our actions now to predictably affect what the world looks like in trillions of years time. (Events which predictably affect the very longrun future are often called ‘lock-in’ events.)
- Preventing humans from going extinct in the next decade continues to affect the future indefinitely - human extinction seems like a clear ‘lock-in’ event.
You might think that our actual actions won’t be affected much by whether the risks we should prioritise most are totalitarianism or human extinction. There are indeed many actions which are useful for both of these. Pausing AI development will give us more time to allow the world to prepare itself for all sorts of risks that transformative AI might bring. Frontier AI companies having good whistleblower provisions gives us a better hope of finding out about many of the largest of those risks before they come about.
But there are ways in which you could disproportionately affect extinction risks compared to other risks. One is working directly to mitigate the ways humans could all plausibly be killed, such as AI enabled biorisks. Another is to focus more on the risk of misaligned AI risk than the risk of AI misuse by humans. The reason is that AI may be inherently alien to humans and therefore likely to have goals which are far reaching and extremely different to ours, and therefore cause a future which doesn’t include us. "The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else". By comparison, humans tend to have goals for which it’s useful to have other humans surviving.
I wanted to get a sense of why there’s been a shift in favour of longtermists prioritising extreme power concentration amongst humans compared to AI takeover, in order to understand better what’s highest priority for me to focus on. Below are the reasons I came across when talking to people about what has caused this trend.
Differences in the world that motivate the change
Alignment is more promising than some people expected.
For many longtermists, by far the most likely cause of human extinction is a misaligned AI. The reason is that humanity is actually reasonably resilient to large shocks, even ones that kill the vast majority of the population. That means natural disasters are reasonably unlikely to wipe everyone out. Whereas a misaligned AI for whom humans were in the way might take a more intentional approach to wiping us out.
Ten years ago, it seemed plausible that AI wouldn’t be able to understand human goals at all, so the idea of AI pursuing totally alien goals seemed more likely. We now have empirical evidence of AI understanding humans reasonably well - being able to get a fairly brief and vague prompt and accurately determine what we’d like it to do and how. To the extent their goals diverge from ours, they’re often very intelligible do us (such as wanting to complete the task and therefore cheating on it). There are some structural reasons for that - using large language models extensively imports a lot of human concepts into models.
Even aside from alignment being less challenging than some expected, the fact that there are now more people working on it provides some reason for thinking that marginal work on it might be less useful than it was in the past.
AI progress has been continuous, and medium-fast
In a world where AI progressed from largely useless to superintelligent in the course of days, there’s little opportunity for humans to figure out how to control it or wield it to their ends. In a world where AI progress is very slow, we have time to adjust our governance mechanisms to the existence of ASI. But where progress is continuous and medium-fast, some humans have enough time to notice it would likely be useful to control super-intelligence and make plans to do so, without there being time for things like governments
US governance is in a worse state than we might have expected for the point at which we hit transformative AI.
The federal government seems unusually opposed to safety-focused AI legislation (making it more likely than it might have been that I company execs could amass unusual levels of power), and less respectful of separation of powers. It’s possible that the people running AI companies also seem more power-seeking / less safety conscious than we would have expected (though that seems less compelling to me).
How much of an update should we make?
The above all seem like solid reasons to shift some prioritisation from avoiding AI takeover to avoiding extreme human power concentration.
I’m still not sure if I buy that these risks are in the same ballpark, even if they’re now closer. My hesitation stems from a strong intuition that it’s extremely hard to predictably affect anything over the long-term. Our actions have a very strong tendency to ‘wash-out’ - often over mere years, but particularly after centuries, millennia and onwards.
There are reasons to think that AGI will make it less likely that actions wash out - for example, it will plausibly enable immortality (perhaps by uploading rather than biologically). Dictatorships have often historically been brought down by the natural death of the dictator, so the impossibility of that could really increase the chance of lock in over the very long run. (For more on how AGI might enable authoritarian lock in, see the MacAskill and Finnveden et al papers linked about.) On the other hand, technological breakthroughs have historically swung the balances of power and otherwise washed out previous actions, and transformative AI will precipitate a time of unusually rapid technological progress.
There are also various things which might have causally led to a shift in views without epistemically justifying them:
- When OpenAI and DeepMind were set up, their comms about their aims were heavily focused on keeping the world safe. Things like the OpenAI board turning over has made salient how much some of the key players seem squarely focused on amassing power, and how scary that is. Being viscerally aware of that (even if the extent to which it’s the case is within the bounds of what you would have theoretically expected) makes it harder to look away from risks from human power concentration.
- The US has become more divided and tribal, making it feel more urgent and important to work on preserving democracy
- As AI progress is discussed more in the mainstream, there’s a larger coalition working on ensuring that the transition to superintelligent AI proceeds safely, many of whose values don’t extend to the extremely longrun future. Preventing extreme concentration of power is a unifying issue in the sense that lots of people find it more plausible than extinction, so this provides a (perhaps unconscious) incentive to emphasise it more.
The existence of causal reasons like those above make me wary of over updating in towards prioritising extreme power concentration. On the other hand, I think it’s plausible that I previously focused too hard on human extinction compared to other plausible lock-in events. A couple of the reasons I might have done that:
- I have a strong intuition about it being very hard to affect things over the longrun. And there is plenty of evidence about people trying to cause lasting change and totally failing. But I think it might be a mistake to trust my intuitions about this question at all. We should expect the world to be unrecognisable after a transition to ASI, so why would we expect it it to be similarly easy/hard to lock things in for the future?
- It does seem right to me that there is a chance of us being able to lock particular values or governance structures for the extremely long run, and also a chance that preventing human extinction does not lock in a change in value of the universe (for example, if humans went extinct decades later it would wash out the effect of saving them this decade). I think it’s easy to round the former off to ‘basically not lock in’ and the latter to ‘basically lock in’ in a way I don’t endorse.
All this mostly leaves me still feeling pretty unclear about how to prioritise between risks that might and might not plausibly cause human extinction. I’m inclined to prioritise those that won’t somewhat higher than I have historically. I still feel nervous that I’m partly doing that for bad reasons. Those reasons are less the explicit causal ones I listed, and more deferring to the ‘general zeitgeist’ without fully understanding its reasoning. I’d be very keen to hear other people’s views on this question.
Thanks to Arden Koehler for making this post (and so many of the things I write) much better, and to various people for the ideas behind the post, including Alex Lawsen and Nick Beckstead.

Well let me stop you right there.
https://forum.effectivealtruism.org/s/8ooZtgeWbsAxxP9bd
https://forum.effectivealtruism.org/s/wmqLbtMMraAv5Gyqn
https://forum.effectivealtruism.org/posts/CxMusuX8E5hiTXEWX/fruit-picking-as-an-existential-risk
https://forum.effectivealtruism.org/posts/zuQeTaqrjveSiSMYo/a-proposed-hierarchy-of-longtermist-concepts
https://forum.effectivealtruism.org/posts/wqmY98m3yNs6TiKeL/parfit-singer-aliens
https://forum.effectivealtruism.org/posts/zLi3MbMCTtCv9ttyz/formalizing-extinction-risk-reduction-vs-longtermism
https://forum.effectivealtruism.org/posts/WebLP36BYDbMAKoa5/the-future-might-not-be-so-great
https://forum.effectivealtruism.org/posts/WebLP36BYDbMAKoa5/the-future-might-not-be-so-great?commentId=cJdqyAAzwrL74x2mG
https://forum.effectivealtruism.org/posts/GsjmufaebreiaivF7/what-is-the-likelihood-that-civilizational-collapse-would
This isn't a comprehensive list, just some stuff off the top of my head.
Personally, I've spent a long time thinking about this and as far as I can tell, it's incredibly uncertain if (e)x(tinction)-risk reduction is positive EV (x-risk reduction is almost tautologically positive to the point of uselessness). In general I think the community got weirdly one shotted by astronomical waste-like arguments which are interesting thought experiments but don't really come close to solving cluelessness.
Thanks for the links!
I don't talk to many people who still identify as longtermist, but as someone who does, I recently wrote these arguments for why longtermists should be less extinction-focused.
The tl;dr is that I think that beyond extinction there are predictable patterns, related to entropy, that provide more nuanced ways to estimate the cost of lesser catastrophes - and that while assessing their cost precisely is unfeasible, we have no basis for thinking they're negligible compared to extinction.
Interesting that you don't come across many people these days who still identify as longtermist, that's pretty different from my experience. I think it feels more intuitive to me to identify as 'longtermist' than 'effective altruist'. The former is a claim about my values (people in the future matter morally) whereas the latter is behavioural and feels presumptuous (how altruistic really am I? Am I effective at it even when I try?). But I guess I'm in the minority on that!
Interesting post, thanks!! Excited to see discussion of it.
One thing I wonder if it's relevant are the sort of "mixed" pictures? Like, maybe you think totalitarian control of AI would be really bad because it increases the risk of human extinction from misalignment, by increasing the success rate of any attempted AI takeover / making it happen sooner in time (bc a totalitarian might e.g. give more control over to an AI they were using to stay in power).
Or, more vaguely, I sorta feel like if world governments / anyone with ambitions to power think that advanced AI will help them, that's bad news for alignment efforts.
Anyway, that could be something going on for some people I think?
Good point, thanks!
This could be entirely explainable by what is most resonant with a broader public and the fact that many of the non-extinction risks have much higher societal buy-in / are much more legible to many more people.
One reason is that longtermists are largely philosophers, who have no particular expertise on the details of aligning AI.
Another reason worth taking into consideration is if the true moral view is "fussy", rather than "easygoing". If you're "easygoing" in what you consider utopia, then, conditional on survival, most achievable value gets realized by default (we get great human lives), and extinction is the one really action-relevant lock-in event. But if you're fussy about realizing the best possible utopia, then, conditional on survival, we're still likely to miss most achievable value across a huge swathe of futures (we don't tile the universe with happy digital minds, say, or whatever crazy future might be the best utopia). The space of "didn't go extinct but missed most of the value" turns out to be enormous, and some of the features determining where in that space we land (early decisions about digital minds, population-ethics, allocation of resources during space settlement, which value-systems get amplified during the AI transition) are themselves plausibly locked-in, even if they don't feel as salient as extinction.
(But then again, maybe I'm recency biased because I just re-read Better Futures for the discussion week here on the Forum)
That doesn't settle the prioritization, and, like, the people are Forethought and 80.000 Hours are directly and explicitly working on the AI transition? So it's not like x-risk is off the table. My vibe for highly-engaged EAs is that perhaps it just feels that the main arguments about x-risk have already been made.
I agree that in deciding how much to prioritise averting extinction vs improving worlds in which we persist, it's important to think about the difference in value between (non-existence)(default survival)(actual utopia). But that argument has been around a long while. I think Ben Garfinkel was advancing the idea that (actual utopia) - (default survival) might be much larger than (default survival) - (non-existence) in the late 2010s. I'm interested in what's changed that's affected discourse. It's possible the answer is 'more people have read arguments of this form'. But in that case people who had already read those arguments should update less than if the change is eg us getting more info about how difficult alignment is.
Is this post conflating EAs or AI-safety researchers/advocates with longtermists? My impression is that actually rather few EAs are strong longtermists, and AI safety researchers/advocates maybe even less - they're just united by their wish to avoid catastrophe, but differ in their understandings of the kind of catastrophe we should expect.
I think when AI safety was young and, er, weird, a lot of those talking about it were longtermists. That makes sense. But now it's become a mainstream EA concern, and even a not-weird concern in society generally, so it's unsurprising that it's become less longtermist, from a sociological perspective.
I was intending to pick out a group of people who have for years identified as EAs and longtermists but have changed what they've worked on. I was thinking it was clear in talking about EAs deprioritising a thing that I meant the ones who prioritised that highly initially, but I see how that's confusing - I'll edit to clarify.
I agree that this shift has happened over the past 2 years. But I think 2 to 3 years ago EAs were unusually focused on extinction compared to a decade ago. I remember more discussions back then around positive visions for the longterm.
For what it’s worth, we just announced our first Frontier Biodefense Fellowship at Pivotal, which is more singularly focused on avoiding extinction than most projects within AI safety (including our AI safety fellowship). Obviously the team has a range of motivation to work on Biodefense, but for me weak longtermist arguments are quite central.
Interesting, I don't think I noticed that trend between 10 years ago and 2 years ago.
Cool!