This post was written for Convergence Analysis. In it, I summarise and analyse existing ideas more than proposing new ones. Some readers may already be familiar with much of this.
Existential risks are considered by many to be among the most pressing issues of our time (see e.g. The Precipice). But what, precisely, do we mean by “existential risks”?
To clarify this, this post will:
Quote prominent definitions of existential risk or existential catastrophe
Highlight three distinctions which are arguably obvious, but are also often overlooked:
- Existential risk vs existential catastrophe
- Existential catastrophe vs extinction
- Existential vs global catastrophic risks
Discuss two nuances of the concept of an existential catastrophe, regarding:
- How much of our potential must be destroyed?
- Does the catastrophe have to be a specific “event”? Does it have to look “catastrophic”?
Bostrom and Ord’s definitions
I believe that the term existential risk was introduced in this context by Nick Bostrom in a 2002 paper, where he defined such a risk as:
One where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.
Bostrom later (2012) updated this to the definition you’ve probably heard most often:
An existential risk is one that threatens the premature extinction of Earth-originating intelligent life or the permanent and drastic destruction of its potential for desirable future development
More recently, in The Precipice (2020), Toby Ord gives the following definitions:
An existential catastrophe is the destruction of humanity’s longterm potential.
An existential risk is a risk that threatens the destruction of humanity’s longterm potential.
Three important distinctions
Existential risk ≠ existential catastrophe
An existential risk is the risk of an existential catastrophe occurring. An existential risk is not itself the destruction of humanity’s potential. Unfortunately, the term existential risk seems to often be used as if it can refer to either the risk or the catastrophe, and the term existential catastrophe seems to relatively rarely be used at all.
For example, Millett and Snyder-Beattie write “For the purposes of this model, we assume that for any global pandemic arising from this kind of research, each has only a one in ten thousand chance of causing an existential risk” (emphasis added). As Beard et al. note (in their appendix):
It is not precisely clear whether the authors mean that one in ten thousand pandemics are predicted to cause extinction, or whether one in ten pandemics will have a risk of extinction. The latter reading is implausible because surely there is at least a risk, however small, that any global pandemic would cause extinction.
Existential catastrophe ≠ extinction
People often seem to say existential risk when they’re actually referring specifically to extinction risk (e.g., in this post, this podcast, and this post). Extinction of humanity is certainly one type of existential catastrophe, and perhaps the most likely one. But it’s not the only one. Here’s Toby Ord’s breakdown of different types of existential catastrophe:
In line with Cotton-Barratt and Ord, I recommend that, if you are specifically talking about extinction or extinction risk, you use those terms, rather than the terms existential catastrophe or existential risk. This should avoid unnecessary jargon and confusion.
Existential risk ≠ global catastrophic risk
Bostrom & Ćirković use the term global catastrophic risk “to refer, loosely, to a risk that might have the potential to inflict serious damage to human well-being on a global scale.” A variety of other definitions can be found here, all of which refer to a wider set of risks than existential risks does. Unfortunately, people sometimes seem to:
- use the term existential risk to refer to events that don’t actually meet the extremely high bar of destroying the vast majority of humanity’s potential
- use the terms existential risk and global catastrophic risk as if they’re interchangeable (e.g., here and here)
To avoid confusion and concept creep, this should be avoided.
How much of our potential must be destroyed?
Ord (2020) writes that “An existential risk is a risk that threatens the destruction of humanity’s longterm potential.” He also states explicitly that:
This includes cases where the destruction is complete (such as extinction) and where it is nearly complete, such as a permanent collapse of civilisation in which the possibility for some very minor types of flourishing remain, or where there remains some remote chance of recovery. I leave the thresholds vague, but it should be understood that in any existential catastrophe the greater part of our potential is gone and very little remains.
It seems like a good idea to have the term capture such “nearly complete” cases.
That said, at least to me, it seems that “destruction of humanity’s longterm potential” could be read as meaning the complete destruction. So I’d personally be inclined to tweak Ord’s definitions to:
An existential catastrophe is the destruction of the vast majority of humanity’s long-term potential.
An existential risk is a risk that threatens the destruction of the vast majority of humanity’s long-term potential.
But what if some of humanity’s long-term potential is destroyed, but not the vast majority of it? Given Ord and Bostrom’s definitions, I think that the risk of that should not be called an existential risk, and that its occurrence should not be called an existential catastrophe. Instead, I’d put such possibilities alongside existential catastrophes in the broader category of things that could cause “Persistent trajectory changes”. More specifically, I’d put them in a category I’ll term in an upcoming post “non-existential trajectory changes”. (Note that “non-existential” does not mean “not important”.)
Does an existential catastrophe have to be a specific “event”? Does it have to look “catastrophic”?
It seems to me that discussions of existential risk or catastrophe typically focus on relatively discrete “events”, and typically those that would look “catastrophic”. For example:
- A “treacherous turn” from a misaligned AI
- A global pandemic from a bioengineered pathogen
- A nuclear war and ensuing nuclear winter
These events seem to clearly match the standard, layperson concept of a “catastrophe”, as they’d occur over a fairly short time period, have fairly clear start and end points, and involve “destruction” that we’d notice as it happens. The full descent into extinction, unrecoverable collapse, or unrecoverable dystopia might take years in some such scenarios, but it still would’ve been sparked by a clear-cut “catastrophe”.
But the term “existential catastrophe” can also apply to “slower moving” catastrophes. For example, when discussing climate change, Ord (2020) writes:
Unlike many of the other risks I address, the central concern here isn’t that we would meet our end this century, but that it may be possible for our actions now to all but lock in such a disaster for the future. If so, this could still be the time of the existential catastrophe - the time when humanity’s potential is destroyed.
And when giving his estimates of the chance of existential catastrophe from various outcomes during the next 100 years, he writes “when the catastrophe has delayed effects, like climate change, I’m talking about the point of no return coming within 100 years”.
Additionally, I think the term “existential catastrophe” should be able to apply to scenarios where there’s no obvious “catastrophe” (in the standard sense) at all. For example, Ord writes:
If our potential greatly exceeds the current state of civilisation, then something that simply locks in the current state would count as an existential catastrophe. An example would be an irrevocable relinquishment of further technology progress.
It may seem strange to call something a catastrophe due to merely being far short of optimal. [...] But consider, say, a choice by parents not to educate their child. There is no immediate suffering, yet catastrophic longterm outcomes for the child may have been locked in.
And his “plausible examples” of a “desired dystopia”, one type of existential catastrophe, include:
worlds that forever fail to recognise some key form of harm or injustice (and thus perpetuate it blindly), worlds that lock in a single fundamentalist religion, and worlds where we deliberately replace ourselves with something that we didn’t realise was much less valuable (such as machines incapable of feeling).
- Existential risks are distinct from existential catastrophes, extinction risks, and global catastrophic risks.
- An existential catastrophe involves the destruction of the vast majority of humanity’s potential - not necessarily all of humanity’s potential, but more than just some of it.
- Existential catastrophes could be “slow-moving” or not apparently “catastrophic”; at least in theory, our potential could be destroyed slowly, or without this being noticed.
Arguably, reducing existential risks should be a top priority of our time. I think one way to improve our existential risk reduction efforts is to clarify and sharpen our thinking, discussion, and research, and one way to do that is to clarify and sharpen our key concepts. I hope this post has helped on that front.
In upcoming posts, I’ll discuss two further complexities with the concept of existential catastrophe:
- The idea that existential catastrophe is really the destruction of the potential of humanity or its “descendants”; it’s not necessarily solely about human wellbeing, nor just Homo sapiens’ potential.
- How an existential catastrophe is (I believe) distinct from a scenario in which humanity maintains its potential but never uses it well, and the implications of that alternative possibility.
This is one of a series of posts I plan to write that summarise, comment on, or take inspiration from parts of The Precipice. You can find a list of all such posts here.
In general, when I imply one meaning of a term is the “correct” or “actual” meaning, I’m really focused on the “useful” meaning, “standard” meaning, or meaning that’s “consistent with explicit definitions”. ↩︎
I don’t believe Bostrom makes explicit what he means by “potential” in his definitions. Ord writes “I’m making a deliberate choice not to define the precise way in which the set of possible futures determines our potential”, and then discusses that point. I’ll discuss the matter of “potential” more in an upcoming post.
This is assuming we don’t classify as “extinction” scenarios such as:
- humanity being “replaced” by a “descendant” which we’d be happy to be replaced by (e.g., whole brain emulations or a slightly different species that we evolve into)
- “humanity (or its descendants) [going] extinct after fulfilling our longterm potential” (Ord)
Seemingly relevantly, Bostrom’s classification of types of existential risk (by which I think he really means “types of existential catastrophe”) includes “plateauing — progress flattens out at a level perhaps somewhat higher than the present level but far below technological maturity”, as well as “unconsummated realization”. Both types seem to me like they could occur in ways such that the catastrophe is very slow or not really recognised by anyone as a catastrophe.
And in Paul Christiano’s description of what “failure” might look like in the context of AI alignment, he writes: “As this world goes off the rails, there may not be any discrete point where consensus recognizes that things have gone off the rails.” Christiano doesn’t use the term “existential catastrophe” in that post, but it seems to me that the scenario he describes would count as one. ↩︎