One of my pet peeves is people calling AI Safety a longtermist cause area. It is entirely reasonable to consider AI Safety a high priority cause area without putting significant value on future generations, let alone buying things like total utilitarianism or the astronomical waste arguments. Given fairly mainstream (among EAs) models of AI timelines and AI X-Risk, it seems highly likely that we get AGI within my lifetime, and that this will be a catastrophic/existential risk.
I think this misconception is actively pretty costly - I observe people introducing x-risk concerns via longtermism (eg in an EA fellowship) getting caught up in arguments over whether future people matter, total utilitarianism, whether expected value is the correct decision theory when using tiny probabilities and large utilities, etc. And people need to waste time debunking questions like ‘Is AI Safety a Pascal’s Mugging’. And yet, even if you convince people of these questions around longtermism, you still need to convince them of arguments around AI x-risk to get them to work on it.
I further think this is damaging, because the label ‘longtermist’ makes AI Safety seem like an abstract, far off thing. Probably not relevant to the world today, but maybe important to future generations, at some point, in the long-term future. And I think this is wildly incorrect. AI x-risk is a directly personal thing that will plausibly affect me, my loved ones, my community, and most people alive today. I think this is a pretty important motivator to some people to work on this - this is not some philosophical thought experiment that you only decide to work on if you buy into some particular philosophical assumptions. Instead, it entirely follows from common sense morality, so long as you buy a few empirical beliefs
To put this into perspective, I’m going to compare my probability of dying young from AI X-Risk to my probability of dying from other causes (operationalised as ‘probability of dying in the next 30 years). My headline result is that, as a 23 year old man living in the UK, conditional on dying in the next 30 years, the probability that I die due to AI x-risk is 41% (3.7% from AI, 5.3% from natural causes)
The main point I want to make in this post is that ‘AI x-risk is a pressing concern for people alive today’ is an obvious consequence of mainstream EA beliefs. Accordingly, rather than using my personal models of this, I try to combine Ajeya Cotra’s timelines report and Joseph Carlsmith’s report on AI X-risk (the best research I know of on AGI timelines and AGI x-risk respectively) to get the probability.
To get the probability of dying from any other cause, I use UK actuarial tables (giving the probability of death from all-cause mortality in each year of life). Ideally I’d have taken the probability of death by cause, but that’s much harder to find data for. Any one cause is <=25% of all-cause mortality, so showing that AI x-risk is roughly competitive with all-cause mortality shows that it’s by far the most likely cause of death.
Concretely, I take the distribution of the probability of AGI in each year from Ajeya’s spreadsheet (column B in sheet 2); the probability of AI x-risk from Joseph’s report (section 8) - this is 7.9% conditioning on AGI happening; and the probability of dying each year from natural causes starting from my current age (in the UK). To get my headline result of 41%, I take the ratio of the cumulative probabilities in 2052
Over the next 40 years, this gets the following chart:
See my spreadsheet to play around with the numbers yourself.
- The biggest caveat is that these numbers are plausibly biased pretty far upwards - I’ve seen basically no one outside of the EA community do a serious analysis on the question of AGI timelines or AGI x-risk, so the only people who do it are people already predisposed towards taking the issue seriously
- Another big caveat is that the numbers are juiced up heavily by the fact that I am conditioning on dying young - if I extend the timelines long enough, then probability of dying naturally goes to 100% while probability of dying from AI X-risk goes to 7.9%
- Is this enough to justify working on AI X-risk from a purely selfish perspective?
- Probably not - in the same way that it’s not selfish to work on climate change. The effect any one person can have on the issue is tiny, even if the magnitude that it affects any individual is fairly high.
- But this does help it appeal to my deontological/virtue ethics side - I am directly working on one of the world’s most pressing problems. This isn’t some abstract nerdy problem that’s an indulgence to work on over bednets, it’s a real danger to most people alive today, and something worthy and valuable to work on.
- My personal take is that the numbers used are too low, and this matches my sense of the median AI Safety researchers opinion. My personal rough guess would be 25% x-risk conditional on making AGI, and median AGI by 2040, which sharply increase the probability of death from AI to well above natural causes.
- If you take Yudkowsky-style numbers you end up with even more ridiculous results, though IMO that take is wildly overblown and overly pessimistic.
- I’ve taken the 7.9% number from Joseph Carlsmith’s report, and assumed it’s the same, regardless of the year we get AGI. I expect it’s actually substantially higher with short timelines as we have less time to solve alignment.
- I’ve used actuarial tables for probability of death at each age today. I expect probability of death to go down over time with medical advances. In particular, if we do get close to AGI, I expect the rate of biotech advances to significantly accelerate.
- Does this mean that AI Safety is obviously the most important cause area even under a person-affecting view?
- Eh, probably not. You also need to think through tractability, cost effectiveness, what fraction of people alive today are alive by each time, etc. But I do expect it goes through if you just put some value on the long-term future (eg 2-10x the value of today), rather than 10^30 times more value.
- EDIT: This assumes all other forms of catastrophic risk (bio, nuclear, climate change, etc) are negligible. As pointed out by Jack_S, this is (depressingly) a pretty suspect assumption!
- In particular, just because AI is the dominant existential risk does not mean that it's significantly higher in terms of a personal risk of death, and I was incorrectly anchored on existential risk estimates. It's notably more plausible to me that future GCBRs kill, eg, 10-50% of the world population than that they kill 100%, which is still pretty substantial in terms of a personal risk of death.
- This doesn't really change the core point of 'AI Safety is important in a visceral, current-day sense', but does mean that this post shouldn't be taken as an argument for prioritising AI x-risk over over GCRs.
Meta: I wrote this under a pseudonym because I mildly prefer this kind of post to not come up when you Google my name - I don’t expect it to be hard to de-anonymise me, but would rather people didn’t!
Andrew Leigh (Australian MP) has a 2021 book called What's the Worst That Could Happen?.
The book blurb starts as follows:
Book review EA Forum post here
This estimate is taken from Toby Ord's The Precipice, right? I remember 1/6 from there.
Technical note: I think we need to be careful to note the difference in meaning between extinction and existential catastrophe. When Joseph Carlsmith talks about existential catastrophe, he doesn't necessarily mean all humans dying; in this report, he's mainly concerned about the disempowerment of humanity. Following Toby Ord in The Precipice, Carlsmith defines an existential catastrophe as "an event that drastically reduces the value of the trajectories along which human civilization could realistically develop". It's not straightforward to translate his estimates of existential risk to estimates of extinction risk.
Of course, you don't need to rely on Joseph Carlsmith's report to believe that there's a ≥7.9% chance of human extinction conditioning on AGI.
I face enormous challenges convincing people of this. Many people don't see, for example, widespread AI-empowered human rights infringements as an 'existential catastrophe' because it doesn't directly kill people, and as a result it falls between the cracks of AI safety definitions - despite being a far more plausable threat than AGI considering it's already happening. Severe curtailments to humanity's potential still firmly count as an existential risk in my opinion.
I have suggested we stop conflating positive and negative longtermism. I found, for instance, the Precipice hard to read because of the way he flipped back and forth between the two.
Thanks for calling this out, whoever you are. This is another reason why, for the last ~5 years or so, I've been pushing for the terminology "AI x-risk" and "AI x-safety" to replace "(long-term) AI risk" and "(long-term) AI safety". For audiences not familiar with the "x" or "existential" terminology, one can say "large scale risk" to point the large stakes, rather than always saying "long term".
(Also, the fact that I don't know who you are is actually fairly heartening :)
"AGI risk" and "AGI safety" also seem good. But they don't immediately convey the high stakes.
I find this post really uncomfortable. I thought I'd mention that, even though I'm having a hard time putting my finger on why. I'll give a few guesses as a reply to this comment so they can be discussed separately.
I personally feel queasy about telling people that they might die in some detail as an epistemic move, even when it's true. I'm worried that this will move people to be less playful and imaginative in their thinking, and make worse intellectual, project, or career decisions overall, compared to more abstract/less concrete considerations of "how can I, a comfortable and privileged person with most of my needs already met, do the most good."
(I'm not saying my considerations here are overwhelming, or even that well-thought-out).
I am inclined to agree, but to take the other side for a minute, if I imagine being very confident it was true, I feel like there would be important truths that I would only realize if I communicated that knowledge deep into my System 1.
I'm also confused about this. One framing I sometimes have is
And my guess is that I won't act the same way. Like I'd care less about work-life balance, be more okay with life not going as well, be more committed with x-risk stuff always being the top of my mind, etc. And framed that way, I feel more attuned to the change of S1 "getting it" being positive.
Interesting, can you say more? I have the opposite intuition, though here this stems from the specific failure mode of considering AI Safety as weird, speculative, abstract, and only affecting the long-term future - I think this puts it at a significant disadvantage compared to more visceral and immediate forms of doing good, and that this kind of post can help partially counter that bias.
Hmm I want to say that I'm not at all certain what my all-things-considered position on this is.
But I think there are several potential cruxes that I'm framing the question around:
I have a bunch of thoughts on each of those, but they're kinda jumbled and not very coherent at the moment.
One possible related framing is "what types of people/thinking styles/archetypes does the EA or AI safety community most need*?" Do we need:
In the above archetypes, I feel good about "visceralness" for soldiers and maybe generals, but not for philosophers. I think I feel bad about "contemplating your own death" for all three, but especially philosophers and generals (A general who obsesses over their own death probably will make more mistakes because they aren't trying as hard to win).
Perhaps I'm wrong. Other general-like archetypes I've considered is "scientists on the Manhattan Project," and I feel warmer about Manhattan Project scientists having a visceral sense of their own death than for generals. Perhaps I'd be interested in reading about actual scientists trying to solve problems that have a high probability of affecting them one day (e.g. aging, cancer and heart disease researchers). Do they find the thought that their failures may be causally linked to their own death motivating or just depressing?
*as a method to suss out both selection: who should we most try to attract? And training: which virtues/mindsets is it most important to cultivate?
Yeah, this resonates with me as well.
One reason I might be finding this post uncomfortable is that I'm pretty concerned about the mental health of many young EAs, and frankly for some people I met I'm more worried about the chance of them dying from suicide or risky activities over the next decade than from x-risks. Unfortunately I think there is also a link between people who are very focused on death by x-risk and poor mental health. This is an intuition, nothing more.
I share this concern, and this was my biggest hesitation to making this post. I'm open to the argument that this post was pretty net bad because of that.
If you're finding things like existential dread concerning, I'll flag that the numbers in this post are actually fairly low in the grand scheme of total risks to you over your life - 3.7% just isn't that high. Dying young just isn't that likely.
You know, even disregarding AI, I'd never have thought that I had a ~5% chance of dying in the next 30 years. It's frightening.
I wouldn't take this as bearing on the matter that you replied to in any way, though.
I'm sorry that the post made you uncomfortable, and appreciate you flagging this constructively. Responses in thread.
One reason I might be finding this post uncomfortable is the chart it's centered around.
The medical information is based on real people who have died recently. It's a forecast based on counting. We can have a lot of confidence in those numbers.
In contrast, the AI numbers are trying to predict something that's never happened before. It's worth trying to predict, but the numbers are very different, and we can't have much confidence in them especially for one particular year.
It feels kind of misleading to try to put these two very different kinds of numbers side by side as if they're directly comparable.
I fairly strongly disagree with this take on two counts:
Re 2 the right way to compare high and low confidence numbers is to add error bounds. This chart does not do that.
I disagree because I think error bounds over probabilities are less principled than a lot of people assume, and they can add a bunch of false confidence.
More false confidence than not mentioning error ranges at all?
Yes. Quantitive expressions of credal resilience is complicated, there isn't a widely-shared-upon formulation, and a lot of people falsely assume that error bounds on made-up probabilities are more "rigorous" or "objective" than the probabilities themselves.
The issue is that by putting together high-confidence (relatively) and low-confidence estimates in your calculation, your resulting numbers should be low-confidence. For example, if your error bounds for AI risk vary by an order of magnitude each way (which is frankly insanely small for something this speculative) then the error bounds in your relative risk estimate would give you a value between 0.6% and 87%. With an error range like this, I don't think the statement "my most likely reason to die young is AI x-risk" is justified.
Hmm. I agree that these numbers are low confidence. But for the purpose of acting and forming conclusions from this, I'm not sure what you think is a better approach (beyond saying that more resources should be put into becoming more confident, which I broadly agree with).
Do you think I can never make statements like "low confidence proposition X is more likely than high confidence proposition Y"? What would feel like a reasonable criteria for being able to say that kind of thing?
More generally, I'm not actually sure what you're trying to capture with error bounds - what does it actually mean to say that P(AI X-risk) is in [0.5%, 50%] rather than 5%? What is this a probability distribution over? I'm estimating a probability, not a quantity. I'd be open to the argument that the uncertainty comes from 'what might I think if I thought about this for much longer'.
I'll also note that the timeline numbers are a distribution over years, which is already implicitly including a bunch of uncertainty plus some probability over AI never. Though obviously it could include more. The figure for AI x-risk is a point estimate, which is much dodgier.
And I'll note again that the natural causes numbers are at best medium confidence, since they assume the status quo continues!
Nitpick: I think you mean 6%? (0.37/(0.37+5.3) = 0.06). Obviously this doesn't change your core point.
Honestly, yeah, I think it is a weird statement to definitively state that X wildly speculative thing is more likely than Y well known and studied thing (or to put it differently, when the error bounds of X are orders of magnitude different from the error bounds in Y). It might help if you provided a counterexample here? I think my objections here might be partially on the semantics, saying "X is more likely than Y" seems like a smuggling of certainty into a very uncertain proposition.
I think it elucidates more accurately the state of knowledge about the situation, which is that you don't know much at all.
(also, lol, fair point on the calculation error)
I don't disagree with the premise that agreeing on empirical beliefs about AI probably matters more for whether someone does AI safety work than philosophical beliefs. I've made that argument before!
I agree that your risk of dying from misaligned AI in an extinction event in the next 30 years is much more than 3.7%. Actually, Carlsmith would too -- he more than doubled his credence in AI existential catastrophe by 2070 since sharing the report (see, e.g., the "author's note" at the end of the arxiv abstract).
(Edit: modulo mic's observation, but still.)
Current AI projects will easily accelerate present trends toward a far higher likelihood of existential catastrophic events. Risks are multiplied by the many uncoordinated global AI projects and their various experimental applications, particularly genetic engineering in less fastidiously scientific jurisdictions; but also the many social, political, and military applications of misaligned AI. AI safety work would be well-intentioned but irrelevant as these genies won't be put back into every AI 'safety bottle'. Optimistically, as we have survived our existential risks for quite some time, we may yet find a means to survive the Great Filter of Civilizations challenge presented by Fermi's Paradox.
Thank you for writing this. I've been repeating this point to many people and now I can point them to this post.
For context, for college-aged people in the US, the two most likely causes of death in a given year are suicide and vehicle accidents, both at around 1 in 6000. Estimates of global nuclear war in a given year are comparable to both of these. Given a AGI timeline of 50% by 2045, it's quite hard to distribute that 50% over ~20 years and assign much less than 1 in 6000 to the next 365 days. Meaning that even right now, in 2022, existential risks are high up on the list of most probable causes of death for college aged-people. (assuming P(death|AGI) is >0.1 in the next few years)
One project I've been thinking about is making (or having someone else make) a medical infographic that takes existential risks seriously, and ranks them accurately as some of the highest probability causes of death (per year) for college-aged people. I'm worried about this seeming too preachy/weird to people who don't buy the estimates though.
I'd be excited to see this, though agree that it could come across as too weird, and wouldn't want to widely and publicly promote it.
If you do this, I recommend trying to use as reputable and objective a source as you can for the estimates.
Seconding this—I’m definitely personally curious what such a chart would look like!
The infographic could perhaps have a 'today' and a 'in 2050' version, with the bubbles representing the risks being very small for AI 'today' compared to eg suicide, or cancer or heart disease, but then becoming much bigger in the 2050 version, illustrating the trajectory. Perhaps the standard medical cause of death bubbles shrink by 2050 illustrating medical progress.
I think the probability of death would go significantly up with age, undercutting the effect of this.
Yes, that's true for an individual. Sorry, I was more meaning the 'today' infographic would be for a person born in say 2002, and the 2050 one for someone born in eg 2030. Some confusion because I was replying about 'medical infographic for x-risks' generally rather than specifically your point about personal risk.
Mod note: I've enabled agree-disagree voting on this thread. This is still in the experimental phase, see the first time we did so here. Still very interested in feedback.
For experimentation purposes, I think the agree-disagree voting should probably be enabled right after the post is published. Otherwise you get noisy data: some of the upvotes/downvotes will just reflect the fact that the option wasn't enabled back when those votes were cast.
I agree that would be a cleaner experiment, but I'm still finding it interesting (if a little jarring because I'm used to downvotes being bad!)
Maybe people who want to contribute to the experiment can coordinate their posts with JP? (e.g. via sending a PM?)
I agree. I was on vacation being American yesterday so I didn't notice it when it was published.
Thanks for the post, this is definitely a valuable framing.
But I'm a bit concerned that the post creates a misleading impression that the whole catastrophic/ speculative risk field is completely overwhelmed by AI x-risk.
Assuming you don't believe that other catastrophic risks are completely negligible compared to AI x-risk, I'd recommend adding a caveat that this is only comparing AI x-risk and existing/ non-speculative risks. If you do think AI x-risk overwhelms other catastrophic risks, you should probably mention that too.
Wow, lots of disagreement points, I'm curious what people disagree with.
I didn't vote on your comment on either scale, but FWIW my guess is that the disagreement is due to quite a few people having the view that AI x-risk does swamp everything else.
I suspected that, but it didn't seem very logical. AI might swamp x-risk, but seems unlikely to swamp our chances of dying young, especially if we use the model in the piece.
Although he says that he's more pessimistic on AI than his model suggests, in the model, his estimates are definitely within the bounds that other catastrophic risks would seriously change his estimates.
I did a rough estimate with nuclear war vs. natural risk (using his very useful spreadsheet, and loosely based on Rodriguez' estimates) (0.39% annual chance of US-Russia nuclear exchange, 50% chance of a Brit dying in it; I know some EAs have made much lower estimates, but this seems in line with the general consensus). In this model, nuclear risk comes out a bit higher than 'natural' over 30 years.
Even if you're particularly optimistic about other GCRs, if you add all the other potential catastrophic/ speculative risks together (pandemics, non-existential AI risk, nuclear, nano, other), I can't imagine them not shifting the model.
Huh, I appreciate you actually putting numbers on this! I was suprised at nuclear risk numbers being remotely competitive with natural causes (let alone significantly dominating over the next 20 years), and I take this as an at least mild downwards update on AI dominating all other risks (on a purely personal level). Probably I had incorrect cached thoughts from people exclusively discussing extinction risk rather than just catastrophic risks, but from a purely personal perspective this distinction matters much less.
EDIT: Added a caveat to the post accordingly
I thought it might be that people simply didn't find the chart misleading, they thought it was clear enough and didn't need any more caveats.
There are person-affecting views according to which future beings still matter substantially, e.g. wide or asymmetric ones. I don't personally find narrow symmetric person-affecting views very plausible, since they would mean causing someone to come to exist and have a life of pure torture and misery wouldn't be bad for that person (although later preventing their suffering after they exist would be good, if possible), and significantly discounting such harms just because the individual doesn't yet exist seems really wrong to me. So, it's hard to me to justify discounting all of the possible sources of (dis)value in the far future, except to the extent that predictably improving things in the far future is hard.
For general implications of some impartial and asymmetric (but non-antinatalist) person-affecting views, see: https://globalprioritiesinstitute.org/teruji-thomas-the-asymmetry-uncertainty-and-the-long-term/
On asymmetric person-affecting views, negative utilitarianism and other downside-focused ethical views, AI safety looks promising, although the relative focus between interventions in our community is plausibly wrong, as s-risks and other quality risks seem relatively neglected. For s-risks, see: https://forum.effectivealtruism.org/posts/225Aq4P4jFPoWBrb5/cause-prioritization-for-downside-focused-value-systems https://longtermrisk.org/research-agenda
While promoting AI safety on the basis of wrong values may increase AI safety work, it may also increase the likelihood that AI will have wrong values (plausibly increasing the likelihood of quality risks), and shift the values in the EA community towards wrong values. It's very plausibly worth the risks, but these risks are worth considering.
I'm personally pretty unconvinced of this. I conceive of AI Safety work as "solve the problem of making AGI that doesn't kill everyone" more so than I conceive of it as "figure out humanity's coherent extrapolated vision and load it into a sovereign that creates a utopia". To the degree that we do explicitly load a value system into an AGI (which I'm skeptical of), I think that the process of creating this value system will be hard and messy and involve many stakeholders, and that EA may have outsized influence but is unlikely to be the deciding voice.
Having outsized influence could be enough, when we're considering the (dis)value in the far future at stake, which is still much larger than from the deaths of everyone killed by AI. What ratio of probabilities between influencing values in a better direction vs preventing extinction would you assign? Is the ratio small enough to give less overall weight to the expected far future impact than the reduction in risk of everyone killed by AI?
(FWIW, I don't think it's strictly necessary to explicitly "load" a value system to influence the kinds of values an AI system might have.)
I strongly agree. I'm glad you wrote this post. I tweeted about this a few weeks ago.
I have also had reasonable success in convincing normies that worrying about AI risk is sensible, or at least not crazy, in the last few months. Oddly enough, GPT-3 hasn't impressed people as much as DALLE-2, even though the former seems scarier/more capable to me.
I have yet to bring it up in policy circles, but I think the Overton window on AI risk has widened quite a bit.
My brother was recently very freaked out when I asked him to pose a set of questions that he thinks an AI wouldn’t be able to answer, and GPT-3 gave excellent-sounding responses to his prompts.
I made this same point a couple months ago--happy to hear more people pointing it out:
Thanks for writing this--even though I've been familiar with AI x-risk for a while, it didn't really hit me on an emotional level that dying from misaligned AI would happen to me too, and not just "humanity" in the abstract. This post changed that.
Might eventually be useful to have one of these that accounts for biorisk too, although biorisk "timelines" aren't as straightforward as trying to estimate the date that humanity builds the first AGI.
We can quibble over the numbers but I think the point here is basically right, and if not right for AI then probably right for biorisk or some other risks. That point being even if you only look at probabilities in the next few years and only care about people alive today, then these issues appear to be the most salient policy areas. I've noted in a recent draft that the velocity of increase in risk (eg from some 0.0001% risk this year, to eg 10% per year in 50 years) results in issues with such probability trajectories being invisible to eg 2-year national risk assessments at present even though area under curve is greater in aggregate than every other risk. But in a sense potentially 'inevitable' (for the demonstration risk profiles I dreamed up) over a human lifetime. This then begs the question of how to monitor the trajectory (surely this is one role of national risk assessment, to invest in 'fire alarms', but this then requires these risks to be included in the assessment so the monitoring can be prioritized). Persuading policymakers is definitely going to be easier by leveraging decade long actuarial tables than having esoteric discussions about total utilitarianism.
Additionally, in the recent FLI 'World Building Contest' the winning entry from Mako Yass made quite a point of the fact that in the world he built the impetus for AI safety and global cooperation on this issue came from the development of very clear and very specific scenario development of how exactly AI could come to kill everyone. This is analogous to Carl Sagan/Turco's work on nuclear winter in the early 1980s , a specific picture changed minds. We need this for AI.
This can be taken further - if your main priority is people alive today (or yourself) - near term catastrophic risks that aren't x-risks become as important. So, for example, while it may be improbable for a pandemic to kill everyone, I think it's much more probable that one kills, say, at least 90% of people. On the other hand I'm not sure the increase in probability from AI killing everyone to AI killing at least 90% of people is that big.
Then again, AI can be misused much worse than other things. So maybe the chance that it doesn't kill me but still, for example, lets a totalitarian government enslave me, is pretty big?
You should also consider the impacts on nonhuman animals (farmed and wild), which could easily dominate, but may not have the same sign as the impacts on humans. Plus, animal welfare interventions may look more promising overall.
On the other hand, there's also a possibility that humans alive today will live extremely long lives, because of AI going well.
Thank you for this post, I have been trying to make this point to people! (about catastrophic risk in general not necessarily AI)
I wrote something similar recently which has a little interactive calculator for your risk based on age/gender if people are interested
Great post! Thanks for writing it! I'm not great at probability so just trying to understand the methodology.