Hide table of contents

The views expressed here are my own, not those of my employers.

Summary

  • I previously estimated an astronomically low annual probability of a conflict causing human extinction fitting Pareto distributions to increasingly rightmost sections of the tail distribution of the annual conflict deaths as a fraction of the global population. In this post, I run a similar analysis for annual epidemic/pandemic deaths.
  • The expected damage from epidemics/pandemics is very concentrated in the most severe ones, with 94.0 % of the annual epidemics/pandemics deaths as a fraction of the global population coming from ones whose such fraction is at least 0.1 %.
  • There has been a downward trend in the logarithm of the annual epidemic/pandemic deaths as a fraction of the global population, with the coefficient of determination (R^2) of the linear regression of it on the year being 38.5 %.
  • In contrast to the conclusion of my past analyses on terrorist attacks, wars and conflicts, a pandemic can possibly cause human extinction, at least on priors. The tail distribution of the annual epidemic/pandemic deaths as a fraction of the global population does not decay fast enough for the annual probability of a pandemic causing human extinction to become astronomically low.

Introduction

I previously estimated an astronomically low annual probability of a conflict causing human extinction fitting Pareto distributions to increasingly rightmost sections of the tail distribution of the annual conflict deaths as a fraction of the global population. In this post, I run a similar analysis for annual epidemic/pandemic deaths.

I focussed on Pareto distributions for simplicity, and because these are arguably the most commonly used to model tail risk. One could argue they have overly thin tails due the possibility of dragon kings, i.e. flatter sections of the right tail (in the unobserved domain), but I assume steep sections cannot be ruled out either. Consequently, I supposed Pareto distributions still offer a good prior.

Methods

I got the epidemic/pandemic deaths from 1500 to 2023 based on data from Marani et. al 2021 on the start and end years as well as deaths by epidemic/pandemic[1]. I stipulated the deaths of each epidemic/pandemic are distributed uniformly from its start to end year. 

I assumed Marani et. al’s data underestimate the epidemic/pandemic deaths due to underreporting.

  • I got no epidemic/pandemic deaths with the procedure above in 8 of 524 (= 2023 - 1500 + 1) years, i.e. 1.53 % (= 8/524) of them. However, I wanted the deaths to be positive in all years to fairly assess the linear regression of the logarithm of the annual epidemic/pandemic deaths as a fraction of the global population on the year. So I supposed the epidemic/pandemic deaths in each of the aforementioned 8 years to be 597, which is half of the ratio between 5 k deaths (lowest value outside these 8 years; see previous footnote) and the mean epidemic/pandemic duration of 4.19 years.
  • For the other years, I considered Marani et. al’s annual epidemic/pandemic deaths as a fraction of the actual annual epidemic/pandemic deaths are represented by a piecewise linear function of the year[2]. I asked Marco Marani on July 2 about some points to define it, but I have not heard back[3]. So I used the function of my analysis of conflict deaths, which is defined by the following points (in agreement with guesses from Peter Brecke, who built the Conflict Catalog dataset):

    • In 1400, 4 %.

    • In 1700, 10 %.

    • In 1800, 22.5 %.

    • In 1900, 90 %.

    • In 2000, 100 %.

      • I also relied on a value of 100 % for 2023, as the above implies underreporting decreases with time.

I fit Pareto distributions to the 2, 3, … and 524 (= 2023 - 1500 + 1) years with the most epidemic/pandemic deaths as a fraction of the global population. To do this:

  • I get the slope and intercept of linear regressions of the logarithm of (sections of) the tail distribution on the logarithm of the annual epidemic/pandemic deaths as a fraction of the global population.
  • Since the tail distribution of a Pareto distribution is P(X > x) = (“minimum”/x)^“tail index”, ln(P(X > x)) = “tail index”*ln(“minimum”) - “tail index”*ln(x) = “intercept” + “slope”*ln(x), so I determine the parameters of the Pareto distributions from:
    • “Tail index” = -“slope”.
    • “Minimum” = e^(“intercept”/“tail index”).

Then I obtain the annual probability of a pandemic causing human extinction from that of conflict deaths exceeding the global population, which is P(X > 1) = “minimum”^“tail index”. This decreases with the tail index, given the minimum of each Pareto distribution is lower than 1.

The calculations are in this Sheet.

Results

Historical annual epidemic/pandemic deaths as a fraction of the global population

Preprocessing

Basic stats

StatisticAnnual epidemic/pandemic deaths as a fraction of the global population
Mean0.236 %
Minimum0
5th percentile1.19*10^-6
10th percentile3.60*10^-6
Median0.0276 %
90th percentile0.414 %
95th percentile0.684 %
Maximum10.3 %

Worst years

NNth highest annual epidemic/pandemic deaths as a fraction of the global populationYearLocation of the epidemics/pandemics
110.3 %1520Americas, Dominican Republic, Haiti, Ireland and Mexico
210.1 %1519China, Dominican Republic, Haiti, Ireland and Mexico
37.39 %1545Americas, China and India
47.34 %1546Americas
57.32 %1547Americas
67.30 %1548Americas
72.02 %1576Americas, England, France, Ireland and Italy
82.01 %1577Americas, England, Ireland and Italy
91.96 %1578Americas and France
101.45 %1770England, Germany, India, Iraq, Ireland, Italy, Russia, Scotland and Sweden

Linear regression of the logarithm of the annual epidemic/pandemic deaths as a fraction of the global population on the year

Slope (1/year)InterceptCoefficient of determination
-0.010610.238.5 %

Risk by severity

Annual epidemic/pandemic deaths as a fraction of the global populationYearsYears as a fraction of the total

Conditional annual epidemic/pandemic deaths as a fraction of the global population

Unconditional annual epidemic/pandemic deaths as a fraction of the global population

Unconditional annual epidemic/pandemic deaths as a fraction of the global population
MinimumMaximum
0Infinity524100 %

0.236 %

0.236 %

100 %
010^-661.15 %

9.05*10^-7

1.04*10^-8

4.38*10^-6
10^-60.001 %7414.1 %

3.24*10^-6

4.57*10^-7

0.0193 %
0.001 %0.01 %8115.5 %

0.00471 %

0.00073 %

0.308 %
0.01 %0.1 %20238.5 %

0.0342 %

0.0132 %

5.58 %
0.1 %1 %14728.1 %

0.368 %

0.103 %

43.6 %
1 %10 %122.29 %

3.50 %

0.0802 %

33.9 %
10 %100 %20.382 %

10.2 %

0.0390 %

16.5 %

Tail distribution

Pandemics tail risk

The tail distribution decays faster for higher tail indices. A tail index of 1 (10) means annual epidemic/pandemic deaths as a fraction of the global population are 10 % (10^-10) likely to become 10 times as large.

Discussion

Historical annual epidemic/pandemic deaths as a fraction of the global population

The expected damage from epidemics/pandemics is very concentrated in the most severe ones, with 94.0 % (= 0.436 + 0.339 + 0.165) of the annual epidemics/pandemics deaths as a fraction of the global population coming from ones whose such fraction is at least 0.1 % (see last table).

The highest annual epidemic/pandemic deaths as a fraction of the global population were 10.3 % in 1520. This involved pandemics/epidemics in the Americas, Dominican Republic, Haiti, Ireland and Mexico.

There has been a downward trend in the logarithm of the annual epidemic/pandemic deaths as a fraction of the global population, with the R^2 of the linear regression of it on the year being 38.5 %. I guess the sign of the slope is resilient against changes to my modelling of the underreporting. One may argue the aforementioned logarithm will increase in the next few decades based on inside view factors such as technology becoming cheaper and more powerful. Nevertheless, technology also became cheaper and more powerful during the period of 1500 to 2023 covered by my data, but these still suggest a decreasing logarithm of the annual epidemic/pandemic deaths as a fraction of the global population.

Pandemics tail risk

Increasing the lowest annual epidemic/pandemic deaths as a fraction of the global population included in the linear regression of the tail distribution, i.e. relying on increasingly rightmost sections of the tail, the annual probability of a pandemic causing human extinction (see last graph):

  • Decreases to 0.0206 % for 0.202 % lowest annual epidemic/pandemic deaths as a fraction of the global population.
  • Then increases to 0.0823 % for 1.31 % lowest annual epidemic/pandemic deaths as a fraction of the global population.
  • Then decreases to 9.76*10^-7 for 7.30 % lowest annual epidemic/pandemic deaths as a fraction of the global population.
  • Then increases to 0.00107 % for 7.39 % lowest annual epidemic/pandemic deaths as a fraction of the global population.
  • Then decreases to 1.46*10^-34 for 10.1 % lowest annual epidemic/pandemic deaths as a fraction of the global population.

The median superforecaster and domain expert of The Existential Risk Persuasion Tournament (XPT) predicted an annual probability of an engineered pathogen causing human extinction in the 78 years (= 2100 - 2023 + 1) from 2023 to 2100 of 1.28*10^-6 (= 1 - (1 - 10^-4)^(1/78)) and 0.0129 %[4] (= 1 - (1 - 10^-4)^(1/78)).

My only astronomically low annual probability of a pandemic causing human extinction of 1.46*10^-34 relies on just the 2 rightmost points of the tail. I believe these should get the most weight to predict extinction risk, but that such a tiny sample size is clearly insufficient to conclude the prior annual probability of a pandemic causing human extinction is astronomically low. This is in contrast to the conclusion of my past analyses on terrorist attacks, wars and conflicts.

The tail distribution still has to eventually decay abruptly for annual epidemic/pandemic deaths as a fraction of the global population that are higher than the ones I studied, as deaths are limited to the global population. However, such an abrupt decrease is not born out of past data. So, at least on priors, a pandemic can possibly cause human extinction.

  1. ^

     I assumed 5 k deaths (= (0 + 10)/2*10^3) for epidemics/pandemics qualitatively inferred (said) to have caused less than 10 k deaths, which are coded as having caused -999 (0) deaths. I also considered the deaths from COVID-19, which is not in the original dataset.

  2. ^

     By Marani et. al’s annual epidemic/pandemic deaths, I mean those I got based on Marani et. al’s data, but Marani et. al 2021 do not (explicitly) provide data on the annual epidemic/pandemic deaths.

  3. ^

     Marco is the 1st author of Marani et. al 2021.

  1. ^

     I calculated these values annualising the ones in Table 3.

Show all footnotes
Comments4


Sorted by Click to highlight new comments since:

Could you please expand on why you think a Pareto distribution is appropriate here? Tail probabilities are often quite sensitive to the assumptions here, and it can be tricky to determine if something is truly power-law distributed.

When I looked at the same dataset, albeit processing the data quite differently, I found that a truncated or cutoff power-law appeared to be a good fit. This gives a much lower value for extreme probabilities using the best-fit parameters. In particular, there were too few of the most severe pandemics in the dataset (COVID-19 and 1918 influenza) otherwise; this issue is visible in fig 1 of Marani et al. Could you please add the data to your tail distribution plot to assess how good a fit it is?

A final note, I think you're calculating the probability of extinction in a single year but the worst pandemics historically have lasted multiple years. The total death toll from the pandemic is perhaps the quantity most of interest.

Thanks for the relevant points, Joshua. I strongly upvoted your comment.

Could you please expand on why you think a Pareto distribution is appropriate here?

I did not mean to suggest a Pareto distribution is appropriate, just that it is worth considering.

Tail probabilities are often quite sensitive to the assumptions here, and it can be tricky to determine if something is truly power-law distributed.

Agreed. In my analysis of conflict deaths, for the method where I used fitter:

The 5th and 95th percentile annual probability of a conflict causing human extinction are 0 and 5.02 % [depending on the distribution]


When I looked at the same dataset, albeit processing the data quite differently, I found that a truncated or cutoff power-law appeared to be a good fit. This gives a much lower value for extreme probabilities using the best-fit parameters. In particular, there were too few of the most severe pandemics in the dataset (COVID-19 and 1918 influenza) otherwise; this issue is visible in fig 1 of Marani et al. Could you please add the data to your tail distribution plot to assess how good a fit it is?

I did not get what you would like me to add to my tail distribution plot. However, I added here the coefficients of determination (R^2) of the regressions I did.

A final note, I think you're calculating the probability of extinction in a single year but the worst pandemics historically have lasted multiple years. The total death toll from the pandemic is perhaps the quantity most of interest.

Focussing on the annual deaths as a fraction of the global population is useful because it being 1 is equivalent to human extinction. In contrast, total epidemic/pandemic deaths as a fraction of the global population in the year in which the epidemic/pandemic started being equal to 1 does not imply human extinction. For example, a pandemic could kill 1 % of the population each year for 100 years, but population remain constant due to births being equal to the pandemic deaths plus other deaths.

However, I agree interventions should be assessed based on standard cost-effectiveness analyses. So I believe the quantity of most interest which could be inferred from my analysis is the expected annual epidemic/pandemic deaths. These would be 2.28 M (= 2.87*10^-4*7.95*10^9) multiplying:

  • My annual epidemic/pandemic deaths as a fraction of the global population based on data from 1900 to 2023. Earlier years are arguably not that informative.
  • The population in 2021.

The above expected death toll would rank as 6th in 2021.

For reference, based on my analysis of conflicts, I get an expected death toll of conflicts based on historical data from 1900 to 2000 (also adjusted for underreporting), and the population in 2021 of 3.83 M (= 2.87*10^-4*7.95*10^9), which would rank above as 5th.

Here is a graph with the top 10 actual causes of death and expected conflict and epidemic/pandemic deaths:

Are the high numbers of deaths in the 1500s old world diseases spreading in the new world? If so, that seems to overestimate natural risk: the world's current population isn't separated from a larger population that has lots of highly human-adapted diseases.

In the other direction, this kind of analysis doesn't capture what I personally see as a larger worry: human-created pandemics. I know you're extrapolating from the past, and it's only very recently that these would even have been possible, but this seems at least worth noting.

Thanks for the comment, Jeff.

Are the high numbers of deaths in the 1500s old world diseases spreading in the new world?

Yes, and deaths are especially high in the 1500s given my assumption of high underreporting then.

If so, that seems to overestimate natural risk: the world's current population isn't separated from a larger population that has lots of highly human-adapted diseases.

Agreed. Personally, I guess the annual probability of a natural pandemic causing human extinction is lower than 10^-10.

In the other direction, this kind of analysis doesn't capture what I personally see as a larger worry: human-created pandemics. I know you're extrapolating from the past, and it's only very recently that these would even have been possible, but this seems at least worth noting.

I think it is interesting that:

There has been a downward trend in the logarithm of the annual epidemic/pandemic deaths as a fraction of the global population, with the R^2 of the linear regression of it on the year being 38.5 %. I guess the sign of the slope is resilient against changes to my modelling of the underreporting. One may argue the aforementioned logarithm will increase in the next few decades based on inside view factors such as technology becoming cheaper and more powerful. Nevertheless, technology also became cheaper and more powerful during the period of 1500 to 2023 covered by my data, but these still suggest a decreasing logarithm of the annual epidemic/pandemic deaths as a fraction of the global population.

Curated and popular this week
LintzA
 ·  · 15m read
 · 
Cross-posted to Lesswrong Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achi
Dr Kassim
 ·  · 4m read
 · 
Hey everyone, I’ve been going through the EA Introductory Program, and I have to admit some of these ideas make sense, but others leave me with more questions than answers. I’m trying to wrap my head around certain core EA principles, and the more I think about them, the more I wonder: Am I misunderstanding, or are there blind spots in EA’s approach? I’d really love to hear what others think. Maybe you can help me clarify some of my doubts. Or maybe you share the same reservations? Let’s talk. Cause Prioritization. Does It Ignore Political and Social Reality? EA focuses on doing the most good per dollar, which makes sense in theory. But does it hold up when you apply it to real world contexts especially in countries like Uganda? Take malaria prevention. It’s a top EA cause because it’s highly cost effective $5,000 can save a life through bed nets (GiveWell, 2023). But what happens when government corruption or instability disrupts these programs? The Global Fund scandal in Uganda saw $1.6 million in malaria aid mismanaged (Global Fund Audit Report, 2016). If money isn’t reaching the people it’s meant to help, is it really the best use of resources? And what about leadership changes? Policies shift unpredictably here. A national animal welfare initiative I supported lost momentum when political priorities changed. How does EA factor in these uncertainties when prioritizing causes? It feels like EA assumes a stable world where money always achieves the intended impact. But what if that’s not the world we live in? Long termism. A Luxury When the Present Is in Crisis? I get why long termists argue that future people matter. But should we really prioritize them over people suffering today? Long termism tells us that existential risks like AI could wipe out trillions of future lives. But in Uganda, we’re losing lives now—1,500+ die from rabies annually (WHO, 2021), and 41% of children suffer from stunting due to malnutrition (UNICEF, 2022). These are preventable d
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f