# Short summary

There are good reasons to care about sentient beings living in the millions of years to come. Caring about the future of sentience is sometimes taken to imply reducing the risk of human extinction as a moral priority. However, this implication is not obvious so long as one is uncertain whether a future with humanity would be better or worse than one without it.

In this article, we try to give an all-things-considered answer to the question: “Is the expected value of efforts to reduce the risk of human extinction positive or negative?”. Among others, we cover the following points:

• What happens if we simply tally up the welfare of current sentient beings on earth and extrapolate into the future; and why that isn’t a good idea
• Thinking about the possible values and preferences of future generations, how these might align with ours, and what that implies
• Why the “option value argument” for reducing extinction risk is weak
• How the potential of a non-human animal civilisation or an extra-terrestrial civilisation taking over after human extinction increases the expected value of extinction risk reduction
• Why, if we had more empirical insight or moral reflection, we might have moral concern for things outside of earth, and how that increases the value of extinction risk reduction
• How avoiding a global catastrophe that would not lead to extinction can have very long-term effects

# Long Summary

If most expected value or disvalue lies in the billions of years to come, altruists should plausibly focus their efforts on improving the long-term future. It is not clear whether reducing the risk of human extinction would, in expectation, improve the long-term future, because a future with humanity may be better or worse than one without it.

From a consequentialist, welfarist view, most expected value (EV) or disvalue of the future comes from scenarios in which (post-)humanity colonizes space, because these scenarios contain most expected beings. Simply extrapolating the current welfare (part 1.1) of humans and farmed and wild animals, it is unclear whether we should support spreading sentient beings to other planets.

From a more general perspective (part 1.2), future agents will likely care morally about the same things we find valuable or about any of the things we are neutral towards. It seems very unlikely that they would see value exactly where we see disvalue. If future agents are powerful enough to shape the world according to their preferences, this asymmetry implies the EV of future agents colonizing space is positive from many welfarist perspectives.

If we can defer the decision about whether to colonize space to future agents with more moral and empirical insight, doing so creates option value (part 1.3). However, most expected future disvalue plausibly comes from futures controlled by indifferent or malicious agents. Such “bad” agents will make worse decisions than we, currently, could. Thus, the option value in reducing the risk of human extinction is small.

The universe may not stay empty, even if humanity goes extinct (part 2.1). A non-human animal civilization, extraterrestrials or uncontrolled artificial intelligence that was created by humanity might colonize space. These scenarios may be worse than (post-)human space colonization in expectation. Additionally, with more moral or empirical insight, we might realize that the universe is already filled with beings or things we care about (part 2.2). If the universe is already filled with disvalue that future agents could alleviate, this gives further reason to reduce extinction risk.

In practice, many efforts to reduce the risk of human extinction also have other effects of long-term significance. Such efforts might often reduce the risk of global catastrophes (part 3.1) from which humanity would recover, but which might set technological and social progress on a worse track than they are on now. Furthermore, such efforts often promote global coordination, peace and stability (part 3.2), which is crucial for safe development of pivotal technologies and to avoid negative trajectory changes in general.

Aggregating these considerations, efforts to reduce extinction risk seem positive in expectation from most consequentialist views, ranging from neutral on some views to extremely positive on others. As efforts to reduce extinction risk also seem highly leveraged and time-sensitive, they should probably hold prominent place in the long-termist EA portfolio.

# Introduction and background

The future of Earth-originating life might be vast, lasting millions of years and containing many times more beings than currently alive (Bostrom, 2003). If future beings matter morally, it should plausibly be a major moral concern that the future plays out well. So how should we, today, prioritise our efforts aimed at improving the future?

We could try to reduce the risk of human extinction. A future with humanity would be drastically different from one without it. Few other factors seems as pivotal for how the world will look like in the millions of years to come as whether or not humanity survives the next few centuries and millennia. Effective efforts to reduce the risk of human extinction could thus have immense long-term impact. If we were sure that this impact was positive, extinction risk reduction would plausibly be one of the most effective ways to improve the future.

However, it is not at first glance clear that reducing extinction risk is positive from an impartial altruistic perspective. For example, future humans might have terrible lives that they can’t escape from, or humane values might exert little control over the future, resulting in future agents causing great harm to other beings. If indeed it turned out that we weren’t sure if extinction risk reduction was positive, we would prioritize other ways to improve the future without making extinction risk reduction a primary goal.

To inform this prioritisation, in this article we estimate the expected value of efforts to reduce the risk of human extinction.

## Moral assumptions

1. That it morally matters what happens in the billions of years to come. From this very long-term view, making sure the future plays out well is a primary moral concern.
2. That we should aim to satisfy our reflected moral preferences. Most people would want to act according to the preferences they would have upon idealized reflection, rather than according to their current preferences. The process of idealized reflection will differ between people. Some people might want to revise their preferences after they became much smarter, more rational and had spent millions of years in philosophical discussion. Others might want to largely keep their current moral intuitions, but learn empirical facts about the world (e.g. about the nature of consciousness).

Most arguments further assume that the state the world is brought into by one’s actions is what matters morally (as opposed to e.g. the actions following a specific rule). We thus take a consequentialist view, judging potential actions by their consequences.

Parts 1.1 and 1.2 further take a welfarist perspective, assuming that what matters morally in states of the world is the welfare of sentient beings. In a way, that means assuming our reflected preferences are welfarist. Welfare will be broadly defined as including pleasure and pain, but also complex values or the satisfaction of preferences. From this perspective, a state of the world is good if it is good for the individuals in this world. Across several beings, welfare will be aggregated additively[1], no matter how far in the future an expected being lives. Additional beings with positive (negative) welfare coming into existence will count as morally good (bad). In short, parts 1.1 and 1.2 take the view of welfarist consequentialism with a total view on population ethics (see e.g. (Greaves, 2017)), but the arguments also hold for other similar views.

If we make the assumptions outlined above, nearly all expected value or disvalue in a future with humanity arises from scenarios in which (post-)humans colonize space. The colonizable universe seems very large, so scenarios with space colonization likely contain a lot more beings than scenarios with earthbound life only (Bostrom, 2003). Conditional on human survival, space colonization also does not seem too unlikely, thus nearly all expected future beings live in scenarios with space colonization[2]. We thus take “a future with humanity” to mean “(post-)human space colonization” for the main text and briefly discuss what a future with only earthbound humanity might look like in Appendix 1.

## Outline of the article

Ultimately, we want to know “What is the expected value (EV) of efforts to reduce the risk of human extinction?”. We will address this question in three parts:

• In part 1, we ask “What is the EV of (post-)human space colonization[3]?”. We first attempt to extrapolate the EV from the amounts of value and disvalue in today’s world and how they would likely develop with space colonization. We then turn toward a more general examination of what future agents’ tools and preferences might look like and how they will, in expectation, shape the future. Finally, we consider if future agents could make a better decision on whether to colonize space (or not) than we can, so that it seems valuable to let them decide (option value).

• In part 1 we tacitly assumed the universe without humanity is and stays empty. In part 2, we drop that assumption. We evaluate how the possibility of space colonization by alternative agents and the possibility of existing but tractable disvalue in the universe change the EV of keeping humans around.

• In part 3, we ask “Besides reducing extinction risk, what will be the consequences of our efforts?”. We look at how different efforts to reduce extinction risk might influence the long-term future by reducing global catastrophic risk and by promoting global coordination and stability.

We stress that the conclusions of the different parts should not be separated from the context. Since we are reasoning about a topic as complex and uncertain as the long-term future, we take several views, aiming to ultimately reach a verdict by aggregating across them.

## A note on disvalue-focus

The moral view on which this article is based is very broad and can include enormously different value systems, in particular different degrees of ‘disvalue-focus’. We consider a moral view disvalue-focused if it holds the prevention/reduction of disvalue is (vastly) more important than the creation of value. One example are views that hold the prevention or reduction of suffering as an especially high moral priority.

The degree of disvalue focus one takes chiefly influences the EV of reducing extinction risk.

From very disvalue-focused views, (post-) human space colonization may not seem desirable even if the future contains a much better ratio of value to disvalue than today. There is little to gain from space colonization if the creation of value (e.g. happy beings) morally matters little. On the other hand, space colonization would multiply the amount of sentient beings and thereby multiply the absolute amount of disvalue.

At first glance it thus seems that reducing the risk of human extinction is not a good idea from a strongly disvalue-focused perspective. However, the value of extinction risk reduction for disvalue-focused views gets shifted upwards considerably by the arguments in part 2 and 3 of this article.

# Part 1: What is the EV of (post-)human space colonization?[4]

## 1.1: Extrapolating from today’s world

Space colonization is hard. By the time our technology is advanced enough, human civilization will possibly have changed considerably in many ways. However, to get a first grasp of the expected value of the long-term future, we can model it as a rough extrapolation of the present. What if humanity as we know it colonized space? There would be vastly more sentient beings, including humans, farmed animals and wild animals[5]. To estimate the expected value of this future, we will consider three questions:

1. How many humans, farmed animals and wild animals will exist?
2. How should we weigh the welfare of different beings?
3. For each of humans, farmed animals and wild animals:
1. Is the current average welfare net positive/average life worth living?
2. How will welfare develop in the future?

We will then attempt to draw a conclusion. Note that throughout this consideration, we take an individualistic welfarist perspective on wild animals. This perspective stands in contrast to e.g. valuing functional ecosystems and might seem unusual, but is increasingly popular.

### There will likely be more farmed and wild animals than humans, but the ratio will decrease compared to the present

In today’s world, both farmed and wild animals outnumber humans by far. There are about 3-4 times more farmed land animals and about 13 times more farmed fish[6] than humans alive. Wild animals prevail over farmed animals, with about 10 times more wild birds than farmed birds and 100 times more wild mammals than farmed mammals alive at any point. Moving on to smaller wild animals, the numbers increase again, with 10 000 times more vertebrates than humans, and between 100 000 000 - 10 000 000 000 times more insects and spiders than humans[7].

In the future, the relative number of animals compared to humans will likely decrease considerably.

Farmed animals will not be alive if animal farming substantially decreases or stops, which seems more likely than not for both for moral and economical reasons. Humanity’s moral circle seems to have been expanding throughout history (Singer, 2011) and further expansion to animals may well lead us to stop farming animals.[8] Also financially, plant-based meat alternatives or lab-grown meat will likely develop to be more efficient than growing animals (Tuomisto and Teixeira de Mattos, 2011). However, none of these developments seems unequivocally destined to end factory-farming[9], and the historical track record shows that meat consumption per head has been growing for > 50 years[10]. Overall, it seems likely but not absolutely clear that the number of farmed animals relative to humans will be smaller in the future. For wild animals, we can extrapolate from a historical trend of decreasing wild animal populations. Even if wild animals were spread to other planets for terraforming, the animal / human ratio would likely be lower than today.

### Welfare of different beings can be weighted by (expected) consciousness

To determine the EV of the future, we need to aggregate welfare across different beings. It seems like we should weigh the experience of a human, a cow and a beetle differently when adding up, but by how much? This is a hard question with no clear answer, but we outline some approaches here. The degree to which an animal is conscious (“the lights are on”, the being is aware of its experiences, emotions and thoughts), or the confidence we have in an animal being conscious, can serve as a parameter by which to weight welfare. To arrive at a number for this parameter, we can use proxies such as brain mass, neuron count and mental abilities directly. Alternatively, we may aggregate these proxies with other considerations into an estimate of confidence that a being is conscious. For instance, the Open Philanthropy Project estimates the probability that cows are conscious at 80%.

### The EV of (post-)human lives is likely positive

Currently, the average human life seems to be perceived as being worth living. Survey data and experience sampling suggests that most humans are quite content with their lives and experience more positive than negative emotions on a day-to-day basis[11]. If they find it not worth living, humans can take their life, but relatively few people commit suicide (Suicide accounts for 1.7 % of all deaths in US).[12] We could conclude that human welfare is positive.

We should, however, note the two caveats in this conclusion. First, a live can be perceived as worth living even if it is negative from a welfarist perspective.[13] Second, the average life might not be worth living if the suffering of the worst off was sufficiently more intense than the happiness of the majority of people.

Overall, it seems that from a large majority of consequentialist views, the current aggregated human welfare is positive.

In the future, we will probably make progress that will improve the average human life. Historic trends have been positive across many indicators of human well-being, knowledge, intelligence and capability. On a global scale, violence is declining, cooperation increasing (Pinker, 2011). Yet, the trend does not include all indicators: subjective welfare has (in recent times) remained stable or improved very little, and mental health problems are more prevalent. These developments have sparked research into positive psychology and mental health treatment, which is slowly bearing fruit. As more fundamental issues are gradually improved, humanity will likely shift more resources towards actively improving welfare and mental health. Powerful tools like genetic design and virtual reality could be used to further improve the lives of the broad majority as well as the worst-off. While there are good reasons to assume that human welfare in the future will be more positive than now, we still face uncertainties (e.g. from low probability events like malicious, but very powerful autocratic regimes and unknown unknowns).

### EV of farmed animals’ lives is probably negative

Currently, 93% of farmed animals live on factory farms in conditions that likely make their lives not worth living. Although there are positive sides to animal life on farms compared to life in the wild[14], these are likely outweighed by negative experiences[15]. Most farmed animals also lack opportunities to exhibit naturally desired behaviours like grooming. While there is clearly room for improvement in factory farming conditions, the question “is the average life worth living?” must be answered separately for each situation and remains controversial[16]. On average, a factory farm animal life today probably has negative welfare.

In the future, factory farming is likely to be abolished or modified to improve animal welfare as our moral circle expands to animals (see above). We can thus be moderately optimistic that farm animal welfare will improve and/or less farm animals will be alive.

### The EV of wild animals’ lives is very unclear, but potentially negative

Currently, we know too little about the lives and perception of wild animals to judge whether their average welfare is positive or negative. We see evidence of both positive[17] and negative[18] experiences. Meanwhile, our perspective on wild animals might be skewed towards charismatic big mammals living relatively good lives. We thus overlook the vast majority of wild animals, based both on biomass and neural count. Most smaller wild animal species (invertebrates, insects etc) are r-selected, with most individuals living very short lives before dying painfully. While vast numbers of those lives seem negative from a welfarist perspective, we may chose to weight them less based on the considerations outlined above. In summary, most welfarist views would probably judge the aggregated welfare of wild animals as negative. The more one thinks that smaller, r-selected animals matter morally, the more negative average wild animal welfare becomes.

In future, we may reduce the suffering of wild animals, but it is unclear whether their welfare would be positive. Future humans may be driven by the expansion of the moral circle and empowered by technological progress (e.g. biotechnology) to improve wild animal lives. However, if average wild animal welfare remains negative, it would still be bad to increase wild animal numbers by space colonization.

### Conclusion

It remains unclear whether the EV of a future in which a human civilization similar to the one we know colonized space is positive or negative.

To quantify the above considerations from a welfarist perspective, we created a mathematical model. This model yields a positive EV for a future with space colonization if different beings are weighted by neuron count and a negative EV if they are weighted by sqrt(neuron count). In the first case, average welfare is positive, driven by the spreading of happy (post-)humans. In the second case, average welfare is negative as suffering wild animals are spread. The model is also based on a series of low-confidence assumptions[19], alteration of which could flip the sign of the outcome again.

More qualitatively, the EV of an extrapolated future heavily depends on one’s moral views. The degree to which one is focused on avoiding disvalue seems especially important. Consider that every day, humans and animals are being tortured, murdered, or in psychological despair. Those who would walk away from Omelas might also walk away from current and extrapolated future worlds.

Finally, we should note how little we know about the world and how this impacts our confidence in considerations about an extrapolated future. To illustrate the extent of our empirical uncertainty, consider that we are extrapolating from 100 000 years of human existence, 10 000 years of civilizational history and 200 years of industrial history to potentially 500 million years on earth (and much longer in the rest of the universe). If people in the past had guessed about the EV of the future in a similar manner, they would most likely have gotten it wrong (e.g. they might not have considered moral relevance of animals, or not have known that there is a universe to potentially colonize). We might be missing crucial considerations now in analogous ways.

## 1.2: Future agents’ tools and preferences

While part 1.1 extrapolates directly from today’s world, part 1.2 takes a more abstract approach. To estimate the EV of (post-)human space-colonization in more broadly applicable terms, we consider three questions:

1. Will future agents have the tools to shape the world according to their preferences?
2. Will future agents’ preferences resemble our 'reflected preferences' (see 'Moral assumptions' section)?
3. Can we expect the net welfare of future agents and powerless beings to be positive or negative?

We then attempt to estimate the EV of future agents colonizing space from a welfarist consequentialist view.

### Future agents will have powerful tools to shape the world according to their preferences

Since climbing down from the trees, humanity has changed the world a great deal. We have done this by developing increasingly powerful tools to satisfy our preferences (i.e. preferences to eat, stay healthy and warm, and communicate with friends (even if they are far away)). As far as humans have altruistic preferences, powerful tools have made acting on them less costly. For instance, if you see someone is badly hurt and want to help, you don’t have to carry them home and care for them yourself anymore, you can just call an ambulance. However, powerful tools have also made it easier to cause harm, either by satisfying harmful preferences (e.g. weapons of mass destruction) or as a side-effect of our actions that we are indifferent to. Technologies that enable factory farming do enormous harm to animals, although they were developed to satisfy a preference for eating meat, not for harming animals[20].

It seems likely that future agents will have much more powerful tools than we do today. These tools could be used to make the future better or worse. For instance, biotechnology and genetic engineering could help us cure diseases and live longer, but they could also enforce inequality if treatments are too expensive for most people. Advanced AI could make all kinds of services much cheaper but could also be misused. For more potent and complex tools, the stakes are even higher. Consider the example of technologies that facilitate space colonization. These tools could be used to cause the existence of many times more happy lives than would be possible on Earth, but also to spread suffering.

In summary, future agents will have the tools to create enormous value (more examples here) or disvalue (more examples here).[21] It is thus important to consider the values/preferences that future agents might have.

### We can expect future agents to have other-regarding preferences that we would, after reflection, find somewhat positive

When referring to future agents’ preferences, we distinguish between ‘self-regarding preferences’, i.e. preferences about states of affairs that directly affect an agent, and ‘other-regarding preferences’, i.e. preferences about the world that remain even if an agent is not directly affected (see footnote[22] for a precise definition). Future agents’ other-regarding preferences will be crucial for the value of the future. For example, if the future contains powerless beings in addition to powerful agents, the welfare of the former will depend to a large degree on the other-regarding preferences of the latter (much more about that later).

#### We can expect a considerable fraction of future agents’ preferences to be other-regarding

Most people alive today clearly have (positive and negative) other-regarding preferences, but will this be the case for future agents? It has been argued that over time, other-regarding preferences could be stripped away by Darwinian selection. We explore this argument and several counterarguments in appendix 2. We conclude that future agents will, in expectation, have a considerable fraction of other-regarding preferences.

#### Future agents’ preferences will in expectation be parallel rather than anti-parallel to our reflected preferences

We want to estimate the EV of a future shaped by powerful tools according to future agents’ other-regarding preferences. In this article we assume that we should ultimately aim to satisfy our reflected moral preferences, the preferences we would have after an idealized reflection process (as discussed in the "Moral assumptions" section above). Thus, we must establish how future agents’ other-regarding preferences (FAP) compare to our reflected other-regarding preferences (RP). Briefly put, we need to ask: “would we want the same things as these future agents who will shape the world?”

FAP can be somewhere on a spectrum from parallel to orthogonal to anti-parallel to RP. If FAP and RP are parallel, future agents agree exactly with our reflected preferences. If the are anti-parallel, future agents see value exactly where we see disvalue. And if the are orthogonal, future agents value what we regard as neutral, and vice versa. We now examine how FAP will be distributed on this spectrum.

Assume that future agents care about moral reflection. They will then have better conditions for an idealized reflection process than we have, for several reasons:

• Future agents will probably be more intelligent and rational[23]

• Empirical advances will help inform moral intuitions (e.g. experience machines might allow agents to get a better idea of other beings’ experiences)

• Future agents will have more time and resources to deliberate

Given these prerequisites, it seems that future agents’ moral reflection would in expectation lead to FAP that are parallel rather than anti-parallel to RP. How much overlap between FAP and RP to expect remains difficult to estimate.[24]

However, scenarios in which future agents do not care about moral reflection might substantially influence the EV of the future. For example, it might be likely that humanity loses control and the agents shaping the future bear no resemblance to humans. This could be the case if developing controlled artificial general intelligence (AGI) is very hard, and the probability that misaligned AGI will be developed is high (in this case, the future agent is a misaligned AI).[25]

Even if (post-)humans remain in control, human moral intuitions might turn out to be contingent the starting conditions of the reflection process and not very convergent across the species. Thus, FAP may not develop into any clear direction, but rather drift randomly[26]. Very strong and fast goal drift might be possible if future agents include digital (human) minds because such minds would not be restrained by the cultural universals rooted in the physical brain architecture.

If it turns out that FAP develop differently from RP, FAP will in expectation be orthogonal to RP rather than anti-parallel. The space of possible preferences is vast, so it seems much more likely that FAP will be completely different from RP, rather than exactly opposite[27] (See footnote[28] for an example). In summary, FAP parallel or orthogonal to RP both seem likely, but a large fraction of FAP being anti-parallel to RP seems fairly unlikely. This main claim seems true for most “idealized reflection processes” that people would choose.

However, FAP being between parallel and orthogonal to RP in expectation does not necessarily imply the future will be good. Actions driven by (orthogonal) FAP could have very harmful side-effects, as judged by our reflected preferences. Harmful side-effects could be devastating especially if future agents are indifferent towards beings we (would on reflection) care about morally. Such negative side-effects might outweigh positive intended effects, as has happened in the past[29]. Indeed, some of the most discussed “risks of astronomical future suffering” are examples of negative side-effects.[30]

### Future agents’ tools and preferences will in expectation shape a world with probably net positive welfare

Above we argued that we can expect some overlap between future agents’ other-regarding preferences (FAP) and our reflected other-regarding preferences (RP). We can thus be somewhat optimistic about the future in a very general way, independent of our first-order moral views, if we ultimately aim to satisfy our reflected preferences. In the following section, we will drop some of that generality. We will examine what future agents’ preferences will imply for the welfare of future beings. In doing so, we assume that we would on reflection hold an aggregative, welfarist altruistic view (as explained in the background-section).

If we assume these specific RP, can we still expect FAP to overlap with them? After all, other-regarding preferences anti-parallel to welfarist altruism – such as sadistic, hateful, revengeful preferences - clearly exist within present day humanity. If current human values transferred broadly into the future, should we then expect a large fraction of FAP being anti-parallel to welfarist altruism? Probably not. We argue in appendix 3 that although this is hard to quantify, the large majority of human other-regarding preferences seem positive.

Assuming somewhat welfarist FAP, we explore what the future might be like for two types of beings: Future agents (post-humans) who have powerful tools to shape the world, and powerless future beings. To aggregate welfare for moral evaluation, we need to estimate how many beings of each type will exist. Powerful agents will likely be able to create powerless beings as “tools” if this seems useful for them. Sentient “tools” could include animals, farmed for meat production or spread to other planets for terraforming (e.g. insects), but also digital sentient minds, like sentient robots for task performance or simulated minds created for scientific experimentation or entertainment. The last example seems especially relevant, as digital minds could be created in vast amounts if digital sentience is possible at all, which does not seem unlikely. If we find we morally care about these “tools” upon reflection, the future would contain many times more powerless beings than powerful agents.

The EV of the future thus depends on the welfare of both powerful agents and powerless beings, with the latter potentially much more relevant than the former. We now consider each in turn, asking:

• How will their expected welfare be affected by intended effects and side-effects of future agents’ actions?
• How to evaluate this morally?

#### The aggregated welfare of powerful future agents is in expectation positive

Future agents will have powerful tools to satisfy their self-regarding preferences and be somewhat benevolent towards each other. Thus, we can expect future agents’ welfare to be increased through intended effects of their actions.

Side-effects of future agents’ actions negative for other agents’ welfare would mainly arise if their civilization is not coordinated well. However, compromise and cooperation seem to usually benefit all involved parties, indicating that we can expect future agents to develop good tools for coordination and use them a lot.[31] Coordination also seems essential to avert many extinction risks. Thus, a civilization that avoided extinction so successfully that it colonizes space is expected to be quite coordinated.

Taken together, vastly more resources will likely be used in ways that improve the welfare of powerful agents than in ways that diminish their welfare. From the big majority of welfarist views, future agents’ aggregated welfare is thus expected to be positive. This conclusion is also supported by human history, as improved tools, cooperation and altruism have increased the welfare of most humans and average human lives are seen as worth living by many (see part 1.1).

#### The aggregated welfare of powerless future beings may in expectation be positive

Assuming that future agents are mostly indifferent towards the welfare of their “tools”, their actions would affect powerless beings only via (in expectation random) side-effects. It is thus relevant to know the “default” level of welfare of powerless beings. If the affected powerless beings were animals shaped by evolution, their default welfare might be net negative. This is because evolutionary pressure might result in a pain-pleasure asymmetry with suffering being much more intense than pleasure (see footnote for further explanation[32]). Such evolutionary pressure would not apply for designed digital sentience. Given that our experience with welfare is restricted to animals (incl. humans) shaped by evolution, it is unclear what the default welfare of digital sentients would be. If there is at least some moral concern for digital sentience, it seems fairly likely that the creating agents would prefer to give their sentient tools net positive welfare[33].

If future agents intend to affect the welfare of powerless beings, they might - besides from treating their sentient “tools” accordingly - create (dis-)value optimized sentience: minds that are optimized for extreme positive or negative welfare. For example, future agents could simulate many minds in bliss, or many minds in agony. The motivation for creating (dis-)value optimized sentience could be altruism, sadism or strategic reasons[34]. Creating (dis-)value optimized sentience would likely produce much more (negative) welfare per unit of invested resources than the side-effects on sentient tools mentioned above, as sentient tools are optimized for task performance, not production of (dis-)value[35]. (Dis-)value optimized sentience would then be the main determinant of the expected value of post-human space colonization, and not side-effects on sentient tools.

FAP may be orthogonal to welfarist altruism, in which case little (dis-)value optimized sentience will be produced. However, we expect a much larger fraction of FAP to be parallel to welfarist altruism than anti-parallel to it, and thus expect that future agents will use many more resources to create value-optimized sentience than disvalue-optimized sentience. The possibility of (dis-)value optimized sentience should increase the net expected welfare of powerless future beings. However, there is considerable uncertainty about the moral implications of one resource-unit spent optimized for value or disvalue (see e.g. here and here). On the one hand, (dis)value optimized sentience created without evolutionary pressure might be equally efficient in producing moral (dis)value, but used a lot more to produce value. On the other hand, disvalue optimized sentience might lead to especially intense suffering. Many people intuitively give more moral importance to the prevention of suffering the worse it gets (e.g. prioritarianism).

In summary, it seems plausible that a little concern for the welfare of sentient tools could go a long way. Even if most future agents were completely indifferent towards sentient tools (=majority of FAP orthogonal to RP), positive intended effects – creation of value-optimized sentience – could plausibly weigh heavier than side-effects.

### Conclusion

Morally evaluating the future scenarios sketched in part 1.2 is hard because we are uncertain. Both empirically uncertain what the future will be like and morally uncertain what our intuitions will be like. The key unanswered questions are

• How much can we expect the preferences that shape the future to overlap with our reflected preferences?
• In absence of concern for the welfare of sentient tools, how good or bad is their default welfare?
• How will the scales of intended effects and side-effects compare?

Taken together, we believe that the arguments in this section indicate that the EV of (post)-human space colonization would only be negative from relatively strongly disvalue-focused views. From the majority, but not overwhelming majority, of welfarist views the EV of (post)-human space colonization seems positive.[36][37]

In parts 1.1 and 1.2, we directly estimated the EV of (post-)human space colonization and found it to be very uncertain. In the remaining parts, we will improve our estimate via other approaches that are less dependent on specific predictions about how (post-)humans will shape the future.

## 1.3: Future agents could later decide not to colonize space (option value)

We are often uncertain about what the right thing to do is. If we can defer the decision to someone wiser than ourselves, this is generally a good call. We can also defer across time: we can keep our options open for now, and hope our descendants will be able to make better decisions. This option value may give us a reason to prefer to keep our options open.

For instance, our descendants may be in a better position to judge whether space colonization would be good or bad. If they can see that space colonization would be negative, they can refrain from (further) colonizing space: They have the option to limit the harm. In contrast, if humanity goes extinct, the option of (post)-human space colonization is forever lost. So avoiding extinction creates ‘option value’(e.g. Macaskill).[38] This specific type of ‘option value’ - from future agents choosing not to colonize space - and not the more general value of keeping options open, is what we will be referring to throughout this section.[39] This type of option value exist for nearly all moral views, and is very unlikely to be negative.[40] However, as we will discuss in this chapter, this value is rather small compared to other considerations.

### A considerable fraction of futures contains option value

Reducing the risk of human extinction only creates option value if future agents will make a better decision, by our (reflected) lights, about whether to colonize space than we could. If they will make worse decisions than us, we would rather decide ourselves.

In order for future agents to make better decisions than us and actually act on them, they need to surpass us in at least one of the following aspects:

• Better values
• Better judgement what space colonization will be like (based on increased empirical understanding and rationality)
• Greater willingness and ability to make decisions based on moral values (non-selfishness and coordination)

#### Values

Human values change. We are disgusted by many of our ancestors’ moral views, and they would find ours equally repugnant. We can even look back on our own moral views and disagree. There is no reason for these trends to stop exactly now: human morality will likely continue to change.

Yet at each stage in the change, we are likely to view our values as obviously correct. This encourages a greater degree of moral uncertainty than feels natural. We should expect that our moral views would change after idealized reflection (although this also depends on which meta-ethical theory is correct and how idealized reflection works).

We argued in part 1.2 that future agents’ preferences will in expectation have some overlap with our reflected preferences. Even if that overlap is not very high, a high degree of moral uncertainty would indicate that we would often prefer future agents’ preferences over our current, unreflected preferences. In a sizeable fraction of future scenarios, future agents with more time and better tools to reflect, can be expected to make better decisions than one could today.

#### Empirical understanding and rationality

We now understand the world better than our ancestors, and are able to think more clearly. If those trends continue, future agents may understand better what space colonization will be like, and so better understand how good it will be on a given set of values.

For example, future agents’ estimate of the EV of space colonization will benefit from

• Better empirical understanding of the universe (for instance about questions discussed in part 2.2)[41] and better predictions, fuelled by more scientific knowledge and better forecasting techniques
• Increased intelligence and rationality[42], allowing them to more accurately determine what the best action is based on their values.

As long as there is some overlap between their preferences and one’s reflected preferences, this gives an additional reason to defer to future agents’ decisions (example see footnote).[43]

#### Non-selfishness and coordination

We often know what’s right, but don’t follow through on it anyway. What is true for diets also applies here:

• Future agents would need to actually make the decision about space colonization based on moral reasoning[44]. This might imply acting against strong economic incentives pushing towards space colonization.

• Future agents need to be coordinated well enough to avoid space colonization. That might be a challenge in non-singleton futures since future civilization would need ways to ensure that not a single agent starts space colonization.

It seems likely that future agents would probably surpass our current level of empirical understanding, rationality, and coordination, and in a considerable fraction of possible futures they might also do better on values and non-selfishness. However, we should note that to actually not colonize space, they would have to surpass a certain threshold in all of these fields, which may be quite high. Thus, a little bit of progress doesn’t help - option value is only created in deferring the decision to future agents if they surpass this threshold.

### Only the relative good futures contain option value

For any future scenario to contain option value, the agents in that future need to surpass us in various ways, as outlined above. This has an implication that further diminishes the relevance of the option value argument. Future agents need to have relatively good values and be relatively non-selfishness to decide not to colonize space for moral reasons. But even if these agents colonized space, they would probably do it in a relatively good manner. Most expected future disvalue plausibly comes from futures controlled by indifferent or malicious agents (like misaligned AI). Such “bad” agents will make worse decisions about whether or not to colonize space than we, currently, could, because their preferences are very different from our (reflected) preferences. Potential space colonization by indifferent or malicious agents thus generates large amounts of expected future disvalue, which cannot be alleviated by option value. Option value doesn’t help in the cases where it is most needed (see footnote for an explanatory example)[45]

### Conclusion

If future agents are good enough, there is option value in deferring the decision whether to colonize space to them. In some not-too-small fraction of possible futures, agents will fulfill the criteria and thus option value adds positively to the EV of reducing extinction risk. However, the futures accounting for most expected future disvalue are likely controlled by indifferent or malicious agents. Such “bad” agents would likely make worse decisions than we could. A large amount of expected future disvalue is thus not amendable from alleviation through option value. Overall, we think the option value in reducing the risk of human extinction is probably fairly moderate, but there is a lot of uncertainty and contingency on one’s specific moral and empirical views[46]. Modelling the considerations of this section showed that if the 90% confidence interval of value of the future was from -0.9 to 0.9 (arbitrary value units), option value was 0.07.

# Part 2: Absence of (post-)human space colonization does not imply a universe devoid of value or disvalue

Up to now, we have tacitly assumed that the sign of EV of (post)-human space colonization determines whether extinction risk reduction is worthwhile. This only holds if without humanity, the EV of the future is roughly zero, because the (colonizable) universe is and will stay devoid of value or disvalue. We now consider two classes of scenarios in which this is not the case, with important implications especially for people who think that EV of (post-)human space colonization is likely negative.

## 2.1 Whether (post-)humans colonizing space is good or bad, space colonization by other agents seems worse

If humanity goes extinct without colonizing space, some kind of other beings would likely survive on earth[47]. These beings might evolve into a non-human technological civilization in the hundreds of millions of years left on earth and eventually colonize space. Similarly, extraterrestrials (that might already exist or come into existence in the future) might colonize (more of) our corner of the universe, if humanity does not.

In these cases, we must ask whether we prefer (post-)human space colonization over the alternatives. Whether alternative civilizations would be more or less compassionate or cooperative than humans, we can only guess. We may however assume that our reflected preferences depend on some aspects of being human, such as human culture or the biological structure of the human brain[48]. Thus, our reflected preferences likely overlap more with a (post-)human civilization than alternative civilizations. As future agents will have powerful tools to shape the world according to their preferences, we should prefer (post-)human space colonization over space colonization by an alternative civilization.

To understand how we can factor this consideration into the overall EV of a future with (post-) human space colonization, consider the following example of Ana and Chris. Ana thinks the EV of (post-)human space colonization is negative. For her, the EV of potential alternative space colonization is thus even more negative. This should cause people who, like Ana, are pessimistic about the EV of (post-)human space colonization (and thus the value of reducing the risk of human extinction) to update towards reducing the risk of human extinction because the alternative is even worse (technical caveat in footnote[49]).

Chris thinks that the EV of (post-)human space colonization is positive. For him, the EV of potential alternative space colonization could be positive or negative. For people like Chris, who are optimistic about the EV of (post-)human space colonization (and thus the value of reducing the risk of human extinction), the direction of update is thus less clear. They should update towards reducing the risk of human extinction if the potential alternative civilization is bad, or away from it if the potential alternative civilization is merely less good. Taken together, this consideration implies a stronger update for future pessimists like Ana than for future optimists like Chris. This becomes clearer in the mathematical derivation[50] or when considering an example[51].

It remains to estimate how big the update should be. Based on our best guesses about the relevant parameters (Fermi-estimate see here), it seems like future pessimists should considerably shift their judgement of the EV of human extinction risk reduction into the less negative direction. Future optimists should moderately shift their judgement downwards. Therefore, if one was previously uncertain with roughly equal credence in future pessimism and future optimism, one’s estimate for the EV of human extinction risk reduction should increase.

We should note that this is a very broad consideration, with details contingent on the actual moral views people hold and specific empirical considerations[52].

A specific case of alternative space colonization could arise if humanity gets extinguished by misaligned AGI. It seems likely that misaligned AI would colonize space. Space colonization by an AI might include (among other things of value/disvalue to us) the creation of many digital minds for instrumental purposes. If the AI is only driven by values orthogonal to ours, it would likely not care about the welfare of those digital minds. Whether we should expect space colonization by a human-made, misaligned AI to be morally worse than space colonization by future agents with (post-)human values has been discussed extensively elsewhere. Briefly, nearly all moral views would most likely rather have human value-inspired space colonization than space colonization by AI with arbitrary values, giving extra reason to work on AI alignment especially for future pessimists.

## 2.2 Existing disvalue could be alleviated by colonizing space

With more empirical knowledge and philosophical reflection, we may find that the universe is already filled with beings/things that we morally care about. Instead of just increasing the number of morally relevant things (i.e. earth originating sentient beings), future agents might then influence the states of morally relevant beings/things already existing in the universe[53]. This topic is highly speculative and we should stress that most of the EV probably comes from “unknown unknowns”, which humanity might discover during idealized reflection. Simply put, we might find some way in which future agents can make the existing world (a lot) better if they stick around. To illustrate this general concept, consider the following ideas.

We might find that we morally care about things other than sentient beings, which could be vastly abundant in the universe. For example, we may develop moral concern for fundamental physics, e.g. in the form of panpsychicism. Another possibility could arise if the solution to the simulation argument (Bostrom, 2003) is indeed that we live in a simulation, with most things of moral relevance positioned outside of our simulation but modifiable by us in yet unknown ways. It might also turn out that we can interact with other agents in the (potentially infinite) universe or multiverse by acausal trade or multiverse-wide cooperation, thereby influencing existing things of moral relevance (to us) in their part of the universe/multiverse. These specific ideas may look weird. However, given humanity’s history of realizing that we care about more/other things than previously thought[54], it should in principle seem likely that our reflected preferences include some yet unknown unknowns.

We argued in part 1.2 that future agents’ preferences will in expectation be parallel rather than anti-parallel to our reflected preferences. If the universe is already filled with things/beings of moral concern, we can thus assume that future agents will in expectation improve the state of these things[55]. This creates additional reason to reduce the risk of human extinction: There might be a moral responsibility for humanity to stick around and “improve the universe”. This perspective is especially relevant for disvalue-focused views. From a (strongly) disvalue-focused view, increasing the numbers of conscious beings by space colonization is negative because it generates suffering and disvalue. It might seem that there is little to gain if space colonization goes well, but much to lose if it goes wrong. If, however, future agents could alleviate existing disvalue, then humanity’s survival (potentially including space colonization) has upsides that may well be larger than the expected downsides (Fermi-estimate see footnote[56]).[57]

# Part 3: Efforts to reduce extinction risk may also improve the future

If we had a button that reduces human extinction risk, and has no other effect, we would only need the considerations in parts 1 and 2 to know whether we should press it. In practice, efforts to reduce extinction risk often have other morally relevant consequences, which we examine below.

## 3.1: Efforts to reduce non-AI extinction risk reduce global catastrophic risk[58]

Global catastrophe here refers to a scenario of hundreds of millions of human deaths and resulting societal collapse. Many potential causes of human extinction, like a large scale epidemic, nuclear war, or runaway climate change, are far more likely to lead to a global catastrophe than to complete extinction. Thus, many efforts to reduce the risk of human extinction also reduce global catastrophic risk. In the following, we argue that this effect adds substantially to the EV of efforts to reduce extinction risk, even from the very-long term perspective of this article. This doesn’t hold for efforts to reduce risks that, like risks from misaligned AGI, are more likely to lead to complete extinction than to a global catastrophe.

Apart from being a dramatic event of immense magnitude for current generations, a global catastrophes could severely curb humanity’s long-term potential by destabilizing technological progress and derailing social progress[59].

Technological progress might be uncoordinated and incautious in a world that is politically destabilized by global catastrophe. For pivotal technologies such as AGI, development in an arms race scenario (e.g. driven by post-catastrophe resource scarcity or war) could lead to adverse outcomes we cannot correct afterwards.

Social progress might likewise divert towards opposing open society and general utilitarian-type values. Can we expect the “new” value system emerging after a global catastrophe to be robustly worse than our current value system? While this issue is debated[60], Nick Beckstead gives a strand of arguments suggesting the “new” values would in expectation be worse. Compared to the rest of human history, we currently seem to be on a unusually promising trajectory of social progress. What exactly would happen if this period was interrupted by a global catastrophe is a difficult question, and any answer will involve many judgements calls about the contingency and convergence of human values. However, as we hardly understand the driving factors behind the current period of social progress, we cannot be confident it would recommence if interrupted by a global catastrophe. Thus, if one sees the current trajectory as broadly positive, one should expect this value to be partially lost if a global catastrophe occurs.

Taken together, reducing global catastrophic risk seems to be a valuable effect of efforts to reduce extinction risk. This aspect is fairly relevant even from a very-long term perspective because catastrophes are much more likely than extinction. A Fermi-Estimate suggests the long-term impact from the prevention of global catastrophes is about 50% of the impact from avoiding extinction events. The potential long-term consequences from a global catastrophe include worse values and an increase in the likelihood of misaligned AI scenarios. These consequences seem bad from most moral perspectives, including strongly disvalue-focused ones. Considering the effects on global catastrophic risk should suggest a significant update in the evaluation of the EV of efforts to reduce extinction risk towards more positive (or less negative) values.

## 3.2: Efforts to reduce extinction risk often promote coordination, peace and stability, which is broadly good

The shared future of humanity is a (transgenerational) global public good (Bostrom, 2013), thus society needs to coordinate to preserve it, e.g. by providing funding and other incentives. Most extinction risk also arises from technologies that allow for one agent (intentionally or by mistake) to start a potential extinction event (e.g. release a harmful virus or start a nuclear war). Coordinated action and careful decisions are thus needed and indeed, the broadest efforts to reduce extinction risk directly promote global coordination, peace and stability. More focused efforts often promote “narrow cooperation” within a specific field (e.g. nuclear non-proliferation) or set up processes (e.g. pathogenic surveillance) that increase global stability by reducing perceived levels of threat from non-extinction events (e.g. bioterrorist attacks).

Taken together, efforts to reduce extinction risk also promote a more coordinated, peaceful and stable global society. Future agents in such a society will probably make wiser and more careful decisions, reducing the risk of unexpected negative trajectory changes in general. Safe development of AI will specifically depend on these factors. Therefore, efforts to reduce extinction risk may also steer the world away from some of the worst non-extinction outcomes, which likely involve war, violence and arms races.

Note that there may be a trade-off as most targeted efforts seem more neglected and therefore promising levers for extinction risk reduction. However, their effects on global coordination, peace and stability are less certain and likely smaller than the effects of broad efforts aimed directly at increasing these factors. Broad efforts to promote global coordination, peace and stability might be among the most promising approaches to robustly improve the future and reduce the risk of dystopian outcomes conditional on human survival.

# Conclusion

## The expected value of efforts to reduce the risk of human extinction (from non-AI causes) seems robustly positive

So all things considered, what is the expected value of efforts to reduce the risk of human extinction? In the first part, we considered what might happen if human extinction is prevented for long enough that future agents, maybe our biological descendants, digital humans, or (misaligned) AGI created by humans, colonize space. The EV of (post-)human space colonization is probably positive from many welfarist perspectives, but very uncertain. We also examined the ‘option value argument’, according to which we should try to avoid extinction and defer the decision to colonize space (or not) to wiser future agents. We concluded that option value, while mostly positive, is small and the option value argument hardly conclusive.

In part 2, we explored what the future universe might look like if humans do go extinct. Vast amounts of value or disvalue might (come to) exist in those scenarios as well. Some of this (dis-)value could be influenced by future agents if they survive. This insight has little impact for people who were optimistic about the future anyway, but shifts the EV of reducing extinction risk upwards for people who were previously pessimistic about the future. In part 3, we extended our considerations to additional effects of many efforts to reduce extinction risk, namely reducing the risk of “mere” global catastrophes and increasing global cooperation and stability. These effects generate considerable additional positive long-term impact. This is because global catastrophes would likely change the direction of technological and social progress in a bad way, while global cooperation and stability are prerequisites for a positive long-term trajectory.

Some aspects of moral views make the EV of reducing extinction risk looks less positive than suggested above. We will consider three such aspects:

• From a strongly disvalue-focused view, increasing the total number of sentient beings seems negative regardless of the empirical circumstances. The EV of (post-) human space colonization (part 1.1 and 1.2) is thus negative, at least if the universe is currently devoid of value.
• From a very stable moral view (with low moral uncertainty, thus very little expected change in preferences upon idealized reflection), there are no moral insights for future agents to discover and act upon. Future agents could then only make better decisions than us about whether to colonize space through empirical insights. Likewise, future agents could only discover opportunities to alleviate astronomical disvalue that we currently do not see through empirical insights. Option value (part 1.3) and the effects from potentially existing disvalue (part 2.2) are reduced.
• From a very unusual moral view (with some of one’s reflected other-regarding preferences expected to be anti-parallel to most of humanity’s reflected other-regarding preferences), future agents will sometimes do the opposite of what one would have wanted[61]. This would be true even if future agents are reflected and act altruistically (according to a different conception of ‘altruism’). From that view the future looks generally worse. There is less option value (part 1.3), and if the universe is already filled with beings/things that we morally care about (part 2.2), sometimes future agents might do the wrong thing upon this discovery.

To generate the (hypothetical) moral view that is most sceptical about reducing extinction risk, we unite all of the three aspects above. We assume a strongly disvalue-focused, very stable and unusual moral view. Even from this perspective (in rough order of descending relevance):

• Efforts to reduce extinction risk may improve the long-term future by reducing the risk of global catastrophes and increasing global cooperation and stability (part 3).
• There may be some opportunity for future agents to alleviate existing disvalue (as long as the moral view in question isn’t completely ‘unusual’ in all aspects) (part 2.2)
• (Post-)humans space colonization might be preferable to space colonization by non-human animals or extraterrestrials (part 2.1)
• Small amounts of option value might arise from empirical insights improving decisions (part 1.3).

From this maximally sceptical view, targeted approaches to reduce the risk of human extinction likely seem somewhat unexciting or neutral, with high uncertainty (see footnote[62] for how advocates of strongly disvalue-focused views see the EV of efforts to reduce extinction risk). Reducing the risk of extinction by misaligned AI probably seems positive because misaligned AI would also colonize space (see part 2.1).

From views that value the creation of happy beings or creation of value more broadly, have considerable moral uncertainty, and believe future reflected and altruistic agents could make good decisions, the EV of efforts to reduce extinction risk is likely positive and extremely high.

In aggregation, efforts to reduce the risk of human extinction seem in expectation robustly positive from many consequentialist perspectives.

## Efforts to reduce extinction risk should be a key part of the EA long-termist portfolio

Effective altruists whose primary moral concern is making sure the future plays out well will, in practice, need to allocate their resources between different possible efforts. Some of these efforts are optimized to reduce extinction risk (e.g. promoting biosecurity), others are optimized to improve the future conditional on human survival while also reducing extinction risk (e.g. promoting global coordination or otherwise preventing negative trajectory changes) and some are optimized to improve the future without making extinction risk reduction a primary goal (e.g. promoting moral circle expansion or "worst-case" AI safety research).

We have argued above that the EV of efforts to reduce extinction risk is positive, but is it large enough to warrant investment of marginal resources? A thorough answer to this question requires detailed examination of the specific efforts in question and goes beyond the scope of this article. We are thus in no position to provide a definitive answer for the community. We will, however, present two arguments that favor including efforts to reduce extinction risk as a key part in the long-termist EA portfolio. Efforts to reduce the risks of human extinction are time-sensitive and seem very leveraged. We know of specific risks this century, we have reasonably good ideas for ways to reduce them, and if we actually avert an extinction event, this has robust impact for millions of years (at least in expectation) to come. As a very broad generalization, many efforts optimized to otherwise improve the future - such as improving today’s values in the hope that they will propagate to future generations - are less time-sensitive or leveraged. In short, it seems easier to prevent an event from happening in this century than to otherwise robustly influence the future millions of years down the line.

Key caveats to this argument include that it is not clear how big differences in time-sensitivity and leverage are[63] and that we may still discover highly leveraged ways to “otherwise improve the future”. Therefore, it seems that the EA long-termist portfolio should contain all of the efforts described above, allowing each member of the community to contribute to their comparative advantage. For those holding very disvalue-focused moral views, the more attractive efforts would plausibly be those optimized to improve the future without making extinction risk reduction a primary goal.

# Appendix 1: What if humanity stayed earthbound?

In this appendix, we use the approach of part 1.1 and apply it to a situation in which humanity stays Earth-bound. It is recommended to first read part 1.1 before reading this appendix.

We think that scenarios in which humanity stays Earth-bound are of very limited relevance for the EV of the future for two reasons:

• Even if humanity staying Earth-bound was the most likely outcome, probably only a small fraction of expected beings live in these scenarios, so they only constitute a small fraction of expected value or disvalue (as argued in the introduction).
• Humanity staying Earth-bound may not actually be a very likely scenario because reaching post-humanity and realizing astronomical value might be a default path, conditional on humanity not going extinct (Bostrom, 2009)

If we assume humanity will stay Earth-bound, it seems that most welfarist views would probably favour reducing extinction risk. If one thinks humans are much more important than animals, it is obvious (unless one combined that view with suffering-focused ethics, such as antinatalism). If one also cares about animals, then very plausibly humanity's impact on wild animals is more relevant than humanity’s impact on farmed animals, because of the enormous numbers of the former (and especially since it seems plausible that factory farming will not continue indefinitely). So far, humanity’s main effect on wild animals has been a permanent decrease of population size (through habitat destruction), which is expected to continue as human population size grows. Compared to that, direct influence on wild animal well-being currently is unclear and probably small (though it is less clear for aquatic life):

• We kill significant numbers of wild animals, but we don’t know how painful human-caused death compared to non-human caused death is
• Wild animal generation times are very short, so the number of animals affected by “never coming into existence” is probably much larger

If one thinks that wild animals are on net suffering, future population size reduction seems beneficial. If one thinks that wild animal welfare is net positive, then habitat reduction would be bad. However, there is still unarguably a lot of suffering in nature. Humanity might eventually - if we have much more knowledge and better tools, that allow us to do so at limited costs to ourselves - improve wild animals’ lives (like we already do with e.g. vaccinations), so the prospect of that might offset some of the negative value of current habitat reduction. Obviously, habitat destruction is negative from a conservationist/environmentalist perspective.

# Appendix 2: Future agents will in expectation have a considerable fraction of other-regarding preferences

Altruism in humans likely evolved as a “shortcut” solution to coordination problems. It was often impossible to forecast how much an altruistic act would help spread your own genes, but it often would (especially in small tribes, where all members were closely related). Thus, humans for whom altruism just felt good had a selective advantage.

As agents become more rational and long-term planning, a tendency to help for purely selfless reasons seems less adaptive. Agents can deliberately cooperate for strategic reasons whenever necessary and for the exactly optimal amount to optimize for their own reproductive fitness. One might fear that in the long run, only preferences for increasing one’s own power and influence (and that of one’s descendants) might remain under Darwinian selection.

But this is not necessarily the case, for two reasons:

## Darwinian processes will select for patience, not “selfishness” (Paul Christiano)

Agents reasoning from a long-term perspective, and the better the tools to preserve values and influence into the future, may reduce the need for altruistic preferences, but also strongly reduce selection pressure for selfishness. In contrast to short-term planning (overly) altruistic agents, long-term planning agents that want to create value would realize that amassing power is an instrumental goal for that, and will try to survive, get resources for instrumental reasons, and coordinate with others against unchecked expansion of selfish agents. Thus, future evolution might select not for selfishness, but for patience or how strongly an agent cares about the long-term. Such long-term preferences should be expected to be more altruistic.

Carl Shulman additionally makes the point that in a space colonization scenario, agents that want to create value would only be very slightly disadvantaged in direct competition with agents that only care about expanding.

Brian Tomasik thinks Christiano’s argument is valid and altruism might not be driven to zero in the future, but is doubtful that very-long term altruist will have strategic advantages over medium-term corporations and governments and cautions against putting too much weight on theoretical arguments: “Human(e) values have only a mild degree of control in the present. So it would be surprising if such values had significantly more control in the far future.”

## Preferences might not even be subject to Darwinian processes indefinitely

If the losses from evolutionary pressure indeed loom large, it seems quite likely that future generations would coordinate against it, e.g. by forming a singleton (Bostrom, 2006) (which broadly encompasses many forms of global coordination or value/goal-preservation). (Of course, there are also future scenarios that would strip away all other-regarding preferences, e.g. in Malthusian scenarios.)

In conclusion, we will end up somewhere between no other-regarding preferences and even more than today, with a considerable probability of future agents having a considerable fraction of other-regarding preferences.

# Appendix 3: What if current human values transferred broadly into the future?

Most humans (past and present) intend to do what we now consider good (be loving, friendly, altruistic) more than they intend to harm (be sadistic, hateful, seek revenge). Positive[64] other-regarding preferences might be more universal: most people would, all else equal, prefer all human or animals to be happy, while fewer people would have such a general preference for suffering. This relative overhang of positive preferences in human society is evident from rules that ban hurting (some) others, but not helping others. These rules will (if they persist) also shape the future, as they increase the costs of doing harm.[65]

Throughout human history, there has been a trend away from cruelty and violence.[66] Although humans cause a lot of suffering in the world today, this is mostly because people are indifferent or “lazy”, rather than evil. All in all, it seems fair to say that the significant majority of human other-regarding preferences is positive, and that most people would, all else equal, prefer more happiness and less suffering. However, we admit this is hard to quantify.[67]

