Crossposted on LessWrong.
Over the last few years, progress has been made in estimating the density of intelligent life in the universe (e.g., Olson 2015, Sandberg 2018, Hanson 2021). Bits of progress have been made in using these results to update longtermist macrostrategy, but these results are partial and stopped short of their potential (Finnveden 2019, Olson 2020, Olson 2021, Cook 2022). Namely, this work stopped early in its tracks, at best, only hinting at the meaty part of the implications and leaving half of the work almost untouched: comparing the expected utility produced by different Space-Faring Civilizations (SFCs). In this post, we hint at the possible macrostrategic implications of these works: A possible switch for the longtermist community from targeting decreasing X-Risks (including increasing P(Alignment)), to increasing P(Alignment | Humanity creates an SFC).
Sequence: This post is part 1 of a sequence investigating the longtermist implications of alien Space-Faring Civilizations. Each post aims to be standalone.
Summary
We define two hypotheses:
- Civ-Saturation Hypothesis: Most resources will be claimed by Space-Faring Civilizations (SFCs) regardless of whether humanity creates an SFC.
- Civ-Similarity Hypothesis: Humanity's Space-Faring Civilization would produce utility similar to other SFCs.
If these hypotheses hold, this could shift longtermist priorities away from reducing pure extinction risks and toward specifically optimizing P(Alignment | Humanity creates an SFC). This means that rather than focusing broadly on preventing misaligned AI and extinction, longtermists might need to prioritize strategies that specifically increase the probability of alignment conditional on humanity creating an SFC. Macrostrategy updates include the following:
- (i) Deprioritizing significantly extinction risks, such as nuclear weapon and bioweapon risks.
- (ii) Deprioritizing to some degree AI Safety agendas mostly increasing P(Humanity creates an SFC) but not increasing much P(Alignment | Humanity creates an SFC).
- (iii) Giving more weight to previously neglected AI Safety agendas. E.g., a "Plan B AI Safety" agenda that would focus on decreasing P(Humanity creates an SFC | Misalignment), for example, by implementing (active & corrigible) preferences against space colonization in early AI systems.
The Civ-Saturation Hypothesis
Will Humanity's SFC grab marginal resources? The Civ-Saturation Hypothesis posits that when making decisions, we should assume most of Humanity’s SFC future resources will eventually be grabbed by SFCs regardless of whether Humanity's SFC exists or not.
Plausibly low marginal resources under EDT. The validity of this hypothesis can be studied using models estimating the frequency of Space-Faring Civilizations (SFCs) in the universe (Sandberg 2018, Finnveden 2019, Olson 2020, Hanson 2021, Snyder-Beattie 2021, Cook 2022). The validity will also depend on which decision theory we use and on our beliefs behind these. As soon as we put some credence on evidential decision theories and on our actions being correlated with those of our exact copies, we may have to put significant weight on the Civ-Saturation Hypothesis. We will produce a first quantitative evaluation of this hypothesis in a later post.
Hinting at longtermist macrostrategic implications
What is the impact of human ancestry on SFC's expected utility? For simplicity, let’s assume the Civ-Saturation Hypothesis is 100% true. How much counterfactual value Humanity creates then depends entirely on the utility Humanity’s SFC creates relative to all SFCs. Are SFCs going to create more or less utility per unit of resources than Humanity’s SFC? I.e., how different are U(SFC) and U(SFC| Human-ancestry)? Little progress has been made on this question. For reference, see quotes from (Finnveden, 2019), (Brian Tomasik, 2015), (Brauner and Grosse-Holz, 2019), (Anthony DiGiovanni, 2021) in footnotes. Most discussions stop after a few of the following arguments.
- Under moral anti-realism, humans are more likely to be higher utility since we should expect a lower level of convergence between moral values, and we are more likely to carry out our own precise values.
- There may be convergence among goals, especially between ancestors of SFCs.
- Human moral values may depend on the biological structure of our brain or on contingent cultural features.
- It is plausible that humans are more compassionate than other Intelligent Civilizations since humans are somewhat abnormally compassionate among Earthly animals.
- Human's SFC may cause more suffering than other SFCs because we are sentient, and conflicts may target our values.
- Human's SFC may create more utility because we are sentient and are more likely to create a sentient SFC.
- Humans are likely biased towards valuing themselves.
For clarity, I am not endorsing these arguments. I am listing arguments found in existing discussions.
No existing work directly studies this precise question in depth. Some related work exists but mostly looks at the moral values of alien or alien SFCs, much more rarely, at those of SFC’s alien ancestors, and not at the relative expected utility between Humanity’s SFC and other SFCs. I will introduce novel object-level arguments about this question in a later post.
A priori, Humanity's SFC expected utility is not special. For now, let’s assume we know nothing about how conditioning on Human-ancestry impacts the utility produced by an SFC, then U(SFC) ~ U(SFC | Human-ancestry). This assumption is similar to using the Principle of Mediocrity. What would be the macrostrategic longtermist implications in that case?
- Reducing pure extinction risks is much less valuable. Increasing P(Humanity creates an SFC) has much less longtermist value, and Nuclear and Bio X-risk reduction agendas would have a reduced priority. Though their neartermist justifications would remain.
Longtermists should optimize P(Alignment | Humanity creates a SFC). Concerning AI Safety, from the point of view of impartial longtermists, increasing P(Alignment | Humanity creates an SFC) would replace the currently commonly used target of increasing P(Alignment AND Humanity creates an SFC). Longtermist AI Safety agendas would need to be re-evaluated using this new target.
Some existing AI Safety agendas may increase P(Alignment AND Humanity creates an SFC) while at the same time not increasing as much or even, if unlucky, reducing P(Alignment | Humanity creates an SFC). For example, such agendas may significantly prevent early AIs and AI usages from destroying, at the same time, the potential of Humanity and AIs.
Other currently neglected agendas may increase P(Alignment | Humanity creates an SFC) while not increasing P(Alignment AND Humanity creates an SFC). Those include agendas aiming at decreasing P(Humanity creates an SFC | Misalignment). An example of intervention in such an agenda is overriding instrumental goals for space colonization and replacing them with an active desire not to colonize space. This defensive preference could be removed later, conditional on achieving corrigibility.
The Civ-Similarity Hypothesis
Is Human ancestry neutral, positive, or negative? The implications hinted above are only plausible if U(SFC) ~ U(SFC | Human-ancestry). We formulate this requirement as a hypothesis. The Civ-Similarity Hypothesis posits that the expected utility efficiency, per unit of resources, of Humanity's future SFC is similar to that of other SFCs.
How could this hypothesis be valid? There are two main components contributing in that direction:
- High uncertainty about the future may flatten expected utilities. We may not know enough about how conditioning on Human (or others) ancestors impacts the value of the long-term future produced by an SFC.
- SFCs are rare, and creating them may be very constrained, AKA convergent evolution and strong selection. We may observe that selection mechanisms and convergent evolutionary processes drastically reduce the space of possible characteristics an SFC’s ancestors can have.
How could this hypothesis be invalid?
- We may know enough to predict significant differences in expected utilities. We may already have enough information to say that Humanity's SFC will be abnormal in some specific ways relative to other SFCs. If, additionally, we are confident in how these abnormalities impact the long-term utility of Humanity's SFC, then we should be able to conclude that our future SFC is significantly higher or lower utility than other SFCs.
- We may only care about our precise values, and we may succeed at aligning our future SFC. We may consider that only our own precise values are valuable (e.g., no moral uncertainty). Additionally, if the distribution of alien moral values is much more diffuse than that of humans, even after conditioning on ancestors creating first an SFC. And if, finally, we are confident enough in how our values impact the long-term utility produced by SFCs (e.g., we think we will succeed at alignment). Then, we should conclude that the hypothesis is invalid.
In later posts, we will look deeper into evaluating the Civ-Similarity Hypothesis and the tractability of making further progress there. We will see that a lot can be said regarding this hypothesis.
The Existence Neutrality Hypothesis
A third hypothesis as the conjunction of the previous two. This third and last hypothesis is simply the conjunction of the first two hypotheses. The Existence Neutrality Hypothesis posits that influencing Humanity's chance at creating an SFC produces little value compared to increasing the quality of the SFC we would eventually create conditional on doing so. Let's note that this hypothesis somewhat contradicts Nick Bostrom's astronomical waste argument.
Whispers of plausible importance. A few discussions about the implications (from the existence of alien SFCs, including the Existence Neutrality Hypothesis) are already available online but, to my knowledge, never led to a proper assessment of these questions. For reference, in the footnotes, you can find relevant quotes from (Brian Tomasik 2015), (Jan M. Brauner and Friederike M. Grosse-Holz, 2018), (Anthony DiGiovanni, 2021), (Maxwell Tabarrok, 2022), (MacAskill, 2023), (Toby Ord's answer to MacAskill 2023), (Jim Buhler, 2023), (Magnus Vinding 2024).
Context
Evaluating the Existence Neutrality Hypothesis - Introductory Series. This post is part of a series introducing a research project for which I am seeking funding: Evaluating the Existence Neutrality Hypothesis. This project includes evaluating both the Civ-Saturation and the Civ-Similarity Hypotheses and their longtermist macrostrategic implications. This introductory series hints at preliminary research results and looks at the tractability of making further progress in evaluating these hypotheses.
Next: A first evaluation of the Civ-Saturation Hypothesis. Over the next few posts, we will introduce a first evaluation of the Civ-Saturation Hypothesis. Starting by reviewing existing SFC density estimates and models producing them and clarifying the meaning and impact of Civ-Saturation on which possible world we should bet on.
Plan of the sequence
(Introduction)
- (1) Longtermist implications of aliens Space-Faring Civilizations - Introduction
(A first pass at evaluating the Civ-Saturation Hypothesis)
(Objects-level arguments about the Civ-Similarity Hypothesis and its tractability)
- (6) The Convergent Path to the Stars - Similar Utility Across Civilizations Challenges Extinction Prioritization
- (7) High-level reasons for optimism in studying the Existence Neutrality Hypothesis
(Introducing the research project & implications)
- (8) Evaluating the Existence Neutrality Hypothesis - A research project
- (9) Macrostrategic Implications of the Existence Neutrality Hypothesis
Acknowledgments
Thanks to Tristan Cook, Magnus Vinding, Miles Kodama, and Justis Mills for their excellent feedback on this post and ideas. Note that this research was done under my personal name and that this content is not meant to represent any organization's stance.
I'm very speculative about making moral decisions concerning the donations of potentially millions of dollars based on something so speculative. I think it's too far down the EA crazy train to prioritise different causes based on the density of alien civilisations. It's probably more speculative than the simulation hypothesis (which, if true, significantly increases the likelihood that you are the only sentient being in this universe), but we don't make moral decisions based on that.
I get that there's been a lot of work on this and that we can make progress on it (I know, I'm an astrobiologist), but I'm sure there are so many unknown unknowns associated with the origin of life, development of sentience, and spacefaring civilisation that we just aren't there yet. The universe is so enormous and bonkers and our brains are so small - we can make numerical estimates sure, but creating a number doesn't necessarily mean we have more certainty.
I've got a big moral circle (all sentient beings and their descendants), but it does not extend to aliens because of cluelessness.
I think you're posing a post-understanding of consciousness question. Consciousness might be very special or it might be an emergent property of anything that synthesises information, we just don't know. But it's possible to imagine aliens with complex behaviour similar to us, but without evolving the consciousness aspect, like superintelligent AI probably will be like. For now, the safe assumption is that we're the only conscious life, and I think it's very important that we act like it until proven otherwise.
So for now, I'm quite confident that if we're thinking about the moral utility of spacefaring civilisation, we should at least limit our scope to our own civilisation, more specifically, our own sentience and its descendants (I personally prefer to limit that scope even further to the next few thousand years, or just our Solar System to reduce the ambiguity a bit - longtermism still stands strong with this huge limitation). I think the main value in looking into the potential density of aliens in the universe helps figure out what our own future might look like. Even if humans only colonise the Solar System because alien SFCs colonise the galaxy, that's still 10^27 potential future lives (1.2 sextillion over the next 6000 years; future life equivalents based on the Solar System's carrying capacity; as opposed to 100 trillion if we stay on Earth till its destruction). We can control and predict that to an extent, and there's enough ambiguity and cluelessness already associated with how to make human civilisation's future in space good in the context of AI - but we can at least make some concrete decisions (e.g. work by Simon Institute & CLR).
Very interesting post though! Lots to think about and I can see that this could be the most important moral consideration... maybe... I look forward to your series and I definitely think it's worthwhile to try and figure out what that consideration might be.
I agree that the particular guesses we make about aliens will be very speculative/arbitrary. But "we shouldn't take the action recommended by our precise 'best guess' about XYZ" does not imply "we can set the expected contribution of XYZ to the value of our interventions to 0". I think if you buy cluelessness — in particular, the indeterminate beliefs framing on cluelessness — the lesson you should take from Maxime's post is that we simply aren't justified in saying any intervention with effects on x-risk is net-positive or net-negative (w.r.t. total welfare of sentient beings).
I somewhat agree with your points. Here are some contributions, and pushbacks:
Something interesting about these hypotheses and implications is that they get stronger the more uncertainty we are, as long as one uses some form of EDT (e.g., CDT + exact copies). The less we know about how conditioning on Humanity ancestry impacts utility production, the more the Civ-Similarity Hypothesis is close to correct. The broader our distribution over the density of SFC in the universe, the more the Civ-Saturation Hypothesis is close to correct. This seems true as long as you account for the impact of correlated agents (e.g., exact copies) and that they exist. For the Civ-Similarity Hypothesis, this comes from the application of the Mediocrity Principle. For the Civ-Saturation Hypothesis, this comes from the fact that we have orders of magnitude more exact copies in saturated worlds than in empty worlds.
Consciousness is indeed one of the arguments pushing the Civ-Similarity Hypothesis toward lower values (humanity being more important), and I am eager to discuss its potential impact. Here are several reasons why the update from consciousness may not be that large:
I am very happy to get pushback and to debate the strength of the "consciousness argument" on Humanity's expected utility.
Thanks for your reply, lots of interesting points :)
I particularly appreciate that reframing of consciousness. I think it's probably both binary and continuous though. Binary in the sense that you need a "machinery" that's capable of producing consciousness i.e. neurons in a brain seem to work. And then if you have that capable machinery, you then have the range from low to high consciousness, like we see on Earth. If intelligence is related to consciousness level as it seems to be on Earth, then I would expect that any alien with "capable machinery" that's intelligent enough to become spacefaring would have consciousness high enough to satisfy my worries (though not necessarily at the top of the range).
So then any alien civilisation would either be "conscious enough" or "not conscious at all", conditional on (a) the machinery of life being binary in its ability to produce a scale of consciousness and (b) consciousness being correlated with intelligence.
So I'm not betting on it. The stakes are so high (a universe devoid of sentience) that I would have to meet and test the consciousness of aliens with a 'perfect' theory of consciousness before I updated any strategy towards reducing P(ancestral-human SFC) even if there's an extremely high probability of Civ-Similarity Hypothesis being true.