The Grabby Values Selection Thesis: What values do space-faring civilizations plausibly have?

Jim Buhler

The Grabby Values Selection Thesis: What values do space-faring civilizations plausibly have?

Comments 12

Sorted by

New & upvoted

Will Aldred

I’m enjoying this sequence, thanks for writing it.

I imagine you’re well aware of what I write below – I write it to maybe help some readers place this post within some wider context.

My model of space-faring civilizations' values, which I’m sure isn’t original to me, goes something like the following:

Start with a uniform prior over all possible values, and with the reasonable assumption that any agent or civilization in the universe, whether biological or artificial, originated from biological life arising on some planet.
Natural selection. All biological life probably goes through a Darwinian selection process. This process predictably favors values that are correlated with genetic fitness.
Cultural evolution, including moral progress. Most sufficiently intelligent life (e.g., humans) probably organizes itself into a civilization, with culture and cultural evolution. It seems harder to predict which values cultural evolution might tend to favor, though.
Great filters.^[1] Notably,
- Self-destruction. Values that increase the likelihood of self-destruction (e.g., via nuclear brinkmanship-gone-wrong) are disfavored.
- Desire to colonize space, aka grabbiness. As this post discusses, values that correlate with grabbiness are favored.
- (For more, see Oesterheld (n.d.).)
A potentially important curveball: the transition from biological to artificial intelligence.
- AI alignment appears to be difficult. This probably undoes some of the value selection effects I describe above, because some fraction of space-faring agents/civilizations is presumably AI with values not aligned with those of their biological creators, and I expect the distribution of misaligned AI values, relative to the distribution of values that survive the above selections, to be closer to uniform over all values (i.e., the prior we started with).
- Exactly how hard alignment is (i.e., what fraction of biological civilizations that build superintelligent AI are disempowered?), as well as some other considerations (e.g., are alignment failures generally near misses or big misses?; if alignment is effectively impossible, then what fraction of civilizations are cognizant enough to not build superintelligence?), likely factor into how this curveball plays out.

^{^}
Technically, I mean late-stage steps within the great filter hypothesis (Wikipedia, n.d.; LessWrong, n.d.).

Jim Buhler

Thanks a lot for this comment! I linked to it in a footnote. I really like this breakdown of different types of relevant evolutionary dynamics. :)

Will Aldred

kokotajlod

NIce post! My current guess is that the inter-civ selection effect is extremely weak and that the intra-civ selection effect is fairly weak. N=1, but in our civilization the people gunning for control of AGI seem more grabby than average but not drastically so, and it seems possible for this trend to reverse e.g. if the US government nationalizes all the AGI projects.

Jim Buhler

Thanks for the comment! :) You're assuming that the AGI's values will be pretty much locked-in forever once it is deployed such that the evolution of values will stop, right? Assuming this, I agree. But I can also imagine worlds where the AGI is made very corrigible (such that the overseers stay in control of the AGI's values) and where intra-civ value evolution continues/accelerates. I'd be curious if you see reasons to think these worlds are unlikely.

kokotajlod

Not sure I'm assuming that. Maybe. The way I'd put it is, selection pressure towards grabby values seems to require lots of diverse agents competing over a lengthy period, with the more successful ones reproducing more / acquiring more influence / etc. Currently we have this with humans competing for influence over AGI development, but it's overall fairly weak pressure. What sorts of things are you imagining happening that would strengthen the pressure? Can you elaborate on the sort of scenario you have in mind?

Jim Buhler

Right so assuming no early value lock-in and the values of the AGI being (at least somewhat) controlled/influenced by its creators, I imagine these creators to have values that are grabby to varying extents, and these values are competing against one another in the big tournament that is cultural evolution.

For simplicity, say there are only two types of creators: the pure grabbers (who value grabbing (quasi-)intrinsically) and the safe grabbers (who are in favor of grabbing only if it is done in a "safe" way, whatever that means).

Since we're assuming there hasn't been any early value lock-in, the AGI isn't committed to some form of compromise between the values of the pure and safe grabbers. Therefore, you can imagine that the AGI allows for competition and helps both groups accomplish what they want proportionally to their size, or something like that. From there, I see two plausible scenarios:
A) The pure and safe grabbers are two cleanly separated groups running a space expansion race against one another, and we should -- all else equal -- expect the pure grabbers to win, for the same reasons why we should -- all else equal -- expect the AGI race to be won by the labs optimizing for AI capabilities rather than for AI safety.
B) The safe grabbers "infiltrate" the pure grabbers in an attempt to make their space-expansion efforts "safer", but are progressively selected against since they drag the pure-grabby project down. The few safe grabbers who might manage not to value drift and not to get kicked out of the pure grabbers are those who are complacent and not pushing really hard for more safety.

The reason why the intra-civ grabby values selection is currently fairly weak on Earth, as you point out, is that humans didn't even start colonizing space, which makes something like A or B very unlikely to have happened yet. Arguably, the process that may eventually lead to something like A or B hasn't even begun for real. We're unlikely to notice a selection for grabby values before people actually start running something like a space expansion race. And most of those we might expect to want to somehow get involved in the potential^[1] space expansion race are currently focused on the race to AGI, which makes sense. It seems like this latter race is more relevant/pressing, right now.

^{^}
It seems like this race will happen (or actually be worth running) if, and only if, AGI has non-locked-in values and is corrigible(-ish) and aligned(-ish) with its creators, as we suggested.

Filip Sondej

My main crux regarding inter-civ selection effect is how fast will space colonization get. F.e. if it's possible to produce small black holes, you can use them for an incredibly efficient propulsion and even just slightly grabby civs still spread at approximately the speed of light - roughly the same speed as extremely grabby civs. Maybe it's also possible with fusion propulsion but I'm not sure - you'd need to ask astro nerds.

values aligned with a (potentially discoverable?) moral truth will be more competitive than those that are the most grabbing-prone

I guess the main hope is not that morality gives you a competitive edge (that's unlikely) but rather that enough agents stumble on it anyway, f.e. realizing open/empty individualism is true, through philosophical reflection.

agents with values that might a priori seem less grabbing-prone could still prioritize colonizing space, as a first step, to not fall behind in the race against other agents (aliens or other agents within their civilization), and actually optimize for their values later, such that there is little selection effect

Yeah, I definitely expect that.

Jim Buhler

My main crux regarding inter-civ selection effect is how fast will space colonization get. F.e. if it's possible to produce small black holes, you can use them for an incredibly efficient propulsion and even just slightly grabby civs still spread at approximately the speed of light - roughly the same speed as extremely grabby civs. Maybe it's also possible with fusion propulsion but I'm not sure - you'd need to ask astro nerds.

I haven't thought about whether this should be the main crux but very good point! Magnus Vinding and I discuss this in this recent comment thread.

I guess the main hope is not that morality gives you a competitive edge (that's unlikely) but rather that enough agents stumble on it anyway, f.e. realizing open/empty individualism is true, through philosophical reflection.

Yes. Related comment thread I find interesting here.

Filip Sondej

(The links you sent are broken)

Jim Buhler

Oops yeah thanks. Fixed :)

Iyngkarran Kumar

Just stumbled upon this sequence and happy to have found it! There seems to be lots of analysis ripe for picking here.

Some thoughts on the strength of the grabbiness selection effect below. I’ll likely to come back to this to add further thoughts in the future.

One factor that seems to be relevant here is the number of pathways to technological completion. If we assume that the only civilisations that dominate the universe in the far future are the ones that have reached technological completion (seems pretty true to me), then tautologically, the dominating civilisations must be those who have walked the path to technological completion. Now imagine that in order to reach technological completion, you must tile 50% of the planets under your control with computer chips, but your value system means that you assign huge disvalue to tiling planets with computer chips*. As a result, you’ll refuse to walk the path to technological completion, and be subjugated or wiped out by the civilisations that did go forward with this action.

The more realistic example here is a future in which suffering subroutines are a necessary step towards technological completion, and so civilisations that disvalue suffering enough to not take this step will be dominated by civilisations that either (1) don’t care for suffering or (2) are willing to bite the bullet of creating suffering sub-routines in order to pre-emptively colonise their available resources.

So the question here is how many paths are there to technological completion? Technological completion could be like a mountain summit that is accessible from many directions - in that case, if your value system doesn’t allow you to follow one path, you can change course and reach the summit from the other direction. But if there’s just a single path with some steps that are necessary to take, then this will constrain the set of value systems that dominate the far future. Sketching out precedents for technological completion would be a first step to gaining clarity here.

*This value system is just for the thought experiment, I’m not claiming that it’s a likely one.

Comments

More from the author

Discussions of (p)sentience of small animals miss the point

Jim Buhler·2mo ago·2m read

AI Safety and Cross-Species Robustness: A brief critical review

Jim Buhler·3mo ago·11m read

A list of resources on Cluelessness

Jim Buhler·8mo ago·4m read

Curated and popular this week

Counting animals: Stable population size is not equivalent to priority level

abrahamrowe, mal_graham🔸·5d ago·Curated 1d ago·16m read

AI Use Note: Main body text entirely human written. Claude (Opus 4.8) helped develop models of animal life histories in the appendix. Cross-posted from Good Structures. Executive Summary * Animal advocates sometimes make claims like “there are X of this animal...

150

Let's taboo the V-word

lincolnq·5d ago·8m read

“How long have you been v*g*n?” This is one of the most common icebreakers at animal protection events. It’s a baseline assumption, and it mostly holds true: if you’re out advocating for animals not to be tortured or abused, realistically these days you are v**n, or close. And it makes for good conversation. It seems fairly safe to assume when you meet strangers. But this assumption is hurting the movement in a way which we don’t always notice: someone new comes into the sp...

Spiro: an update 2.5 years on and a fundraising ask for expansion

Habiba Banu·2d ago·6m read

Summary Back in November 2023 I posted here to launch Spiro and raise our first $198k. Two and a half years later this is an update and a fundraiser for the next step. The short version: we've now reached over-5,900 people with TB preventive medicine, including over 3,000 children under five years old. Our early results have held up well an...

Recent opportunities to take action

EA Organisation Updates thread: July 2026

Dane Valerie·4d ago·1m read

A proposal for food retail and services: the internal animal welfare feebate

Stijn Bruers 🔸·1h ago·6m read

announcing High Impact Aliens

tzukitchan·23h ago·1m read

^{^}

Technically, I mean late-stage steps within the great filter hypothesis (Wikipedia, n.d.; LessWrong, n.d.).

Jim Buhler

^{^}
It seems like this race will happen (or actually be worth running) if, and only if, AGI has non-locked-in values and is corrigible(-ish) and aligned(-ish) with its creators, as we suggested.

^{^}

By “the most powerful”, I mean “those who control the most resources such that they’re also those who achieve their goals most efficiently.”

^{^}

Other pieces have pointed at potential dynamics that are fairly similar/analogous. Nick Bostrom (2004) explores “scenarios where freewheeling evolutionary developments, while continuing to produce complex and intelligent forms of organization, lead to the gradual elimination of all forms of being that we care about.” Paul Christiano (2019) depicts a scenario where “ML training [...] gives rise to “greedy” patterns that try to expand their own influence.” Allan Dafoe (2019; 2020) coined the term “value erosion” to illustrate a dynamic where “[j]ust as a safety-performance tradeoff, in the presence of intense competition, pushes decision-makers to cut corners on safety, so can a tradeoff between any human value and competitive performance incentivize decision makers to sacrifice that value.”

^{^}

They arguably have already been somewhat selected for, via natural and cultural evolution (see Will Aldred's comment) long before space colonization becomes a possibility, though.

^{^}

Thanks to Robin Hanson for pointing out this last part to me, and for helping me realize that differentiating between the intra-civ selection and the inter-civ one was much more important than I previously thought.

^{^}

Dafoe (2019, section Frequently Asked Question) makes an analogous point.

The Grabby Values Selection Thesis: What values do space-faring civilizations plausibly have?

The Grabby Values Selection Thesis: What values do space-faring civilizations plausibly have?

The thesis

How strong is the selection effect?

Conclusion

Acknowledgment