An even deeper atheism

Joe_Carlsmith

An even deeper atheism

Joe_Carlsmith

Comments 2

Sorted by

New & upvoted

Matthew_Barnett

We can haggle about some of the details of Yudkowsky's pessimism here... but I'm sympathetic to the broad vibe: if roughly all the power is held by agents entirely indifferent to your welfare/preferences, it seems unsurprising if you end up getting treated poorly. Indeed, a lot of the alignment problem comes down to this.

I agree with the weak claim that if literally every powerful entity in the world is entirely indifferent to my welfare, it is unsurprising if I am treated poorly. But I suspect there's a stronger claim underneath this thesis that seems more relevant to the debate, and also substantially false.

The stronger claim is: adding powerful entities to the world who don't share our values is selfishly bad, and the more of such entities we add to the world, the worse our situation becomes (according to our selfish values). We know this stronger claim is likely false because—assuming we accept the deeper atheism claim that humans have non-overlapping utility functions—the claim would imply that ordinary population growth is selfishly bad. Think about it: by permitting ordinary population growth, we are filling the universe with entities who don't share our values. Population growth, in other words, causes our relative power in the world to decline.

Yet, I think a sensible interpretation is that ordinary population growth is not bad on these grounds. I doubt it is better, selfishly, for the Earth to have 800 million people compared to 8 billion people, even though I would have greater relative power in the first world compared to the second. [ETA: see this comment for why I think population growth seems selfishly good on current margins.]

Similarly, I doubt it is better, selfishly, for the Earth to have 8 billion humans compared to 80 billion human-level agents, 90% of which are AIs. Likewise, I'm skeptical that it is worse for my values if there are 8 billion slightly-smarter-than human AIs who are individually, on average, 9 times more powerful than humans, living alongside 8 billion humans.

(This is all with the caveat that the details here matter a lot. If, for example, these AIs have a strong propensity to be warlike, or aren't integrated into our culture, or otherwise form a natural coalition against humans, it could very well end poorly for me.)

If our argument for the inherent danger of AI applies equally to ordinary population growth, I think something has gone wrong in our argument, and we should probably reject it, or at least revise it.

Vasco Grilo🔸

Nice post, Joe!

"I reject Yudkowsky's story that some particular AI will foom and become dictator-of-the-future; rather, I think there will be a multi-polar ecosystem of different AIs with different values. Thus: problem solved?" Well, hmm: what values in particular? Is it all still ultimately an office-supplies thing? If so, it depends how much you like a complex ecosystem of staple-maximizers, thumb-tack-maximizers, and so on – fighting, trading, etc. "Better than a monoculture." Maybe, but how much?^[9] Also, are all the humans still dead?

In my mind, there is a sense in which this last question is analogous to Neanderthals^[1] asking a few hundreds of thousands of years ago whether they would still be around now. They are not, but is this any significant evidence that the world has gone through a much less valuable trajectory? I do not think so. What arguably matters is whether there are still beings around with the desire and ability to increase welfare. So I would instead ask, "are all intelligent welfarists dead?", where intelligent could be interpreted as sufficiently intelligent to eventually leverage (via successors or not) the cosmic endowment to increase welfare. My question is equivalent to yours nearterm, since humans are the only intelligent welfarists now, but the answers may come apart in the next few decades thanks to (even more) intelligent sentient AI. To the extent the answers to the 2 questions differ, it seems important to focus on the right one.

^{^}
Or individuals of another species of the genus Homo. There are 12 besides Homo Sapiens!

Comments

More from the author

137

Leaving Open Philanthropy, going to Anthropic

Joe_Carlsmith·8mo ago·22m read

Fake thinking and real thinking

Joe_Carlsmith·1y ago·Curated 1y ago·46m read

238

Killing the ants

Joe_Carlsmith·5y ago·9m read

Curated and popular this week

Counting animals: Stable population size is not equivalent to priority level

abrahamrowe, mal_graham🔸·6d ago·Curated 2d ago·16m read

AI Use Note: Main body text entirely human written. Claude (Opus 4.8) helped develop models of animal life histories in the appendix. Cross-posted from Good Structures. Executive Summary * Animal advocates sometimes make claims like “there are X of this animal...

154

Let's taboo the V-word

lincolnq·6d ago·8m read

“How long have you been v*g*n?” This is one of the most common icebreakers at animal protection events. It’s a baseline assumption, and it mostly holds true: if you’re out advocating for animals not to be tortured or abused, realistically these days you are v**n, or close. And it makes for good conversation. It seems fairly safe to assume when you meet strangers. But this assumption is hurting the movement in a way which we don’t always notice: someone new comes into the sp...

105

Spiro: an update 2.5 years on and a fundraising ask for expansion

Habiba Banu·3d ago·6m read

Summary Back in November 2023 I posted here to launch Spiro and raise our first $198k. Two and a half years later this is an update and a fundraiser for the next step. The short version: we've now reached over-5,900 people with TB preventive medicine, including over 3,000 children under five years old. Our early results have held up well an...

Recent opportunities to take action

EA Organisation Updates thread: July 2026

Dane Valerie·5d ago·1m read

announcing High Impact Aliens

tzukitchan·2d ago·1m read

Help us launch AI safety university groups by referring potential founders

Jason Chin🔸, Thomas Rodskog·2d ago·4m read

Matthew_Barnett

We can haggle about some of the details of Yudkowsky's pessimism here... but I'm sympathetic to the broad vibe: if roughly all the power is held by agents entirely indifferent to your welfare/preferences, it seems unsurprising if you end up getting treated poorly. Indeed, a lot of the alignment problem comes down to this.

If our argument for the inherent danger of AI applies equally to ordinary population growth, I think something has gone wrong in our argument, and we should probably reject it, or at least revise it.

^{^}

Or individuals of another species of the genus Homo. There are 12 besides Homo Sapiens!

"You could very analogously say 'human faces are fragile' because if you just leave out the nose it suddenly doesn't look like a typical human face at all. Sure, but is that the kind of error you get when you try to train ML systems to mimic human faces? Almost none of the faces on thispersondoesnotexist.com are blatantly morphologically unusual in any way, let alone noseless." ↩︎
I think Stuart Russell's comment here – "A system that is optimizing a function of n variables, where the objective depends on a subset of size k<n, will often set the remaining unconstrained variables to extreme values; if one of those unconstrained variables is actually something we care about, the solution found may be highly undesirable" – really doesn't cut it. ↩︎
See also: "The tails come apart" and "Beware surprising and suspicious convergence." Plus Yudkowsky's discussion of Corrigible and Sovereign AIs here, both of which appeal to the notion of wanting "exactly what we extrapolated-want." ↩︎
I'm no fan of experience machines, but still – yes? Worth paying a lot for over paperclips, I think. ↩︎
Indeed, Soares gives various examples of humans doing similar stuff here. ↩︎
See also Soares here. ↩︎
Thanks to Carl Shulman for suggesting this example, years ago. One empirical hypothesis here is that in fact, human reflection will specifically try to avoid leading to path-dependent conclusions of this kind. But again, this is a convenient and substantive empirical hypothesis about where our meta-reflection process will lead (and note that anti-realism assumes that some kind of path dependence must be OK regardless – e.g., you need ways of not caring about the fact that in some possible worlds, you ended up caring about paperclips). ↩︎
My sense is that Yudkowsky deems this behavior roughly as anti-natural as believing that 222+222=555, after exposure to the basics of math.* ↩︎
And note that "having AI systems with lots of different values systems increases the chances that those values overlap with ours" doesn't cut it, at least in the context of extremal goodhart, because sufficient similarity with human values requires hitting such a narrow target so precisely that throwing more not aimed-well-enough darts doesn't help much. And the same holds if we posit that the AI values will be "complex" rather than "simple." Sure, human values are complex, so AIs with complex values are at least still in the running for alignment. But the space of possible complex value systems is also gigantic – so the narrow target problem still applies. ↩︎

An even deeper atheism

An even deeper atheism

The fragility of value

Human paperclippers?

Deeper into godlessness

Balance of power problems