TL;DR: Imagine that a person (or a limited group of n people), B, creates an ASI and converts the whole solar system into a simulation of a succession of very happy mental states for a copy of B – i.e., sim-B. Since sim-B is modeled after a human who doesn’t like to feel alone, the responses of other people-in-the-sim towards B are simulated, too – i.e., sim-B lives in a solipsistic world, interacting with “philosophical zombies”. That's what I call The Solipsist Repugnant conclusion - for lack of a better name.

We could replace B with a set of people favored by B. Many of us would still regard this as a (morally) catastrophic scenario.

[wow, maybe this explains the plot of Being John Malkovich]

Epistemic status: I’m pretty confident that: I am not totally off the mark regarding this problem, this is not its best or most original presentation, there might be an objection out there that satisfies me.

[Edit: of course, I should have given more credit to Physics and Philosophy classic thought experiments such as Boltzmann Brain or Brain in a vat]

I think the conclusion in the TL;DR is likely, given some attractive principles and plausible hypothesis many of us embrace:

  1. Impartiality towards simulations: a high-fidelity simulation of an agent B is not axiologically inferior to the original B.
  2. Simulations are cheaper: for any relevant experience of a person in our world, you can produce a high-fidelity simulation of that experience for a tiny fraction of its energy resources.
  3. Z-phil or "NPCs": it’s possible and cheaper to simulate the responses of an agent C towards an agent B without simulating C’s conscious and internal mental states – i.e., through a low-fidelity simulation – and without B ever realizing it’s not dealing with a “real” (or high-fidelity simulated) agent.
  4. Pareto hedonistic sum-utilitarianism: the best possible world displays maximal general happiness… or better: is in the set of the best possible worlds iff w does not display a smaller sum than another world w' of (or "is strictly weakly preferred to w' regarding...") positive mental states .
  5. The economics of Kolmogorov complexity: simulating one agent is cheaper (it requires a smaller program) than simulating two. [I'm particularly uncertain about this]
  6. Against hedonic treadmill: though humans in the real world have concave utility functions, because of hedonic treadmill and decreasing marginal returns for consumption, it’s possible and desirable to simulate agents with something “linear” utility functions. Simulating two identical experiences for one agent would then be equivalent to simulating one experience twice for two simulated agents.
  7. Selfish preferences, or the Ego Principle: an agent B is allowed to favor their own mental states (or the mental states of those they love) over someone else’s experiences.


Given 1-6, if one wants to maximize the amount of positive hedonic mental states in a simulation, there should be only one agent in your utilitarian simulation (or similarly: there is an upper bound for the population of your optimal simulation), for a maximally extended subjective time.

I wonder if a different result would come from weaker premises. Even if we drop (6) – which I find attractive: the value of my experiences depends a lot on memory, plans and expectations, and this possibly leads to something like a concave utility function – it is still desirable to simulate a maximal number of instances of the same agent. Given 7, an agent who is designing a simulation would be morally allowed to populate this simulation only with their own copies (or with their loved ones). Maybe we could weaken (5), too: there might be gains of scale in simulating more agents - but only up to a point. Probably (4) could be replaced with other kind of welfare aggregation, too - but I doubt this would change the overall conclusion.

Even if we drop (7) [1], I am not sure the conclusion would be much better: we should instead identify the agent (or minimal set of agents) x whose repeated simulation would result in the largest amount of positive hedonic states. If B is one of the possible values of x , then B could still be justified in converting everything into "B-verse".

Two types of questions arise:

a) What is (morally) wrong with this scenario? Surely, I don't like it, but maybe that's just my selfish preferences talking; would I change my mind if I could be B? Is my dissatisfaction compensated by the satisfaction of B's preferences? If I knew I wouldn't be simulated anyway (nor those I care about most), would I be indifferent between the "B-verse" and a world with many other agents? And how many is "too few"? A country, a religion, an ethnicity, a generation... a species?

If, from a utilitarian perspective, a world with 10^24 sim-Bs is as good as a world with 10^24 different people, on what grounds can one prefer the latter?

b) is this an actual risk - and if so, how can we avoid it (if we want to)? 


Egalitarian constraints and Faustian pacts

This shouldn’t be a surprise: one of the fears in AI safety / ethics /policy is that it will increase inequality; or optimize for a very limited set of values. This conclusion is just a limiting case of such arguments. Also, literature on population ethics is now full of impossibility theorems; this might be just an instance of a larger pattern.

My current hypothesis is something like a contractarian / contractualist reasoning: people evolved strong meta-preferences for egalitarian principles in order to avoid competition for resources and coordination failures - to prevent things like prisoner's dilemma or stag hunts. Thus, since the scenario derived from (1-7) above implies a possible race to the bottom, where everyone who "could be B" would compete for this (yeah, think about it as a sophisticated memetical evolutionary conflict to maximize offspring / gene-in-the-loci), and that this competition might lead to suboptmal results (shortcuts in AI safety, open war, etc. - pick your preferred analogue of a Hobbesian state of nature), the relevant agents would have an incentive to reach an agreement. This would answer (a) and (b) above.
My first thought about this Solipsistic Repugnant Conclusion was “I don’t want to be converted into computronium to simulate heaven for a utility monster”. But that’s the wrong line of thought; suppose B manages to avoid affecting the lives of anyone currently alive – maybe they could even leave Earth untouched and convert only the rest of the galaxy into their simulation. I think this is still unfair, and possibly a waste of our cosmic endowment.

In the limit, perhaps B could even BE everyone currently alive: suppose we strike a “Faustian pact” with an ASI and extend our lives for eons in a simulation where only us would have internal states... Though I don’t want to die, I don’t like this idea: I suspect some sort of variability might be a relevant human value. So maybe the need for egalitarian principles to avoid conflict or increase the prospects of cooperation is not the only problem here.


Part II - The vagaries of personal identity

Now things become a bit more metaphysical– this second part is more like an arrangement of observations that would have better been on my personal blog, if they were not a natural continuation of the  analysis above. Those who are familiar with Reasons and Persons can probably stop here. I just want to remark how confusing some intuitions about personal identity are even without fancy thought-experiments.

  • Indexical identity and origins: it is quite plausible that origins might fix the essential properties of an object – in the sense that if x has a different origin than y, then x ≠ y. So if my parents had had an embryo from different gametes than those that formed me, they would have had a different child, and I wouldn’t exist. That’s the sense of identity behind statements like “I could have been a crazy nazi Christian cook drag queen” – where “could” denotes a modal logical possibility. But I couldn’t have had a very different original genome, though; unfortunately, there’s no possible world where I have hotdog fingers.

(Of course, eventually, the discussion might end up in one of vagueness and borders: if only one chromosome had been different, would this imply my parents would have had a different kid? Notice that this doesn’t imply that your indexical identity is fixed by your DNA – just by your origins)

This might sound obvious, but then:

  • Mental identity and indifference to substrate: some people love (I enjoy it, but I don’t love it) the idea of brain update; they think that if copies their mind (memories, personality traits, etc.) into a high-fidelity simulation, then this simulated-would be identical to the original x in the relevant practical senses. That’s one of the reasons behind (1) Impartiality… above – though (1) is weaker than this thesis. This idea is bread and butter in sci-fi, and David Chalmers has been giving philosophical plausibility to it.

In this sense “I could have been a crazy nazi…” strikes me as false: I cannot be identical to an agent with a very different personality – but again, vagueness: how much could you change my personality before I stop identifying myself with the resulting agent?

A possible objection for this is cardinal temporal identity: we usually think that I can be identical with only one object at a time (i.e., me) – that’s the point of Parfit’s thought experiment with a malfunctioning teleport machine. But a simulation could have different instances of the same agent running at (what seems to me) the same time – i.e., instances of sim-x could be having distinct incompatible experiences… Though counter-intuitive, I am not sure this should be a concern – maybe one could attack this objection with the relativity of simultaneity.


  1. ^

    Which leads me to pondering on the notion of a separated-self as an illusion, or the idea that consciousness is one big thing, and that what distinguishes the different instances of consciousness is not metaphysically and axiologically relevant – it only becomes relevant due to limitations and contingencies of our interactions. In one of his dialogues (damn, I still can’t find the reference… maybe it was in 5000 BC), Raymond Smullyan considers this idea that consciousness could be one thing moving really fast between brains… But perhaps this could raise a problem for conscious agents in astronomical distances.

    This reminds me of Wittgenstein’s controversial reasoning in the Tractatus (5.64) concluding that the solipsistic self must shrink to one extensionless point (the closest I know to a “philosophical singularity”), finally becoming identical to philosophical realism.


New Comment