SIA > SSA, part 2: Telekinesis, reference classes, and other scandals

by Joe_Carlsmith52 min read1st Oct 2021No comments

7

Anthropics
Frontpage

(Cross-posted from Hands and Cities. Previously in sequence: SIA > SSA, part 1: Learning from the fact that you exist.)

This post is the second in a four-part series, explaining why I think that one prominent approach to anthropic reasoning (the “Self-Indication Assumption” or “SIA”) is better than another (the “Self-Sampling Assumption” or “SSA”). This part focuses on objections to SSA. In particular, SSA implies: 

IV. The inevitability of presumptuousness

To get an initial flavor of some basic trade-offs between SIA and SSA, let’s look at the basic dialectic surrounding two versions of God’s extreme coin toss:

God’s extreme coin toss with jackets: God flips a fair coin. If heads, he creates one person with a red jacket. If tails, he creates one person with a red jacket, and a million people with blue jackets.

  • Darkness: God keeps the lights in all the rooms off. You wake up in darkness and can’t see your jacket. What should your credence be on heads?
  • Light+Red: God keeps the lights in all the rooms on. You wake up and see that you have a red jacket. What should your credence be on heads?

(I’ll assume, for simplicity, that the SSA reference class here is “people,” and excludes God. I talk about fancier reference-class footwork below.)

In Darkness, SIA is extremely confident that the coin landed tails, because waking up at all is a million-to-one update towards tails. SSA, by contrast, is 50-50: you’re the same fraction of the reference class either way. In Light+Red, by contrast, SIA is 50-50: there’s only one person in your epistemic situation in each world. SSA, by contrast, is extremely confident that the coin landed heads. On heads, after all, you’re 100% of the reference class; but on tails, you’re a tiny sliver.

Thus, both views imply an extreme level of confidence in some version of the case. And at least in Bostrom’s dialectic, the most prominent problem cases for each view basically amount to a restatement of this fact. (In particular, for those familiar with Bostrom, the Presumptuous Philosopher is basically just a restatement of SIA’s verdict in Darkness. The Doomsday Argument, Adam and Eve, UN++, and Quantum Joe are all basically just restatements of SSA’s verdict in Light+Red.) I’ll suggest, though, that while such confidence can be made counterintuitive in both cases, SSA’s version is worse.

Let’s start with the Presumptuous Philosopher:

The Presumptuous Philosopher: There are two cosmological theories, T1 and T2, both of which posit a finite world. According to T1, there are a trillion trillion observers. According to T2, there are a trillion trillion trillion observers. The (non-anthropic) empirical evidence is indifferent between these theories, and the scientists are preparing to run a cheap experiment that will settle the question. However, a philosopher who accepts SIA argues that this experiment is not necessary, since T2 is a trillion times more likely to be correct.

It seems strange, in this case, for the philosopher to be so confident about the true cosmology, simply in virtue of the number of observers at stake. After all, isn’t cosmology an empirical science? What’s this philosopher doing, weighing in on an empirical dispute with such confidence, on the basis of no evidence whatsoever save that she exists? Go back to your armchair, philosopher! Leave the science to the scientists!

Indeed, we can make the presumptuous philosopher look even more foolish. We can imagine, for example, the empirical evidence favors T1 a thousand to one. Still, the philosopher bets hard against its prediction about the next experiment, and in favor of T2. Unsurprisingly to the scientists, she loses. Now the evidence favors T1 a million to one. Broke, she mortgages her house to bet again, on the next experiment. Again, she loses. At this point, the scientists are feeling sorry for her. “The presumptuous philosopher,” Bostrom writes, “is making a fool of [her]self” (p. 9).

For many people, this is close to the end of the train for SIA: presumptuousness of this kind (not to mention humiliation — and in front of the scientists, no less) is just too much to handle. And to be clear: I agree that this is a very bad result. For now, though, after nit-picking a little bit about the example as presented, I want to argue that SSA’s implications are (a) just as bad (and presumptuous, unscientific, humiliating, etc), and (b) worse.

Let’s start with the nit-picks. First, the example as usually presented is phrased in terms of observers (I attribute this to the literature’s focus on R-SIA discussed above): but as I discussed in part 1, to be relevant to SIA’s ultimate verdicts, it needs to be phrased in terms of people in your epistemic situation. That is, it needs to be the case that on T2, there are trillion times more candidates for people who might be you. Suppose, for example, that the cosmologies in question work like this. In both cases, earth sits at the exact center of a giant, finite sphere of space. On T1, the sphere is smaller, and so has more non-earth observers; and on T2, it’s bigger, and so has more. In both cases, though, all these non-earth observers can tell that they’re not in the center. In this case, SIA doesn’t care about the observer count, because it’s the same number of people-who-could-be-me either way. Thus, SIA follows the science: just do the experiment. For the case to work, then, the cosmologies in question need to be such that the extra observers could be us. Of course, lots of cosmologies are like this (just because we know we’re on earth doesn’t mean we know “where earth is” in an objective world), so you can in fact make versions of the case that work: hence the label “nit-pick.” But it’s a nit-pick that will become relevant below.

My second nit-pick is that pretty clearly, you shouldn’t be 100% on a given theory of anthropics (see Carl Shulman’s comment here). So while it’s true that these sorts of credences are implied by SIA, they’re not implied by a reasonable-person’s epistemic relationship to SIA.

My third nit-pick is that I think it’s at least a bit unfair, in a debate about the right credences to have in this scenario, to imagine the philosopher losing all these bets. That is, if SIA is right, then it’s not the case that the non-anthropic empirical evidence is the sole relevant guide as to what will result from the experiment — the fact that you exist at all, in your epistemic situation, is also itself a massive update. Indeed, if we take this update seriously, then to even end up in a situation in which the non-anthropic empirical evidence favors T1 by a factor of a thousand seems like it might be positing something very weird having happened — something we might expect to induce the type of model-uncertainty I just mentioned. And more broadly, to SIA, imagining the philosopher losing these bets is similar to imagining someone betting hard against Bob winning the lottery, and losing twice in a row: by hypothesis, it almost certainly wouldn’t happen. (That said, after the first loss, the model uncertainty thing comes into the lottery case as well: e.g., something fishy is going on with Bob…)

All that said, I don’t think these nit-picks, on their own, really take the bite out of the case. The more important point is that SSA gets bitten too.

To see this, return to the version of the case just discussed, in which on both theories, earth is at the center of a giant sphere of space, but on T2, and the sphere and observer count are bigger. Let’s say the non-anthropic empirical evidence, here, is 50-50, or a thousand to one in favor of T2, or whatever. As mentioned above, now SIA just follows the science. SSA, though, suddenly jumps into the role of presumptuous philosopher (or at least, it does if we use a reference class that includes the non-earth observers — more on epicycles that try to avoid this below). After all, on T2 and SSA, we are a much smaller fraction of the reference class, and it was hence much less likely that we find ourselves in our epistemic position, on earth. Thus, SSA mortgages the house, goes broke betting with the cosmologists, and so on — just like SIA did in the version of the case where we didn’t know our location (see Grace’s “The Unpresumptuous Philosopher” for more on the parallels here).

Indeed: SSA, famously, can lead to the Doomsday Argument, which is structurally analogous to the case just given. Thus, suppose that you are considering two hypotheses: doom soon, which says that humanity will go extinct after there have been ~200 billion humans, and doom later, which says that humanity will survive and flourish long enough for ~200 trillion humans to live instead. On the basis of the available non-anthropic empirical evidence (for example, re: the level of extinction risk from nuclear war, pandemics, and so on), you start out with 10% on doom soon, and 90% on doom later. But if you use “humans” as the reference class, then you make a hard SSA update in favor of doom soon, and become virtually certain of it (including mortgaging the house, betting with the scientists, etc) — since in a doom soon world, you are a much larger fraction of the reference class as a whole. (The usual doomsday argument appeals to your “birth rank,” but I don’t actually think this is necessary: it’s just about the size of the reference classes). Whether this argument actually goes through in the real world, even conditional on SSA, is a much further question (it depends, in particular, on what reference class we use, and what other hypotheses are in play). But the bare possibility of making such an argument, on SSA, suggests that un-presumptuousness isn’t exactly SSA’s strong suit, either.

Is SIA’s version of presumptuousness somehow worse? I don’t see much reason to think so in principle. The core counter-intuitive thing, after all, was imagining philosophers (those sorry bastards — is it even a real field?) making extravagant bets with the sober cosmologists, on the basis of whatever the heck anthropics is supposed to be about. Surely, the intuition apparently goes, anthropics can’t, as it were, be a thing. It can’t, as it were, actually tell you stuff. Or at least not, you know, extreme stuff. Sure, maybe it can push you to be a third on heads, if God’s coin toss with one vs. two people; or, like, two-thirds, if you’re SSA and you know stuff about your jacket. But surely you can’t just apply the same reasoning to a more extreme version of the case. That would mean you could make your credence on heads be whatever, just by adding more people/changing the jackets. That would be unfair. That could cause extreme credences on things, and endorsing epistemic principles that imply extreme credences in some cases is not, apparently, the philosopher’s role — especially not in cases that are science-flavored, as opposed to people-in-boxes-with-jackets-flavored.

I’m writing somewhat in jest, here (though I do think that Bostrom’s treatment of Sleeping Beauty suggests this kind of aversion towards extreme credences). But clearly, there is indeed something important to the intuition that the way to do cosmology, in either case, is to bet on the cosmologists, not against them. And to the extent that SIA or SSA would lead to actively bad empirical predictions, this seems like weighty (~decisive?) counter-evidence, whatever the theoretical virtues at stake. 

For now, though, I’m left feeling like SIA and SSA are both presumptuous, including in cases that the scientists have opinions about. So presumptuousness, on its own, doesn’t seem like a good desiderata.

V. Can you move boulders with your mind?

However: I also think that SSA’s brand of presumptuousness is worse. In particular, it involves (a) betting not just against the scientists, but against the objective chances, and (b) it implies that a kind of telekinesis is possible. (As above, I’m going to assume in these cases that we’re using a reference class that includes the relevant large group of observers.)

Let’s start with (a). Consider the following variant on Light + Red above, from Bostrom:

The red-jacketed high-roller: You wake up in a room with a red jacket. God appears before you. He says: “I created one person with a red jacket: you. Now, if this fair coin comes up tails, I won’t create any more people. If it comes up heads, I’ll create a million people with blue jackets.” What should your credence be that the coin will land heads?

One might think: 50% — what with the fair coin thing, the not-having-been-tossed thing, and so on (if we want, for good measure, we can also make it a quantum coin — see Quantum Joe). But SSA is close to certain that the coin will land heads: after all, if it lands tails, then you would be a tiny fraction of the reference class, and would’ve been overwhelmingly likely to be a blue jacketed, post-coin-flip person instead. Thus, in effect, SSA treats your existence pre coin-flip, with a red jacket, as an Oracle-like prediction that the coin will land heads. And presumably (though betting in anthropics-ish cases can get complicated — see part 3), it bets, mortgages the house, etc accordingly.

The philosophy literature has a lot to say about when, exactly, ones credences should align with the objective chances, and I’m not going to try to unravel the issue here. What’s more, one can imagine arguing that SIA, too, is weird about fair coins: after all, SIA is highly confident on tails, in Darkness above (though SIA’s response here is: that’s because I learned something from the fact that I exist). For now, though, I’ll just note that SSA’s verdict here seems like a really bad result to me — and in particular, a worse result than the Presumptuous Philosopher above. The Presumptuous Philosopher, to me, reads as a “can anthropics actually provide strong evidence about stuff?” type of case. Whereas the red-jacketed high-roller reads as more of a “can anthropics tell you that a fair coin, not yet flipped, is almost certain to land heads?” type of case. The latter is a species of the former, but it seems to me substantially more problematic.

But SSA’s implications get worse. Consider:

Save the puppy: You wake up. In front of you is a puppy. Next to you is a button that says “create a trillion more people.” No else exists. A giant boulder is rolling inexorably towards the puppy. It’s almost certainly going to crush the puppy. You have to save the puppy. But you can’t reach it. How can you save it? You remember: you’re SSA. You hold in your hands the awesome power of reference classes. You make a firm commitment: if the boulder doesn’t swerve away from the puppy, you will press the button; otherwise, you won’t. Should you now expect the boulder to swerve, and the puppy to live?

(See Bostrom’s UN++ and Lazy Adam for related examples.)

One might think this a very strange expectation. Or more specifically: one might just think, point blank, that this type of move won’t work. You can’t move that boulder with your mind. That puppy is dead meat. But SSA expects the puppy to live. After all, if the puppy dies, then there will be a trillion extra people, and you would’ve been a tiny fraction of the reference class.

(Note that if we include puppies in the reference class, SSA also updates, upon waking up in this world, towards the puppy being an illusion — since if the puppy was real, then it was only 50% that you were you, instead of the puppy. And if we include the boulder in the reference class…)

We can also imagine versions where the boulder has either already crushed the puppy, or not, but you don’t know which, and you make a commitment to press the button if you learn that the puppy is dead. This version combines the “telekinesis” vibe with a “backwards causation” vibe. That said, this blog isn’t known for its strong stance against backwards causation vibes, so I’ll focus on the forwards version.

Even if the forwards version, though, I can imagine protest: “Joe, I thought you were into zany acausal control stuff. Isn’t this the same?” I don’t think it’s the same. In particular, I’m into acausal control when it works. If a roughly infallible predictor put a force-field around the puppy if and only if they predicted you were going to press the button, then by all means, press. My objection here is that I don’t think SSA’s move in this case is the type of thing that works. Or at least, I think that positing that it works is substantially more presumptuous than positing that anthropics can provide strong evidence about cosmology in general.

VI. Does SIA imply telekinesis, too?

Now we might start to wonder, though: does SIA imply telekinesis, too? After all, SIA likes worlds with lots of people in your epistemic situation. Can’t we use a button that makes lots of those people in particular to manipulate the world, telekinesis-style?

Sort of, but SIA’s version of this, in my opinion, is less bad than SSA’s version. Consider:

Save the puppy as SIA: The boulder is rolling towards the puppy. You set up a machine that will make a trillion copies of you-in-a-sealed-white-room if and only if the boulder swerves. Having set up the machine, you prepare to enter a sealed white room. Should you expect the boulder to swerve, and the puppy to live?

Here, SIA still answers no. To see why, though, it helps to make a move that I expect basically all good anthropic theories will need to make — namely, to treat “you,” for epistemic purposes, as a person-moment, rather than a person-over-time. After all, anthropics is about reasoning about the probabilistic relationship between objective worlds and centered-worlds, and centered-worlds can pick out both an agent and a time (hence, a person-moment) within an objective world. So a fully general theory needs to handle person-moments, too.

Thus, for example, the classic case of Sleeping Beauty is basically just a reformulation of God’s coin toss-type cases, but with person-moments instead:

Sleeping Beauty: Beauty goes to sleep on Sunday night. After she goes to sleep, a fair coin is flipped. If heads, she is woken up once, on Monday. If tails, she is woken up twice: first on Monday, then on Tuesday. However, if tails, Beauty’s memories are altered on Monday night, such that her awakening on Tuesday is subjectively indistinguishable from her awakening on Monday. When Beauty wakes up, what should her credence be that the coin landed heads?

Here, SIA answers 1/3rd, using reasoning that treats you as a person-moment instead of a person-over-time: after all, there are twice as many person-moments-in-your-epistemic-situation given tails than heads. And Bostrom, too, offers and accepts a reformulation of SSA that appeals to person-moments instead of people-over-time (he calls this reformulation the “Strong Self-Sampling Assumption,” or “SSSA”). I’ll generally ignore the differences between person-moments and persons-over-time versions of SIA and SSA, but my background assumption is that person-moments are more fundamental; and bringing this out explicitly can be helpful when thinking about cases like Save the puppy as SIA.

In particular: it’s true, in Save the puppy as SIA, that on SIA, once you’re in a sealed white room, you should expect the boulder to have swerved. After all, there are many more person-moments-in-your-epistemic-situation in “the boulder swerved” worlds than otherwise. But this doesn’t mean that prior to entering the sealed white room, you should expect swerving. Rather, you should expect the boulder to behave normally.

The dynamic, here, is precisely analogous to the way in which, on Sunday, SIA says that Beauty should be 1/2 on heads; but once she wakes up, she should change to 1/3rd. This change can seem counterintuitive, since it can seem like she didn’t gain any new information. But that’s precisely the intuition that SIA denies. On SIA, when she wakes up, she shouldn’t think of herself as Beauty-the-agent-over-time, who was guaranteed to wake up regardless. Rather, she should think of herself as a particular person-moment-in-this-epistemic-situation — a moment that might or might not have existed, and which is more likely to have existed conditional on tails. We can debate whether this is a reasonable way to think, but it’s a core SIA thing.

And note, too, that on Wednesday, after the whole experiment is over, Beauty should be back at 50% on Heads, just like she was on Sunday. This is because there aren’t any extra person-moments-in-a-Wednesday-like-epistemic-situation conditional on heads vs. tails. This means that you can’t use the number of awakenings to e.g. cause Beauty, on Wednesday, to expect to have won the lottery, just by waking her up a zillion times on Monday and Tuesday if she does. And the same holds for Save the Puppy as SIA. Yes, you can get the people-in-the-sealed-white-rooms to expect the boulder to have swerved. But if, before letting any of them leave, you kill off all of them except one, or merge them into one person; or if you make them into Beauty-style awakenings instead of separate people; then the person who leaves the room and re-emerges into the harsh sunlight of this awful thought experiment should expect to see the puppy dead. (This, in my opinion, is also the thing to say about Yudkowsky’s “Anthropic Trilemma.”)

That said, it’s true that, if you don’t do any killing/merging etc, and instead let everyone out of their rooms no matter what, then you and all your copies will expect to find the puppy alive. And thus, from the perspective of the person-moment who hasn’t yet gone into the room, it’s predictable in advance that the guy in the room (your next person-moment) is going to become extremely confident that something that isn’t going to happen (e.g., the swerve) has happened; and when he (or more specifically, his next person-moment) emerges into the daylight, he’s in for a grisly surprise. On SIA, the reason for this mistake is just that this person-moment-in-the-room has in fact found itself in an extremely unlikely situation — namely, the situation of having been created, despite so few person-moments-in-this-situation getting created. In this sense, your future person-moment-in-a-room is like the number 672, who finds itself having been pulled from a bucket of 1-1000 — and who therefore updates, wrongly but reasonably, towards worlds where there were lots of pulls (and hence more chances to pull 672). In worlds with only one pull, one sorry sucker has to make this type of mistake.

Shouldn’t SIA be able to guard against this type of mistake, though? For example, shouldn’t you be able to send a message to your likely future self: “dude, don’t believe this SIA bullsh**: the puppy is dead.” Well, whether you want to send a message like that, and force your future self to believe it, depends on who you are counting as your future self — or more specifically, whose beliefs you care about making accurate. In particular, if you only care about accuracy of the original Joe — e.g., the original series of person-moments — rather than the copies, then it’s true that you want to force a “puppy is dead” belief, because the original Joe ends up almost exclusively in “puppy is dead worlds.” But this move has a side effect: it makes a trillion copies (or whatever) of you (plus the original) wrong, in some much-more-than-one-in-a-trillion number of cases. Thus, if you care about the copies, too, you can’t just go writing notes like that willy-nilly. You’ve got broader epistemic responsibilities. Indeed, most of your “epistemic influence,” if we weight by both probability and number of minds-influenced, is funneled towards the “puppy is alive” worlds. That said, once we’re bringing questions about who you care about, and what sorts of pre-commitments (epistemic and otherwise) you want to make, we’re getting into pretty gnarly territory, which I won’t try to disentangle here (see part 3 for a bit more discussion).

For now, I’m happy to acknowledge that SIA isn’t sitting entirely pretty with this sort of case. But I think SSA is sitting uglier. In particular, SSA actively expects this sort of “use the button to the save the puppy” thing to work. It will pay to get access to this sort of button; it will start calling in the “puppy saved!” parade even before it enters any kind of sealed-white-room. From SIA’s perspective, by contrast, this sort of button-maneuver, and these sorts of sealed-white-rooms, are much less appealing. Exactly what type of not-appealing depends on factors like whether SIA cares about Joe-copies, but in general, even if in some cases SIA ends up expecting telekinesis to have worked, it will generally avoid, or at least not seek out, cases where it ends up with this belief. SSA, by contrast, believes in telekinesis ahead of time, and goes around looking to use it.

Overall, then, my current view is that (a) SSA is ~as cosmologically presumptuous as SIA, but that (b) SSA endorses wackier stuff, in other cases, in a worse way. On their own, then, I’d be inclined to view the cases thus far as favoring SIA overall. But there’s also more to say.

VII. Against reference classes

Let’s talk about reference classes. In particular, why they’re bad (this section), and why using them to try to get out of the cases above is an un-Bayesian epicycle that doesn’t work anyway (next section).

Why are reference classes bad? Well, for one thing, what even are they? What is the story about reference classes, such that they are a thing — and not just any old thing, but one sufficiently important as to warrant massive updates as to what sorts of world you’re likely living in? SSA’s toy story, as I’ve told it, is that the reference class is the set of beings in a given world such that God, dead set on creating you somehow (according to SSA), randomly “makes you one of them.” But then, of course, SSA doesn’t actually believe this in any literal sense. But what does SSA actually believe? What “way” does the world have to be, in order for SSA’s reference class reasoning to make sense? What could even make it the case that the “true” reference class is one thing vs. another?

I have yet to hear such questions answered. To the contrary, as far as I can tell, for Bostrom the notion of reference class is centrally justified via its utility in getting the answers he wants from various anthropics cases. Indeed, as I’ll discuss in the next section, Bostrom demonstrates a lot of willingness to contort the reference class — sometimes, in my opinion, unsuccessfully — in pursuit of those answers. But we are left with very little sense of what constraints — if any — such contortions need, in principle, to obey.

In the absence of any such underlying metaphysical picture — or indeed, any non-mysterious characterization of reference classes more broadly — one could be forgiven for wondering whether the reference class could, as it were, be anything. Perhaps my reference class consists entirely of Joe, Winston Churchill, the set of 47 pigs that acted in the 1995 comedy-drama Babe (“‘There was,’ Miller admits reluctantly, ‘one animatronic pig'”), five bug-eyed aliens 10^100 light-years away, and a King of France who never existed. When God created this world, he made “me” one of these creatures at random (the relevant King of France happened to not be present in this world). Probably, I was going to be a pig. (In fact, given that I’m Joe, this is evidence that Babe actually involved fewer than 47 pigs…).

What rules out this sort of picture? The natural answer is: it’s flagrant arbitrariness. But is there some non-arbitrary alternative? We discussed one candidate above: the minimal reference class consisting entirely of “people in your epistemic situation.” We saw, though, that this doesn’t work: it gives the wrong answers in “God’s coin-toss with equal numbers” type cases, and it violates conditionalization to boot.

If we jettison the minimal reference class, the natural next alternative would be something like the “maximal” reference class, which I think of as the reference class consisting of all observer-moments. Bostrom, though, rejects this option, because he wants to use various limitations on the reference class to try to avoid various counterintuitive results, like the Doomsday Argument, The Red-Jacketed High Roller, Save the Puppy, and so on. I’ll say more about why this doesn’t work below. Indeed, my current take is that if you’re going to go for SSA, you should go for the maximal reference class. This is partly because I don’t think Bostrom’s rejection of it gets him what he wants, but centrally because it feels much less arbitrary than something in between minimal and maximal.

Even for the maximal reference class, though, worries about arbitrariness loom. There are, of course, questions about what counts as an observer-moment, especially if you’re not a deep realist about “observers” (though SIA has somewhat related problems about “counting people-in-your-epistemic-situation”). Beyond this, though, if we’re really trying to be maximal, we might wonder: why stop with observer-like things? Why not, for example, throw in some unconscious/inanimate things too? Sure, I know that I’m an observer-like thing. But the whole point of reference classes is to include things I know I’m not. So why not include rocks, galaxies, electrons? Why not the composite object consisting of the moon and my nose? Why not, for that matter, abstract objects, like the natural numbers? Viewed in this light, “things” seems a more maximal reference class than “observer moments” (and perhaps “things” is itself less-than-fully maximal; do the things have to “exist”? Can merely possible things count? What about impossible things?). And if “observer-moments” turns out to be less-than-fully maximal, it loses some of its non-arbitrariness appeal (though perhaps there’s some way of salvaging this appeal — I do think that “observer-moments” is intuitively a more natural reference class than “things.” Maybe we say something about: the “things” you don’t rule out once you realize that you exist and are asking questions? But why that?).

Suppose that following Bostrom, we reject both the minimal and the maximal reference class. Is there anywhere non-arbitrary we could land in between? One option would be to appeal, with some philosophers, to some notion of metaphysical “essence.” Thus, we might say, you couldn’t have been a pig, or an alien, or an electron; perhaps, even, you couldn’t have been someone with different genes. And if you couldn’t have been something, then perhaps God couldn’t have randomly made you that type of thing, either. Indeed, my sense is that sometimes, the notion of “reference classes” is construed in some vaguely-reminiscent-of-metaphysical-essences kind of way (e.g., “but you couldn’t have been an electron; you’re an observer!“), even absent any kind of explicit account of the concept at stake.

But do we really want to bring in stuff about metaphysical essences, here? Really, SSA? Bostrom, at least, seems keen to distance himself from this sort of discourse; and I am inclined to heartily agree. And once we start making cosmological predictions on the basis of whether Saul Kripke would grant that I “could’ve” been a brain emulation, one starts to wonder even more about presumptuousness.

Are there other non-arbitrary reference options, between minimal and maximal? Maybe: humans? But… why? Do they need to be biological? Can they be enhanced? How much? Why or why not? Why not say: creatures in the genus homo? Why not: primates? Why not: intelligences-at-roughly-human-levels? Why not: people-with-roughly-Joe’s-values? Why not: people-with-Joe’s-DNA-in-particular? I’ve yet to hear any answers, here. Indeed, as far as I can tell, we’re basically in the land of just entirely making up whatever we want, subject to whatever constraints on e.g. simplicity, vaguely-intuitiveness, etc that we have the philosophical decency to impose on ourselves. The discourse, that is, is totally untethered. And no surprise: it never had a tether. We never knew what we were trying to talk about in the first place.

What’s more, this untethered quality has real effects on our ability to actually use SSA to say useful or determinate things. We started to get a flavor of this in the discussion above, when we found it necessary to preface different cases with provisos about who is or isn’t in the reference class — e.g., “I’m assuming, here, that God/the puppy/the boulder isn’t part of the reference class, but that the people on the other planets/with the blue jackets/in the doom later world are.” And it becomes even clearer in cases like God’s coin toss with chimpanzees, in which your credence hinges crucially on whether you count chimps in the jungle as in the reference class or not. Indeed, reading over Katja Grace’s overview of her attempt apply SIA and SSA to reasoning about the Great Filter, I was struck by the contrast been SIA’s comparatively crisp verdicts (“SIA increases expectations of larger future filter steps because it favours smaller past filter steps”), vs. the SSA’s greater muddle (“SSA can give a variety of results according to reference class choice. Generally it directly increases expectations of both larger future filter steps and smaller past filter steps, but only for those steps between stages of development that are at least partially included in the reference class.”).

One of Bostrom’s main responses to all of this is to appeal to a kind of “partner in guilt” with the Bayesian’s “prior.” That is, Bostrom acknowledges that even though we can put some constraints on what sorts of reference classes are reasonable, at the end of the day rational people might just disagree about what reference classes to use. But this is plausibly the case with Bayesian priors, too; and still, we can get to agreement about various types of conclusions, because in cases of strong evidence, a wide variety of reasonable priors will converge on similar conclusions. Perhaps, then, we might hope for something similar from anthropics: e.g., some verdicts (e.g., our scientific observations are reliable) will be robust across most reference classes, and others (hopefully: bad ones like the Doomsday Argument, telekinesis, etc) will be less so, and so less “objective.”

I do think this response helps. In particular, I think that seeing reference classes as a mysterious subjective object like the “prior” does put them in somewhat more respectable company. And indeed, some implications of the subjectivity at stake are similar: for example, just as agents with different priors can continue to disagree after sharing all their information, so too can agents with different reference classes, but the same priors. (Which isn’t to say this is a good result; it’s not. But it establishes more kinship with the prior.) Still, though, I think we should view introducing yet another mysterious subjective object of this kind as a disadvantage to a theory — especially when we can’t really give an account of what it’s supposed to represent.

At heart, I think my true rejection of reference classes might just be that they feel janky and made-up. When I look at the God’s coin toss with chimpanzees; when I find myself having to say “of course, if there are ten other people watching Sleeping Beauty’s experiment, then depending on whether they’re in the reference class, and how many person-moments they’ve had, Beauty’s credence should actually be X; but let’s bracket that for now…”; when I find myself without any sense of what I’m actually trying to talk about; I have some feeling like: Bleh. Yuck. This is silliness. Someone I know once said, of SSA, something like: “this is repugnant to good philosophical taste.” I’ve found that this characterization has stuck with me, especially with respect to reference classes in between minimal and maximal. When forced to talk about such reference classes, I feel some visceral sense of: ugh, this is terrible, let’s get out of here. SIA is sweet relief.

VIII. Against redraw-the-reference-class epicycles that don’t work anyway

There’s a particular use of reference classes that I’m especially opposed to: namely, redrawing the lines around the reference class to fit whatever conclusion you want in a given case. Here I want to look at a move Bostrom makes, in an effort to avoid cases like Save the Puppy, that has this flavor, for me. I’ll argue that this move is problematically epicyclic (and un-Bayesian); and that it doesn’t work anyway.

To see the structure of Bostrom’s move, recall:

God’s extreme coin toss with jackets: God flips a fair coin. If heads, he creates one person with a red jacket. If tails, he creates one person with a red jacket, and a million people with blue jackets.

  • Darkness: God keeps the lights in all the rooms off. You wake up in darkness and can’t see your jacket. What should you credence be on heads?
  • Light+Red: God keeps the lights in all the rooms on. You wake up and see that you have a red jacket. What should your credence be on heads?

In Darkness and Light + Red, SIA and SSA (respectively) each give extreme verdicts about the toss of a fair coin. These examples served as the templates for other putatively problematic implications of SIA (the Presumptuous Philosopher) and SSA (e.g., the Doomsday Argument, Red-Jacketed High-Roller, Save the Puppy). Bostrom hopes to avoid them both. That is, he hopes to thread some sort of weird needle, which will allow him to be 50% on heads in Darkness, and 50% on head in Light+Red — despite the fact that Light + Red is just Darkness, plus some information that you didn’t know before (namely, that your jacket is Red). If Bostrom can succeed, he will have banished both forms of presumptuousness. Heads will always be 50%; the scientists will always be right; the puppy will always die; and the world will be safe from anthropics — at least, for now.

How can we reach such a happy state? As far as I can tell, the idea is: define the reference classes such that you get this result. (See Bostrom (2002), p. 167, and p. 171-2 for fairly explicit comments about this intention.) In particular: claim that your reference class changes when God turns the lights on. That is, in Darkness, your reference class is “person-moments in darkness.” But in Light + Red, your reference class is “person-moments who know they have red jackets.” That is, in both cases, your reference class consists entirely of people in your epistemic situation. Thus, as SSA, you don’t update away from the prior in either case. You start out in Darkness, at 50-50. Then, when the light comes on, rather than updating in the way standard Bayesianism would imply, you “start over” with the whole SSA game, but this time, with a new and improved reference class — a reference class that allows you not think it was unlikely, conditional on tails, that you ended up with a red jacket. After all, on this new reference you class, you “essentially” have a red jacket, and know it; you couldn’t have been someone with a blue jacket (who knows it), granted that you, in the light, have a red. Thus, on tails, your jacket color is no surprise.

Problem solved? Not in my book. The immediate objection is that this move doesn’t seem very Bayesian. Normally, we think that when you learn new information like “my jacket is red,” where this information rules out various tails-world possibilities you had credence on, but no heads-world possibilities, you do this thing where your credence on “I’m in a tails world” ends up changing. Bostrom does a dance, here, about how no, really, his model is (or at least, can be, if you want it to be) kosher Bayes after all, because you’re losing indexical information (e.g., “I’m a person-moment who doesn’t know what their jacket color is”) even as you gain new information (e.g., “my jacket is red and I know it”). I haven’t tried to engage with this dance in detail, but my current take is: I bet it doesn’t work. In particular, my suspicion is that Bostrom’s treatment is going to throw the doors wide open for person-moments to reason in very unconstrained ways even in non-anthropics-y cases (see e.g. Grace’s discussion here); and that more generally, Bostrom is going to end up treating the type of Bayesian reasoning that you should be doing in this sort of case as more different from normal reasoning than it should be.

My higher-level objection, though, is that it seems pretty clear that Bostrom is making this move specifically in order to give a certain set of answers in a certain set of otherwise problematic cases, and that he would have little interest in it otherwise. Indeed, he frames this move as in some sense “optional” — something you can, as it were, get away with, if you want to avoid both e.g. the Presumptuous Philosopher and Save the Puppy, but which you don’t, as it were, have to make. But the fact that in Bostrom’s book you don’t “have” to make this move betrays its lack of independent justification: it’s not a move you’d come up with on your own, for some other reason. If you don’t want to make it (for example, because it seems arbitrary, un-Bayesian, and so on) nothing pushes back — except, that is, the cases-you-might-not-like.

Of course, contorting your fundamental principles to curve-fit a specific (and often artificially-limited) batch of cases, with little regard for other theoretical desiderata, is the bread and butter of a certain type of philosophical methodology. But that’s not to the field’s credit. Indeed, plausibly such a methodology, for all its charms, often sends the philosophers astray — and I expect that trying to use it to say 50% in both Darkness and Light + Red will lead us astray here. At the very least, Bostrom’s version sets off a lot of alarm bells, for me, about over-fitting, epicycles, and the like. And it makes me wonder, as well, about what sorts of limits — if any — are meant to apply to how much we can redraw our reference classes, moment to moment, to suit our epistemic whims. If SSA lets us say 50% in both cases, here, what won’t it let us say? And if our theory can be made to say anything we want, how can we ever learn anything from it? The specter of the reference class’s indeterminacy looms ever larger.

My most flat-footed objection, though, is that this particular move doesn’t work by Bostrom’s own lights. Rather, it runs into problems similar to those that the minimal reference class does (my thanks to Bastian Stern suggesting this point in conversation). To see this, consider a version of God’s coin toss with equal numbers:

God’s coin toss with equal numbers: God flips a fair coin, and creates a million people either way. If heads, he gives them all red jackets. If tails, he gives one of them a red jacket, and the rest blue jackets.

  • Equal Number Darkness: God keeps all the lights off. You wake up in darkness. What should your credence be on heads?
  • Equal Number Light + Red: God keeps all the lights on. You wake up and see that you have a red jacket. What should your credence be on heads?

Equal Number Light + Red is really similar to the original Light + Red: the only difference is the presence of an extra ~million people with red jackets, conditional on heads. However, Bostrom is committed (I think, rightly) to saying that in Equal Number Light + Red, you should be very confident that the coin landed heads. Indeed, Bostrom thinks that if you can’t say things like that, you can’t do science in big worlds.

But the reference class Bostrom wants to use in the original, non-equal-number Light + Red doesn’t allow him this confidence in the equal-number version. That is, in Light + Red, Bostrom wants to use the reference class “person-moments who know they have red jackets” — that’s why he can stay at 50-50, despite all those know-they-have-blue-jackets people in the tails world. But this means that SSA stays at 50-50 in Equal Number Light + Red, too: after all, in both cases, people in your epistemic situation are 100% of the reference class. But this is a verdict Bostrom explicitly doesn’t want.

Indeed, I feel confused by Bostrom’s treatment of this issue. After introducing his treatment of the original Light + Red on p. 165 of the book, he goes on, 13 pages to later, to discuss why the minimal reference class fails in cases like Equal Number Light + Red, and to suggest that in Equal Number Light + Red, the proper reference class to use is wider than “person-moments who know that they have red jackets” (in particular, he discusses the reference class “all person-moments”). But surely Bostrom doesn’t mean to suggest that we should use “person-moments who know that they have red jackets” in Light + Red, but something wider in Equal Number Light + Red. The cases are basically the same! The only difference is the extra red-jacketed people in heads! Using different reference classes in the two cases would be just … too much. At that point, we should just throw in the towel. We should just say: “the reference class is whatever the heck I need to say it is in order to have the credence I want, which in this particular case is, let me check … 50%.”

To be clear, I don’t actually think that Bostrom would endorse using different reference classes in these two cases. But as far as I can tell, his discussion in the book implies this, and makes it necessary. Maybe I’ve misunderstood something, or missed some other discussion of the issue elsewhere?

Moving beyond Bostrom in particular: my suspicion is that something in the vicinity of these objections is going to apply, in general, to attempts to contort the reference classes to avoid SSA’s problematic implications in cases like Save the Puppy (especially to avoid them in principle, as opposed to in some particularly putatively real-world application). Thus, to avoid telekinesis in Save the Puppy, my sense is that you’ll have to do something un-Bayesian (e.g., not update when you learn that you are the single, pre-boulder squishing/swerving person, rather than one of the possible people created by the button in the no-swerve worlds), epicyclic (it’s going to seem like: what? why?), and in tension with what one would want to say in a nearby, equal-numbers version (though maybe it’s harder to find equal-numbers versions of Save the Puppy? I haven’t thought about it much.)

IX. Is killing epistemically different from non-creation?

I’ll mention one other category of abstract argument for SIA over SSA, which I find quite compelling. Consider two cases:

Coin toss + killing: God tosses a fair coin. Either way, he creates ten people in darkness, and gives one of them a red jacket, and the rest blue. Then he waits an hour. If heads, he then kills all the red jacketed people. If tails, he kills all the blue jacketed people. After the killing in either case, he rings a bell to let everyone know that it’s over. You wake up in darkness, sit around for an hour, then hear the bell. What should your credence be that your jacket is red, and hence that the coin landed heads?

Coin toss + non-creation: God tosses a fair coin. If heads, he creates one person with a red jacket. If tails, he creates nine people with blue jackets. You wake up in darkness. What should your credence be that your jacket is red, and hence that the coin landed heads?

(This is a condensed version of an argument from Stuart Armstrong; see also a closely-related version in Dorr (2002), and a related series of cases in Arntzenius (2003)).

Here, SIA gives the same answer in each case: 10%. After all, there are many more people in your epistemic situation in tails worlds.

SSA, by contrast, gives different answers in each case (or at least, it does if you don’t try any of Bostrom’s reference-class shenanigans above). Thus, in Coin toss + non-creation, it gives its standard 50% answer: you were (SSA thinks) guaranteed to exist either way. But in Coin toss + killing, it goes all SIA-ish. In particular, when it first wakes up, but it hasn’t yet heard or not heard the bell, it updates against having a red jacket, to 10%: after all, it’s an equal-numbers case, and most people have blue jackets. Then, because the chance of death is 50% conditional on either having a blue jacket, or a red jacket, it stays at 10% after hearing the bell: survival is no update.

But are these cases actually importantly different? Armstrong (at least, circa 2009; he’s since changed his view, for decision-theory reasons) doesn’t think so, and I’m inclined to agree. And note that we can construct a kind of “spectrum” of cases leading from the first case to the second, where it seems quite unclear what would constitute an epistemically-relevant dividing line (see Armstrong’s post for more).

Dorr makes a similar argument in Sleeping Beauty. Consider a version where Beauty is woken up on both Monday and Tuesday conditional on both heads and tails, but then, if it’s heads and Tuesday, she hears a bell after an hour or so. Surely, argues Dorr, Beauty ought to be 50-50 on heads vs. tails prior to hearing-the-bell-or-not, and 25% on each of Heads-Monday, Heads-Tuesday, Tails-Monday, and Tails-Tuesday. Then, after she doesn’t hear the bell, surely she should cross off “Heads-and-Tuesday,” re-normalize, and end up at 1/3rd on heads like a reasonable SIA-er. And indeed, this is what SSA does do (unless, of course, we futz with the reference classes), if Beauty is also woken up in “Heads-and-Tuesday” and can hear this type of bell. But if Beauty isn’t woken up in Heads-and-Tuesday at all, then suddenly SSA is back to halfing. Does this difference matter? It really seems like: no.

What we’re seeing in these cases is basically SSA’s “sensitivity of outsiders,” made especially vivid and counter-intuitive. That is, SSA cares a lot about the existence (or non-existence) of people/person-moments you know that you’re not: for example, person-moments who just got killed by God (even though you’re alive), or who heard a bell you didn’t hear, or who are living as chimpanzees in the jungle while you, a human, participate in funky thought experiments. At bottom, this is because if such people exist (and are in the reference class), their existence makes it less likely that you live in their world, because such a world makes it less likely that you’d be you, and not them. That said, I’ve griped about reference classes quite a bit already, and I’m not actually sure that the “what’s up with the relevance of these outsiders?” objection actually adds much to the “what’s up with reference classes in general?” objection (though it definitely prompts in me some sense of: “man this janky”).

Indeed, perhaps for some SSA-ers, who hoped to say SIA-like things about various cases, outsiders come as some comfort. This is because (if you use your reference classes right), outsiders can push SSA towards more SIA-like verdicts. Consider, for example, a version of God’s coin toss where if heads, he creates one person in a white room, and if tails, two people in white rooms; but where there are also a million chimps in the jungle either way (and the chimps are in the reference class). In such a case, SSA can actually get pretty close to 1/3-ing: if heads, you had a 1/~1M chance of being in a white room rather than the jungle, and if tails, you had a 2/~1M chance of this, so finding yourself existing in a white room is actually a ~2:1 update in favor of tails. SSA-ers might try to use similar “appeals to outsiders” to try to avoid saying bad things about the doomsday argument. Thus, if there are (finite) tons of observers and they’re all in the reference class, the difference between doom soon and doom later does less to the fraction of people-in-your-reference-class you are.

I think moves like this might well help to alleviate some of SSA’s bad results in real-world cases (though we’d have to actually work out the details, and no surprise if they get gnarly). But note that they can also be used to give SIA’s counter-examples to SSA. Thus, in the Presumptuous Philosopher, if we add a sufficiently large number of extra observers who we know that we aren’t to T1 and T2, then suddenly the fact that T2 has a trillion times more people-in-our-epistemic-situation makes it the case that in T2, you’re a ~trillion times larger fraction of the reference class. So SSA, too, starts mortgaging the house to bet with the scientists.

Beyond this, though, solutions to SSA’s problems that involve futzing with the number of outsiders (or hoping for the right number) feel pretty hacky to me, and not in the original spirit of the view. And regardless, SSA’s bad results in cleaner, more thought-experimental cases (e.g., Save the Puppy) will persist.

X. SSA’s metaphysical mistake

I’ve given a lot of specific counter-examples and counter-arguments to SSA. But I also want to talk about where it feels, at least from SIA’s perspective, like SSA goes wrong from the get-go: namely, it assumes that you exist no matter what, in any world epistemically-compatible with your existence. This is a core shtick for SSA. It’s what allows SSA to not update on the fact that you exist. But at least when viewed in certain light, it doesn’t really make sense (perhaps other light is more flattering).

To see what I mean, return to a basic version of God’s coin toss, where God creates one person if heads, and a million if tails, all in white rooms. Suppose that the coin has in fact landed tails. You are Bob, one of the million people God has created, and you’re wondering whether the coin landed heads, or tails. As a good SSA-er, you basically reason: “well, I exist. So if it landed heads, I’d exist; and if it landed tails, I’d exist. So: 50-50.” But now consider Alice, in the next room over. She, too, is an SSA-er. So she, too, reasons the same.

But notice: Bob and Alice can’t both be right. In particular, Bob is treating the heads world like it would necessarily create him; and Alice is doing the same; but there ain’t room enough in the heads world for both (thanks to Katja Grace for suggesting this framing). And indeed, we can specify that, had the coin landed heads, the person who would’ve been created is in fact not Bob or Alice but rather Cheryl of all people. And is that so surprising? What mechanism was supposed to guarantee that it would be Bob, or Alice, or any other particular anthropic-reasoner? The mere fact that Bob and Alice found themselves existing in the actual world, and thus were able to wonder about the question? But why would that matter?

This isn’t necessarily a tight argument (indeed, I discuss some possible replies below). But I’m trying to point at some kind of “why reason like that?” energy I can get into in relation to SSA. Maybe this example can make it vivid.

My sense is that for Bostrom, at least, the story here is supposed to be hinge centrally on the fact that if you hadn’t existed, you wouldn’t be around to observe your non-existence (see, e.g., his discussion on p. 125). But why, exactly, would this fact license assuming, granted that you do exist, that you would’ve existed no matter what (at least in worlds you can’t currently rule out)? Here I think of classic examples reminiscent of Armstrong’s case above. Suppose you’re one of a hundred people in white rooms. God is going to kill ninety-nine of you, if heads, or one, if tails, then ring his bell either way. His bell rings. If you had been killed, you wouldn’t have been around to hear it. Does this mean you were guaranteed to survive? Does this mean you shouldn’t update towards tails? No. So what’s the story? Why is never-having-been created different from getting killed?

In general, it can feel to me like, because you happen to exist, SSA treats you like some sort of special snow-flake person — some sort of privileged ball that God must’ve gone “fishing for” in the urn, since after all, it got drawn. Or perhaps, on a different framing, SSA treats you like a kind of “ghostly observer,” who has learned, from the fact that it exists and is making observations, that it “would’ve been someone” in any world, and the only question is who. On this framing, it’s not that as Bob, you should assume that God would’ve created Bob in the heads world. Rather, God could’ve created anybody he wanted: but that person would’ve been you regardless — e.g., the ghostly observer would inhabit a different body. That is, in this case, had the coin landed heads, God may well have created Cheryl; but “you,” in that case, would’ve been Cheryl. (And presumably, the same would be true of Alice? You’d both have been Cheryl? Or something?)

Indeed, maybe the most SSA-ish story is something like: look, there’s a world spirit. You’re it. We’re all it. The world spirit, um, experiences a random sample of all the observer-moments in the world, no matter how many of them God creates. Thus, if God created just Cheryl, you’d be experiencing Cheryl. If God created Alice, Bob, etc, you’d be experiencing one of them or other. Thus, you’d be experiencing someone either way, and you shouldn’t update from the fact that you’re experiencing anything at all. (However, if you learn that you’re experiencing Cheryl in a land of chimps, you should update towards the chimps being illusions.) I doubt people will want to put things in these terms, but I think that this picture would in fact make sense of SSA’s reasoning. 

That said, I think that SSA-ers have other options/replies available here as well. In particular, I think that SSA can say something like: “look, I do in fact exist. Thus, if any of these epistemically-possible worlds are actual, then they do in fact contain me. So, it makes sense, in considering what credence to put on these worlds, to condition on my having been created in them — since if they were actual, I would’ve been.” This sort of line does have its own pull, and I think really running to ground some of the differences here might get tricky. In particular it looks like there’s some semantic difference re: “would I have existed if e.g. the coin had come up heads.” The thing I specified re: Cheryl was that on a counterfactual “if,” the answer (when Bob asks) is no. But the SSA-er presumably wants to say that a different “if” is relevant, one more akin to “if the coin came up heads, do I exist?” — and I don’t currently have an especially strong opinion about where debate about the “ifs” here will go.

Overall then, I think it’s probably best to construe the arguments in this section centrally as “here’s a way that someone in an SIA-like mindset can end up looking at SSA and saying: what?” Really pinning down the dialectic would take further work.

XI. SSA’s solipsism

One last dig at SSA: it loves solipsism. If you were the only thing that exists, it would be so likely that you are you. Like, 100%. Compared to these hypotheses where there are all these other people (8 billion of them? 100 billion throughout history? More to come? Come on. Don’t be ridiculous.), and you just happen to be you, SSA thinks that solipsism look great. Indeed, if there are >100 billion people in the reference class in non-solipsism worlds, that’s a >100 billion to one update in favor of solipsism. And weren’t you way more than one-in-a-hundred-billion on solipsism anyway? Don’t you remember Descartes? How did you really know those other people existed in the first place? It would look the same regardless, you know. And don’t even get me started on the idea that animals are conscious, or that aliens exist. Please.

In fact, while we’re at it, what’s all this about your memories? That sounds like some “other-person-moments in the reference class” bullsh** to me. How many of them did you say there were? What’s that? We never defined any sort of temporal duration for a person-moment because obviously that’s going to be a silly discourse, but apparently we’re going to use the concept anyway and hope the issue never makes a difference? Hmm. Sounds suspicious to me. And sounds like the type of thing that would make it less likely that you were having these experiences in particular. Best to just do without. That 13th birthday party: never happened. And obviously your future, too, is out the window.

I jest, here, but it’s a real dynamic. Just as SIA loves big worlds, if you don’t know you are, SSA loves small worlds, if you do. And the solipsist’s world is the smallest of all.

Next post in sequence: SIA > SSA, part 3: An aside on betting in anthropics. Or, if you don’t care about betting-related arguments, you can skip to Part 4: In defense of the presumptuous philosopher.

7

New Comment