Epistemic status: Extremely uncertain
In this post I argue against the claim that we should take into account the Anthropic Shadow effect when estimating the risk of catastrophic events.
If we wish to estimate the probability of some catastrophic event occurring in the next 100 years, the first thing that we are likely to try is to count how many times the catastrophe has already occurred in the past. Anthropic Shadow is the idea that this naive approach will lead to an underestimate of the true risk, due to observer selection effects. The lower the probability of human survival following the catastrophe, , the greater the degree of anthropic bias in our estimate.
I have spent some time trying to understand the logic underpinning the Anthropic Shadow effect, and I am becoming increasingly convinced that it has significant flaws. My personal view would now be that the naive approach to catastrophic risk estimation is probably ok after all. (My all-things-considered view is that a lot of smart people uncritically espouse the Anthropic Shadow argument, so there is a high chance that I am missing something!)
Outline of my argument
The original Anthropic Shadow argument can be summarized as follows (for a toy model), in two steps:
Step 1: Consider a planet where some catastrophe occurs randomly, at an average rate of times per million years. Whenever it occurs, it causes the permanent extinction of all observers with probability (defining as in the original paper). If observers still exist after 1 million years, we should expect them to have fewer than catastrophes in their past, on average. In mathematical notation:
It is easiest to see why this is true when . Observers then necessarily have catastrophes in their past.
Step 2: We are necessarily observers. Therefore, if we base our estimate of catastrophe frequency on the frequency of catastrophes in our past, our estimate will be biased downwards.
This argument seems intuitively plausible. The first step is certainly true. But does the second step really follow from the first? In the next section I will consider a similar, non-anthropic, toy model for which an analogous argument fails, despite holding some of the same intuitive appeal. I will then spend the rest of the post exploring how the 'anthropicness' of the Anthropic Shadow effect might explain why the two models should be treated differently. Ultimately, I fail to do this convincingly, and conclude that the Anthropic Shadow argument probably does not work either.
More precisely, I am only able to get the Anthropic Shadow argument to work if I make two additional assumptions, not stated explicitly in the original paper:
- There are a very large number of observer-containing planets (or in the case of universe level catastrophes, such as vacuum decay, there are a very large number of observer-containing universes).
- A very specific choice of observer reference class must be made, when applying the anthropic part of the argument.
The first assumption is not so bad. If Anthropic Shadow occurred only under that assumption, we should still take it seriously. But the second assumption is very problematic. In the book Anthropic Bias, Nick Bostrom argues that we should only respect anthropic conclusions if they are relatively robust to the choice of observer reference class. It is on this basis that he rejects the Doomsday Argument (a less popular anthropic argument that also says we underestimate extinction risk). I am arguing here that Anthropic Shadow suffers from the same problem, and should probably be treated with similar suspicion to the Doomsday Argument.
Finally, I consider SIA (a largely reference class independent method of anthropic reasoning) to show that the Anthropic Shadow effect does not occur under that approach to anthropic reasoning either.
The non-anthropic analogy
As above, consider a planet where some catastrophe occurs randomly, at an average rate of times per million years. But this time, there is no chance of it causing extinction. Instead, whenever it occurs there is a probability of it permanently changing the colour of the sky from blue to green. Inhabitants of this planet could still make an analogous argument to Anthropic Shadow:
Step 1: If the sky is still blue after 1 million years, we should expect there to be fewer than catastrophes in the past, on average. In mathematical notation,
It is easiest to see why this is true when . A blue sky then necessarily implies catastrophes in the past.
Step 2: We see a blue sky after 1 million years. Therefore, if we base our estimate of catastrophe frequency on the frequency of catastrophes in our past, our estimate will be biased downwards.
The only meaningful difference between this argument and Anthropic Shadow is the ommission of the word 'necessarily' in the second step. But this argument is wrong. The colour of the sky gives these observers no additional information about the risk of catastrophe, on top of what they already have from the historical record. It is true that if you simulated history on this planet a large number of times, then observers with a blue sky after 1 million years would have biased estimates of catastrophe frequency, on average. But, perhaps counter-intuitively, that does not mean that they should take the colour of the sky into account when making their estimate.
It is clear that we need to be more precise in order to unpick what is going on. To prove my claim that this non-anthropic analogy argument is false, I take a Bayesian approach. is no longer a fixed unknown quantity. Instead, it is a random variable, reflecting our subjective uncertainty in catastrophe frequency. We now wish to update our prior distribution on , given the evidence. To make this update, the only quantity that need concern us is the likelihood:
Values of with higher likelihood will then have their probability boosted in appropriate proportion by the Bayesian update.
We can now prove that the non-anthropic analogy is false by considering the likelihood, as follows:
The second factor above is now independent of , and so has no effect on the Bayesian update. The first term is independent of the colour of the sky, so the colour of the sky can have no effect on the Bayesian update. Therefore, the colour of the sky is irrelevant, and the inhabitants of this planet should stick with their naive estimate of catastrophe frequency.
Where does the 'anthropicness' in 'Anthropic Shadow' come in?
We have now shown that the anthropicness of the Anthropic Shadow argument (or the word 'necessarily' in Step 2) must be essential. In hindsight this should not be surprising. 'Anthropic' is in the name after all! But I still found this non-anthropic analogy very helpful to think about. It had not been clear to me when I first read the Anthropic Shadow paper what role the anthropicness was playing. The argument did seem intuitively plausible to me, but I can now see that it was intuitively plausible for the wrong reasons. That same intuition would have led me astray in the non-anthropic analogy above. (Of course, this might say more about me than it does about the argument.) But now that this has been clarified, can we explain, in similar Bayesian language to that used above, how the anthropic shadow argument actually works?
It will be helpful to first explain why the non-anthropic analogy argument fails. When the inhabitants of the planet see a blue sky, they know that they belong to a category of observers who will underestimate catastrophe frequency on average. Why shouldn't that concern them? We can understand why by splitting the likelihood up in a different way:
Split this way, it is now much easier to see where our non-anthropic analogy argument went wrong. The second factor above is where the bias referred to by our argument comes in. This factor favours , as the argument said it should. But our argument went wrong by ignoring the first factor. The first factor favours smaller . The two opposite effects must exactly cancel, so that in the end we only need to consider the naive likelihood: , as already demonstrated.
This point is very important, so I will summarise it again in less mathematical terms:
The planet's inhabitants can consider the bias in their historical record, given their blue sky, if they wish, but they must then also consider that the colour of the sky gives them information about as well. The sky is more likely to be blue when is smaller. Overall, they will end up with the same conclusions as if they had just ignored the colour of the sky completely.
The next question we need to ask is: why doesn't the same thing apply in the anthropic case? Why shouldn't we also consider the fact that we are more likely to exist today when the catastrophe rate is smaller, precisely cancelling out the effect of anthropic bias in our historical record?
Lets consider the same equation in the anthropic case:
To make the Anthropic Shadow argument work, all we need to do is justify why it is ok to ignore the first factor in the anthropic case. That will then leave us with only the anthropic-bias-adjusted second factor. Next I will consider two possible ways that you could try to do this, and why I haven't found either of them to be satisfactory.
Possible solution 1: You should reason as if your existence is guaranteed (as long as it is logically possible)
One way to justify the Anthropic Shadow argument would be to simply assert that , as a general approach to anthropic reasoning. In other words, we should never be surprised about the fact that we exist. After all, we could never have observed non-existence.
This solution works, but it is very weird. Suppose that you play Russian roulette with one of two guns (chosen randomly), one that fires 999/1000 times and one that fires 1/1000 times. If you pull the trigger and survive, do you really have no information about which gun you picked up? It feels to me like you should be fairly confident that you picked up the second one.
Another way to see how weird this solution is, is to consider planetary and cosmological fine tuning. It is well known that various conditions on our planet, and universe, seem fine tuned to allow the existence of life. One popular explanation of this is anthropic. In the case of our planet, it is obvious how this works. There are a very large number of planets in the universe and it makes sense that some of them are, by chance, suited to life. Of course we would have to find ourselves on one of these. At the cosmological level, the same solution could work, but usually it is stated that this requires some kind of multiverse.
Taking the approach to anthropic reasoning would be equivalent to saying that the multiverse, or the existence of large numbers of planets, is not necessary to explain fine tuning. Under this approach, the fine tuning of the cosmological constants would not constitute evidence of a multiverse. This is because our existence should never be surprising anyway.
This solution seems too far-fetched to me.
In the original draft of this post, this section concluded here. However, Jonas Moss left some helpful comments below challenging these arguments. It was interesting to learn that their intuitions on the Russian roulette thought experiment were different to mine. They initially defended the idea that you really do have no information about which gun is which, after firing a gun once and surviving. Here is my more sophisticated challenge to that idea, which does not purely rely on an appeal to intuition:
Suppose that you really do have no information after firing a gun once and surviving. Then, if told to play the game again, you should be indifferent between sticking with the same gun, or switching to the different gun. Lets say you settle on the switching strategy (maybe I offer you some trivial incentive to do so). I, on the other hand, would strongly favour sticking with the same gun. This is because I think I have extremely strong evidence that the gun I picked is the less risky one, if I have survived once.
Now lets take an outside view. We imagine an outside observer watching the game, betting on which one of us is more likely to survive through two rounds. Obviously they would favour me over you. My odds of survival are approximately 50% (it more or less just depends on whether I pick the safe gun first or not). Your odds of survival are approximately 1 in 1000 (you are guaranteed to have one shot with the dangerous gun).
This doesn't prove that the approach to formulating anthropic probabilities is wrong, but if we are ultimately interested in using probabilities to inform our decisions, I think this suggests that an alternative approach is better.
We could also imagine applying a similar chain of reasoning in the more relevant case of planetary catastrophes. If there were two planets in the solar system, and we somehow knew that one of them experienced catastrophes at a much higher rate than the other, which one should we decide to live on? Should we stay on the one we evolved on, with relatively few catastrophes in our past, or should we switch to the other? If we try to take the Anthropic Shadow effect into account then we could end up making a bad decision here.
Possible solution 2: There are a large number of planets containing life
This solution is inspired by the analogy with the anthropic explanation of fine tuning.
Suppose that instead of one observer-containing planet experiencing a catastrophe at a rate of times per million years, there are a very large number of such observer-containing planets. If the number of planets is sufficiently large, then it is true that:
You could now try to argue for the Anthropic Shadow effect along the same lines as the anthropic explanation of fine tuning. It seems plausible to assert that in general
If this does not seem obvious to you, recall that this is precisely how the anthropic explanation of fine tuning works (and if you are still not convinced, you might believe in SIA instead, to be discussed in the Appendix). With this assumption it now looks like we almost have the result we want.
But there is a problem with this solution as well. The statement above is one thing, but what we actually need to show is this:
Or if we cannot show this, we at least need to show that the LHS is independent of . But crucially, the validity of this equation depends upon our choice of reference class (the set of observers that you consider yourself to be a sample from). It is well known that anthropic conclusions are often very sensitive to the choice of reference class, and that certainly applies here. If we take our reference class to be all the observers who exist at 1 million years into their planet's history, then it is true that the above equation holds and that the Anthropic Shadow argument does work. But this is a very specific, arbitrary, choice. It is also not immediately clear how to generalise this choice to the real world, outside of this toy example. Should our reference class be the observers who exist 4.5 billion years into their planet's development, or 3.7 billion years into life's development, or 100,000 years into the development of language possessing observers?
To see that Anthropic Shadow only occurs with this very specific reference class, we can imagine choosing a more general one instead. For example, we might choose the reference class of all observers who exist at 1 million years or prior, or the reference class of all observers who ever exist on one of these planets. We then have to consider the possibility that we might have existed at a different time. The probability of us finding ourselves at any particular point in history is equal to the proportion of observers in our reference class who live at that point in history.
For simplicity, since we are already using a toy model, lets assume that each planet has the same population, and that this population is constant over time. Then, in general, in the limit of a large number of planets, we would have:
For Anthropic Shadow to go through as originally argued, we need the right hand side to be independent of , but it is clearly not. In general, it is going to be a very complicated beast, with a strong dependence on the chosen reference class. We can ask instead whether this likelihood will consistently lead to an underestimate of catastrophic risk. In that case we might say that the Anthropic Shadow argument is still qualitatively correct, if not quantitatively (in a reference class insensitive sense). But no, the numerator is now once again proportional to the likelihood of a particular planet surviving for 1 million years, which was exactly the factor we saw that we needed in order to negate the anthropic bias in our historical record. There is not a lot we can say about the complicated RHS in general, but one of the few things we can say is that that numerator by itself will always cancel the anthropic bias in our historical record.
What about the denominator? It's true that there is some much more complicated dependence down there as well. In fact, the denominator is going to penalise small , especially if the reference class is wide. This is because the denominator is much larger if life on more planets lasts longer. This means that instead of cancelling the effect of anthropic bias in our historical record, this factor could actually make it even worse! But it would not be right to call this Anthropic Shadow. All we have done here is rediscover the Doomsday Argument. This says that the future is likely to be short, because if it were big then it would be unlikely for us to find ourselves so early on in history. This is precisely the same effect which is being captured by the denominator in the above expression.
To sum up, there is certainly a lot going on in the above equation, but there is nothing that you could meaningfully call Anthropic Shadow. If you were going to sum it up in words, it would best be described as: "naive estimate of catastrophe frequency + Doomsday Argument". Even the special reference class where we saw the Anthropic Shadow effect does hold could be re-interpreted in these terms. You can think of the Anthropic Shadow effect in the many-planet case as the Doomsday Argument applied to catastrophe probability estimation where the reference class is all observers across all planets who live at the same point in history as us.
In conclusion, I can only make sense of the Anthropic Shadow argument if two assumptions are made:
- There are a large number of planets (or in some cases, universes) containing life.
- Our reference class should only contain observers who live at the 'same' point in history as us. (Although it is not clear to me how to define 'same' outside of toy models, since this reference class must include alien life.)
In the book 'Anthropic Bias', Nick Bostrom says the following after reviewing some successful applications of anthropic reasoning:
"I wish to suggest that insensitivity (within limits) to the choice of reference class is exactly what makes the applications just surveyed scientifically respectable. Such robustness is one hallmark of scientific objectivity."
He goes on to make an analogy with Bayesian statistics: scientifically respectable conclusions should be insensitive (within limits) to your choice of prior. It is on this basis that he refutes the Doomsday Argument, and similar anthropic paradoxes (Adam+Eve paradox), because of their strong sensitivity to reference class choice. In his words:
"These arguments will fail to persuade anybody who doesn't use the particular kind of very inclusive reference class they rely on -indeed, reflecting on these arguments may well lead a reasonable person to adopt a more narrow reference class. Because they presuppose a very special shape of the indexical parts of one's prior credence function, they are not scientifically rigorous."
It seems to me that with its strong dependence on reference class choice, Anthropic Shadow belongs with the Doomsday Argument and Adam+Eve paradox in this category of non-rigorous anthropic argument. It differs from Doomsday and Adam+Eve in that its reference class is extremely exclusive, rather than extremely inclusive, but that is not important. What matters is that it is still extremely sensitive to the reference class. If anything, the reference class dependence is worse for Anthropic Shadow, because it is not even clear how you would define the required reference class outside of toy models.
Mini Appendix: What about SIA?
The Self-Indication Assumption (SIA) is an alternative, popular, approach to anthropic reasoning. It rejects the equation we saw before:
and instead claims that the LHS is more likely in worlds with more observers. Its appeal comes from the fact that it is largely reference class independent, and that it avoids the Doomsday Argument and Adam+Eve conclusions already mentioned (although it comes with unattractive conclusions of its own). I won't discuss it much more here except to say that SIA doesn't contain the Anthropic Shadow effect either. It doesn't matter what reference class we use to demonstrate this, so the simplest way to see it is to return to the final example considered in Possible Solution #2, and consider our final likelihood equation. SIA has the effect of removing the denominator in the likelihood expression, leaving only the numerator, which we already saw had the required form to precisely negate the anthropic shadow effect.
We are here considering events where we don't have good reason to think the base risk per century has changed much over time.