tobycrisford

Don’t Be Comforted by Failed Apocalypses

In the tile case, the observers who see a blue tile are underestimating on average. If you see a blue tile, you then know that you belong to that group, who are underestimating on average. But that still should not change your estimate. That's weird and unintuitive, but true in the coin/tile case (unless I've got the maths badly wrong somewhere).

I get that there is a difference in the anthropic case. If you kill everyone with a red tile, then you're right, the observers on average will be biased, because it's only the observers with a blue tile who are left, and their estimates were biased to begin with. But what I don't understand is, why is finding out that you are alive any different to finding out that your tile is blue? Shouldn't the update be the same?

Don’t Be Comforted by Failed Apocalypses

Thanks for your reply!

If 100 people do the experiment, the ones who end up with a blue tile will, on average, have fewer heads than they should, for exactly the same reason that most observers will live after comparitively fewer catastrophic events.

But in the coin case that still does not mean that seeing a blue tile should make you revise your naive estimate upwards. The naive estimate is still, in bayesian terms, the correct one.

I don't understand why the anthropic case is different.

Don’t Be Comforted by Failed Apocalypses

I've never understood the bayesian logic of the anthropic shadow argument. I actually posted a question about this on the EA forum before, and didn't get a good answer. I'd appreciate it if someone could help me figure out what I'm missing. When I write down the causal diagram for this situation, I can't see how an anthropic shadow effect could be possible.

Section 2 of the linked paper shows that the probability of a catastrophic event having occurred in some time frame in the past *given that we exist now:* P(B_2|E), is smaller than its actual probability of occurring in that time frame, P. The two get more and more different the less likely we are to survive the catastrophic event (they call our probability of survival Q). It's easy to understand why that is true. It is more likely that we would exist now if the event did not occur than if it did occur. In the extreme case where we are certain to be wiped out by the event, then P(B_2|E) = 0.

This means that if you re-ran the history of the world thousands of times, the ones with observers around at our time would have fewer catastrophic events in their past, on average, than is suggested by P. I am completely happy with this.

But the paper then leaps from this observation to the conclusion that our naive estimate of the frequency of catastrophic events (i.e. our estimate of P) must be biased downwards. This is the point where I lose the chain of reasoning. Here is why.

What we care about here is *not* P(B_2|E). What we care about is our estimate of P itself. We would ideally like to calculate the posterior distribution of *P*, given both B_1,2 (the occurrence/non-occurrence of the event in the past), and our existence, E. The causal diagram here looks like this:

P -> B_2 -> E

This diagram means: P influences B_2 (the catastrophic event occurring), which influences E (our existence). But P does not influence E except through B_2.

*This means if we condition on B_2, the fact we exist now should have no further impact on our estimate of P*

To sum up my confusion: The distribution of (P|B_2,E) should be equivalent to the distribution of (P|B_2). I.e., there is no anthropic shadow effect.

In my original EA forum question I took the messy anthropics out of it and imagined flipping a biased coin hundreds of times and painting a blue tile red with probability 1-Q (extinction) if we ever get a head. If we looked at the results of this experiment, we could estimate the bias of the coin by simply counting the number of heads. The colour of the tile is irrelevant. And we should go with the naive estimate, *even though it is again true that people who see a blue tile will have fewer heads on average than is suggested by the bias of the coin*.

What this observation about the tile frequencies misses is that the tile is more likely to be blue when the probability of heads is smaller (or we are more likely to exist if P is smaller), and we should take that into account too.

Overall it seems like our naive estimate of P based on the frequency of the catastrophic event in our past is totally fine when all things are considered.

I'm struggling at the moment to see why the anthropic case should be different to the coin case.

Is the reasoning of the Repugnant Conclusion valid?

"I would say exactly the same for this. If these people are being freshly created, then I don't see the harm in treating them as identical."

I think you missed my point. How can 1,000 people be identical to 2,000 people? Let me give a more concrete example. Suppose again we have 3 possible outcomes:

(A) (Status quo): 1 person exists at high welfare +X

(B): Original person has welfare reduced to X - 2, 1000 new people are created at welfare +X

(C): Original person has welfare reduced only to X - , 2000 new people are created, 1000 at welfare , and 1000 at welfare X + .

And you are forced to choose between (B) and (C).

How do you pick? I think you want to say 1000 of the potential new people are "effectively real", but which 1000 are "effectively real" in scenario (C)? Is it the 1000 at welfare ? Is it the 1000 at welfare X+? Is it some mix of the two?

If you take the first route, (B) is strongly preferred, but if you take the second, then (C) would be preferred. There's ambiguity here which needs to be sorted out.

"Then, supposedly no one is effectively real. But actually, I'm not sure this is a problem. More thinking will be required here to see whether I am right or wrong."

Thank you for finding and expressing my objection for me! This does seem like a fairly major problem to me.

"Sorry, but this is quite incorrect. The people in (C) would want to move to (B)."

No, they wouldn't, because the people in (B) are different to the people in (C). You can assert that you *treat* them the same, but you can't assert that they *are* the same. The (B) scenario with different people and the (B) scenario with the same people are both distinct, possible, outcomes, and your theory needs to handle them both. It can give the same answer to both, that's fine, but part of the set up of my hypothetical scenario is that the people *are* different.

"Isn't the very idea of reducing people to their welfare impersonal?"

Not necessarily. So called "person affecting" theories say that an act can only be wrong if it makes things worse *for someone*. That's an example of a theory based on welfare which is not impersonal. Your intuitive justification for your theory seemed to have a similar flavour to this, but if we want to avoid the non-identity problem, we need to reject this appealing sounding principle. It is possible to make things worse even though there is no one who it is worse *for*. Your 'effectively real' modification does this, I just think it reduces the intuitive appeal of the argument you gave.

The COILS Framework for Decision Analysis: A Shortened Intro+Pitch

Where would unintended consequences fit into this?

E.g. if someone says:

"This plan would cause X, which is good. (Co) X would not occur without this plan, (I) We will be able to carry out the plan by doing Y, (L) the plan will cause X to occur, and (S) X is morally good."

And I reply:

"This plan will also cause Z, which is morally bad, and outweights the benefit of X"

Which of the 4 categories of claim am I attacking? Is it 'implementation'?

Is the reasoning of the Repugnant Conclusion valid?

You can assert that you consider the 1000 people in (B) and (C) to be identical, for the purposes of applying your theory. That does avoid the non-identity problem in this case. But the fact is that they are not the same people. They have different hopes, dreams, personalities, memories, genders, etc.

By treating these different people as equivalent, your theory has become more *impersonal. * This means you can no longer appeal to one of the main arguments you gave to support it: that your recommendations always align with the answer you'd get if you asked the people in the population whether they'd like to move from one situation to the other. The people in (B) would not want to move to (C), and vice versa, because that would mean they no longer exist. But your theory now gives a strong recommendation for one over the other anyway.

There are also technical problems with how you'd actually apply this logic to more complicated situations where the number of future people differs. Suppose that 1000 extra people are created in (B), but 2000 extra people are created in (C), with varying levels of welfare. How do you apply your theory then? You now need 1000 of the 2000 people in (C) to be considered 'effectively real', to continue avoiding non-identity problem like conclusions, but which 1000? How do you pick? Different choices of the way you decide to pick will give you very different answers, and again your theory is becoming more impersonal, and losing more of its initial intuitive appeal.

Another problem is what to do under uncertainty. What if instead of a forced choice between (B) and (C), the choice is between:

0.1% chance of (A), 99.9% chance of (B)

0.1000001% chance of (A), 99.9% chance of (C).

Intuitively, the recommendations here should not be very different to the original example. The first choice should still be strongly preferred. But are the 1000 people still considered 'effectively real' in your theory, in order to allow you to reach that conclusion? Why? They're not guaranteed to exist, and actually, your real preferred option, (A), is more likely to happen with the second choice.

Maybe it's possible to resolve all these complications, but I think you're still a long way from that at the moment. And I think the theory will look a lot less intuitively appealing once you're finished.

I'd be interested to read what the final form of the theory looks like if you do accomplish this, although I still don't think I'm going to be convinced by a theory which will lead you to be predictably in conflict with your future self, even if you and your future self both follow the theory. I can see how that property can let you evade the repugnant conclusion logic while still *sort of* being transitive. But I think that property is just as undesirable to me as non-transitiveness would be.

Is the reasoning of the Repugnant Conclusion valid?

"We minimise our loss of welfare according to the methodology and pick B, the 'least worst' option."

But (B) doesn't minimise our loss of welfare. In B we have welfare X-2, and in C we have welfare X - , so wouldn't your methodology tell us to pick (C)? And this is intuitively clearly wrong in this case. It's telling us not tmake a negligible sacrifice to our welfare now in order to improve the lives of future generations, which is the same problematic conclusion that the non-identity problem gives to certain theories of population ethics.

I'm interested in how your approach would tell us to pick (B), because I still don't understand that?

I won't reply to your other comment just to keep the thread in one place from now on (my fault for adding a P.S, so trying to fix the mistake). But in short, yes, I disagree, and I think that these flaws are unfortunately severe and intractable. The 'forcing' scenario I imagined is more like the real world than the unforced decisions. For most of us making decisions, the fact that people will exist in the future is inevitable, and we have to think about how we can influence their welfare. We are therefore in a situation like (2), where we are going to move from (A) to either (B) or (C) and we just get to pick which of (B) or (C) it will be. Similarly, figuring out how to incorporate uncertainty is also fundamental, because all real world decisions are made under uncertainty.

Is the reasoning of the Repugnant Conclusion valid?

I understood your rejection of the total ordering on populations, and as I say, this is an idea that others have tried to apply to this problem before.

But the approach others have tried to take is to use the lack of a precise "better than" relation to evade the logic of the repugnant conclusion arguments, while still ultimately concluding that population Z is worse than population A. If you only conclude that Z is not worse than A, and A is not worse than Z (i.e. we should be indifferent about taking actions which transform us from world A to world Z), then a lot of people would still find that repugnant!

Or are you saying that your theory tells us *not* to transform ourselves to world Z? Because we should only ever do anything that will make things actually better?

If so, how would your approach handle uncertainty? What probability of a world Z should we be willing to risk in order to improve a small amount of real welfare?

And there's another way in which your approach still contains some form of the repugnant conclusion. If a population stopped dealing in hypotheticals and actually started taking actions, so that these imaginary people became real, then you could imagine a population going through all the steps of the repugnant conclusion argument process, thinking they were making improvements on the status quo each time, and finding themselves ultimately ending up at Z. In fact it can happen in just two steps, if the population of B is made large enough, with small enough welfare.

I find something a bit strange about it being different when happening in reality to when happening in our heads. You could imagine people thinking

"Should we create a large population B at small positive welfare?"

"Sure, it increases positive imaginary welfare and does nothing to real welfare"

"But once we've done that, they will then be real, and so then we might want to boost their welfare at the expense of our own. We'll end up with a huge population of people with lives barely worth living, that seems quite repugnant."

"It is repugnant, we shouldn't prioritise imaginary welfare over real welfare. Those people don't exist."

"But if we create them they *will* exist, so then we will end up deciding to move towards world Z. We should take action now to stop ourselves being able to do that in future."

I find this situation of people being in conflict with their future selves quite strange. It seems irrational to me!

Is the reasoning of the Repugnant Conclusion valid?

It sounds like I have misunderstood how to apply your methodology. I would like to understand it though. How would it apply to the following case?

Status quo (A): 1 person exists at very high welfare +X

Possible new situation (B): Original person has welfare reduced to X - 2 , 1000 people are created with very high welfare +X

Possible new situation (C): Original person has welfare X - , 1000 people are created with small positive welfare .

I'd like to understand how your theory would answer two cases: (1) We get to choose between all of A,B,C. (2) We are forced to choose between (B) and (C), because we know that the world is about to instantaneously transform into one of them.

This is how I had understood your theory to be applied:

- Neither (B) nor (C) are better than (A), because an instanataneous change from (A) to (B) or (C) would reduce real welfare (of the one already existing person).
- (A) is not better than (B) or (C) because to change (B) or (C) to (A) would cause 1000 people to disappear (which is a lot of negative real welfare).
- (B) and (C) are neither better or worse than each other, because an instantaneous change of one to the other would involve the loss of 1000 existing people (negative real welfare) which is only compensated by the creation of imaginary people (positive imaginary welfare). It's important here that the 1000 people in (B) and (C) are not the same people. This is the non-identity problem.

From your reply it sounds like you're coming up with a different answer when comparing (B) to (C), because both ways round the 1000 people are always considered imaginary, as they don't literally exist in the status quo? Is that right?

If so, that still seems like it gives a non-sensical answer in this case, because it would then say that (C) is better than (B) (real welfare is reduced by less), when it seems obvious that (B) is actually better? This is an even worse version of the flaw you've already highlighted, because the existing person you're prioritising over the imaginary people is already at a welfare well above the 0 level.

If I've got something wrong and your methodology can explain the intuitively obvious answer that (B) is better than (C), and should be chosen in example (2) (regardless of their comparison to A), then I would be interested to understand how that works.

I can see that is a difference between the two cases. What I'm struggling to understand is why that leads to a different answer.

My understanding of the steps of the anthropic shadow argument (possibly flawed or incomplete) is something like this:

You are an observer -> We should expect observers to underestimate the frequency of catastrophic events on average, if they use the frequency of catastrophic events in their past -> You should revise your estimate of the frequency of catastrophic events upwards

But in the coin/tile case you could make an exactly analogous argument:

You see a blue tile -> We should expect people who see a blue tile to underestimate the frequency of heads on average, if they use the frequency of heads in their past -> You should revise your estimate of the frequency of heads upwards.

But in the coin/tile case, this argument is wrong, even though it appears intuitively plausible. If you do the full bayesian analysis, that argument leads you to the wrong answer. Why should we trust the argument of identical structure in the anthropic case?