The Multiple Stage Fallacy

by tyleralterman16th Mar 201619 comments

4

Frontpage

This was originally written by Eliezer Yudkowsky and posted on his Facebook wall. It is reposted here with permission from the author:

 

In August 2015, renowned statistician and predictor Nate Silver wrote "Trump's Six Stages of Doom" in which he gave Donald Trump a 2% chance of getting the Republican nomination (not the presidency, the nomination).

It's too late now to register an advance disagreement, but now that I've seen this article, I do think that Nate Silver's argument was a clear instance of something that I say in general you shouldn't do - what I used to call the Conjunction Fallacy Fallacy and have now renamed to the Multiple-Stage Fallacy (thanks to Graehl for pointing out the naming problem).

The Multiple-Stage Fallacy is when you list multiple 'stages' that need to happen on the way to some final outcome, assign probabilities to each 'stage', multiply the probabilities together, and end up with a small final answer. In other words, you take an arbitrary event, break it down into apparent stages, and say "But you should avoid the Conjunction Fallacy!" to make it seem very low-probability.

In his original writing, Nate listed 6 things that he thought Trump needed to do to get the nomination - "Trump's Six Stages of Doom" - and assigned 50% probability to each of them.

(Original link here: http://fivethirtyeight.com/…/donald-trumps-six-stages-of-d…/)

We're now past the first four stages Nate listed, and prediction markets give Trump a 74% chance of taking the nomination. On Nate's logic, that should have been a 25% probability. So while a low prior probability might not have been crazy and Trump came as a surprise to many of us, the specific logic that Nate used is definitely not holding up in light of current events.

On a probability-theoretic level, the three problems at work in the usual Multiple-Stage Fallacy are as follows:

1. First and foremost, you need to multiply *conditional* probabilities rather than the absolute probabilities. When you're considering a later stage, you need to assume that the world was such that every prior stage went through. Nate Silver was probably - though I here critique a man of some statistical sophistication - Nate Silver was probably trying to simulate his prior model of Trump accumulating enough delegates in March through June, not imagining his *updated* beliefs about Trump and the world after seeing Trump be victorious up to March.

1a. Even if you're aware in principle that you need to use conjunctive probabilities, it's hard to update far *enough* when you imagine the pure hypothetical possibility that Trump wins stages 1-4 for some reason - compared to how much you actually update when you actually see Trump winning! (Some sort of reverse hindsight bias or something? We don't realize how much we'd need to update our current model if we were already that surprised?)

2. Often, people neglect to consider disjunctive alternatives - there may be more than one way to reach a stage, so that not *all* the listed things need to happen. This doesn't appear to have played a critical role in Nate's prediction here, but I've often seen it in other cases of the Multiple-Stage Fallacy.

3. People have tendencies to assign middle-tending probabilities. So if you list enough stages, you can drive the apparent probability of anything down to zero, even if you solicit probabilities from the reader.

3a. If you're a motivated skeptic, you will be tempted to list more 'stages'.

Fallacy #3 is particularly dangerous for people who've read a lot about the dangers of overconfidence. Right now, we're down to two remaining stages in Nate's "six stages of doom" for Trump, accumulating the remaining delegates and winning the convention. The prediction markets assign 74% probability that Trump passes through both of them. So the conditional probabilities must look something like 90% probability that Trump accumulates enough delegates, and then 80% probability that Trump wins the convention given those delegates.

Imagine how overconfident this would sound without the prediction market! Oh, haven't you heard that what people assign 90% probability usually doesn't happen 9 times out of 10?

But if you're not willing to make "overconfident" probability assignments like those, then you can drive the apparent probability of anything down to zero by breaking it down into enough 'stages'. In fact, even if someone hasn't heard about overconfidence, people's probability assignments often trend toward the middle, so you can drive down their "personally assigned" probability of anything just by breaking it down into more stages.

For an absolutely ridiculous and egregious example of the Multiple-Stage Fallacy, see e.g. this page which commits both of the first two fallacies at great length and invites the third fallacy as much as possible: http://www.jefftk.com/p/breaking-down-cryonics-probabilities. To be sure, Nate Silver didn't do anything remotely on THAT order, but it does put all three subfallacies on clearer display than they appear in "Trump's Six Stages of Doom".

From beginning to end, I've never used this style of reasoning and I don't recommend that you do so either. Beware the Multiple-Stage Fallacy!

 

Edit (3/16): Reply from Jeff Kaufman below:

 

a) A more recent version of that post's approach with more people's estimates is http://www.jefftk.com/p/more-cryonics-probability-estimates

b) My estimates were careful to avoid (1) but I didn't make this clear enough in the writeup apparently. All the probabilities I'm using are conditional on the earlier things not happening.

c) In that post I explicitly excluded disjunctive paths because I really do think there's pretty much one path that's far more likely. All the candidates people have offered me for disjunctive paths for cryonics seem much less likely.

d) I'm pretty unhappy about the insinuation that my skepticism is motivated. I'd love it if cryonics was likely to work! I care about so many people that I would be devastated to lose. And when I initially wrote that post two friends (Jim, Michael) had me mostly convinced. Then thinking about why it still seemed unlikely to me I tried a Fermi style estimate and became way more skeptical.

e) I'd like to see more examples of people using this approach and whether it was useful before calling it a fallacy or a reasoning trap.
16 comments, sorted by Highlighting new comments since Today at 5:32 PM
New Comment

To defend the critique of Jeff's reasoning a little, I do think Eliezer has a point with the 1a fallacy when he says that it's hard to properly condition on the fact that you've been surprised. For example, according to Jeff's estimates there's a ~3% chance that getting and keeping you frozen goes well. If this does happen you'd be hugely surprised at how well things go for cryonicists. There should be some explanation for that other than pure chance. (The problem is you can't search the space of explanations for each of the ~30 probabilities and adjust them appropriately. ) Here's one simple explanation: Cryonics gets big and successful. Perhaps that's unlikely a priori but given that something very weird happened it becomes plausible. This will strongly mess with the probabilities that determine if something goes wrong with reviving. The biggest one, 'The technology is never developed to extract the information', would certainly be lower. In fact, 9/10 probabilities would be go down. Sometimes they could also go up.

I doubt that Jeff managed to take all of these possibilities into account. Properly conditioning each of the ~30 events on each of the ones before it going well seems like a pretty daunting task. That doesn't mean Jeff is horrendously wrong but he probably does make mistake 1a because that's just hard to avoid with this type of reasoning.

The name "Multiple Stage Fallacy" seems to encourage equivocation: Is it a fallacy to analyze the probability of an event by breaking it down into multiple stages, or a fallacy to make the mistakes Eliezer points to?

For the Nate Silver example, Eliezer does aim to point out particular mistakes. But for Jeff, the criticism comes sort of between these two possibilities: There's a claim that Jeff makes these mistakes (which seems to be wrong - see Jeff's reply), but it's as if the mere fact of "Multiple Stages" means there's no need to actually make an argument.

Yes, I found the criticism-by-insinuation of Jeff's post unhelpful, because none of these errors were obvious. A more concrete discussion of disagreements might be interesting.

(For what it's worth Jeff's analysis still looks pretty plausible to me. My biggest disagreement is on the probabilities of something "other" going wrong, which look too modestly large to me after a decent attempt to think about what might fail. It's not clear that's even one of the kind of errors Eliezer is talking about.)

My biggest disagreement is on the probabilities of something "other" going wrong, which look too modestly large to me after a decent attempt to think about what might fail.

After a lot of discussion in the original post, I made a new model where I (a) removed steps I thought were very likely to succeed and (b) removed most of the mass from "other" since discussing with people had increased my confidence that we'd been exhaustive: http://lesswrong.com/lw/fz9

Sorry, if I'd done my homework I'd have linked to that and might have said you agreed with me!

I was pointing out the disagreement not to critique, but to highlight that in the piece Eliezer linked to as exhibiting the problems described, it seemed to me like the biggest issue was in fact a rather different problem.

I think this is dumb; I don't see any particular evidence that this happens very often, and I'm much more worried about people being overconfident about things based on tenuous, badly thought out, oversimplified models than I am about them being underconfident because of concerns like these.

According to Eliezer, in the cog sci literature people don't systematically undercorrect for biases like overconfidence when warned to avoid them. Rather, they're systematically miscalibrated about whether they're overcorrecting or undercorrecting. ("Yes, I know, I was surprised too.")

If our cure for moderate cases of overconfidence tends to produce extreme underconfidence, then we can be left worse off that we were originally, especially if there are community norms punishing people more for sounding overconfident than for sounding underconfident.

Hmm, though I agree with the idea that people tend to be overconfident, the critique of this style of reasoning is exactly that it leads to overconfidence. I think the argument "people tend to be underconfident, not overconfident" does not seem to bear a lot on the truth of this critique.

(e.g. underconfidence is having believes that tend towards the middling ranges, overconfidence is having extreme beliefs. Eliezer argues that this style of reasoning leads one to assign extremely low probabilities to events, which should be classified as overconfident)

But point 3 relies on underconfident estimates of the individual factors.

I'm not sure that addresses Buck's point. I just don't think you can reduce this to "people tend to be overconfident", even if it's a conclusion in a limited domain.

And the discussion on Jeff's FB post: https://www.facebook.com/jefftk/posts/775488981742.

Especially helpful:

"Back when I was doing predictions for the Good Judgement Project this is something that the top forecasters would use all the time. I don't recall it being thought inaccurate and the superforecasters were all pretty sharp cookies who were empirically good at making predictions."

I hate taking something like this and calling it the XXX fallacy. The unsubstantiated swipe at the end is an excellent example of why.

I think the unsubstantiated swipe at the end is a separate problem from calling it the XXX fallacy.

They're linked, though. Labelling it as a fallacy makes it easier to just point to an example of this kind of reasoning with the implication that it's fallacious (so apparently no need to substantiate).

Yeah that's a good point.