This is the first in what might become a bunch of posts picking out issues from statistics and probability of relevance to EA. The format will be informal and fairly bite-size. None of this will be original, hopefully.

**Expectations are not outcomes**

Here we attempt to trim back the intuition that an expected value can be safely thought of as a representative value of the random variable.

**Situation 1**

A Rademacher random variable X takes the value 1 with probability 1/2 and otherwise -1. Its expectation is zero. We will almost surely never see any value other than -1 or 1.

This means that the expected value might not even be a number the distribution *could* produce. We might not even be able to get arbitrarily close to it.

Imagine walking up to a table in a casino and betting that the next roll of a die will be 7/2.

**Situation 2**

Researchers create a natural language simulation model. Upon receiving a piece of text as stimulus it outputs a random short story. What is the expectation of the story?

Let’s think about the first word. There will be some implied probability distribution over a dictionary. Its expectation is some *fractional combination of every word* in the dictionary. Whatever that means, and whatever it is useful for, it is not the start of a legible story - and should not be used as such.

What is the expected length of the story? What would a solution to that problem mean? Could one, for example, print the expected story?

**Situation 3**

Distributions with very fat tails. For instance, the Cauchy distribution has an undefined expectation.

**Implication**

It is tempting to freely substitute an expectation in as a representative of a random variable. Suppose we used the following procedure in a blanket fashion:

- We are faced with a decision depending on an uncertain outcome.
- We take the expected value of the outcome.
- We use the expectation as a scenario to plan around.

Step three is unsafe in principle - even if sometimes not in practice.

If there is a next time (the length of this series is currently fractional) I hope to touch on some scenarios less easily dismissed as the concerns of a pedant.

Hopefully, we'll get there! It'll be mostly Bayesian though :)