Toby_Ord

Shapley values: Better than counterfactuals

While I think the Shapley value can be useful, there are clearly cases where the counterfactual value is superior for an agent deciding what to do. Derek Parfit clearly explains this in Five Mistakes in Moral Mathematics. He is arguing against the 'share of the total view' and but at least some of the arguments also apply to the Shapley value too (which is basically an improved version of 'share of the total'). In particular, the best things you have listed in favour of the Shapley value applied to making a moral decision correctly apply when you and others are all making the decision 'together'. If the others have already committed to their part in a decision, the counterfactual value approach looks better.

e.g. on your first example, if the other party has already paid their $1000 to P, you face a choice between creating 15 units of value by funding P or 10 units by funding the alternative. Simple application of Shapley value says you should do the action that creates 10 units, predictably making the world worse.

One might be able to get the best of both methods here if you treat cases like this where another agent has already committed to a known choice as part of the environment when calculating Shapley values. But you need to be clear about this. I consider this kind of approach to be a hybrid of the Shapley and counterfactual value approaches, with Shapley only being applied when the other agents' decisions are still 'live'. As another example, consider your first example and add the assumption that the other party hasn't yet decided, but that you know they love charity P and will donate to it for family reasons. In that case, the other party's decision, while not yet made, is not 'live' in the relevant sense and you should support P as well.

If you are going to pursue what the community could gain from considering Shapley values, then look into cases like this and subtleties of applying the Shapley value further — and do read that Parfit piece.

Are we living at the most influential time in history?

I don't have time to get into all the details, but I think that while your intuition is reasonable (I used to share it) the maths does actually turn out my way. At least on one interpretation of what you mean. I looked into this when wondering if the doomsday argument suggested that the EV of the future must be small. Try writing out the algebra for a Gott style prior that there is an x% chance we are in the first x%, for all x. You get a Pareto distribution that is a power law with infinite mean. While there is very little chance on this prior that there is a big future ahead, the size of each possible future compensates for that, such that each order of magnitude of increasing size of the future contributes an equal expected amount of population to the future, such that the sum is infinite.

I'm not quite sure what to make of this, and it may be quite brittle (e.g. if we were somehow certain that there weren't more than 10^100 people in the future, the expected population wouldn't be all that high), but as a raw prior I really think it is both an extreme outside view, saying we are equally likely to live at any relative position in the sequence *and* that there is extremely high (infinite) EV in the future -- not because it thinks there is any single future whose EV is high, but because the series diverges.

This isn't quite the same as your claim (about influence), but does seem to 'save existential risk work' from this challenge based on priors (I don't actually think it needed saving, but that is another story).

Are we living at the most influential time in history?

Thanks for this very thorough reply. There are so many strands here that I can't really hope to do justice to them all, but I'll make a few observations.

1) There are two versions of my argument. The weak/vague one is that a uniform prior is wrong and the real prior should decay over time, such that you can't make your extreme claim from priors. The strong/precise one is that it should decay as 1/n^2 in line with a version of LLS. The latter is more meant as an illustration. It is my go-to default for things like this, but my main point here is the weaker one. It seems that you agree that it should decay, and that the main question now is whether it does so fast enough to make your prior-based points moot. I'm not quite sure how to resolve that. But I note that from this position, we can't reach either your argument that from priors this is way too unlikely for our evidence to overturn (and we also can't reach my statement of the opposite of that).

2) I wouldn't use the LLS prior for arbitrary superlative properties where you fix the total population. I'd use it only if the population over time was radically unknown (so that the first person is much more likely to be strongest than the thousandth, because there probably won't be a thousand) or where there is a strong time dependency such that it happening at one time rules out later times.

3) You are right that I am appealing to some structural properties beyond mere superlatives, such as extinction or other permanent lock-in. This is because these things happening in a century would be sufficient for that century to have a decent chance of being the most influential (technically this still depends on the influenceability of the event, but I think most people would grant that conditional on next century being the end of humanity, it is no longer surprising at all if this or next century were the most influential). So I think that your prior setting approach proves too much, telling us that there is almost no chance of extinction or permanent lock-in next century (and even after updating on evidence). This feels fishy. A bit like Bostrom's 'presumptuous philosopher' example. I think it looks even more fishy in your worked example where the prior is low precisely because of an assumption about how long we will last without extinction: especially as that assumption is compatible with, say, a 50% chance of extinction in the next century. (I don't think this is a knockdown blow here: but I'm trying to indicate the part of your argument I think would be most likely to fall and roughly why).

4) I agree there is an issue to do with too many hypotheses . And a related issue with what is the first timescale on which to apply a 1/2 chance of the event occurring. I think these can be dealt with together. You modify the raw LLS prior by some other kind of prior you have for each particular type of event (which you need to have since some are sub-events of others and rationality requires you to assign lower probability to them). You could operationalise this by asking over what time frame you'd expect a 1/2 chance of that event occurring. Then LLS isn't acting as an indifference principle, but rather just as a way of keeping track of how to update your ur prior in light of how many time periods have elapsed without the event occurring. I think this should work out somewhat similarly, just with a stretched PDF that still decays as 1/n^2, but am not sure. There may be a literature on this.

Are we living at the most influential time in history?

I'm sympathetic to the mixture of simple priors approach and value simplicity a great deal. However, I don't think that the uniform prior up to an arbitrary end point is the simplest as your comment appears to suggest. e.g. I don't see how it is simpler than an exponential distribution with an arbitrary mean (which is the max entropy prior over R+ conditional on a finite mean). I'm not sure if there is a max entropy prior over R+ without the finite mean assumption, but 1/x^2 looks right to me for that.

Also, re having a distribution that increases over a fixed time interval giving a peak at the end, I agree that this kind of thing is simple, but note that since we are actually very uncertain over when that interval ends, that peak gets very smeared out. Enough so that I don't think there is a peak at the end at all when the distribution is denominated in years (rather than centiles through human history or something). That said, it could turn into a peak in the middle, depending on the nature of one's distribution over durations.

Are we living at the most influential time in history?

I don't think I'm building in any assumptions about living extremely early -- in fact I think it makes as little assumption on that as possible. The prior you get from LLS or from Gott's doomsday argument says the median number of people to follow us is as many as have lived so far (~100 billion), that we have an equal chance of being in any quantile, and so for example we only have a 1 in a million chance of living in the first millionth. (Though note that since each order of magnitude contributes an equal expected value and there are infinitely many orders of magnitude, the expected number of people is infinite / has no mean.)

Are we living at the most influential time in history?

You are right that having a fuzzy starting point for when we started drawing from the urn causes problems for Laplace's Law of Succession, making it less appropriate without modification. However, note that in terms of people who have ever lived, there isn't that much variation as populations were so low for so long, compared to now.

I see your point re 'arbitrary superlatives', but am not sure it goes through technically. If I could choose a prior over the relative timescale of beginning to the final year of humanity, I would intuitively have peaks at both ends. But denominated in years, we don't know where the final year is and have a distribution over this that smears that second peak out over a long time. This often leaves us just with the initial peak and a monotonic decline (though not necessarily of the functional form of LLS). That said, this interacts with your first point, as the beginning of humanity is also vague, smearing that peak out somewhat too.

Are we living at the most influential time in history?

That's interesting. Earlier I suggested that a mixture of different priors that included some like mine would give a result very different to your result. But you are right to say that we can interpret this in two ways: as a mixture of ur priors or as a mixture of priors we get after updating on the length of time so far. I was implicitly assuming the latter, but maybe the former is better and it would indeed lessen or eliminate the effect I mentioned.

Your suggestion is also interesting as a general approach, choosing a distribution over these Beta distributions instead of debating between certainty in (0,0), (0.5, 0.5), and (1,1). For some distributions over Beta parameters these the maths is probably quite tractable. That might be an answer to the right meta-rational approach rather than an answer to the right rational approach, or something, but it does seem nicely robust.

Are we living at the most influential time in history?

Quite high. If you think it hasn't happened yet, then this is a problem for my prior that Will's doesn't have.

More precisely, the argument I sketched gives a prior whose PDF decays roughly as 1/n^2 (which corresponds to the chance of it first happening in the next period after n absences decaying as ~1/n). You might be able to get some tweaks to this such that it is less likely than not to happen by now, but I think the cleanest versions predict it would have happened by now. The clean version of Laplace's Law of Succession, measured in centuries, says there would only be a 1/2,001 chance it hadn't happened before now, which reflects poorly on the prior, but I don't think it quite serves to rule it out. If you don't know whether it has happened yet (e.g. you are unsure of things like Will's Axial Age argument), this would give some extra weight to that possibility.

Are we living at the most influential time in history?

Hi Will,

It is great to see all your thinking on this down in one place: there are lots of great points here (and in the comments too). By explaining your thinking so clearly, it makes it much easier to see where one departs from it.

My biggest departure is on the prior, which actually does most of the work in your argument: it creates the extremely high bar for evidence, which I agree probably couldn’t be met. I’ve mentioned before that I’m quite sure the uniform prior is the wrong choice here and that this makes a big difference. I’ll explain a bit about why I think that.

As a general rule if you have a domain like this that extends indefinitely in one direction, the correct prior is one that diminishes as you move further away in that direction, rather than picking a somewhat arbitrary end point and using a uniform prior on that. People do take this latter approach in scientific papers, but I think it is usually wrong to do so. Moreover in your case in particular, there are also good reasons to suspect that the chance of a century being the most influential should diminish over time. Especially because there are important kinds of significant event (such as the value lock-in or an existential catastrophe) where early occurrence blocks out later occurrence.

This directly leads to diminishing credence over time. e.g. if there is a known constant chance of such a key event happening in any century *conditional on not happening before that time* then the chance it first happens in any century diminishes exponentially as time goes on. Or if this chance is unknown and could be anything between zero and one, then instead of an exponential decline, it diminishes more slowly (analogous to Weitzman discounting). The most famous model of this is Laplace’s Law of Succession, where if your prior for the unknown contstant hazard rate per time period is uniform on the interval between 0 and 1, then the chance it happens in the nth period if it hasn’t before is 1/n+2 — a hyperbola. I think hazard rates closer to zero and one are more likely than those in between, so I prefer the bucket shaped Jeffrey’s prior (= Beta(0.5, 0.5) for the maths nerds out there), which gives a different hyperbola of 1/2n+2 (and makes my case a little bit harder than if I’d settled for the uniform prior).

A raw application of this would say that since Homo sapiens has been around for 2,000 centuries (without, let us suppose, having had such a one-off critical time yet), the chance it happens this century is 1 in 2,002 (or 1 in 4,002). [Actually I’ll just say 1 in 2,000 or (1 in 4,000), as the +2 is just an artefact of how we cut up the time periods and can be seen to go to zero when we use continuous time.] This is a lot more likely than your 1 in a million or 1 in 100,000. And it gets even more so when you run it in terms of persons or person years (as I believe you should). i.e. measure time with a clock that ticks as each lifetime ends, rather than one that ticks each second. e.g. about 1/20th of all people who have ever lived are alive now, so the next century it is not really 1/2,000th of human history but more like 1/20th of it. On this clock and with this prior, one would expect a 1/20 (or 1/40) chance of a pivotal event (first) occurring.

Note that while your model applied a kind of principle of indifference uniformly across time, saying each century was equally likely (a kind of outside view), my model makes similar sounding assumptions. It assumes that each century is equally likely to have such a high stakes pivotal event (conditional on it not already happening), and if you do the maths, this also corresponds to each order of magnitude of time having an equal (unconditional) chance of the the pivotal event happening in it (i.e. instead of equal chance in century 1, century 2, century 3… it is equal chance in centuries 1 to 10, centuries 10 to 100, centuries 100 to 1,000), which actually seems more intuitive to me. Then there is the wrinkle that I don’t assign it across clock time, but across persons or person-years (e.g. where I say ‘century’ your could read it as ‘1 trillion person years’). All these choices are inspired by very similar motivations to how you chose your prior.

*[As an interesting side-note, this kind of prior is also what you get if you apply Richard Gott’s version of the Doomsday Argument to estimate how long we will last (say, instead of the toy model you apply), and this is another famous way of doing outside-view forecasting.]*

I doubt I can easily convince you that the prior I’ve chosen is objectively best, or even that it is better than the one you used. Prior-choice is a bit of an art, rather like choice of axioms. But I hope you see that it does show that the whole thing comes down to whether you choose a prior like you did, or another reasonable alternative. My prior gives a prior chance of HoH of about 5% or 2.5%, which is thousands of times more likely than yours, and can easily be bumped up by the available evidence to probabilities >10%. So your argument doesn’t do well on sensitivity analysis over prior-choice. Additionally, if you didn’t know which of these priors to use and used a mixture with mine weighted in to a non-trivial degree, this would also lead to a substantial prior probability of HoH. And this is only worse if instead of using a 1/n hyperbola like I did, you had arguments that it declined more quickly, like 1/n^2 or an exponential. So it only goes through if you are very solidly committed to a prior like the one you used.

This is a very nice explanation Ben.

For the record, while I'm perhaps the most prominent voice in EA for our time being one of the most influential there will ever be, I'm also very sympathetic to this approach. For instance, my claim is that this key time period has already been going for 75 years and can't last more than a small number of centuries. This is quite compatible with more important times being 100 years away, and with the arguments that investing for long periods like that could provide a large increase in the expected impact of the resources (even if the time they were spent was not more influential). And of course, I might be wrong about the importance of this time. So I am excited to see more work exploring patient longtermism.