[Epistemic status: Pretty confident. But also, enthusiasm on the verge of partisanship]

One intuitive function which assigns impact to agents is the counterfactual, which has the form:

CounterfactualImpact(Agent) = Value(World) - Value(World/Agent)

which reads "The impact of an agent is the difference between the value of the world with the agent and the value of the world without the agent".

It has been discussed in the effective altruism community that this function leads to pitfalls, paradoxes, or to unintuitive results when considering scenarios with multiple stakeholders. See:

In this post I'll present some new and old examples in which the counterfactual function seems to fail, and how, in each of them, I think that a less known function does better: the Shapley value, a concept from cooperative game theory which has also been brought up before in such discussions. In the first three examples, I'll just present what the Shapley value outputs, and halfway through this post, I'll use these examples to arrive at a definition.

I think that one of the main hindrances in the adoption of Shapley values is the difficulty in its calculation. To solve this, I have written a Shapley value calculator and made it available online: shapleyvalue.com. I encourage you to play around with it.

## Example 1 & recap: Sometimes, the counterfactual impact exceeds the total value.

Suppose there are three possible outcomes: P has cost $2000 and gives 15 utility to the world Q has cost $1000 and gives 10 utility to the world R has cost $1000 and gives 10 utility to the world

Suppose Alice and Bob each have $1000 to donate. Consider two scenarios:

Scenario 1: Both Alice and Bob give $1000 to P. The world gets 15 more utility. Both Alice and Bob are counterfactually responsible for giving 15 utility to the world.

Scenario 2: Alice gives $1000 to Q and Bob gives $1000 to R. The world gets 20 more utility. Both Alice and Bob are counterfactually responsible for giving 10 utility to the world.

From the world's perspective, scenario 2 is better. However, from Alice and Bob's individual perspective (if they are maximizing their own counterfactual impact), scenario 1 is better. This seems wrong, we'd want to somehow coordinate so that we achieve scenario 2 instead of scenario 1.

Source

Attribution: rohinmshah

In Scenario 1:

Counterfactual impact of Alice: 15 utility.

Counterfactual impact of Bob: 15 utility.

Sum of the counterfactual impacts: 30 utility. Total impact: 15 utility.

The Shapley value of Alice would be: 7.5 utility.

The Shapley value of Bob would be: 7.5 utility.

The sum of the Shapley values always adds up to the total impact, which is 15 utility.

In Scenario 2:

Counterfactual impact of Alice: 10 utility.

Counterfactual impact of Bob: 10 utility.

Sum of the counterfactual impacts: 20 utility. Total impact: 20 utility.

The Shapley value of Alice would be: 10 utility.

The Shapley value of Bob would be: 10 utility.

The sum of the Shapley values always adds up to the total impact, which is 10+10 utility = 20 utility.

In this case, if Alice and Bob were each individually optimizing for counterfactual impact, they'd end up with a total impact of 15. If they were, each of them, individually, optimizing for the Shapley value, they'd end up with a total impact of 20, which is higher.

It would seem that we could use a function such as

CounterfactualImpactModified = CounterfactualImpact / NumberOfStakeholders

to solve this particular problem. However, as the next example shows, that sometimes doesn't work. The Shapley value, on the other hand, has the property that it always adds up to total value.

Property 1: The Shapley value always adds up to the total value.

## Example 2: Sometimes, the sum of the counterfactuals is less than total value. Sometimes it's 0.

Consider the invention of Calculus, by Newton and Leibniz at roughly the same time. If Newton hadn't existed, Leibniz would still have invented it, and vice-versa, so the counterfactual impact of each of them is 0. Thus, you can't normalize like above.

The Shapley value doesn't have that problem. It has the property that equal people have equal impact, which together with the requirement that it adds up to total value is enough to assign 1/2 of the total impact to each of Newton and Leibniz.

Interestingly, GiveWell has Iodine Global Network as a standout charity, but not as a recommended charity, because of considerations related to the above. If it were the case that, had IGN not existed, another organization would have taken its place, its counterfactual value would be 0, but its Shapley value would be 1/2 (of the impact of iodizing salt in developing countries).

Property 2: The Shapley assigns equal value to equivalent agents.

## Example 3: Order indifference.

Consider Scenario 1 from Example 1 again.

P has cost $2000 and gives 15 utility to the world.

Suppose Alice and Bob each have $1000 to donate. Both Alice and Bob give $1000 to P. The world gets 15 more utility. Both Alice and Bob are counterfactually responsible for giving 15 utility to the world.

Alice is now a pure counterfactual-impact maximizer, but something has gone wrong. She now views Bob adversarially. She thinks he's a sucker, and she waits until Bob has donated to make her own donation. There are no worlds in which he doesn't donate before her, and Alice assigns all 15 utility to herself, and 0 to Bob. Note that she isn't exactly calculating the counterfactual impact, but something slightly different.

The Shapley value doesn't consider any agent to be a sucker, doesn't consider any variables to be in the background, and doesn't care whether people try to donate strategically before or after someone else. Here is a perhaps more familiar example:

Scenario 1:

Suppose that the Indian government creates some big and expensive infrastructure to vaccinate people, but people don't use it. Suppose an NGO then comes in, and sends reminders to people to vaccinate their people, and some end up going.

Scenario 2:

Suppose that an NGO could be sending reminders to people to vaccinate their children, but it doesn't, because the vaccination infrastructure is nonexistent, so there would be no point. Then, the government steps in, and creates the needed infrastructure, and vaccination reminders are sent.

Again, it's tempting to say that in the first scenario, the NGO gets all the impact, and in the second scenario the government gets all the impact, perhaps because we take either the NGO or the Indian government to be in the background. To repeat, the Shapley value doesn't differentiate between the two scenarios, and doesn't leave variables in the background. For how this works numerically, see the examples below.

Property 3: The Shapley value doesn't care about who comes first.

## The Shapley value is uniquely determined by simple properties.

These properties:

- Property 1: Sum of the values adds up to the total value (Efficiency)
- Property 2: Equal agents have equal value (Symmetry)
- Property 3: Order indifference: it doesn't matter which order you go in (Linearity). Or, in other words, if there are two steps, Value(Step1 + Step2) = Value(Step1) + Value(Step2).

And an extra property:

- Property 4: Null-player (if in every world, adding a person to the world has no impact, the person has no impact). You can either take this as an axiom, or derive it from the first three properties.

are enough to force the Shapley value function to take the form it takes:

At this point, the reader may want to consult Wikipedia to familiarize themselves with the mathematical formalism, or, for a book-length treatment, *The Shapley value: Essays in honor of Lloyd S. Shapley*. Ultimately, a quick way to understand it is as "the function uniquely determined by the properties above".

I suspect that order indifference will be the most controversial option. Intuitively, it prevents stakeholders from adversarially choosing to collaborate earlier or later in order to assign themselves more impact.

Note that in the case of only one agent the Shapley value reduces to the counterfactual function, and that the Shapley value uses many counterfactual comparisons in its formula. It sometimes just reduces to CounterfactualValue/ NumberOfStakeholders (though it sometimes doesn't). Thus, the Shapley value might be best understood as an extension of counterfactuals, rather than as something completely alien.

## Example 4: The Shapley value can also deal with leveraging

Organisations can leverage funds from other actors into a particular project. Suppose that AMF will spend $1m on a net distribution. As a result of AMF’s commitment, the Gates Foundation contributes $400,000. If AMF had not acted, Gates would have spent the $400,000 on something else. Therefore, the counterfactual impact of AMF’s work is:

AMF’s own $1m on bednets plus Gates’ $400,000 on bednets minus the benefits of what Gates would otherwise have spent their $400,000 on.

If Gates would otherwise have spent the money on something worse than bednets, then the leveraging is beneficial; if they would otherwise have spent it on something better than bednets, the leveraging reduces the benefit produced by AMF.

Source: The counterfactual impact of agents acting in concert.

Let's consider the case in which the Gates Foundation would otherwise have spent their $400,000 on something half as valuable.

Then the counterfactual impact of the AMF is 1,000,000+400,000-(400,000)*0.5 = $1,2m.

The counterfactual impact of the Gates Foundation is $400,000.

And the sum of the counterfactual impacts is $1,6m, which exceeds total impact, which is $1,4m.

The Shapley value of the AMF is $1,1m.

The Shapley value of the Gates Foundation is $300,000.

Thus, the Shapley value assigns to the AMF part, but not all, of the impact of the Gates Foundation donation. It takes into account their outside options when doing so: if the Gates Foundation would have invested on something equally valuable, the AMF wouldn't get anything from that.

## Example 5: The Shapley value can also deal with funging

Suppose again that AMF commits $1m to a net distribution. But if AMF had put nothing in, DFID would instead have committed $500,000 to the net distribution. In this case, AMF funges with DFID. AMF’s counterfactual impact is therefore:

AMF’s own $1m on bednets minus the $500,000 that DFID would have put in plus the benefits of what DFID in fact spent their $500,000 on.

Source

Suppose that the DFID spends their money on something half as valuable.

The counterfactual impact of the AMF is $1m - $500,000 + ($500,000)*0.5 = $750,000.

The counterfactual impact of DFID is $250,000.

The sum of their counterfactual impacts is $1m; lower than the total impact, which is $1,250,000.

The Shapley value of the AMF is, in this case, $875,000.

The Shapley value of the DFID is $375,000.

The AMF is penalized: even though it paid $1,000,000, its Shapley value is less than that. The DFID's Shapley-impact is increased, because it could have invested its money in something more valuable, if the AMF hadn't intervened.

For a perhaps cleaner example, consider the case in which the DFID's counterfactual impact is $0: It can't use the money except to distribute nets, and the AMF got there first. In that scenario:

The counterfactual impact of the AMF is $500,000.

The counterfactual impact of DFID is $0.

The sum of their counterfactual impacts is $500,000. This is lower than the total impact, which is $1,000,000.

The Shapley value of the AMF is $750,000.

The Shapley value of the DFID is $250,000.

The AMF is penalized: even though it paid $1,000,000, its Shapley value is less than that. The DFID shares some of the impact,

## Example 6: The counterfactual value doesn't deal correctly tragedy of the commons scenarios.

Imagine a scenario in which many people could replicate the GPT-2 model and make it freely available, but the damage is already done once the first person does it. Imagine that 10 people end up doing it, and that the damage done is something big, like -10 million utility.

Then the counterfactual damage done by each person would be 0, because the other nine would have done it regardless.

The Shapley value deals with this by assigning an impact of -1 million utility to each person.

### Example 7: Hiring in EA

Suppose that there was a position in an EA org, for which there were 7 qualified applicants which are otherwise "idle". In arbitrary units, the person in that position in that organization can produce an impact of 100 utility.

The counterfactual impact of the organization is 100.

The counterfactual impact of any one applicant is 0.

The Shapley value of the the organization is 85.71.

The Shapley value of any one applicant is 2.38.

As there are more applicant, the value skews more in favor of the organization, and the opposite happens with less applicants. If there were instead only 3 applicants, the values would be 75 and 8.33, respectively. If there were only 2 applicants, the Shapley value of the organization is 66.66, and that of the applicants is 16.66. With one applicant and one organization, the impact is split 50/50.

In general, I suspect, but I haven't proved it, that if there are n otherwise iddle applicants, the Shapley value assigned to the organization is (n-1)/n. This suggests that a lot of the impact of the position goes to whomever created the position.

## Example 8: The Shapley value makes the price of a life rise with the number of stakeholders.

Key:

- Shapley value - counterfactual value / counterfactual impact
- Shapley price - counterfactual price. The amount of money needed to be counterfactually responsible for 1 unit of X / The amount of money needed for your Shapley value to be 1 unit of X.
- Shapley cost-effectiveness - counterfactual cost-effectiveness.

Suppose that, in order to save a life, 4 agents have to be there: AMF to save a life, GiveWell to research them, Peter Singer to popularize them and a person to donate $5000. Then the counterfactual impact of the donation would be 1 life, but its Shapley value would be 1/4th. Or, in other words, the Shapley cost of saving a life though a donationis four times higher than the counterfactual cost.

Why is this? Well, suppose that, to save a life, each of the organizations spent $5000. Because all of them are necessary, the counterfactual cost of a life is $5000 for any of the stakeholders. But if you wanted to save an additional life, the amount of money which would be spend must be $5000*4 = $20,000, because someone would have to go through the four necessary steps.

If, instead of 4 agents there were 100 agents involved, then the counterfactual price stays the same, but the Shapley price rises to 100x the counterfactual price. In general, I've said "AMF", or "GiveWell", as if they each were only one agent, but that isn't necessarily the case, so the Shapley price (of saving a life) might potentially be even higher.

This is a problem because if agents are reporting their cost-effectiveness in terms of counterfactuals, and one agent switches to consider their cost-effectiveness in terms of Shapley values, their cost effectiveness will look worse.

This is also a problem if organizations are reporting their cost-effectiveness in terms of counterfactuals, but in some areas there are 100 necessary stakeholders, and in other areas there are four.

## Shapley value and cost effectiveness.

So we not only care about impact, but also about cost-effectiveness. Let us continue with the example in which an NGO sends reminders to undergo vaccination, and let us give us some numbers.

Lets say that a small Indian state with 10 million inhabitants spends $60 million to vaccinate 30% of their population. An NGO which would otherwise be doing something really ineffective (we'll come back to this), comes in, and by sending reminders, increases the vaccination rate to 35%. They do this very cheaply, for $100,000.

The Shapley value of the Indian government would be 32.5%, or 3.25 million people vaccinated.

The Shapley value of the small NGO would be 2.5%, or 0.25 million people vaccinated.

Dividing this by the amount of money spent:

The cost-effectiveness in terms of the Shapley value of the Indian government would be $60 million / 3.25 million vaccinations = $18.46/vaccination.

The cost-effectiveness in terms of the Shapley value of the NGO would be $100,000 / 250,000 vaccinations = $0.4/vaccination.

So even though the NGO's Shapley value is smaller, it's cost-effectiveness is higher, as one might expect.

If the outside option of the NGO were something which has a similar impact to vaccinating 250,000 people, we're back at the funging/leveraging scenario: because the NGO's outside option is better, its Shapley value rises.

## Cost effectiveness in terms of Shapley value changes when considering different groupings of agents.

Continuing with the same example, consider that, instead of the abstract "Indian government" as a homogeneous whole, there are different subagents which are all necessary to vaccinate people. Consider: The Central Indian Government, the Ministry of Finance, the Ministry of Health and Family Welfare, and within any one particular state: the State's Council of Ministers, the Finance Department, the Department of Medical Health and Family Welfare, etc. And within each of them there are sub-agencies, and sub-subagencies.

In the end, suppose that there are 10 organizations which are needed for the vaccine to be delivered, for a nurse to be there, for a hospital or a similar building to be available, and for there to be money to pay for all of it. For simplicity, suppose that the budget of each of those organizations is the same: $60 million / 10 = $6 million. Then the Shapley-cost effectiveness is different:

The Shapley value of each governmental organization would be 1/10 * (30 million + 10/11 * 0.5 million) = 345,454 people vaccinated.

The Shapley value of the NGO would be 1/11 * 500,000 = 45,454 people vaccinated.

The cost effectiveness of each governmental organization would be ($6 million)/(345,454 vaccinations) = $17 / vaccination.

The cost effectiveness of the NGO would be $100,000 / 45,454 vaccinations = $2.2 / vaccination.

That's interesting. These concrete numbers are all made up, but they're inspired by reality and "plausible", and I was expecting the result to be that the NGO would be less cost-effective than a government agency. It's curious to see that, in this concrete example, the NGO seems to be robustly more cost-efficient than the government under different groupings. I suspect that something similar is going on with 80,000h.

## Better optimize Shapley.

If each agent individually maximizes their counterfactual impact per dollar, we get suboptimal results, as we have seen above. In particular, consider a toy world in which twenty people can either:

- Each be an indispensable part of a project which has a value of 100 utility, for a total impact of 100 utility
- Each can by themselves undertake a project which has 10 utility, for a total impact of 200 utility.

Then if each person was optimizing for the counterfactual impact, they would all choose the first option, for a lower total impact. If they were optimizing for their Shapley value, they'd choose the second option.

Can we make a more general statement? Yes. Agents individually optimizing for cost-effectiveness in terms of Shapley value globally optimize for total cost-effectiveness.

Informal proof: Consider the case in which agents have constant budgets and can divide them between different projects as they like. Then, consider the case in which each $1 is an agent: projects with higher Shapley value per dollar get funded first, then those with less impact per dollar, etc. Total cost-effectiveness is maximized. Because of order indifference, both cases produce the same distribution of resources. Thus, agents individually optimizing for cost effectiveness in terms of Shapley-value globally optimize for total cost-effectiveness.

Note: Thinking in terms of marginal cost-effectiveness doesn't change this conclusion. Thinking in terms of time/units other than money probably doesn't change the conclusion.

## Am I bean counting?

I don't have a good answer to that question.

## Conclusion

The counterfactual impact function is well defined, but it fails to meet my expectations of what an impact function ought to do when considering scenarios with multiple stakeholders.

On the other hand, the Shapley value function flows from some very general and simple properties, and can deal with the examples in which the counterfactual function fails. Thus, instead of optimizing for counterfactual impact, it seems to me that optimizing for Shapley value is less wrong.

Finally, because the Shapley value is not pretty to calculate by hand, here is a calculator.

Question: Is there a scenario in which the Shapley value assigns impacts which are clearly nonsensical, but with which the counterfactual value, or a third function, deals correctly?

## Addendum: The Shapley value is not easily computable.

For large values the Shapley value will not be computationally tractable (but approximations might be pretty good), and work on the topic has been done in the area of interpreting machine learning results. See, for example:

This was a very simple example that we’ve been able to compute analytically, but these won’t be possible in real applications, in which we will need the approximated solution by the algorithm. Source: https://towardsdatascience.com/understanding-how-ime-shapley-values-explains-predictions-d75c0fceca5a

Or

The Shapley value requires a lot of computing time. In 99.9% of real-world problems, only the approximate solution is feasible. An exact computation of the Shapley value is computationally expensive because there are 2^k possible coalitions of the feature values and the “absence” of a feature has to be simulated by drawing random instances, which increases the variance for the estimate of the Shapley values estimation. The exponential number of the coalitions is dealt with by sampling coalitions and limiting the number of iterations M. Decreasing M reduces computation time, but increases the variance of the Shapley value. There is no good rule of thumb for the number of iterations M. M should be large enough to accurately estimate the Shapley values, but small enough to complete the computation in a reasonable time. It should be possible to choose M based on Chernoff bounds, but I have not seen any paper on doing this for Shapley values for machine learning predictions. Source: https://christophm.github.io/interpretable-ml-book/shapley.html#disadvantages-13

That being said, here is a nontrivial example:

### Foundations and projects.

Suppose that within the EA community, OpenPhilantropy, a foundation whose existence I appreciate, has the opportunity to fund 250 out of 500 projects every year. Say that you also have 10 smaller foundations: Foundation1,..., Foundation10, each of which can afford to fund 20 projects, that there aren't any more sources of funding, and that each project costs the same.

On the other hand, we will also consider the situation in which OpenPhil is a monopoly. In the end, perhaps all these other foundations and centers might be founded by OpenPhilantropy themselves. Consider the assumption that OpenPhil has the opportunity to fund 450 projects out of 500, and that there are no other sources in the EA community.

Additionally, we could model the distribution of projects with respect to how much good they do in the world by ordering all projects from 1 to 500, and saying that:

- Impact1 of the k-th project = I1(k) = 0.99^k.
- Impact2 of the k-th project = I2(k) = 2/k^2 (a power law).

With that in mind, here are our results for the different assumptions. Power Index= Shapley(OP) / Total Impact

Monopoly? | Impact measure | Total Impact | Shapley(OP) | Power index |
---|---|---|---|---|

0 | I(k) = 0.99^k | 97.92 | 7.72 | 7.89% |

0 | I(k) = 2/k^2 | 3.29 | 0.028 | 0.86% |

1 | I(k) = 0.99^k | 97.92 | 48.96 | 50% |

1 | I(k) = 2/k^2 | 3.29 | 1.64 | 50% |

For a version of this table which has counterfactual impact as well, see here.

The above took some time, and required me to beat the formula for the Shapley value into being computationally tractable for this particular case (see here for some maths, which as far as I'm aware, are original, and here for some code).

While I think the Shapley value can be useful, there are clearly cases where the counterfactual value is superior for an agent deciding what to do. Derek Parfit clearly explains this in Five Mistakes in Moral Mathematics. He is arguing against the 'share of the total view' and but at least some of the arguments also apply to the Shapley value too (which is basically an improved version of 'share of the total'). In particular, the best things you have listed in favour of the Shapley value applied to making a moral decision correctly apply when you and others are all making the decision 'together'. If the others have already committed to their part in a decision, the counterfactual value approach looks better.

e.g. on your first example, if the other party has already paid their $1000 to P, you face a choice between creating 15 units of value by funding P or 10 units by funding the alternative. Simple application of Shapley value says you should do the action that creates 10 units, predictably making the world worse.

One might be able to get the best of both methods here if you treat cases like this where another agent has already committed to a known choice as part of the environment when calculating Shapley values. But you need to be clear about this. I consider this kind of approach to be a hybrid of the Shapley and counterfactual value approaches, with Shapley only being applied when the other agents' decisions are still 'live'. As another example, consider your first example and add the assumption that the other party hasn't yet decided, but that you know they love charity P and will donate to it for family reasons. In that case, the other party's decision, while not yet made, is not 'live' in the relevant sense and you should support P as well.

If you are going to pursue what the community could gain from considering Shapley values, then look into cases like this and subtleties of applying the Shapley value further — and do read that Parfit piece.

I think the reason summing counterfactual impact of multiple people leads to weird results is not a problem with counterfactual impact but with how you are summing it. Adding together each individual's counterfactual impact by summing is adding the difference between world A where they both act and world B and C where each of them act alone. In your calculus, you then assume this is the same as the difference between world A and D where nobody acts.

The true issue in maximising counterfactual impact seems to arise when actors act cooperatively but think of their actions as an individual. When acting cooperatively you should compare your counterfactuals to world D, when acting individually world B or C.

The Shapley value is not immune to error either I can see three ways it could lead to poor decision making:

I may have misunderstood Shapely here so feel free to correct me. Overall I enjoyed the post and think it is well worth reading. Criticism of the underlying assumptions of many EAs decision-making methods is very valuable.

## 1.

I have thought about this, and I'm actually biting the bullet. I think that a lot of people get impact for a lot of things, and that even smallish projects depend on a lot of other moving parts, in the direction of You didn't build that.

I don't agree with some of your examples when taken literally, but I agree with the nuanced thing you're pointing at with them, e.g., building good roads seems very valuable precisely because it helps other projects, if there is high nurse absenteeism then the nurses who show up take some of the impact...

I think that if you divide the thing's impact by, say 10x, the ordering of the things according to impact remains, so this shouldn't dissuade people from doing high impact things. The interesting thing is that some divisors will be greater than others, and thus the ordering will be changed. I claim that this says something interesting.

## 2.

Not really. If 10 people have already done it, your Shapley value will be positive if you take that bargain. If the thing hasn't been done yet, you can't convince 10 Shapley-optimizing altruists to do the thing for 0.5m each, but you might convince 10 counterfactual impact optimizers. As @casebach mentioned, this may have problems when dealing with uncertainty (for example: what if you're pretty sure that someone is going to do it?).

## 3.

You're right. The example, however, specified that the EAs were to be "otherwise idle", to simplify calculations.

The order indifference of Shapely values only makes sense from a perspective where there is perfect knowledge of what other players will do, but if you don't have that, a party that spent a huge amount of money on a project that was almost certainly going to be wasteful and ended up being saved when by sheer happenstance another party appeared to save the project was not making good spending decisions. Similarly, many agents won't be optimising for Shapely value, say a government which spends money on infrastructure not caring about whether it'll be used or not just to win political points, so they don't properly deserve a share of the gains when someone else intervenes with notifications to make the project actually effective.

I feel that this article presents Shapley value as just plain superior, when instead a combination of both Shapley value and counterfactual value will likely be a better metric. Beyond this, what you really want to use is something more like FDT where you take into account the fact that the decisions of some agents are subjunctively linked to you and that the decisions of some other agents aren't. Even though my current theory is that very, very few agents are actually subjunctively linked to you, I suspect that thinking about problems in this fashion is likely to work reasonably well in practise (I would need to dedicate a solid couple of hours in order to be able to write out my reasons for believing this more concretely)

Hey Chris! It was nice seeing you at the EA Hotel, and I'm glad we could talk about this. I'm writing down some of my notes from our conversations. Is there anything I've forgotten, or which you'd like to add?

## a. What are you using Shapley values / counterfactual values for?

You might want to use different tools depending on what your goal is; three different goals migh be: Coordination / Analysis / Reward / Award.

For example, you might want a function which is easier to understand when announcing an award. If you're rewarding a behavior, you might want to make sure you're incentivizing the right thing.

## b. The problem of choosing who to count is more complicated than I originally thought, and you should in fact exclude some agents from your calculations.

The example of: "If a bus driver falls off a cliff and Superman rescues them and brings them safely to their destination, earlier, the bus driver gets half the credit" is silly, but made the thing really crisp for me.

Hearing that, we then thought that:

So the answer would seem to be something like: -Counting only over people who are broadly similar to you?

## c. Shapley values and uncertainty

How do SVs deal with uncertainty? Can you do expected value over SVs? [Yes, you can]. For example, if you have a 1% chance of a SV of 100, you can say that the E[SV] = 1. Even thought the SV formalism is more complicated than the counterfactual, it still works elegantly / is well-defined, etc.

Fair point re: uncertainty. The situation seems pretty symmetric, though: if a politician builds roads just to get votes, and an NGO steps in and does something valuable with that, the politician's counterfactual impact is still the same as the NGO's, so both the Shapley value and counterfactuals have that problem (?). Maybe one can exclude agents acording to how close their goals are to yours, e.g., totally exclude a paperclip maximizer from both counterfactual and Shapley value calculations, and apply order indifference to allies only (?). This is something I haven't though about; thanks for pointing it out.

Fair point re: epistemic status. Changed my epistemic status.

"The situation seems pretty symmetric, though: if a politician builds roads just to get votes, and an NGO steps in and does something valuable with that, the politician's counterfactual impact is still the same as the NGO's" - true, but the NGO's counterfactual impact is reduced when I feel it's fairer for the NGO to be able to claim the full amount (though of course you'd never know the government's true motivations in real life)

I like this angle! It seems useful to compare the Shapley value in this domain to the Banzhaf value. (Brief, dense description: If Shapley value attributes value to pivotal actors during the sequential process of coalition formation (averaged across all permutations of coalition formation orderings), Banzhaf value attributes value to critical actors without which any given coalition would fail. See Shapley-Shubik power index and Banzhaf power index for similar concepts in a slightly different context.)

This paper has a nice table of properties:

("Additivity' is the same as "linearity" here.)

Focusing on just the properties where they differ:

I'd have to think about this more carefully, but it's not immediately obvious to me which set of properties is better for the purpose at hand.

Is it possible to use Banzhaf values for generic attribution questions outside of voting? If so, can you link to some posts/papers that describe how to use it in such cases. The first set of things that came up are all voting-related.

Unless I'm very confused, yes. Unfortunately, it does seem that almost all of the discussion of it is pretty theoretical and about various axiomatic characterizations. Here's an interesting application paper I found though: The Shapley and Banzhaf values in microarray games. They have a short description of their use of the Banzhaf value (equation 2)---not sure how helpful it is.

Thanks for this post. I'm also pretty enthusiastic about Shapley values, and it is overdue for a clear presentation like this.

The main worry I have is related to the first one GeorgeBridgwater notes: the values seem very sensitive to who one includes as a co-operative counterparty (and how finely we individuate them). As your example with vaccine reminders shows, different (but fairly plausible) accounts of this can change the 'raw' CE estimate by a factor of five.

We may preserve ordering among contributors if we twiddle this dial, but the more typical 'EA problem' is considering different interventions (and thus disjoint sets of counter-parties). Although typical 'EA style' CE estimates likely have expected errors in their exponent rather than their leading digit, a factor of 5 (or maybe more) which can hinge on relatively arbitrary decisions on how finely to individuate who we are working with looks pretty challenging to me.

The Banzhaf value should avoid this problem since it has the property of 2-Efficiency: "The 2-Efficiency property states that the allocation rule that satisfies it is immune against artificial merging or splitting of players."

I'd like to hear more about this if you have the time. It seems to me that it's hard to find a non-arbitrary way splitting of players.

Say a professor and a student work together on a paper. Each of them spends 30 hours on it and the paper would counterfactually not have been written if either of them had not contributed this time. The Shapley values should not be equivalent, because the 'relative size' of the players' contributions shouldn't be measured by time input.

Similarly, in the India vaccination example, players' contribution size is determined by their

money spent. But this is sensitive to efficiency: one should not be able to get a higher Shapley value just from spending money inefficiently, right? Or should it, because this worry is addressed by Shapley cost-effectiveness?(This issue seems structurally similar to how we should allocate credence between competing hypotheses in the absence of evidence. Just because the two logical possibilities are A and ~A, does not mean a 50/50 credence is non-abitrary. Cf. Principle of Indifference)

This is the best explanation I could find: Notes on a comment on 2-efficiency and the Banzhaf value.

It describes two different kinds of 2-efficiency:

These lead to the corresponding properties:

So basically they're just saying that players can't artificially boost or reduce their assigned values by merging or amalgamating---the resulting reward is always just the sum of the individual rewards.

I don't think it directly applies in the case of your professor and student case. The closest analogue would be if the professor and student were walking as part of a larger group. Then 2-efficiency would say that the student and professor collectively get X credit whether they submit their work under two names or one.

Sorry for the delayed reply. Does that help at all?

Thanks! Late replies are better than no replies ;)

I don't think this type of efficiency deals with the practical problem of impact credit allocation though! Because there the problem appears to be that it's difficult to find a common denominator for people's contributions. You can't just use man hours, and I don't think the market value of man hours would do that much better (although it gets in the right direction).

I'm skating on thin ice, but I think

1) the discussion is basically correct

2) similar problems have been discussed in evolutionary game theory, chemical reaction/economic/ ecological networks, cooking, and category theory.

3) I find it difficult to wade through examples (ie stories about AMF and gates foundations, or EA hiring) --these remind me of many 'self help' psychology books which explain how to resolve conflicts by going through numerous vignettes involving couples, families, etc--i can't remember all the 'actors' names and roles.

4) i think a classic theorem in game theory (probably by john von neumman, but maybe by john nash) shows you can convert shapley value to counterfactual value very easily. the same issue applies in physics--which can be often thought of as a 'continuous game'.

5) time ordering invariance is not really a problem (except technically)---you can include a time variable as is done in evolutionary game theory. (mathematically its a much more difficult problem but not conceptually).

As I said I'm skating on thin ice, but the theorem says you can convert any positive or negative sum game into a zero sum game. (its due to von Neumann or nash, but i think i saw it in books on evolutionary game theory . i think there are analogs in physics , and even ecology, etc. ).

Again, i think that may be related to the counterfactual/shapley conversion i 'see' or think exists, but can't prove it----i'd have to look at the definitions again.

To possibly fall through more holes in the ice , i think the prisoner's dillema might be the simplest example.

(I'm just not fluent in the definitions since i didn't learn them when i was studying some game theory; but i looked at many game theory texts where they did occur--mostly for more complex situations than i was dealing with.

Also the term 'counterfactual' i only learned from a history book by Niall Ferguson (not a big hero of mine but had what seemed like worthwhile ideas--- he wrote 'counterfactual history'---eg 'what would be state of the world if Germany had won WW2?' )

as noted , i also find examples which use 'vignettes' or 'scenarios', fractions, whole numbers like '7 EA candidates', '60 million$ ' , along with the names of countries (India) and organizations, make it difficult (or time consuming for me) to process. but this is just a stylisitic or personal issue.

I wonder if you think an excercize trying to compare the shapley vs counterfactual value of the 2 cases for WW2 is meaningful---ie would money spent by UK/USA/etc fighting the war have been better spent another way?

i may even put this question to myself to see if its meaningful in your framework. i spend a bit of time on questionable math/logic problems (some of which have solutions, but i try to find different proofs because i dont understand the existing ones, and occasionaly do. Many theorems have many correct proofs which look very different and use different methods, and often have been discovered by many people on different continents at the same time (eg the renormalization group in physics was discovered by Feynman and Nambu (japan) about the same time) . I wish i had a study group who shared my interests in various problems like this one; the few aquaintances i have who work on math/logic basically work on problems that interest them, and don't find mine interesting or relevant. )

P.S. I just re-skimmed your article and see you dealt in Scenario 6 with 'tragedy of the commons' which i view as an n-person variant of the 2 -person prisoner's dillema.

also your example 2 (Newton and Leibniz ) is an example which is sort of what i was thinking. The theorem i was thinking of would add to the picture and have something like a 'god' who would create either Newton, Leibniz, or both of them. Shapley value would be the same in all cases. (unless 2 calculus discoveries are better than 1----in sciences sometimes this is seen as true ('replication'), or having 'multiple witnesses' in law as opposed to just an account by one (who is the victim and may not be believed )).

(its also the case for example that the 3 or 4 or even 5 early versions of quantum mechanics-- schrodinger, heisenberg, dirac, feynman, bohm---though some say debroglie anticipated bohm , and feynman acknolwedged that he found his idea in a footnote in a book by Dirac--although redundant in many ways, each have unique perspectives . the golden rule also has many formulations i've heard)

(In my scenario, with 'god' , i think the counterfactual value of either newton or leibniz would be 1---because without either or both there would be no calculus with shapley value 1. god could have just created nothing---0 rather than 1).

In a way what you seem to be describing is how to avoid the 'neglectedness' problem of EA theory. This overlaps with questions in politics---some people vote for people in a major party who may win anyway, rather than vote for a 'minor party' they may actually agree with more. This might be called the 'glow effect' ---similarily some people will support some rock or sports star partly just to be in the 'in crowd'. So they get 'counterfactual value' even if the world is no better off-voting for someone who will win any way is no better than voting for one who will lose --or rather they actually get additional Shapley value because they are 'happier' being in the 'in crowd' rather than being a less favored minority--but this involves a different calculation for the Shapley value, including 'happiness' and not just 'who won'. But, some people are happier being in 'minorities', so thats another complication in the calculations.

(eg the song by Beck 'i'm a loser' comes to mind. pays to be a loser some times or support an unpopular cause because its actually a neglected one---people just didn't know its actual or Shapley value. )

Thanks for this interesting post. As I argued in the post that you cite and as George Bridgwater notes below, I don't think you have identified a problem in the idea of counterfactual impact here, but have instead shown that you sometimes cannot aggregate counterfactual impact across agents. As you say, CounterfactualImpact(Agent) = Value(World with agent) - Value(World without agent).

Suppose Karen and Andrew have a one night stand which leads to Karen having a baby George (and Karen and Andrew otherwise have no effect on anything). In this case, Andrew's counterfactual impact is:

Value (world with one night stand) - Value (world without one night stand)

The same is true for Karen. Thus, the counterfactual impact of each of them taken individually is an additional baby George. This doesn't mean that the counterfactual impact of Andrew and Karen combined is two additional baby Georges. In fact, the counterfactual impact of Karen and Andrew combined is also given by:

Value (world with one night stand) - Value (world without one night stand)

Thus, the counterfactual impact of Karen and Andrew combined is an additional baby George. There is nothing in the definition of counterfactual impact which implies it can be always be aggregated across agents.

This is the difference between "if me and Karen hadn't existed, neither would George" and "If I hadn't existed, neither would George, and if Karen hadn't existed neither would George, therefore if me and Karen hadn't existed, neither would two Georges." This last statement is confused, because the babies referred to in the antecedent are the same.

I discuss other examples in the comments to Joey's post.

**

The counterfactual understanding of impact is how almost all voting theorists analyse the expected value of voting. EAs tends to think that voting is sometimes altruistically rational because of the small chance of being the one pivotal voter and making a large counterfactual difference. On the Shapely value approach, the large counterfactual difference would be divided by the number of winning voters. Firstly, to my knowledge almost no-one in voting theory assesses the impact of voting in this way. Secondly, this would I think imply that voting is never rational since in any large election the prospective pay-off of voting would be divided by the potential set of winning voters and so would be >100,000x smaller than on the counterfactual approach

I don't exactly claim to have identified a problem with the counterfactual function, in itself. The counterfactual is perfectly well defined, and I like it, and it has done nothing wrong. I understand this. It is clear to me that it can't be added just like that. The function, per se, is

fine.What I'm claiming is that, because it can't be aggregated, it is not the right function to think about in terms of assigning impact to people in the context of groups. I am arguing about the area of applicability of the function, not about the function. I am claiming that, if you are optimizing for counterfactual impact in terms of groups, pitfalls may arise.

It's like, when you first see for the same time: -1 = sqrt(-1)*sqrt(-1) = sqrt((-1)*(-1)) = sqrt(1) = 1, therefore -1 = 1, and you

can't see the mistake. It's not that the sqrt function is wrong, it's that you're using it outside it's limited fiefdom, so something breaks. I hope the example proved amusing.I'm not only making statements about the counterfactual function, I'm also making statements about the concept which people have in your head which is called "impact", and how that concept doesn't map to counterfactual impact some of the time, and about how, if you had to map that concept to a mathematical function, the Shapley value is a better candidate.

Nice post!

Quick thought on example 2:

I just wanted to point out that what is described with Newton and Leibniz is a very, very simplified example.

I imagine that really, Newton and Leibniz wouldn't be the only ones counted. With Shapley values, all of the other many people responsible for them doing that work and for propagating it would also have shared responsibility. Plus, all of the people who would have invented calculus had the two of them not invented it also would have had some part of the Shapley value.

The phrase "The Shapley assigns equal value to equivalent agents." is quite tricky here, as there's a very specific meaning to "equivalent agents" that probably won't be obvious to most readers at first.

Of course, much of this complexity also takes place with counterfactual value. (As in, Newton and Leibniz aren't counterfactually responsible for all of calculus, but rather some speedup and quality difference, in all likelihood).

This post was awarded an EA Forum Prize; see the prize announcement for more details.

My notes on what I liked about the post, from the announcement:

I think that if this is true, they aren't modelling the counterfactual correctly. If it were the case that all the others were definitely going for the 100 joint utility project no matter what you do, then yes, you should also do that, since the difference in utility is 100 > 20. That's the correct solution in this particular case. If none of the others were pursuing the 100 utility project, then you should pursue the 20 utility one, since 20 > 0. Reality is in-between, since you should treat the counterfactual as a (subjective) probability distribution.

EDIT: "Reality is in-between" was inaccurate. Rather, the situation I presented had all decisions independent. In reality, they are not independent, and you should consider your impact on the decisions of others. See my reply below.

What you say seems similar to a Stag hunt. Consider, though, that if the group is optimizing for their individual counterfactual impact, they'll

want to coordinateto all do the 100 utility project. If they were optimizing their Shapley value, they'd instead want to coordinate to do 10 different projects, each worth 20 utility. 20*10 = 200 >100.Consider this case: you choose the 20 utility project

andsingle-handedly convince the others to each choose the 20 utility project, or else you convince everyone to do the joint 100 utility project. Now, your own individual counterfactual impact would be 20*10 = 200 > 100.If you all coordinate and all agree to the 20 utility projects, with the alternative being everyone choosing the joint 100 utility project, then each actor has an impact of 20*10 = 200 > 100. Each of them can claim they convinced all the others.

So, when you're coordinating, you should consider your impact on others' decisions; some of the impact they attribute to themselves is also your own, and this is why you would end up double-counting if you just add up individual impacts to get the group's impact. Shapley values may be useful, but maximizing expected utility still, by definition, leads to the maximum expected utility (ex ante).

Good point!

In my mind, that gets a complexity penalty. Imagine that instead of ten people, there were 10^10 people. Then for that hack to work, and for everyone to be able to say that they convinced all the others, there

hasto be some overhead, which I think that the Shapley value doesn't require.FWIW, it's a complex as you want it to be since you can use subjective probability distributions, but there are tradeoffs. With a very large number of people, you probably wouldn't rely much on individual information anymore, and would instead lean on aggregate statistics. You might assume the individuals are sampled from some (joint) distribution which is identical under permutations.

If you were calculating Shapley values in practice, I think you would likely do something similar, too. However, if you do have a lot of

individualdata, then Shapley values might be more useful there (this is not an informed opinion on my part, though).Perhaps Shapley values could also be useful to guide more accurate estimation, if directly using counterfactuals is error-prone. But it's also a more complex concept for people to understand, which may cause difficulties in their use and verification.

Instead of "counterfactually" should we say "Shapily" now?