Respecting your Local Preferences

Scott Garrabrant

In this post, I give a application of geometric rationality to a toy version of a real problem.

A Conflicted Agent

Let's say you are an agent with two partially conflicting goals. Part of you wants to play a video game, and part of you wants to save the world and tile the multiverse with computronium shaped exactly the way you like it (not paperclips). How should these conflicting interests figure out what to do? (Assume you have not yet had the idea of starting a video game company to save the world.)

We will assume that 2/3 of you wants to save the world, and 1/3 of you wants to play video games.

At first your world-saving-self has the bright idea that 2/3 is bigger that 1/3, and you should therefore devote all your time to saving the world. However, this proposal doesn't stick. Eventually, you end up Nash bargaining with your time, and devoting 2/3 of your time to saving the world and 1/3 of your time to playing video games. This works well for a while, but then your world-saving-self has a new bright idea:

"Let's look around at the world, and see how much ability we expect to have to save it. If it feels like we are in the top 60 percentile of worlds ordered by how much control we have, then we will try to save the world. If we are in the bottom 40 percentile, we will play video games!"

(The 60 percentile is arbitrarily rounding down from 2/3, so that you can both play more video games and save more worlds.)

A Nash Bargaining Model

Let's model this more carefully. Let's say there are five different types of worlds: 1, 2, 3, 4, and 5. In each world, you have two buttons in front of you. The video game button, and the save the world button. Each time step, pressing the video game button lets you play some video games, and pressing the save the world button saves the world with probability ε⋅i in the world of type i.

We have five degrees of freedom. For each i∈{1,2,3,4,5}, we have pi, which represents the proportion of our time in world i that we spend pressing the save the world button. The rest of the time is spent playing video games.

The part of you that wants to save the world has power 23, and utility equal to 1010100⋅ε⋅(p1+2p2+3p3+4p4+5p5). However, since we are Nash bargaining, the coefficient out in front does not matter.

The part of you that wants to play video games has power 13 and utility equal to 5−p1−p2−p3−p4−p5.

We are trying to maximize the weighted geometric mean, 3√(p1+2p2+3p3+4p4+5p5)2(5−p1−p2−p3−p4−p5), on the cube [0,1]5.

This achieves a maximum when p1=p2=0, and p3=p4=p5=1. Simple enough.

(Linear is not the best model for the distribution of how much control you have. I am just trying to keep things simple.)

Local Preferences

The main problem with the above analysis, according to me, is that it is not respecting the locality of some of your preferences.

Maybe your desire to save the world is nonlocal. Maybe you care equally about whether this world is saved and whether some hypothetical other world in which you have more or less control over saving the world is saved. Why should you care about this Everett branch more than the other ones? I will grant that your altruistic self thinks that way, but I am guessing your desire to play video games probably doesn't.

The part of me that wants to play video games wants to play video games in this world. If I ask it what it wants in other Everett branches, it doesn't really care much. This does not mean it is making a mistake. Indeed, if I were to go back in time before I observed what world I am in and ask it whether it wanted to play more video games, concentrated in a small number of futures, or fewer total video games equally distributed, it chooses the equal distribution. The (video game part of a) version of me that does not yet know what world it is in is a coalition of many different versions of me that all want to play video games, and are unwilling to trade arbitrary amounts of one of them playing for another one of them playing.

When we combined the 5 different hypothetical parts of you that want to play video games in each world into one big part with a single utility function, we made a mistake. Let us try again.

A Better Nash Bargaining Model

We will keep the same model as before, but we will split up the video game preference.

Instead of one component with power 13, we will have five different components, each with power 115, whose utility functions are 1−pi, for i∈{1,2,3,4,5}.

We are trying to maximize 15√(p1+2p2+3p3+4p4+5p5)10(1−p1)(1−p2)(1−p3)(1−p4)(1−p5) on the cube [0,1]5.

This turns out to be maximized when pi=i−1i.

As a quick sketch of a proof, observe we can multiply by a constant, and equivalently maximize 15√(p1+2p2+3p3+4p4+5p5)10(10−10p1)(20−20p2)(30−30p3)(40−40p4)(50−50p5), which can be viewed as the geometric mean of 15 numbers, where (p1+2p2+3p3+4p4+5p5) is repeated 10 times and the other factors are repeated once. However, observe that regardless of the pi values, the arithmetic mean of these 15 numbers is 10, since they sum to 150, which means the geometric mean cannot be larger than 10, by the AM-GM inequality. We can achieve a geometric mean of 10 by setting all 15 values equal to 10, by setting pi=i−1i.

Note that p1=0, and in world 1, you don't save the world at all. However, none of the preferences are being exploited here. The part of you in world 1 that wants to save the world is happy you are prioritizing other worlds. If we had pi=1 for some i, on the other hand, this would be a sign that some of your local preferences to play video games were being exploited, since those preferences do not care about other worlds.

The closest we get to that is in where in world 5, p5=.8, and you are spending 80 percent of your time saving the world, and 20 percent of you time playing video games. This is more that 2/3 of your time, which is actually caused by the parts of you in the other worlds having preferences over your actions in world 5, which decrease your video game time, but not all the way to 0.

Note that the above analysis is sort of the most updateless version, where you allow for trade across the different worlds as much as possible. There could be an argument that you should be playing video games even more, because the parts of you in other worlds that care about this world are not actually here to enforce their preferences, but it is hard for me to imagine a good argument that you should be playing less given these preferences.

Note that this argument is about self cooperation. Maybe you want to cooperate with a much larger collection of potential agents behind a much larger veil of ignorance. I am not arguing against that here. I am only trying to take seriously the argument that you should cooperate with your self more by putting more effort into saving the world because you have surprisingly high leverage.

Effective Altruism Forum
EA Forum

Respecting your Local Preferences

23

A Conflicted Agent

A Nash Bargaining Model

Local Preferences

A Better Nash Bargaining Model

23

Reactions

More posts like this