Hide table of contents

In a recent post, Scott Garrabrant gave an application of geometric rationality to the problem of work-life balance. Here's the setup: part of you wants to try to make the world better (I'll be calling this your "altruistic part") and part of you wants to relax and play video games (your "relaxation part"). Geometric rationality suggests doing a Nash bargain between the altruistic part and the relaxation part, across possible worlds that you might have found yourself in. In worlds where you're in an unusually good position to make the world better, the bargain commits you to spend most of your time doing that (satisfying your altruistic part); in return, in worlds where you're not in a good position to make the world better, you spend most of your time playing video games (satisfying your relaxation part).

Scott built a toy mathematical model and tested it with a nice example. The example involved five possible worlds, all equally likely. The way the math worked out, if you ranked the five worlds from least conducive to your altruism to most conducive, the Nash bargain had you spend 0%, 50%, 67%, 75%, and 80% of your time on altruism, respectively, in those five worlds. A pretty nice, intuitively satisfying list of numbers.

 

I recommend reading Scott's post before reading this one. And while I really liked the post, his example was constructed so that the probability distribution over how much altruistic impact you could have was pretty close to a uniform distribution. In the example, the world that was most conducive to your altruistic impact only gave you 1.7x more impact than the median world. But I think impact is much more unevenly distributed than that across possible worlds.

My issue with Scott's post wasn't his model (I quite liked the model!); it was only his example. So I decided to take the model and solve for the optimal bargain in more generality. I wanted to see what the results would be -- whether they'd still be intuitively compelling.

 

The rest of this post will be a bunch of math, but here's the (imprecise) TLDR:

  • If you know that your ability to make the world better is somewhat below average (across all possible worlds you could have found yourself in), you can spend all your time playing video games. If you know it's about average, you should aim for a fairly even mix of video games and altruism. If you know it's substantially above average (>10x, say), then you should spend almost all your time on altruism.
    • I mean "average" in the precise sense of the arithmetic mean -- not the median! There are plausible views under which you are in a top 0.001% world by impact potential, but still have below-average impact potential, because almost all the impact potential is concentrated in the top 0.00000001% of worlds.
  • (Epistemic status: super uncertain.) In practice, you have a huge amount of uncertainty about the distribution of your impact potential across worlds (not to mention uncertainty about how much impact potential you have in this world). While building this uncertainty into the model is hard, my intuition for geometric rationality suggests that a good way to deal with the uncertainty is to consider the probability that you'd assign to the proposition "my impact potential in this world is above average" (where the probability is over your logical uncertainty about the nature of the multiverse), and spend that fraction of your time improving the world.

 

All right, time to get to the math!

The setup

First, let  be the fraction of you that wants to improve the world (Scott's example uses ).

Order all worlds based on how impact potential you have, from least to most. For , let  be the impact potential you have in the -th quantile world (so  is an increasing function). (In Scott's example,  is  on  on  on  on , and  on .)

Let  be fraction of time you'll be devoting toward altruism in the -th quantile world. This is the function that we will be optimizing. Specifically, we will be looking for the function  that maximizes

.

(This is just the generalization of Scott's equation to a continuous setting.)

...okay, whoa there, what's this thing with the  in the exponent? You can read Scott's post on the geometric integral if you'd like. But if you (like me) don't have experience with geometric integrals, fear not: it'll go away once we take the log of that expression (which we can do because the function  that maximizes that expression will also be the  that maximizes its logarithm).

So, taking the log: we are looking for the function  that maximizes

(V stands for "value".)

The next section will be technical. Feel free to skip to the "Drawing conclusions" section below -- you won't lose too much.

 

Solving for the optimal function 

The right way to optimize this is with Lagrange multipliers. Unfortunately, I don't know Lagrange multipliers, so I do these sorts of optimization problems informally.

Let  be the optimal choice of . Let's consider making a small change to the value of  near a particular value : in particular, for very small  and very small positive , we will consider increasing  by  for . This change will approximately:

  • increase  by , where , and
  • decrease  by ,

for a net change of . In order for  to be optimal, this change cannot be positive. Thus, one of two things must be true for all : either

  • ; or
  • , but , so decreasing  is not possible.

(Briefly, we can't have the analogous edge case with  because we have  in a denominator.)

Therefore, rearranging terms, we have

.

We find it convenient to define , so that

.

(An aside: the quantity  (which, recall, is defined as ) has a nice interpretation: it's how much you improve the world, on average across worlds, if you follow the optimal bargain . The quantity  also has a nice interpretation: it's how much your relaxation part would need to be satisfied, if it were satisfied as much as your altruistic part, in proportion to how much these two parts exist.)

 

Our solution for  shows that you should spend all your time playing video games if you find yourself in a world where , and should spend at least some of your time saving the world if in fact .

...okay, but what is ? While we found  in terms of , we've defined  in terms of . That means we have an equation to solve! In particular, we have

By moving the  out of the integrand and rearranging, we can rewrite this as

.       (Eq. 1)

This isn't a solution, since there's a  on the right side of the equation. And we won't solve for  exactly, but we will prove a nice lemma that bounds  on both sides in terms of  -- which, remember, measures your average impact potential across possible worlds.

 

Lemma: .

Proof: the right-hand inequality is straightforward. We have

.

To prove the left-hand inequality, for convenience we will define , and . Then (from Eq. 1) . We also have that , i.e. that the average value of  on  is less than , since  and  is increasing. Therefore, we have

.

This completes the proof.

 

Now, recall that . Let . By Lemma 1, for any particular -value , we have

.

This gives us some nice bounds to work with when interpreting .

 

Drawing conclusions

Recall that  is the fraction of you that's altruistically minded. In this section, I recommend having a particular value of  in mind. Scott's choice of  is a reasonable one.

The optimal function  is . If you skipped the previous section, you don't need to worry about what  is; all you need to know is that we've bounded it: for every -value , we have

.

If  is such that , the model tells us to always play video games. We are guaranteed that this is the case whenever

.

The quantity on the left has a really natural interpretation: it is the factor by which you have more impact potential in world  than you do in the average world. So if you're in a somewhat below-average impact world, you can safely spend all your time playing video games (or so the model says).

On the other hand, what if this factor is instead pretty large -- at least , let's say? Then our bound tells us that  -- that is, you should spend nearly all your time working to improve the world.

And if for some reason you think that the factor is pretty close to  -- let's say it's exactly  -- then the time you should spend improving the world is somewhere between  and . (If , this gives a lower bound of 50% and an upper bound of 67%.) So you'd probably spend some intermediate fraction of your time improving the world.

 

Analysis and takeaways

First: how satisfying is this solution?

There's a sense in which I find it satisfying, which is that it accords with my intuition of what should have happened in the model. It seems like a reasonable way for the bargain to go down.

Then again, a sense in which I find it dissatisfying is that if you think (as I do) that the distribution of impact potential across possible worlds has a huge spread, then the bargain has almost all copies of you either spending all their time on video games or spending nearly all their time on improving the world. Which isn't really the takeaway that Scott was going for in his original post.

 

Is there a way to salvage this -- to restore the intuition that in fact you should have a reasonable work-life balance? I think the answer is yes! That's because you're probably very uncertain about what world you live in: you don't have a great idea about what the shape of  is, nor what quantile world you're in (i.e. what -value your world corresponds to). You should probably assign a decent chance to being in an above-average world in terms of impact potential, and a decent chance to being in a below-average world in terms of impact potential.[1]

Faced with this uncertainty, what do you do? If I understand the spirit of geometric rationality correctly, an important lesson is that you can do a Nash bargain not just between you-in-possible-worlds, but between different possibilities with respect to your epistemic state. For example, if you think there's a 90% chance that the Riemann hypothesis is true, you can bargain between you-if-the-Riemann-hypothesis-is-true and you-if-the-Riemann-hypothesis-is-false, even if there's an objective fact of the matter to which one is the actual you.

You can try to build this uncertainty into the model, but I think this would be really hard.[2] On the other hand, here's a basic fact about Nash bargaining: suppose there are two options,  and , and you assign x credence that  has utility  and  has utility , and  credence that  has utility  and  has utility , then the Nash bargain between these two epistemic states says to do A with probability  and  with probability .

And if you squint, this is sort of like what we have going on. Suppose you think that you're either in a substantially above-average world for impact potential (with probability ), or a substantially below-average one (with probability ). In the first epistemic state, "spend almost all your time altruistically" (option ) is much better than "spend all your time on video games" (option ); the the second case, option  is much better than option . And so -- if I'm not losing too much accuracy with my hand-wavy comparisons -- the Nash bargain between these two epistemic states would have you spend  fraction of your time on video games.

 

So suppose you think there's a 25% chance you live in a substantially below-average world in terms of impact potential, a 70% chance you live in a substantially above-average world, and a 5% chance of a close-to-average world -- well then, perhaps you should spend about 70% of your time working to improve the world!

...well, if you buy the basic model, and if you buy the hand-waving about Nash bargaining over epistemic states that I just did (which I'm not sure I buy)

...but, putting all these caveats to the side, I find this conclusion pretty satisfying.

 

  1. ^

    That last bit -- that maybe you're in a below average impact world -- is perhaps contrarian, but I stand by it. You may well think that you are in the top one-billionth of possible worlds in terms of impact potential. But it could be that among those worlds that are even higher than this one in terms of your impact potential, 1% of them have  times more potential beings than ours, and that this completely dominates the average.

  2. ^

    You'd need to have the domain of optimization be the set of your epistemic states instead of the set of world states. But your "epistemic state" includes way more than just a guess about what  looks like across worlds (or even a probability distribution over what  looks like across worlds). You need to include -- in each individual epistemic state -- your guesses about (what your probability distribution over what  looks like across worlds) looks like in all the other possible epistemic states. And you'll need to include your guesses about what those look like in all the other possible epistemic states. And so on. There are probably nice simplifying assumptions you could make to make this optimization problem tractable; I haven't thought much about this.

Comments2


Sorted by Click to highlight new comments since:

Interesting to think about! 

But for this kind of bargain to work, wouldn't you need confidence that the you in other worlds would uphold their end of the bargain? 

E.g., if it looks like I'm in videogame-world, it's probably pretty easy to spend lots of time playing videogames. But can I be confident that my counterpart in altruism-world will actually allocate enough of their time towards altruism?

(Note I don't know anything about Nash bargains and only read the non-maths parts of this post, so let me know if this is a basic misunderstanding!)

Great question -- you absolutely need to take that into account! You can only bargain with people who you expect to uphold the bargain. This probably means that when you're bargaining, you should weight "you in other worlds" in proportion to how likely they are to uphold the bargain. This seems really hard to think about and probably ties in with a bunch of complicated questions around decision theory.

Curated and popular this week
 ·  · 38m read
 · 
In recent months, the CEOs of leading AI companies have grown increasingly confident about rapid progress: * OpenAI's Sam Altman: Shifted from saying in November "the rate of progress continues" to declaring in January "we are now confident we know how to build AGI" * Anthropic's Dario Amodei: Stated in January "I'm more confident than I've ever been that we're close to powerful capabilities... in the next 2-3 years" * Google DeepMind's Demis Hassabis: Changed from "as soon as 10 years" in autumn to "probably three to five years away" by January. What explains the shift? Is it just hype? Or could we really have Artificial General Intelligence (AGI)[1] by 2028? In this article, I look at what's driven recent progress, estimate how far those drivers can continue, and explain why they're likely to continue for at least four more years. In particular, while in 2024 progress in LLM chatbots seemed to slow, a new approach started to work: teaching the models to reason using reinforcement learning. In just a year, this let them surpass human PhDs at answering difficult scientific reasoning questions, and achieve expert-level performance on one-hour coding tasks. We don't know how capable AGI will become, but extrapolating the recent rate of progress suggests that, by 2028, we could reach AI models with beyond-human reasoning abilities, expert-level knowledge in every domain, and that can autonomously complete multi-week projects, and progress would likely continue from there.  On this set of software engineering & computer use tasks, in 2020 AI was only able to do tasks that would typically take a human expert a couple of seconds. By 2024, that had risen to almost an hour. If the trend continues, by 2028 it'll reach several weeks.  No longer mere chatbots, these 'agent' models might soon satisfy many people's definitions of AGI — roughly, AI systems that match human performance at most knowledge work (see definition in footnote). This means that, while the compa
 ·  · 4m read
 · 
SUMMARY:  ALLFED is launching an emergency appeal on the EA Forum due to a serious funding shortfall. Without new support, ALLFED will be forced to cut half our budget in the coming months, drastically reducing our capacity to help build global food system resilience for catastrophic scenarios like nuclear winter, a severe pandemic, or infrastructure breakdown. ALLFED is seeking $800,000 over the course of 2025 to sustain its team, continue policy-relevant research, and move forward with pilot projects that could save lives in a catastrophe. As funding priorities shift toward AI safety, we believe resilient food solutions remain a highly cost-effective way to protect the future. If you’re able to support or share this appeal, please visit allfed.info/donate. Donate to ALLFED FULL ARTICLE: I (David Denkenberger) am writing alongside two of my team-mates, as ALLFED’s co-founder, to ask for your support. This is the first time in Alliance to Feed the Earth in Disaster’s (ALLFED’s) 8 year existence that we have reached out on the EA Forum with a direct funding appeal outside of Marginal Funding Week/our annual updates. I am doing so because ALLFED’s funding situation is serious, and because so much of ALLFED’s progress to date has been made possible through the support, feedback, and collaboration of the EA community.  Read our funding appeal At ALLFED, we are deeply grateful to all our supporters, including the Survival and Flourishing Fund, which has provided the majority of our funding for years. At the end of 2024, we learned we would be receiving far less support than expected due to a shift in SFF’s strategic priorities toward AI safety. Without additional funding, ALLFED will need to shrink. I believe the marginal cost effectiveness for improving the future and saving lives of resilience is competitive with AI Safety, even if timelines are short, because of potential AI-induced catastrophes. That is why we are asking people to donate to this emergency appeal
 ·  · 23m read
 · 
Or on the types of prioritization, their strengths, pitfalls, and how EA should balance them   The cause prioritization landscape in EA is changing. Prominent groups have shut down, others have been founded, and everyone is trying to figure out how to prepare for AI. This is the first in a series of posts examining the state of cause prioritization and proposing strategies for moving forward.   Executive Summary * Performing prioritization work has been one of the main tasks, and arguably achievements, of EA. * We highlight three types of prioritization: Cause Prioritization, Within-Cause (Intervention) Prioritization, and Cross-Cause (Intervention) Prioritization. * We ask how much of EA prioritization work falls in each of these categories: * Our estimates suggest that, for the organizations we investigated, the current split is 89% within-cause work, 2% cross-cause, and 9% cause prioritization. * We then explore strengths and potential pitfalls of each level: * Cause prioritization offers a big-picture view for identifying pressing problems but can fail to capture the practical nuances that often determine real-world success. * Within-cause prioritization focuses on a narrower set of interventions with deeper more specialised analysis but risks missing higher-impact alternatives elsewhere. * Cross-cause prioritization broadens the scope to find synergies and the potential for greater impact, yet demands complex assumptions and compromises on measurement. * See the Summary Table below to view the considerations. * We encourage reflection and future work on what the best ways of prioritizing are and how EA should allocate resources between the three types. * With this in mind, we outline eight cruxes that sketch what factors could favor some types over others. * We also suggest some potential next steps aimed at refining our approach to prioritization by exploring variance, value of information, tractability, and the
Recent opportunities in Community
24
· · 3m read