Rob Wiblin: One really important consideration that plays into Open Phil’s decisions about how to allocate its funding — and also it really bears importantly on how the effective altruism community ought to allocate its efforts — is worldview diversification. Yeah, can you explain what that is and how that plays into this debate?
Alexander Berger: Yeah, the central idea of worldview diversification is that the internal logic of a lot of these causes might be really compelling and a little bit totalizing, and you might want to step back and say, “Okay, I’m not ready to go all in on that internal logic.” So one example would be just comparing farm animal welfare to human causes within the remit of global health and wellbeing. One perspective on farm animal welfare would say, “Okay, we’re going to get chickens out of cages. I’m not a speciesist and I think that a chicken-day suffering in the cage is somehow very similar to a human-day suffering in a cage, and I should care similarly about these things.”
Alexander Berger: I think another perspective would say, “I would trade an infinite number of chicken-days for any human experience. I don’t care at all.” If you just try to put probabilities on those views and multiply them together, you end up with this really chaotic process where you’re likely to either be 100% focused on chickens or 0% focused on chickens. Our view is that that seems misguided. It does seem like animals could suffer. It seems like there’s a lot at stake here morally, and that there’s a lot of cost-effective opportunities that we have to improve the world this way. But we don’t think that the correct answer is to either go 100% all in where we only work on farm animal welfare, or to say, “Well, I’m not ready to go all in, so I’m going to go to zero and not do anything on farm animal welfare.”
Alexander Berger: We’re able to work on multiple things, and the effective altruism community is able to work on multiple things. A lot of the idea of worldview diversification is to say, even though the internal logic of some of these causes might be so totalizing, so demanding, ask so much of you, that being able to preserve space to say, “I’m going to make some of that bet, but I’m not ready to make all of that bet,” can be a really important move at the portfolio level for people to make in their individual lives, but also for Open Phil to make as a big institution.
Rob Wiblin: Yeah. It feels so intuitively clear that when you’re to some degree picking these numbers out of a hat, you should never go 100% or 0% based on stuff that’s basically just guesswork. I guess, the challenge here seems to have been trying to make that philosophically rigorous, and it does seem like coming up with a truly philosophically grounded justification for that has proved quite hard. But nonetheless, we’ve decided to go with something that’s a bit more cluster thinking, a bit more embracing common sense and refusing to do something that obviously seems mad.
This is also how I think about the meat eater problem. I have a lot of uncertainty about the moral weight of animals, and I see funding/working on both animal welfare and global development as a compromise position that is good across all worldviews. (Your certainty in the meat eater problem can reduce how much you want to fund global development on the margin, but not eliminate it altogether.)
If by weight you meant probability, then placing 100% of that in anything is not implied by a discrete matrix, which must use expected values (i.e the average of {probability × impact conditional on probability}). One could mentally replace each number with a range for which the original number is the average.
(It is the case that my comment premises a certain weighting, and humans should not update on implied premises, except in case of beliefs about what may be good to investigate, to avoid outside-view cascades.)
I think beliefs about risk-aversion are probably where the crux between us is.
Uncertainty alone does not imply one should act in proportion to their probabilities.[1]
I don't know what is meant by 'risk averse' in this context. More precisely, I claim risk aversion must either (i) follow instrumentally from one's values, or (ii) not be the most good option under one's own values.[2]
Example of (i), where acting in a way that looks risk-averse is instrumental to fulfilling ones actual values: The Kelly criterion.
In a simple positive-EV bet, like at 1:2-odds on a fair coinflip, if one continually bets all of their resources, the probability they eventually lose everything approaches 1 as all their gains are concentrated into an unlikely series of events, resulting in many possible worlds where they have nothing and one where they have a huge amount of resources. The average resources had across all possible worlds is highest in this case.
Under my values, that set of outcomes is actually much worse than available alternatives (due to diminishing value of additional resources in a single possible world). To avoid that, we can apply something called the Kelly criterion, or in general bet with sums that are substantially smaller than the full amount of currently had resources.
This lets us choose the distribution of resources over possible worlds that our values want to result from resource-positive-EV bets; we can accept a lower average for a more even distribution.
Similarly, if presented with a series of positive-EV bets about things you find morally valuable in themselves, I claim that if you Kelly bet in that situation, it is actually because your values are more complex than {linearly valuing those things} alone.
As an example, I would prefer {a 90% chance of saving 500 good lives} to {a certainty of saving 400} in a world that already had many lives, but if those 500 lives were all the lives that exist, I would switch to preferring the latter - a certainty of only 100 of the 500 dying - even if the resulting quantities then became the eternal maximum (no creation of new minds possible, so we can't say it actually results in a higher expected amount).
This is because I have other values that require just some amount of lives to be satisfied, including vaguely 'the unfolding of a story', and 'the light of life/curiosity/intelligence continuing to make progress in understanding metaphysics until no more is possible'.
Another way to say this would be to say that our values are effectively concave over the the thing in question, and we're distributing them across possible futures.
This is importantly not what we do when we make a choice in an already large world, and we're not effecting all of it - then we're choosing between, e.g., {90%: 1,000,500, 10%: 1,000,000} and {100%: 1,000,400}. (And notably, we are in a very large world, even beyond Earth.)
At least my own values are over worlds per se, rather than the local effects of my actions per se. Maybe the framing of the latter leads to mistaken Kelly-like tradeoffs[3], and acting as if one assigns value to the fact of being net-positive itself.
(I expanded on this section about Kelly in a footnote at first, then had it replace example (i) in the main post. I think it might make the underlying principle clear enough to make example (ii) unnecessary, so I've moved (ii) to a footnote instead.)[4]
There are two relevant posts from Yudkowsky's sequences that come to mind here. I could only find one of them, 'Circular Altruism'. The other was about a study wherein people bet on multiple outcomes at once in proportion to the probability of each outcome, rather than placing their full bet on the most probable outcome, in a simple scenario where the latter was incentivized.
(Not including edge-cases where an agent values being risk-averse)
It just struck me that some technical term should be used instead of 'risk aversion' here, because the latter in everyday language includes things like taking a moment to check if you forgot anything before leaving home.
Example of (ii), where I seem to act risk-unaverse
I'm offered the option to press a dubious button. This example ended up very long, because there is more implied uncertainty than just the innate chances of the button being of either kind, but maybe the extra detail will help show what I mean / be more surface for a cruxy disagreement to be exposed.
I think (66%) it's a magic artifact my friends have been looking for, in which case it {saves 1 vegan[5] who would have died} when pressed. But I'm not sure; it might also be (33%) a cursed decoy, in which case it {causes 1 vegan[5] who would not have died to die} when pressed instead.
The button can't kill the presser
These are unrealistic, but they mean I don't have to reason about how at-risk vegans are less likely to be alignment researchers than non-at-risk vegans who I risk killing, or how I might be saving people who don't want to live, or how those at risk of death would have more prepared families, or how my death could cut short a series of bad presses, anything like that.
In this case, I first wonder what it means to 'save a life', and reason it must mean preventing a death that would otherwise occur. I notice that if no one is going to die, then no additional lives can be saved. I notice that there is some true quantity of vegans who will die absent any action, and I would like to just press the button exactly that many times, but I don't know that true quantity, so I have reason about it under uncertainty.
So, I try to reason about what that quantity is by estimating an amount of lives at various levels of at-risk; and though my estimates are very uncertain (I don't know what portion of the population is vegan, nor how likely different ones are to die), I still try.
In the end I have a wide probability distribution that is not very concentrated at any particular point, and which is not the one an ideal reasoner would produce, and because I cannot do any better, I press the button exactly as many times as there are deaths in the distribution's average[6].
More specifically, I stop once it has a ≤ 50% chance of {saving an additional life conditional on it already being a life-saving button}, because anything less, when multiplied by the 66% chance of it being a life-saving button, would be an under 33% total chance of saving a life compared to a 33% chance of certainly ending one. The last press will have only a very slightly positive EV, and one press further would have a very slightly negative EV.
Someone following a 'risk averse principle' might stop pressing once their distribution says an additional press scores less than 60% on that conditional, or something. They may reason, "Pressing it only so many times seems likely to do good across the vast majority of worldviews in the probability distribution," and that would be true.
In my view, that's just accepting the opposite trade: declining a 60% chance of preventing a death in return for a 40% chance of preventing a death.
I don't see why this simple case would not generalize to reasoning about real-world actions under uncertainties about different things like how bad the experience would be as a factory farmed animal. But it would be positive for me to learn of such reasons if I'm missing something.
(To avoid, in the thought experiment, the very problem this post is about)
(given the setup's simplifying assumptions. in reality, there might be a huge average number that mostly comes from tail-worlds, let alone probable environment hackers)