Fanatical EAs should support very weird projects

Derek Shiller

72

Even if you try to follow an unbounded utility function (which has deep mathematical problems, but set those aside for now) these don't follow.

Generally the claims here fall prey to the fallacy of unevenly applying the possibility of large consequences to some acts where you highlight them and not to others, such that you wind up neglecting more likely paths to large consequences.

For instance, in an infinite world (including infinities creating by infinite branching faster than you can control) with infinite copies of you, any decision, e.g. eating an apple, has infinite consequences on decision theories that account for the fact that all must make the same (distribution of ) decisions . If perpetual motion machines or hypercomputation or baby universes are possible, then making a much more advanced and stable civilization is far more promising for realizing things related to that then giving in to religions where you have very high likelihood ratios that they don't feed into cosmic consequences.

Any plan for infinite/cosmic impact that has an extremely foolish step in it (like Pascal's Mugging) is going to be dominated by less foolish plans.

There will still be implications of unbounded utility functions that are weird and terrible by the standards of other values, but they would have to follow from the most sophisticated analysis, and wouldn't have foolish instrumental irrationalities or uneven calculation of possible consequences.

A lot of these scenarios are analogous to someone caricaturing the case for aid to the global poor as implying that people should give away all of the food they have (sending it by FedEx) to famine-struck regions, until they themselves starve to death. Yes, cosmopolitan concern for the poor can elicit huge sacrifices of other values like personal wellbeing or community loyalty, but that hypothetical is obviously wrong on its own terms as an implication.

Derek Shiller

4y

11

Generally the claims here fall prey to the fallacy of unevenly applying the possibility of large consequences to some acts where you highlight them and not to others, such that you wind up neglecting more likely paths to large consequences.

Could you be more specific about the claims that I make that involve this fallacy? This sounds to me like a general critique of Pascal's mugging, which I don't think fits the case that I've made. For instance, I suggested that the simple MWI has a probability ~ and would mean that it is trivially possible if true to generate $2^{1000} v$ in value, where v is all the value currently in the world. The expected value of doing things that might cause 1000 successive branchings is ~ $10^{283} v$ where v is all the value in the world. Do you think that there is a higher probability way to generate a similar amount of value?

then making a much more advanced and stable civilization is far more promising for realizing things related to that.

I suppose your point might be something like, absurdist research is promising, and that is precisely why we need humanity to spread throughout the stars. Just think of how many zany long-shot possibilities we'll get to pursue! If so, that sounds fair to me. Maybe that is what the fanatic would want. It's not obvious that we should focus on saving humanity for now and leave the absurd research for later. Asymmetries in time might make us much more powerful now than later, but I can see why you might think that. I find it a rather odd motivation though.

CarlShulman

4y

9

Here's one application. You posit a divergent 'exponentially splitting' path for a universe. There are better versions of this story with baby universes (which work better on their own terms than counting branches equally irrespective of measure, which assigns ~0 probability to our observations).

But in any case you get some kind of infinite exponentially growing branching tree ahead of you regardless. You then want to say that having two of these trees ahead of you (or a faster split rate) is better. Indeed, on this line you're going to say that something that splits twice as fast is so much more valuable as to drive the first tree to~nothing. Our world very much looks not-optimized for that, but it could be, for instance, a simulation or byproduct of such a tree, with a constant relationship of such simulations to the faster-expanding tree (and any action we take is replicated across the endless identical copies of us therein).

Or you can say we're part of a set of parallel universes that don't split but which is as 'large' as the infinite limit of the fastest splitting process.

I suppose your point might be something like, absurdist research is promising, and that is precisely why we need humanity to spread throughout the stars. Just think of how many zany long-shot possibilities we'll get to pursue! If so, that sounds fair to me. Maybe that is what the fanatic would want. It's not obvious that we should focus on saving humanity for now and leave the absurd research for later. Asymmetries in time might make us much more powerful now than later, but I can see why you might think that. I find it a rather odd motivation though.

Personally, I think we should have a bounded social welfare function (and can't actually have an unbounded one), but place finite utility on doing a good job picking low-hanging fruit on these infinite scope possibilities. But that's separate from the questions of what an efficient resource expenditures on those possibilities looks like.

SamuelKnoche

4y

3

I give the MWI a probability of greater than 0.5 of being correct, but as far as I can tell, there isn't any way to generate more value out of it. There isn't any way to create more branches. You only can choose to be intentional and explicit about creating new identifiable branches, but that doesn't mean that you've created more branches. The branching happens regardless of human action.

Someone with a better understanding of this please weigh in.

Michael St Jules 🔸

4y

9

(Edited to remove some bits.)

Pursuing (or influencing others to pursue) larger cardinal numbers of value, e.g. creating or preventing the existence of possible beings, seems sufficiently neglected relative to extinction risk reduction and the chances of value-lock-in are high enough that increasing or decreasing the expected amount of resources used to generate such higher cardinals of (dis)value or improving their quality conditional on an advanced stable civilization looks at least roughly as promising as extinction risk reduction for a scope-sensitive expected value maximizer. (However, plausibly you should just be indifferent to everything, if you aggregate value before taking differences rather than after.)

JP Addison🔸

4y

22

Mod note: I've enabled agree-disagree voting on this thread. This is the EA Forum's first experiment with this feature, which was developed by the LessWrong team (thanks!). I'm very interested in your feedback. Leave a comment here, or email me.

Guy Raveh

4y

2

I noticed this on my own comment, came back to look at other comments here, and can say I'm already confused. But I already said I was against the idea, and maybe it's just about getting used to change.

So as more practical feedback: the meaning of the different karma types could be explained better in the hover texts. Currently they're presented as "agreement" vs. "overall" karma - it's not clear what the latter means. And the "agreement" hover text basically tries to explain both:

How much do you agree with this, separate from whether you think it's a good comment?

I would only put "How much do you agree with this?" there, and put "How good is this comment?" (or maybe some clearer but short explanation) in the regular karma hover text.

dark

4y

17

Just to clarify: is the core argument here roughly, "I'm suspicious of things that look like a Pascal's mugging"?

If this is your argument, then I agree with you (to an extent). But reading through your examples, I feel unsure whether the Pascal's mugging aspect is your crux, or if the weirdness of the conclusion is your crux. To test this concretely: if we were close to 100% confident that we do live in a universe where we could, e.g., produce quantum events that trigger branchings, would you want a lot of effort going into triggering such branchings? (For what it's worth, I would want this.)

Derek Shiller

4y

4

I don't love the EA/rationalist tendency to dismiss long shots as Pascal's muggings. Pascal's mugging raises two separate issues: 1) what should we make of long shots with high expected value? and 2) what evidence does testimony by itself provide to highly implausible hypotheses (particularly compared with other salient possibilities). Considerations around (2.) seem sufficient to be wary of Pascal's mugging, regardless of what you think of (1.).

I definitely think that if you were 100% confident in the simple MWI view, that should really dominate your altruistic concern. Every time the world splits, the number of pigs in gestation crates (at least) doubles! How can you not see that as something you should really care about? It might be a lonely road, but how can you pass up such high returns? (Of course it is bad for there to be pigs in gestation crates -- I assume it is outweighed by good things, but those good things must be really good to outweigh such bads, so we should really want to double them. If they're not outweighed we should really try to stop branchings)

For what it's worth, I think I'd be inclined to think that the simple MWI should dominate our considerations even at a 1 in a thousand probability. Not sure about the 1 in a million range.

I think this post is the result of three motivations.

1.) I think the expected value of weird projects really is ludicrously high. 2.) I don't want to work on them, or feel like I should be working on them. I get the impression that many, even most, EAs would agree. 3.) I'd bet I'm not going to win a fight about the rationality of fanaticism with Yoaav Isaacs or Hayden Wilkinson.

Lukas_Gloor

4y

9

I definitely think that if you were 100% confident in the simple MWI view, that should really dominate your altruistic concern. Every time the world splits, the number of pigs in gestation crates (at least) doubles! How can you not see that as something you should really care about?

If you google terms like "measure," "reality fluid" or "observer fluid" you find long discussions on Lesswrong related to how "the number of pigs in gestation crates (at least) doubles!" is probably a confused way of thinking. I don't understand these issues at all, but there's definitely a rabbit hole to delve into from here.

Derek Shiller

4y

10

Lesswrong related to how "the number of pigs in gestation crates (at least) doubles!" is probably a confused way of thinking.

Sure, but how small is the probability that it isn't? It has to be really small to counteract the amount of value doubling would provide.

Lukas_Gloor

4y

18

Ah, reading your post and comments more closely, I realize you're aware of the picture probably being a different one, but, in your example, you focus on "branching doubles the things that matter" because it leads to these fanatical conclusions. That makes sense.

It has to be really small to counteract the amount of value doubling would provide.

It depends what you compare it to. Sure, if you compare a case where no branching happens at all (i.e., no MWI) and one in which branching happens and you treat it as "branching doubles the amount of stuff that matters," then yes, there's a wager in favor of the second.

However, if you compare "MWI where branching doubles the amount of stuff that matters" to "MWI where there's an infinite sea of stuff and within that sea, there's objective reality fluid or maybe everything's subjective and something something probabilities are merely preferences over simplicity," then it's entirely unclear how to compare these two pictures. (Basically, the pictures don't even agree on what it means to exist, let alone how to have impact.)

Derek Shiller

4y

1

However, if you compare "MWI where branching doubles the amount of stuff that matters" to "MWI where there's an infinite sea of stuff and within that sea, there's objective reality fluid or maybe everything's subjective and something something probabilities are merely preferences over simplicity," then it's entirely unclear how to compare these two pictures. (Basically, the pictures don't even agree on what it means to exist, let alone how to have impact.)

I'm not sure I really understand the response. Is it that we shouldn't compare the outcomes between, say, a Bohmian interpretation and my simplistic MW interpretation, but between my simplistic MW interpretation and a more sophisticated and plausible MW interpretation, and those comparisons aren't straightforward?

If I've got you right, this seems to me to be a sensible response. But let me try to push back a little. While you're right that it may be difficult to compare different metaphysical pictures considered as counterfactual, I'm only asking you to compare metaphysical pictures considered as actual. You know how great it actually is to suck on a lollipop? That's how great it is to suck on a lollipop whether you're a worm navigating through branching worlds or a quantum ghost whose reality is split across different possibilities or a plain old Bohmian hunk of meat. Suppose you're a hunk of meat, how great would it be if you were instead a worm? Who knows and who cares! We don't have to make decisions for metaphysical possibilities that are definitely not real and where sucking on a lollipop isn't exactly this great.

Lukas_Gloor

4y

9

I'm not sure I really understand the response. Is it that we shouldn't compare the outcomes between, say, a Bohmian interpretation and my simplistic MW interpretation,

I’m not saying you can’t compare those two. You can – the simplistic MW interpretation will win because it has more impact at stake, as you say, so it wins under expected utility theory, even if you assign it low credence.

However, if you’re going down the road of “what speculative physics interpretation produces the largest utilities under expected utility theory?” you have to make sure to get the biggest one, the one where the numbers grow the most. This is Carl's point above. My point is related, it's that it seems more plausible for there to be infinite* branches** already if we're considering the many worlds interpretation, as opposed to branching doubling the amount of stuff that matters.

So, comparing infinite many worlds to your many worlds with some finite but ever-growing number of branches, it seems unclear which picture to focus on as expected utility maximizers. If there's an infinite sea of worlds/branches all at once and all our actions have infinite consequences across infinite copies of ourselves in different worlds/branches, that's more total utility at stake than in your example, arguably. I say "arguably" because the concept of infinity is contested by some , there's the infinitarian paralysis argument that says all actions that affect infinities of the same order are of equal value, and there are philosophical issues around what it could possibly mean for something to "exist" if there's an infinite number of everything you can logically describe (this goes slightly further than many worlds – "if everything we can coherently describe can exist, what would it even mean for something not to exist? Can some things exist more than others?").***

In short, the picture becomes so strange that "Which of these speculative physics scenarios should I focus on as an expected utility maximizer?" becomes more a question about the philosophy of "What do we mean by having impact in a world with infinities?" and less about straightforwardly comparing the amounts of utility at stake.

*I might butcher this (I remember there's something about how the probabilities you get for branching "splits" may change based on arbitrary-seeming assumptions about which "basis" to use, or something like that? I found this section on Wikipedia on the preferred basis problem), but I think one argument for infinities in the MWI goes as follows. Say you have a quantum split and it’s 50-50, meaning 50% that the cat in the box is dead, 50% it’s alive. In this situation, it seems straightforward to assume that one original world splits into two daughter worlds. (Or maybe the original splits into four worlds, half with dead cat, half with an alive cat. It's already a bit disconcerting that we maybe couldn't distinguish between one world splitting into two and one world splitting into four?)

Now let’s assume there’s a quantum split, but the observed probabilities are something weird like 2/7. Easy, you say. "Two worlds with a dead cat, five worlds with an alive cat."

Okay. But here comes the point where this logic breaks apart. Apparently, some quantum splits happen with probabilities that are irrational numbers – numbers that cannot be expressed as fractions. Wtf. :S (I remember this from somewhere in Yudkowsky's quantum physics sequence, but here's a discussion on a physics forum where I found the same point. I don't know how reliable that source is.)

**[Even more speculative than the other points above.] Perhaps the concept of "branching" isn't exactly appropriate, and there's some equivalence between the MW quantum multiverse and a single universe with infinite spatial extent, where there are also infinite copies of you with each copy being extremely far apart from each other. (In an infinitely spatially extended universe with fixed physical laws and random initial conditions, macroscopic patterns would start to repeat themselves eventually at a far enough distance, so you'd have infinite exact copies of yourself and infinte nearly-exact copies.) Maybe what we experience/think of as "branching" is just consciousness moments hopping from one subjectively indistinguishable location to the next. This sounds wild, but it's interesting that when you compare different ways for there to be a multiverse, the MWI and the infinitely spatially expanded universe have the same laws of physics, so there's some reason to assume that maybe they're two ways of describing the same thing. By contrast, inflationary cosmology, which is yet another way you can get a "multiverse," would generate universe bubbles with different laws of physics for each bubble. (At least, that's what I remember from the book The Hidden Reality.) (I found a paper that discusses the hypothesis that the infinitely spatially extended multiverse is the same as the quantum multiverse – it references the idea to Tegmark and Aguirre, but I first heard it by Yudkowsky. The paper claims to argue that the idea is false, for what it's worth.)

***To elaborate on "philosophical issues around what it means for something to exist." Consider the weird idea that there might be these infinite copies of ourselves out there, some of which should find themselves in bizarre circumstances where the world isn't behaving predictably. (If there are infinite copies of you in total, you can't really say "there are more copies of you in environments where the furniture doesn't turn into broccoli the next second than there are copies in environments where it does." After all, there are infinite copies in both types of environment!) So, this raises questions like "Why do things generally appear lawful/predictable to us?" and "How much should we care about copies of ourselves that find themselves in worlds where the furniture turns into broccoli?" So, people speculate whether there's some mysterious "reality fluid" that could be concentrated in worlds that are simpler and therefore they appear more normal/predictable to us. (One way to maybe think of this is that the universe is a giant automaton that's being run, and existence corresponds not just whether there's a mathematical description of the patterns that make up you and your environment, but also somehow of "actually being run" or "(relative?) run-time.") Alternatively, there's a philosophical view that we may call "existence anti-realism." We start by noting that the concept of "existence" looks suspicious. David Chalmers coined the term bedrock concepts for concepts that we cannot re-formulate in non-question-begging terminology (terminology from another domain). So these concepts are claimed to be "irreducible." Concepts like "moral" or "conscious" are other contenders for bedrock concepts. Interestingly enough, when we investigate purported bedrock concepts, many of them turn out to be reducible after all (e.g., almost all philosophers think concepts like "beautiful" are reducible; many philosophers think moral concepts are reducible; a bunch of philosophers are consciousness anti-realists, etc.) See this typology of bedrock concepts I made, where existence anti-realism is the craziest tier. It takes the sort of reasoning that is common on Lesswrong the furthest you can take it. It claims that whether something "exists"is a bit of a confused question, that our answers to it depend on how our minds are built, like what priors we have over worlds or what sort of configurations we care about. I don't understand it, really. But here's a confusing dialogue on the topic.

As I said in my earlier comment, it's a rabbit hole.

Derek Shiller

4y

2

Thanks for clarifying! I think I get what you're saying. This certainly is a rabbit hole. But to bring it back to the points that I initially tried to make, I'm kind of struggling to figure out what the upshot would be. The following seem to me to be possible take-aways:

1.) While the considerations in the ballpark of what I've presented do have counterintuitive implications (if we're spawning infinite divisions every second, that must have some hefty implications for how we should and shouldn't act, mustn't it?), fanaticism per se doesn't have any weird implications for how we should be behaving because it is fairly likely that we're already producing infinite amounts of value and so long shots don't enter into it.

2.) Fanaticism per se doesn't have any weird implications for how we should be behaving, because it is fairly likely that the best ways to produce stupendous amounts of value happen to align closely with what commonsense EA suggests we should be doing anyway. (I like Michael St. Jules approach to this that says we should promote the long-term future of humanity so we have the chance to research possible transfinite amounts of value.)

3.) These issues are so complicated that there is no way to know what to do if we're going fanatical, so even if trying to create branches appears to have more expected utility than ordinary altruistic actions, we should stick to the ordinary altruistic actions to avoid opening up that can of worms.

Guy Raveh

4y

4

I definitely think that if you were 100% confident in the simple MWI view, that should really dominate your altruistic concern.

TBH I don't think this makes sense. Every decision you make in this scenario, including the one to promote or stop branching, would be a result of some quantum processes (because everything is a quantum process), so the universe where you decided to do it would be complemented by one where you didn't. None of your decisions have any effect on the amount of suffering etc., if it's taken as a sum over universes.

kokotajlod

4y

10

My own thoughts on this subject.

Also relevant: Impossibility results for unbounded utility functions.

You say:

Isaacs, Beckstead & Thomas, and Wilkinson point out how weird it would be to adopt a complete and consistent decision theory that wasn't fanatical. It would involve making arbitrary distinctions between minute differences of the probability of different wagers or evaluating packages of wagers differently then one evaluates the sum of the wagers individually. Offered enough wagers, non-fanatics must make some distinctions that they will be very hard-pressed to justify.

Idk, bounded utility functions seem pretty justifiable to me.* Just slap diminishing returns on everything. Yes, more happy lives are good, but if you already have a googleplex of them, it's not so morally important to make more. Etc. As for infinities, well, I think we need a measure over infinities anyway, so let's say that our utility function is bounded by 1 and -1, with 1 being the case where literally everything that happens across infinite spacetime is as good as possible--the best possible world--and -1 is the opposite, and in between we have various cases in which good things happen with some measure and bad things happen with some measure.

*I totally feel the awkwardness/counterintuitiveness in certain cases, as the papers you link point out. E.g. when it's about suffering. But it feels much less bad than the problems with unbounded utility functions. As you say, it seems like people with unbounded utility functions should be fanatical (or paralyzed, I'd add) and fanatics... well, no one I know is willing to bite the bullet and actually start doing absurdist research in earnest. People might claim, therefore, to have unbounded utility functions, but I doubt their claims.

Derek Shiller

4y

19

Big fan of your sequence!

I'm curious how you think about bounded utility function. Its not something I've thought about much. The following sort of case seems problematic.

Walking home one night from a lecture on astrophysics where you learned about the latest research establishing the massive size of the universe, you come across a child drowning in a pond. The kid is kicking and screaming trying to stay above the water. You can see the terror in his eyes and you know that it's going to get painful when the water starts filling his lungs. You see is mother, off in the distance, screaming and running. Something just tells you she'll never get over this. It will wreck her marriage and her career. There's a life preserver in easy reach. You could save the child without much fuss. But you recall your lecture the oodles and oodles of people living on other planets and figure that we must be very near the bound of total value for the universe, so the kid's death can't be of more than the remotest significance. And there's a real small chance that solipsism is true, in which case your whims matter much more (we're not near the bounds) and satisfying them will make a much bigger difference to total value. The altruistic thing to do is to not make the effort, which could be mildly unpleasant, even though it very likely means the kid will die an agonizing death and his mother will mourn for decades.

That seems really wrong. Much more so than thinking that fanaticism is unreasonable.

kokotajlod

4y

3

Ooof, yeah, I hadn't thought about the solipsism possibility before. If the math checks out then I'll keep my bounded utility function but also maybe add in some nonconsequentialist-ish stuff to cover this case and cases like it. (or, you can think of it as just specifying that the utility function should assign significant negative utility to you doing unvirtuous acts like this.)

That said, I'm skeptical that the math works out for this example. Just because the universe is very big doesn't mean we are very near the bound. We'd only be very near the bound if the universe was both very big and very perfect, i.e. suffering, injustice, etc. all practically nonexistent as a fraction of things happening.

So we are probably nowhere near either end of the bound, and the question is how much difference saving one child makes in a very big universe.

For reasons related to noncausal decision theory, the answer is "a small but non-negligible fraction of all the things that happen in this universe depend on what you do in this case. If you save the child, people similar to you in similar situations all across the multiverse will choose to save similar children (or alien children, or whatever)."

The question is whether that small but non-negligible positive impact is outweighed by the maybe-solipsism-is-true-and-me-enjoying-this-ice-cream-is-thus-somewhat-important possibility.

Intuitively it feels like the answer is "hell no" but it would be good to see a full accounting. I agree that if the full accounting says the answer is "yes" then that's a reductio.

Note that the best possible solipsistic world is still vastly worse than the best possible big world.

(Oops, didn't realize you were the same person that talked to me about the sequence, shoulda put two and two together, sorry!)

Derek Shiller

4y

7

Just because the universe is very big doesn't mean we are very near the bound. We'd only be very near the bound if the universe was both very big and very perfect, i.e. suffering, injustice, etc. all practically nonexistent as a fraction of things happening.

My thought was that you'd need a large universe consisting of people like us to be very near the bound, otherwise you couldn't use boundedness to get out of assigning a high expected value to the example projects I proposed. There might be ways of finessing the dimensions of boundedness to avoid this sort of concern, but I'm skeptical (though I haven't thought about it much).

I also find it methodologically dubious to adjust your value function to fit what actions you think you should do. It feels to me like your value function should be your value function, and you should adjust your decision rules if they produce a bad verdict. If your value function is bounded, so be it. But don't cut it off to make expected value maximization more palatable.

If the math checks out then I'll keep my bounded utility function but also maybe add in some nonconsequentialist-ish stuff to cover this case and cases like it.

I can see why you might do this, but it feels strange to me. The reason to save the child isn't because its a good thing for the child not to drown, but because there's some rule that you're supposed to follow that tells you to save the kid? Do these rules happen to require you to act in ways that basically align with what a total utilitarian would do, or do they have the sort of oddities that afflict deontological views (e.g. don't lie to the murderer at the door)?

Michael St Jules 🔸

4y

5

There's a paper by Tarsney on solipsistic swamping for some specific social welfare functions, like average utilitarianism, just considering moral patients on our Earth so far: https://www.tandfonline.com/doi/full/10.1080/00048402.2021.1962375

Your utility function can instead be bounded wrt the difference you make relative to some fixed default distribution of outcomes ("doing nothing", or "business as usual") or in each pairwise comparison (although I'm not sure this will be well-behaved). For example, take all the differences in welfare between the two random variable outcomes corresponding to two options, apply some bounded function of all of these differences, and finally take the expected value.

Derek Shiller

4y

9

Your utility function can instead be bounded wrt the difference you make relative to some fixed default distribution of outcomes ("doing nothing", or "business as usual") or in each pairwise comparison (although I'm not sure this will be well-behaved). For example, take all the differences in welfare between the two random variable outcomes corresponding to two options, apply some bounded function of all of these differences, and finally take the expected value.

Consider the following amended thought experiment: (changes in bold)

Walking home one night from a lecture on astrophysics where you learned about the latest research establishing the massive size of the universe, you come across a child drowning in a pond. The kid is kicking and screaming trying to stay above the water. You can see the terror in his eyes and you know that it's going to get painful when the water starts filling his lungs. You see is mother, off in the distance, screaming and running. Something just tells you she'll never get over this. It will wreck her marriage and her career. There's two buttons near you. Pressing either will trigger an event that adds really good lives to the universe. (The buttons will create the exact same lives and only function once.) The second also causes a life preserver to be tossed to the child. The second button is slightly further from you, and you'd have to strain to reach it. And there's a real small chance that solipsism is true, in which case your whims matter much more (we're not near the bounds) and satisfying them will make a much bigger difference to total value. The altruistic thing to do is to not make the additional effort to react the further button, which could be mildly unpleasant, even though it very likely means the kid will die an agonizing death and his mother will mourn for decades.

Michael St Jules 🔸

4y

3

Good example! At least this isn't solipsistic egoism, but I agree the results seem too egoistic.

What you could do is rearrange the two probability distributions of aggregate welfares statewise in non-decreasing order (or in a way that minimizes some distance between the two distributions), take the difference between the two resulting random variables, apply a bounded monotonically increasing function to the difference, and then take the expected value.

Unfortunately, I suspect this pairwise comparison approach won't even be transitive.

Michael St Jules 🔸

4y

2

Given an intransitive relation over options (distributions over outcomes), you can use voting methods like beatpath to define a similar transitive relation or choose among options even when there's intransitivity in a choice set. Using beatpath on the specific actual option sets you face in particular will mean violating the independence of irrelevant alternatives, which I'm pretty okay with giving up, personally.

This is done in this paper:

https://globalprioritiesinstitute.org/teruji-thomas-the-asymmetry-uncertainty-and-the-long-term/

You could apply beatpath to the set of all conceivable options, even those not actually available to you in a given choice situation, but I imagine you'll get too much indifference or incomparability.

Owen Cotton-Barratt

4y

2

Re. non-consequentialist stuff, I notice that I expect societies to go better if people have some degree of extra duty towards (or caring towards) those closer to them. That could be enough here?

(i.e. Boundedly rational agents shouldn't try to directly approximate their best guess about the global utility function.)

Will Aldred

4y

10

There's a bit of a history of estimating how low the probabilities are that we can ignore.

See also "Exceeding Expectations: Stochastic Dominance as a General Decision Theory" (Tarsney, 2020); West (2021) summarises this Tarsney paper, which is pretty technical. A key sentence from West:

Tarsney argues that we should use an alternative decision criterion called stochastic dominance which agrees with EV in non-Pascallian situations, but, when combined with the above argument about uncertainty, disagrees with EV in Pascallian ones.

Ramiro

4y

6

This reminds me of Ole Peters alternative time resolution of the St. Petersburg paradox. I'd really appreciate more summaries of abstruse technical papers on alternatives to expected utility weird scenarios.

Jacy

4y

8

This is super interesting. Thanks for writing it. Do you think you're conflating several analytically distinct phenomena when you say (i) "Fanaticism is the idea that we should base our decisions on all of the possible outcomes of our actions no matter how unlikely they are ... base our decisions on all of the possible outcomes of our actions no matter how unlikely they are EA fanatics take a roughly maximize expected utility approach" and (ii) "Fanaticism is unreasonable"?

For (i), I mainly have in mind two approaches "fanatics" could be defined by: (ia) "do a quick back-of-the-envelope calculation of expected utility and form beliefs based solely on its output," and (ib) "do what you actually think maximizes expected utility, no matter whether that's based on a spreadsheet, heuristic, intuition, etc." I think (ia) isn't something basically anyone would defend, while (ib) is something I and many others would (and it's how I think "fanaticism" tends to be used). And for (ib), we need to account for heuristics like, (f) quick BOTE calculations tend to overestimate the expected utility of low probabilities of high impact, and (g) extremely large and extremely small numbers should be sandboxed (e.g., capped in the influence they can have on the conclusion). This is a (large) downside of these "very weird projects," and I think it makes the "should support" case a lot weaker.

For (ii), I mainly have in mind three claims about fanaticism: (iia) "Fanaticism is unintuitive," (iib) "Fanaticism is absurd (a la reductio ad absurdum," and (iic) "Fanaticism breaks some utility axioms." These each have different evidence . For example, (iia) might not really matter if we don't think our intuitions—which have been trained through evolution and life experience—are reliable for such unusual questions like maximizing long-run aggregate utility.

Did you have some of these in mind? Or maybe other operationalizations?

Derek Shiller

4y

2

I meant to suggest that our all-things-considered assignments of probability and value should support projects like the ones I laid out. Those assignments might include napkin calculations, but if we know we overestimate those, we should adjust accordingly.

(g) extremely large and extremely small numbers should be sandboxed (e.g., capped in the influence they can have on the conclusion)

This sounds to me like it is in line with my takeaways. Perhaps we differ on the grounds for sandboxing? Expected value calculations don't involve capping influence of component hypotheses. Do you have a take on how you would defend that?

or (ii), I mainly have in mind three claims about fanaticism: (iia) "Fanaticism is unintuitive," (iib) "Fanaticism is absurd (a la reductio ad absurdum," and (iic) "Fanaticism breaks some utility axioms."

I don't mean to say that fanaticism is wrong. So please don't read this as a reductio. Interpreted as a claim about rationality, I largely am inclined to agree with it. What I would disagree with is a normative inference from its rationality to how we should act. Let's not focus less on animal welfare or global poverty because of farfetched high-value possibilities, even if it would be rational to do so.

Ramiro

4y

8

Thanks for the post - I'd like to see more people thinking about the consequences of "fanaticism". But I should notice that discussions about Pascal's Wagers have been running for a long time among rationalists, decision theorists and philosophers - and even in this community.

I disagree a bit with the conclusions. Sorry if this is too brief:

(1) is probably right, but I'm not sure this can based on the reductio presented on this post;

(2) is probably wrong. I think the best theory of rationality probably converges with the best theory of reasonableness - it would show why bounded rational cooperators should display this trait. But it's an interesting distinction to have in mind.

(3) I guess most consequentialists would agree that expected utility shouldn't always guide behaviour directly. They might distinguish between the correctness of an action (what you should do) and its value; your case fails to point out why the value of a Pascal's Wager-like scenario shouldn't be assessed with expected utility. But what's really weird is that the usual alternative to expected value is common sense deontic reasoning , which is often taken to claim that you should / could do a certain action A, no matter what its consequences or chances are: pereat mundus, fiat justitia. I fail to see why this shouldn't be called "fanatical", too.

(4) I'm very inclined to agree with this when we are dealing with very uncertain subjective probability distributions, and even with objective probabilities with very high variance (like Saint Petersburg Paradox). I'm not sure the same would apply to well-defined frequencies - so I wouldn't proscribe a lottery with a probability of 10^(-12).

That being said, it's been a long time since I last checked on the state of the matter... but the main lesson I learned about PW was that ideas should "pay rent" to be in our heads (I think Yudkowsky mentioned it while writing about a PW's scenario). So the often neglected issue with PW scenarios is that it's hard to account for their opportunity costs - and they are potentially infinite, precisely because it's so cheap to formulate them. For instance, if I am willing assign a relevant credence to a random person who tries to Pascal-mug me, then not only I can be mugged by anyone, I also have to assing some probability to events like:

The world will become the ultimate Paradise / Hell iff I voice a certain sequence of characters in the next n seconds.

Maybe there's a shy god around waiting for our prayer.

Derek Shiller

4y

3

That being said, it's been a long time since I last checked on the state of the matter... but the main lesson I learned about PW was that ideas should "pay rent" to be in our heads (I think Yudkowsky mentioned it while writing about a PW's scenario). So the often neglected issue with PW scenarios is that it's hard to account for their opportunity costs - and they are potentially infinite, precisely because it's so cheap to formulate them.

Pascal's wager is somewhat fraught, and what you should make of it may turn on what you think about humility, religious epistemology, and the space of plausible religions. What's so interesting about the MWI project is that it isn't like this. It isn't some theory concocted from nothing and assigned a probability. There's at least some evidence that something in the ballpark of the theory is true. And it's not easy to come up with an approximately as plausible hypothesis that suggests that the actions which might cause branchings might instead prevent them, or that we have alternative choices might lead to massive amounts of value in other ways.

If you grant that MWI is coherent, then I think you should be open to the possibility that it isn't unique, and there are other hypotheses that suggest possible projects that are much more likely to create massive amounts of value than prevent it.

Ramiro

4y

3

If you grant that MWI is coherent, then I think you should be open to the possibility that it isn't unique, and there are other hypotheses that suggest possible projects that are much more likely to create massive amounts of value than prevent it.

Actually, I didn't address your argument from MWI because I suspect we couldn't make any difference. Maybe I'm wrong (it's way beyond my expertise), but quantum branching events would be happening all the time, so either (i) there are (or will be) infinite worlds, whatever we do, and then the problem here is more about infinite ethics than Fanaticism, or (ii) there is a limit to the number of possible branches - which I guess will (most likely) be achieved whatever we do. So it's not clear to me that we would gain additional utility by creating more branching events.

[And yet, the modus ponens of one philosopher is the modus tollens of another one... rat/EAs have actually been discussing the potential implications of weird physics: here, here...]

However, I'm not sure the problem I identified with PW's (i.e., take opportunity costs seriously) wouldn't apply here, too... if we are to act conditioned on MWI being true, then we should do the same for every theory that could be true with similar odds. But how strong should this "could" be? Like "we could be living in a simulation"? And how long until you face a "basilisk", or just someone using motivated reasoning?

As Carl Shulman points out, this might be a case of "applying the possibility of large consequences to some acts where you highlight them and not to others, such that you wind up neglecting more likely paths to large consequence."

Michael_Wiebe

4y

5

What do you think of the Bayesian solution, where you shrink your EV estimate towards a prior (thereby avoiding the fanatical outcomes)?

Derek Shiller

4y

5

Thanks for sharing this. My (quick) reading is that the idea is to treat expected value calculations not as gospel, but as if they are experiments with estimated error intervals. These experiments should then inform, but not totally supplant, our prior. That seems sensible for givewell’s use cases, but I don’t follow the application to pascal’s mugging cases or better supported fanatical projects. The issue is that they don’t have expected value calculations that make sense to regard as experiments.

Perhaps the proposal is that we should have a gut estimate and a gut confidence based on not thinking through the issues much, and another estimate based on making some guesses and plugging in the numbers, and we should reconcile those. I think this would be wrong. If anything, we should take our Bayesian prior to be our estimate after thinking through all the issues, (but perhaps before plugging in all of the exact numbers). If you’ve thought through all the issues above, I think it is appropriate to allow an extremely high expected value for fanatical projects even before trying to make a precise calculation. Or at least it is reasonable for your prior to be radically uncertain.

Thomas Kwa🔹

4y

4

There are ways to deal with Pascal's Mugger with leverage penalties, which IIRC deal with some problems but are not totally satisfying in extremes.

Michael St Jules 🔸

4y

5

I think it's plausible for symmetric utilitarian views lexically sensitive to the differences between different infinite cardinals of value that reducing extinction risk is among the best ways of achieving cardinally larger infinities of value, since it buys us more time to do so, and plausibly it will be worked on anyway if we don't go extinct.

However, with a major value lock-in event on its way, e.g. AGI or space colonization, increasing the likelihood of and amount of work with which these larger infinities are pursued in the future seems at least as important as reducing extinction risk, since the default amount of resources for it seems low to me, given how neglected it is.

I'd expect that doubling the expected amount of resources used by our descendants to generate higher infinities conditional on non-extinction is about as good as halving extinction risk, and the former is far far more neglected, so easier to achieve.

For fanatical suffering-focused views, preventing such higher infinities would instead be a top priority.

Michael St Jules 🔸

4y

2

If you aggregate before taking differences, conditional on the universe/multiverse already being infinite, larger cardinalities of (dis)utilities should already be pursued with high probability, and without a way to distinguish between different outcomes with the same cardinal number of value-bearers of the same sign, it seems like the only option that makes any difference to the aggregate utility in expectation is aiming to ensure that for a given cardinal, there are fewer than that many utilities that are negative. But I'm not sure even this makes a difference. If you take expectations over the size of the universe before taking differences, the infinities dominate anyway, so you can ignore the possibility of a finite universe.

If you're instead sensitive to the difference you make (i.e. you estimate differences before aggregating, either over individuals or the probability), then pursuing or preventing larger infinities matters again, and quality improvements may matter, too. Increasing or decreasing the probability of the universe/multiverse being infinite at all could still look valuable.

Derek Shiller

4y

1

Is there any plausible path to producing (or even $ℵ_{1}$ ) amounts of value with the standard metaphysical picture of the world we have? Or are you thinking that we may discover that it is possible and so should aim to position ourselves to make that discovery?

Michael St Jules 🔸

4y

2

Affecting (and $ℵ_{2}$ , assuming the continuum hypothesis is false, i.e. $ℵ_{2} \leq | R |$ ) utilities seems possible in a continuous spacetime universe with continuous quantum branching but counting and aggregating value discretely, indexing and distinguishing moral patients by branches (among other characteristics), of which there are $| R |$ . I think continuity of the universe is still consistent with current physics, and the Planck scale is apparently not the lowest we can probe in particular (here's a paper making this claim in its abstract and background in section 1; you can ignore the rest of the paper). Of course, a discrete universe is also still consistent with current physics, and conscious experiences and other things that matter are only practically distinguishable discretely, anyway.

I mostly have in mind trying to influence the probability (your subjective probability) that there will be $ℵ_{α}$ moral patients at all under discrete counting or enough of their utilities that an aggregate you use, if any, will be different, and I don't see any particular plausible paths to achieving this with the (or a) standard picture, but I am thinking "we may discover that it is possible and so should aim to position ourselves to make that discovery" and use it. I don't have any particular ideas for affecting strictly more than $| R |$ moral patients without moving away from the standard picture, either.

Michael St Jules 🔸

4y

2

Are you sure the differences between the versions of the many worlds interpretations aren't really just normative? I think some would just claim you should treat the quantum measure as the actual measure of amounts of value and disvalue over which you aggregate, so everything gets normalized, and you get back to maximizing expected value as if the MWIs are false and there's no branching at all.

For dealing with normative uncertainty, if you're into maximizing-expected choiceworthiness and believe in intertheoretic utility comparisons between the different normative interpretations of MWI, then the expanding version of MWI ~~should~~ (EDIT) could dominate, although there are infinite "amplifications" of the more standard measure-based interpretation that could compete or dominate instead, as described in Carl's comment:

Or you can say we're part of a set of parallel universes that don't split but which is as 'large' as the infinite limit of the fastest splitting process.

Amplified theories are discussed further in the Moral Uncertainty book chapter I linked above.

Maximizing expected choiceworthiness is not my preferred way to deal with normative uncertainty, anyway, though; I prefer moral parliament/proportional resource allocation the most, and then structural normalization like variance voting next. But these preferences are largely due to my distaste for fanaticism.

SamuelKnoche

4y

1

My reading of this post is that it attempts to gesture at the valley of bad rationality.

tobycrisford 🔸

4y

1

The problem with neglecting small probabilities is the same problem you get when neglecting small anything.

What benefit does a microlitre of water bring you if you're extremely thirsty? Something so small it is equivalent to zero? Well if I offer you a microlitre of water a million times and you say 'no thanks' each time, then you've missed out! The rational way to value things is for a million microlitres to be worth the same as one litre. The 1000th microlitre doesn't have to be worth the same as the 2000th, but their values have to add to the value of 1 litre. If they're all zero then they can't.

I think the same logic applies to valuing small probabilities. For instance, what is the value of one vote from the point of view of a political party? The chance of it swinging an election is tiny, but they'll quickly go wrong if they assign all votes zero value.

I'm not sure what the solution to pascal's mugging/fanatacism is. It's really troubling. But maybe it's something like penalising large effects with our priors? We don't ignore small probabilities, we instead become extremely sceptical of large impacts (in proportion to the size of the claimed impact).

Fanatical EAs should support very weird projects

Fanatical EAs should support very weird projects

Fanaticism

Potential Fanatical EA Projects

1.) Quantum Branching^[4]

2.) Evangelism

3.) Absurdist Research

Lessons

1.) Fanaticism is unreasonable

2.) Rationality can be unreasonable

3.) Expected value shouldn't determine our behavior

4.) We should ignore at least some probabilities on the order of one in a trillion, no matter how much value they promise

Fanatical EAs should support very weird projects

Fanatical EAs should support very weird projects

Fanaticism

Potential Fanatical EA Projects

1.) Quantum Branching[4]

2.) Evangelism

3.) Absurdist Research

Lessons

1.) Fanaticism is unreasonable

2.) Rationality can be unreasonable

3.) Expected value shouldn't determine our behavior

4.) We should ignore at least some probabilities on the order of one in a trillion, no matter how much value they promise

1.) Quantum Branching^[4]