Summary: EAs who accept fanaticism (the idea that we should pursue long shots with enough expected value) should favor some pretty weird projects. E.g. trying to create quantum branches, converting sinners to the one true religion, or researching other out-there cause areas. This is unreasonable, if not irrational. We should be somewhat wary of expected value calculations for supporting such weird projects.
Fanaticism is the idea that we should base our decisions on all of the possible outcomes of our actions no matter how unlikely they are. Even extremely unlikely outcomes may sway what we should do if they are sufficiently good or bad.
Traditional decision theories tell us to maximize expected utility in some form or other. This is fanatical, for it may be that the actions that maximize expected utility produce inordinate amounts of value at extremely low probabilities. There are other ways to be fanatical, but I'll assume here that EA fanatics take a roughly maximize expected utility approach.
Fanaticism isn't a weird idea. It's a sensible and straightforward way of building a decision theory. But it has weird implications in practice. The weirdness of what fanaticism suggests that EAs should do should make us suspicious of it.
Potential Fanatical EA Projects
1.) Quantum Branching
Some simple versions of the Many Worlds interpretation of quantum mechanics say that the universe regularly branches during quantum events, producing multiple universes that differ from each other in the behavior of the state of some particles. If the universe branches in this way, the value of all subsequent events might multiply. There could be twice as much joy and twice as much suffering every time the universe branches in two.
We have the power to produce these quantum events. We can chain them one after another, potentially doubling, then redoubling, then again redoubling all the value and disvalue in the world. These quantum events happen all the time as well without our deliberate choice, but our decisions would make a difference to how many branchings occur. This gives us the power to pretty trivially exponentially increase the total amount of value (for better or worse) in the world by astronomical numbers.
The interpretations of quantum mechanics that allow for branchings like this are almost surely wrong. Almost. There is a small sliver of chance that this is the right way to think about quantum phenomena. Quantum phenomena are weird. We should be humble. The interpretations are logically coherent. They deserve, I think, at least a one in a quintillion probability of being right (and possibly a lot higher).
A small sliver of probability in a simple Many Worlds interpretation is enough to make the expected value of spending our time and money producing quantum events that might trigger branchings very high. It doesn't much matter how low of a probability you assign. (If you like, add a couple hundred 0s in back of that quintillion, and the expected value of attempting to produce branches will still be enormous.) Doubling the value of our world a thousand times successively would effectively multiply the amount of value by a factor of 21000. If this interpretation of quantum mechanics is true, then there must already be a number of branches being continuously created. There must be a ton of value in all of the lives in all those branches. Additional divisions would effectively multiply the number of branches created in the future. Multiplying the amount of value by a factor of 21000 would mean we multiply a tremendous amount of value by a huge factor. This suggests an expected utility inconceivably greater than any current EA project.
In one of the very first applications of probabilistic decision theory, Pascal argued that we should attempt to believe in religious teachings in case they should make the difference in where we spend our eternal afterlife. Many religions suggest that our actions here on Earth will make an infinitely significant difference to our wellbeing after we die. It is natural to apply this idea to charity. If we really want to help other people, we should aim to secure their eternal afterlife in a good place.
No religion promising eternal damnation or salvation based on our earthly actions is particularly plausible, but they can't be totally ruled out either. Religious views are coherent. Every major religion has some extremely intelligent people who believe in it. It would be irrationally self-confident to not give such religions even a remote chance of being correct.
Insofar as we are concerned with everyone's wellbeing, the prospect of infinite afterlife should make it extremely important that we get as many people into heaven and out of hell as possible, even if we think such outcomes are extremely unlikely. Making the difference for one person would account for a greater difference in value than all of the secular good deeds ever performed. Saving the soul of one individual would be better then forever ending factory farming. It would be better than ensuring the survival of a trillion generations of human beings.
There are significant complications to Pascal's argument: it isn't clear which religion is right, and any choice with infinite rewards on one view may incur infinite punishments on another which are hard to compare. This gets us deep into infinite ethics, a tricky subject.
Whatever we end up doing with them, I still think Pascal was probably right that religious considerations should swamp all known secular considerations. If we substitute sufficiently large finite numbers for the infinite values of heaven and hell, and if considerations aren't perfectly balanced, they will dominate expected utilities.
We should perhaps base our charitable decisions entirely on which religions are the least implausible; which promise the greatest rewards; which have the clearest paths to getting into heaven, etc, and devote time and money to evangelizing.
3.) Absurdist Research
The previous two proposals sketch ways we might be able to create tremendous amounts of value relatively easily. If the Many Worlds interpretation is correct, creating quantum branches is much easier than outlawing gestation crates. If Johnathan Edwards was right about God, saving a few people from eternal damnation is a lot easier than solving the alignment problem. These proposals involve far-fetched ideas about the way the world works. It may be that by thinking about more and more absurd hypotheticals, we can find other remote possibilities with even greater payoffs.
Searching through absurd hypotheticals for possible cause areas is an extremely neglected task. No one, as far as I'm aware, is actively trying to work out what prospects there are for producing inordinate amounts of value at probabilities far less than one in a trillion. Human intellectual activity in general has a strong bias towards figuring out what we have strong evidence for, not for figuring out what we can't conclusively rule out. We don't have good epistemic tools for distinguishing one in a trillion hypotheses from one in a googol hypotheses, or for saying when considerations are perfectly balanced for and against and when there are minuscule reasons to favor some options over others.
The research needed to identify remote possibilities for creating extraordinarily large amounts of value itself could be treated as a cause area, for it is only after such possibilities are recognized that we can act on them. If there is a one in a quintillion probability that we find a proposal meriting a one in a quintillion probability, that, if true, would entail that we can trivially exponentially raise the value of the universe, it is worth devoting all our fanatical attention to looking for it.
There are some reasons to be optimistic. There are a huge number of possibilities and the vast majority are extremely unlikely and have never before been considered. The recognized values of available charitable projects are generally pretty small in the grand scheme of things. There may be ways, such as with duplication via branching or with creating whole new universes, to produce vast amounts of value. If there are such remote possibilities, then they could easily dominate expected utilities.
1.) Fanaticism is unreasonable
I think this is pretty clear from the above examples. I feel reasonably confident that fanatical EAs should be working on one of those three things -- certainly not anything mainstream EAs are currently doing -- and I lean toward absurdist research. Maybe I've mistaken how plausible these projects are, or there are some better options I'm missing. The point is, the fanatic's projects will look more like these than space governance or insect suffering. The point is not just that fanatical EAs would devote some time to these absurd possibilities, but rather they are the only things that fanatical EAs would see as worth pursuing.
2.) Rationality can be unreasonable
Isaacs, Beckstead & Thomas, and Wilkinson point out how weird it would be to adopt a complete and consistent decision theory that wasn't fanatical. It would involve making arbitrary distinctions between minute differences of the probability of different wagers or evaluating packages of wagers differently then one evaluates the sum of the wagers individually. Offered enough wagers, non-fanatics must make some distinctions that they will be very hard-pressed to justify.
I take it that a rational decision procedure must be complete and consistent. If you're rational, you have a pattern of making decisions that is coherent come what wagers may. That pattern can't involve arbitrary differences, such as refusing a wager at one probability for one penny while accepting the same wager at .0000000000001% greater probability at the cost of your whole life savings. Isaacs et al. are right that it is rational to follow a decision procedure that is fanatical and irrational to follow a decision procedure that is not.
However, I don't think this challenges the fact that it is clearly unreasonable to be fanatical. If you must decide between devoting your life to spreading the gospel for some religion that you think is almost certainly wrong and making an arbitrary distinction between two remote wagers that you will never actually be offered, the reasonable way to go is the latter.
This shows that sometimes it is unreasonable to be rational. There are plenty of cases where it is unfortunate to be rational (e.g. Newcomb's paradox). This goes a step further. Reasonability and rationality are separate concepts that often travel together, but not always.
3.) Expected value shouldn't determine our behavior
Where rationality and reasonability come apart, I'd rather be reasonable, and I hope you would do. Insofar as fanaticism is unreasonable, we should ignore some small probabilities. We shouldn't work on these projects. We should also be wary about more benign appeals to very low-probability but high-value possibilities. There is no obvious cutoff where it becomes reasonable to ignore small probabilities. We should probably not ignore probabilities on the scale of one in a thousand. But one in a million? One in a billion?
4.) We should ignore at least some probabilities on the order of one in a trillion, no matter how much value they promise
There's a bit of a history of estimating how low the probabilities are that we can ignore.
I'm not sure precisely how plausible the simplistic Many Worlds interpretation or evangelical religions are, but I can see a case to be made that the relevant probabilities are as high as one in a trillion. Even so, I think it would be unreasonable to devote all of EAs resources to these projects. It follows that at least some probabilities on that order should be ignored.
It doesn't follow from the fact that we should some one in a trillion probabilities that we should ignore all probabilities on that order, but I'd hope there would be a good story about why some small probabilities should be ignored and some equally small probabilities shouldn't.
That story might distinguish between probabilities that are small because they depend on absurd metaphysical postulates and probabilities that are small because they depend upon lots of mundane possibilities turning out just right, but I worry that drawing such a distinction is really just a way for us to save face. We don't want to have to evangelize (at least I don't), so we tell a story that lets us off the hook.
A more promising alternative might distinguish between kinds of decisions that humans must make over and over, where the outcomes are independent and decisions that are one-offs or dependent on each other. Collectively ignoring relatively small probabilities on a large number of independent wagers will very likely get us into trouble. Small probabilities add up.
More formally: for any wager with probability greater than 0 and a finite cost, there is a possible reward value for winning that makes it rational to accept the wager. The terminology comes from Hayden Wilkinson's In Defense of Fanaticism . See also Smith's Is Evaluative Consistency a Requirement of Rationality , Isaacs's Probabilities cannot be rationally neglected , Monton's How to Avoid Maximizing Expected Utility , Beckstead and Teruji Thomas's A paradox for tiny probabilities and enormous values and Russell's On Two Arguments for Fanaticism . Kokotajlo's sequence on remote possibilities is also great. ↩︎
If value is bounded by a ceiling, expected utility maximazation doesn't entail fanaticism. There may be nothing that could occur at a small probability that would be sufficiently valuable to be worth paying some cost. Bounded value functions for moral values are rather strange, however, and I don't think this is a plausible way to get around the issues. ↩︎
We might discount the expected value of low probability prospects, but only by a reasonable factor. Even quite generous discounting will allow us to draw unreasonable conclusions from fanaticism. ↩︎
See Schwitzgebel's How To Create Immensely Valuable New Worlds By Donning Your Sunglasses ↩︎
There are different ways we could evaluate separate branches that are independent of how we think of the metaphysics of these branches. It is plausible, but not obvious, that we should treat the separate branches in the same way we treat separate situations in the same branch. ↩︎
There would be differences between the two branches, which might grow quite large as time goes on. So bifurcation wouldn't strictly speaking double all value, but on average we should expect bifurcations to approximately double value. ↩︎
One in a quintillion is equivalent to getting three one in a million results in a row. If we think that there is a one in a million chance that the Many Worlds interpretation is true, and a one in a million if, given that, the simple version formulated here is true, and if, given that, there is a one in a million probability that value in such universes would effectively double after division, then we should allow this hypothesis a one in a quintillion probability. ↩︎
Or thwarting, for the pessimists. ↩︎
Skeptical of these numbers? There's an argument against even pausing to consider where the argument goes wrong. Each additional bifurcation makes the universe so much better. At best you figure out that it isn't worth your time and are down a few hours. At worst you miss a chance to multiply all the value on Earth many times over. ↩︎
The traditional route to rejecting fanaticism comes by way of evaluating the St. Petersburg game. I find the examples here more convincing since they don't rely on infinite structures of payoffs and they are genuine options for us. ↩︎
Wilkinson suggests positronium research on the grounds that it might some day enable an infinite amount of computation, letting us produce an infinite number of good lives. I'm not sure if this was intended as a serious proposal, but it strikes me as less promising than the proposals I put forward here. Even if it is possible, there’s a case to be made that it is better to create many quantum branches with the expectation we’ll figure out positronium computers in a bunch of them. ↩︎
Even if you try to follow an unbounded utility function (which has deep mathematical problems, but set those aside for now) these don't follow.
Generally the claims here fall prey to the fallacy of unevenly applying the possibility of large consequences to some acts where you highlight them and not to others, such that you wind up neglecting more likely paths to large consequences.
For instance, in an infinite world (including infinities creating by infinite branching faster than you can control) with infinite copies of you, any decision, e.g. eating an apple, has infinite consequences on decision theories that account for the fact that all must make the same (distribution of ) decisions . If perpetual motion machines or hypercomputation or baby universes are possible, then making a much more advanced and stable civilization is far more promising for realizing things related to that then giving in to religions where you have very high likelihood ratios that they don't feed into cosmic consequences.
Any plan for infinite/cosmic impact that has an extremely foolish step in it (like Pascal's Mugging) is going to be dominated by less foolish plans.
There will still be implications of unbounded utility functions that are weird and terrible by the standards of other values, but they would have to follow from the most sophisticated analysis, and wouldn't have foolish instrumental irrationalities or uneven calculation of possible consequences.
A lot of these scenarios are analogous to someone caricaturing the case for aid to the global poor as implying that people should give away all of the food they have (sending it by FedEx) to famine-struck regions, until they themselves starve to death. Yes, cosmopolitan concern for the poor can elicit huge sacrifices of other values like personal wellbeing or community loyalty, but that hypothetical is obviously wrong on its own terms as an implication.
Could you be more specific about the claims that I make that involve this fallacy? This sounds to me like a general critique of Pascal's mugging, which I don't think fits the case that I've made. For instance, I suggested that the simple MWI has a probability ~ 11018 and would mean that it is trivially possible if true to generate 21000v in value, where v is all the value currently in the world. The expected value of doing things that might cause 1000 successive branchings is ~10283v where v is all the value in the world. Do you think that there is a higher probability way to generate a similar amount of value?
I suppose your point might be something like, absurdist research is promising, and that is precisely why we need humanity to spread throughout the stars. Just think of how many zany long-shot possibilities we'll get to pursue! If so, that sounds fair to me. Maybe that is what the fanatic would want. It's not obvious that we should focus on saving humanity for now and leave the absurd research for later. Asymmetries in time might make us much more powerful now than later, but I can see why you might think that. I find it a rather odd motivation though.
I give the MWI a probability of greater than 0.5 of being correct, but as far as I can tell, there isn't any way to generate more value out of it. There isn't any way to create more branches. You only can choose to be intentional and explicit about creating new identifiable branches, but that doesn't mean that you've created more branches. The branching happens regardless of human action.
Someone with a better understanding of this please weigh in.
Here's one application. You posit a divergent 'exponentially splitting' path for a universe. There are better versions of this story with baby universes (which work better on their own terms than counting branches equally irrespective of measure, which assigns ~0 probability to our observations).
But in any case you get some kind of infinite exponentially growing branching tree ahead of you regardless. You then want to say that having two of these trees ahead of you (or a faster split rate) is better. Indeed, on this line you're going to say that something that splits twice as fast is so much more valuable as to drive the first tree to~nothing. Our world very much looks not-optimized for that, but it could be, for instance, a simulation or byproduct of such a tree, with a constant relationship of such simulations to the faster-expanding tree (and any action we take is replicated across the endless identical copies of us therein).
Or you can say we're part of a set of parallel universes that don't split but which is as 'large' as the infinite limit of the fastest splitting process.
Personally, I think we should have a bounded social welfare function (and can't actually have an unbounded one), but place finite utility on doing a good job picking low-hanging fruit on these infinite scope possibilities. But that's separate from the questions of what an efficient resource expenditures on those possibilities looks like.
(Edited to remove some bits.)
Pursuing (or influencing others to pursue) larger cardinal numbers of value, e.g. creating or preventing the existence of ℵ5 possible beings, seems sufficiently neglected relative to extinction risk reduction and the chances of value-lock-in are high enough that increasing or decreasing the expected amount of resources used to generate such higher cardinals of (dis)value or improving their quality conditional on an advanced stable civilization looks at least roughly as promising as extinction risk reduction for a scope-sensitive expected value maximizer. (However, plausibly you should just be indifferent to everything, if you aggregate value before taking differences rather than after.)
Mod note: I've enabled agree-disagree voting on this thread. This is the EA Forum's first experiment with this feature, which was developed by the LessWrong team (thanks!). I'm very interested in your feedback. Leave a comment here, or email me.
I noticed this on my own comment, came back to look at other comments here, and can say I'm already confused. But I already said I was against the idea, and maybe it's just about getting used to change.
So as more practical feedback: the meaning of the different karma types could be explained better in the hover texts. Currently they're presented as "agreement" vs. "overall" karma - it's not clear what the latter means. And the "agreement" hover text basically tries to explain both:
I would only put "How much do you agree with this?" there, and put "How good is this comment?" (or maybe some clearer but short explanation) in the regular karma hover text.
Just to clarify: is the core argument here roughly, "I'm suspicious of things that look like a Pascal's mugging"?
If this is your argument, then I agree with you (to an extent). But reading through your examples, I feel unsure whether the Pascal's mugging aspect is your crux, or if the weirdness of the conclusion is your crux. To test this concretely: if we were close to 100% confident that we do live in a universe where we could, e.g., produce quantum events that trigger branchings, would you want a lot of effort going into triggering such branchings? (For what it's worth, I would want this.)
I don't love the EA/rationalist tendency to dismiss long shots as Pascal's muggings. Pascal's mugging raises two separate issues: 1) what should we make of long shots with high expected value? and 2) what evidence does testimony by itself provide to highly implausible hypotheses (particularly compared with other salient possibilities). Considerations around (2.) seem sufficient to be wary of Pascal's mugging, regardless of what you think of (1.).
I definitely think that if you were 100% confident in the simple MWI view, that should really dominate your altruistic concern. Every time the world splits, the number of pigs in gestation crates (at least) doubles! How can you not see that as something you should really care about? It might be a lonely road, but how can you pass up such high returns? (Of course it is bad for there to be pigs in gestation crates -- I assume it is outweighed by good things, but those good things must be really good to outweigh such bads, so we should really want to double them. If they're not outweighed we should really try to stop branchings)
For what it's worth, I think I'd be inclined to think that the simple MWI should dominate our considerations even at a 1 in a thousand probability. Not sure about the 1 in a million range.
I think this post is the result of three motivations.
1.) I think the expected value of weird projects really is ludicrously high. 2.) I don't want to work on them, or feel like I should be working on them. I get the impression that many, even most, EAs would agree. 3.) I'd bet I'm not going to win a fight about the rationality of fanaticism with Yoaav Isaacs or Hayden Wilkinson.
TBH I don't think this makes sense. Every decision you make in this scenario, including the one to promote or stop branching, would be a result of some quantum processes (because everything is a quantum process), so the universe where you decided to do it would be complemented by one where you didn't. None of your decisions have any effect on the amount of suffering etc., if it's taken as a sum over universes.
If you google terms like "measure," "reality fluid" or "observer fluid" you find long discussions on Lesswrong related to how "the number of pigs in gestation crates (at least) doubles!" is probably a confused way of thinking. I don't understand these issues at all, but there's definitely a rabbit hole to delve into from here.
Sure, but how small is the probability that it isn't? It has to be really small to counteract the amount of value doubling would provide.
Ah, reading your post and comments more closely, I realize you're aware of the picture probably being a different one, but, in your example, you focus on "branching doubles the things that matter" because it leads to these fanatical conclusions. That makes sense.
It depends what you compare it to. Sure, if you compare a case where no branching happens at all (i.e., no MWI) and one in which branching happens and you treat it as "branching doubles the amount of stuff that matters," then yes, there's a wager in favor of the second.
However, if you compare "MWI where branching doubles the amount of stuff that matters" to "MWI where there's an infinite sea of stuff and within that sea, there's objective reality fluid or maybe everything's subjective and something something probabilities are merely preferences over simplicity," then it's entirely unclear how to compare these two pictures. (Basically, the pictures don't even agree on what it means to exist, let alone how to have impact.)
I'm not sure I really understand the response. Is it that we shouldn't compare the outcomes between, say, a Bohmian interpretation and my simplistic MW interpretation, but between my simplistic MW interpretation and a more sophisticated and plausible MW interpretation, and those comparisons aren't straightforward?
If I've got you right, this seems to me to be a sensible response. But let me try to push back a little. While you're right that it may be difficult to compare different metaphysical pictures considered as counterfactual, I'm only asking you to compare metaphysical pictures considered as actual. You know how great it actually is to suck on a lollipop? That's how great it is to suck on a lollipop whether you're a worm navigating through branching worlds or a quantum ghost whose reality is split across different possibilities or a plain old Bohmian hunk of meat. Suppose you're a hunk of meat, how great would it be if you were instead a worm? Who knows and who cares! We don't have to make decisions for metaphysical possibilities that are definitely not real and where sucking on a lollipop isn't exactly this great.
I’m not saying you can’t compare those two. You can – the simplistic MW interpretation will win because it has more impact at stake, as you say, so it wins under expected utility theory, even if you assign it low credence.
However, if you’re going down the road of “what speculative physics interpretation produces the largest utilities under expected utility theory?” you have to make sure to get the biggest one, the one where the numbers grow the most. This is Carl's point above. My point is related, it's that it seems more plausible for there to be infinite* branches** already if we're considering the many worlds interpretation, as opposed to branching doubling the amount of stuff that matters.
So, comparing infinite many worlds to your many worlds with some finite but ever-growing number of branches, it seems unclear which picture to focus on as expected utility maximizers. If there's an infinite sea of worlds/branches all at once and all our actions have infinite consequences across infinite copies of ourselves in different worlds/branches, that's more total utility at stake than in your example, arguably. I say "arguably" because the concept of infinity is contested by some , there's the infinitarian paralysis argument that says all actions that affect infinities of the same order are of equal value, and there are philosophical issues around what it could possibly mean for something to "exist" if there's an infinite number of everything you can logically describe (this goes slightly further than many worlds – "if everything we can coherently describe can exist, what would it even mean for something not to exist? Can some things exist more than others?").***
In short, the picture becomes so strange that "Which of these speculative physics scenarios should I focus on as an expected utility maximizer?" becomes more a question about the philosophy of "What do we mean by having impact in a world with infinities?" and less about straightforwardly comparing the amounts of utility at stake.
*I might butcher this (I remember there's something about how the probabilities you get for branching "splits" may change based on arbitrary-seeming assumptions about which "basis" to use, or something like that? I found this section on Wikipedia on the preferred basis problem), but I think one argument for infinities in the MWI goes as follows. Say you have a quantum split and it’s 50-50, meaning 50% that the cat in the box is dead, 50% it’s alive. In this situation, it seems straightforward to assume that one original world splits into two daughter worlds. (Or maybe the original splits into four worlds, half with dead cat, half with an alive cat. It's already a bit disconcerting that we maybe couldn't distinguish between one world splitting into two and one world splitting into four?)
Now let’s assume there’s a quantum split, but the observed probabilities are something weird like 2/7. Easy, you say. "Two worlds with a dead cat, five worlds with an alive cat."
Okay. But here comes the point where this logic breaks apart. Apparently, some quantum splits happen with probabilities that are irrational numbers – numbers that cannot be expressed as fractions. Wtf. :S (I remember this from somewhere in Yudkowsky's quantum physics sequence, but here's a discussion on a physics forum where I found the same point. I don't know how reliable that source is.)
**[Even more speculative than the other points above.] Perhaps the concept of "branching" isn't exactly appropriate, and there's some equivalence between the MW quantum multiverse and a single universe with infinite spatial extent, where there are also infinite copies of you with each copy being extremely far apart from each other. (In an infinitely spatially extended universe with fixed physical laws and random initial conditions, macroscopic patterns would start to repeat themselves eventually at a far enough distance, so you'd have infinite exact copies of yourself and infinte nearly-exact copies.) Maybe what we experience/think of as "branching" is just consciousness moments hopping from one subjectively indistinguishable location to the next. This sounds wild, but it's interesting that when you compare different ways for there to be a multiverse, the MWI and the infinitely spatially expanded universe have the same laws of physics, so there's some reason to assume that maybe they're two ways of describing the same thing. By contrast, inflationary cosmology, which is yet another way you can get a "multiverse," would generate universe bubbles with different laws of physics for each bubble. (At least, that's what I remember from the book The Hidden Reality.) (I found a paper that discusses the hypothesis that the infinitely spatially extended multiverse is the same as the quantum multiverse – it references the idea to Tegmark and Aguirre, but I first heard it by Yudkowsky. The paper claims to argue that the idea is false, for what it's worth.)
***To elaborate on "philosophical issues around what it means for something to exist." Consider the weird idea that there might be these infinite copies of ourselves out there, some of which should find themselves in bizarre circumstances where the world isn't behaving predictably. (If there are infinite copies of you in total, you can't really say "there are more copies of you in environments where the furniture doesn't turn into broccoli the next second than there are copies in environments where it does." After all, there are infinite copies in both types of environment!) So, this raises questions like "Why do things generally appear lawful/predictable to us?" and "How much should we care about copies of ourselves that find themselves in worlds where the furniture turns into broccoli?" So, people speculate whether there's some mysterious "reality fluid" that could be concentrated in worlds that are simpler and therefore they appear more normal/predictable to us. (One way to maybe think of this is that the universe is a giant automaton that's being run, and existence corresponds not just whether there's a mathematical description of the patterns that make up you and your environment, but also somehow of "actually being run" or "(relative?) run-time.") Alternatively, there's a philosophical view that we may call "existence anti-realism." We start by noting that the concept of "existence" looks suspicious. David Chalmers coined the term bedrock concepts for concepts that we cannot re-formulate in non-question-begging terminology (terminology from another domain). So these concepts are claimed to be "irreducible." Concepts like "moral" or "conscious" are other contenders for bedrock concepts. Interestingly enough, when we investigate purported bedrock concepts, many of them turn out to be reducible after all (e.g., almost all philosophers think concepts like "beautiful" are reducible; many philosophers think moral concepts are reducible; a bunch of philosophers are consciousness anti-realists, etc.) See this typology of bedrock concepts I made, where existence anti-realism is the craziest tier. It takes the sort of reasoning that is common on Lesswrong the furthest you can take it. It claims that whether something "exists"is a bit of a confused question, that our answers to it depend on how our minds are built, like what priors we have over worlds or what sort of configurations we care about. I don't understand it, really. But here's a confusing dialogue on the topic.
As I said in my earlier comment, it's a rabbit hole.
Thanks for clarifying! I think I get what you're saying. This certainly is a rabbit hole. But to bring it back to the points that I initially tried to make, I'm kind of struggling to figure out what the upshot would be. The following seem to me to be possible take-aways:
1.) While the considerations in the ballpark of what I've presented do have counterintuitive implications (if we're spawning infinite divisions every second, that must have some hefty implications for how we should and shouldn't act, mustn't it?), fanaticism per se doesn't have any weird implications for how we should be behaving because it is fairly likely that we're already producing infinite amounts of value and so long shots don't enter into it.
2.) Fanaticism per se doesn't have any weird implications for how we should be behaving, because it is fairly likely that the best ways to produce stupendous amounts of value happen to align closely with what commonsense EA suggests we should be doing anyway. (I like Michael St. Jules approach to this that says we should promote the long-term future of humanity so we have the chance to research possible transfinite amounts of value.)
3.) These issues are so complicated that there is no way to know what to do if we're going fanatical, so even if trying to create branches appears to have more expected utility than ordinary altruistic actions, we should stick to the ordinary altruistic actions to avoid opening up that can of worms.
What do you think of the Bayesian solution, where you shrink your EV estimate towards a prior (thereby avoiding the fanatical outcomes)?
Thanks for sharing this. My (quick) reading is that the idea is to treat expected value calculations not as gospel, but as if they are experiments with estimated error intervals. These experiments should then inform, but not totally supplant, our prior. That seems sensible for givewell’s use cases, but I don’t follow the application to pascal’s mugging cases or better supported fanatical projects. The issue is that they don’t have expected value calculations that make sense to regard as experiments.
Perhaps the proposal is that we should have a gut estimate and a gut confidence based on not thinking through the issues much, and another estimate based on making some guesses and plugging in the numbers, and we should reconcile those. I think this would be wrong. If anything, we should take our Bayesian prior to be our estimate after thinking through all the issues, (but perhaps before plugging in all of the exact numbers). If you’ve thought through all the issues above, I think it is appropriate to allow an extremely high expected value for fanatical projects even before trying to make a precise calculation. Or at least it is reasonable for your prior to be radically uncertain.
There are ways to deal with Pascal's Mugger with leverage penalties, which IIRC deal with some problems but are not totally satisfying in extremes.
I think it's plausible for symmetric utilitarian views lexically sensitive to the differences between different infinite cardinals of value that reducing extinction risk is among the best ways of achieving cardinally larger infinities of value, since it buys us more time to do so, and plausibly it will be worked on anyway if we don't go extinct.
However, with a major value lock-in event on its way, e.g. AGI or space colonization, increasing the likelihood of and amount of work with which these larger infinities are pursued in the future seems at least as important as reducing extinction risk, since the default amount of resources for it seems low to me, given how neglected it is.
I'd expect that doubling the expected amount of resources used by our descendants to generate higher infinities conditional on non-extinction is about as good as halving extinction risk, and the former is far far more neglected, so easier to achieve.
For fanatical suffering-focused views, preventing such higher infinities would instead be a top priority.
If you aggregate before taking differences, conditional on the universe/multiverse already being infinite, larger cardinalities of (dis)utilities should already be pursued with high probability, and without a way to distinguish between different outcomes with the same cardinal number of value-bearers of the same sign, it seems like the only option that makes any difference to the aggregate utility in expectation is aiming to ensure that for a given cardinal, there are fewer than that many utilities that are negative. But I'm not sure even this makes a difference. If you take expectations over the size of the universe before taking differences, the infinities dominate anyway, so you can ignore the possibility of a finite universe.
If you're instead sensitive to the difference you make (i.e. you estimate differences before aggregating, either over individuals or the probability), then pursuing or preventing larger infinities matters again, and quality improvements may matter, too. Increasing or decreasing the probability of the universe/multiverse being infinite at all could still look valuable.
Is there any plausible path to producing ℵ2 (or even ℵ1) amounts of value with the standard metaphysical picture of the world we have? Or are you thinking that we may discover that it is possible and so should aim to position ourselves to make that discovery?
Affecting ℵ1 (and ℵ2, assuming the continuum hypothesis is false, i.e. ℵ2≤|R|) utilities seems possible in a continuous spacetime universe with continuous quantum branching but counting and aggregating value discretely, indexing and distinguishing moral patients by branches (among other characteristics), of which there are |R|. I think continuity of the universe is still consistent with current physics, and the Planck scale is apparently not the lowest we can probe in particular (here's a paper making this claim in its abstract and background in section 1; you can ignore the rest of the paper). Of course, a discrete universe is also still consistent with current physics, and conscious experiences and other things that matter are only practically distinguishable discretely, anyway.
I mostly have in mind trying to influence the probability (your subjective probability) that there will be ℵα moral patients at all under discrete counting or enough of their utilities that an aggregate you use, if any, will be different, and I don't see any particular plausible paths to achieving this with the (or a) standard picture, but I am thinking "we may discover that it is possible and so should aim to position ourselves to make that discovery" and use it. I don't have any particular ideas for affecting strictly more than |R| moral patients without moving away from the standard picture, either.
See also "Exceeding Expectations: Stochastic Dominance as a General Decision Theory" (Tarsney, 2020); West (2021) summarises this Tarsney paper, which is pretty technical. A key sentence from West:
This reminds me of Ole Peters alternative time resolution of the St. Petersburg paradox. I'd really appreciate more summaries of abstruse technical papers on alternatives to expected utility weird scenarios.
My own thoughts on this subject.
Also relevant: Impossibility results for unbounded utility functions.
Idk, bounded utility functions seem pretty justifiable to me.* Just slap diminishing returns on everything. Yes, more happy lives are good, but if you already have a googleplex of them, it's not so morally important to make more. Etc. As for infinities, well, I think we need a measure over infinities anyway, so let's say that our utility function is bounded by 1 and -1, with 1 being the case where literally everything that happens across infinite spacetime is as good as possible--the best possible world--and -1 is the opposite, and in between we have various cases in which good things happen with some measure and bad things happen with some measure.
*I totally feel the awkwardness/counterintuitiveness in certain cases, as the papers you link point out. E.g. when it's about suffering. But it feels much less bad than the problems with unbounded utility functions. As you say, it seems like people with unbounded utility functions should be fanatical (or paralyzed, I'd add) and fanatics... well, no one I know is willing to bite the bullet and actually start doing absurdist research in earnest. People might claim, therefore, to have unbounded utility functions, but I doubt their claims.
Big fan of your sequence!
I'm curious how you think about bounded utility function. Its not something I've thought about much. The following sort of case seems problematic.
That seems really wrong. Much more so than thinking that fanaticism is unreasonable.
Ooof, yeah, I hadn't thought about the solipsism possibility before. If the math checks out then I'll keep my bounded utility function but also maybe add in some nonconsequentialist-ish stuff to cover this case and cases like it. (or, you can think of it as just specifying that the utility function should assign significant negative utility to you doing unvirtuous acts like this.)
That said, I'm skeptical that the math works out for this example. Just because the universe is very big doesn't mean we are very near the bound. We'd only be very near the bound if the universe was both very big and very perfect, i.e. suffering, injustice, etc. all practically nonexistent as a fraction of things happening.
So we are probably nowhere near either end of the bound, and the question is how much difference saving one child makes in a very big universe.
For reasons related to noncausal decision theory, the answer is "a small but non-negligible fraction of all the things that happen in this universe depend on what you do in this case. If you save the child, people similar to you in similar situations all across the multiverse will choose to save similar children (or alien children, or whatever)."
The question is whether that small but non-negligible positive impact is outweighed by the maybe-solipsism-is-true-and-me-enjoying-this-ice-cream-is-thus-somewhat-important possibility.
Intuitively it feels like the answer is "hell no" but it would be good to see a full accounting. I agree that if the full accounting says the answer is "yes" then that's a reductio.
Note that the best possible solipsistic world is still vastly worse than the best possible big world.
(Oops, didn't realize you were the same person that talked to me about the sequence, shoulda put two and two together, sorry!)
There's a paper by Tarsney on solipsistic swamping for some specific social welfare functions, like average utilitarianism, just considering moral patients on our Earth so far: https://www.tandfonline.com/doi/full/10.1080/00048402.2021.1962375
Your utility function can instead be bounded wrt the difference you make relative to some fixed default distribution of outcomes ("doing nothing", or "business as usual") or in each pairwise comparison (although I'm not sure this will be well-behaved). For example, take all the differences in welfare between the two random variable outcomes corresponding to two options, apply some bounded function of all of these differences, and finally take the expected value.
Consider the following amended thought experiment: (changes in bold)
Good example! At least this isn't solipsistic egoism, but I agree the results seem too egoistic.
What you could do is rearrange the two probability distributions of aggregate welfares statewise in non-decreasing order (or in a way that minimizes some distance between the two distributions), take the difference between the two resulting random variables, apply a bounded monotonically increasing function to the difference, and then take the expected value.
Unfortunately, I suspect this pairwise comparison approach won't even be transitive.
Given an intransitive relation over options (distributions over outcomes), you can use voting methods like beatpath to define a similar transitive relation or choose among options even when there's intransitivity in a choice set. Using beatpath on the specific actual option sets you face in particular will mean violating the independence of irrelevant alternatives, which I'm pretty okay with giving up, personally.
This is done in this paper:
You could apply beatpath to the set of all conceivable options, even those not actually available to you in a given choice situation, but I imagine you'll get too much indifference or incomparability.
Re. non-consequentialist stuff, I notice that I expect societies to go better if people have some degree of extra duty towards (or caring towards) those closer to them. That could be enough here?
(i.e. Boundedly rational agents shouldn't try to directly approximate their best guess about the global utility function.)
My thought was that you'd need a large universe consisting of people like us to be very near the bound, otherwise you couldn't use boundedness to get out of assigning a high expected value to the example projects I proposed. There might be ways of finessing the dimensions of boundedness to avoid this sort of concern, but I'm skeptical (though I haven't thought about it much).
I also find it methodologically dubious to adjust your value function to fit what actions you think you should do. It feels to me like your value function should be your value function, and you should adjust your decision rules if they produce a bad verdict. If your value function is bounded, so be it. But don't cut it off to make expected value maximization more palatable.
I can see why you might do this, but it feels strange to me. The reason to save the child isn't because its a good thing for the child not to drown, but because there's some rule that you're supposed to follow that tells you to save the kid? Do these rules happen to require you to act in ways that basically align with what a total utilitarian would do, or do they have the sort of oddities that afflict deontological views (e.g. don't lie to the murderer at the door)?
This is super interesting. Thanks for writing it. Do you think you're conflating several analytically distinct phenomena when you say (i) "Fanaticism is the idea that we should base our decisions on all of the possible outcomes of our actions no matter how unlikely they are ... base our decisions on all of the possible outcomes of our actions no matter how unlikely they are EA fanatics take a roughly maximize expected utility approach" and (ii) "Fanaticism is unreasonable"?
For (i), I mainly have in mind two approaches "fanatics" could be defined by: (ia) "do a quick back-of-the-envelope calculation of expected utility and form beliefs based solely on its output," and (ib) "do what you actually think maximizes expected utility, no matter whether that's based on a spreadsheet, heuristic, intuition, etc." I think (ia) isn't something basically anyone would defend, while (ib) is something I and many others would (and it's how I think "fanaticism" tends to be used). And for (ib), we need to account for heuristics like, (f) quick BOTE calculations tend to overestimate the expected utility of low probabilities of high impact, and (g) extremely large and extremely small numbers should be sandboxed (e.g., capped in the influence they can have on the conclusion). This is a (large) downside of these "very weird projects," and I think it makes the "should support" case a lot weaker.
For (ii), I mainly have in mind three claims about fanaticism: (iia) "Fanaticism is unintuitive," (iib) "Fanaticism is absurd (a la reductio ad absurdum," and (iic) "Fanaticism breaks some utility axioms." These each have different evidence . For example, (iia) might not really matter if we don't think our intuitions—which have been trained through evolution and life experience—are reliable for such unusual questions like maximizing long-run aggregate utility.
Did you have some of these in mind? Or maybe other operationalizations?
I meant to suggest that our all-things-considered assignments of probability and value should support projects like the ones I laid out. Those assignments might include napkin calculations, but if we know we overestimate those, we should adjust accordingly.
This sounds to me like it is in line with my takeaways. Perhaps we differ on the grounds for sandboxing? Expected value calculations don't involve capping influence of component hypotheses. Do you have a take on how you would defend that?
I don't mean to say that fanaticism is wrong. So please don't read this as a reductio. Interpreted as a claim about rationality, I largely am inclined to agree with it. What I would disagree with is a normative inference from its rationality to how we should act. Let's not focus less on animal welfare or global poverty because of farfetched high-value possibilities, even if it would be rational to do so.
Thanks for the post - I'd like to see more people thinking about the consequences of "fanaticism". But I should notice that discussions about Pascal's Wagers have been running for a long time among rationalists, decision theorists and philosophers - and even in this community.
I disagree a bit with the conclusions. Sorry if this is too brief:
(1) is probably right, but I'm not sure this can based on the reductio presented on this post;
(2) is probably wrong. I think the best theory of rationality probably converges with the best theory of reasonableness - it would show why bounded rational cooperators should display this trait. But it's an interesting distinction to have in mind.
(3) I guess most consequentialists would agree that expected utility shouldn't always guide behaviour directly. They might distinguish between the correctness of an action (what you should do) and its value; your case fails to point out why the value of a Pascal's Wager-like scenario shouldn't be assessed with expected utility. But what's really weird is that the usual alternative to expected value is common sense deontic reasoning , which is often taken to claim that you should / could do a certain action A, no matter what its consequences or chances are: pereat mundus, fiat justitia. I fail to see why this shouldn't be called "fanatical", too.
(4) I'm very inclined to agree with this when we are dealing with very uncertain subjective probability distributions, and even with objective probabilities with very high variance (like Saint Petersburg Paradox). I'm not sure the same would apply to well-defined frequencies - so I wouldn't proscribe a lottery with a probability of 10^(-12).
That being said, it's been a long time since I last checked on the state of the matter... but the main lesson I learned about PW was that ideas should "pay rent" to be in our heads (I think Yudkowsky mentioned it while writing about a PW's scenario). So the often neglected issue with PW scenarios is that it's hard to account for their opportunity costs - and they are potentially infinite, precisely because it's so cheap to formulate them. For instance, if I am willing assign a relevant credence to a random person who tries to Pascal-mug me, then not only I can be mugged by anyone, I also have to assing some probability to events like:
The world will become the ultimate Paradise / Hell iff I voice a certain sequence of characters in the next n seconds.
Maybe there's a shy god around waiting for our prayer.
Pascal's wager is somewhat fraught, and what you should make of it may turn on what you think about humility, religious epistemology, and the space of plausible religions. What's so interesting about the MWI project is that it isn't like this. It isn't some theory concocted from nothing and assigned a probability. There's at least some evidence that something in the ballpark of the theory is true. And it's not easy to come up with an approximately as plausible hypothesis that suggests that the actions which might cause branchings might instead prevent them, or that we have alternative choices might lead to massive amounts of value in other ways.
If you grant that MWI is coherent, then I think you should be open to the possibility that it isn't unique, and there are other hypotheses that suggest possible projects that are much more likely to create massive amounts of value than prevent it.
Actually, I didn't address your argument from MWI because I suspect we couldn't make any difference. Maybe I'm wrong (it's way beyond my expertise), but quantum branching events would be happening all the time, so either (i) there are (or will be) infinite worlds, whatever we do, and then the problem here is more about infinite ethics than Fanaticism, or (ii) there is a limit to the number of possible branches - which I guess will (most likely) be achieved whatever we do. So it's not clear to me that we would gain additional utility by creating more branching events.
[And yet, the modus ponens of one philosopher is the modus tollens of another one... rat/EAs have actually been discussing the potential implications of weird physics: here, here...]
However, I'm not sure the problem I identified with PW's (i.e., take opportunity costs seriously) wouldn't apply here, too... if we are to act conditioned on MWI being true, then we should do the same for every theory that could be true with similar odds. But how strong should this "could" be? Like "we could be living in a simulation"? And how long until you face a "basilisk", or just someone using motivated reasoning?
As Carl Shulman points out, this might be a case of "applying the possibility of large consequences to some acts where you highlight them and not to others, such that you wind up neglecting more likely paths to large consequence."
Are you sure the differences between the versions of the many worlds interpretations aren't really just normative? I think some would just claim you should treat the quantum measure as the actual measure of amounts of value and disvalue over which you aggregate, so everything gets normalized, and you get back to maximizing expected value as if the MWIs are false and there's no branching at all.
For dealing with normative uncertainty, if you're into maximizing-expected choiceworthiness and believe in intertheoretic utility comparisons between the different normative interpretations of MWI, then the expanding version of MWI
should(EDIT) could dominate, although there are infinite "amplifications" of the more standard measure-based interpretation that could compete or dominate instead, as described in Carl's comment:
Amplified theories are discussed further in the Moral Uncertainty book chapter I linked above.
Maximizing expected choiceworthiness is not my preferred way to deal with normative uncertainty, anyway, though; I prefer moral parliament/proportional resource allocation the most, and then structural normalization like variance voting next. But these preferences are largely due to my distaste for fanaticism.
My reading of this post is that it attempts to gesture at the valley of bad rationality.
The problem with neglecting small probabilities is the same problem you get when neglecting small anything.
What benefit does a microlitre of water bring you if you're extremely thirsty? Something so small it is equivalent to zero? Well if I offer you a microlitre of water a million times and you say 'no thanks' each time, then you've missed out! The rational way to value things is for a million microlitres to be worth the same as one litre. The 1000th microlitre doesn't have to be worth the same as the 2000th, but their values have to add to the value of 1 litre. If they're all zero then they can't.
I think the same logic applies to valuing small probabilities. For instance, what is the value of one vote from the point of view of a political party? The chance of it swinging an election is tiny, but they'll quickly go wrong if they assign all votes zero value.
I'm not sure what the solution to pascal's mugging/fanatacism is. It's really troubling. But maybe it's something like penalising large effects with our priors? We don't ignore small probabilities, we instead become extremely sceptical of large impacts (in proportion to the size of the claimed impact).