Hide table of contents

What are the theoretical obstacles to abandoning expected utility calculations regarding extremities like x-risk from a rogue AI system in order to avoid biting the bullet on Pascal’s Mugging? Does Bayesian epistemology really require that we assign a credence to any proposition at all and if so - shouldn’t we reject this framework in order to avoid fanaticism? It does not seem rational to me that we should assign credences to e.g. the success of specific x-risk mitigation interventions when there are so many unknown unknowns governing the eventual outcome.

 

I hope you can help me sort out this confusion.

6

0
0

Reactions

0
0
New Answer
New Comment


2 Answers sorted by

Attempts to reject fanatacism necessarily lead to major theoretical problems, as described for instance here and here.

However, questions about fanatacism are not that relevant for most questions about x-risk. The x-risks of greatest concern to most long-termists (AI risk, bioweapons, nuclear weapons, climate change) all have reasonable odds of occurring within the next century or so, and even if we care only about humans living in the next century or so we would find that these are valuable to prevent. This is mostly a consequence of the huge number of people alive today.

I think timidity, as described in your first link, e.g. with a bounded social welfare function, is basically okay, but it's a matter of intuition (similarly, discomfort with Pascalian problems is a matter of intuition). However, it does mean giving up separability in probabilistic cases, and it may instead support x-risks reduction (depending on the details).

I would also recommend https://globalprioritiesinstitute.org/christian-tarsney-the-epistemic-challenge-to-longtermism/ https://globalprioritiesinstitute.org/christian-tarsney-exceeding-expectations-sto... (read more)

Thanks for your answer. I don't think I under stand what you're saying, though. As I understand it, it makes a huge difference to the resource distribution that longtermism recommends, because if you allow for e.g. Bostrom's 10^52 happy lives to be the baseline utility, avoiding x-risk becomes vastly more important than if you just consider the 10^10 people alive today. Right?

1
djbinder
In principal I agree, although in practice there are other mitigating factors which means it doesn't seem to be that relevant. This is partly because the 10^52 number is not very robust. In particular, once you start postulating such large numbers of future people I think you have to take the simulation hypothesis much more seriously, so that the large size of the far future may in fact be illusory. But even on a more mundane level we should probably worry that achieving 10^52 happy lives might be much harder than it looks. It is partly also because at a practical level the interventions long-termists consider don't rely on the possibility of 10^52 future lives, but are good even over just the next few hundred years. I am not aware of many things that have smaller impacts and yet still remain robustly positive, such that we would only pursue them due to the 10^52 future lives. This is essentially for the reasons that asolomonr gives in their comment.

In short:

  1. Bayesianism is largely about how to assign probabilities to things, it is not a ethical/normative doctrine like utilitarianism that tells you how you should prioritize your time. And as a (non-naïve) utilitarian will emphasize, when doing so-called “utilitarian calculus” (and related forms of analysis) is inefficient/less effective than using intuition, then you should rely on intuition.
  2. Especially when dealing with facially implausible/far-fetched claims about extremely high risk, I think it’s helpful to fight dubious fire with similarly dubious fire and then trim off the ashes: if someone says “there’s a slight (0.001%) chance that this (weird/dubious) intervention Y could prevent extinction, and that’s extremely important,” you might be able to argue that it is equally or even more likely that doing Y backfires or that doing Y prevents you from doing intervention Z which plausibly has a similar (unlikely) chance of preventing extinction. (See longer illustration block below)

In the end, these two points are not the only things to consider, but I think they tend to be the most neglected/overlooked whereas the complementary concepts are decently understood (although I might be forgetting something else).

Regarding 2 in more detail: Take for example classic Pascal's mugging-type situations, like "A strange-looking man in a suit walks up to you and says that he will warp up to his spaceship and detonate a super-mega nuke that will eradicate all life on earth if and only if you do not give him $50 (which you have in your wallet), but he will give you $3^^^3 tomorrow if and only if you give him $50." We could technically/formally suppose the chance he is being truthful is nonzero (e.g., 0.0000000001%), but still abide by rational expectation theory if you suppose that there are indistinguishably likely cases that cause the opposite expected value -- for example, the possibility that he is telling you the exact opposite of what he will do if you give him the money (for comparison, see the philosopher God response to Pascal's wager), or the possibility that the "true" mega-punisher/rewarder is actually just a block down the street and if you give your money to this random lunatic you won't have the $50 to give to the true one (for comparison, see the "other religions" response to the narrow/Christianity-specific Pascal's wager). More realistically, that $50 might be better donated to an X-risk charity. Add in the fact that stopping and thinking through this entire situation would be a waste of time that you could perhaps be using to help avert catastrophes in some other way (e.g., making money to donate to X-risk charities), and you’ve got a pretty strong case for not even entertaining the fantasy for a few seconds, and thus not getting paralyzed by naive application of expected value theory.

Thanks for your reply. A follow-up question: when I see the 'cancelling out'-argument, I always wonder why it doesn't apply to the x-risk case itself. It seems to me that you could just as easily argue that halting biotech research in order to enter the Long Reflection might backfire in some unpredictable way, or that aiming at Bostrom's utopia would ruin the chances of ending up in a vastly better state that we had never even dreamt of - and so on and so forth.

Isn't the whole case for longtermism so empirically uncertain as to be open to the 'cancelling out'-argument as well?

 

Hope it makes sense what I'm saying.

1
Marcel D
I do understand what you are saying, but my response (albeit as someone who is not steeped in longtermist/X-risk thought) would be "not necessarily (and almost certainly not entirely)."The tl;dr version is "there are lots of claims about X-risks and interventions to reduce x-risks that are reasonably more plausible than their reverse-claim." e.g., there are decent reasons to believe that certain forms of pandemic preparations reduce x-risk more than they increase x-risk. I can't (yet) give full, formalistic rules for how I apply the trimming heuristic, but some of the major points are discussed in the blocks below. One key to using/understanding the trimming heuristic is that it is not meant to directly maximize the accuracy of your beliefs, rather it's meant to improve the effectiveness of your overall decision-making *in light of constraints on your time/cognitive resources. * If we had infinite time to evaluate everything--even possibilities that seem like red herrings--it would probably (usually) be optimal to do so, but we don't have infinite time so we have to make decisions as to what to spend our time analyzing and what to accept as "best-guesstimates" for particularly fuzzy questions. Here, intuition (including "when should we rely on various levels of intuition/analysis") can be far more effective than formalistic rules. I think another key is to understand the distinction between risk and uncertainty: (to heavily simplify) risk refers to confidently verifiable/specific probabilities (e.g., a 1/20 chance of rolling a 1 on a standard 20-sided die) whereas uncertainty refers to when we don't confidently know the specific degree of risk (e.g., the chance of rolling a 1 on a confusingly-shaped 20-sided die which has never rolled a 1 yet, but perhaps might eventually). In the end, I think my 3-4-ish conditions or at least factors for using the trimming heuristic are: 1. There is a high degree of uncertainty associated with the claim (e.g., it is not a well
Comments2
Sorted by Click to highlight new comments since:

It does and we should. I wrote a post about this you might find useful: https://vmasrani.github.io/blog/2021/the_credence_assumption/

(I'm putting this as a comment and not an answer to reflect that I have a few tentative thoughts here but they're not well developed)

A really useful source that explains a Bayesian method of avoiding Pascal's mugging is this GiveWell Post. TL;dr much of the variation in EV estimates for situations that we know very little about comes from "estimate error", so we'd have very low credence in these estimates. Even if the most likely EV estimate for an action seems very positive, if there's extremely high variance due to having very little evidence on which to base that estimate, then we wouldn't be very surprised if the actual value is net zero or even negative. The post argues that we should also incorporate some sort of prior about the probability distribution of impacts that we can expect from actions. This basically makes us more skeptical the more outlandish the claim is. As a result, we're actually less persuaded to take an action if it is motivated by an extremely high but unfounded EV estimate versus an action that is equally unfounded but has a less extreme EV estimate and so falls closer to our prior about what is generally plausible. This seems to avoid Pascal's mugging. (This was my read of the post, it's completely possible that I misunderstood something and or that persuasive critiques of this reasoning exist and I haven't encountered them so far).

I think that another point here is whether the very promising but difficult to empirically verify claims that you're talking about are made with consideration of a representative spectrum of the possible outcomes for an action. As a bit of a toy example (and I'm not criticizing any actual view point here, this is just hopefully rhetorically illustrative), if you think that improving institutional decision making is really positive, your basic reasoning might look like: taking some action to teach decision makers about rationality has x small probability of reaching a person who now has y small probability of being in a position to decide something that has z hugely positive impact if decided with consideration of rational principles. Therefore the EV of me taking this action is xy * z = really big positive number. This only considers the most positive value direction that this action could unfold since it's assumed that within the much bigger 1-xy probability there are only basically neutral outcomes. It's at least plausible, however, that some of those outcomes are actually quit bad (say that you teach a decision maker an incorrect principle or that you present the idea badly and so through idea inoculation you dissuade someone from becoming more rational and this leads to some significant negative outcome). The likelihood of doing something bad is probably not that high, but say there's k chance that your action leads to m very bad outcome, then the actual EV is (xy * z ) - (k * m), which might be much lower than if we only considered the positive outcomes of this action. This might suggest that the EV estimates of the types of x-risk mitigation actions you're expressing some skepticism about could be forgetting to account for the possibility that they have a negative impact, which could meaningfully lower their EV. Although people may be already factoring such considerations in and just not necessarily making that explicit. 

Curated and popular this week
Paul Present
 ·  · 28m read
 · 
Note: I am not a malaria expert. This is my best-faith attempt at answering a question that was bothering me, but this field is a large and complex field, and I’ve almost certainly misunderstood something somewhere along the way. Summary While the world made incredible progress in reducing malaria cases from 2000 to 2015, the past 10 years have seen malaria cases stop declining and start rising. I investigated potential reasons behind this increase through reading the existing literature and looking at publicly available data, and I identified three key factors explaining the rise: 1. Population Growth: Africa's population has increased by approximately 75% since 2000. This alone explains most of the increase in absolute case numbers, while cases per capita have remained relatively flat since 2015. 2. Stagnant Funding: After rapid growth starting in 2000, funding for malaria prevention plateaued around 2010. 3. Insecticide Resistance: Mosquitoes have become increasingly resistant to the insecticides used in bednets over the past 20 years. This has made older models of bednets less effective, although they still have some effect. Newer models of bednets developed in response to insecticide resistance are more effective but still not widely deployed.  I very crudely estimate that without any of these factors, there would be 55% fewer malaria cases in the world than what we see today. I think all three of these factors are roughly equally important in explaining the difference.  Alternative explanations like removal of PFAS, climate change, or invasive mosquito species don't appear to be major contributors.  Overall this investigation made me more convinced that bednets are an effective global health intervention.  Introduction In 2015, malaria rates were down, and EAs were celebrating. Giving What We Can posted this incredible gif showing the decrease in malaria cases across Africa since 2000: Giving What We Can said that > The reduction in malaria has be
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f
LewisBollard
 ·  · 8m read
 · 
> How the dismal science can help us end the dismal treatment of farm animals By Martin Gould ---------------------------------------- Note: This post was crossposted from the Open Philanthropy Farm Animal Welfare Research Newsletter by the Forum team, with the author's permission. The author may not see or respond to comments on this post. ---------------------------------------- This year we’ll be sharing a few notes from my colleagues on their areas of expertise. The first is from Martin. I’ll be back next month. - Lewis In 2024, Denmark announced plans to introduce the world’s first carbon tax on cow, sheep, and pig farming. Climate advocates celebrated, but animal advocates should be much more cautious. When Denmark’s Aarhus municipality tested a similar tax in 2022, beef purchases dropped by 40% while demand for chicken and pork increased. Beef is the most emissions-intensive meat, so carbon taxes hit it hardest — and Denmark’s policies don’t even cover chicken or fish. When the price of beef rises, consumers mostly shift to other meats like chicken. And replacing beef with chicken means more animals suffer in worse conditions — about 190 chickens are needed to match the meat from one cow, and chickens are raised in much worse conditions. It may be possible to design carbon taxes which avoid this outcome; a recent paper argues that a broad carbon tax would reduce all meat production (although it omits impacts on egg or dairy production). But with cows ten times more emissions-intensive than chicken per kilogram of meat, other governments may follow Denmark’s lead — focusing taxes on the highest emitters while ignoring the welfare implications. Beef is easily the most emissions-intensive meat, but also requires the fewest animals for a given amount. The graph shows climate emissions per tonne of meat on the right-hand side, and the number of animals needed to produce a kilogram of meat on the left. The fish “lives lost” number varies significantly by