Cause prioritization for downside-focused value systems

Lukas_Gloor

Cause prioritization for downside-focused value systems

Lukas_Gloor

57 min read · Jan 31, 2018

Comments 11

Sorted by

New & upvoted

Mihnea

This is impressively good.

Denkenberger🔸

Wow-impressive piece of work. This is longer than most journal articles-maybe a record for the EA forum? You had good links to other people discussing the long-term impacts of global catastrophic risk mitigation. I think a major impact of alternate foods is to make the collapse of civilization less likely, meaning less of the nastiness that could get locked into AI. But of course preventing the risk would mean none of that nastiness. It sounds like you're talking about a different effect, which is that civilization could still be lost, but alternate foods would mean that the minimum population would be higher, meaning a faster recovery. Some of the alternate foods require civilization, but some could be done on a small scale, so this is possible. In this case, alternate foods would still reduce the nastiness, but because recovery would be quicker, I guess it is more likely that the nastiness would not decay out before we get AI.

dust

"downside focused" is a fantastic term, worthy of replacing both "negative" (not descriptive) and "suffering focused" (too intense for some contexts).

I think I'll call myself a downside focused utilitarian from here on.

Jamie_Harris

Note that no one should quote the above map out of context and call it “The likely future” or something like that, because some of the scenarios I listed may be highly improbable and because the whole map is drawn with a focus on things that could go wrong. If we wanted a map that also tracked outcomes with astronomical amounts of happiness, there would in addition be many nodes for things like “happy subroutines,” “mindcrime-opposite,” “superhappiness-enabling technologies,” or “unaligned AI trades with aligned AI and does good things after all.” There can be futures in which several s-risk scenarios come to pass at the same time, as well as futures that contain s-risk scenarios but also a lot of happiness (this seems pretty likely).

I like this map! Do you know of anything that attempts to assign probabilities (even very vague/ballpark) to these different outcomes?

As someone who is not particularly "downside-focused," one thing I find difficult in evaluating the importance of prioritising s-risks vs extinction risks (and then different interventions that could be used to address them) is just not being able to get my head around which sorts of outcomes seem most likely. Given my lack of knowledge about the different risk factors, I mostly just treat each of the different possible outcomes on your map and the hypothetical "map that also tracked outcomes with astronomical amounts of happiness" as being roughly equal in probability.

Lukas_Gloor

Sorry for the delayed answer; I had this open but forgot.

I like this map! Do you know of anything that attempts to assign probabilities (even very vague/ballpark) to these different outcomes?

Not in any principled way, no. I think the action threshold ("How large/small would the probability have to be in order to make a for-me-actionable difference?") are quite low if you're particularly suffering-focused, and quite high if you have a symmetrical/upside-focused view. (This distinction is crude and nowadays I'd caveat that some plausible moral views might not fit on the spectrum.) So in practice, I'd imagine that cruxes are rarely about the probabilities of these scenarios. Still, I think it could be interesting to think about their plausibility and likelihood in a systematic fashion.

Given my lack of knowledge about the different risk factors, I mostly just treat each of the different possible outcomes on your map and the hypothetical "map that also tracked outcomes with astronomical amounts of happiness" as being roughly equal in probability.

At the extremes (very good outcomes vs. very bad ones), the good outcomes seem a lot more likely, because future civilization would want to intentionally bring them about. For the very bad outcomes, things don't only have to go wrong, but do so in very specific ways.

For the less extreme cases (moderately good vs. moderately bad), I think most options are defensible and treating them as similarly likely certainly seems reasonable.

SummaryBot

Executive summary: This post analyzes cause prioritization for downside-focused value systems, arguing that reducing suffering risks, particularly through AI alignment, should be prioritized over utopia creation to mitigate potential long-term disvalue.

Key points:

Distinction between downside-focused and upside-focused value systems, where the former emphasizes reducing disvalue and the latter emphasizes creating significant positive outcomes.
Downside-focused views prioritize the reduction of suffering risks (s-risks) over the creation of utopian futures due to the potential for catastrophic disvalue.
Extinction risk reduction is generally not favorable for downside-focused value systems as it may inadvertently increase s-risks associated with space colonization and technological advancements.
AI alignment is likely beneficial for downside-focused perspectives by preventing the creation of superintelligent AI that could generate vast amounts of suffering, despite high uncertainty in outcomes.
Effective altruism portfolios should incorporate interventions that are valuable from both downside- and upside-focused perspectives, with a strong emphasis on AI safety and strategic cooperation.
Addressing moral uncertainty and fostering cooperation are crucial for maximizing positive outcomes and minimizing harms across diverse value systems, recommending a focus on universally beneficial interventions.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

MichaelPlant

Hello Lukas,

I'm struggling to wrap my head around the difference between upside and downside focused morality. I tried to read the rest of the document, but I kept thinking "hold on, I don't understand the original motivation" and going back to the start.

I’m using the term downside-focused to refer to value systems that in practice (given what we know about the world) primarily recommend working on interventions that make bad things less likely.

If I understand it, the project is something like "how do your priorities differ if you focus on reducing bad things over promoting good things?" but I don't see how you can on to draw anything conclusions about that because downside (as well as upside) morality covers so many different things.

Here are 4 different ways you might come to the conclusion you should work on making bad things less likely. Quoting Ord:

"Absolute Negative Utilitarianism (NU). Only suffering counts.

Lexical NU. Suffering and happiness both count, but no amount of happiness (regardless of how great) can outweigh any amount of suffering (no matter how small).

Lexical Threshold NU. Suffering and happiness both count, but there is some amount of suffering that no amount of happiness can outweigh.

Weak NU. Suffering and happiness both count, but suffering counts more. There is an exchange rate between suffering and happiness or perhaps some nonlinear function which shows how much happiness would be required to outweigh any given amount of suffering."

This would lead you to give more weight to suffering at the theoretical level. Or, fifth, you could be a classical utilitarian - happiness and suffering count equally - and decide, for practical reasons, to focus on reducing suffering.

As I see it, the problem is that all of them will and do recommend different priorities. A lexical or absolute NU should, perhaps, really be trying to blow up the world. Weak NU and classical U will be interested in promoting happiness too and might want humanity to survive and conquer the stars. It doesn't seem useful or possible to conduct analysis along the lines of "this is what you should do if you're more interested in reducing bad things" because the views within downside focused morality won't agree with what you should do or why you should do it.

More broadly, this division seems unhelpful. Suppose we we have four people in a room, a lexical NU, a very weak NU, a classical U, and a lexical positive utilitarian (any happiness outweighs all suffer). It seems like, on your view, the first two should be downside focused and the latter two upside focused. However, it could be both the classical U and the very weak NU agree that the best way to do good is focusing suffering reduction, so they're downside. Or they could agree the best way is happiness promotion, so they're upside. In fact, the weak NU and classical U have much more in common with each other - they will nearly always agree on the value of states of affairs - than either of them do with the lexical NU or lexical PU. Hence they should really stick together and it doesn't seem trying to force views into those that, practically speaking, focus on producing good or reducing bad, is a category that helps our analysis.

It might be useful to hear you say why you think this is a useful distinction.

Lukas_Gloor

If I understand it, the project is something like "how do your priorities differ if you focus on reducing bad things over promoting good things?"

This sounds accurate, but I was thinking of it with empirical cause prioritization already factored in. For instance, while a view like classical utilitarianism can be called "symmetrical" when it comes to normatively prioritizing good things and bad things (always with some element of arbitrariness because there are no "proper units" of happiness and suffering), in practice the view turns out to be upside-focused because, given our empirical situation, there is more room for creating happiness/good things than there is future expected suffering left to prevent. (Cf. the astronomical waste argument.)

This would go the other way if we had good reason to believe that the future will be very bad, but I think the classical utilitarians who are optimistic about the future (given their values) are right to be optimistic: If you count the creation of extreme happiness as not-a-lot-less important than the prevention of extreme suffering, then the future will in expectation be very valuable according to your values (see footnote [3]).

but I don't see how you can on to draw anything conclusions about that because downside (as well as upside) morality covers so many different things.

My thinking is that when it comes to interventions that affect the long-term future, different normative views tend to converge roughly into two large clusters for the object-level interventions they recommend. If the future will be good for your value system, reducing exinction risks and existential risk related to "not realizing full potential" will be most important. If your value system makes it harder to attain vast amounts of positive value through bringing about large (in terms of time and/or space) utopian futures, then you want to focus specifically on (cooperative ways of) reducing suffering risks or downside risks generally. The cut-off point is determined by what the epistemically proper degree of optimism or pessimism is with regard to the quality of the long-term future, and to what extent we can have an impact on that. Meaning, if we had reason to believe that the future will be very negative and that effort to make the future contain vast amounts of happiness are very very very unlikely to ever work, then even classical utilitarianism would count as "downside-focused" according to my classification.

Some normative views simply don't place much importance on creating new happy people, in which case they kind of come out as downside-focused by default (except for the consideration I mention in footnote 2). (If these views give a lot of weight to currently existing people, then they can be both downside-focused and give high priority to averting extinction risks, which is something I pointed out in the third-last paragraph in the section on extinction risks.)

Out of the five examples you mentioned, I'd say they fall into the two clusters as follows: Downside-focused: absolute NU, lexical NU, lexical threshold NU and a "negative-leaning" utilitarianism that is sufficiently negative-leaning to counteract our empirical assessment of how much easier it will be to create happiness than to prevent suffering. The rest is upside-focused (maybe with some stuck at "could go either way"). How much is "sufficiently negative-leaning"? It becomes tricky because there are not really any "proper units" of happiness and suffering, so we have to first specify what we are comparing. See footnote 3: My own view is that the cut-off is maybe very roughly at around 100, but I mentioned "100 or maybe 1,000" to be on the conservative side. And these refer to comparing extreme happiness to extreme suffering. Needless to say, it is hard to predict the future and we should take such numbers with a lot of caution, and it seems legitimate for people to disagree. Though I should qualify that a bit: Say, if someone thinks that classical utilitarians should not work on extinction risk reduction because the future is too negative, or if someone thinks even strongly negative-leaning consequentialists should have the same ranking of priorities as classical utilitarians because the future is so very positive, then both of these have to explain away strong expert disagreement (at least within EA; I think outside of EA, people's predictions are all over the place, with economists generally being more optimistic).

Lastly, I don't think proponents of any value system should start to sabotage other people's efforts, especially not since there are other ways to create value according to your own value systems that is altogether much more positive sum. Note that this – the dangers of naive/Machiavellian consequentialism – is a very general problem that reaches far deeper than just value differences. Say you have two EAs who both think creating happiness is 1/10th as important as reducing suffering. One is optimistic about the future, the other has become more pessimistic after reading about some new arguments. They try to talk out the disagreement, but do not reach agreement. Should the second EA now start to sabotage the efforts of the first one, or vice versa? That seems ill-advised; no good can come from going down that path.

Dawn Drescher

Just FYI, Simon Knutsson has responded to Toby Ord.

Siebe

I would therefore say that large-scale catastrophes related to biorisk or nuclear war are quite likely (~80–90%) to merely delay space colonization in expectation.[17] (With more uncertainty being not on the likelihood of recovery, but on whether some outlier-type catastrophes might directly lead to extinction.)

You seem to be highly certain that humans will recover from near-extinction. Is this based on solely the arguments in the text and footnote, or is there more? It seems to rest on the assumption that only population growth/size is the bottleneck, and key technologies and infrastructures will be developed anyway.

Lukas_Gloor

There isn't much more except that I got the impression that people in EA who have thought about this a lot think recovery is very likely, and I'm mostly deferring to them. The section about extinction risk is the part of my post where I feel the least knowledgeable. As for additional object-level arguments, I initially wasn't aware of points such as crops and animals already being cultivated/domesticated, metals already mined, and there being alternatives to rapid growth induced by fossil fuels, one of which being slow but steady growth over longer time periods. The way cultural evolution works is that slight improvements from innovations (which are allowed to be disjunctive rather than having to rely on developing a very specific technology) spread everywhere, which makes me think that large populations + a lot of time should go far enough eventually. Note also that if all-out extinction is simply very unlikely to ever happen, then you have several attempts left to reach technological maturity again.

Comments

Cause prioritization for downside-focused value systems

Cause prioritization for downside-focused value systems

Which views qualify as downside-focused?

Most expected disvalue happens in the long-term future

Downside-focused views prioritize s-risk reduction over utopia creation

Business as usual (BAU)

Astronomical suffering (AS)

Paradise (small or astronomical; SP/AP)

Extinction risk reduction: Unlikely to be positive according to downside-focused views

AI alignment: (Probably) positive for downside-focused views; high variance

Moral uncertainty and cooperation

Endnotes

Acknowledgements

References