Effective Altruism Forum
EA Forum

Hide table of contents

Comment Permalink

Ryan GreenblattDec 20 20236

That said, I still think both topics are important, since I think many EAs seem to have a faulty background chain of reasoning that flows from their views about human disempowerment risk, concluding that such risks override most other concerns.

For example, I suspect either a majority or a substantial minority of EAs would agree with the claim that it is OK to let 90% of humans die (e.g. of aging), if that reduced the risk of an AI catastrophe by 1 percentage point. By contrast, I think that type of view seems to naively prioritize a concept of "the human species" far above actual human lives in a way that is inconsistent with careful utilitarian reasoning, empirical evidence, or both. And I do not think this logic merely comes down to whether you have person-affecting views or not.

I currently don't think these normative arguments make much of a difference in prioritization or decision making in practice. So, I think this probably isn't that important to argue about.

Perhaps the most important case in which they would lead to very different decision making is the case of pausing AI (or trying to speed it up). Strong longtermists likely want to pause AI (at the optimal time) until the reduction in p(doom) per year is around the same as the exogenous doom. (This includes the chance of societal disruption which makes the situation worse and then results in doom. For instance, nuclear war induced societal collapse which results in building AI far less safely. Gradual changes in power over time also seem relevant, e.g. china.) I think the go-ahead-point for longtermists probably looks like 0.1% to 0.01% reduction in p(doom) per year of delay for longtermists, but this might depend on how optimistic you are about other aspects of society. Of course, if we could coordinate sufficiently to also eliminate other sources of risk, the go-ahead-point might lower considerably.

ETA: note that waiting until the reduction in p(doom) per year of delay is 0.1% does not imply that the final p(doom) is 0.1%. It's probably notably higher, maybe over an order of magnitude higher.

[Low confidence] If we apply the preferences of typical people (but gloomier empirical views about AI), then it seems very relevant that people broadly don't seem care that much about saving the elderly, life extension, or getting strong versions of utopia for themselves before they die. But, they do care a lot about avoiding societal collapse and ruin. And they care some about the continuity of human civilization. So, the go-ahead-point in reduction in doom per year if we use the preferences of normal people might look pretty similar to longtermists (though it's a bit confusing to apply somewhat incoherant preferences). I think it's probably less than a factor of 10 higher, maybe a factor of 3 higher. Also, normal people care about the absolute level of risk: if we couldn't reduce risk below 20%, then it's plausible that the typical preferences of normal people never want to build AI because they care more about not dying in a catastrophe than not dying of old age etc.

If we instead assume something like utilitarian person-affecting views (let's say only caring about humans for simplicity), but with strongly diminishing returns (e.g. logarithmic) above the quality of live of current americans and with similarly diminishing returns after 500 years of life, then I think you end up do 1% reduction in P(doom) per year of delay as the go-ahead point. This probably leads to pretty similar decisions in most cases.

(Separately, pure person-affecting views seem super implausible to me. Indifference to the torture of arbitrary number of new future people seems strong from my perspective. If you have asymmetric person-affecting views, then you plausible get dominated by the potential for reducing suffering in the long run.)

The only views which seem to lead to a pretty different conclusions are views with radically higher discount rates, e.g. pure person-affecting views where you care mostly about the lives of short lived animals or perhaps some views where you care about fulfilling the preferences of current humans (who might have high discount rates on their preferences?). But it's worth noting that these views seem indifferent to the torture of an arbitrary number of future people in a way that feels pretty implausible to me.

By contrast, I think that type of view seems to naively prioritize a concept of "the human species" far above actual human lives in a way that is inconsistent with careful utilitarian reasoning, empirical evidence, or both.

I don't think this depends on the concept of "the human species". Personally, I care about the overall utilization of resources in the far future (and I imagine many people with a similar perspective agree with me here). For instance, I think literal extinction in the event of AI takeover is unlikely and also not very importantly worse relative to full misaligned AI takeover without extinction. Similarly, I would potentially be happier to turn over the universe to aliens instead of AIs.

Separately, I think scope-sensitive/linear-returns person-affecting views are likely dominated by the potential for using a high fraction of future resources to simulate huge number of copies of existing people living happy lives. In practice, no one goes here because the actual thing people mean when they say "person-affecting views" is more like caring about the preferences of currently existing humans in a diminishing returns way. I think the underlying crux isn't well described as person-affecting vs non-person-affecting and is better described as diminishing returns.

See in context

[ Question ]

What is the current most representative EA AI x-risk argument?

by Matthew_Barnett

Dec 15 20233 min read10 answers 25