Effective Altruism Forum
EA Forum

Hide table of contents

Comment Permalink

Matthew_BarnettDec 23 20233

I think the go-ahead-point for longtermists probably looks like 0.1% to 0.01% reduction in p(doom) per year for longtermists, but this might depend on how optimistic you are about other aspects of society.

To be clear, my argument would be that the go-ahead-point for longtermists likely looks much higher, like a 10% total risk of catastrophe. Actually that's not exactly how I'd frame it, since what matters more is how much we can reduce the risk of catastrophe by delaying, not just the total risk of a catastrophe. But I'd likely consider a world where we delay AI until the total risk falls below 0.1% to be intolerable from several perspectives.

I guess one way of putting my point here is that you probably think of "human disempowerment" as a terminal state that is astronomically bad, and probably far worse than "all currently existing humans die". But I don't really agree with this. Human disempowerment just means that the species homo sapiens is disempowered, and I don't see why we should draw the relevant moral boundary around our species. We can imagine other boundaries like "our current cultural and moral values", which I think would drift dramatically over time even if the human species remained.

I'm just not really attached to the general frame here. I don't identify much with "human values" in the abstract as opposed to other salient characteristics of intelligent beings. I think standard EA framing around "humans" is simply bad in an important way relevant to these arguments (and this includes most attempts I've seen to broaden the standard arguments to remove references to humans). Even when an EA insists their concern isn't about the human species per se I typically end up disagreeing on some other fundamental point here that seems like roughly the same thing I'm pointing at. Unfortunately, I consistently have trouble conveying this point to people, so I'm not likely to be understood here unless I give a very thorough argument.

I suspect it's a bit like the arguments vegans have with non-vegans about whether animals are OK to eat because they're "not human". There's a conceptual leap from "I care a lot about humans" to "I don't necessarily care a lot about the human species boundary" that people don't reliably find intuitive except perhaps after a lot of reflection. Most ordinary instances of arguments between vegans and non-vegans are not going to lead to people successfully crossing this conceptual gap. It's just a counterintuitive concept for most people.

Perhaps as a brief example to help illustrate my point, it seems very plausible to me that I would identify more strongly with a smart behavioral LLM clone of me trained on my personal data compared to how much I'd identify with the human species. This includes imperfections in the behavioral clone arising from failures to perfectly generalize from my data (though excluding extreme cases like the entity not generalizing any significant behavioral properties at all). Even if this clone were not aligned with humanity in the strong sense often meant by EAs, I would not obviously consider it bad to give this behavioral clone power, even at the expense of empowering "real humans".

On top of all of this, I think I disagree with your argument about discount rates, since I think you're ignoring the case for high discount rates based on epistemic uncertainty, rather than pure time preferences.

See in context

[ Question ]

What is the current most representative EA AI x-risk argument?

by Matthew_Barnett

Dec 15 20233 min read10 answers 25