Hide table of contents

I need know why "AI" tends to be the top cause area in reducing s-risks among human health/ animal welfare which also seems potential to me. Is it because of "longtermism"? I've read some essays on longtermism, but most of them are talking on x-risks, not s-risks. There are lots of debating on if reducing x-risks are right. But, I haven't seen any discussing on long-term s-risks. Reducing digital sentience seems very abstract to me, I don't know how to persuade others that reducing AI s-risks is important, especially when AI still don't have sentience.(Should we research AI s-risks later until AI has sentience? )
(My naive opinion) Although the future might be vast(around 10^50 lives), I don't think it means the expected value of longtermism is higher, because the traceability might be low to nearly 0. Future is vast because a lot of livings, a wide range of time will affect the future, AI may evolve by themselves, unless there's a value-lock in. The future also has lots of factors, ex: Maybe AI will be killed by aliens, the universe will explode early... The future is super unpredictable. I don't feel like we can affect 1/10^50 of the future, so the expected value might be low. Longtermism sounds a little self-confident on the ability to influence the future, are there some essays arguing/proving why long-term suffering is more important?




New Answer
New Comment

2 Answers sorted by

Hiii! I’d break this down into two questions and give my answers to each. I don’t know whether my takes are still the most widely shared ones, but they’re the arguments that were most convincing to me.

Why focus on AI when it comes to s-risks: Tobias Baumann’s book and his blog posts are probably the best online sources on this topic. The basic idea is that during multipolar takeoffs, i.e. if several groups on earth build two or more similarly powerful AIs, (or for acausal reasons or if AIs meet aliens at some later point), it seems likely that by default these AIs will be at war. I’ll defer to the Tobias’s book when it comes to all the specific scenarios that can occur here and cause suffering.

The urgency of this subarea of s-risks rests a lot on its tractability for me. AI alignment requires that (1) we find a solution to inner alignment, (2) we find a solution to outer alignment, (3) we either implement corrigibility (not sure if that’s the latest term?) or something akin to human values using those solutions, and (4) we convince every last group building AIs to do the same forever. I’m pessimistic that all of that can succeed, but it seems that people smarter than me are still somewhat optimistic, so I could easily be missing something.

But if we aim at the bigger target of just avoiding wars (zero-sum conflicts) between AIs, the particular values of the AIs don’t matter. Regardless of whatever the values might be, every rational AI will rather solve a conflict in a positive-sum way than a zero-sum way. 

CLR has already identified a bunch of problems, e.g., that the bargaining solutions that allow for positive-sum conflicts don’t work at all when different sides use different ones, i.e. everyone needs to agree on the same bargaining solution. But that is again an example of a problem were every AI is on our side. No AI that already has values wants to be aligned with different values. But every AI, regardless of its values, will want to pick a bargaining solution such that it can resolve conflicts in a mutually beneficial rather than internecine way. Perhaps it’ll be enough to publish a bunch of arguments where, say, Nash is the Schelling point bargaining solution, and if they end up in the training data that is used for most AIs, that’ll convince the AI that Nash really is the Schelling point among AIs trained on that data. But that’s all part of the research that still needs to be done.

Personally, I also disvalue certain intense forms of 0suffering at least 10^8 times as much as death (for short durations where I have intuitions), but that varies between people.

You could say that neglectedness is also on the side of s-risks from AI vs. x-risk from AI since there are probably hundreds of people working on alignment but only some 30 or less working on cooperative AI. But really both are ridiculously neglected.

When it comes to other sources of s-risks, “AI war” type agential s-risks seem worse to me compared to natural ones and easier to avert compared to incidental ones.

Tractability of long-term interventions: I don’t think s-risks are special here, and I don’t have any novel thoughts on the topic. They’re one dystopian lock-in among other less bad ones. My favorite resource here is Tarsney’s “The Epistemic Challenge to Longtermism.” It makes a lot of very conservative assumptions, but even so it can’t rule out that we can have a sizeable effect on hundreds to thousands of years. An AI war or its repercussions lasting for hundreds or thousands of years is sufficiently bad in my book, but the conditions under which they occur will be so different to ours today that I find it hard to tell whether effect will be more or less enduring. Time probably passes more slowly for AIs (which think so much faster than us), so that they may also be subject to more drift. But if you send otherwise dumb von Neumann probes out into space at high speeds so no one can catch up with them, I would intuitively guess that they could keep going for a long time before they all fail.

Hey Jack! In support of your view, I think you'd like some of Magnus Vinding's writings on the topic. Like you, he expresses some skepticism about focusing on narrower long-term interventions like AI safety research (vs. broader interventions like improved institutions).

Against your view, you could check out these two (i, ii) articles from CLR.

Feel free to message me if you'd like more resources. I'd love to chat further :)

Curated and popular this week
Relevant opportunities