I was chatting recently to someone who had difficulty knowing how to orient to x-risk work (despite being a respected professional working in that field). They expressed that they didn't find it motivating at a gut level in the same way they did with poverty or animal stuff; and relatedly that they felt some disconnect between the arguments that they intellectually believed, and what they felt to be important.
I think that existential security (or something like it) should be one of the top priorities of our time. But I usually feel good about people in this situation paying attention to their gut scepticism or nagging doubts about x-risk (or about parts of particular narratives about x-risk, or whatever). I’d encourage them to spend time trying to name their fears, and see what they think then. And I’d encourage them to talk about these things with other people, or write about the complexities of their thinking.
Partly this is because I don't expect people who are using intellectual arguments to override their gut to do a good job of consistently tracking what the most important things to do are on a micro-scale. So it would be good to get the different parts of them to sync more.
And partly because it seems like it would be a public good to explore and write about these things. Either their gut is onto something with parts of its scepticism, in which case it would be great to have that articulated; or their gut is wrong, but if other people have similar gut reactions then playing out that internal dialogue in public could be pretty helpful.
It's a bit funny to make this point about x-risk in particular because of course the above all applies to whatever topic. But I think people normally grasp it intuitively, and somehow that's less universal around x-risk. I guess maybe this is because people don't have any first-hand experience with x-risk, so their introductions to it are all via explicit arguments … and it's true that it's a domain where we should be unusually unwilling to trust our gut takes without hearing the arguments, but it seems to me like people are unusually likely to forget that they can know anything which has bearing on the questions without already being explicit (and also that perhaps the social environment, in encouraging people to take explicit arguments seriously, can accidentally overstep and end up discouraging people from taking anything else seriously). These dynamics seem especially strong in the case of AI risk — which I regard as the most serious source of x-risk, but also the one where I most wish people spent more time exploring their nagging doubts.
I resonate a lot with this post. Thank you for writing it and giving an opportunity to people like me to express their thoughts on the topic. I'm writing with an anonymous account because publicly stating things like, 'I'm not sure it would be bad for humanity to be destroyed' seems dangerous for my professional reputation. I don't like not being transparent, but the risks here seem too great.
I currently work in an organization dedicated to reducing animal suffering. I've recently wondered a lot if I should go work on reducing x-risks from AI: it seems there's work where I could potentially be counterfactually useful in AI safety. But after having had about a dozen discussions with people from AI Safety field, I still don't have this gut feeling that reducing x-risks from AI is something that deserves my energy more than reducing animal suffering in the short-term.
I am not at all an expert on issues around AI, so take what follows as 'the viewpoint of someone outside the world of AI safety / x-risks trying to form an opinion on these issues, with the constraint of having a limited amount of time to do so'
The reasons are:
Ultimately, the two questions I would like to answer are:
Interesting points!
AI not being aligned at all is not exactly a live option? The pre-training relies on lots of human data, so it alone leads to some alignment with humanity. Then I would say that current frontier models post-alignment already have better values than a random human, so I assume alignment tech... (read more)