My understanding is that most AI safety work that plausibly reduces some s-risks may reduce extinction risks as well, and I'm thinking that some futures where we go extinct because of AI (especially with a single AI taking over) wouldn't involve astronomical suffering, if the AI has no (or sufficiently little) interest in consciousness or suffering, whether
- terminally,
- because consciousness or suffering is useful to some goal (e.g. it might simulate suffering incidentally or for the value of information), or
- because there are other agents who care about suffering it has to interact with or whose preferences it should follow (they could all be gone, ruling out s-risks from conflicts).
I am interested in how people are weighing (or defeating) these considerations against the s-risk reduction they expect from (particular) AI safety work.
EDIT: Summarizing:
- AI safety work (including s-risk-focused work) also reduces extinction risk.
- Reducing extinction risk increases some s-risks, especially non-AGI-caused s-risks, but also possibly AGI-caused s-risks.
So AI safety work may increase s-risks, depending on tradeoffs.
I would think there's kind of a continuum between each of the three options, and AI safety work shifts the distribution, making things closer to (C) less likely and things closer to (A) more likely. More or fewer of our values could be represented, and that could be good or bad, and related to the risks of extinction. It's not actually clear to me that moving in this direction is preferable from an s-risk perspective, since there could be more interest in creating more sentience overall and greater risks from conflict with others.