In the last few years, people have proposed various AI takeover scenarios. We think this type of scenario building is great, since there are now more concrete ideas of what AI takeover could realistically look like. That said, we have been confused for a while about how the different scenarios relate to each other and what different assumptions they make. These posts might be helpful for anyone who has similar confusions.
We define AI takeover to be a scenario where the most consequential decisions about the future get made by AI systems with goals that aren’t desirable by human standards.
In the first post, we focus on explaining the differences between seven prominent scenarios: the ‘Brain-in-a-box’ scenario, ‘What failure looks like’ part 1 (WFLL 1), ‘What failure looks like’ part 2 (WFLL 2), ‘Another (outer) alignment failure story’ (AAFS), ‘Production Web’, ‘Flash economy’ and ‘Soft takeoff leading to decisive strategic advantage’. While these scenarios do not capture all the risks from transformative AI, participants in a recent survey aimed at leading AI safety/governance researchers estimated the first three of these scenarios to cover 50% of existential catastrophes from AI.
Here you can see a summary table that describes the distinctive characteristics of each of these 7 scenarios.
In the second post, we investigate the different assumptions of each of these scenarios in four key areas:
- Crucial decisions: the specific (human) decisions necessary for takeover
- Competitive Pressures: the strength of incentives to deploy AI systems despite the dangers they might pose
- Takeover capabilities: how powerful the systems executing the takeover are
- Hackability of alignment: the difficulty of correcting misaligned behaviour through incremental fixes
Here you can see a summary table that describes the key assumptions of each of these 7 scenarios.
We think these posts might be useful for
- Those who want to know what different informed people believe about AI takeover
- Those who are trying to form their own views about AI takeover, and are unsure what scenarios are already under discussion
- Those who already have a general understanding of takeover scenarios, but would like to dig deeper into their assumptions
We hope these posts can serve as a starting point for productive discussion around AI takeover scenarios.
Again, here are the links: https://www.lesswrong.com/posts/qYzqDtoQaZ3eDDyxa/distinguishing-ai-takeover-scenarios and https://www.lesswrong.com/posts/zkF9PNSyDKusoyLkP/investigating-ai-takeover-scenarios.