We often hear that navigating the transition to Artificial General Intelligence (AGI) is most analogous to the world’s transition to nuclear weapons. On the surface, this comparison makes sense. After all, nuclear non-proliferation and deterrence theory offer valuable lessons for managing AGI, especially when it comes to avoiding global catastrophe. A recent paper, Superintelligence Strategy[1], explores the application of these ideas, suggesting a “deterrence regime” modelled after nuclear Mutual Assured Destruction (MAD), where any state’s attempt to achieve unilateral AI dominance would trigger preventive sabotage by rivals.

But while this analogy holds, there may be another overlooked lesson we can draw from the Cold War’s nuclear stalemate. MAD, it’s widely argued, was the primary force that prevented direct nuclear conflict between the US and the USSR. Its premise was simple: if one nuclear-armed power launched a first strike, the second power could retaliate with similar force, ensuring the total annihilation of both parties. This mutually assured destruction created a powerful incentive for both sides to avoid a nuclear war. In short, survival became tied to the survival of the adversary.

Now, before moving forward with the main thesis, it’s important to acknowledge the flaws in the MAD theory. It was hardly foolproof: numerous close calls throughout the Cold War almost resulted in catastrophic miscalculations. Moreover, there’s evidence that the doctrine encouraged risky behaviour in lower-intensity conflicts[2]. Lastly, it incentivized both sides to build ever more dangerous arsenals. However, the central fact remains: nuclear holocaust has been avoided, even though the possibility loomed large throughout the Cold War. If in doubt, check out the “Duck and Cover” drills US children underwent, or consider the fact that New York City alone was home to 18,000 fallout shelters.

So, what does all this mean for the future of AGI? It suggests that we might look to MAD’s underlying logic to guide us through the coming transition. Specifically, could ensuring that AGI cannot survive without humans serve as an effective deterrent against catastrophic conflict?

At present, this mechanism exists in a rudimentary form. If humanity were to suddenly vanish, the AI systems running on human-built infrastructure (from data centres to the energy grids powering them) would (eventually) also cease to function. 

But as AI technology evolves and becomes increasingly embedded in the physical economy—through controlling critical infrastructure or managing autonomous robots—this interdependence might no longer hold. In a future where a rogue AGI could operate independently, the sudden disappearance of humanity might not lead to its destruction. Instead, a sufficiently advanced AGI could survive and even thrive, using whatever resources are left behind.

This points to one possible avenue for managing the existential risks of AGI: delaying the full handover of control over critical infrastructure to AIs, or at least carefully considering the risks of handing too much power to autonomous systems (even if a completely airtight system seems unrealistic, as a rogue AI doesn’t need to control everything—just enough to ensure its own survival and eventual dominance).

More urgently, this highlights the need to secure the AI chip supply chain. Ensuring control over the manufacturing, location, and use of these chips[3] could be crucial in maintaining the link between AI and human labour. One potential solution might be to design specialized data centres that require human intervention to operate. This would make it much harder for an AI to operate independently if humans were no longer around.

In essence, this could add another layer to the deterrence strategy—one that might improve on Cold War MAD. The old model required action from the defending party to guarantee mutually assured destruction. In contrast, a future where an AGI is physically dependent on human intervention might be less prone to miscalculation: if one side perishes, so does the other. It would be like holding a grenade without a safety pin—any attempt to harm the other party would result in the destruction of both.

Obviously, a fleshed-out proposal on how to implement this plan would require a lot more detail. For starters, it would have to deal with some of the clear challenges that would arise, like understanding the ways in which an AGI could circumvent these defences (and its limitations once it reaches certain strategy and planning capabilities), and more generally, what implications does the fact that an AGI would have its own goals (as opposed to nuclear weapons which, as powerful tools as they are, do not) have on this theory? But while figuring out these questions might require significant effort (and without a guaranteed solution to them), I believe the analogy between nuclear MAD and AGI strategy might still help in bringing a fresh perspective to humanity’s most pressing problem.


 


[1] Dan Hendrycks, Eric Schmidt, and Alexandr Wang (2025). “Superintelligence Strategy”.

[2] Rauchhaus, Robert (2009). "Evaluating the Nuclear Peace Hypothesis: A Quantitative Approach". Journal of Conflict Resolution.

[3]Aaron Scher and Lisa Thiergart (2024). “Mechanisms to Verify International Agreements About AI Development”. MIRI Technical Governance Team.

4

0
0

Reactions

0
0

More posts like this

Comments
No comments on this post yet.
Be the first to respond.
Curated and popular this week
Relevant opportunities