I don't think world dystopia is entirely necessary, but a successful long stop for AI (the ~30+ years it'll probably take) is probably going to require knocking over a couple of countries that refuse to play ball. It seems fairly hard to keep even small countries from setting up datacentres and chip factories except by threatening or using military force.
To be clear, I think that's worth it. Heck, nuclear war would be worth it if necessary, although I'm not sure it will be - the PRC in particular I rate as >50% either a) agreeing to a stop, and/or b) getting destroyed in non-AI-related nuclear war in the next few years.
Deceptive alignment is a convergent instrumental subgoal. If an AI is clearly misaligned while its creator still has the ability to pull the plug, the plug will be pulled; ergo, pretending to be aligned is worthwhile ~regardless of terminal goal.
Thus, the prior would seem to be that all sufficiently-smart AI appear aligned, but only X proportion of them are truly aligned where X is the chance of a randomly-selected value system being aligned; the 1-X others are deceptively aligned.
GPT-4 being the smartest AI we have and also appearing aligned is not really evidence against this; it's plausibly smart enough in the specific domain of "predicting humans" for its apparent alignment to be deceptive.
Drone swarms do take time to build. Also, nuclear war is "only" going to kill a large percentage of your country's citizens; if you're sufficiently convinced that any monkey getting the banana means Doom, then even nuclear war is worth it.
I think getting the great powers on-side is plausible; the Western and Chinese alliance systems already cover the majority. Do I think a full stop can be implemented without some kind of war? Probably not. But not necessarily WWIII (though IMO that would still be worth it).