AI governance needs a theory of victory

Corin Katzke; Justin Bullock

AI governance needs a theory of victory

Corin Katzke,

Comments 8

Sorted by

New & upvoted

Chris Leong

I'd also consider a scenario where the world is multi-polar, but not vastly so and the actors are able to avoid conflict either due to being able to get along or mutually assured destruction. Then again, this may not be sufficiently robust for you to include it.

nathanhb

That's really really not robust since it strongly implies race conditions to greater intelligence and power. You can't assume the multi-polar detente will hold across increasing levels of technology and machine intelligence. This would, furthermore, be expected to be a rapid process. So this situation would turn into some other after only a few years.

Caruso

A theory of victory approach won't work for AI. Theories of victory are borne out of a study of what hasn't worked in warfare. You've got nothing to draw from in order to create an actual theory of victory. Instead, you appear to be proposing a few different strategies, which don't appear to be very well thought out.

You argue that the U.S. could have established a monopoly on nuclear weapons development. How?The U.S. lost its monopoly to Russia due to acts of Russian espionage that took place at Los Alamos. How do you imagine that could have been prevented?

AI is software, and in software security, offense always has the advantage over defense. There is no network that cannot be breached w/ sufficient time and resources because software is inherently insecure.

Corin Katzke

While the USSR was indeed able to exfiltrate secrets from Los Alamos to speed up its nuclear program, it took a few more years for it to actually develop a nuclear weapon.

Russell (and we don't necessarily agree here) argued that the US could have established a monopoly on nuclear development through nuclear coercion. That strategy doesn't have anything to do with preventing espionage.

Caruso

Once the genie is out of the bottle, it doesn't matter, does it? Much of China's current tech achievements began with industrial espionage. You can't constrain a game-changing technology while excluding espionage as a factor.

It's exactly the same issue with AI.

While you have an interesting theoretical concept, there's no way to derive a strategy from it that would lead to AI safety that I can see.

Corin Katzke

The idea for this particular theory of victory is that, if some country (for example, the US) develops TAI first, it could use TAI to prevent other countries (for example, China) from developing TAI as well — including via espionage.

If TAI grants a decisive strategic advantage, then it follows that such a monopoly could be effectively enforced (for example, it’s plausible that TAI-enabled cybersecurity would effectively protect against non-TAI cyberoffense).

Again, I’m not necessarily endorsing this ToV. But it does seem plausible.

Duckruck

Maybe we can have a "theory of failure".

That said, since basically ASI is bound to override humans, the only way to do that is how to adapt to that.

SummaryBot

Executive summary: AI governance needs explicit "theories of victory" that describe desired end states of existential security and strategies to achieve them, with three potential approaches being an AI moratorium, an "AI Leviathan", or defensive acceleration.

Key points:

A theory of victory for AI governance should include an existentially secure endgame that preserves optionality, and a plausible strategy to achieve it.
An AI moratorium would prevent development beyond a threshold, but faces major coordination challenges.
An "AI Leviathan" would use the first transformative AI to enforce a monopoly, but risks lock-in of mistakes.
Defensive acceleration aims to outpace offensive AI capabilities with defensive ones, but requires careful technological development.
Nuclear weapons history offers relevant precedents, though AI differs in being dual-use and potentially self-improving.
AI governance actors should make their preferred theories of victory explicit to enable open discussion and examination.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Comments

More from the author

The Manhattan Trap: Why a Race to Artificial Superintelligence is Self-Defeating

Corin Katzke, Deogin·1y ago·2m read

Scenario planning for AI x-risk

Corin Katzke·2y ago·18m read

Curated and popular this week

What would an animal-aligned AI be aligned to?

Aidan Kankyoku, Anima International·1w ago·Curated 1d ago·15m read

This is a crosspost from the new Animal Welfare Alignment Newsletter by Anima International. You can subscribe on Substack if you are interested in following these efforts. Audio reading also available on Substack. The goals of this post are to: 1. Raise a question I see as crucially important to the goal of aligning AI to animal welfare...

158

The first video from Giving What We Can's new channel is out now!

JustinPortela·2d ago·1m read

Hello! I'm Justin Portela. I got hired by GWWC to make YouTube videos after AI in Context did such a kickass job. My channel is using that same cinematic, high-production value beauty to talk about everything in the EA universe that isn't AI. ...

New round of digital minds funding opportunities at Longview

zdgroff, Longview Philanthropy·4d ago·2m read

This is a linkpost for Request for Proposals: Research and Applied Work on Digital Minds. I'm glad to announce a request for proposals for research and applied work on digital minds at Longview Ph...

Recent opportunities to take action

Seeking feedback and collaborators for an AI welfare project

Juliana Grant·52m ago·2m read

A huge way you can help pigs in 5-20 minutes (in the US)

ElliotTep·15h ago·1m read

EA Switzerland is Hiring: 🇨🇭Impact Cohort Manager

Erik Jentzen·9h ago·4m read

AI governance needs a theory of victory

AI governance needs a theory of victory

Overview

Introduction

What is the goal of AI governance?

Existential security as a positive goal

Motivation

Criteria for a theory of victory

Endgames and strategies

Scenario robustness

Evaluating interventions

Existential risk not from AI

Nuclear risk as a case study

International coordination to prevent nuclear development

A unilateral monopoly on nuclear weapons

Comparing nuclear risk and AI risk

Theories of victory for AI governance

AI development moratorium

Description

Discussion

AI leviathan

Description

Discussion

Defensive acceleration

Description

Discussion

Conclusion