Components of Strategic Clarity [Strategic Perspectives on Long-term AI Governance, #2]

MMMaas

This is post 2 of an in-progress draft report called Strategic Perspectives on Long-term AI Governance (see sequence).

Over the last 5 years since the appearance of Allan Dafoe’s research agenda, there have been many developments in the field of Long-Term AI Governance. However, the field is still young and remains open-ended.

Working definitions

There is uncertainty over key terms within the field, around both different types of advanced AI, and the practice and field of AI governance. Both have seen different definitions [see Appendix 1].

For this discussion, I define Transformative AI as:

“AI systems that have extreme and practically irreversible effects on the long-term trajectory of society, including disruption comparable to the industrial revolution, and/or potential existential risks.”

I define the field of Long-term AI Governance as:

“The study and shaping of local and global governance systems—including norms, policies, laws, processes, politics, and institutions—that affect the research, development, deployment, and use of existing and future AI systems in ways that positively shape societal outcomes into the long-term future.”

I define a Strategic Perspective as:

"A cluster of correlated views on long-term AI governance, encompassing (1) broadly shared assumptions about the key technical and governance parameters of the challenge; (2) a loose theory of victory or impact story about what solving this problem would look like; (3) a set of historical analogies to provide comparison, grounding, inspiration, or guidance; (4) a set of intermediate strategic goals to be pursued, and near-term interventions or actions that contribute to reaching them;”

Challenges for Long-term AI Governance

The Long-term AI Governance community faces a series of challenges. It remains an incipient field with a small pool of active researchers from just a handful of institutions. It remains at least partially pre-paradigmatic, with underdeveloped or unpursued research lines. There are challenges around intra-community legibility, with many researchers not aware of what others are working on; - or understanding what are the assumptions, views, or cruxes that drive different people’s choices about what to research and investigate.

In search of strategic clarity

The lack of clarity is a problem, because the community of Long-term AI Governance identifies strongly as an impact-focused project. We are not just intellectually curious about advanced AI; we are motivated to find ways to make this future go well.

However, as has been noted, the community lacks strategic clarity around which intermediate goals or policy actions should be pursued. Specifically, there is pervasive uncertainty not just about the technical landscape of TAI, but also about robust nearer-term activities or even goals for policy. As a consequence, various people have emphasized the importance of high-quality research to ‘disentangle’ the field, scope out key parameters for governance, and the identification of interventions that are robustly good.

Is strategic clarity what we need most? There could be debate over whether an effective Long-term AI Governance community requires:

strategic clarity - i.e. a sensible and grounded theory of change, providing a detailed, even ‘gears-level’ model of both the technical landscape and the policy world, with a resulting clear roadmap for near-term or intermediate interventions;
strategic consensus - i.e. (almost) everyone in the Long-term AI Governance community shares the same roadmap or perspective–the same model of strategic clarity; or-
strategic coherence - i.e. interventions by different people or communities in the Long-term AI Governance community don't catastrophically interfere with- or erode one another.

It is unclear whether achieving strategic clarity would be enough to create strategic consensus at the expert or community level, although the two are likely correlated. If they can be decoupled, it is not clear whether the Long-term AI Governance community necessarily requires, or would currently gain from, achieving full strategic consensus.

On the one hand, translating strategic clarity into strategic consensus could ensure full alignment of the community, and avoid disagreements and tensions over interventions;
On the other hand, entertaining a portfolio of different perspectives that lack consensus, but which each have internal strategic clarity, could be an acceptable or even preferable meta-strategy for the purposes of Long-term AI Governance–so long as we ensure minimal strategic coherence or non-interference amongst the interventions pursued by different groups.

In either case, to pursue any long-term-oriented interventions for TAI--and to even begin to explore questions of strategic consensus and coherence (and their tradeoffs)--the Long-term AI Governance community requires at least one account of the world that provides strategic clarity for action, given its background world assumptions.

How existing work contributes to the components of strategic clarity

What is holding us back from gaining strategic clarity? A background cause lies in the generally wide range of philosophical, technical, political and other views which the world appears to have on the subject of advanced AI.

Strategic clarity in Long-term AI Governance requires several ingredients:

A detailed and gears-level account of the strategic parameters of the problem;
An understanding of all available or potential options (e.g. assets, levers, interventions) that could contribute solutions;
A theory of impact or -victory for comparing and prioritizing amongst these governance solutions, based on an account of the strategic parameters of the problem;

Existing work in the field has already contributed to these components (see table 2; Appendix 2).

Type	Focus of work	Includes work on:
1st order	Understanding strategic parameters of the long-term AI governance problem	Technical parameters: Technical landscape Timelines Architectures (e.g. AGI, CAIS, PASTA) Pathways (e.g. scaling hypothesis) vs. barriers Takeoff speeds Historical analogies for disjunctive development Epistemic terrain: (lack of) advance warning signs of capability breakthroughs Distribution of AGI programs, and of relevant inputs Trends in relevant inputs [...] Direct existential threat models 'Superintelligence' alignment failure 'What Failure Looks Like' (1&2) War Misuse Intersection with other risks (e.g. nuclear escalation) Multipolar failures Suffering risks [...] Indirect effects on existential risk factors Intermediate political impacts Effects on epistemic security Effects on coordination architectures [...] Technical alignment approaches [various overviews] Governance parameters: Structural features of the TAI governance challenge, as: Global Public Good problem Collective Action problem Technology Race Involving risks from accident, misuse, structure [...] Likely prevailing governance conditions Global perceptions of AI; policymaker perceptions of AI Existing or likely near-term governance regimes which will affect TAI; [...] Historical precedents and lessons for AI governance Desiderata for ideal AI governance [...]
2nd order	Understanding potential options for long-term AI governance solutions	Mapping distribution of current assets: Topology of institutions active in the space Distribution of talent Funding landscape [...] Mapping individual career options Mapping key TAI actors to influence The relative relevance of influencing different types of actors (e.g. firms vs. governments vs. academia) The relative relevance of influencing particular actors (e.g. US, China, EU, ...) Mapping possible levers for pursuing goals Sets of tools available to different actors to shape TAI (e.g. export control regulation; arms control treaties, lab policies, defence-in-depth, compute governance, [...]); The different pathways by which these interventions might be realized and implemented Articulating specific proposals for long-term-relevant AI intervention 'products' [...]
3rd order	Articulating individual theories of impact / victory (allowing the selection or prioritization amongst solutions, given a particular view of the problem)	In technical AI alignment, e.g. Wei Dai’s ‘AI Safety Success Stories’ Neel Nanda’s ‘Overview of the AI Alignment Landscape’, and ‘A Longlist of Theories of Impact for Interpretability’, Evan Hubinger’s ‘A positive case for how we might succeed at prosaic AI alignment’; [...] In long-term AI governance, e.g. Allan Dafoe's 'Asset-decision' model Jade Leung's 'decision-influence' model Ben Garfinkel's 'Pathways for Impact' Seth Baum's 'affecting the future of AI governance' [etc.] …
4th order	Mapping and comparing strategic perspectives, (to prioritize amongst or coordinate between work pursuing different theories of impact)	(this project)

Table 2: different types of Long-term(ist) AI Governance work, on distinct components of strategic clarity (see also Appendix 2).

Such work has been rich and nuanced. However, it does not yet appear to have yet led to significantly increased strategic consensus, to a level where it has won over a large part of the Long-term AI Governance community. Moreover, the positions that have been informally articulated also do not exhaust the space of positions. As such, achieving any or all of strategic clarity, consensus, or coherence in Long-term AI Governance, can gain form an additional component:

A mapping of strategic perspectives: comparative understanding of how each given theory of victory relates to, conflicts with, or supports other contending theories and perceptions;

The purpose of this project is to provide a sketch for such a mapping.

Effective Altruism Forum
EA Forum

Components of Strategic Clarity [Strategic Perspectives on Long-term AI Governance, #2]

66

Working definitions

Challenges for Long-term AI Governance

In search of strategic clarity

How existing work contributes to the components of strategic clarity

66

Reactions