Summary & Information About Applying
While many people are interested in long-term AI governance, there is currently no scalable introduction to the field that offers substantial breadth, depth, context, accountability, and information about relevant career opportunities. Anecdotally, finding even a few of those things can be tough. Aiming to improve this state of affairs, I’m excited to introduce this AI governance course. This program seeks to efficiently bring people up to speed on long-term issues in AI governance through an 11-week virtual course. It consists of 8 weeks of readings, facilitated group discussions, speaker sessions, and a 4-week capstone project, for a total of ~3-4 hours per week.
Collaborators, advisors, and acknowledgements:
In creating the curriculum, I drew gratefully and extensively from eight older AI governance reading lists, collaboration with Richard Ngo (a former machine learning research engineer at DeepMind, now working on the policy team at OpenAI), generously detailed feedback from Ben Garfinkel (The Centre for the Governance of AI’s Acting Director) and Luke Muehlhauser (Open Philanthropy’s AI Governance and Policy Program Officer), and additional useful advice from Jenny Xiao (Columbia University), Sam Clarke (CSER), a researcher from CSET/FHI, and two affiliates of Concordia Consulting. (This does not constitute endorsements from any of these advisors or their organizations.)
Professionals with relevant experience in the Future of Humanity Institute, the UK Office for AI, the Center for Security and Emerging Technology, the Centre for the Study of Existential Risk, the Center for Human-Compatible Artificial Intelligence, and the Stanford Institute for Human-Centered Artificial Intelligence have already expressed interest in facilitating course discussions, as have some students with relevant context.
Logistically, the course will be run as another track alongside the technical AI alignment track of the AGI Safety Fundamentals Programme (after which this program and post are modeled, if the name didn’t give it away), in collaboration with Dewi Erwan and Jamie Bernardi, with additional support from Vince Huang and Will Aldred.
If you're interested in joining the next version of the course (taking place January - March 2022) apply here to be a participant or here to be a (compensated) facilitator. Applications are open to anyone and close December 15th. Note that:
- This is the same application form as the one used for the technical AI alignment track of the AGI Safety Fundamentals programme; you can select to apply for the governance track or the technical track. (The AGI Safety Fundamentals Programme is now what we’re currently calling the umbrella program that is hosting both tracks.)
- If you've already applied to the AGI Safety Fundamentals programme and selected the governance track, then you've already applied for this program—no need to do anything else.
- We will offer honoraria of £800 (~$1,070) to facilitators for their time, through Cambridge EA CIC.
- The curriculum is intended to be accessible to people with a wide range of backgrounds—backgrounds in computer science, AI, or social sciences are not necessary.
I encourage you to apply if:
- You are potentially interested in eventually doing work aimed at improving the trajectory of AI through AI governance research/policy (or full-time work that could indirectly help a lot, like technical AI safety research or recruiting people to work on pressing problems).
- And you’d like to learn more about whether to pursue a career in this field, and/or you’d like to get background context that would be a useful step toward contributing.
- And the syllabus seems interesting to you, including its focus on long-term, global risks.
This post contains an overview of the course and an abbreviated version of the curriculum; the full version (which also contains optional readings and notes/context on each core reading, and future discussion prompts and project ideas) can be found here. Comments and feedback are very welcome, either on this post or in the full curriculum document; suggestions of new exercises, prompts, or readings would be particularly helpful. I'll continue to make updates until shortly before the program starts.
See the final section of this post for an FAQ.
Participants are divided into groups of 4-6 people, matched based on their prior knowledge of long-term AI risks and governance. From weeks 1 to 7, each group and their discussion facilitator will meet for 1.5 hours to discuss the readings and exercises. The course consists of 8 weeks of readings, plus a final project. Broadly speaking, the first half of the course explores potential risks, while the second half focuses on strategic considerations and possibilities for good governance. After Week 7, participants will have several weeks to work on projects of their choice, to present at the final session. Each week (apart from week 0) each group and their discussion facilitator will meet for 1.5 hours to discuss the readings and exercises.
Each week's curriculum contains:
- Key ideas for that week
- Core readings
- Optional readings
- Two exercises (participants should pick one to do each week)
- Further notes on the readings
- Discussion prompts for the weekly session
- Week 0 replaces the small group discussions with a lecture plus live group exercises, since it's aimed at getting people with little machine learning knowledge up to speed quickly.
Some high-level approaches that informed the syllabus design:
- Long-term: A focus on long-term risks that future AI systems may pose
- Problem-first: An emphasis on understanding relevant problems and risk scenarios, for better generating and prioritizing among paths to impact
- Pluralistic: (Attempted) Inclusion of a range of the (very different) views that are prominent in the long-term AI governance community
- Foundational: An emphasis on relevant concepts, context, and big-picture ideas, to help prepare participants to themselves discover new strategic considerations and promising policy options
Topics for each week:
This syllabus is structured into three parts:
- Before the main readings, there is a recommended week of background context:
- Week 0 (Recommended Background): AI, Machine Learning, and the Importance of Their Long-Term Impacts
- Part 1 dives into the risks, with the motivating idea that thoroughly understanding problems is very helpful for both identifying potential solutions and prioritizing among them.
- Week 1: Introduction to Governing World-Transforming AI
- Week 2: Deep Dive - The Alignment Problem
- Week 3: Potential Sources of AI Existential Risk
- Part 2 dives into the questions of how governance decisions can help address AI risks.
- Week 4: Avoiding a Race to the Bottom
- Week 5: Corporate Actors & Levers
- Week 6: (Inter)Governmental Actors & Levers, With Historical Case Studies
- Week 7: Career Advice & Opportunities for People Who Are Interested in Helping
Some limitations of this program and suggested mindsets:
- In general, long-term AI governance remains at least partly pre-paradigmatic, and explanation of existing ideas has been limited. Somewhat more concretely:
- Many ideas in long-term AI strategy and governance have never been properly written up and published.
- Viewpoints on important questions (e.g. timelines) often change pretty rapidly.
- Many arguments and ideas have yet to be proposed and scrutinized in much depth.
- There is much disagreement and uncertainty about:
What abstractions are useful
What topics are useful to think about
How large different risks are
Other critical object-level questions
- In other words, borrowing from Olah and Carter’s essay on “Research Debt,” the long-term AI governance field currently has substantial limitations from limited exposition of important ideas, undigested ideas, and (debatably) bad abstractions.
- Some long-term AI governance research focuses on preparing for advances in AI, and some strategies involve drawing on advances in AI safety technical research. However, we don’t know precisely what any of these advances will be, or in precisely what (e.g. political) context they will take place. So some strategic clarity and precision might only come with time. More generally though, big-picture strategic research has much worse feedback loops than some other fields.
- While the above state of affairs need not be a reason to despair (the field is still very young and has arguably made significant progress), it means that it can be useful to keep these things in mind (much more so than in an average course):
- Unfortunately, public materials often:
- Give impressions of views that are outdated by a few years.
- Include little of the thinking of researchers who write and publish less often.
- Include little material on some important topics.
- Facilitators/mentors (and organizers) are still in the process of learning, especially if they're new to the area.
- Overall, this content is not a settled map of the AI governance landscape.
- There is very little “established wisdom” or informed consensus.
- Skeptical and questioning mindsets are especially valuable.
- Unfortunately, public materials often:
Abbreviated curriculum (only key ideas and core readings)
This is (an abbreviated version of) the current version of the curriculum, as of writing. It will be further edited leading up to the start of the program.
Week 0 (Recommended Background): AI, Machine Learning, and the Importance of Their Long-Term Impacts
This week aims to provide background context on AI and machine learning, as well as on why this course focuses on their long-term impacts.
We’ll look at: what is modern AI like? How are machine learning models structured and trained? Learning about these things will hopefully help you form your own informed views on relevant questions, be able to reason about AI’s impacts with more concrete examples and a higher awareness of current trends, and be better able to sound like you know what you are talking about.
This course also makes the somewhat unconventional choice of focusing on the long-term impacts of AI. That choice requires some context.
- Why governing AI is our opportunity to shape the long-term future (Leung, 2020)
- A short introduction to machine learning (Ngo, 2021)
- But what is a neural network? (3Blue1Brown, 2017)
- More perspectives on focusing on especially long-term and high-stakes impacts:
Week 1: Introduction to Governing World-Transforming AI
This week, we’ll see an overview of AI governance research. Additionally, we’ll examine several ideas that, at least for some researchers, are core motivations for focusing on AI governance: a mix of historical, economic, and other empirical arguments which suggest that AI will drastically transform the world, likely starting this century. (That is a strong, controversial claim—worth scrutinizing.)
- AI Strategy, Policy, and Governance (Dafoe, 2019)
- On just how world-transforming AI could be:
- A few shorter pieces aiming to better understand and forecast advanced AI:
Week 2: Deep Dive - The Alignment Problem
Last week, we saw arguments for the view that AI will drastically transform the world, very possibly this century. We should arguably expect such transformative impacts to come with many long-term risks and opportunities. This week, we will examine one potential source of risk that some researchers see as especially pressing: the AI alignment problem.
We’ll ask, “Why, precisely, might it be hard to align AI systems’ objectives with humans’ values, and how can people help solve this problem?” The purpose of this is to help us build a solid foundation for thinking clearly about (a) what risks, if any, the alignment problem poses, and (b) what these risks mean for AI governance. We take this focus because (as we will see next week) the alignment problem is at the center of many (though far from all) worries about how AI may cause long-lasting harms, and existing publications on these risks are relatively detailed.
Arguably, being able to reason independently about the alignment problem is especially important because just how much risk this problem poses is one of the (many) open questions in AI strategy and governance. To paraphrase, even among leading researchers focused on long-term AI risks, views range from “the alignment problem is the main source of AI existential risk, and it is very likely to cause existential catastrophe,” to “the alignment problem is a minor source of AI existential risk, and it is very unlikely to cause existential catastrophe.” Many researchers also have intermediate shades of worry.
- Why AI alignment could be hard with modern deep learning - Cotra (2021)
- Specification gaming: the flip side of AI ingenuity (Krakovna et al., 2020)
- “Inner Alignment: Explain like I'm 12 Edition” (Harth, 2020)
- AI alignment landscape (Christiano, 2020)
Week 3: Potential Sources of AI Existential Risk
Researchers have proposed several paths by which advances in AI may cause especially long-lasting harms—harms that are especially worrying from the perspective that the long-term impacts of our actions are very important. This week, we’ll consider: What are some of the best arguments for and against various claims about how AI poses existential risk?
A 2020 survey of AI safety and governance researchers (Clarke et al., 2021) found high levels of uncertainty and disagreement about what the most plausible AI existential risk scenarios. These imply that many researchers must be wrong about important questions, which arguably makes skeptical and questioning mindsets (with skepticism toward both your own view and others’) especially valuable.
- What Failure Looks Like (Christiano, 2019)
- Is power-seeking AI an existential risk? (Carlsmith, 2021) (Excerpt)
- AI Governance: Opportunity and Theory of Impact (Dafoe, 2020)
- Sharing the World with Digital Minds (Shulman and Bostrom, 2020) (Excerpt)
Week 4: Avoiding a Race to the Bottom
This week, we shift gears from assessing potential AI risks to considering potential ways to mitigate these risks, while seeking to better understand strategic considerations that can help us identify new opportunities for steering toward better outcomes.
While many questions in AI governance remain unclear, researchers often agree that an AI-related “race to the bottom”—a situation in which intense, AI-related competition pushes AI developers to neglect important values (such as safety) or be out-competed—would be bad. Some researchers see the main risk here as under-investment in AI alignment research, while other researchers have more general concerns about such competition. Additionally, some researchers have argued that inadequately constrained competition poses other threats. This week, we’ll examine several (though far from all) strategic considerations that may help us find ways to avoid such risky dynamics. Many of the issues covered this week have potential implications for both government and corporate policies—areas we will examine individually in the two weeks that follow.
- Cooperation, conflict and transformative AI: Sections 1 & 2 (Clifton, 2019)
- Cooperative AI: machines must learn to find common ground (Dafoe et al., 2021)
- AI, the space race, and prestige (Barnhart, 2021) (Excerpts)
- On AI hardware supply chains:
Week 5: Corporate Actors & Levers
Corporations are the leading developers of AI, and there is some historical precedent of early corporate standards influencing later government regulation (Leung, 2018), so corporate AI developers’ governance choices may be of great significance for the future of AI. How can, and how should, corporate AI developers shape the trajectory of AI?
- Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims (Brundage et al., 2020) (Excerpt)
- The Offense-Defense Balance of Scientific Knowledge: Does Publishing AI Research Reduce Misuse? (Shevlane and Dafoe, 2020)
- Various short readings:
- OpenAI Charter (2018) and OpenAI LP announcement (2019)
- DeepMind reportedly lost a yearslong bid to win more independence from Google (Vincent, 2021)
- How do a corporation's shareholders influence its Board of Directors? (Ross, 2021)
- Information security careers for GCR reduction (Zabel and Muehlhauser, 2019) (Excerpt)
Week 6: (Inter)Governmental Actors & Levers, With Historical Case Studies
Governments have unique authorities to engage in acts like international coordination and commercial regulation, and there is historical precedent (Leung, 2019) of governments taking major roles in the governance of strategic general-purpose technologies. For these reasons, governments may be highly influential actors in shaping the future of AI. How can, and how should, governments and intergovernmental organizations shape the trajectory of AI?
This week emphasizes historical case studies, since these seem useful for informing intuitions about at least some of the future impacts of (and political/organizational reactions to) AI.
- China’s Current Capabilities, Policies, and Industrial Ecosystem in AI - Ding (2019)
- Compilation: 7+ historical case studies of technology governance and international agreements (various authors)
- “Policy Making for the Long Term in Advanced Democracies” (Jacobs, 2016) (Excerpt)
- Various short readings:
- Global AI Talent Tracker (MacroPolo, c. 2019)
- Twitter thread summary of the Evans et al. paper “Truthful AI” (Evans, 2021)
- Twitter thread summary of Clark and Whittlestone paper “Why and How Governments Should Monitor AI Development” (Clark, 2021)
- What is the EU AI Act and why should you care about it? (2021)
Week 7: Career Advice & Opportunities for People Who Are Interested in Helping
There is a growing ecosystem of professionals and organizations aiming to positively shape the long-term impacts of AI. This week, we’ll dive into the question: What is happening in this ecosystem, and—if you want to join the efforts to positively shape AI’s trajectory—how can you do so?
Like AI governance research, AI governance career opportunities are rapidly changing, so current opportunities and advice are likely far from all that will ever be available.
- A personal take on longtermist AI governance (Muehlhauser, 2021)
- AGI Safety Fundamentals - Further Resources
- Description of some organizations in or adjacent to long-term AI governance (non-exhaustive) (2021)
- Read about whichever of these AI governance career paths you’re interested in:
- AI governance research:
- Unfortunately, I’m not aware of detailed, published career guidance for entering AI governance research. Here is some information about some specific opportunities: [...]
- US policy work aimed at improving AI governance:
- “How to pursue a career in US AI public policy,” section from 80,000 Hours’ article on US AI policy
- Pick and read at least one of the US policy readings from the “Additional Recommendations” section.
- Read these bullet points about some specific opportunities: [...]
- China-based work aimed at improving AI governance
- China-related AI safety and governance paths, 80,000 Hours career review (2021)
- The “China and AI” section of the “Further Readings” section lists relevant reading lists. Pick and read at least one reading from one of these, for some additional context on AI governance in China.
- Europe/UK-based work aimed at improving AI governance:
- AI governance research:
- Some AI governance research ideas (Anderljung and Carlier, 2021)
Week 8: AI Governance Projects
The final part of this course will be projects where you get to dig into something related to the course. The project is a chance for you to explore your_ _interests, so try to find something you’re excited about! The goal of this project is to help you practice taking an intellectually productive stance toward AI governance - to go beyond just reading and discussing existing ideas, and take a tangible step towards contributing to the field yourself. This is particularly valuable because it’s such a new field, with lots of room to explore.
This project might be especially valuable for learning more about how much you like AI governance research, identifying research areas you’re excited about, and in some cases to produce directly useful work or a useful writing sample. (This is partly because people in the field often take such projects more seriously than, say, academics often do.) Still, a useful mindset for research in the field is often, "I'm trying to figure out what I should believe and explain it clearly"—that will often produce better (and, ironically, more impressive) work than if you were directly aiming for looking impressive.
We’ve allocated four weeks between the last week of the curriculum content and the sessions where people present their projects. As a rough guide, spending 5-10 hours on the project during that time seems reasonable, but we’re happy for participants to be flexible about spending more or less time on it. You may find it useful to write up a rough project proposal in the first week of working on your project and to send it to your cohort for feedback.
The format of the project is very flexible. The default project will probably be a piece of writing, roughly the length and scope of a typical blog post; we’d encourage participants to put these online after finishing them (although this is entirely optional). Projects in the form of presentations are also possible, but we slightly discourage them; we’d prefer if you spend more time creating some piece of writing, then just casually talk through it with your cohort, rather than spending time on trying to make a polished presentation. We expect most projects to be individual ones; but feel free to do a collaborative project if you’d like to.
To find additional materials about topics you’re interested in (or to find interesting topics), you may find this compilation of AI governance readings by topic useful.
Click here for the full version of the curriculum, which contains additional readings, notes/context on each core reading, and project ideas, and will include exercises and discussion prompts.
This is the current version of the curriculum, as of writing. It will be further edited leading up to the start of the program.
Why is this valuable if there's a mentorship bottleneck in AI governance?
A view that seems common in the long-term AI governance field is that the research field’s growth is bottlenecked by capacity to train/mentor junior researchers. We might worry that, because of this, introducing more people to the field is of little value (or maybe even harmful, if it means more people will be frustrated by the difficulty of breaking into the field). I think there are several reasons why this program is valuable anyway:
- There is a good chance the mentorship bottleneck will soon be eased.
- From talking with a few people at relevant organizations, there seems to be significant interest in easing the mentorship bottleneck, as well as some promising, scalable ideas for doing so.
- Mentorship is not a bottleneck in all important areas of the field.
- My sense is that mentorship (more specifically, mentorship from members of this community) is not as severe of a bottleneck for ladder-climbing in government, which also seems important.
- Even under mentorship bottlenecks, efficient introductions to the field are useful.
- It’s convenient for people to be efficiently introduced to the field, both for saving their time and for lowering mentorship costs.
As an “outside view” consideration, relevant professionals have generally been encouraging.
What should I do if I already did the AGI safety fundamentals program and am interested in this program?
If you’re interested after checking out this program’s syllabus, we’d encourage you to apply! The governance track is sufficiently different in focus that you're still likely to learn a lot of new content, and the foundational ideas within AGI safety are complex and confusing enough to warrant more discussion and evaluation.
What if I want to do both the technical alignment and the governance tracks?
Since this would mean a roughly doubled time commitment, we mainly just recommend this to people who have a lot of free time or are at a particularly pivotal point in their career (such that learning more about these fields soon would inform time-sensitive decisions). If you’d like to apply to do both tracks simultaneously, there’s an option to indicate that preference in the “choose a track” question of the application form.
I didn’t receive a confirmation email—did you receive my application?
We likely did—our version of the software used to process applications doesn’t send confirmation emails.
What if I already applied to the upcoming AGI Safety Fundamentals program and indicated an interest in the governance track?
Then you’re all set! No need to submit an additional application here.
What if I already did an AI governance reading group?
Take a look at the syllabus and decide from there! If you previously participated in one of the smaller AI governance reading groups I helped run, I think you’ll find lots of new content in this syllabus.
What if I have feedback on the syllabus, or ideas for exercises/discussion prompts/etc?
I’d love to hear your ideas! Please let me know by commenting below or on the linked syllabus, messaging me through the forum, or emailing stanfordeateam [at] gmail [dot] com (although that is checked less frequently).
I’ll be editing the syllabus and adding exercises and discussion prompts until the program starts in January.
For context on my own background, among other hats, I currently run the Stanford Existential Risks Initiative’s AI governance and US policy programming. I’ve only been learning about this field for about a year, so I’m especially thankful for the 20+ years of collective expertise our advisors brought to this curriculum. ↩︎
If you are listed here and would prefer a different description/additional caveats/etc., please let me know! ↩︎
Some specific content gaps in current public materials, as flagged by one reviewer:
- A good taxonomy of plausible sources of existential risk from AI
- More thorough analysis of some potential AI-related existential risks to which misalignment is non-central
- A good overview of different views on “theories of victory” and cruxes between people who disagree on them
- (I don’t have a very precise sense of what this reviewer has in mind with this term.)
- (There are probably many more things.)
As an example of researchers having different views on which abstractions to use, researchers have proposed and used a slew of different terms and concepts for thinking about future AI capabilities: AGI, human-level AI, advanced AI, strong AI, superintelligence (collective, speed, quality), transformative AI, oracle AI, genie AI, sovereign AI, tool AI, agentic AI, optimizer, mesa-optimizer, comprehensive AI services, and probably others. As another example, definitions/framings of “outer alignment,” “inner alignment,” and related terms are contested (you can find relevant in-the-weeds discussion e.g. here, here, here, and here) (thanks to Sam Clarke for flagging these to me). ↩︎