The Intro to ML Safety course covers foundational techniques and concepts in ML safety for those interested in pursuing research careers in AI safety, with a focus on empirical research. We think it's a good fit for people with ML backgrounds who are looking to get into empirical research careers focused on AI safety.
Apply to be a participant by January 29th, 2023
About the Course
Intro to ML Safety is an 8-week virtual course that aims to introduce students with a deep learning background to the latest empirical AI Safety research. The program introduces foundational ML safety concepts such as robustness, alignment, monitoring, and systemic safety.
The course takes 5 hours a week, and consists of a mixture of:
- Assigned readings and lecture videos (publicly available at course.mlsafety.org)
- Homework and coding assignments
- A facilitated discussion session with a TA and weekly optional office hours
The course will be virtual by default, though in-person sections may be offered at some universities.
The Intro to ML Safety curriculum
The course covers:
- Hazard Analysis: an introduction to concepts from the field of hazard analysis and how they can be applied to ML systems; and an overview of standard models for modelling risks and accidents.
- Robustness: Robustness focuses on ensuring models behave acceptably when exposed to abnormal, unforeseen, unusual, highly impactful, or adversarial events. We cover techniques for generating adversarial examples and making models robust to adversarial examples; benchmarks in measuring robustness to distribution shift; and approaches to improving robustness via data augmentation, architectural choices, and pretraining techniques.
- Monitoring: We cover techniques to identify malicious use, hidden model functionality and data poisoning, and emergent behaviour in models; metrics for OOD detection; confidence calibration for deep neural networks; and transparency tools for neural nets.
- Alignment: We define alignment as reducing inherent model hazards. We cover measuring honesty in models; power aversion; an introduction to ethics; and imposing ethical constraints in ML systems.
- Systemic Safety: In addition to directly reducing hazards from AI systems, there are several ways that AI can be used to make the world better equipped to handle the development of AI by improving sociotechnical factors like decision making ability and safety culture. We cover using ML for improved epistemics; ML for cyberdefense; and ways in which AI systems could be made to better cooperate.
- Additional X-Risk Discussion: The last section of the course explores the broader importance of the concepts covered: namely, existential risk and possible existential hazards. We cover specific ways in which AI could potentially cause an existential catastrophe, such as weaponization, proxy gaming, treacherous turn, deceptive alignment, value lock-in, and persuasive AI. We introduce some considerations for influencing future AI systems; and introduce research on selection pressures.
How is this program different from AGISF?
If you are interested in an empirical research career in AI safety, then you are in the target audience for this course. The ML Safety course does not overlap much with AGISF, so we expect that participants who both have and have not previously done AGISF to get a lot out of Intro to ML Safety.
Intro to ML Safety is focused on ML empirical research rather than conceptual work. Participants are required to watch recorded lectures and complete homework assignments that test their understanding of the technical material.
You can read about more the ML safety approach in Open Problems in AI X-risk.
The program will last 8 weeks, beginning on February 20th and ending on April 21th. Participants are expected to commit at least 5 hours per week. This includes ~1 hour of recorded lectures, ~1-2 hours of readings, ~1-2 hours of written assignments, and 1 hour of discussion.
We understand that 5 hours is a large time commitment, so to make our program more inclusive and remove any financial barriers, we will provide a $500 stipend upon completion of the course.
The prerequisites for the course are:
- Familiarity with deep learning (e.g. a college course)
- Watch this deep learning review to check your level of knowledge.
- Linear algebra or introductory statistics (e.g. AP Statistics)
- Multivariate differential calculus
Apply to be a participant by January 29th, 2023.