Hide table of contents

In a nutshell: To mitigate the risks posed by the development of artificial intelligence, it’s imperative to research how to solve technical challenges and design problems to ensure that powerful AI systems do what we want — and are beneficial — without any catastrophic unintended consequences.

If you are well suited to this career, it may be the best way for you to have a social impact.

Review status

Based on a medium-depth investigation 

Why might working on AI safety research be high impact?

As we’ve argued, in the next few decades, we might see the development of powerful machine learning algorithms with the potential to transform society. This could have major upsides and downsides, including the possibility of catastrophic risks.

Besides strategy and policy work discussed in this career review, another key way to limit these risks is research into the technical challenges raised by powerful AI systems, such as the alignment problem. In short, how do we design powerful AI systems so they’ll do what we want, and not have unintended consequences?

This field of research has started to take off, and there are now major academic centres and AI labs where you can work on these issues, such as Mila in Montreal, the Future of Humanity Institute at Oxford, the Center for Human-Compatible Artificial Intelligence at Berkeley, DeepMind in London, and OpenAI in San Francisco. We’ve advised over 100 people on this path, with several already working at the above institutions. The Machine Intelligence Research Institute in Berkeley has been working in this area since 2005 and has an unconventional perspective and research agenda relative to the other labs.

There is plenty of funding available for talented researchers — including academic grants and philanthropic donations from major grantmakers like Open Philanthropy. It’s also possible to get funding for your PhD programme. The main need of the field is more people capable of using this funding to carry out the research.

What does this path involve?

In this path, the aim is to get a position at one of the top AI safety research centres — either in industry, nonprofits, or academia — and then try to work on the most pressing questions, with the eventual aim of becoming a research lead overseeing safety research.

Broadly, AI safety technical positions can be divided into (i) research and (ii) engineering. Researchers direct the research programme. Engineers create the systems and do the analysis needed to carry out the research.

Although engineers have less influence over the high-level research goals, it is still important that engineers are concerned about safety, as they’ll better understand the ultimate goals of the research (and so prioritise better), be more motivated, shift the culture towards safety, and use the career capital they gain to benefit other safety projects in the future. This means that engineering can be a good alternative for those who don’t want to be a research scientist.

It can also be useful to have people who understand the challenges of AI safety working in AI research teams that aren’t directly focused on AI safety. Working on these teams can put you in a position to help promote concern for safety in general, especially if you end up in a management position with influence over the organisation’s priorities.

We’d also be excited to see more people build expertise to do AI safety work in or related to China — read more in our career review on China-related AI safety and governance paths, some of which take the form of technical research.

Examples of people pursuing this path

Catherine Olsson portrait

Catherine Olsson

Catherine started her PhD at NYU, working on computational models of human vision. Eventually, she decided to work directly on AI safety, and got a job at OpenAI, and then Google Brain, before moving to Anthropic.

Daniel Ziegler portrait

Daniel Ziegler

After dropping out of his machine learning PhD at Stanford, Daniel — who’d always enjoyed building things and wanted to help shape the development of AI — decided to apply to OpenAI. He spent six weeks preparing for the interview, and landed the job. His PhD, by contrast, might have taken six years. Daniel thinks this highly accelerated path may be possible for many others.

Chris Olah portrait

Chris Olah

Chris has had a fascinating and unconventional path. Chris not only doesn’t have a PhD, but doesn’t even have an undergraduate degree. After dropping out of university to help defend an acquaintance who was facing unfair criminal charges, Chris started independently working on machine learning research, and eventually got an internship at Google Brain.

How to assess your fit

The most impactful AI technical safety research will probably be done by people in the top jobs listed earlier. So to decide if this path is a good fit for you, it’s important to consider whether you have a reasonable chance of getting those jobs.

  • Do you have a chance of getting into a top five graduate school in machine learning? This can be a good test for whether you could get a job at a top AI research centre, though it’s not a requirement.
  • Are you convinced of the importance of long-term AI safety?
  • Are you a software or machine learning engineer who’s worked at FAANG and other competitive companies? You may be able to train to enter a research position, or otherwise take an engineering position.
  • Do you have a chance at making a contribution to a relevant research question? For instance, are you highly interested in the topic, have ideas for questions to look into, and can’t resist pursuing them? Read more about how to tell if you’re a good fit for working in research.

How to enter this field

The first step on this path is usually to pursue a PhD in machine learning at a good school. It’s possible to enter this field without a PhD, but it’s likely to be required in research roles at academic centres and DeepMind, which make up a large fraction of the best positions. A PhD in machine learning also opens up options in AI policy, applied AI, and earning to give, so this path has good backup options if you later decide AI technical safety isn’t for you.

However, if you want to pursue engineering over research, then the PhD is not necessary. Instead, you can do a master’s programme or train up in industry.

It’s also possible to enter this path from neuroscience (especially computational neuroscience), so if you already have a background in that area, you may not have to return to study.

If you have a lot of familiarity already with AI safety as a problem area, our top recommendation is to look at this step-by-step guide to pursuing a career in technical AI safety, by Charlie Rogers-Smith.

Recently, opportunities have also opened up for social scientists to contribute to AI safety.

You can find much more detail in the resources listed below.

  • AI Safety Support works to reduce existential and catastrophic risk from AI by supporting everyone who wants to work on this problem, with a focus on helping new and aspiring AI safety researchers through career advice and community building.
  • Alignment Research Center is a nonprofit research organisation working to align future machine learning systems with human interests. Its current work focuses on developing an ‘end-to-end’ alignment strategy that could be adopted in industry today while scaling gracefully to future machine learning systems. See current vacancies.
  • Anthropic is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems. Their multidisciplinary team’s research interests include natural language, human feedback, scaling laws, reinforcement learning, code generation, and interpretability. See current vacancies.
  • The Center for Human-Compatible Artificial Intelligence aims to develop the conceptual and technical wherewithal to reorient the general thrust of AI research towards provably beneficial systems. See current vacancies.
  • The Center on Long-term Risk addresses worst-case risks from the development and deployment of advanced AI systems. It is currently focused on conflict scenarios as well as technical and philosophical aspects of cooperation. Their work includes conducting interdisciplinary research, making and recommending grants, and building a community of professionals and other researchers around these priorities. See current vacancies.
  • DeepMind is probably the largest research group developing general machine intelligence in the Western world. We’re only confident about recommending DeepMind roles working specifically on safety, ethics, policy, and security issues. See current vacancies.
  • The Future of Humanity Institute is a multidisciplinary research institute at the University of Oxford. Academics at FHI bring the tools of mathematics, philosophy, and social sciences to bear on big-picture questions about humanity and its prospects.
  • The Machine Intelligence Research Institute was one of the first groups to become concerned about the risks from machine intelligence in the early 2000s, and has published a number of papers on safety issues and how to resolve them. See current vacancies.
  • OpenAI was founded in 2015 with the goal of conducting research into how to make AI safe. It has received over $1 billion in funding commitments from the technology community. We’re only confident in recommending opportunities in their policy, safety, and security teams. See current vacancies.
  • Redwood Research conducts applied research to help align future AI systems with human interests. See current vacancies.

Want one-on-one advice on pursuing this path?

Because this is one of our priority paths, if you think this path might be a great option for you, we’d be especially excited to advise you on next steps, one-on-one. We can help you consider your options, make connections with others working in the same field, and possibly even help you find jobs or funding opportunities.


Learn more

Key further reading:

Other further reading:


This work is licensed under a Creative Commons Attribution 4.0 International License.

Sorted by Click to highlight new comments since:

The first step on this path is usually to pursue a PhD in machine learning at a good school.

Just want to add that I've heard from a few folks that this may be less true today — most recently in this 80k episode with Jan Leike:

Jan Leike: There’s a lot of different backgrounds that are applicable here. Machine learning PhDs have been the traditional way people get into the field, especially if you want to do something more researchy. I don’t think you need that at all. And in fact, if you’re thinking about starting a PhD now, I don’t know if you’ll have that much time. You should just go work on the problem now.

I wonder if it might be worth adding a bit more context about other options, and the opportunity costs of starting a PhD at this point given plausible AI timelines.

Curated and popular this week
Relevant opportunities