Levelling Up in AI Safety Research Engineering

GabeM

Summary: A level-based guide for independently up-skilling in AI Safety Research Engineering that aims to give concrete objectives, goals, and resources to help anyone go from zero to hero.

Cross-posted to LessWrong. View a pretty Google Docs version here.

Introduction

I think great career guides are really useful for guiding and structuring the learning journey of people new to a technical field like AI Safety. I also like role-playing games. Here’s my attempt to use levelling frameworks and break up one possible path from zero to hero in Research Engineering for AI Safety (e.g. jobs with the “Research Engineer” title) through objectives, concrete goals, and resources. I hope this kind of framework makes it easier to see where one is on this journey, how far they have to go, and some options to get there.

I’m mostly making this to sort out my own thoughts about my career development and how I’ll support other students through Stanford AI Alignment, but hopefully, this is also useful to others! Note that I assume some interest in AI Safety Research Engineering—this guide is about how to up-skill in Research Engineering, not why (though working through it should be a great way to test your fit). Also note that there isn’t much abstract advice in this guide (see the end for links to guides with advice), and the goal is more to lay out concrete steps you can take to improve.

For each level, I describe the general capabilities of someone at the end of that level, some object-level goals to measure that capability, and some resources to choose from that would help get there. The categories of resources within a level are listed in the order you should progress, and resources within a category are roughly ordered by quality. There’s some redundancy, so I would recommend picking and choosing between the resources rather than doing all of them. Also, if you are a student and your university has a good class on one of the below topics, consider taking that instead of one of the online courses I listed.

As a very rough estimate, I think each level should take at least 100-200 hours of focused work, for a total of 700-1400 hours. At 10 hours/week (quarter-time), that comes to around 16-32 months of study but can definitely be shorter (e.g. if you already have some experience) or longer (if you dive more deeply into some topics)! I think each level is about evenly split between time spent reading/watching and time spent building/testing, with more reading earlier on and more building later.

Confidence: mid-to-high. I am not yet an AI Safety Research Engineer (but I plan to be)—this is mostly a distillation of what I’ve read from other career guides (linked at the end) and talked about with people working on AI Safety. I definitely haven’t done all these things, just seen them recommended. I don’t expect this to be the “perfect” way to prepare for a career in AI Safety Research Engineering, but I do think it’s a very solid way.

Level 1: AI Safety Fundamentals

Objective‏‏‎ ‎

You are familiar with the basic arguments for existential risks due to advanced AI, models for forecasting AI advancements, and some of the past and current research directions within AI alignment/safety. You have an opinion on how much you buy these arguments and whether you want to keep exploring AI Safety Research Engineering.

Why this first? Exposing yourself to these fundamental arguments and ideas is useful for testing your fit for AI Safety generally, but that isn’t to say you should “finish” this Level first and move on. Rather, you should be coming back to these readings and keeping up to date with the latest work in AI Safety throughout your learning journey. It’s okay if you don’t understand everything on your first try—Level 1 kind of happens all the time.

Goals‏‏‎ ‎

Complete an AI Safety introductory reading group fellowship.
Write a reflection distilling, recontextualizing, or expanding upon some AI Safety topic and share it with someone for feedback.
Figure out how convinced you are of the arguments for AI risk.
Decide if you want to continue learning about AI Safety Research Engineering, Theoretical AI Alignment, AI Policy and Strategy, or another field.

Resources‏‏‎ ‎

AI Safety Reading Curriculum (Choose 1)
Additional Resources

Level 2: Software Engineering

Objective‏‏‎ ‎

You can program in Python at the level of an introductory university course. You also know some other general software engineering tools/skills like the command line, Git/GitHub, documentation, and unit testing.

Why Python? Modern Machine Learning work, and thus AI Safety work, is almost entirely written in Python. Python is also an easier language for beginners to pick up, and there are plenty of resources for learning it.

Goals‏‏‎ ‎

Solve basic algorithmic programming problems with Python.
Know the basics of scientific computing with Python, including NumPy, and Jupyter/Colab/iPython Notebooks.
Create a new Git repository on GitHub, clone it, and add/commit/push changes to it for a personal project.
Know other software engineering skills like how to use the command line, write documentation, or make unit tests.

Resources‏‏‎ ‎

Level 3: Machine Learning

Objective‏‏‎ ‎

You have the mathematical context necessary for understanding Machine Learning (ML). You know the differences between supervised and unsupervised learning and between classification and regression. You understand common models like linear regression, logistic regression, neural networks, decision trees, and clustering, and you can code some of them in a library like PyTorch or JAX. You grasp core ML concepts like loss functions, regularization, bias/variance, optimizers, metrics, and error analysis.

Why so much math? Machine learning at its core is basically applied statistics and multivariable calculus. It used to be that you needed to know this kind of math really well, but now with techniques like automatic differentiation, you can train neural networks without knowing much of what’s happening under the hood. These foundational resources are included for completeness, but you can probably spend a lot less time on math (e.g. the first few sections of each course) depending on what kind of engineering work you intend to do. You might want to come back and improve you math skills for understanding certain work in Levels 6-7, though, and if you find this math really interesting, you might be a good fit for theoretical AI alignment research.

Goals‏‏‎ ‎

Understand the mathematical basis of Machine Learning, especially linear algebra and multivariable calculus.
Write out the differences between supervised and unsupervised learning and between classification and regression.
Train and evaluate a simple neural network on a standard classification task like MNIST or a standard regression task like a Housing Dataset.

Resources‏‏‎ ‎

Basic Calculus (Choose 1)
1. Essence of calculus - 3Blue1Brown
Probability (Choose 1)
Linear Algebra (Choose 1)
1. Essence of linear algebra - 3Blue1Brown
2. Linear Algebra - MIT OpenCourseWare
3. Georgia Tech’s course (parts 1, 2, 3, 4)
4. Linear Algebra - Foundations to Frontiers - edX
5. Linear Algebra Done Right - Sheldon Axler
Multivariable Calculus (Choose 1)
Introductory Machine Learning (Choose 1-2)
Additional Resources

Level 4: Deep Learning

Objective‏‏‎ ‎

You’ve dived deeper into Deep Learning (DL) through the lens of at least one subfield such as Natural Language Processing (NLP), Computer Vision (CV), or Reinforcement Learning (RL). You now have a better understanding of ML fundamentals, and you’ve reimplemented some core ML algorithms “from scratch.” You’ve started to build a portfolio of DL projects you can show others.

Goals‏‏‎ ‎

Be able to describe in moderate detail a wide range of modern deep learning architectures, techniques, and applications such as long short-term memory networks (LSTM) or convolutional neural networks (CNN).
Gain a more advanced understanding of machine learning by implementing autograd, backpropagation, and stochastic gradient descent “from scratch.”
Complete 1-3 deep learning projects, taking 10–20 hours each, in 1 or more sub-fields like NLP, CV, or RL.

Resources‏‏‎ ‎

General Deep Learning (Choose 1)
Advanced Machine Learning
1. Studying (Choose 1-2)
  1. Backpropagation - CS231n Convolutional Neural Networks for Visual Recognition
  2. A Recipe for Training Neural Networks - Andrej Karpathy
2. Implementing (Choose 1)
  1. MiniTorch (reimplement the core of PyTorch, self-study tips here)
  2. building micrograd - Andrej Karpathy
  3. Autodidax: JAX core from scratch
Natural Language Processing (Choose 1 Or Another Sub-Field)
Computer Vision (Choose 1 Or Another Sub-Field)
1. Deep Learning for Computer Vision - UMich
2. CS231n: Deep Learning for Computer Vision - Stanford University (lecture videos here)
Reinforcement Learning (Choose 1 Or Another Sub-Field)
Additional Resources

Level 5: Understanding Transformers

Objective‏‏‎ ‎

You have a principled understanding of self-attention, cross-attention, and the general transformer architecture along with some of its variants. You are able to write a transformer like BERT or GPT-2 “from scratch” in PyTorch or JAX (a skill I believe Redwood Research looks for), and you can use resources like 🤗 Transformers to work with pre-trained transformer models. Through experimenting with deployed transformer models, you have a decent sense of what transformer-based language and vision models can and cannot do.

Why transformers? The transformer architecture is currently the foundation for State of the Art (SOTA) results on most deep learning benchmarks, and it doesn’t look like it’s going away soon. Much of the newest ML research involves transformers, so AI Safety organizations focused on prosaic AI alignment or conducting research on current models practically all focus on transformers for their research.

Goals‏‏‎ ‎

Play around with deployed transformer models and write up some things you notice about what they can and cannot do. See if you can get them to do unexpected or interesting behaviors.
Read and take notes about how transformers work.
Use 🤗 Transformers to import, load the pre-trained weights of, and fine-tune a transformer model on a standard NLP or CV task.
Implement basic transformer models like BERT or GPT-2 from scratch and test that they work by loading pre-trained weights and checking that they produce the same results as the reference model or generate interesting outputs.

Resources‏‏‎ ‎

Experiment With Deployed Transformers (Choose 1-3)
Study The Transformer Architecture (Choose 2-3)
Using 🤗 Transformers (Choose 1-2)
1. Hugging Face Course
2. CS224U: Natural Language Understanding - Stanford University (Supervised Sentiment Analysis unit only)
Implement Transformers From Scratch (Choose 1-2)
1. MLAB-Transformers-From-Scratch - Redwood Research (refactored by Gabriel Mukobi)
2. deep_learning_curriculum/1-Transformers - Jacob Hilton
Compare Your Code With Other Implementations
1. BERT (Choose 1-3)
  1. pytorchic-bert/models.py - dhlee347 (PyTorch)
  2. BERT - Google Research (TensorFlow)
  3. How to Code BERT Using PyTorch - neptune.ai (PyTorch)
  4. nlp-tutorial/BERT.py - graykode (PyTorch)
  5. Transformer-Architectures-From-Scratch/BERT.py - ShivamRajSharma (PyTorch)
2. GPT-2 (Choose 1-3)
  1. Transformer-Architectures-From-Scratch/GPT_2.py - ShivamRajSharma (PyTorch)
  2. gpt-2/model.py - openai (TensorFlow)
  3. minGPT/model.py - Andrej Karpathy (PyTorch)
  4. The Annotated GPT-2 - Aman Arora (PyTorch)
Additional Resources
1. Study Transformers More
2. Other Transformer Models You Could Implement

Level 6: Reimplementing Papers

Objective‏‏‎ ‎

You can read a recently published AI research paper and efficiently implement the core technique they present to validate their results or build upon their research. You also have a good sense of the latest ML/DL/AI Safety research. You’re pretty damn employable now—if you haven’t started applying for Research Engineering jobs/internships, consider getting on that!

Why papers? I talked with research scientists or engineers from most of the empirical AI Safety organizations (i.e. Redwood Research, Anthropic, Conjecture, Ought, CAIS, Encultured AI, DeepMind), and they all said that being able to read a recent ML/AI research paper and efficiently implement it is both a signal of a strong engineering candidate and a good way to build useful skills for actual AI Safety work.

Goals‏‏‎ ‎

Learn how to efficiently read Computer Science research papers.
Learn tips on how to implement papers and learn efficiently by doing so.
Reimplement the key contribution and evaluate the key results of 5+ AI research papers in topics of your choosing.

Resources‏‏‎ ‎

How to Read Computer Science Papers (Choose 1-3)
How to Implement Papers (Choose 2-4)
Implement Papers (Choose 5+, look beyond these)
1. General Lists
  1. Machine Learning Reading List - Ought
  2. Some fun machine learning engineering projects that I would think are cool - Buck Shlegeris
2. Interpretability
3. Robustness/Anomaly Detection
  1. Agreement-on-the-Line: Predicting the Performance of Neural Networks under Distribution Shift - Baek et al.
4. Value/Preference Learning
  1. Deep Learning from Human Preferences - OpenAI
  2. Fine-Tuning Language Models from Human Preferences - OpenAI
5. Reinforcement Learning
  1. Key Papers in Deep RL — OpenAI

Level 7: Original Experiments

Objective‏‏‎ ‎

You can now efficiently grasp the results of AI research papers and come up with novel research questions to ask as well as empirical ways to answer them. You might already have a job at an AI Safety organization and have picked up these skills as you got more Research Engineering experience. If you can generate and test these original experiments particularly well, you might consider Research Scientist roles, too. You might also want to apply for AI residencies or Ph.D. programs to explore some research directions further in a more structured academic setting.

Goals‏‏‎ ‎

Write an explanation of what research directions fit your tastes.
Create 5+ concrete research questions you might want to explore. These can be from lists like those below, the future research sections at the ends of ML papers, or your own brainstorming.
Conduct AI Safety research and publish or share your results.

Resources‏‏‎ ‎

Research Advice
General Lists of Open Questions to Start Researching
Open Questions in Interpretability
1. Ten experiments in modularity, which we'd like you to run! - TheMcDouglas, Lucius Bushnaq, Avery
2. A Mechanistic Interpretability Analysis of Grokking#Future Directions - Neel Nanda, Tom Lieberum
Open Questions in Robustness/Anomaly Detection
1. Benchmark for successful concept extrapolation/avoiding goal misgeneralization - AlignedAI
2. Neural Trojan Attacks and How You Can Help - Sidney Hough
Open Questions in Adversarial Training
1. Final Project Guidelines- MLSS

Epilogue: Risks

Embarking on this quest brings with it a few risks. By keeping these in mind, you may be less likely to fail in these ways:

Capabilities

There is a real risk of novel AI Safety research advancing AI capabilities, reducing AI timelines, and giving us less time to solve alignment. If you go through this or other career guides, you may produce work that inadvertently leads to capabilities externalities.
That said, I expect most of the work from following this guide to be directly harmless, at least until you get to Level 6. Before then, it’s probably fine (and helpful motivation) to share what you are working on with others! After then, if you think you might have produced research that could be dangerous, err on the side of secrecy and ask for advice from a trusted and willing AI Safety researcher.
As a general guideline, consider building the habit of marking how hazardous your work is, evaluating other researchers’ work, and reading other AI Safety researchers’ evaluations of other AI work (e.g. in the comments of the AI Alignment Forum) in order to grow your sense of what various forms of dangerous work “smells” like.

Difficulty

Following this career guide alone will be extremely difficult, and there’s a high chance you could fail not because of your own lack of skill, but just because you don’t have the right support structures behind you.
To stay supported, find a mentor. This could be a willing AI Safety researcher or anyone else you trust who is skilled in machine learning with whom you could meet every so often to ask conceptual questions, get feedback, or discover new learning resources.
To stay supported, find coworkers. These could be friends who are also up-skilling in AI Safety with whom you could meet once or a few times per week to discuss things you are learning, collaborate on projects, or set accountable goals (e.g. “If I don’t read and take notes on this paper by next Monday, I owe you $20.”).
Also, keep in mind that this guide is kind of fake: breaking up the challenge of developing Research Engineering skills into a series of levels is especially reductive of the reality of learning. You probably shouldn’t do every suggested thing in a level once, move on to the next level, and never look back—rather, I think a better approach involves frequently revisiting previous levels, exploring other resources not listed here, and moving on from things if the marginal benefit you’d get from immediately spending extra time on those things doesn’t outweigh the opportunity costs of learning new things.
With all that said, this guide can be a way to test your fit. If you give programming or machine learning an honest effort and find you really don’t like any of it, then maybe Research Engineering just isn’t for you, and that’s okay! If you still think AI Safety is important, consider exploring AI Policy and Strategy.

Early Over-Specialization

Don’t be afraid to try new things! One failure mode I imagine here is diving deep into a small niche sub-field of AI Safety early in your journey and then either never making any useful contributions or getting bored and quitting. You actually might have been better at or derived more enjoyment from a different sub-field.
This could be important for getting jobs in AI Safety. Unless you get real lucky and happen to choose right, you might over-specialize in a narrow band of skills and lack knowledge in other domains that are important for the jobs you are interested in.
AI in general and even AI Safety in specific are very diverse fields with too many different things for any one person to specialize in, but you can still aim for a breadth of knowledge early on. When choosing projects in Levels 3-5, consider trying things you haven’t tried yet. It might feel scary, but breaking out of your comfort zone in this way is probably better for efficient learning and exploration.
Somewhat relatedly, AI moves super fast, and soon many of the resources here might become outdated or new promising research areas might emerge. I intend to keep this guide updated, but I encourage you to look beyond what’s here to explore new and interesting things in the future!
On a meta-level, it’s also possible to over-specialize in Research Engineering. Consider exploring other impactful career paths—many of the skills here could be really useful for Theoretical AI Alignment, AI Policy and Strategy, Information Security, and even Operations Management.

Late Over-Generalization

That said, you should eventually find some areas you particularly like and dig deeper into them (between Levels 6-7). I just think it’s worth exploring broadly before then so you can have a good taste of areas to choose from. Specialization is good; early over-specialization is bad.
This is also important if you intend to work in an AI Safety research lab: A common failure mode here is skimming across many different areas at a surface level but never diving into any enough to gain deeper insights and produce tangible results.
This failure mode seems to be pretty common with Ph.D. students, many of whom repeatedly hear from their advisors to make their projects even narrower in scope.

Sources

Here are some of the other great career guides and resources I used in the making of this. Most of the guides here also have good general advice that would be useful to read even if you don’t do the other things they suggest. Consider checking them out!

How to pursue a career in technical AI alignment - Charlie Rogers-Smith
deep_learning_curriculum - Jacob Hilton
ML Safety Scholars Program - CAIS
ML engineering for AI Safety & robustness: a Google Brain engineer's guide to entering the field - 80,000 Hours
ML for Alignment Bootcamp (MLAB 2) - Redwood Research (and the public GitHub repo)
Machine Learning Reading List - Ought
Careers in alignment - Adam Gleave
Talking with various AI Safety Research Engineers and Scientists

Many thanks to Jakub Nowak, Peter Chatain, Thomas Woodside, Erik Jenner, Jacy Reese Anthis, and Konstantin Pilz for review and suggestions!

Jay BaileySep 2 20227

This is a fantastic resource, and I'm really glad to have it!

My own path has been a little more haphazard - I completed Level 2 (Software Engineering) years ago, and am currently working on AI safety (1), mathematics (3) and research engineering ability (4) simultaneously. Having just completed the last goal of 4 (Completing 1-3 RL projects) I was planning to jump right into 6 at this point, since transformers haven't yet appeared in my RL perusal, but I'm now rethinking those plans based on this document - perhaps I should learn about transformers first.

All in all, the first four levels (The ones I feel qualified to write about, having gone through some or all of them) seem extremely good.

The thing that most surprised me about the rest of the document was Level 6. Specifically, the part about being able to reimplement a paper's work in 10-20 hours. This seems pretty fast compared to other resources I've seen out there, though most of these resources are RL-focused. For instance, this post (220 hours). This post from DeepMind about job vacancies a few months ago also says:

"As a rough test for the Research Engineer role, if you can reproduce a typical ML paper in a few hundred hours and your interests align with ours, we’re probably interested in interviewing you."

Thus, I don't think it's necessary to be able to replicate a paper in 10-20 hours. Replicating papers is a great idea according to my own research, but I think that one can be considerably slower than that and still be at a useful standard.

If you have other sources that suggest otherwise I'd be very interested to read them - it's always good to improve my idea of where I'm heading towards!

GabeMSep 2 20221

Thanks for sharing your experiences, too! As for transformers, yeah it seems pretty plausible that you could specialize in a bunch of traditional Deep RL methods and qualify as a good research engineer (e.g. very employable). That's what several professionals seem to have done, e.g. Daniel Ziegler.

But maybe that's changing, and it's worth it to start learning things. It seems like most of the new RL papers incorporate some kind of transformer encoder in the loop, if not basically being a straight-up Decision Transformer.

Jay BaileySep 2 20221

Interesting. Do you have any good examples?

GabeMSep 2 20222

Sure!

Thanks, that's a good point! I was very uncertain about that, it was mostly a made-up number. I do think the time to implement an ML paper depends wildly on how complex the paper is (e.g. a new training algorithm paper necessitates a lot more time to test it than a post-hoc interpretability paper that uses pre-trained models) and how much you implement (e.g. rewrite the code but don't do any training vs evaluate the key result to get the most important graph vs try to replicate almost all of the results).

I now think my original 10-20 hours per paper number was probably an underestimate, but it feels really hard to come up with a robust estimate here and I'm not sure how valuable it would be, so I've removed that parenthetical from the text.

quinnSep 30 20224

Epistemic status: a few projects in ML, technically "a professional ML engineer" right now but I don't think I'm good enough to get hired by one of the big EA-ML orgs right now.

Two points re the start of your math sequence

A lot of calculus is geared toward engineers and the way engineering is taught, leaving a gap in the basic language of mathematics that IME can be filled by discrete math. This pays dividends when you're reading wikipedia and research papers later. It doesn't take long to get ahold of sets and logic with trevtutor.
Single variable calculus may require exercises. I find it really odd that you think 3b1b gets the job done- it probably doesn't. Luckily, single variable calculus is well before you've maxed out khanacademy's wonderful exercises widget. Some ways of doing the rote calculations is a waste of time (like professors who don't seem to realize that if you've derivative'd one polynomial you've taken every polynomial), but the rote computations really pay dividends in your overall sophistication level (from the symbolic find-and-replace game to mystery solving even to thinking on your feet about applications.

Joseph QuevedoFeb 4 20241

I am currently going through the Data Analysis with Python course on FreeCodeCamp.

I have to say, just watching a video and then answering a question is not very interactive, and has made it hard for me to keep engaging with the course.

When I was doing the certification projects for Scientific Computing with Python in December, I was interested in progressing with my project. Still, lately, I've been dreading or uninterested in watching one more video of this guy going through a Jupyter notebook.

Another thing I disliked about it is that the "exercises" already have the answers in them.

I searched for this link in Reddit and found someone recommending a FreeCodeCamp course on Data Analysis with Python with Jovian, it supposedly being more interactive (you do have to pay for submitting assignments), but I was able to see the assignments, so it might be useful in case someone wants more practice and hands-on things to do.

I think I won't switch courses halfway, I don't want to get stuck in tutorial hell.

Joseph QuevedoMar 8 20241

So I finished the Data Analysis with Python course on FCC. I have to say, the certification projects may have some library usages that were not displayed on the videos for the course (as of today). One example is scikit's linregress: you won't see an explanation on that, but you'll be required to do a linear regression, good luck if you have no math background in functions.

Joseph QuevedoJan 26 20241

This is a great resource, and one I've been using myself, even though I'm a software engineer, I've been doing the Python FreeCodeCamp courses because I love FCC and I don't use Python daily at work.

I think there should be a Slack workspace or a Discord server for people doing this - I've felt that getting the maths in or the pre-requisite Machine Learning concepts is not a matter of just watching a few YouTube videos.

Vael GatesOct 12 20222

Thanks Gabriel-- super useful step-by-step guide, and also knowledge/skill clarification structure! I usually gesture around vaguely when talking about my skills (I lose track of how much I know compared to others-- the answer is I clearly completed Levels 1-3 then stopped) and trying to hire other people with related skills. It feels useful to be able to say to someone e.g. "For this position, I want you to have completed Level 1 and have a very surface level grasp of Levels 2-4"!

GabeMOct 13 20221

Ha thanks Vael! Yeah, that seems hard to standardize but potentially quite useful to use levels like these for hiring, promotions, and such. Let me know how it goes if you try it!

jmsdaoSep 8 20222

Great list!

Probably an important skillset that's missing is working with cloud computing services, which you may need if you want to train models that require more resources than what your local machine/Google Colab provides

GabeMSep 10 20221

Thanks! Forgot about cloud computing, added a couple of courses to the Additional Resources of Level 4: Deep Learning.

PabloAMC 🔸Sep 2 20222

The HuggingFace RL course might be an alternative in the Deep Learning - RL discussion above: https://github.com/huggingface/deep-rl-class

Good find, added!

[anonymous]Sep 30 20221

For Level 3: Machine Learning, this document might be useful. It provides a quick summary/recap of a lot of the math required for ML.

Joseph QuevedoDec 6 20241

I think there is also a similar book, "Mathematics For Machine Learning" by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, Cambridge University Press (2020)

https://mml-book.com

EmrikSep 3 20221

Many thanks to Jakub Nowak, Peter Chatain, Thomas Woodside, Erik Jenner, Jacy Reese Anthis, Ludwig Wittgenstein, and Konstantin Pilz for review and suggestions!

GabeMSep 3 20221

Oh lol I didn't realize that was a famous philosopher until now, someone commented from a Google account with that name! Removed Ludwig.

EmrikSep 3 20225

I didn't mean for you to remove it! I was just happy to see that he still has some influence from beyond the grave.

Artyom KJan 10 20231

That's a great resource to navigate my self study in ML! Thank you for compiling this list.

I wonder if a pull request to some popular machine learning library or tool counts as a step towards AI Safety Research. Say, a PR implements some safety feature for PyTorch, e.g. in interpretability, adversarial training, or other. Would it go to Level 6 as it is reimplementing papers? Making PR is, arguable, takes more efforts than just reimplementating a paper as it needs to fit into a tool.

Effective Altruism Forum
EA Forum

Levelling Up in AI Safety Research Engineering

165

Introduction

Level 1: AI Safety Fundamentals

Level 2: Software Engineering

Level 3: Machine Learning

Level 4: Deep Learning

Level 5: Understanding Transformers

Level 6: Reimplementing Papers

Level 7: Original Experiments

Epilogue: Risks

Capabilities

Difficulty

Early Over-Specialization

Late Over-Generalization

Sources

165

Reactions

More posts like this