A huge thank you to Stephen Casper, Rachel Freedman, Lewis Hammond, and Pablo Moreno for taking the time to join our panel and answer questions!
On November 7, 2021, AI Safety Support hosted a Q&A panel with four PhD students working on AI safety. They discussed numerous topics including: finding a supervisor (especially when a university doesn’t have faculty working on AI safety), self-studying, prioritizing direct AI safety work versus working with a good advisor, studying AI safety from a philosophy versus quantitative perspective, and more!
Below is a transcript of the conversation, edited for brevity. We hope this can be a useful resource for students. You can also find a recording of the panel on our event page or view it directly via this link.
Additionally, we encourage students to check out our resources page for Graduate Studies which has a compilation of advice from others in the field, funding opportunities, project ideas, and a list of schools that have students or faculty working on AI Safety (this list is not exhaustive and is based on our current database).
Finally, if you have a different answer to any of the questions below, it would be wonderful to hear your perspective. Please feel free to leave a comment, thank you!
Questions and Answers
How do you choose supervisors if your university doesn't have anyone directly interested in AI safety/AI alignment?
I work in a university where no one really knows about AI safety. However, to choose a supervisor, I think there are a few things that are very important. Most importantly, you should simply find a good supervisor. This can make a huge difference in avoiding burn out and having adequate support. I would also look specially for supervisors who have an open mind for new topics or would give you some freedom to work on different things. This includes allowing you to go to AI safety camps, conferences, or things of that nature- really allowing you to get in touch with other AI safety researchers. I recommend asking a potential supervisor’s current students how happy they are, how much freedom they have, and just generally gauging their satisfaction. Additionally, if there is little opportunity to work directly on AI Safety during your graduate studies, you can choose a supervisor that will help you build skills to work on AI safety in the future (such as reinforcement learning, large language models, etc.)
I would say that numerous schools have supervisors doing AI safety relevant research, even if they don't necessarily consider themselves to be an AI safety researcher in the way effective altruists do. Most schools where people are working on deep networks, AI policy, or algorithms and justice are trying to make those systems work better for us, which is broadly in-line with AI safety research. The only difference between something I might call an explicit AI safety paper and other papers is some of the problem framing and what is being emphasized. As a result, I think finding a supervisor that is explicitly working on AI safety can be over-prioritized relative to the importance of finding a good advisor. Something that is relatively under emphasized is the importance of the other people that you may be working with in the lab. Even if there's not an advisor who's explicitly focused on AI safety, if they're focused on a broad range of topics and their students are interesting, relevant people you can collaborate with, then that's a great reason to consider that programme.
Implicitly within this question, it seems there's an assumption that part of the point of doing a PhD is to do useful AI safety research. I think this is a model that most people, or many people aim for, and it's something that can be done. However, I also know a bunch of people who are strongly interested in working on AI safety in the future and are doing AI PhDs, but are not directly doing safety work in their PhDs. They're just trying to do good, impactful, maybe safety adjacent research and trying to scale up as much as possible so that they are in a strong position to pursue safety research later on.
How hard is it to get a useful job in AI safety after graduating from a related master's programme?
There certainly are jobs out there, such as research engineer or machine learning engineer positions at larger companies, that don't require PhDs. If you want to work as a research scientist you might be expected to have a PhD, but for example, DeepMind regularly recruits people with master's or even bachelor's sometimes. I know people who have joined their safety team with those backgrounds and are doing important, impactful work, although it can perhaps be harder than if you have a PhD. Overall, I think if you're competent, and especially if you can code and build things well, people are more than happy to hire you from a masters for certain positions.
Instead of working on AI Safety, could pursuing an ML or AI PhD with the intention of earning-to-give be effective?
My intuition is that working directly on AI safety is more important than earning to give, and that seems to be a general intuition within the community. For example, Open Philanthropy has lots of money they're willing to give away to people who work on safety, so presumably they consider it quite important.
Given how competitive admissions are, is it worth trying to get into a top programme? If you don't get in, is it worth waiting and reapplying?
I think it depends a lot on what your goals are. Stephen and Lewis have already mentioned that you don't necessarily need to be at a top AI safety place in order to do useful AI safety work. When I was applying, I really wanted to get into one of the top US programmes, such as Berkeley and Stanford, because I was aware of people doing safety relevant research there. But in just the past few years, there has been a proliferation of people interested in safety relevant topics and just general acceptability of doing AI safety work at a lab that's not explicitly safety oriented. Thus, I think what you should be aiming for is a lab and a research group that's going to work really well for you, which is one where you have good collaborators, a good adviser, and can work on topics relevant to your interests. That does not necessarily mean that you have to go to one of the top four programmes, so it’s worth the effort to get into a top programme only if all the labs you want to work at are there, which was the case for me a few years ago. I don't know if that would be the case now.
With respect to waiting a year and reapplying, it depends on what you would do with that year. In general, if you don't get admitted to PhD programmes and want to spend a year strengthening your application, you need to be doing research. If you have some publications in the pipeline that are going to come out in the next year, or there's some sort of relevant research position you could do in that time, that's likely worthwhile. If during the year you're going to be doing something that's not related to research, it's probably not going to make your application stronger and so it's probably not worth waiting. I should say, however, that I was an extremely rare exception to this. If I didn't get in anywhere, my plan was to do a technical masters and then reapply, which I wouldn’t recommend to most people. I thought this could make sense in my case because I didn't have a technical background but I did have a research background. I had publications, but I hadn't taken a linear algebra course and thought that some universities might be concerned by this.
I think Rachel answered as well as I ever could with respect to waiting a year. I want to add, I think there are two potential reasons one would not necessarily want to focus on getting into a few particular programmes. Firstly, AI safety is still somewhat of a wild west which can’t be said for other academic disciplines. For example, you can meaningfully shape people's thinking on research directions just by posting something really interesting on LessWrong. That regularly happens multiple times a year. In that sense, someone doesn't necessarily need to be surrounded by a particular group of people or in a particular programme to make useful progress. Secondly, long distance collaboration is common and useful, and probably easier now that the pandemic has happened. Fun fact, I have a co-author that I've never met and we've worked on two papers before. I don’t know how tall he is, but we have a really good relationship and we frequently talk about research topics. You can find great collaborators in the community without being at a top program or in a big AI safety lab.
I’ll also say, when I applied for a Master's, I didn't have a good sense of how strong my application was. For example, I applied to both Cambridge and Oxford and a few other places around Europe. I got into them despite coming from an unknown university in Spain, which was kind of a surprise. This should tell you about the difficulty of self assessing. I would recommend trying for top programs because you have very little to lose.
For undergraduate research, would you prioritize working on problems tied to AI safety, or working with good advisors on topics you can make the most progress on?
If you're forced to make a trade off like this, I think choosing a really good advisor and making some really good progress is useful. As far as building capital and experience, I would generally suggest erring on the side of the good advisor; however, that doesn't mean you can't also do AI safety related work on the side. You can do meaningful things independently by posting on LessWrong or writing a survey paper, which applies less for different fields. AI safety is still the kind of place where a blog post might be relevant enough for a resume.
I think you should fairly strongly prioritise working with good, important advisors on good projects as opposed to something that's AI safety relevant. If you're doing undergrad research, then your goal is likely to do work that gets you into a competitive masters or PhD later. In that case, it just makes sense to work with the best advisor you can find.
When approaching supervisors, do you recommend openly stating an interest in AI safety/alignment? What if they’ve never stated an interest in the topic?
Some professors don’t want to be approached by students until they’re accepted by the university, so be mindful of that. You can typically check whether this is the case on a professor’s website. That said, when you approach a supervisor or any potential research collaborator, you should look at their recent work and try to pitch your interests in relation to or as an extension of their recent work.
I totally support what Rachel said. I think it is most important to show potential supervisors that you understand what they are working on. When reading their papers, try to answer the question, why is this paper important? And then if you can make some connection between what they’re doing and what you're interested in, that's super great.
I think when it comes to most AI safety relevant topics, you can frequently frame them as solving near term problems. When talking about your research interests, I recommend being just as good at talking about AGI as you are about self driving cars or debugging biased models, things like that.
What criteria for choosing a school did you not think of, but should have?
Something I didn't think about much is the composition of the lab that you might be joining and who else is there, beyond the adviser. This is important since a lot of your time will be spent with other students, maybe postdocs, and so on. In some labs, if the professor is particularly busy or high profile, you might end up being predominantly supervised by the more senior people in your lab (i.e. postdocs). Chatting with them and finding out who they are/the flavour of the lab is something I think I should have done
I didn't realise that the funding resources of my school would influence my experience; although, I don’t think it would have swayed my decision. When I was applying, I just looked at the specific programmes or labs and their resources. There are some situations where the resources of the university come in. I happen to be in one of the most underfunded public university departments on a per student level in the US (UC Berkeley), so this has come up quite a lot in the past couple of years. Sometimes wild things happen. For example, for a while we just didn't have enough desks for everyone. For most of this semester, there hasn't been usable Wi-Fi. COVID has definitely worsened the situation, but it is worth noting more funded schools like Stanford have provided better resources to their students through this crisis. And even beforehand, there was a notable difference in resources at public versus private universities, at least in the US.
One thing to consider is that PhDs are typically a lot longer in the US when compared to the UK, for instance. When I was applying, the idea of a shorter PhD appealed to me quite a bit; however, it can be good or bad depending on how you want to use your PhD. I am now about halfway through my PhD and I think it is way too short. I'm trying to drag it out as much as possible, but this is very hard to do in the UK as you generally need a concrete reason and can’t just extend your research.
"They don't have any publications, but they ____, so we should accept them." - says the admission committee. What can be in the blank?
The admissions committee wants to know that you're able to do research, that's why they focus so much on publications. If you don’t have any, the second best thing you can do is try to demonstrate your knowledge on the topic and why you’re capable of research.
One good thing could be having a research job or experience doing research-like things, even if it didn't lead to a publication. Aside from a job, there are non paper deliverables from research such as a really good blog post that shows a strong understanding of what you were doing. A third thing could be connecting with a professor at a school you’re interested in and trying to pitch yourself and your ideas (ex. maybe they’re sympathetic to or EA-aligned and would be interested in your LessWrong posts). Having such a professor vouch for you will definitely make an application to that school stronger. Finally, just because you don't currently have a clear opportunity to publish, doesn’t mean you can’t write something on your own in the next year. Even if you don’t have access to compute, you can still write a survey paper and put it on arXiv. That's something people would pay attention to.
How do you self-study math and other things, such that you can do productive work at CHAI? And what is CHAI actually doing, specifically?
So first, self studying math and other things. There are lots of great resources online, I used MIT OpenCourseWare pretty heavily. Berkeley also has a number of courses with lectures on YouTube, such as a deep reinforcement learning course. It's not too difficult to find the resources to study these things, it is really difficult to find the time and mental energy. Taking a break in between finishing my undergraduate degree and applying to PhDs helped give me extra time for self-study.
What is CHAI doing? Our research agenda is a little bit amorphous. Since PhD students are very independent, we have a lot of freedom to set our own agendas and motivations, so CHAI is kind of a collection of half-formed PhD student research agendas. That said, the most popular topics are probably related to reward modelling, learning from human feedback, or learning from human interaction. This can broadly be defined to include things like Cooperative Inverse Reinforcement Learning (CIRL), assistance games, and inverse reinforcement learning. There's also at least one person working on interpretability of neural networks and at least one person focusing on game theory. There are also researchers who are associated with CHAI but are not PhD students, I can't speak directly to their work.
I’ll share what I consider a recipe for doing a deep dive into a subfield of the AI Safety literature. If you ignore everything else, this is most likely to be the most valuable thing I might say. I think a deep dive has three really important components, one is choosing a subfield of academic literature you're interested in, the second is reading 40 or 50 papers or more and taking notes on all of them. The third is producing some sort of end deliverable. You can choose any topic you're interested in as long as it's not too broad or too narrow. For example, you could choose safe exploration in reinforcement learning, adversaries and networks, interpretability of neurons and networks, bias within neural systems, etc. Once you’ve taken notes on 40 or 50 papers, you have an understanding comparable to the people doing active research in that area. Following-up with a synthesis project will solidify this, whether it's a research proposal, blog, post, survey paper, application, repository, etc. I will attest to having done a couple deep dives and I would trade any class I've ever taken for one of them.
How can you pitch a non standard background on your application?
In my statement of purpose, I really emphasized the research that I had done and also a reading group. Regarding independent study, I was really explicit about how everything I had researched or learned related to what I proposed to do in my PhD. Because most of my undergraduate coursework was not related to my PhD, I ignored it and focused on my independent study. In the interview, if asked about my weird background, I did a similar thing. Additionally, my letters of recommendation were from technical researchers I had worked with. I asked them to emphasize my technical knowledge- your letters of recommendation are a great space to include evidence of whatever your application doesn’t show. For example, if you've submitted a research paper to a conference, but you're not going to hear back before the application deadline, get one of your letter writers to say that you've submitted it. In my case, my thesis advisor was aware of the independent study I had done and so I asked him to include that in his reference letter. My general advice is to think explicitly about what you want to be doing in a research programme and then explain how everything you've done has prepared you for that.
Most importantly, when proposing research topics in my statement of purpose, I showed more than a superficial understanding, which is what they care about. They don't care whether you got that from a class or independent study, they just care that you have a technical understanding, you can phrase research topics, and you can understand research that's already been done. That should be a decent part of your statement of purpose.
How much AI safety work is philosophical vs. mathematical vs. computational? (Would it make sense to specialize in philosophy, e.g. decision theory?)
I would strongly favour mathematical and computational degrees. I think that philosophy is great for highlighting the importance of AI safety, but solving concrete problems requires you to be a bit technical.
Having done degrees in philosophy and mathematics, AI, and now computer science, I want to echo what Pablo has said. Although a philosophy background can be helpful for addressing interesting questions in AI Safety, it isn’t clearly relevant to specific AI safety problems. If your goal is to be a technical AI safety researcher, going so far as to specialize in philosophy is probably less helpful than being fairly versed in the field. Philosophy may be comparably easier to self-study than technical topics, so that’s another factor. Additionally, having a solid background in maths is quite helpful as it’s difficult to learn as you go. I feel very glad to have started with maths before moving into computer science rather than the other way around.