SERI ML Alignment Theory Scholars Program 2022

Ryan Kidd; Victor Warlop; Oliver Z

The Stanford Existential Risks Initiative (SERI) recently opened applications for the second iteration of the ML Alignment Theory Scholars (MATS) Program, which aims to help aspiring alignment researchers enter the field by pairing them with established research mentors and fostering an academic community in Berkeley, California over the summer. Current mentors include Alex Gray, Beth Barnes, Evan Hubinger, John Wentworth, Leo Gao and Stuart Armstrong. Applications close on May 15 and include a written response to mentor-specific selection questions, viewable on our website.

Who is this program for?

Our ideal applicant has:

an understanding of the AI alignment research landscape equivalent to having completed EA Cambridge’s AGI Safety Fundamentals course;
previous experience with technical research (e.g. ML, CS, maths, physics, neuroscience, etc.);
strong motivation to pursue a career in AI alignment research.

For the first stage of the program, we asked each alignment researcher to provide a set of questions that are sufficient to select candidates they would be happy to mentor. Applicants can apply for multiple mentors, but will have to complete each mentor’s selection questions.

What will this program involve?

Over four weeks, the participants will develop an understanding of a research agenda at the forefront of AI alignment through online readings and cohort discussions, averaging 10 h/week from Jun 6 to Jul 1. After this initial upskilling period, the scholars will be paired with an established AI alignment researcher for a two-week “research sprint” to test fit from Jul 4 to Jul 15. Assuming all goes well, scholars will be accepted into an eight-week intensive research program in Berkeley, California over the US summer break (Jul 25 to Sep 16).

Participants will obtain a $6,000 grant for completing the training and research sprint and $16,000 at the conclusion of the program. Furthermore, all expenses will be covered, including accommodation, office space and networking events with the Bay Area alignment community. We are happy to continue providing funding after the two month period to promising scholars, at the discretion of our research mentors. International students can apply to the program, and will arrive in the US under a B1 visa.

We hope to run another iteration of the program in the winter, and possibly in the fall. If you are not able to apply for the summer program, we encourage you to apply for the fall or winter. We may be able to offer different types of visas in future iterations.

Theory of change

This section is intended to explain the reasoning behind our program structure and is not required reading for any applicant. SERI MATS’ theory of change is as follows:

We believe that AI alignment research is pre-paradigmatic, with a diversity of potentially promising research agendas. Therefore, we aim to support many different alignment research agendas to decorrelate failure. We also aim to accelerate the development of scholars into researchers capable of pursuing original agendas and mentoring further scholars.
We believe that working 1:1 with a mentor is the best and quickest way to develop the ability to conduct alignment theory research—that reading curriculum alone is worse for a large number of participants. Moreover, we believe that our target scholars might be able to produce value directly for the mentors by acting as research assistants. For the first few months, we are generally more excited about mentees working on an established mentor’s research agenda than on their own.
We believe that our limiting constraint is mentor time. This means we wish to have strong filtering mechanisms (e.g. candidate selection questions) to ensure that each applicant is suitable for each mentor. We’d rather risk rejecting a strong participant than admitting a weak participant. We offer the possibility for mentors to leave the program at any time they want.
We believe that MATS should be a “mentor-centered” program, in that we are willing to be very flexible regarding mentors’ preferences regarding the structure and implementation of the program.
We believe that there exists a large population of possible alignment researchers whose limitations are not some innate lack of talent, but rather more mundane barriers, which we can address:
1. Lack of networking within the community to find mentors;
2. Lack of peers and cohort to discuss research with;
3. Financial stability; or
4. Low risk tolerance.
We believe that creating a strong alignment theory community, where scholars share housing and offices, could be extremely beneficial for the development of new ideas. We have already seen promising results of alignment theory collaboration at the office space and housing we provided for the first iteration of SERI MATS and hope to see more!

We are happy to hear any feedback on our aims or strategy. If you would like to become a mentor or join MATS as a program organiser for future program iterations, please send us an email at vwarlop@stanford.edu.

electroswingMay 7 20229

I worry that the current format of this program might filter out promising candidates who are risk averse. Specifically, the fact that candidates are only granted the actual research opportunity "Assuming all goes well" is a lot of risk to take on. For driven undergraduates, the cost of a summer opportunity falling through is costly, and they might not apply just because of this uncertainty.

Currently your structure is like PhD programs which admit students to a specific lab (who may be dropped from that lab if they're not a good fit, and in that case, will have to scramble to find an alternative placement).

Maybe a better model for this program is PhD programs who admit a strong cohort of students. Instead of one two-week research sprint, maybe you have 2-3 shorter research sprints ("rotations"). From a student perspective this would probably lower the probability of them being dropped (since all of the mentors would have to dislike them for this to happen).

What you're currently doing seems like a fine option for you with little downside for the students if:

1) "Assuming all goes well" means >90% of students continue on with research

2) The projects are sufficiently disjoint that it's unlikely a student is going to be a good fit for more than one project (I think this is probably false but you know more than me, and maybe you think it's true)

3) 2-week research sprints are much more valuable than 1-week research sprints (I am not convinced of this but maybe you are)

If not all of these are the case I argue it might be better to do rotations / find other ways to make this less risky for candidates.

Other idea to avoid filtering out risk averse candidates: You could promise that if they don't get matched with a mentor, they can at least do <some other project> , for example, they could be paid to distill AI Safety materials.

Effective Altruism Forum
EA Forum

SERI ML Alignment Theory Scholars Program 2022

57

Who is this program for?

What will this program involve?

Theory of change

57

Reactions

More posts like this