MATS Summer 2023 Retrospective

utilistrutil

MATS Summer 2023 Retrospective

utilistrutil

32 min readDec 2, 2023

Comments 3

Sorted by

New & upvoted

NickLaing

Thanks that's a thorough report.

Quick note "postmortem" kind of sounds like something bad has happened which had triggered the report egg want the case, perhaps "review" or "roundup" or similar might be a bit more positive a word to use.

Ryan Kidd

Cheers, Nick! We decided to change the title to "retrospective" based on this and some LessWrong comments.

SummaryBot

Executive summary: The ML Alignment & Theory Scholars (MATS) program supported 60 AI safety scholars with mentorship, training, housing, infrastructure, and funding. Scholars improved technical ability, research taste, and knowledge breadth, and reported many positive connections with peers and researchers. Scholars and mentors form part of a talent pipeline for AI safety.

Key points:

60 scholars studied AI safety for 3 months with 15 mentors. Scholars rated mentors highly (8/10) and are likely to recommend MATS (8.9/10).
Scholars improved technical research skills (self-rated 7.2/10 vs counterfactual summer), knowledge breadth (+1.75/10), research taste (5.9-6.9/10), and made 10 professional connections on average.
Scholars faced fewer career obstacles after MATS, but lack of publications remained an issue. Mentors strongly endorsed 94% of scholars to continue research.
Scholars valued community, seminars and Scholar Support coaching in addition to mentorship. Scholar Support meetings were valued at $750-$3700 in grant equivalent.
MATS will improve applicant screening, support technical skills and research management, and reduce seminars for the next cohort.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Comments

More from the author

How to Talk to Journalists

utilistrutil·7mo ago·13m read

130

A Love Letter to EA

utilistrutil·2y ago·3m read

MATS Winter 2023-24 Retrospective

utilistrutil·2y ago·59m read

Curated and popular this week

What would an animal-aligned AI be aligned to?

Aidan Kankyoku, Anima International·1w ago·Curated 38m ago·15m read

This is a crosspost from the new Animal Welfare Alignment Newsletter by Anima International. You can subscribe on Substack if you are interested in following these efforts. Audio reading also available on Substack. The goals of this post are to: 1. Raise a question I see as crucially important to the goal of aligning AI to animal welfare...

146

Maybe do the thing you wish CEA would do

alejoacelas 🔸·6d ago·2m read

I used AI to fix transcription errors, rerrarange the ideas, and suggest tweaks to the title and some sentences. Three of the most exciting projects to come out of EA in recent years are, in a vague sense, CEA spinouts: * Kairos is directly a spinout of CEA and now handles most support for university AI safety groups. Basically everyone I've found who knows them is really excited about what they do * NEST is an opinionated ideas-fi...

137

The first video from Giving What We Can's new channel is out now!

JustinPortela·1d ago·1m read

Hello! I'm Justin Portela. I got hired by GWWC to make YouTube videos after AI in Context did such a kickass job. My channel is using that same cinematic, high-production value beauty to talk about everything in the EA universe that isn't AI. ...

Recent opportunities to take action

Find funding, fast

Austin·16h ago·3m read

New round of digital minds funding opportunities at Longview

zdgroff, Longview Philanthropy·3d ago·2m read

173

Possible mistake EAs are making and shout out to Pause AI UK

Michelle_Hutchinson·2w ago·4m read

^{^}

Examples of selection questions that some mentors included in their application:

- Spend 2-4 hours using language models to develop a dataset of question-answer pairs on which models trained by RLHF will likely give friendly but incorrect responses. Then record yourself for 2-4 hours evaluating models on your dataset and plotting scaling laws. Bonus points if your dataset shows inverse scaling.

- I would like you to spend ~10 hours (for fairness, max 20) trying to make research progress on a small open problem in mechanistic interpretability, and show me what progress you’ve made. Note that I do not expect you to solve the problem, to have a background in mech interp, get impressive results, or really to feel that you know what you’re doing.

- Is “Bayesian expected utility maximizer” a “True Name” for a generally powerful intelligence? Why/why not?

^{^}

The trainees who did not progress were concentrated among two mentors, who informed their trainees about their prospects for getting into the Research Phase prior to the Training Phase. The primary determinant of these mentors’ decisions was scholars’ performance in a research sprint at the end of the Training Phase. Knowing their odds, the trainees in these streams elected to participate, in part due to the high value of a guaranteed month working with their mentor. Jay Bailey, a trainee from Winter 2022-23, attests,

“Neel [Nanda]'s training process in mechanistic interpretability was a great way for me to test my fit in the field and collaborate with a lot of smart people. Neel's stream is demanding, and he expects it to be an environment that doesn't work for everyone, but is very clear that there's no shame in this. While I didn't end up getting selected for the in-person phase, going through the process helped me understand whether I wanted to pursue mechanistic interpretability in the long term, and firm up my plans around how best to contribute to alignment going forward.”

^{^}

This question is commonly used to calculate a “net promoter score” (NPS) which is standard in many industries. Based on our respondents, the NPS for MATS is +69.

^{^}

This survey question and the following one come from Lynette Bye’s 2020 review of her coaching impact.

^{^}

The following histograms exclude a scholar who increased from 0 to 999 available professionals and potential collaborators, explaining, “at the start I basically felt like I could message / call up no-one and now I feel like I can message / call up anyone quite literally.”

^{^}

The question elaborated, “If you want a more specific scoping: how many professionals would you feel able to contact for 30 minutes of career advice? Factors that influence this include what professionals you know and whether you have sufficient connection that they'll help you out. A rough estimate is fine, and this question isn't just about people you met in MATS!”

^{^}

The question elaborated, “Imagine you had some research project idea within your alignment field of interest. How many people that you know could plausibly be collaborators? A rough estimate is fine, and this question isn't just about people you met in MATS!”

MATS Summer 2023 Retrospective

MATS Summer 2023 Retrospective

Summary

Theory of Change

Overview of MATS Summer 2023 Program

Schedule Overview

Program Elements

Scholar Support

Workshops

Networking Events

Seminars

Community Health

Mentor Selection

Scholar Selection

Educational attainment of scholars

Counterfactual summers

Training Phase (June 5 - June 30)

Research Phase (July 10 - Sep 1)

Mentorship styles

Scholar Support

Research Milestones

Workshops, Seminars, Networking Events

Community Health

Extension Phase (Sep 18 - Jan 1)

Evaluation of MATS Summer 2023

Evaluating Program Elements

Alignment 201

Overall Program

Value of Mentorship

Value of Scholar Support

Workshops & Seminars

Community Health

Evaluating Outcomes

Research Milestones

Research ability

Networking

Career Obstacles

Lessons and Changes for Future Programs

1. Filtering better during the application stage

2. Providing more technical support

3. More proactively supporting professional development beyond mentorship

4. Offering research management help to mentors

5. Reducing the number of seminars

6. Improving communication with scholars during the program

7. Developing more robust internal systems

8. Frontloading social events

Acknowledgements