All of Max Nadeau's Comments + Replies

I'd love to hear his thoughts on defensive measures for "fuzzier" threats from advanced AI, e.g. manipulation, persuasion, "distortion of epistemics", etc. Since it seems difficult to delineate when these sorts of harms are occuring (as opposed to benign forms of advertising/rhetoric/expression), it seems hard to construct defenses.

This is a related concept mechanisms for collective epistemics like prediction markets or community notes, which Vitalik praises here. But the harms from manipulation are broader, and could route through "superstimuli", addictiv... (read more)

1
Roman Leventov
3mo
And also: about the "AI race" risk a.k.a. Moloch a.k.a. https://www.lesswrong.com/posts/LpM3EAakwYdS6aRKf/what-multipolar-failure-looks-like-and-robust-agent-agnostic

Disclaimer: I joined OP two weeks ago in the Program Associate role on the Technical AI Safety team. I'm leaving some comments describing questions I wanted to know to assess whether I should take the job (which, obviously, I ended up doing).

What sorts of personal/career development does the PA role provide? What are the pros and cons of this path over e.g. technical research (which has relatively clear professional development in the form of published papers, academic degrees, high-status job titles that bring public credibility)?

For me personally, research and then grantmaking at Open Phil has been excellent for my career development, and it's pretty implausible that grad school in ML or CS, or an ML engineering role at an AI company, or any other path I can easily think of, would have been comparably useful. 

If I had pursued an academic path, then assuming I was successful on that path, I would be in my first or maybe second year as an assistant professor right about now (or maybe I'd just be starting to apply for such a role). Instead, at Open Phil, I wrote less-academic re... (read more)

Disclaimer: I joined OP two weeks ago in the Program Associate role on the Technical AI Safety team. I'm leaving some comments describing questions I wanted to know to assess whether I should take the job (which, obviously, I ended up doing).

How inclined are you/would the OP grantmaking strategy be towards technical research with theories of impact that aren’t “researcher discovers technique that makes the AI internally pursue human values” -> “labs adopt this technique”. Some examples of other theories of change that technical research might have:

  • Provi
... (read more)
9
Ajeya
6mo
I'm very interested in these paths. In fact, I currently think that well over half the value created by the projects we have funded or will fund in 2023 will go through "providing evidence for dangerous capabilities" and "demonstrating emergent misalignment;" I wouldn't be surprised if that continues being the case.

Disclaimer: I joined OP two weeks ago in the Program Associate role on the Technical AI Safety team. I'm leaving some comments describing questions I wanted to know to assess whether I should take the job (which, obviously, I ended up doing).

How much do the roles on the TAIS team involve engagement with technical topics? How do the depth and breadth of “keeping up with” AI safety research compare to being an AI safety researcher?

2
Ajeya
6mo
The way I approach the role, it involves thinking deeply about what technical research we want to see in the world and why, and trying to articulate that to potential grantees (in one-on-one conversations, posts like this one, RFPs, talks at conferences, etc) so that they can form a fine-grained understanding of how we're thinking about the core problems and where their research interests overlap with Open Phil's philanthropic goals in the space. To do this well, it's really valuable to have a good grip on the existing work in the relevant area(s).

Disclaimer: I joined OP two weeks ago in the Program Associate role on the Technical AI Safety team. I'm leaving some comments describing questions I wanted to know to assess whether I should take the job (which, obviously, I ended up doing).

What does OP’s TAIS funding go to? Don’t professors’ salaries already get paid by their universities? Can (or can't) PhD students in AI get no-strings-attached funding (at least, can PhD students at prestigious universities)?

3
Ajeya
6mo
Professors typically have their own salaries covered, but need to secure funding for each new student they take on, so providing funding to an academic lab allows them to take on more students and grow (it's not always the case that everyone is taking on as many students as they can manage). Additionally, it's often hard for professors to get funding for non-student expenses (compute, engineering help, data labeling contractors, etc) through NSF grants and similar, which are often restricted to students.

Disclaimer: I joined OP two weeks ago in the Program Associate role on the Technical AI Safety team. I'm leaving some comments describing questions I wanted to know to assess whether I should take the job (which, obviously, I ended up doing).

Is it way easier for researchers to do AI safety research within AI scaling labs (due to: more capable/diverse AI models, easier access to them (i.e. no rate limits/usage caps), better infra for running experiments, maybe some network effects from the other researchers at those labs, not having to deal with all the log... (read more)

3
Ajeya
6mo
I think this is definitely a real dynamic, but a lot of EAs seem to exaggerate it a lot in their minds and inappropriately round the impact of external research down to 0. Here are a few scattered points on this topic: * Third party researchers can influence the research that happens at labs through the normal diffusion process by which all research influences all other research. There's definitely some barrier to research insight diffusing from academia to companies (and e.g. it's unfortunately common for an academic project to have no impact on company practice because it just wasn't developed with the right practical constraints in mind), but it still happens all the time (and some types of research, e.g. benchmarks, are especially easy to port over). If third party research can influence lab practice to a substantial degree, then funding third party research just straightforwardly increases the total amount of useful research happening, since labs can't hire everyone who could do useful work.   * It will increasingly be possible to do good (non-interpretability) research on large models through APIs provided by labs, and Open Phil could help facilitate that and increase the rate at which it happens. We can also help facilitate greater compute budgets and engineering support. * The work of the lab-external safety research community can also impact policy and public opinion; the safety teams at scaling labs are not their only audience. For example, capability evaluations and model organisms work both have the potential to have at least as big an impact on policy as they do on technical safety work happening inside labs. * We can fund nonprofits and companies which directly interface with AI companies in a consulting-like manner (e.g. red-teaming consultants); I expect an increasing fraction of our opportunities to look like this. * Academics and other external safety researchers we fund now can end up joining scaling labs later (as e.g. Ethan Perez and Colli

Sampled from my areas of personal interest, and not intended to be at all thorough or comprehensive:

AI researchers (in no particular order):

  • Prof. Jacob Steinhardt: author of multiple fascinating pieces on forecasting AI progress and contributor/research lead on numerous AI safety-relevant papers.
  • Dan Hendrycks: director of the multi-faceted and hard-to-summarize research and field-building non-profit CAIS.
  • Prof. Sam Bowman: has worked on many varieties of AI safety research at Anthropic and NYU
  • Ethan Perez: researcher doing fascinating work to display and add
... (read more)

+1 on David Thorstad

Best of luck with your new gig; excited to hear about it! Also, I really appreciate the honesty and specificity in this post.

From the post: "We plan to have some researchers arrive early, with some people starting as soon as possible. The majority of researchers will likely participate during the months of December and/or January."

Artir Kel (aka José Luis Ricón Fernández de la Puente) at Nintil wrote an essay broadly sympathetic to AI risk scenarios but doubtful of a particular step in the power-seeking stories Cotra, Gwern, and others have told. In particular, he has a hard time believing that a scaled-up version of present systems (e.g. Gato) would learn facts about itself (e.g. that it is an AI in a training process, what its trainers motivations would be, etc) and incorporate those facts into its planning (Cotra calls this "situational awareness"). Some AI safety researchers I'v... (read more)

2
Jose Luis Ricon
1y
I wonder too!

It is possible but unlikely that such a person would be a TA. Someone with little prior ML experience would be a better fit as a participant.

We intended that sentence to be read as: "In addition to people who plan on doing technical alignment, MLAB can be valuable to other sorts of people (e.g. theoretical researchers)".