Ryan Kidd

Co-Director @ SERI MATS
356 karmaJoined Sep 2021Working (0-5 years)Berkeley, CA, USA
serimats.org

Bio

Participation
5

Co-Director at SERI MATS (2022-current)

Ph.D. in Physics from the University of Queensland (2022)

Group organizer at Effective Altruism UQ (2017-2021)

Comments
19

Speaking on behalf of MATS, we offered support to the following AI governance/strategy mentors in Summer 2023: Alex Gray, Daniel Kokotajlo, Jack Clark, Jesse Clifton, Lennart Heim, Richard Ngo, and Yonadav Shavit. Of these people, only Daniel and Jesse decided to be included in our program. After reviewing the applicant pool, Jesse took on three scholars and Daniel took on zero.

I think that one's level of risk aversion in grantmaking should depend on the upside and the downside risk of grantees' action space. I see a potentially high upside to AI safety standards or compute governance projects that are specific, achievable, and verifiable and are rigorously determined by AI safety and policy experts. I see a potentially high downside to low-context and high-bandwidth efforts to slow down AI development that are unspecific, unachievable, or unverifiable and generate controversy or opposition that could negatively affect later, better efforts.

One might say, "If the default is pretty bad, surely there are more ways to improve the world than harm it, and we should fund a broad swathe of projects!" I think that the current projects to determine specific, achievable, and verifiable safety standards and compute governance levers are actually on track to be quite good, and we have a lot to lose through high-bandwith, low-context campaigns.

Thanks Joseph! Adding to this, our ideal applicant has:

  • an understanding of the AI alignment research landscape equivalent to having completed the AGI Safety Fundamentals course;
  • previous experience with technical research (e.g. ML, CS, maths, physics, neuroscience, etc.), ideally at a postgraduate level;
  • strong motivation to pursue a career in AI alignment research, particularly to reduce global catastrophic risk.

MATS alumni have gone on to publish safety research (LW posts here), join alignment research teams (including at Anthropic and MIRI), and found alignment research organizations (including a MIRI team, Leap Labs, and Apollo Research). Our alumni spotlight is here.

  • We broadened our advertising approach for the Summer 2023 Cohort, including a Twitter post and a shout-out on Rob Miles' YouTube and TikTok channels. We expected some lowering of average applicant quality as a result but have yet to see a massive influx of applicants from these sources. We additionally focused more on targeted advertising to AI safety student groups, given their recent growth. We will publish updated applicant statistics after our applications close.
  • In addition to applicant selection and curriculum elements, our Scholar Support staff, introduced in the Winter 2022-23 Cohort, supplement the mentorship experience by providing 1-1 research strategy and unblocking support for scholars. This program feature aims to:
    • Supplement and augment mentorship with 1-1 debugging, planning, and unblocking;
    • Allow air-gapping of evaluation and support, improving scholar outcomes by resolving issues they would not take to their mentor;
    • Solve scholars’ problems, giving more time for research.
  • Defining "good alignment research" is very complicated and merits a post of its own (or two, if you also include the theories of change that MATS endorses). We are currently developing scholar research ability through curriculum elements focused on breadth, depth, and epistemology (the "T-model of research"):
  • Our Alumni Spotlight includes an incomplete list of projects we highlight. Many more past scholar projects seem promising to us but have yet to meet our criteria for inclusion here. Watch this space.
  • Since Summer 2022, MATS has explicitly been trying to parallelize the field of AI safety as much as is prudent, given the available mentorship and scholarly talent. In longer-timeline worlds, more careful serial research seems prudent, as growing the field rapidly is a risk for the reasons outlined in the above article. We believe that MATS' goals have grown more important from the perspective of timelines shortening (though MATS management has not updated on timelines much as they were already fairly short in our estimation).
  • MATS would love to support senior research talent interested in transitioning into AI safety! Our scholars generally comprise 10% Postdocs, and we would like this number to rise. Currently, our advertising strategy is contingent on the AI safety community adequately targeting these populations (which seems false) and might change for future cohorts.

Copying over the Facebook comments I just made.

Response to Kat, intended as a devil's advocate stance:

  1. As Tyler said, funders can already query other funders regarding projects they think might have been rejected. I think the unilateralist's curse argument holds if the funding platform has at least one risk-averse big spender. I'm particularly scared about random entrepreneurs with typical entrepreneurial risk tolerance entering this space and throwing money at projects without concern for downsides, not about e.g. Open Phil + LTFF + Longview accessing a central database (though such a database should be administered by those orgs, probably).
  2. I'm very open to hearing solutions to the risk of a miscommunication-induced unilateralist's curse. I think a better solution than a centralized funding database would be a centralized query database, where any risk-averse funders can submit a request for information to a trusted third party, who knows every proposal that was rejected by all parties and can connect the prospective funders with the funders who rejected the proposal for more information. This reduces the chances that potentially risk-tolerant funders get pinged with every grant proposal but increases the chances that risk-averse funders request information that might help them reject too-risky proposals. I know it's complicated, but it seems like a much better mechanism design if one is risk-averse.
  3. It seems pretty unlikely that small projects will get noticed or critiqued on the EA Forum, but low-quality small projects might be bad en masse. Future Fund gave a lot of money to projects and people that got low visibility but might have contributed to "mass movement building concerns" around "diluted epistemics", "counterfactually driving the wheel of AI hype and progress," and "burning bridges for effective outreach."
  4. Open-sourcing funding analysis is a trade-off between false positives and false negatives for downside risk. Currently, I'm much more convinced that Open Phil and other funders are catching the large downside projects than an open source model would avoid these. False alarms seem safer than downside risk to me too, but this might be because I have a particularly low opinion of entrepreneurial risk tolerance and feel particularly concerned about "doing movement building right" (happy to discuss MATS' role in this, btw).

A few key background claims:

  1. AI safety is hard to do right, even for the experts. Doing it wrong but looking successful at it just makes AI products more marketable but doesn't avert AGI tail-risks (the scary ones). 
  2. The market doesn't solve AI risk by default and probably makes it worse, even if composed of (ineffectively) altruistic entrepreneurs. Silicon Valley's optimism bias can be antithetical to a “security mindset." “Deploy MVP + iterate” fails if we have to get it right first on the first real try. Market forces cannot distinguish between AI “saints” and “sycophants" unaided.
  3. Big AI x-risk funders are generally anchoring on the "sign-value" of impact rather than "penny-pinching" when rejecting projects. Projects might sometimes get submaximal funding because the risk/benefit ratio increases with scale.

We hope to hold another cohort starting in Nov. However, applying for the summer cohort might be good practice, and if the mentor is willing, you could just defer to winter!

I'm not advocating a stock HR department with my comment. I used "HR" as a shorthand for "community health agent who is focused on support over evaluation." This is why I didn't refer to HR departments in my post. Corporate HR seems flawed in obvious ways, though I think it's probably usually better than nothing, at least for tail risks.

In my management role, I have to juggle these responsibilities. I think a HR department should generally exist, even if management is really fair and only wants the best for the world, we promise (not bad faith, just humour).

This post is mainly explaining part of what I'm currently thinking about regarding community health in EA and at MATS. If I think of concrete, shareable examples of concerns regarding insufficient air-gapping in EA or AI safety, I'll share them here.

Yeah, I think that EA is far better at encouraging and supporting disclosure to evaluators than, for example, private industry. I also think EAs are more likely to genuinely report their failures (and I take pride in doing this myself, to the extent I'm able). However, I feel that there is still room for more support in the EA community that is decoupled from evaluation, for individuals that might benefit from this.

Load more