AI Governance Program Associate @ Open Philanthropy
1329 karmaJoined Jan 2022Working (0-5 years)


(Posting in a personal capacity unless stated otherwise.) I help allocate Open Phil's resources to improve the governance of AI with a focus on avoiding catastrophic outcomes. Formerly co-founder of the Cambridge Boston Alignment Initiative, which supports AI alignment/safety research and outreach programs at Harvard, MIT, and beyond, co-president of Harvard EA, Director of Governance Programs at the Harvard AI Safety Team and MIT AI Alignment, and occasional AI governance researcher. I'm also a proud GWWC pledger and vegan.


The shape of my updates has been something like:

Q2 2023: Woah, looks like the AI Act might have a lot more stuff aimed at the future AI systems I'm most worried about than I thought! Making that go well now seems a lot more important than it did when it looked like it would mostly be focused on pre-foundation model AI. I hope this passes!

Q3 2023: As I learn more about this, it seems like a lot of the value is going to come from the implementation process, since it seems like the same text in the actual Act could wind up either specifically requiring things that could meaningfully reduce the risks or just imposing a lot of costs at a lot of points in the process without actually aiming at the most important parts, based on how the standard-setting orgs and member states operationalize it. But still, for that to happen at all it needs to pass and not have the general-purpose AI stuff removed.

November 2023: Oh no, France and Germany want to take out the stuff I was excited about in Q2. Maybe this will not be very impactful after all.

December 2023: Oh good, actually it seems like they've figured out a way to focus the costs France/Germany were worried about on the very most dangerous AIs and this will wind up being more like what I was hoping for pre-November, and now highly likely to pass!

The text of the Act is mostly determined, but it delegates tons of very important detail to standard-setting organizations and implementation bodies at the member-state level.

(Cross-posting from LW)

Thanks for these thoughts! I agree that advocacy and communications is an important part of the story here, and I'm glad for you to have added some detail on that with your comment. I’m also sympathetic to the claim that serious thought about “ambitious comms/advocacy” is especially neglected within the community, though I think it’s far from clear that the effort that went into the policy research that identified these solutions or work on the ground in Brussels should have been shifted at the margin to the kinds of public communications you mention.

I also think Open Phil’s strategy is pretty bullish on supporting comms and advocacy work, but it has taken us a while to acquire the staff capacity to gain context on those opportunities and begin funding them, and perhaps there are specific opportunities that you're more excited about than we are. 

For what it’s worth, I didn’t seek significant outside input while writing this post and think that's fine (given the alternative of writing it quickly, posting it here, disclaiming my non-expertise, and getting additional perspectives and context from commenters like yourself). However, I have spoken with about a dozen people working on AI policy in Europe over the last couple months (including one of the people whose public comms efforts are linked in your comment) and would love to chat with more people with experience doing policy/politics/comms work in the EU.

We could definitely use more help thinking about this stuff, and I encourage readers who are interested in contributing to OP’s thinking on advocacy and comms to do any of the following:

  • Write up these critiques (we do read the forums!); 
  • Join our team (our latest hiring round specifically mentioned US policy advocacy as a specialization we'd be excited about, but people with advocacy/politics/comms backgrounds more generally could also be very useful, and while the round is now closed, we may still review general applications); and/or 
  • Introduce yourself via the form mentioned in this post.

It uses the language of "models that present systemic risks" rather than "very capable," but otherwise, a decent summary, bot.

(I began working for OP on the AI governance team in June. I'm commenting in a personal capacity based on my own observations; other team members may disagree with me.)

OpenPhil sometimes uses its influence to put pressure on orgs to not do things that would disrupt the status quo

FWIW I really don’t think OP is in the business of preserving the status quo.  People who work on AI at OP have a range of opinions on just about every issue, but I don't think any of us feel good about the status quo! People (including non-grantees) often ask us for our thoughts about a proposed action, and we’ll share if we think some action might be counterproductive, but many things we’d consider “productive” look very different from “preserving the status quo.” For example, I would consider the CAIS statement to be pretty disruptive to the status quo and productive, and people at Open Phil were excited about it and spent a bunch of time finding additional people to sign it before it was published.

Lots of people want to work there; replaceability

I agree that OP has an easier time recruiting than many other orgs, though perhaps a harder time than frontier labs. But at risk of self-flattery, I think the people we've hired would generally be hard to replace — these roles require a fairly rare combination of traits. People who have them can be huge value-adds relative to the counterfactual!

pretty hard to steer OP from within

I basically disagree with this. There are areas where senior staff have strong takes, but they'll definitely engage with the views of junior staff, and they sometimes change their minds. Also, the AI world is changing fast, and as a result our strategy has been changing fast, and there are areas full of new terrain where a new hire could really shape our strategy. (This is one way in which grantmaker capacity is a serious bottleneck.)

Nitpick: I would be sad if people ruled themselves out for e.g. being "20th percentile conscientiousness" since in my impression the popular tests for OCEAN are very sensitive to what implicit reference class the test-taker is using. 

For example, I took one a year ago and got third percentile conscientiousness, which seems pretty unlikely to be true given my abilities to e.g. hold down a grantmaking job, get decent grades in grad school, successfully run 50-person retreats, etc. I think the explanation is basically that this is how I respond to "I am often late for my appointments": "Oh boy, so true. I really am often rushing to my office for meetings and often don't join until a minute or two after the hour." And I could instead be thinking, "Well, there are lots of people who just regularly completely miss appointments, don't pay bills, lose jobs, etc. It seems to me like I'm running late a lot, but I should be accounting for the vast diversity of human experience and answer 'somewhat disagree'." But the first thing is way easier; you kinda have to know about this issue with the test to do the second thing.

(Unless you wouldn't hire someone because they were only ~1.3 standard deviations more conscientious than I am, which is fair I guess!)

Reposting my LW comment here:

Just want to plug Josh Greene's great book Moral Tribes here (disclosure: he's my former boss). Moral Tribes basically makes the same argument in different/more words: we evolved moral instincts that usually serve us pretty well, and the tricky part is realizing when we're in a situation that requires us to pull out the heavy-duty philosophical machinery.

Huh, it really doesn't read that way to me. Both are pretty clear causal paths to "the policy and general coordination we get are better/worse as a result."

Most of these have the downside of not giving the accused the chance to respond and thereby giving the community the chance to evaluate both the criticism and the response (which as I wrote recently isn't necessarily a dominant consideration, but it is an upside of the public writeup).

Fwiw, seems like the positive performance is more censored in expectation than the negative performance: while a case that CH handled poorly could either be widely discussed or never heard about again, I'm struggling to think of how we'd all hear about a case that they handled well, since part of handling it well likely involves the thing not escalating into a big deal and respecting people's requests for anonymity and privacy.

It does seem like a big drawback that the accused don't know the details of the accusations, but it also seems like there are obvious tradeoffs here, and it would make sense for this to be very different from the criminal justice system given the difference in punishments (loss of professional and financial opportunities and social status vs. actual prison time).

Agreed that a survey seems really good.

Load more