AI Governance Program Associate @ Open Philanthropy
1363 karmaJoined Jan 2022Working (0-5 years)


(Posting in a personal capacity unless stated otherwise.) I help allocate Open Phil's resources to improve the governance of AI with a focus on avoiding catastrophic outcomes. Formerly co-founder of the Cambridge Boston Alignment Initiative, which supports AI alignment/safety research and outreach programs at Harvard, MIT, and beyond, co-president of Harvard EA, Director of Governance Programs at the Harvard AI Safety Team and MIT AI Alignment, and occasional AI governance researcher. I'm also a proud GWWC pledger and vegan.


Interesting! I actually wrote a piece on "the ethics of 'selling out'" in The Crimson almost 6 years ago (jeez) that was somewhat more explicit in its EA justification, and I'm curious what you make of those arguments.

I think randomly selected Harvard students (among those who have the option to do so) deciding to take high-paying jobs and donate double-digit percentages of their salary to places like GiveWell is very likely better for the world than the random-ish other things they might have done, and for that reason I strongly support this op-ed. But I think for undergrads who are really committed to doing the most good, there are two things I would recommend instead. Both route through developing a solid understanding of the most important and tractable problems in the world, via reading widely, asking good questions of knowledgeable people, doing their own writing and seeking feedback, probably aggressively networking among the people working on these problems. 

This enables much more effective earning to give — I think very plugged-in and reasonably informed donors can outperform even top grantmaking organizations in various ways, including helping organizations diversify their funding, moving faster, spotting opportunities that the grantmakers don't, etc. 

And it's also basically necessary for doing direct work on the world's most important problems. I think the generic advice to earn to give misses the huge variation in performance between individuals in direct work; if I understand correctly, 80k agrees with this and thinks this should have been much more emphasized in their early writing and advice. Many Harvard students, in my view, could relatively quickly become excellent in roles like think tank research in AI policy or biosecurity or operations at very impactful organizations. A smaller but nontrivial number could be excellent researchers on important philosophical or technical questions. I think it takes a lot of earning potential to beat those.

I object to calling funding two public defenders "strictly dominating" being one yourself; while public defender isn't an especially high-variance role with respect to performance compared to e.g. federal public policy, it doesn't seem that crazy that a really talented and dedicated public defender could be more impactful than the 2 or 3 marginal PDs they'd fund while earning to give.

The shape of my updates has been something like:

Q2 2023: Woah, looks like the AI Act might have a lot more stuff aimed at the future AI systems I'm most worried about than I thought! Making that go well now seems a lot more important than it did when it looked like it would mostly be focused on pre-foundation model AI. I hope this passes!

Q3 2023: As I learn more about this, it seems like a lot of the value is going to come from the implementation process, since it seems like the same text in the actual Act could wind up either specifically requiring things that could meaningfully reduce the risks or just imposing a lot of costs at a lot of points in the process without actually aiming at the most important parts, based on how the standard-setting orgs and member states operationalize it. But still, for that to happen at all it needs to pass and not have the general-purpose AI stuff removed.

November 2023: Oh no, France and Germany want to take out the stuff I was excited about in Q2. Maybe this will not be very impactful after all.

December 2023: Oh good, actually it seems like they've figured out a way to focus the costs France/Germany were worried about on the very most dangerous AIs and this will wind up being more like what I was hoping for pre-November, and now highly likely to pass!

The text of the Act is mostly determined, but it delegates tons of very important detail to standard-setting organizations and implementation bodies at the member-state level.

(Cross-posting from LW)

Thanks for these thoughts! I agree that advocacy and communications is an important part of the story here, and I'm glad for you to have added some detail on that with your comment. I’m also sympathetic to the claim that serious thought about “ambitious comms/advocacy” is especially neglected within the community, though I think it’s far from clear that the effort that went into the policy research that identified these solutions or work on the ground in Brussels should have been shifted at the margin to the kinds of public communications you mention.

I also think Open Phil’s strategy is pretty bullish on supporting comms and advocacy work, but it has taken us a while to acquire the staff capacity to gain context on those opportunities and begin funding them, and perhaps there are specific opportunities that you're more excited about than we are. 

For what it’s worth, I didn’t seek significant outside input while writing this post and think that's fine (given the alternative of writing it quickly, posting it here, disclaiming my non-expertise, and getting additional perspectives and context from commenters like yourself). However, I have spoken with about a dozen people working on AI policy in Europe over the last couple months (including one of the people whose public comms efforts are linked in your comment) and would love to chat with more people with experience doing policy/politics/comms work in the EU.

We could definitely use more help thinking about this stuff, and I encourage readers who are interested in contributing to OP’s thinking on advocacy and comms to do any of the following:

  • Write up these critiques (we do read the forums!); 
  • Join our team (our latest hiring round specifically mentioned US policy advocacy as a specialization we'd be excited about, but people with advocacy/politics/comms backgrounds more generally could also be very useful, and while the round is now closed, we may still review general applications); and/or 
  • Introduce yourself via the form mentioned in this post.

It uses the language of "models that present systemic risks" rather than "very capable," but otherwise, a decent summary, bot.

(I began working for OP on the AI governance team in June. I'm commenting in a personal capacity based on my own observations; other team members may disagree with me.)

OpenPhil sometimes uses its influence to put pressure on orgs to not do things that would disrupt the status quo

FWIW I really don’t think OP is in the business of preserving the status quo.  People who work on AI at OP have a range of opinions on just about every issue, but I don't think any of us feel good about the status quo! People (including non-grantees) often ask us for our thoughts about a proposed action, and we’ll share if we think some action might be counterproductive, but many things we’d consider “productive” look very different from “preserving the status quo.” For example, I would consider the CAIS statement to be pretty disruptive to the status quo and productive, and people at Open Phil were excited about it and spent a bunch of time finding additional people to sign it before it was published.

Lots of people want to work there; replaceability

I agree that OP has an easier time recruiting than many other orgs, though perhaps a harder time than frontier labs. But at risk of self-flattery, I think the people we've hired would generally be hard to replace — these roles require a fairly rare combination of traits. People who have them can be huge value-adds relative to the counterfactual!

pretty hard to steer OP from within

I basically disagree with this. There are areas where senior staff have strong takes, but they'll definitely engage with the views of junior staff, and they sometimes change their minds. Also, the AI world is changing fast, and as a result our strategy has been changing fast, and there are areas full of new terrain where a new hire could really shape our strategy. (This is one way in which grantmaker capacity is a serious bottleneck.)

Nitpick: I would be sad if people ruled themselves out for e.g. being "20th percentile conscientiousness" since in my impression the popular tests for OCEAN are very sensitive to what implicit reference class the test-taker is using. 

For example, I took one a year ago and got third percentile conscientiousness, which seems pretty unlikely to be true given my abilities to e.g. hold down a grantmaking job, get decent grades in grad school, successfully run 50-person retreats, etc. I think the explanation is basically that this is how I respond to "I am often late for my appointments": "Oh boy, so true. I really am often rushing to my office for meetings and often don't join until a minute or two after the hour." And I could instead be thinking, "Well, there are lots of people who just regularly completely miss appointments, don't pay bills, lose jobs, etc. It seems to me like I'm running late a lot, but I should be accounting for the vast diversity of human experience and answer 'somewhat disagree'." But the first thing is way easier; you kinda have to know about this issue with the test to do the second thing.

(Unless you wouldn't hire someone because they were only ~1.3 standard deviations more conscientious than I am, which is fair I guess!)

Reposting my LW comment here:

Just want to plug Josh Greene's great book Moral Tribes here (disclosure: he's my former boss). Moral Tribes basically makes the same argument in different/more words: we evolved moral instincts that usually serve us pretty well, and the tricky part is realizing when we're in a situation that requires us to pull out the heavy-duty philosophical machinery.

Huh, it really doesn't read that way to me. Both are pretty clear causal paths to "the policy and general coordination we get are better/worse as a result."

Load more