AI Governance Program Associate @ Open Philanthropy
1461 karmaJoined Working (0-5 years)


(Posting in a personal capacity unless stated otherwise.) I help allocate Open Phil's resources to improve the governance of AI with a focus on avoiding catastrophic outcomes. Formerly co-founder of the Cambridge Boston Alignment Initiative, which supports AI alignment/safety research and outreach programs at Harvard, MIT, and beyond, co-president of Harvard EA, Director of Governance Programs at the Harvard AI Safety Team and MIT AI Alignment, and occasional AI governance researcher. I'm also a proud GWWC pledger and vegan.


It seems like you might be under-weighing the cumulative amount of resources - even if you have some pretty heavy decay rate (which it's unclear you should -- usually we think of philanthropic investments compounding over time), avoiding nuclear war was a top global priority for decades, and it feels like we have a lot of intellectual and policy "legacy infrastructure" from that.

Yeah, this is all pretty compelling, thanks!


I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable.

I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies.

In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences.

Would be interested in empirical evidence on this question (ideally actual studies from psych, political science, sociology, econ, etc literatures, rather than specific case studies due to reference class tennis type issues).

Yes, some regulations backfire, and this is a good flag to keep in mind when designing policy, but to actually make the reference-class argument here work, you'd have to show that this is what we should expect from AI policy, which would include showing that failures like NEPA are either much more relevant for the AI case or more numerous than other, more successful regulations, like (in my opinion) the Clean Air Act, Sarbanes-Oxley, bans on CFCs or leaded gasoline, etc. I know it's not quite as simple as "I would simply design good regulations instead of bad ones," but it's also not as simple as "some regulations are really counterproductive, so you shouldn't advocate for any." Among other things, this assumes that nobody else will be pushing for really counterproductive regulations!

This post correctly identifies some of the major obstacles to governing AI, but ultimately makes an argument for "by default, governments will not regulate AI well," rather than the claim implied by its title, which is that advocating for (specific) AI regulations is net negative -- a type of fallacious conflation I recognize all too well from my own libertarian past.

Interesting! I actually wrote a piece on "the ethics of 'selling out'" in The Crimson almost 6 years ago (jeez) that was somewhat more explicit in its EA justification, and I'm curious what you make of those arguments.

I think randomly selected Harvard students (among those who have the option to do so) deciding to take high-paying jobs and donate double-digit percentages of their salary to places like GiveWell is very likely better for the world than the random-ish other things they might have done, and for that reason I strongly support this op-ed. But I think for undergrads who are really committed to doing the most good, there are two things I would recommend instead. Both route through developing a solid understanding of the most important and tractable problems in the world, via reading widely, asking good questions of knowledgeable people, doing their own writing and seeking feedback, probably aggressively networking among the people working on these problems. 

This enables much more effective earning to give — I think very plugged-in and reasonably informed donors can outperform even top grantmaking organizations in various ways, including helping organizations diversify their funding, moving faster, spotting opportunities that the grantmakers don't, etc. 

And it's also basically necessary for doing direct work on the world's most important problems. I think the generic advice to earn to give misses the huge variation in performance between individuals in direct work; if I understand correctly, 80k agrees with this and thinks this should have been much more emphasized in their early writing and advice. Many Harvard students, in my view, could relatively quickly become excellent in roles like think tank research in AI policy or biosecurity or operations at very impactful organizations. A smaller but nontrivial number could be excellent researchers on important philosophical or technical questions. I think it takes a lot of earning potential to beat those.

I object to calling funding two public defenders "strictly dominating" being one yourself; while public defender isn't an especially high-variance role with respect to performance compared to e.g. federal public policy, it doesn't seem that crazy that a really talented and dedicated public defender could be more impactful than the 2 or 3 marginal PDs they'd fund while earning to give.

The shape of my updates has been something like:

Q2 2023: Woah, looks like the AI Act might have a lot more stuff aimed at the future AI systems I'm most worried about than I thought! Making that go well now seems a lot more important than it did when it looked like it would mostly be focused on pre-foundation model AI. I hope this passes!

Q3 2023: As I learn more about this, it seems like a lot of the value is going to come from the implementation process, since it seems like the same text in the actual Act could wind up either specifically requiring things that could meaningfully reduce the risks or just imposing a lot of costs at a lot of points in the process without actually aiming at the most important parts, based on how the standard-setting orgs and member states operationalize it. But still, for that to happen at all it needs to pass and not have the general-purpose AI stuff removed.

November 2023: Oh no, France and Germany want to take out the stuff I was excited about in Q2. Maybe this will not be very impactful after all.

December 2023: Oh good, actually it seems like they've figured out a way to focus the costs France/Germany were worried about on the very most dangerous AIs and this will wind up being more like what I was hoping for pre-November, and now highly likely to pass!

The text of the Act is mostly determined, but it delegates tons of very important detail to standard-setting organizations and implementation bodies at the member-state level.

(Cross-posting from LW)

Thanks for these thoughts! I agree that advocacy and communications is an important part of the story here, and I'm glad for you to have added some detail on that with your comment. I’m also sympathetic to the claim that serious thought about “ambitious comms/advocacy” is especially neglected within the community, though I think it’s far from clear that the effort that went into the policy research that identified these solutions or work on the ground in Brussels should have been shifted at the margin to the kinds of public communications you mention.

I also think Open Phil’s strategy is pretty bullish on supporting comms and advocacy work, but it has taken us a while to acquire the staff capacity to gain context on those opportunities and begin funding them, and perhaps there are specific opportunities that you're more excited about than we are. 

For what it’s worth, I didn’t seek significant outside input while writing this post and think that's fine (given the alternative of writing it quickly, posting it here, disclaiming my non-expertise, and getting additional perspectives and context from commenters like yourself). However, I have spoken with about a dozen people working on AI policy in Europe over the last couple months (including one of the people whose public comms efforts are linked in your comment) and would love to chat with more people with experience doing policy/politics/comms work in the EU.

We could definitely use more help thinking about this stuff, and I encourage readers who are interested in contributing to OP’s thinking on advocacy and comms to do any of the following:

  • Write up these critiques (we do read the forums!); 
  • Join our team (our latest hiring round specifically mentioned US policy advocacy as a specialization we'd be excited about, but people with advocacy/politics/comms backgrounds more generally could also be very useful, and while the round is now closed, we may still review general applications); and/or 
  • Introduce yourself via the form mentioned in this post.
Load more