AI Governance Program Associate @ Open Philanthropy
1204 karmaJoined Jan 2022Working (0-5 years)


(Posting in a personal capacity unless stated otherwise.) I help allocate Open Phil's resources to improve the governance of AI with a focus on avoiding catastrophic outcomes. Formerly co-founder of the Cambridge Boston Alignment Initiative, which supports AI alignment/safety research and outreach programs at Harvard, MIT, and beyond, co-president of Harvard EA, Director of Governance Programs at the Harvard AI Safety Team and MIT AI Alignment, and occasional AI governance researcher. GWWC pledger.


Huh, it really doesn't read that way to me. Both are pretty clear causal paths to "the policy and general coordination we get are better/worse as a result."

Most of these have the downside of not giving the accused the chance to respond and thereby giving the community the chance to evaluate both the criticism and the response (which as I wrote recently isn't necessarily a dominant consideration, but it is an upside of the public writeup).

Fwiw, seems like the positive performance is more censored in expectation than the negative performance: while a case that CH handled poorly could either be widely discussed or never heard about again, I'm struggling to think of how we'd all hear about a case that they handled well, since part of handling it well likely involves the thing not escalating into a big deal and respecting people's requests for anonymity and privacy.

It does seem like a big drawback that the accused don't know the details of the accusations, but it also seems like there are obvious tradeoffs here, and it would make sense for this to be very different from the criminal justice system given the difference in punishments (loss of professional and financial opportunities and social status vs. actual prison time).

Agreed that a survey seems really good.

Thanks for writing this up!

I hope to write a post about this at some point, but since you raise some of these arguments, I think the most important cruxes for a pause are:

  1. It seems like in many people's models, the reason the "snap back" is problematic is that the productivity of safety research is much higher when capabilities are close to the danger zone, both because the AIs that we're using to do safety research are better and because the AIs that we're doing the safety research on are more similar to the ones in the danger zone. If the "snap back" reduces the amount of calendar time during which we think AI safety research will be most productive in exchange for giving us more time overall, this could easily be net negative. On the other hand, a pause might just "snap back" to somewhere on the capabilities graph that's still outside the danger zone, and lower than it would've been without the pause for the reasons you describe.
  2. A huge empirical uncertainty I have is: how elastic is the long-term supply curve of compute? If, on one extreme end, the production of computing hardware for the next 20 years is set in stone, then at the end of the pause there would be a huge jump in how much compute a developer could use to train a model, which seems pretty likely to produce a destabilizing/costly jump. At the other end, if compute supply were very responsive to expected AI progress and a pause would mean a big cut to e.g. Nvidia's R&D budget and TSMC shelved plans for a leading-node fab or two as a result, the jump would be much less worrying in expectation. I've heard that the industry plans pretty far in advance because of how much time and money it takes to build a fab (and how much coordination is required between the different parts of the supply chain), but it seems like at this point a lot of the future expected revenue to be won from designing the next generations of GPUs comes from their usefulness for training huge AI systems, so it seems like there should at least be some marginal reduction in long-term capacity if there were a big regulatory response.

Agree, basically any policy job seems to start teaching you important stuff about institutional politics and process and the culture of the whole political system!

Though I should also add this important-seeming nuance I gathered from a pretty senior policy person who said basically: "I don't like the mindset of, get anywhere in the government and climb the ladder and wait for your time to save the day; people should be thinking of it as proactively learning as much as possible about their corner of the government-world, and ideally sharing that information with others."

Suggestion for how people go about developing this expertise from ~scratch, in a way that should be pretty adaptable to e.g. the context of an undergraduate or grad-level course, or independent research (a much better/stronger version of things I've done in the past, which involved lots of talking and take-developing but not a lot of detail and publication, which I think are both really important):

  1. Figure out who, both within the EA world and not, would know at least a fair amount about this topic -- maybe they just would be able to explain why it's useful in more context than you have, maybe they know what papers you should read or acronyms you should familiarize yourself with -- and talk to them, roughly in increasing order of scariness/value of their time, such that you've at least had a few conversations by the time you're talking to the scariest/highest-time-value people. Maybe this is like a list of 5-10 people?
  2. During these conversations, take note of what's confusing you, ideas that you have, connections you or your interlocutors draw between topics, takes you find yourself repeating, etc.; you're on the hunt for a first project.
  3. Use the "learning by writing" method and just try to write "what you think should happen" in this area, as in, a specific person (maybe a government agency, maybe a funder in EA) should take a specific action, with as much detail as you can, noting a bunch of ways it could go wrong and how you propose to overcome these obstacles.
  4. Treat this proposal as a hypothesis that you then test (meaning, you have some sense of what could convince you it's wrong), and you seek out tests for it, e.g. talking to more experts about it (or asking them to read your draft and give feedback), finding academic or non-academic literature that bears on the important cruxes, etc., and revise your proposal (including scrapping it) as implied by the evidence.
  5. Try to publish something from this exercise -- maybe it's the proposal, maybe it's "hey, it turns out lots of proposals in this domain hinge on this empirical question," maybe it's "here's why I now think [topic] is a dead end." This gathers more feedback and importantly circulates the information that you've thought about it a nonzero amount.

Curious what other approaches people recommend!

A technique I've found useful in making complex decisions where you gather lots of evidence over time -- for example, deciding what to do after your graduation, or whether to change jobs, etc., where you talk to lots of different people and weigh lots of considerations -- is to make a spreadsheet of all the arguments you hear, each with a score for how much it supports each decision.

For example, this summer, I was considering the options of "take the Open Phil job," "go to law school," and "finish the master's." I put each of these options in columns. Then, I'd hear an argument like "being in school delays your ability to take a full-time job, which is where most of your impact will happen"; I'd add a row for this argument. I thought this was a very strong consideration, so I gave the Open Phil job 10 points, law school 0, and the master's 3 (since it was one more year of school instead of 3 years). Later, I'd hear an argument like "legal knowledge is actually pretty useful for policy work," which I thought was a medium-strength consideration, and I'd give these options 0, 5, and 0.

I wouldn't take the sum of these as a final answer, but it was useful for a few reasons:

  • In complicated decisions, it's hard to hold all of the arguments in your head at a time. This might be part of why I noticed a strong recency bias, where the most recent handful of considerations raised to me seemed the most important. By putting them all in one place, I could feel like I was properly accounting for all the things I was aware of.
  • Relatedly, it helped me avoid double-counting arguments. When I'd talk to a new person, and they'd give me an opinion, I could just check whether their argument was basically already in the spreadsheet; sometimes I'd bump a number from 4 to 5, or something, based on them being persuasive, but sometimes I'd just say, "Oh, right, I guess I already knew this and shouldn't really update from it."
  • I also notice a temptation to simplify the decision down to a single crux or knockdown argument, but usually cluster thinking is a better way to make these decisions, and the spreadsheet helps aggregate things such that an overall balance of evidence can carry the day.

As of August 24, 2023, I no longer endorse this post for a few reasons.

  1. I think university groups should primarily be focused on encouraging people to learn a lot of things and becoming a venue/community for people to try to become excellent at things that the world really needs, and this will mostly look like creating exciting and welcoming environments for co-working and discussion on campus. In part this is driven by the things that I think made HAIST successful, and in part it's driven by thinking there's some merit to the unfavorable link to this post in "University EA Groups Need Fixing."
  2. I also think retreats are more costly than I realized when writing, and (relatedly) if you're going to organize a retreat or workshop or whatever, it should probably have a theory of change and driven by a target audience's area of interest and background (e.g., "early-career people interested in AI policy who haven't spent time in DC come and meet AI policy people in DC") rather than general-purpose uni group bonding.
  3. I also think "retreat" is basically the wrong word for what these events are; at least the ones I've run have generally had enough subject-matter-driven content that "workshop" is a more appropriate term.
  4. That said, I do still think university groups should consider doing retreats/workshops, depending on their capacity, the specific needs of their group, and the extent to which they buy the arguments for/against prioritizing them over other programs.

Edited the top of the post to reflect this.

Was going to make a very similar comment. Also, even if "someone else in Boston could have" done the things, their labor would have funged from something else; organizer time/talent is a scarce resource, and adding to that pool is really valuable.

Yep, all sounds right to me re: not deferring too much and thinking through cause prioritization yourself, and then also that the portfolio is too broad, though these are kind of in tension.

To answer your question, I'm not sure I update that much on having changed my mind, since I think if people did listen to me and do AISTR this would have been a better use of time even for a governance career than basically anything besides AI governance work (and of course there's a distribution within each of those categories for how useful a given project is; lots of technical projects would've been more useful than the median governance project).

Load more