Open Phil announced two weeks ago that we’re hiring for over 20 roles across our teams working on global catastrophic risk reduction — and we’ll answer questions at our AMA starting tomorrow. Ahead of that, I wanted to share some information about the roles I’m hiring for on my team (Technical AI Safety). This team is aiming to think through what technical research could most help us understand and reduce AI x-risk, and build thriving fields in high priority research areas by making grants to great projects and research groups.

First of all — since we initially listed roles on Sep 29, we’ve added three new roles in Technical AI Safety that you might not have seen yet if you only saw the original announcement! In addition to the (Senior) Program Associate role that was there originally, we added an Executive Assistant role last week — and yesterday we added a (Senior) Research Associate role and a role for a Senior Program Associate specializing in a particular subfield of AI safety research (e.g. interpretability, alignment theory, etc). Check those out if they seem interesting! The Executive Assistant role in particular requires a very different, less technical skill set.

Secondly, before starting to answer AMA questions, I wanted to highlight that our technical AI safety giving is far away from where it should be at equilibrium, there is considerable room to grow, and hiring more people is likely to lead quickly to more and better grants. My estimate is that last year, we recommended around ~$25M in grants to technical AI safety,[1] and so far this year I’ve recommended a similar amount. With more capacity for grant evaluation, research, and operations, we think this could pretty readily double or more.

All of our GCR teams (Technical AI Safety led by me, Capacity Building led by Claire Zabel, AI Governance and Policy led by Luke Muehlhauser, and Biosecurity led by Andrew Snyder-Beattie) are heavily capacity constrained right now — especially the teams that do work related to AI, given the recent boom in interest and activity in that area. I think my team currently faces even more severe constraints than other program teams. Compared to other teams, my team:

  • Is much smaller: Until literally last week, it was just me focusing primarily on technical AI safety (although Claire’s team sometimes funds technical AI safety work, primarily upskilling). Last week, Max Nadeau joined as my first Program Associate. In contrast, the capacity building team has eight people, and the biosecurity and AI governance teams each have five people.
  • Likely has worse “coverage” of its field: 
    • Ideally, a robust and committed grantmaking team in a given field would:
      • Maintain substantive relationships with the most impactful / promising (say) 5-30% of existing grantees, potential grantees, and key non-grantee players (e.g. people working on AI safety in industry labs) in their field.
      • Have pretty robust systems for hearing about most of the plausible potential new grantees in their field (via e.g. application forms or strong referral networks).
      • Have the bandwidth to give non-trivial consideration to a large fraction of plausible potential grantees, in order to make an informed, explicit decision about whether to fund them and how much.
      • Have the bandwidth to retrospectively evaluate what came out of large grants or important categories of grant.
    • My team has absolutely nowhere near that level of coverage (for example, we haven’t had the time to open application forms or to get to know academics who could work on safety). While all our GCR program areas could use a lot more “field coverage,” my guess is that our coverage in technical AI safety is considerably worse than the coverage that at least Claire and Andrew get in their fields. Not only does this team have fewer people to cover its field with, the set of plausible potential players feels like it could well be larger, since large numbers of technical people have started to get a lot more interested in AI safety recently.
  • Has a more nascent strategy: While we’ve been funding technical AI safety research in one form or another since 2015, the program area has switched leadership and strategic direction multiple times,[2] and the current iteration is pretty close to a fresh slate — we’ve closed out most of our old programs and are looking to build out a fresh stable of grantmaking initiatives from the ground up.
    • One reason our strategy is up in the air is that the team in its current iteration is very new, and advances in AI capabilities are rapidly changing the landscape of tractable research projects. I’ve led the program area for less than a year, and most of the grants I’ve made have been to new groups that didn’t exist before 2021 and/or to research projects that weren’t even practically feasible to do before the last couple of years. In contrast, other program leads have been building out a strategy for a few years or more.
    • Another big reason is that we have a huge number of unanswered questions about what technical projects we most want to see, what kind of results would most change our mind about key questions or move the needle on key safety techniques, and how we should prioritize between different streams of object-level work. For example, better answers to questions like these could change what research areas we go big on and what we pitch to potential grantees:
      • How can we tell how promising an interpretability technique is? What are the best “internal validity” measures of success? What are the best downstream tasks to measure?
      • What are the elements of an ideal model organism for misalignment, and what are the challenges to creating such a model?
      • What is the most compelling theory of change / path to impact for research on adversarial attacks and defenses, and what is the most exciting version of that kind of research?
      • Are there some empirical research directions inspired by the assistance games / reward uncertainty tradition which could be helpful even in a language model paradigm?

If you join the technical AI safety team in this round, you could help relieve some severe bottlenecks while building this new iteration of the program area from the ground up. If this sounds exciting to you, I strongly encourage you to apply!
 

  1. ^

     Interestingly, these figures are actually considerably larger than annual technical AI safety giving in the several years before that, even though we had fewer full-time-equivalent staff working in the area in 2022 and 2023 compared to 2015-2021.

  2. ^

    Initially, our program was led by Daniel Dewey. By around 2019, Catherine Olsson had joined the team, and eventually (I think by 2020-2021) it transitioned to being a team of three run by Nick Beckstead, who managed Catherine and Daniel, as well as Asya Bergal at half her time. In 2021, all three of Daniel, Catherine, and Nick left for other roles. For an interim period, there was no single point person: Holden was personally handling bigger grants (e.g. Redwood Research), and Asya was handling smaller grants (e.g. an RFP that Nick originally started and our PhD fellowship). Holden then moved on to direct work and Asya went full-time on capacity building. I began doing grantmaking in Oct 2022, and quickly ended up full-time handling FTXFF bailout grants. Since late January 2023 or so, I’ve been presiding over a more normal program area.

Comments3


Sorted by Click to highlight new comments since:

Was there some blocker that caused this to happen now, rather than 6 months / 1 year ago?

I only got into grantmaking less than a year ago (in November 2022), and shortly after I unburied myself from FTXFF-collapse-related grants around January, I started hiring in a private round which led to Max joining (a private round is generally much less of a logistical lift than a big public round). I'm now joining this big public round along with other OP GCR teams because combining hiring rounds makes it easier on the back-end. See Luke's AMA answers here and here for more detail on the "Why are you hiring now rather than previously?" question, and my comment here for more color on my personal working situation over the last ten months or so.

[anonymous]9
2
0

Excited to see this team expand! A few [optional] questions:

  1. What do you think were some of your best and worst grants in the last 6 months?
  2. What are your views on the value of "prosaic alignment" relative to "non-prosaic alignment?" To what extent do you think the most valuable technical research will look fairly similar to "standard ML research", "pure theory research", or other kinds of research?
  3. What kinds of technical research proposals do you think are most difficult to evaluate, and why?
  4. What are your favorite examples of technical alignment research from the past 6-12 months?
  5. What, if anything, do you think you've learned in the last year? What advice would you have for a Young Ajeya who was about to start in your role?
Curated and popular this week
Paul Present
 ·  · 28m read
 · 
Note: I am not a malaria expert. This is my best-faith attempt at answering a question that was bothering me, but this field is a large and complex field, and I’ve almost certainly misunderstood something somewhere along the way. Summary While the world made incredible progress in reducing malaria cases from 2000 to 2015, the past 10 years have seen malaria cases stop declining and start rising. I investigated potential reasons behind this increase through reading the existing literature and looking at publicly available data, and I identified three key factors explaining the rise: 1. Population Growth: Africa's population has increased by approximately 75% since 2000. This alone explains most of the increase in absolute case numbers, while cases per capita have remained relatively flat since 2015. 2. Stagnant Funding: After rapid growth starting in 2000, funding for malaria prevention plateaued around 2010. 3. Insecticide Resistance: Mosquitoes have become increasingly resistant to the insecticides used in bednets over the past 20 years. This has made older models of bednets less effective, although they still have some effect. Newer models of bednets developed in response to insecticide resistance are more effective but still not widely deployed.  I very crudely estimate that without any of these factors, there would be 55% fewer malaria cases in the world than what we see today. I think all three of these factors are roughly equally important in explaining the difference.  Alternative explanations like removal of PFAS, climate change, or invasive mosquito species don't appear to be major contributors.  Overall this investigation made me more convinced that bednets are an effective global health intervention.  Introduction In 2015, malaria rates were down, and EAs were celebrating. Giving What We Can posted this incredible gif showing the decrease in malaria cases across Africa since 2000: Giving What We Can said that > The reduction in malaria has be
LewisBollard
 ·  · 8m read
 · 
> How the dismal science can help us end the dismal treatment of farm animals By Martin Gould ---------------------------------------- Note: This post was crossposted from the Open Philanthropy Farm Animal Welfare Research Newsletter by the Forum team, with the author's permission. The author may not see or respond to comments on this post. ---------------------------------------- This year we’ll be sharing a few notes from my colleagues on their areas of expertise. The first is from Martin. I’ll be back next month. - Lewis In 2024, Denmark announced plans to introduce the world’s first carbon tax on cow, sheep, and pig farming. Climate advocates celebrated, but animal advocates should be much more cautious. When Denmark’s Aarhus municipality tested a similar tax in 2022, beef purchases dropped by 40% while demand for chicken and pork increased. Beef is the most emissions-intensive meat, so carbon taxes hit it hardest — and Denmark’s policies don’t even cover chicken or fish. When the price of beef rises, consumers mostly shift to other meats like chicken. And replacing beef with chicken means more animals suffer in worse conditions — about 190 chickens are needed to match the meat from one cow, and chickens are raised in much worse conditions. It may be possible to design carbon taxes which avoid this outcome; a recent paper argues that a broad carbon tax would reduce all meat production (although it omits impacts on egg or dairy production). But with cows ten times more emissions-intensive than chicken per kilogram of meat, other governments may follow Denmark’s lead — focusing taxes on the highest emitters while ignoring the welfare implications. Beef is easily the most emissions-intensive meat, but also requires the fewest animals for a given amount. The graph shows climate emissions per tonne of meat on the right-hand side, and the number of animals needed to produce a kilogram of meat on the left. The fish “lives lost” number varies significantly by
Neel Nanda
 ·  · 1m read
 · 
TL;DR Having a good research track record is some evidence of good big-picture takes, but it's weak evidence. Strategic thinking is hard, and requires different skills. But people often conflate these skills, leading to excessive deference to researchers in the field, without evidence that that person is good at strategic thinking specifically. I certainly try to have good strategic takes, but it's hard, and you shouldn't assume I succeed! Introduction I often find myself giving talks or Q&As about mechanistic interpretability research. But inevitably, I'll get questions about the big picture: "What's the theory of change for interpretability?", "Is this really going to help with alignment?", "Does any of this matter if we can’t ensure all labs take alignment seriously?". And I think people take my answers to these way too seriously. These are great questions, and I'm happy to try answering them. But I've noticed a bit of a pathology: people seem to assume that because I'm (hopefully!) good at the research, I'm automatically well-qualified to answer these broader strategic questions. I think this is a mistake, a form of undue deference that is both incorrect and unhelpful. I certainly try to have good strategic takes, and I think this makes me better at my job, but this is far from sufficient. Being good at research and being good at high level strategic thinking are just fairly different skillsets! But isn’t someone being good at research strong evidence they’re also good at strategic thinking? I personally think it’s moderate evidence, but far from sufficient. One key factor is that a very hard part of strategic thinking is the lack of feedback. Your reasoning about confusing long-term factors need to extrapolate from past trends and make analogies from things you do understand better, and it can be quite hard to tell if what you're saying is complete bullshit or not. In an empirical science like mechanistic interpretability, however, you can get a lot more fe