Hide table of contents

Open Philanthropy is launching a big new Request for Proposals for technical AI safety research, with plans to fund roughly $40M in grants over the next 5 months, and available funding for substantially more depending on application quality. 

Applications start with a simple 300 word expression of interest and are open until April 15, 2025.

Apply now

Overview

We're seeking proposals across 21 different research areas, organized into five broad categories:

  1. Adversarial Machine Learning
    • *Jailbreaks and unintentional misalignment
    • *Control evaluations
    • *Backdoors and other alignment stress tests
    • *Alternatives to adversarial training
    • Robust unlearning
  2. Exploring sophisticated misbehavior of LLMs
    • *Experiments on alignment faking
    • *Encoded reasoning in CoT and inter-model communication
    • Black-box LLM psychology
    • Evaluating whether models can hide dangerous behaviors
    • Reward hacking of human oversight
  3. Model transparency
    • Applications of white-box techniques
    • Activation monitoring
    • Finding feature representations
    • Toy models for interpretability
    • Externalizing reasoning
    • Interpretability benchmarks
    • More transparent architectures
  4. Trust from first principles
    • White-box estimation of rare misbehavior
    • Theoretical study of inductive biases
  5. Alternative approaches to mitigating AI risks
    • Conceptual clarity about risks from powerful AI
    • New moonshots for aligning superintelligence

We’re willing to make a range of types of grants including:

  • Research expenses (compute, APIs, etc.)
  • Discrete research projects (typically lasting 6-24 months)
  • Academic start-up packages
  • Support for existing nonprofits
  • Funding to start new research organizations or new teams at existing organizations.

The full RFP provides much more detail on each research area, including eligibility criteria, example projects, and nice-to-haves. 

Read more

We want the bar to be low for submitting expressions of interest: even if you're unsure whether your project fits perfectly, we encourage you to submit an EOI. This RFP is partly an experiment to understand the demand for funding in AI safety research.

Please email aisafety@openphilanthropy.org with questions, or just submit an EOI.

95

0
0

Reactions

0
0
Comments3


Sorted by Click to highlight new comments since:

Has Open Phil (or others) conducted a comprehensive analysis for both understanding and building the AI safety field? 

If yes, could you share some leads to add to my research? 

If not, would Open Phil consider funding such work? (either under the above or other funds)

Here is a recent example: Introducing SyDFAIS: A Systemic Design Framework for AI Safety Field-Building

I'm new to applying for an AIS grant, so I have some common questions that might have been answered elsewhere:

(1) what are some failure modes that I might need to consider when writing a proposal, specifically for a research project?

(2) will research expenses include stipends for the researchers?

(3) can I write a grant to do a research project with my university AI safety group? I'm not sure if this will be considered a field-building or a technical AI safety grant.

Some common failure modes:

  • Not reading the eligibility criteria
  • Not clearly distinguishing your project from prior work on the topic you're interested in
  • Not demonstrating a good understanding of prior work (would be good to read some/all of the papers we link to in this doc for whatever section you're applying within)
  • Not demonstrating that you/your team has prior experience doing ML projects. If you don't have such experience, then it's good to work with/be mentored by someone who does. 

"Research expeneses" does not include stipends, but you can apply for a project grant, which does.

If you're looking for money to spend on ML experiments or to pay people who are spending their time doing ML research, then that may fall within this RFP. If you're looking for money to do other things (e.g. reading groups, events, etc), then that may fall under the capacity-building team's RFPs.

Curated and popular this week
Sam Anschell
 ·  · 6m read
 · 
*Disclaimer* I am writing this post in a personal capacity; the opinions I express are my own and do not represent my employer. I think that more people and orgs (especially nonprofits) should consider negotiating the cost of sizable expenses. In my experience, there is usually nothing to lose by respectfully asking to pay less, and doing so can sometimes save thousands or tens of thousands of dollars per hour. This is because negotiating doesn’t take very much time[1], savings can persist across multiple years, and counterparties can be surprisingly generous with discounts. Here are a few examples of expenses that may be negotiable: For organizations * Software or news subscriptions * Of 35 corporate software and news providers I’ve negotiated with, 30 have been willing to provide discounts. These discounts range from 10% to 80%, with an average of around 40%. * Leases * A friend was able to negotiate a 22% reduction in the price per square foot on a corporate lease and secured a couple months of free rent. This led to >$480,000 in savings for their nonprofit. Other negotiable parameters include: * Square footage counted towards rent costs * Lease length * A tenant improvement allowance * Certain physical goods (e.g., smart TVs) * Buying in bulk can be a great lever for negotiating smaller items like covid tests, and can reduce costs by 50% or more. * Event/retreat venues (both venue price and smaller items like food and AV) * Hotel blocks * A quick email with the rates of comparable but more affordable hotel blocks can often save ~10%. * Professional service contracts with large for-profit firms (e.g., IT contracts, office internet coverage) * Insurance premiums (though I am less confident that this is negotiable) For many products and services, a nonprofit can qualify for a discount simply by providing their IRS determination letter or getting verified on platforms like TechSoup. In my experience, most vendors and companies
 ·  · 4m read
 · 
Forethought[1] is a new AI macrostrategy research group cofounded by Max Dalton, Will MacAskill, Tom Davidson, and Amrit Sidhu-Brar. We are trying to figure out how to navigate the (potentially rapid) transition to a world with superintelligent AI systems. We aim to tackle the most important questions we can find, unrestricted by the current Overton window. More details on our website. Why we exist We think that AGI might come soon (say, modal timelines to mostly-automated AI R&D in the next 2-8 years), and might significantly accelerate technological progress, leading to many different challenges. We don’t yet have a good understanding of what this change might look like or how to navigate it. Society is not prepared. Moreover, we want the world to not just avoid catastrophe: we want to reach a really great future. We think about what this might be like (incorporating moral uncertainty), and what we can do, now, to build towards a good future. Like all projects, this started out with a plethora of Google docs. We ran a series of seminars to explore the ideas further, and that cascaded into an organization. This area of work feels to us like the early days of EA: we’re exploring unusual, neglected ideas, and finding research progress surprisingly tractable. And while we start out with (literally) galaxy-brained schemes, they often ground out into fairly specific and concrete ideas about what should happen next. Of course, we’re bringing principles like scope sensitivity, impartiality, etc to our thinking, and we think that these issues urgently need more morally dedicated and thoughtful people working on them. Research Research agendas We are currently pursuing the following perspectives: * Preparing for the intelligence explosion: If AI drives explosive growth there will be an enormous number of challenges we have to face. In addition to misalignment risk and biorisk, this potentially includes: how to govern the development of new weapons of mass destr
jackva
 ·  · 3m read
 · 
 [Edits on March 10th for clarity, two sub-sections added] Watching what is happening in the world -- with lots of renegotiation of institutional norms within Western democracies and a parallel fracturing of the post-WW2 institutional order -- I do think we, as a community, should more seriously question our priors on the relative value of surgical/targeted and broad system-level interventions. Speaking somewhat roughly, with EA as a movement coming of age in an era where democratic institutions and the rule-based international order were not fundamentally questioned, it seems easy to underestimate how much the world is currently changing and how much riskier a world of stronger institutional and democratic backsliding and weakened international norms might be. Of course, working on these issues might be intractable and possibly there's nothing highly effective for EAs to do on the margin given much attention to these issues from society at large. So, I am not here to confidently state we should be working on these issues more. But I do think in a situation of more downside risk with regards to broad system-level changes and significantly more fluidity, it seems at least worth rigorously asking whether we should shift more attention to work that is less surgical (working on specific risks) and more systemic (working on institutional quality, indirect risk factors, etc.). While there have been many posts along those lines over the past months and there are of course some EA organizations working on these issues, it stil appears like a niche focus in the community and none of the major EA and EA-adjacent orgs (including the one I work for, though I am writing this in a personal capacity) seem to have taken it up as a serious focus and I worry it might be due to baked-in assumptions about the relative value of such work that are outdated in a time where the importance of systemic work has changed in the face of greater threat and fluidity. When the world seems to