Hide table of contents

Overview

In late July I posted Announcing the AI Safety Field Building Hub, a new effort to provide AISFB projects, mentorship, and funding. This new organization, funded by FTXFF, set out to work on AI safety outreach projects aimed at AI researchers, with funding to mentor new people and the flexibility to take on projects suggested by the community. As we’re now nearing the end of the FTXFF funding, the AISFB Hub is finishing up all existing projects and closing down. It’s been an exciting six months! Here’s a retrospective of all the AISFB Hub has been up to, as well as potential future work we’d get up to if we secure new funding. 

 

What AISFB Hub has done

There are three major categories that the work has fallen into: working with my interview series, completing the pilot of an outreach survey, and everything else. 

Interview Series

In Februrary-March 2022, I conducted 97 interviews with AI researchers (who had papers accepted to NeurIPS / ICML 2021). Interviewees were asked about their perceptions of artificial intelligence now and in the future, with special focus on risks from advanced AI systems. I presented the alignment problem and the idea of instrumental incentives, and asked them questions like whether they thought we’d ever achieve AGI, and if they’d be interested in working on AI alignment. I released 11 transcripts of those interviews at the time, posted a talk on the preliminary results, and promised future analysis. Now, for what the AISFB Hub did!

  1. Created a website to display all of the below: https://ai-risk-discussions.org/interviews (EAF post
  2. Anonymized and released 72 more transcripts (i.e. everyone who gave permission), which brings us up to 83/97 transcripts available.
  3. Completed a quantitative analysis of these interviews, especially focused on how people responded to the core AI safety questions (EAF post), with a separate writeup on predicting researchers' interest in alignment (EAF post).
  4. Created an interactive walkthrough of common perspectives in the interviews, as well as counterarguments to some of the common objections to AI safety arguments (EAF post).

Outreach Survey

In this project, we sent AI researchers (paper accepted at NeurIPS / ICML / ICLR 2021) a 5-hour survey of AI safety readings to engage critically with and answer questions about. 28 researchers completed the pilot survey, and a partial writeup of those pilot results is available on EAF / LW as “What AI Safety Materials Do ML Researchers Find Compelling?” We’re interested in continuing this project with significant modifications (drawn from lessons in the pilot study) if we receive further funding (as a scalable project, it most benefits from funding). 

A few writeups were commissioned for use in this project, some of which were used, some not: 

We also had some additional work that was set up for future surveys, which is described later.

Miscellaneous

Write-ups not listed above:

There was a lot of logistics, featuring: setting up under a fiscal sponsor, hiring / working with lots of people on individual projects, and neverending reimbursements!

I also talked to various members of the community about AI safety fieldbuilding, and acquired a healthy amount of confusion about AI field-building strategy, and our familiar friend clawbacks. 

 

People involved in AISFB Hub

So many people worked with the AISFB Hub, or helped out with various projects! Thanks so much to all of them. 

  • ai-risk-discussions.org: Lukas Trötzmüller (interactive walkthrough, website), Maheen Shermohammed (quantitative analysis), Michael Keenan (website)
  • Outreach survey: Collin Burns
  • Data collection, data organizing, text cleaning, copyediting, and ops (alphabetical order): Rauno Arike, Angelica Belo, Tom Hutton, Ash Jafari, Aashish Khmiasia, Harvey LeNar, Kitt Morjanova, Jonathan Ng, Nicole Nohemi, Cleyton Pires, David Spearman, Kelsey Theriault, Stephen Thomas, Lukas Trötzmüller, Austin Witte
  • Writing: check out the linked EA Forum posts above for names
  • Interviews: Zi Cheng (Sam) Huang (tagging), Mary Collier Wilks (advising), Andrew Critch (idea suggestion), Tobi Gerstenberg (support)
  • Broader community: Many people not listed here provided helpful advice, feedback, did a short trial with me, or otherwise contributed. I have Akash Wasil, Michael Chen, and Vaniver listed in my notes, but let me know if I lost track and you should be listed here!
  • (If you’re wondering what my role in this org is: I do a lot of the direct work – writing / ops / data analysis etc. – and also manage / hire people to work on projects.)

Funding: FTX Future Fund, Stanford University, two anonymous donors, and LTFF

 

What AISFB Hub did not do

Of the Specific Projects

There were some projects listed on the original AISFB Hub post that I decided not to pursue further. 

  • Helping with internal OpenAI / DeepMind field-building efforts → My guess after talking to people is that they mostly need internal people rather than external people, and while there’s a possibility of involvement, it’s pretty niche.
  • AI safety-oriented film → I talked to a number of people attempting to make films, but didn’t think any were pursuing the fully-fledged feature-length film I hoped for. (However, one organization that I didn’t end up talking to is perhaps doing this!) It’s a very difficult vision, though. I’m stepping out of this now since I don’t see a clear path to contribution.
  • Projects developed by Center for AI Safety → I referred some people to CAIS, but didn’t end up taking on any of their projects myself. 

Of the Stated Aims

More broadly, I also failed to complete two of the major stated aims of the AI Safety Field-Building Hub. 

  • First: I basically failed to take on community-suggested field-building projects. I found that the community was unlikely to suggest things I wanted to do more than what I already wanted to do (perhaps unsurprising in retrospect). I was also quite busy with existing projects, and felt bottlenecked on people I felt happy delegating high-level projects to. I was able to help match 1-2 field-building suggestions with people who were ready to execute, but it was rare.
  • I also failed to mentor people new to AI safety fieldbuilding. As it turns out, I find the [hiring / evaluating / training] circuit stressful, and prefer to work with a couple of people whose world-models and skills are quite close to mine. This meant I was mostly working in relatively peer relationships, or in evaluative rather than mentoring relationships. 

Given the above, if I secure additional funding, I plan to substantially restructure this org. I’ll rebrand (probably to “Arkose”, rather than “AI Safety Field Building Hub”) to allow other groups to take the more ambitious, far-reaching name. I’ll have narrower organizational focus, and as such won’t take public suggestions for field-building projects. I’ll also not take on mentorship roles outside of my usual capacity as an EA (though people are welcome to contact me in that capacity, especially to talk about field building). I still aim to work on AI safety field-building projects aimed at ML researchers, with a smaller team of people on individual projects!

Of note, a lot of the anticipated changes above are due to personal fit. I find that field-building can be quite draining, especially when orienting on a sense of impact, and even before the FTXFF situation changed the funding environment and cut off my growth aims, I had been trying lots of pivots to make the experience feel sustainable. To my surprise and pleasure, at this point I’ve settled into a workflow I feel actively happy with. I got to try lots of different things during this AISFB Hub experiment! I really appreciate having had the opportunity to get feedback (from reality and others) about the quality of various ideas, and what kind of tasks, environments, and mindsets feel engaging to me.

Impact Assessment

Finally, a last thing the AISFB Hub did not do: an accursed impact assessment. They’re so correct and good, and I think I’m basically too lazy to do one or figure out how. (As an aside, “making up scrappy processes with no external review” is such a refrain of this new org). Regardless, some notes are below. 

My overall goal is to encourage ML researchers to be more interested in AI alignment, given that I think this is an important problem that needs more attention. I'm interested in changing the overall perception of the field to be more pro-safety, in addition to encouraging specific people to work on it if they're interested. The final output I’m looking for is something like “how many counterfactual people became interested enough to do AI alignment research”. One way to measure this is on the level of individuals – who became more involved, who were less involved, etc. The other measure is more nebulous, and I think needs to incorporate the fact that much of AISFB Hub outputs seem to be “non-peer-reviewed research output”-shaped. And I think it’s worth noting that a lot of my work feels pretty preliminary, like it’s vaguely promising small-scale stuff that feels like it could set up for large-scale outreach if it goes well (which is definitely still in question).

Here’s some data on individuals, and we’ll end there.

Interview series

On 7/29/22 (interviews took place in Feb-early March 2022, so about 5-6 months after), 86/97 participants were emailed. 82/86 participants responded to the email or the reminder email. They were asked: 

  • “Did the interview have a lasting effect on your beliefs (Y/N)?” 
    • 42/82 (51%) responded Y.
  • “Did the interview cause you to take any new actions in your work (Y/N)?” 
    • 12/82 (15%) responded Y.

Note however that the interviews didn’t take place within AISFB Hub’s establishment.

Outreach survey (pilot) 

  • 2/28 (7%) seemed actively aggravated (maybe 3/30 (10%)) with AI alignment post-survey
  • 2/28 (7%) seemed high degree of interest (maybe 2/30 (7%)) in AI alignment post-survey
  • 8/28 (29%) seemed high degree (n=2) or pretty interested (n=6) in AI alignment post-survey

This ratio is not great, and we would want better numbers before progressing. 

 

What’s next? 

I’m interested in continuing to work on AI safety field-building, aimed at ML researchers. AISFB Hub is closing down, but if I secure additional funding, then I’d want to start a new rebranded org (probably “Arkose”) with a more focused mode of operation. I’d likely focus on a “survey + interview(?) field-building org aimed at ML researchers”, where I’d hire a couple of people to help work on individual projects and still do a bunch of direct work myself. While these ideas are going to need to be more fleshed out before applying for funding, the directions I’m most excited about are: 

Surveys

I like surveys because they’re scalable, and because a lot of people haven’t heard of AI alignment (41% had heard of AI alignment in any capacity in my interviews) so it’s a nice way to introduce people to the ideas. (I also personally enjoy the tasks involved in surveys). We completed a pilot outreach survey, which went all right, but we have ideas for how to restructure it so that it goes better. 

Once we have that better survey, I’m also interested in modifying it so that it can be run in China. I continue to be interested in AI safety in China, and have since talked to a lot of the involved parties there, who seem tentatively interested in me running such a survey. 

We also did a fair amount of work to prepare for larger-scale deployment of surveys – if our pilot had gone better and funding had remained, we would have likely started scaling. A lot of the field-building insights from previous posts will be useful for focusing on specific populations, and we’ve done some work with respect to scalable survey logistics. 

Interviews

One-on-one conversations with AI researchers offer an unparalleled degree of personalization. During the AISFB Hub period, I was most interested in training other people to conduct these interviews, since I find conducting interviews to be pretty tiring as an introvert. There were a couple of potentially good candidates, but ultimately I don’t think anything was firmly set in motion. 

However, people continue to be excited about interviews, and there are a couple of different ways I could see this progressing: 

  • More structured interviews compared to my previous interview series (also more technical-focused). This might make it more sustainable for me personally to conduct interviews. (And might increase trainability, but the technical requirements would be higher…)
  • I continue to be interested in doing a pairing program between AI safety researchers and AI researchers for one-on-ones. I haven’t had time to invest in this, but have some preliminary interest and plans. 

Pipeline

I also remain firmly interested in “building the ML researcher pipeline” activities, despite not having concrete plans here. These wouldn’t be focuses of the org, but will be something I’m often thinking about when developing surveys and interviews. 

  • After ML researchers are introduced to AI alignment, where do interested researchers go next? Is there something they can do before the AGISF AI alignment curriculum?
  • We probably need new materials aimed specifically at AI researchers.
    • We haven’t tested all the existing introductory material, but the intended audience does seem to matter a lot for reception, and I’d definitely take more options.
    • One next-step need: If an ML researcher comes from [x subfield] and is interested in working on an alignment research project, what are the existing current papers they should read?
  • Prizes to work on AI alignment projects is something I haven’t looked into, but still seems potentially quite good – how can those be integrated better into the pipeline? (CAIS’s work)
  • Having AI safety workshops at the major conferences also seems good – is that covered adequately? (CAIS and others)
  • Thoughts I’ve been hearing around: peer review for AI alignment papers, prediction markets / forecasting training

Interested in funding me? 

The above thus constitutes my rough plans for a “survey + interview(?) field-building org aimed at ML researchers”! I’m going to put together a more detailed plan in the next month and apply to Open Phil and probably SFF. (I’ve talked with Open Phil some already; they’re interested in seeing a full grant proposal from me.) 

However, I’m also interested in talking to individual donors who want more of this stuff done, or have specific projects they’d be excited about funding. This is not a strong bid at all – I’ll get it sorted one way or another, and I’m personally financially stable. But if you happened to be interested in this work, I have a 501(c)(3) setup such that donations are tax deductible, and I’d like to chat about what you think field-building priorities are and how that may intersect with my plans.

Timeline: I anticipate taking a pause from this work for ~2-5 months pretty soon. This is mostly to explore some different field-building / ops experiences while I wait for funding evaluations. After that, I’ll be making a choice about whether to continue AISFB Hub-like activities, or work for a different org where I can do AI safety fieldbuilding. While my future plans are quite uncertain and subject to revision, my current best guess is that I’ll want to return to running this org, and ease and continuity of funding is likely to be a major crux.

 

Conclusion

And that’s all, folks. It’s been a really cool ride for me – thanks so much for everyone who contributed to the AISFB Hub! Looking forward to whatever comes next, and very happy to discuss. 

Comments2


Sorted by Click to highlight new comments since:

Thanks for sharing and for all your work!

What you have done already has been quite useful to me and I am excited to see what you do next.

Thanks Peter!

Curated and popular this week
 ·  · 17m read
 · 
TL;DR Exactly one year after receiving our seed funding upon completion of the Charity Entrepreneurship program, we (Miri and Evan) look back on our first year of operations, discuss our plans for the future, and launch our fundraising for our Year 2 budget. Family Planning could be one of the most cost-effective public health interventions available. Reducing unintended pregnancies lowers maternal mortality, decreases rates of unsafe abortions, and reduces maternal morbidity. Increasing the interval between births lowers under-five mortality. Allowing women to control their reproductive health leads to improved education and a significant increase in their income. Many excellent organisations have laid out the case for Family Planning, most recently GiveWell.[1] In many low and middle income countries, many women who want to delay or prevent their next pregnancy can not access contraceptives due to poor supply chains and high costs. Access to Medicines Initiative (AMI) was incubated by Ambitious Impact’s Charity Entrepreneurship Incubation Program in 2024 with the goal of increasing the availability of contraceptives and other essential medicines.[2] The Problem Maternal mortality is a serious problem in Nigeria. Globally, almost 28.5% of all maternal deaths occur in Nigeria. This is driven by Nigeria’s staggeringly high maternal mortality rate of 1,047 deaths per 100,000 live births, the third highest in the world. To illustrate the magnitude, for the U.K., this number is 8 deaths per 100,000 live births.   While there are many contributing factors, 29% of pregnancies in Nigeria are unintended. 6 out of 10 women of reproductive age in Nigeria have an unmet need for contraception, and fulfilling these needs would likely prevent almost 11,000 maternal deaths per year. Additionally, the Guttmacher Institute estimates that every dollar spent on contraceptive services beyond the current level would reduce the cost of pregnancy-related and newborn care by three do
 ·  · 2m read
 · 
I speak to many entrepreneurial people trying to do a large amount of good by starting a nonprofit organisation. I think this is often an error for four main reasons. 1. Scalability 2. Capital counterfactuals 3. Standards 4. Learning potential 5. Earning to give potential These arguments are most applicable to starting high-growth organisations, such as startups.[1] Scalability There is a lot of capital available for startups, and established mechanisms exist to continue raising funds if the ROI appears high. It seems extremely difficult to operate a nonprofit with a budget of more than $30M per year (e.g., with approximately 150 people), but this is not particularly unusual for for-profit organisations. Capital Counterfactuals I generally believe that value-aligned funders are spending their money reasonably well, while for-profit investors are spending theirs extremely poorly (on altruistic grounds). If you can redirect that funding towards high-altruism value work, you could potentially create a much larger delta between your use of funding and the counterfactual of someone else receiving those funds. You also won’t be reliant on constantly convincing donors to give you money, once you’re generating revenue. Standards Nonprofits have significantly weaker feedback mechanisms compared to for-profits. They are often difficult to evaluate and lack a natural kill function. Few people are going to complain that you provided bad service when it didn’t cost them anything. Most nonprofits are not very ambitious, despite having large moral ambitions. It’s challenging to find talented people willing to accept a substantial pay cut to work with you. For-profits are considerably more likely to create something that people actually want. Learning Potential Most people should be trying to put themselves in a better position to do useful work later on. People often report learning a great deal from working at high-growth companies, building interesting connection
 ·  · 1m read
 · 
Need help planning your career? Probably Good’s 1-1 advising service is back! After refining our approach and expanding our capacity, we’re excited to once again offer personal advising sessions to help people figure out how to build careers that are good for them and for the world. Our advising is open to people at all career stages who want to have a positive impact across a range of cause areas—whether you're early in your career, looking to make a transition, or facing uncertainty about your next steps. Some applicants come in with specific plans they want feedback on, while others are just beginning to explore what impactful careers could look like for them. Either way, we aim to provide useful guidance tailored to your situation. Learn more about our advising program and apply here. Also, if you know someone who might benefit from an advising call, we’d really appreciate you passing this along. Looking forward to hearing from those interested. Feel free to get in touch if you have any questions. Finally, we wanted to say a big thank you to 80,000 Hours for their help! The input that they gave us, both now and earlier in the process, was instrumental in shaping what our advising program will look like, and we really appreciate their support.