Hide table of contents

Acknowledgements

Thank you to Frances Lorenz for writing this post and Daniel Braun for his comments.

 

Introduction
AI Safety has a pool of research talent working via three main avenues: industry, academia, and independent research. Unfortunately, independent researchers and, depending on the lab, those in academia can be physically isolated from others in the community. Thus, they lose out on incidental feedback: bumping into coworkers at lunch or labmates in the office and having continuing, casual discussions on research ideas, progress updates, points of uncertainty, etc. Alternative sources of feedback are largely constrained and intimidating: you can share drafts with people, write a blog post, email/message back and forth, or schedule video calls with other researchers. These sources can be difficult to find and typically follow a, “provide context, receive feedback” structure that prevents deep exploration of particular ideas or avenues, and unless scheduled regularly, will lack the familiarity that can elevate the quality of a discussion. 

 

Building community through feedback events

AI Safety Support would like to try hosting feedback/discussion events. Ideally, these could run for a longer duration (maybe 90 minutes) with participants allowed to pop out at any time (there could be a scheduled break after 45 minutes so those who feel uncomfortable exiting can do so at that time). The event would involve a host individual or group briefly presenting their work, followed by free-form discussion between participants and the hosts, rather than one-way feedback. If there is some rough continuity in the group of people who frequent such discussions, there is increased potential for familiarity between participants and hosts. Additionally, AISS relieves the burden of requesting feedback and can help encourage a friendly, collaborative atmosphere.

 

The challenge of feedback events

In particular, AISS would like participants to have a working knowledge of AI Safety research to facilitate high-level discussions. One challenge with this is convincing such participants that feedback events are worth their time. This is difficult because: 

  • Participants are providing most of the value in these events. Though AI Safety researchers may provide incidental feedback regularly throughout their working lives, committing to a scheduled zoom session seems to be calculated as a greater time loss than such encounters. This could perhaps be attributed to a lack of familiarity or context (you may not know the hosts or quality of the project).
  • For participants to see such an event as worthwhile they need:
    • Some amount of confidence in the project and those working on it.
    • Some amount of confidence that other participants are knowledgeable and will ask higher-level questions rather than seek clarifications.

This presents a real challenge for getting feedback events up and running, but we believe they have the potential to be quite valuable to the field as a whole. Our first event went well, both participants and hosts seemed quite engaged with ample back-and-forth. The hosts concluded that the event was encouraging and stimulating. One of our participants provided the following feedback: 

“The format of having authors present their work live is very refreshing - I'm tired of reading all the time. I'm also hungry for discussions because I have too little contact with AI safety people.”

Conversely, we had another participant state that they couldn’t regularly attend such events due to the time commitment. We hope to determine whether a significant portion of researchers relate to the first participant; thus, we want to give feedback events a real shot. 


 

We would like to kindly emphasize that your decision to attend feedback events as a participant will (in part) determine the success of such events. We would be very happy if readers like you could consider giving them a try, even if your initial instinct says it’s not worth it given the difficulties we’ve stated above. If after doing so, you feel the events are not worthwhile, we will have an anonymous feedback form available for you to share.


What can you do?

Join our mailing list and receive invites to feedback events

Please fill out this short form to join the mailing list (you can always unsubscribe later):

​​https://airtable.com/shrerPv4HjXof0oUU

*Signing up is not a commitment to anything and comes with no obligations, we will simply add you to our mailing list for feedback events. Every time we host one, we’ll email you an invitation and you can decide whether you’d like to attend or not! You can always unsubscribe.

(Note: We’re particularly seeking non-beginners with concrete knowledge in AI Safety).


 

Sign up to host a feedback event

If you’d like to host a feedback event and present your work (with AISS taking on all the organization and facilitation), please fill out this form: 

https://airtable.com/shreRd4ctQcaFmMds

We will email you with the suggested date and details for your feedback event. If you approve, we’ll email our mailing list to invite participants. 


 

Thank you!

Finally, If you have any ideas on how to best run feedback events, please let us know!

Comments3


Sorted by Click to highlight new comments since:

I think moderated video calls are my favorite format, as boring as that is. I.e. you have a speaker and also a moderator who picks people to ask questions, cuts people off or prompts them to keep talking depending on their judgment, etc.

Another thing I like, if it seems like people are interested in talking about multiple different things after the main talk / QA / discussion, is splitting up the discussion into multiple rooms by topic. I think Discord is a good application for this. Zoom is pretty bad at this but can be cajoled into having the right functionality if you make everyone a co-host, I think Microsoft Teams is fine but other people have problems, and other people think GatherTown is fine but I have problems.

Kudos for the initiative! I think it makes sense to crosspost this to LessWrong.

Good idea :) thank you!

More from JJ Hepburn
116
JJ Hepburn
· · 1m read
110
JJ Hepburn
· · 1m read
96
Curated and popular this week
LintzA
 ·  · 15m read
 · 
Cross-posted to Lesswrong Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achi
Dr Kassim
 ·  · 4m read
 · 
Hey everyone, I’ve been going through the EA Introductory Program, and I have to admit some of these ideas make sense, but others leave me with more questions than answers. I’m trying to wrap my head around certain core EA principles, and the more I think about them, the more I wonder: Am I misunderstanding, or are there blind spots in EA’s approach? I’d really love to hear what others think. Maybe you can help me clarify some of my doubts. Or maybe you share the same reservations? Let’s talk. Cause Prioritization. Does It Ignore Political and Social Reality? EA focuses on doing the most good per dollar, which makes sense in theory. But does it hold up when you apply it to real world contexts especially in countries like Uganda? Take malaria prevention. It’s a top EA cause because it’s highly cost effective $5,000 can save a life through bed nets (GiveWell, 2023). But what happens when government corruption or instability disrupts these programs? The Global Fund scandal in Uganda saw $1.6 million in malaria aid mismanaged (Global Fund Audit Report, 2016). If money isn’t reaching the people it’s meant to help, is it really the best use of resources? And what about leadership changes? Policies shift unpredictably here. A national animal welfare initiative I supported lost momentum when political priorities changed. How does EA factor in these uncertainties when prioritizing causes? It feels like EA assumes a stable world where money always achieves the intended impact. But what if that’s not the world we live in? Long termism. A Luxury When the Present Is in Crisis? I get why long termists argue that future people matter. But should we really prioritize them over people suffering today? Long termism tells us that existential risks like AI could wipe out trillions of future lives. But in Uganda, we’re losing lives now—1,500+ die from rabies annually (WHO, 2021), and 41% of children suffer from stunting due to malnutrition (UNICEF, 2022). These are preventable d
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f
Recent opportunities in AI safety
30
Ryan Kidd
·