Bio

I currently lead EA funds.

Before that, I worked on improving epistemics in the EA community at CEA (as a contractor), as a research assistant at the Global Priorities Institute, on community building, and Global Health Policy.

Unless explicitly stated otherwise, opinions are my own, not my employer's.

You can give me positive and negative feedback here.

Comments
482

Topic contributions
6

A separate cluster of threat models that is worth disentangling is creating more surface area for anti-human-user coordination within the economy, particularly if it's much easier for smart, misaligned AI systems to coordinate with relatively stupid, corrigible AI systems (e.g., Opus 4.7). The arguments for AI <> AI coordination advantage (over AI <> human) are quite intuitive to me, but I don't think you actually need an asymmetry here to put society in a more vulnerable state than the current one. I don't have a great sense of how this washes out, but it feels like a crux for evaluating the net benefit of coordination tech.

Similar to how traditional -> digital banking probably creates more surface area for exploitation by computer hackers, it's probably very good to have primitive computers touching nukes rather than more modern ones.
 

I thought that part of the core thesis was that as we go through the intelligence explosion, coordination tech becomes increasingly valuable (maybe critical). Are you saying that it's plausible that we'll get "good enough" coordination tech out of agents that are much less powerful that than the frontier during the IE? E.g. coordination tech generally uses Opus 4.7, even in the Opus 6-8 era, where coordination tech seems most (?) valuable, but we also have much more legitimate concerns about scheming capabilities?

The dual-use concerns you raise are framed around bad human actors: corporations colluding, coup plotters, criminals. But the coordination infrastructure you're sketching could also create significant attack surfaces for AI systems themselves. If AI delegates are negotiating on behalf of humans, running arbitration, doing confidential monitoring, and profiling preferences, then a misaligned or adversarially manipulated AI layer sitting inside all of that coordination infrastructure seems like it could be quite a powerful lever for influence or control.

Curious if you have thoughts on this class of concerns?

Thanks for sharing this. Did your team make and test simple prototypes for any of these ideas? If not, I'm curious about why from a research/writing perspective. I would have thought that you could get quite a lot of signal very quickly with Claude Code on the feasibility and difficultly of some of these ideas.

I think the "strong default" framing overstates the case, for a few reasons.

The argument (IIUC) hinges on one actor gaining decisive, uncontested control before anyone else can respond. But that assumption does a lot of work, and I'm not sure it holds:

  • We currently have dozens of serious actors across multiple adversarial jurisdictions racing simultaneously, which looks more like a setup for messy multipolarity than a clean monopoly
  • Extreme military advantage hasn't historically guaranteed political control - the US had overwhelming superiority in Vietnam, Afghanistan and Iraq and still couldn't convert that into stable governance. On fast take-offs, the gap between "ASI achieved internally" and running a society requires human cooperation, and sustaining that loyalty is very hard.
  • The same inference ("extreme capability asymmetry, therefore inevitable authoritarianism") was made about nuclear weapons. What emerged was contested, ugly and dangerous, but not totalitarian. That ofc doesn't mean ASI follows the same path, but it's worth thinking about whether you would have predicted that outcome in advance.
  • Even within a single ASI-controlling organisation, individuals have interests, and defection, whistleblowing and sabotage are historically common responses to illegitimate power grabs from within institutions. The DARPA director scenario assumes a level of internal cohesion that imo rarely holds in practice

I'd put the more likely default as a messy, contested outcome that preserves more democratic structure than your title implies, even if it falls well short of anything we'd be happy with.

Zooming out slightly, I'm not sure what you are actually imagining ASI looks like here, so maybe I'm talking past you. I suspect that either:
 

  1. You're imagining a "god-like" AI which has intellectual and physical capabilities that far exceed the aggregate yearly output and total resources of the current USA.
    1. In which case, even aggressive ASI timelines should be measured in a low number of decades rather than years. (Edit: I should have said 5-15 years here, low number if decades make it sound like 30 years. I still think the general point on democratic societies having time to adapt stands)
  2. You're imagining a "country of geniuses in a datacenter" and little more (perhaps you also get a significant number of automated military drones).
    1. In which case, I don't think there is a strong case for the kind of overwhelming loss of democratic control. The data centres will still rely on their host country for energy, human resources, etc.
calebp
24
12
4
4

I don't think they are trying to convert the EA community into something else - they are pretty clearly creating separate spaces for their movement/community. [1]

Describing their post as using "applause lights" seems at best uncharitable, and "absolute nonsense" is just rude. There are several well-received posts on the forum around "[a]ugmenting decision-making with meditative (e.g. mindfulness) [practices]" like this one and this one. It's fine to dislike their principles, but I think it's worth making an effort to be encouraging when fellow altruists try to build on the "project" of Effective Altruism;.

  1. ^

    e.g. they say "That being said, we’re also aware of the danger of potential zero-sum dynamics between int/a and EA, and would like to avoid them as much as possible. One thing we are afraid of is int/a gravitating towards the “just bitching about EA” attractor state, which is definitely not the vibe we’re going for. Another concern is “taking people away from EA”. We don’t intend to dissuade people from doing impactful work by EA lights, in fact many of us in the movement are doing incredibly canonical EA jobs." and have run many events themselves under their own banner.

     

I found this post hard to engage with, I'm not quite sure why. I think it's pointing at some important areas so I've tried to write our some of my confusions. 

I don't understand why you believe that these problems won't be solved by "ASI" or "human-level AI" - presumably, if they are tractable for humans, they'll be tractable for human-level AIs. Agree that making sure that these systems are used for other problems is important and a lot of that work is "solving the alignment problem".

I think you might be using terms like AGI and ASI in non-standard ways, e.g. "Approach 4: Research how to steer ASI toward solving non-alignment problems [like philosophy]". It's plausible that very powerful AI systems are less good at philosophy than they are at tasks that are cheaper to evaluate - but they'll almost definitionally be better at philosophy than current humans. I think this is concerning for a bunch of reasons (including doing good alignment research in the run up to ASI) , but I'm not very worried about situations where we succeed at aligning ASI and then can't get good philosophy research out of it (at least by human standards of good) for capability reasons.

Also, approach 3 (pause at human-level AI) probably does help with misalignment risks relative to the counterfactual of just proceeding to ASI. For reasons like AI control, and having much stronger evidence for our ability to control human-level intelligences than super-intelligences.

I agree with some of the early parts of the post - I definitely feel the community has a lot of researchers and not enough people doing other things though, I suspect that many of the other things people imagine reading this post, are also not very useful for the non-alignment AI problems you described.

 

no fieldbuilding programs exclusively dedicated to biosecurity
 


Minor, but in case you aren't aware, the Cambridge Biosecurity Hub is a fieldbuilding program exclusively dedicated to biosecurity (they are running the AI x Bio stream at ERA, helped start a bio stream at SPAR, and are running their annual conference in a few weeks in Cambridge UK). I think it's funded by CG's Biosec team, but it may be funded by your team (the Biosec team is at least aware of them). People interested in making more stuff happen in this space should consider reaching out to them!

In any case, I'm also excited about more biosecurity fieldbuilding work happening and agree that it's been extremely neglected for a long time - thanks for writing this!

This is great to see! I've enjoyed reading Gergo's Substack and have really appreciated his work on AIS fieldbuilding - excited to see what you do with more resources!

I found this text particularly useful for working out what the program is.

When
 

  • Program Timeline: 1 December 2025 – 1 March 2026 (3 months)
  • Rolling applications begin 31 October 2025
  • If you are accepted prior to December 1st, you can get an early start with our content!
  • Extension options: You have the option to extend the program for yourself for up to two additional months – through to 1 June 2026. (We expect several cohort participants to opt for at least 1 month extensions)​​​​​​​​​​
program_timeline_no_bg.png

Where

 

  • Remote, with all chats and content located on the Supercycle.org community platform
  • Most program events will happen in the European evenings / North American mornings. Other working groups can reach consensus on their own times.

 

How much

 

$750 per month (over 3 months)

Load more