Bio

"EA-Adjacent" now I guess.

🔸 10% Pledger.

Likes pluralist conceptions of the good.

Dislikes Bay Culture being in control of the future.

Sequences
3

Against the overwhelming importance of AI Safety
EA EDA
Criticism of EA Criticism

Comments
370

Note - this was written kinda quickly, so might be a bit less tactful than I would write if I had more time.

Making a quick reply here after binge listening to three Epoch-related podcasts in the last week, and I basically think my original perspective was vindicated. It was kinda interesting to see which points were repeated or phrased a different way - would recommend if your interested in the topic.

  • The initial podcast with Jaime, Ege, and Tamay. This clearly positions the Epoch brain trust as between traditional academia and the AI Safety community (AISC). tl;dr - academia has good models but doesn't take ai seriously, and AISC the opposite (from Epoch's PoV)
  • The 'debate' between Matthew and Ege. This should have clued people in, because while full of good content, by the last hour/hour and half it almost seemed to turn into 'openly mocking and laughing' at AISC, or at least the traditional arguments. I also don't buy those arguments, but I feel like the reaction Matthew/Ege have shows that they just don't buy the root AISC claims.
  • The recent podcast Dwarkesh with Ege & Tamay. This is the best of the 3, but probably also best listened too after the first too, since Dwarkesh actually pushes back on quite a few claims, which means Ege & Tamay flush out their views more - personal highlight was what the reference class for AI Takeover actually means.

Basically, the Mechanize cofounders don't agree at all with 'AI Safety Classic', I am very confident that they don't buy the arguments at all, that they don't identify with the community, and somewhat confident that they don't respect the community or its intellectual output that much. 

Given that their views are: a) AI will be a big deal soon (~a few decades), b) returns to AI will be very large, c) Alignment concerns/AI risks are overrated, and d) Other people/institutions aren't on the ball, then starting an AI Start-up seems to make sense.

What is interesting to note, and one I might look into in the future, is just how much these differences in expectation of AI depend on differences in worldview, rather than differences in technical understanding of ML or understanding of how the systems work on a technical level.

So why are people upset?

  • Maybe they thought the Epoch people were more part of the AISC than they actually were? Seems like the fault of the people believe this, not Epoch or the Mechanize founders.
  • Maybe people are upset that Epoch was funded by OpenPhil, and this seems to have lead to 'AI acceleration'? I think that's plausible, but Epoch has still produced high-quality reports and information, which OP presumably wanted them to do. But I don't think equating EA == OP, or anyone funded by OP, is a useful concept to me.
  • Maybe people are upset at any progress in AI capabilities. But that assumes that Mechanize will be successful in its aims, not guaranteed. It also seems to reify the concept of 'capabilities' as one big thing which i don't think makes sense. Making a better Stockfish, or a better AI for FromSoft bosses does not increase x-risk, for instance.
  • Maybe people think that the AI Safety Classic arguments are just correct and therefore people taking actions other than it. But then many actions seem bad by this criteria all the time, so odd this would provoke such a reaction. I also don't think EA should hang its hat on 'AI Safety Classic' arguments being correct anyway.

Probably some mix of it. I personally remain not that upset because a) I didn't really class Epoch as 'part of the community', b) I'm not really sure I'm 'part of the community' either and c) my views are at least somewhat similar to the Epoch set above, though maybe not as far in their direction, so I'm not as concerned object-level either.

I'm not sure I feel as concerned about this as others. tl;dr - They have different beliefs from Safety-concerned EAs, and their actions are a reflection of those beliefs.

It seems broadly bad that the alumni from a safety-focused AI org

Was Epoch ever a 'safety-focused' org? I thought they were trying to understand what's happening with AI, not taking a position on Safety per se.

 ...have left to form a company which accelerates AI timelines

I think Matthew and Tamay think this is positive, since they think AI is positive. As they say, they think explosive growth can be translated into abundance. They don't think that the case for AI risk is strong, or significant, especially given the opportunity cost they see from leaving abundance on the table.

Also important to note is what Epoch boss Jaime says in this very comment thread.

As I learned more and the situation unfolded I have become more skeptical of AI Risk.

The same thing seems to be happening with me, for what it's worth.

People seem to think that there is an 'EA Orthodoxy' on this stuff, but there either isn't as much as people think, or people who disagree with it are no longer EAs. I really don't think it makes sense to clamp down on 'doing anything to progress AI' as being a hill for EA to die on.

Note: I'm writing this for the audience as much as a direct response

The use of Evolution to justify this metaphor is not really justified. I think Quintin Pope's Evolution provides no evidence for the sharp left turn (which won a prize in an OpenPhil Worldview contest) convincingly argues against it. Zvi wrote a response from the "LW Orthodox" camp that wasn't convincing and Quintin responds against it here.

On "Inner vs Outer" framings for misalignment is also kinda confusing and not that easy to understand when put under scrutiny. Alex Turner points this out here, and even BlueDot have a whole "Criticisms of the inner/outer alignment breakdown" in their intro which to me gives the game away by saying "they're useful because people in the field use them", not because their useful as a concept itself.

Finally, a lot of these concerns revolve around the idea of their being set, fixed, 'internal goals' that these models have, and represent internally, but are themselves immune from change, or can hide from humans, etc. This kind of strong 'Goal Realism' is a key part of the case for 'Deception' style arguments, whereas I think Belrose & Pope show an alternative way to view how AIs work is 'Goal Reductionism', in which framing the issues imagined don't seem certain any more, as AIs are better understood as having 'contextually-activated heuristics' rather than Terminal Goals. For more along these lines, you can read up on Shard Theory.

I've become a lot more convinced about these criticisms of "Alignment Classic" by diving into them. Of course, people don't have to agree with me (or the authors), but I'd highly encourage EAs reading the comments on this post to realise Alignment Orthodoxy is not uncontested, and is not settled, and if you see people making strong cases based on arguments and analogies that seem not solid to you, you're probably right, and you should look to decide for yourself rather than accepting that the truth has already been found on these issues.[1]

  1. ^

    And this goes for my comments too

I'm glad someone wrote this up, but I actually don't see much evaluation here from you, apart from "it's too early to say", but then Zhou Enlai pointed out that you could say that about the French Revolution,[1] and I think we can probably say some things. I generally have you mapped to the "right-wing Rationalist" subgroup Arjun,[2] so it'd be actually interested to get your opinion instead of trying to read between the lines on what you may or may not believe. I think there was a pretty strong swing in Silicon Valley / Tech Twitter & TPOT / Broader Rationalism towards Trump, and I think this isn't turning out well, so I'd actually be interested to see people saying what they actually think - be that "I made a huge mistake", "It was a bad gamble but Harris would've been worse" or even "This is exactly what I want"

  1. ^

    I know it's not apocryphal but it's a good quote

  2. ^

    Let me know if this is wrong and/or you don't identify this way

Hey Cullen, thanks for responding! So I think there are object-level and meta-level thoughts here, and I was just using Jeremy as a stand-in for the polarisation of Open Source vs AI Safety more generally.

Object Level - I don't want to spend too long here as it's not the direct focus of Richard's OP. Some points:

  • On 'elite panic' and 'counter-enlightenment', he's not directly comparing FAIR to it I think. He's saying that previous attempts to avoid democratisation of power in the Enlightenment tradition have had these flaws. I do agree that it is escalatory though.
  • I think, from Jeremy's PoV, that centralization of power is the actual ballgame and what Frontier AI Regulation should be about. So one mention on page 31 probably isn't good enough for him. That's a fine reaction to me, just as it's fine for you and Marcus to disagree on the relative costs/benefits and write the FAIR paper the way you did.
  • On the actual points though, I actually went back and skim-listened to the the webinar on the paper in July 2023, which Jeremy (and you!) participated in, and man I am so much more receptive and sympathetic to his position now than I was back then, and I don't really find Marcus and you to be that convincing in rebuttal, but as I say I only did a quick skim listen so I hold that opinion very lightly.

Meta Level - 

  • On the 'escalation' in the blog post, maybe his mind has hardened over the year? There's probably a difference between ~July23-Jeremy and ~Nov23Jeremy, which he may view as an escalation from the AI Safety Side to double down on these kind of legislative proposals? While it's before SB1047, I see Wiener had introduced an earlier intent bill in September 2023.
  • I agree that "people are mad at us, we're doing something wrong" isn't a guaranteed logic proof, but as you say it's a good prompt to think "should i have done something different?", and (not saying you're doing this) I think the absolutely disaster zone that was the sB1047 debate and discourse can't be fully attributed to e/acc or a16z or something. I think the backlash I've seen to the AI Safety/x-risk/EA memeplex over the last few years should prompt anyone in these communities, especially those trying to influence policy of the world's most powerful state, to really consider Cromwell's rule.
  • On this "you will just in fact have pro-OS people mad at you, no matter how nicely your white papers are written." I think there's some sense in which it's true, but I think that there's a lot of contigency about just how mad people get, how mad they get, and whether other allies could have been made on the way. I think one of the reasons they got so bad is because previous work on AI Safety has understimated the socio-political sides of Alignment and Regulation.[1]
  1. ^

    Again, not saying that this is referring to you in particular

I responded well to Richard's call for More Co-operative AI Safety Strategies, and I like the call toward more sociopolitical thinking, since the Alignment problem really is a sociological one at heart (always has been). Things which help the community think along these lines are good imo, and I hope to share some of my own writing on this topic in the future.

Whether or not I agree with Richard's personal politics or not is kinda beside the point to this as a message. Richard's allowed to have his own views on things and other people are allowed to criticse this (I think David Mathers' comment is directionally where I lean too). I will say that not appreciating arguments from open-source advocates, who are very concerned about the concentration of power from powerful AI, has lead to a completely unnecessary polarisation against the AI Safety community from it. I think, while some tensions do exist, it wasn't inevitable that it'd get as bad as it is now, and in the end it was a particularly self-defeating one. Again, by doing the kind of thinking Richard is advocating for (you don't have to co-sign with his solutions, he's even calling for criticism in the post!), we can hopefully avoid these failures in the future.

On the bounties, the one that really interests me is the OpenAI board one. I feel like I've been living in a bizarro-world with EAs/AI Safety People ever since it happened because it seemed such a collosal failure, either of legitimacy or strategy (most likely both), and it's a key example of the "un-cooperative strategy" that Richard is concerned about imo. The combination of extreme action and ~0 justification either externally or internally remains completely bemusing to me and was big wake-up call for my own perception of 'AI Safety' as a brand. I don't think people can underestimate the second-impact effect this bad on both 'AI Safety' and EA, coming about a year after FTX.

Piggybacking on this comment because I feel like the points have been well-covered already:

Given that the podcast is going to have a tigher focus on AGI, I wonder if the team is giving any considering to featuring more guests who present well-reasoned skepticism toward 80k's current perspective (broadly understood). While some skeptics might be so sceptical of AGI or hostile to EA they wouldn't make good guests, I think there are many thoughtful experts who could present a counter-case that would make for a useful episode(s).

To me, this comes from a case for epistemic hygiene, especially given the prominence that the 80k podcast has. To outside observers, 80k's recent pivot might appear less as "evidence-based updating" and more as "surprising and suspicious convergence" without credible demonstrations that the team actually understands opposing perspectives and can respond to the obvious criticisms. I don't remember the podcast featuring many guests who present a counter-case to 80ks AGI-bullishness as opposed to marginal critiques, and I don't particularly remember those arguments/perspectives being given much time or care.

Even if the 80k team is convinced by the evidence, I believe that many in both the EA community and 80k's broader audience are not. From a strategic persuasion standpoint, even if you believe the evidence for transformative AI and x-risk is overwhelming, interviewing primarily those already also convinced within the AI Safety community will likely fail to persuade those who don't already find that community credible. Finally, there's also significant value in "pressure testing" your position through engagement with thoughtful critics, especially if your theory of change involves persuading people who are either sceptical themselves or just unconvinced.

Some potential guests who could provide this perspective (note, I don't these 100% endorse the people below, but just that they point the direction of guests that might do a good job at the above):

  • Melanie Mitchell
  • François Chollet
  • Kenneth Stanley
  • Tan Zhi-Xuan
  • Nora Belrose
  • Nathan Lambert
  • Sarah Hooker
  • Timothy B. Lee
  • Krishnan Rohit

I don't really get the framing of this question.

I suspect, for any increment of time one could take through EAs existence, then there would have been more 'harm' done in the total rest of world during that time. EA simply isn't big enough to counteract the moral actions of the rest of the world. Wild animals suffer horribly, people die of preventable diseases etc constantly, formal wars and violent struggles occur affecting the lives of millions. There sheer scale of the world outweighs EA many, many times over.

So I suspect you're making a more direct comparison to Musk/DOGE/PEPFAR? But again, I feel like anyone wielding using the awesome executive power of the United States Government should expect to have larger impacts on the world than EA.

I think this is downstream of a lot of confusion about what 'Effective Altruism' really means, and I realise I don't have a good definition any more. In fact, because all of the below can be criticised, it sort of explains why EA gets seemingly infinite criticism from all directions.

  • Is it explicit self-identification?
  • Is it explicit membership in a community?
  • Is it implicit membership in a community?
  • Is it if you get funded by OpenPhilanthropy?
  • Is it if you are interested or working in some particular field that is deemed "effective"?
  • Is it if you believe in totalising utilitarianism with no limits?
  • To always justify your actions with quantitative cost-effectiveness analyses where you're chosen course of actions is the top ranked one?
  • Is it if you behave a certain way?

Because in many ways I don't count as EA based off the above. I certainly feel less like one than I have in a long time.

For example:

I think a lot of EAs assume that OP shares a lot of the same beliefs they do.

I don't know if this refers to some gestalt 'belief' than OP might have, or Dustin's beliefs, or some kind of 'intentional stance' regarding OP's actions. While many EAs shared some beliefs (I guess) there's also a whole range of variance within EA itself, and the fundamental issue is that I don't know if there's something which can bind it all together.

I guess I think the question should be less "public clarification on the relationship between effective altruism and Open Philanthropy" and more "what does 'Effective Altruism' mean in 2025?"

I mean I just don't take Ben to be a reasonable actor regarding his opinions on EA? I doubt you'll see him open up and fully explain a) who the people he's arguing with are or b) what the explicit change in EA to an "NGO patronage network" was with names, details, public evidence of the above, and being willing to change his mind to counter-evidence.

He seems to have been related to Leverage Research, maybe in the original days?[1] And there was a big falling out there, any many people linked to original Leverage hate "EA" with the fire of a thousand burning suns. Then he linked up with Samo Burja at Bismarck Analysis and also with Palladium, which definitely links him the emerging Thielian tech-right, kinda what I talk about here. (Ozzie also had a good LW comment about this here).

In the original tweet Emmett Shear replies, and then it's spiralled into loads of fractal discussions, and I'm still not really clear what Ben means. Maybe you can get more clarification in Twitter DMs rather than having an argument where he'll want to dig into his position publicly?

  1. ^

    For the record, a double Leverage & Vassar connection seems pretty disqualifying to me - especially as i'm very Bay sceptical anyway

Load more