Bio

Pro-pluralist, pro-bednet, anti-Bay EA. 🔸 10% Pledger.

Sequences
3

Against the overwhelming importance of AI Safety
EA EDA
Criticism of EA Criticism

Comments
344

Sharing some planned Forum posts I'm considering, mostly as a commitment device, but welcome thoughts from others:

  • I plan to add another post in my "EA EDA" sequence analysing Forum trends in 2024. My pre-registered prediction is that we'll see 2023 patterns continue: declining engagement (though possibly plateauing) and AI Safety further cementing its dominance across posts, karma, and comments.
  • I'll also try to do another end-of-year Forum awards post (see here for last year's) though with slightly different categories.
  • I'm working on an analysis of EA's post-FTX reputation using both quantitative metrics (Forum engagement, Wikipedia traffic) and qualitative evidence (public statements from influential figures inside and outside EA). The preliminary data suggests more serious reputational damage than the recent Pulse survey. I meaningful (as opposed to methodological or just a mistake) I suspect it might highlight the difference between public and elite perception.
  • I recently finished reading former US General Stanley McChrystal's book: Team of Teams. Ostensibly it's a book about his command of JSOC in the Iraq War, but it's really about the concept of Auftragstaktik as a method of command, and there was more than one passage which I thought was relevant to Effective Altruism (especially for what "Third Wave" EA might mean). This one is a stretch though, I'm not sure how interested the Forum would be for this, or whether it would be the right place to post it.

My focus for 2025 will be to work towards developing my position on AI Safety, and share that through a series of posts AI Safety sequence.[1] The concept of AGI went mainstream in 2024, and it does look like we will see significant technological and social disruption in the coming decades due to AI development. Nevertheless, I find myself increasingly skeptical of traditional narratives and arguments about what Alignment is, the likelihood of risk, and what ought to be done about it. Instead, I've come to view "Alignment" primarily as a political philosophy rather than a technical computer science. Nevertheless, I could very well be wrong on most-all of these ideas, and getting critical discussion from the community will I think be good both for myself and (I hope) the Forum readership.[2]

As such, I'm considering doing a deep-dive on the Apollo o1 report given the controversial reception it's had.[3] I think this is the most unlikely one though, as I'd want to research it as thoroughly as I could, and time is at a premium since Christmas is around the corner, so this is definitely a "stretch goal".

Finally, I don't expect to devote much more time[4] to adding to the "Criticism of EA Criticism" sequence. I often finish the posts well after the initial discourse has died down, and I'm not sure what effect they really have.[5] Furthermore, and I've started to notice my own views of a variety of topics start to diverge from "EA Orthodoxy", so I'm not really sure I'd make a good defender. This change may itself warrant a future post, though again I'm not committing to that yet.

  1. ^

    Which I will rename

  2. ^

    It possibly may be more helpful for those without technical backgrounds concerned about AI, but I'm not sure. I also think have a somewhat AGI-sceptical persepctive represented on the Forum might be useful for intellectual diversity purposes but I don't want to claim that. I'm very uncertain about the future of AI and could easily see myself being convinced to change my mind.

  3. ^

    I'm slightly leaning towards the skeptical interpretation myself, as you might have guessed

  4. ^

    if any at all, unless an absolutely egregious but widely-shared example comes up

  5. ^

    Does Martin Sandbu read the EA Forum, for instance?

I think this is, to a significant extent, definitionally impossible with longtermist interventions, because the 'long-term' part excludes having an empirical feedback loop quick enough to update our models of the world.

For example, if I'm curious about whether malaria net distribution or vitamin A supplementation is more 'cost-effective' than another, I can fund interventions and run RCTs, and then model the resulting impact according to some metric like the DALY. This isn't cast-iron secure evidence, but it is at least causally connected to the result I care about.

For interventions that target the long-run future of humanity, this is impossible. We can't run counterfactuals of the future or past, and I at least can't wait 1000 years to see the long-term impact of certain decisions on the civilizational trajectory of the world. Thus, any longtermist intervention cannot really get empirical feedback on the parameters of action, and mostly rely on subjective human judgement about them.

To their credit, the EA Long-Term Future Fund says as much on their own web page:

Unfortunately, there is no robust way of knowing whether succeeding on these proxy measures will cause an improvement to the long-term future.

For similar thoughts, see Laura Duffy's thread on empirical vs reason-driven EA

One potential weakness is that I'm curious if it promotes the more well-known charities due to the voting system. I'd assume that these are somewhat inversely correlated with the most neglected charities.

I guess this isn't necessarily a weakness if the more well-known charities are more effective? I can see the case that: a) they might not be neglected in EA circles, but may be very neglected globally compared to their impact and that b) there is often an inverse relationship between tractability/neglectedness and importance/impact of a cause area and charity. Not saying you're wrong, but it's not necessarily a problem.

Furthermore, my anecdotal take from the voting patterns as well as the comments on the discussion thread seem to indicate that neglectedness is often high on the mind of voters - though I admit that commenters on that thread are a biased sample of all those voting in the election.

It can be a bit underwhelming if an experiment to try to get the crowd's takes on charities winds up determining to, "just let the current few experts figure it out." 

Is it underwhelming? I guess if you want the donation election to be about spurring lots of donations to small, spunky EA-startups working in weird-er cause areas, it might be, but I don't think that's what I understand the intention of the experiment to be (though I could be wrong). 

My take is that the election is an experiment with EA democratisation, where we get to see what the community values when we do a roughly 1-person-1-ballot system instead of those-with-the-moeny decide system which is how things work right now. Those takeaways seem to be:

  • The broad EA community values Animal Welfare a lot more than the current major funders
  • The broad EA community sees value in all 3 of the 'big cause areas' with high-scoring charities in Animal Welfare, AI Safety, and Global Health & Development.

But you haven't provided any data 🤷

Like you could explain why you think so without de-anonymising yourself, e.g. sammy shouldn't put EA on his CV in US policy because:

  • Republicans are in control of most positions and they see EA as heavily democrat-coded and aren't willing to consider hiring people with it
  • The intelligentsia who hire for most US policy positions see EA as cult-like and/or disgraced after FTX
  • People won't understand what EA is on a CV will and discount sammy's chances compared to them putting down "ran a discussion group at university" or something like that
  • You think EA is doomed/likely to collapse and sammy should pre-emptively dissasociate their career from it

Like I feel that would be interesting and useful to hear your perspective on, to the extend you can share information about it. Otherwise just jumping in with strong (and controversial?) opinions from anonymous accounts on the forum just serves to pollute the epistemic commons in my opinion.

Right but I don't know who you are, or what your position in the US Policy Sphere is, if you have one at all. I have no way to verify your potential background or the veracity of the information you share, which is one of the major problems with anonymous accounts.

You may be correct (though again that lack of explanation doesn't help give detail or a mechanism why or help sammy that much, as you said it depends on the section) but that isn't really the point, the only data point you provide is "intentionally anonymous person of the EAForum states opinion without supporting explanations" which is honestly pretty weak sauce

I don't find comments like these helpful without explanations or evidence, especially from throwaway accounts

Yeah again I just think this depends on one's definition of EA, which is the point I was trying to make above.

Many people have turned away from EA, both the beliefs, institutions, and community in the aftermath of the FTX collapse. Even Ben Todd seems to not be an EA by some definitions any more, be that via association or identification. Who is to say Leopold is any different, or has not gone further? What then is the use of calling him EA, or using his views to represent the 'Third Wave' of EA? 

I guess from my PoV what I'm saying is that I'm not sure there's much 'connective tissue' between Leopold and myself, so when people use phrases like "listen to us" or "How could we have done" I end up thinking "who the heck is we/us?"

I'm not sure to what extent the Situational Awareness Memo or Leopold himself are representatives of 'EA'

In the pro-side:

  • Leopold thinks AGI is coming soon, will be a big deal, and that solving the alignment problem is one of the world's most important priorities
  • He used to work at GPI & FTX, and formerly identified with EA
  • He (probably almost certainly) personally knows lots of EA people in the Bay

On the con-side:

  • EA isn't just AI Safety (yet), so having short timelines/high importance on AI shouldn't be sufficient to make someone an EA?[1]
  • EA shouldn't also just refer to a specific subset of the Bay Culture (please), or at least we need some more labels to distinguish different parts of it in that case
  • Many EAs have disagreed with various parts of the memo, e.g. Gideon's well received post here
  • Since his EA institutional history he moved to OpenAI (mixed)[2] and now runs an AGI investment firm.
  • By self-identification, I'm not sure I've seen Leopold identify as an EA at all recently.

This again comes down to the nebulousness of what 'being an EA' means.[3] I have no doubts at all that, given what Leopold thinks is the way to have the most impact he'll be very effective at achieving that.

Further, on your point, I think there's a reason to suspect that something like situational awareness went viral in a way that, say, Rethink Priorities Moral Weight project didn't - the promise many people see in powerful AI is power itself, and that's always going to be interesting for people to follow, so I'm not sure that situational awareness becoming influential makes it more likely that other 'EA' ideas will

  1. ^

    Plenty of e/accs have these two beliefs as well, they just expect alignment by default, for instance

  2. ^

    I view OpenAI as tending implicitly/explicitly anti-EA, though I don't think there was an explicit 'purge', I think the culture/vision of the company was changed such that card-carrying EAs didn't want to work there any more

  3. ^

    The 3 big defintions I have (self-identification, beliefs, actions) could all easily point in different directions for Leopold

I sort-off bounced of this one Richard. I'm not a professor of moral philosophy, so some of what I say below may seem obviously wrong/stupid/incorrect - but I think that were I a philosophy professor I would be able to shape it into a stronger objection than it might appear on first glance.

Now, when people complain that EA quantifies things (like cross-species suffering) that allegedly “can’t be precisely quantified,” what they’re effectively doing is refusing to consider that thing at all.

I don't think this would pass an ideological Turing Test. I think what people who make this claim are saying is often that previous attempts to quantify the good precisely have ended up having morally bad consequences. Given this history, perhaps our takeaway shouldn't be "they weren't precise enough in their quantification" and should be more "perhaps precise quantification isn't the right way to go about ethics".

Because the realistic alternative to EA-style quantitative analysis is vibes-based analysis: just blindly going with what’s emotionally appealing at a gut level.

Again, I don't think this is true. Would you say that before the publication of Famine, Affluence, and Morality that all moral philosophy was just "vibes-based analysis"? I think, instead, all of moral reasoning is in some sense 'vibes-based' and the quantification of EA is often trying to present arguments for the EA position.

To state it more clearly, what we care about is moral decision-making, not the quantification of moral decisions. And most decisions that have been made or have ever been made have been done so without quantification. What matters is the moral decisions we make, and the reasons we have for those decisions/values, not what quantitative value we place on said decisions/values.

the question that properly guides our philanthropic deliberations is not “How can I be sure to do some good?” but rather, “How can I (permissibly) do the most (expected) good?”

I guess I'm starting to bounce of this because I now view this as a big moral commitment which I think goes beyond simple beneficentrism. Another view, for example, would be a contractualism, where what 'doing good' means is substantially different from what you describe here, but perhaps that's a base metaethical debate.

It’s very conventional to think, “Prioritizing global health is epistemically safe; you really have to go out on a limb, and adopt some extreme views, in order to prioritize the other EA stuff.” This conventional thought is false. The truth is the opposite. You need to have some really extreme (near-zero) credence levels in order to prevent ultra-high-impact prospects from swamping more ordinary forms of do-gooding.

I think this is confusing two forms of 'extreme'. Like in one sense the default 'animals have little-to-no moral worth' view is extreme for setting the moral value of animals so low as to be near zero (and confidently so at that). But I think the 'extreme' in your first sentence refers to 'extreme from the point of view of society'.

Furthermore, if we argue that quantifying expected value in quantitative models is the right way to do moral reasoning (as opposed to sometimes being a tool), then you don't have to accept the "even a 1% chance is enough", I could just decline to find a tool that produces such dogmatism at 1% acceptable. You could counter with "your default/status-quo morality is dogmatic", which sure. But it doesn't convince me to accept strong longtermism any more, and I've already read a fair bit about it (though I accept probably not as much as you).

While you’re at it, take care to avoid the conventional dogmatism that regards ultra-high-impact as impossible.

One man's "conventional dogmatism" could be reframed as "the accurate observation that people with totalising philosophies promising ultra-high-impact have a very bad track record that have often caused harm and those with similar philosophies ought to be viewed with suspicion"


Sorry if the above was a bit jumbled. It just seemed this post was very unlike your recent Good Judgement with Numbers post, which I clicked with a lot more. This one seems to be you, instead of rejecting the ‘All or Nothing’ Assumption, actually going "all in" on quantitative reasoning. Perhaps it was the tone with which it was written, but it really didn't seem to actually engage with why people have an aversion to over-quantification of moral reasoning.

Thanks for sharing your thoughts. I'll respond in turn to what I think are the two main parts of it, since as you said this post seems to be a combination of suffering-focused ethics and complex cluelessness.

On Suffering-focused Ethics: To be honest, I've never seen the intuitive pull of suffering-focused theories, especially since my read of your paragraphs here seems to tend towards a lexical view where the amount of suffering is the only thing that matters for moral consideration.[1] 

Such a moral view doesn't really make sense to me, to be honest, so I'm not particularly concerned by it, though of course everyone has different moral intuitions so YMMV.[2] Even if you're convinced of SFE though, the question is how best to reduce suffering which hits into the clueless considerations you point out.

On complex cluelessness: On this side, I think you're right about a lot of things, but that's a good thing not a bad one!

  • I think you're right about the 'time of perils' assumption, but you really should increase your scepticism of any intervention which claims to have "lasting, positive effects over millennia" since we can't get the feedback on the millennia long impact of our interventions.
  • You are right that radical uncertainty is humbling, and it can be frustrating, but it is also the state that everyone is in, and there's no use beating yourself up for the default state that everyone is in.
  • You can only decide how to steer humanity toward a better future with the knowledge and tools that you have now. It could be something very small, and doesn't have to involve you spending hundreds of hours trying to solve the problems of cluelessness.

I'd argue that reckoning with the radical uncertainty should point towards moral humility and pluralism, but I would say that since that's the perspective in my wheelhouse! I also hinted at such considerations in my last post about a Gradient-Descent approach to doing good, which might be a more cluessness-friendly attitude to take.

  1. ^

    You seem to be asking e.g. "will lowering existential risk increase the expected amount of future suffering" instead of "will lowering existential risk increase the amount of total preferences satisfied/non frustrated" for example.

  2. ^

    To clairfy, this sentence specifically referred to lexical suffering views, not all forms of SFE that are less strong in their formulation

Load more