Pronouns: she/her or they/them.
I got interested in effective altruism back before it was called effective altruism, back before Giving What We Can had a website. Later on, I got involved in my university EA group and helped run it for a few years. Now I’m trying to figure out where effective altruism can fit into my life these days and what it means to me.
I write on Substack, and used to write on Medium.
There's an expert consensus that tobacco is harmful, and there is a well-documented history of tobacco companies engaging in shady tactics. There is also a well-documented history of government propaganda being misleading and deceptive, and if you asked anyone with relevant expertise — historians, political scientists, media experts, whoever — they would certainly tell you that government propaganda is not reliable.
But just lumping in "AI accelerationist companies" with that is not justified. "AI accelerationist" just means anyone who works on making AI systems more capable who doesn't agree with the AI alignment/AI safety community's peculiar worldview. In practice, that means you're saying most people with expertise AI are compromised and not worth listening to, but you are willing to listen to this weird random group of people, some of whom like Yudkowsky who have no technical expertise in contemporary AI paradigms (i.e. deep learning and deep reinforcement learning). This seems like a recipe for disaster, like deciding that capitalist economists are all corrupt and that only Marxist philosophers are worth trusting.
A problem with motivated reasoning arguments, when stretched to this extent, is that anyone can accuse anyone over the thinnest pretext. And rather than engaging with people's views and arguments in any serious, substantive way, it just turns into a lot of finger pointing.
Yudkowsky's gotten paid millions of dollars to prophesize AI doom. Many people have argued that AI safety/AI alignment narratives benefit the AI companies and their investors. The argument goes like this: Exaggerating the risks of AI exaggerates AI's capabilities. Exaggerating AI's capabilities makes the prospective financial value of AI much higher than it really is. Therefore, talking about AI risk or even AI doom is good business.
I would add that exaggerating risk may be a particularly effective way to exaggerate AI's capabilities. People tend to be skeptical of anything that sounds like pie-in-the-sky hope or optimism. On the other hand, talking about risk sounds serious and intelligent. Notice what goes unsaid: many near-term AGI believers think there's a high chance of some unbelievably amazing utopia just on the horizon. How many times have you heard someone imagine that utopia? One? Zero? And how many times have you heard various AI doom or disempowerment stories? Why would no one ever bring up this amazing utopia they think might happen very soon?
Even if you're very pessimistic and think there's a 90% chance of AI doom, a 10% chance of utopia is still pretty damn interesting. And many people are much more optimistic, thinking there's around a 1-30% chance of doom, which implies a 70%+ chance of utopia. So, what gives? Where's the utopia talk? Even when people talk about the utopian elements of AGI futures, they emphasize the worrying parts: what if intelligent machines produce effectively unlimited wealth, how will we organize the economy? What policies will we need to implement? How will people cope? We need to start worrying about this now! When I think about what would happen if I won the lottery, my mind does not go to worrying about the downsides.
I think the overwhelming majority of people who express views on this topic are true believers. I think they are sincere. I would only be willing to accuse someone of possibly doing something underhanded if, independently, they had a track record of deceptive behaviour. (Sam Altman has such a track record, and generally I don't believe anything he says anymore. I have no way of knowing what's sincere, what's a lie, and what's something he's convinced himself of because it suits him to believe it.) I think the specific accusation that AI safety/AI alignment is a deliberate, conscious lie cooked up to juice AI investment is silly. It's probably true, though, that people at AI companies have some counterintuitive incentive or bias toward talking up AI doom fears.
However, my general point is that just as it's silly to accuse AI safety/alignment people of being shills for AI companies, it also seems silly to me to say that AI companies (or "AI accelerationist" companies, which is effectively all major AI companies and almost all startups) are the equivalent of tobacco companies, and you shouldn't pay attention to what people at AI companies say about AI. Motivated reasoning accusations made on thin grounds can put you into a deluded bubble (e.g. becoming a Marxist) and I don't think AI is some clear-cut, exceptional case like tobacco or state propaganda where obviously you should ignore the message.
There's a fine line between steelmanning people's views and creating new views that are facially similar to those views but are crucially different from the views those people actually hold. I think what you're describing is not steelmanning, but developing your own views different from Yudkowsky and Soares' — views that they would almost certainly disagree with in strong terms.
I think it would be constructive for you to publish the views you developed after reading Yudkowsky and Soares' book. People might find that useful to read. That could give people something interesting to engage with. But if you write that Yudkowsky and Soares' claim about alien preferences is wrong, many people will disagree with you (including Yudkowsky and Soares, if they read it). So, it's important to get very clear on what different people in a discussion are saying and what they're not saying. Just to keep everything straight, at least.
I agree the alien preferences thing is not necessarily a crux of AI doom arguments more generally, but it is certainly a crux of Yudkowsky and Soares' overall AI doom argument specifically. Yes, you can change their overall argument into some other argument that doesn't depend on the alien preferences thing anymore, but then that's no longer their argument, that's a different argument.
I agree that Yudkowsky and Soares (and their book) are not fully representative of the AI safety community's views, and probably no single text or person (or pair of people) are. I agree that it isn't really reasonable to say that if you can refute Yudkowsky and Soares (or their book), you refute the AI safety community's views overall. So, I agree with that critique.
So, if the best version of Yudkowsky and Soares' argument is not the one made in their book, what is the best version? Can you explain how that version of the argument, which they made previously elsewhere, is different than the version in the book?
I can't tell if you're saying:
a) that the alien preferences thing is not a crux of Yudkowsky and Soares' overall argument for AI doom (it seems like it is) or if
b) the version of the specific argument about alien preferences they gave in the book isn't as good as previous versions they've given (which is why I asked what version is better) or if
c) you're saying that Yudkowsky and Soares' book overall isn't as good as their previous writings on AI alignment.
I don't know that academic reviewers of Yudkowsky and Soares' argument would take a different approach. The book is supposed to be the most up-to-date version of the argument, and one the authors took a lot of care in formulating. It doesn't feel intuitive to go back and look at their earlier writings and compare different version of the argument, which aren't obviously different at first glance. (Will MacAskill and Clara Collier both complained the book wasn't sufficiently different from previous formulations of the argument, i.e. wasn't updated enough in light of advancements in deep learning and deep reinforcement learning over the last decade.) I think an academic reviewer might just trust that Yudkowsky and Soares' book is going to be the best thing to read and respond to if they want to engage with their argument.
You might, as an academic, engage in a really close reading of many versions of a similar argument made by Aristotle in different texts, if you're a scholar of Aristotle, but this level of deep textual analysis doesn't typical apply to contemporary works by lesser-known writers outside academia.
The academic philosopher David Thorstad is writing a blog series in response to the book. I haven't read it yet, so I don't know if he pulls his alternative Yudkowsky and Soares writings other than the book itself. However, I think it would be perfectly fine for him to just focus on the book, and not seek out other texts from the same authors that make the same argument in maybe a better form.
If what you're saying is that there are multiple independent (and mutually incompatible) arguments for the AI safety community's core claims, including ones that Yudkowsky and Soares don't make, then I agree with that. I agree you can criticize that sentence in the Mechanize co-founders' essay if you believe Yudkowsky's views and arguments don't actually unifies (or adequately represents) the views and arguments of the AI safety community overall. Maybe you could point out what those other arguments are and who has formulated them best. Maybe the Mechanize co-founders would write a follow-up piece engaging with those non-Yudkowsky arguments as well, to give a more complete engage with the AI safety community's worldview.
I think the claim that Yudkowsky's views on AI risk are meaningfully influenced by money is very weak.
To be clear, I agree. I also agree with your general point that other factors are often more important than money. Some of these factors include the allure of millennialism, or the allure of any sort of totalizing worldview or "ideology".
I was trying to make a general point against accusations of motivated reasoning related to money, at least in this context. If two sets of people are each getting paid to work on opposite sides of an issue, why only accuse one side of motivated reasoning?
This is indicated by the hundreds of comments, tweets, in-person arguments, DMs, and posts from at least 2023 onward in which I expressed skepticism about AI risk arguments and AI pause proposals.
Thanks for describing this history. Evidence of a similar kind lends strong credence to Yudkowsky forming his views independent from the influence of money as well.
My general view is that reasoning is complex, motivation is complex, people's real psychology is complex, and that the forum-like behaviour of accusing someone of engaging in X bias is probably a misguided pop science simplification of the relevant scientific knowledge. For instance, when people engage in distorted thinking, the actual underlying reasoning often seems to be a surprisingly complicated multi-step sequence.
The essay above that you co-wrote is incredibly strong. I was the one who originally sent it to Vasco and, since he is a prolific cross-poster and I don't like to cross-post under my name, encouraged him to cross-post it. I'm glad more people in the EA community have now read it. I think everyone in the EA community should read it. It's regrettable that there's only been one object-level comment on the substance of the essay so far, and so many comments about this (to me) relatively uninteresting and unimportant side point about money biasing people's beliefs. I hope more people will comment on the substance of the essay at some point.
I think part of where the angsty energy comes from is that Yudkowsky and Soares are incredibly brazen and insulting when they express their views on AI. For instance, Yudkowsky recently said that people with AGI timelines longer than 30 years are no "smarter than a potted plant". Yudkowsky has publicly said, on at least two occasions, that he believes he's the smartest person in the world — at least on AI safety and maybe just in general — and there's no second place that's particularly close. Yudkowsky routinely expresses withering contempt, even for people who are generally "on his side" and trying to be helpful. It's really hard to engage with this style of "debate" (as it were) and not feel incredibly pissed off.
When I was running an EA university group, if anyone had behaved like Yudkowsky routinely behaves, they would have been banned from the group, and I'm sure the members of my group would have unanimously agreed the behaviour is unacceptable. The same applies to any other in-person group, community, or social circle I've been apart of. It would scarcely be more acceptable than a man in an EA group repeatedly telling the women he just met there how hot they are. People generally don't tolerate this kind of thing. I think many people would prefer not to reward this behaviour with attention, but given that Yudkowsky (and Soares) have already successfully gotten a lot of attention, it's necessary to write replies like this essay (the one above, by the Mechanize co-founders).
Privately, some people in the LessWrong community, where Yudkowsky is deeply revered, have said they find Yudkowsky's style of engagement unpleasant and regrettable (in stronger words than that). Some have said it publicly. (Soares, too, has been publicly criticized for his demeanor toward people who are "on his side" and trying to be helpful, let alone people he disagrees with, or thinks he does.)
I think it's close to impossible not to feel angsty when engaging with Yudkowsky (and Soares), unless you happen to be one of those people who revere him and treat him as a role model (or, I don't know, you're a Zen Buddhist master). I agree that it's regrettable for the debates to become as heated as they often get. I agree it would be more interesting to have intellectual discussions based in civility, mutual respect, curiosity about the other person's opinion, intellectual generosity, and so on. But if someone isn't willing to play ball, I think you've gotta either just ignore them, bite your tongue and be artificially polite (in which case some amount of angst will probably still be revealed), or write angry refutations.
I agree with Ben Stewart's response that this is not a helpful thing to say. You are making some very strange and unintuitive claims. I can't imagine how you would persuade a reasonable, skeptical, well-informed person outside the EA/LessWrong (or adjacent) bubble that these are credible claims, let alone that they are true. (Even within the EA Forum bubble, it seems like significantly more people disagree with you than agree.)
To pick on just one aspect of this claim: it is my understanding that Yudkowsky has no meaningful technical proficiency with deep learning-based or deep reinforcement learning-based AI systems. In my understanding, Yudkowsky lacks the necessary skills and knowledge to perform the role of an entry-level AI capabilities researcher or engineer at any AI company capable of paying multi-million-dollar salaries. If there is evidence that shows my understanding is mistaken, I would like to see that evidence. Otherwise, I can only conclude that you are mistaken.
I think the claim that an endorsement is worth billions of billions is also wrong, but it's hard to disprove a claim about what would happen in the event of a strange and unlikely hypothetical. Yudkowsky, Soares, and MIRI have an outsized intellectual influence in the EA community (and obviously on LessWrong). There is some meaningful level of influence on the community of people working in the AI industry in the Bay Area, but it's much less. Among the sort of people who could make decisions that would realize billions or tens of billions in value, namely the top-level executives at AI companies and investors, the influence seems pretty marginal. I would guess the overwhelming majority of investors either don't know who Yudkowsky and Soares are or do but don't care what their views are. Top-level executives do know who Yudkowsky is, but in every instance I've seen, they tend to be politely disdainful or dismissive toward his views on AGI and AI safety.
Anyway, this seems like a regrettably unproductive and unimportant tangent.
Help me understand what you're saying here. Are you saying that Yudkowsky and Soares's argument is just so obviously wrong that it's almost uninteresting to discuss why it's wrong? That you find the Mechanize co-founders refutation of the Yudkowsky and Soares argument disappointing because you found that argument so weak to begin with?
If so, I'm not saying that's a wrong view — not at all. But it's worth noting how controversial that view is in the EA community (and other communities that talk a lot about AGI). Essays like this need to be written because so many people in this community (and others) believe Yudkowsky and Soares' argument is correct. If my impression of the EA community is off base and actually there's a community consensus that Yudkowsky and Soares' argument is wrong, then more people should talk about this, because it's really hard to get the wrong impression.
I think it's also worth discussing the question of what if AGI turns out to have generally human-like motivations and psychology. What dangers might it pose? How would it behave? But not every relevant and worthy question can be addressed in a single essay.
We should generally be skeptical of corporations (or even non-profits!) releasing pre-prints that look like scientific papers but might not pass peer review at a scientific journal. We should indeed view such pre-prints as somewhere between research and marketing. OpenAI's pre-prints or white papers are a good example.
I think it's hard to claim that a pre-print like Sparks of AGI is insincere (it might be, but how could we support that claim?), but this doesn't undermine the general point. Suppose employees at Microsoft Research wanted to publish a similar report arguing that GPT-4's seeming cognitive capabilities are actually just a bunch of cheap tricks and not sparks of anything. Would Microsoft publish that report? It's not just about how financial or job-related incentives shape what you believe (although that is worth thinking about), it's also about how they shape what you can say out loud. (And, importantly, what you are encouraged to focus on.)