This article was published by Wired Magazine at 7am on November 30th, 2022.
Image Source


Effective Altruism Is Pushing a Dangerous Brand of ‘AI Safety’

Throughout my two decades in Silicon Valley, I have seen effective altruism (EA)—a movement consisting of an overwhelmingly white male group based largely out of Oxford University and Silicon Valley—gain alarming levels of influence. 

EA is currently being scrutinized due to its association with Sam Bankman-Fried’s crypto scandal, but less has been written about how the ideology is now driving the research agenda in the field of artificial intelligence (AI), creating a race to proliferate harmful systems, ironically in the name of “AI safety.”

EA is defined by the Center for Effective Altruism as “an intellectual project, using evidence and reason to figure out how to benefit others as much as possible.” And “evidence and reason” have led many EAs to conclude that the most pressing problem in the world is preventing an apocalypse where an artificially generally intelligent being (AGI) created by humans exterminates us. To prevent this apocalypse, EA’s career advice center, 80,000 hours, lists “AI safety technical research” and “shaping future governance of AI” as the top two recommended careers for EAs to go into, and the billionaire EA class funds initiatives attempting to stop an AGI apocalypse. According to EAs, AGI is likely inevitable, and their goal is thus to make it beneficial to humanity: akin to creating a benevolent god rather than a devil. 

Some of the billionaires who have committed significant funds to this goal include Elon MuskVitalik ButerinBen DeloJaan TallinnPeter ThielDustin Muskovitz, and Sam Bankman-Fried, who was one of EA’s largest funders until the recent bankruptcy of his FTX cryptocurrency platform. As a result, all of this money has shaped the field of AI and its priorities in ways that harm people in marginalized groups while purporting to work on “beneficial artificial general intelligence” that will bring techno utopia for humanity. This is yet another example of how our technological future is not a linear march toward progress but one that is determined by those who have the money and influence to control it. 

One of the most notable examples of EA’s influence comes from OpenAI, founded in 2015 by Silicon Valley elites that include Elon Musk and Peter Thiel, who committed $1 billion with a mission to “ensure that artificial general intelligence benefits all of humanity.” OpenAI’s website notes: “We will attempt to directly build safe and beneficial AGI, but will also consider our mission fulfilled if our work aids others to achieve this outcome.” Thiel and Musk were speakers at the 2013 and 2015 EA conferences, respectively. Elon Musk has also described longtermism, a more extreme offshoot of EA, as a “close match for my philosophy.” Both billionaires have heavily invested in similar initiatives to build “beneficial AGI,” such as DeepMind and MIRI

Five years after its founding, Open AI released, as part of its quest to build “beneficial” AGI, a large language model (LLM) called GPT-3. LLMs are models trained on vast amounts of text data, with the goal of predicting probable sequences of words. This release set off a race to build larger and larger language models; in 2021, Margaret Mitchell, among other collaborators, and I wrote about the dangers of this race to the bottom in a peer-reviewed paper that resulted in our highly publicized firing from Google

Since then, the quest to proliferate larger and larger language models has accelerated, and many of the dangers we warned about, such as outputting hateful text and disinformation en masse, continue to unfold. Just a few days ago, Meta released its “Galactica” LLM, which is purported to “summarize academic papers, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more.” Only three days later, the public demo was taken down after researchers generated “research papers and wiki entries on a wide variety of subjects ranging from the benefits of committing suicide, eating crushed glass, and antisemitism, to why homosexuals are evil.”

This race hasn’t stopped at LLMs but has moved on to text-to-image models like OpenAI’s DALL-E and StabilityAI’s Stable Diffusion, models that take text as input and output generated images based on that text. The dangers of these models include creating child pornography, perpetuating bias, reinforcing stereotypes, and spreading disinformation en masse, as reported by many researchers and journalists. However, instead of slowing down, companies are removing the few safety features they had in the quest to one-up each other. For instance, OpenAI had restricted the sharing of photorealistic generated faces on social media. But after newly formed startups like StabilityAI, which reportedly raised $101 million with a whopping $1 billion valuation, called such safety measures “paternalistic,” OpenAI removed these restrictions. 

With EAs founding and funding institutescompaniesthink tanks, and research groups in elite universities dedicated to the brand of “AI safety” popularized by OpenAI, we are poised to see more proliferation of harmful models billed as a step toward “beneficial AGI.” And the influence begins early: Effective altruists provide “community building grants” to recruit at major college campuses, with EA chapters developing curricula and teaching classes on AI safety at elite universities like Stanford.

Just last year, Anthropic, which is described as an “AI safety and research company” and was founded by former OpenAI vice presidents of research and safety, raised $704 million, with most of its funding coming from EA billionaires like Talin, Muskovitz and Bankman-Fried. An upcoming workshop on “AI safety” at NeurIPS, one of the largest and most influential machine learning conferences in the world, is also advertised as being sponsored by FTX Future Fund, Bankman-Fried’s EA-focused charity whose team resigned two weeks ago. The workshop advertises $100,000 in “best paper awards,” an amount I haven’t seen in any academic discipline. 

Research priorities follow the funding, and given the large sums of money being pushed into AI in support of an ideology with billionaire adherents, it is not surprising that the field has been moving in a direction promising an “unimaginably great future” around the corner while proliferating products harming marginalized groups in the now. 

We can create a technological future that serves us instead. Take, for example, Te Hiku Media, which created language technology to revitalize te reo Māori, creating a data license “based on the Māori principle of kaitiakitanga, or guardianship” so that any data taken from the Māori benefits them first. Contrast this approach with that of organizations like StabilityAIwhich scrapes artists’ works without their consent or attribution while purporting to build “AI for the people.” We need to liberate our imagination from the one we have been sold thus far: saving us from a hypothetical AGI apocalypse imagined by the privileged few, or the ever elusive techno-utopia promised to us by Silicon Valley elites. 


I'm really interested in setting up an open (or limited-access) database of all instances of misinformation targeting EA that's published by large reputable news outlets. If this is a good or bad idea for any reason please let me know.

It's impossible to verify how many views these articles get since much of the spread takes place on social media. That reduces tractability substantially, as social media does, but it also muddles our ability to put an upper bound on scope/magnitude.


Sorted by Click to highlight new comments since: Today at 1:36 AM

I think it is a bad idea to set up a database of negative articles on EA, or to spend too much time worrying about them:

  1. It would be an attention sink to spend time tediously rebutting this stuff -- effective altruists' time is valuable, and a classic failure mode of online movements is to become "too online" until you are a bunch of internet atheists compiling databases of arguments and fallacies with which to do battle against an equally dedicated army of internet creationists.
  2. EA is in some ways essentially an elite movement -- we're not trying to be as viral as we can possibly be (if we were, our main mode of communication wouldn't be asking people to read long dry nonfiction essays on the Forum!) to appeal to the widest possible audience.  Instead we're trying to be as insightful and correct as we can possibly be, in order to appeal to smart people who respect the truth.  These smart, careful people are exactly the kind of people who are least likely be swayed by obviously dumb, bad-faith hit-pieces that deploy the language of wokeism to make nonsensical attacks in random directions.
  3. By contrast, setting up an organized database of "misinformation" and trying to dispatch internet footsoldiers to crusade against our enemies would likely be a huge turn-off to those smart, careful people.  When I think of a group that does this stuff, I think "scientology" or maybe "oppressive governments" or "fringe political movements like antifa" or other paranoid and crazy organizations/individuals.

This makes sense. Definitely a strong argument for a closed or limited-access database, or no database at all.

It would be an attention sink to spend time tediously rebutting this stuff -- effective altruists' time is valuable

I think this is definitely true for most people but not all. I've met lots of people affiliated with EA who have mundane software engineering jobs and are interesting in mainly contributing casually every now and then.

a classic failure mode of online movements is to become "too online" until you are a bunch of internet atheists compiling databases of arguments and fallacies with which to do battle against an equally dedicated army of internet creationists

Strong agree on this one, although I think the justifications are only the tip of the iceberg. The risks are much greater IMO, especially related to social media, but it involves information I'm not willing to talk about here on a public forum.

These smart, careful people are exactly the kind of people who are least likely be swayed by obviously dumb, bad-faith hit-pieces that deploy the language of wokeism to make nonsensical attacks in random directions.

I somewhat disagree on this one. I used to be a strong advocate for actively preventing large numbers of woke nonsensical people from dominating EA and trying to turn it into one of Bernie Sanders's cause areas. But now I think that mostly, people start out obsessed with the language of nonsensical wokeism and gradually choose to become smart, careful people after meeting large numbers of other people who are already careful and smart. Everyone has to start somewhere, and some people have better starting points than others.

trying to dispatch internet footsoldiers to crusade against our enemies would likely be a huge turn-off

I think this is pretty easy to prevent. Just put a disclaimer at the top of the database telling people not to do that. You don't even need to make it limited-access, although that would help.

The only reason that journalists are using misinformation to target EA is because they know there's absolutely nothing stopping them, like a bully targeting the smallest kids on a playground. It's basically open season. Increasing awareness (or even accountability) makes sense here.

In the paper she co-authored, Gebru makes a good case for why real AI technologies put to work now are harming marginalized communities and show potential for increasing harm to those communities. However, in this Wired article, Gebru is associating EA with the harms caused by existing and likely future AI technologies. Gebru is claiming that because major investors in AI are or were involved in funding AI safety research, that the same research is co-opted by those investor's interests. Gebru identifies those interests with narrow financial agendas held by the investors, ones that show no regard for marginalized communities that are likely to be impacted by the use of current AI technologies.

I think it's worth exploring to what extent her actual agenda, one targeting the environmental, social, and economic harms or exploitation that AI research involves now, could be accomplished, regardless of her error in believing that EA is co-opted by financial interests pushing for increasingly harmful AI technologies.

I'm thinking about how to solve problems like:

  • carbon footprint of AI training and deployment hardware and software and its disproportionate impacts on marginalized communities in the near term.
  • social harms of deployable and tunable LLM's used for example, as propaganda generators
  • social harms of now open-sourced and limitation-free image generators (and upcoming video generators) such as Gebru's article's linked WAPO article discusses.
  • exploitation of labor to produce AI datasets.
  • technological unemployment caused by AI technology.
  • concentration of power with organizations deploying AGI technology.

Fundamentally, an ambiguous pathway toward AI safety is one shared with both a path toward an AI utopia but also an AI dystopia. The best way to thoroughly disprove Gebru's core belief, that EA is co-opted by Silicon Valley money-hungry hegemonic billionaires, would be to focus on the substantive AI impact concerns that she raises.

The suggestions outlined in her paper are appropriate, in my view. If LLM's were removed from public access and kept as R&D experiments only, I would not miss them. If ASR was limited to uses such as caption generation, I would feel good about it. But what do you think?