AI safety advocates for the most part have taken an exclusively–and potentially excessively–friendly and cooperative approach with AI firms–and especially OpenAI. I am as guilty of this as anyone[1]–but after the OpenAI disaster, it is irresponsible not to update on what happened and the new situation. While it still makes sense to emphasize and prioritize friendliness and cooperation, it may be time to also adopt some of the lowercase “p” political advocacy tools used by other civil society organizations and groups.[2] 

As far as is known publicly, the OpenAI disaster began with Sam Altman attempting to purge the board of serious AI safety advocates and it ended with him successfully purging the board of serious AI safety advocates. (As well as him gaining folk hero status for his actions while AI safety advocacy was roundly defamed, humiliated, and shown to be powerless.) During the events in between, various stakeholders flexed their power to bring about this outcome. 

  • Sam Altman began implementing plans to gut OpenAI, first privately and then with the aid of Microsoft.
  • Satya Nadella and Microsoft began implementing a plan to defund and gut OpenAI.
  • Employees–and especially those with an opportunity to cash out their equity—threatened to resign en masse and go to Microsoft–risking the existence of OpenAI.
  • Various prominent individuals used their platforms to exert pressure and spin up social media campaigns–especially on Twitter.
  • Social media campaigns on Twitter exercised populist power to severely criticize and, in some cases, harass the safety advocates on the board. (And AI safety advocacy and advocates more broadly.)

If there is reason to believe that AI will not be safe by default–and there is–AI safety advocates need to have the ability to exercise some influence over actions of major actors–especially labs. We bet tens of millions of dollars and nearly a decade of ingratiating ourselves to OpenAI (and avoiding otherwise valuable actions for fear they may be seen as hostile) in the belief that board seats could provide this influence. We were wrong and wrong in a way that blew up in our faces terribly.

It is important that we not overreact and alienate groups and people we need to be able to work with, but it is also important that we demonstrate that we are–like Sam, Satya, OpenAI’s employees, prominent individuals, social media campaigns, and most civil society groups everywhere–a constituency that has lowercase “p” power and a place at the negotiating table for major decisions. (You will note, these other parties are brought to the table not despite their willingness to exercise some amount of coercive power, but at least in part because of it.)

I believe the first move in implementing a more typical civil society advocacy approach is to push back in a measured way against OpenAI–or better yet Sam Altman. The comment section below might be a good location to brainstorm. 

Some tentative ideas:

  • An open letter–with prominent signatories, but also open for signatures from thousands of others–raising concerns about Sam’s well-documented scheming and deception to remove AI safety advocates from OpenAI’s board of directors and his betrayal of OpenAI, its mission, and his own alleged values[3], in his subsequent attempt to destroy OpenAI as a personal vendetta for being let go in response. This is the type of thing FLI is excellent at championing. I could imagine Scott Alexander successfully doing this as well.  
  • A serious deep dive into the long list of allegations of misconduct by Sam at OpenAI and elsewhere to write up and publish. Something like the recent Nonlinear post–but focused at Sam–would likely have far, far higher EV. Alternatively, a donor could hire a professional firm to do this. (If someone is interested in funding this but is low on time, please DM me, I’d be happy to manage such a project.)
  • Capital “P” political pressure. AI safety advocates might consider nudging various actors in government to subject OpenAI and Microsoft to more scrutiny. Given the existing distrust of OpenAI and similar firms in DC, it might not take much to do this. With OpenAI’s sudden and shocking purge of its oversight mechanism, it makes sense to potentially bring in some new eyes that are not as easily removed by malfeasance.[4] 
  • Interpersonal social pressure. Here is the list of OpenAI signatories who demanded the board step down–while threatening to gut and destroy OpenAI–without waiting to learn the reason for the board’s actions. If the CEO of ExxonMobil had an undisclosed conflict with the company’s internal environmental oversight board, and I had a friend who publicly threatened to resign if the environmental board was not fired–while not knowing the reason for the conflict–it would badly undermine my confidence in the morality of my friend. I know many people at OpenAI who signed this letter, and though it is awkward, I intend to have a gentle but probing conversation with each of them. Much like with advocacy, I don’t intend to push hard enough to harm our long-term relationship. Social pressure is one of the most powerful tools civil society actors can yield. 

AI safety advocates are good at hugboxing. We should lean into this strength and continue to prioritize hugboxing. But we can’t only hugbox. This is too important to get right for us to hide in our comfort zone while more skilled, serious political actors take over the space and purge AI safety mechanisms and advocates.

  1. ^

    My job involves serving as a friendly face of AI safety. Accordingly, I am in a bad position to unilaterally take a strong public stand. I imagine many others are in a similar position. However, with social cover, I believe the amount of pressure we could exert on firms would snowball as more of us could deanonymize.

  2. ^

    Our interactions with firms are like iterated games. Having the ability and willingness to tit for tat is likely necessary to secure–or re-secure–some amount of cooperation.

  3. ^
  1. ^

    This could also reestablish the value to firms of oversight boards, with real authority, and full of genuinely independent members. The genuine independence and commitment of AI safety advocates could again be seen as an asset and not just a liability.

Show all footnotes

86

4
8

Reactions

4
8

More posts like this

Comments5


Sorted by Click to highlight new comments since:

I’d think very carefully before pursuing this. Sam is a very experienced political player and two quite senior EA's just got outplayed. In many worlds, I expect our attempts backfire. Don’t pursue this unless you have good reason to believe that you can compete on his level. Otherwise, I'd suggest picking easier fights first.

Also, some of these don't really seem like gentle pushback, but actually rather aggressive. Maybe we should be aggressive, but if so, we should own it.

They got outplayed in the context of the internal politics of OpenAI, where there were A LOT of people with profit (Microsoft) or career (the employees) incentives to race ahead. But there seems to be an emerging public consensus in favor of more regulation, so I would expect that e.g. smart, ambitious politicians have quite different incentives. 

Thanks for this provocative and timely post. 

I agree that EAs have been far too friendly to AI companies, too eager to get hired within these companies as internal AI safety experts, too willing to give money to support their in-house safety work, and too wary about upsetting AI leaders and developers. 

This has diluted our warnings about extinction risks from AI. I've noticed that on social media like X, ordinary folks get very confused about EA attitudes towards AI. If we really think AI is extraordinarily dangerous, why would we be working with AI companies to advance capabilities, safety-wash their advances, and serve as their PR props to convince the public that they're being cautious and responsible? 

If rapid AI development is really an extinction risk, and EAs want to minimize extinction risks, it's puzzling that we would see the AI industry as our allies rather than our enemies. 

We've talked a lot over the years about the benefits of 'engagement' with the AI industry, 'being in the room' when they make decisions, having insider tracks to monitor and nudge their safety policies, etc. But, as this post points out, the OpenAI debacle might mark the end of that era. The voices for AI safety at OpenAI were decisively pushed out, in favor of maximum-speed commercialization and AGI development.

So, I think EAs need a new strategy for AI safety that is more confrontational, more political, and savvier about the cynicism, greed, and power of the AI industry. My essay on moral stigmatization of AI outlined one possible path. There might be other viable strategies, such as those outlined in this post.

As I've said many times over the last year or so, it's time to stop playing nice with the AI industry. Especially since, following this recent OpenAI shakeup, they stopped playing nice with us.

Something like the recent Nonlinear post–but focused at Sam–would likely have far, far higher EV.

I felt really uncomfortable reading this

Not to frame everything as the nail my favorite hammer could plant, but I would suggest people to form themselves to conversational techniques (Deep Canvassing, Smart Politics and Street Epistemology). I think that classical argumentation is likely to have only very limited effects if not handled with extremely good rapport and on very long timespans.

Note that at least one person disagrees with me on this, but I think acting methodically is still better than doing so spontaneously.

Curated and popular this week
 ·  · 10m read
 · 
I wrote this to try to explain the key thing going on with AI right now to a broader audience. Feedback welcome. Most people think of AI as a pattern-matching chatbot – good at writing emails, terrible at real thinking. They've missed something huge. In 2024, while many declared AI was reaching a plateau, it was actually entering a new paradigm: learning to reason using reinforcement learning. This approach isn’t limited by data, so could deliver beyond-human capabilities in coding and scientific reasoning within two years. Here's a simple introduction to how it works, and why it's the most important development that most people have missed. The new paradigm: reinforcement learning People sometimes say “chatGPT is just next token prediction on the internet”. But that’s never been quite true. Raw next token prediction produces outputs that are regularly crazy. GPT only became useful with the addition of what’s called “reinforcement learning from human feedback” (RLHF): 1. The model produces outputs 2. Humans rate those outputs for helpfulness 3. The model is adjusted in a way expected to get a higher rating A model that’s under RLHF hasn’t been trained only to predict next tokens, it’s been trained to produce whatever output is most helpful to human raters. Think of the initial large language model (LLM) as containing a foundation of knowledge and concepts. Reinforcement learning is what enables that structure to be turned to a specific end. Now AI companies are using reinforcement learning in a powerful new way – training models to reason step-by-step: 1. Show the model a problem like a math puzzle. 2. Ask it to produce a chain of reasoning to solve the problem (“chain of thought”).[1] 3. If the answer is correct, adjust the model to be more like that (“reinforcement”).[2] 4. Repeat thousands of times. Before 2023 this didn’t seem to work. If each step of reasoning is too unreliable, then the chains quickly go wrong. Without getting close to co
 ·  · 11m read
 · 
My name is Keyvan, and I lead Anima International’s work in France. Our organization went through a major transformation in 2024. I want to share that journey with you. Anima International in France used to be known as Assiettes Végétales (‘Plant-Based Plates’). We focused entirely on introducing and promoting vegetarian and plant-based meals in collective catering. Today, as Anima, our mission is to put an end to the use of cages for laying hens. These changes come after a thorough evaluation of our previous campaign, assessing 94 potential new interventions, making several difficult choices, and navigating emotional struggles. We hope that by sharing our experience, we can help others who find themselves in similar situations. So let me walk you through how the past twelve months have unfolded for us.  The French team Act One: What we did as Assiettes Végétales Since 2018, we worked with the local authorities of cities, counties, regions, and universities across France to develop vegetarian meals in their collective catering services. If you don’t know much about France, this intervention may feel odd to you. But here, the collective catering sector feeds a huge number of people and produces an enormous quantity of meals. Two out of three children, more than seven million in total, eat at a school canteen at least once a week. Overall, more than three billion meals are served each year in collective catering. We knew that by influencing practices in this sector, we could reach a massive number of people. However, this work was not easy. France has a strong culinary heritage deeply rooted in animal-based products. Meat and fish-based meals remain the standard in collective catering and school canteens. It is effectively mandatory to serve a dairy product every day in school canteens. To be a certified chef, you have to complete special training and until recently, such training didn’t include a single vegetarian dish among the essential recipes to master. De
 ·  · 2m read
 · 
Note: This started as a quick take, but it got too long so I made it a full post. It's still kind of a rant; a stronger post would include sources and would have gotten feedback from people more knowledgeable than I. But in the spirit of Draft Amnesty Week, I'm writing this in one sitting and smashing that Submit button. Many people continue to refer to companies like OpenAI, Anthropic, and Google DeepMind as "frontier AI labs". I think we should drop "labs" entirely when discussing these companies, calling them "AI companies"[1] instead. While these companies may have once been primarily research laboratories, they are no longer so. Continuing to call them labs makes them sound like harmless groups focused on pushing the frontier of human knowledge, when in reality they are profit-seeking corporations focused on building products and capturing value in the marketplace. Laboratories do not directly publish software products that attract hundreds of millions of users and billions in revenue. Laboratories do not hire armies of lobbyists to control the regulation of their work. Laboratories do not compete for tens of billions in external investments or announce many-billion-dollar capital expenditures in partnership with governments both foreign and domestic. People call these companies labs due to some combination of marketing and historical accident. To my knowledge no one ever called Facebook, Amazon, Apple, or Netflix "labs", despite each of them employing many researchers and pushing a lot of genuine innovation in many fields of technology. To be clear, there are labs inside many AI companies, especially the big ones mentioned above. There are groups of researchers doing research at the cutting edge of various fields of knowledge, in AI capabilities, safety, governance, etc. Many individuals (perhaps some readers of this very post!) would be correct in saying they work at a lab inside a frontier AI company. It's just not the case that any of these companies as
Relevant opportunities