Edit: Wow, it seems like a lot of people misconstrued this post as saying that we shouldn't criticize EAs who work on cutting-edge AI capabilities. I included some confusing wording in the original version of this piece and have crossed it out. To be utterly clear, I am talking about people who work on AI safety at large AI labs.
While I was at a party in the Bay Area during EAG, I overheard someone jokingly criticizing their friend for working at a large AI safety org. Since the org is increasing AI capabilities - so the reasoning goes - anyone who works at that org is "selling out" and increasing x-risk.
Although this interaction was a (mostly) harmless joke, I think it reflects a concerning and possibly growing dynamic in the EA community, and my aim in writing this post is to nip it in the bud before it becomes a serious problem. While it is fine to criticize organizations in the EA community for actions that may cause harm, EAs should avoid scrutinizing other community members' personal career choices unless those individuals ask them for feedback. This is for a few reasons:
- People take jobs for a variety of reasons. For example, they might find the compensation packages at OpenAI and Anthropic appealing, relative to those they could get at other AI safety organizations. Also, they might have altruistic reasons to work at the organization: for example, they might sincerely believe that the organization they are working for has a good plan to reduce x-risk, or that their work at the org would be beneficial even if the org as a whole causes harm. If you don't know a person well, you don't have much visibility into what factors they're considering and how they're weighing those factors as they choose a job. Therefore, it is not your place to judge them. Rather than passing judgment, you can ask them why they decided to take a certain job and try to understand their motivations (cf. "Approach disagreements with curiosity").
- Relatedly, people don't respond well to unsolicited feedback. For instance, I have gotten a lot of unsolicited advice throughout my adult life, and I find it grating because it reflects a lack of understanding of my specific needs and circumstances. I do seek out advice, but only from people I trust, such as my advisor at 80,000 Hours. It is more polite to ask a person before giving them individual feedback or refrain from giving them feedback unless they ask for it. You can also phrase advice in a more humble way, such as "doing X works well for me because Y", rather than "you should do X because Y" (cf. "Aim to explain, not persuade").
- Finally, pitting
"AI capabilities" peoplepeople who work on safety at big AI labs against "true" AI safety people creates unnecessary division in the EA community. Different AI safety orgs have different strategies for ensuring AGI will be safe, and we don't know which ones will work. In the face of this uncertainty, I think we should be kind and cooperative toward everyone who is trying in good faith to reduce AI risk. In particular, while we can legitimately disagree with an AI org's strategy, we shouldn't pass judgment on individuals who work for those organizations or ostracize them from the community.
Setting aside the questions of the impacts of working at these companies, it seems to me like this post prioritizes the warmth and collegiality of the EA community over the effects that our actions could have on the entire rest of the planet in a way that makes me feel pretty nervous. If we're trying in good faith to do the most good, and someone takes a job we think is harmful, it seems like the question should be "how can I express my beliefs in a way that is likely to be heard, to find truth, and not to alienate the person?" rather than "is it polite to express these beliefs at all?" It seems like at least the first two reasons listed would also imply that we shouldn't criticize people in really obviously harmful jobs like cigarette advertising.
It also seems quite dangerous to avoid passing judgment on individuals within the EA community based on our impressions of their work, which, unless I'm missing something, is what this post implies we should do. Saying we should "be kind and cooperative toward everyone who is trying in good faith to reduce AI risk" kind of misses the point, because a lot of the evidence for them "trying in good faith" comes from our observations of their actions. And, if it seems to me that someone's actions make the world worse, the obvious next step is "see what happens if they're presented with an argument that their actions are making the world worse." If they have responses that make sense to me, they're more likely to be acting in good faith. If they don't, this is a significant red flag that they're not trustworthy, regardless of their inner motivations: either factors besides the social impact of their actions are dominating in a way that makes it hard to trust them, or their judgment is bad in a way that makes it hard to trust them. I don't get this information just by asking them open-ended questions; I get it by telling them what I think, in a polite and safe-feeling way.
I think the norms proposed in this post result in people not passing judgment on the individuals working at FTX, which in turn leads to trusting these individuals and trusting the institution that they run. (Indeed, I'm confused at the post's separation between criticizing the decisions/strategies made by institutions and those made by the individuals who make the decisions and choose to further the strategies.) If people had suspicions that FTX was committing fraud or otherwise acting unethically, confronting individuals at FTX with these suspicions -- and forming judgments of the individuals and of FTX -- could have been incredibly valuable.
Weaving these points together: if you think leading AGI labs are acting recklessly, telling this to individuals who work at these labs (in a socially competent way) and critically evaluating their responses seems like a very important thing to do. Preserving a norm of non-criticism also denies these people the information that (1) you think their actions are net-negative and (2) you and others might be forming judgments of them in light of this. If they are acting in good faith, it seems extremely important that they have this information -- worth the risk of an awkward conversation or hurt feelings, both of which are mitigable with social skills.
(Realizing that it would be hypocritical for me not to say this, so I'll add: if you're working on capabilities at an AGI lab, I do think you're probably making us less safe and could do a lot of good by switching to, well, nearly anything else, but especially safety research.)