This is a special post for quick takes by Agrippa. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
Sorted by Click to highlight new quick takes since:

Why do I keep meeting so many damned capabilities researchers and AI salespeople? 
I thought that we agreed capabilities research was really bad. I thought we agreed that increasing the amount of economic activity in capabiliities was really bad. To me it seems like the single worst thing that I could even do! 

This really seems like a pretty consensus view among EA orthodoxy. So why do I keep meeting so many people who, as far as I can tell, are doing the single worst thing that it's even in their power to do? If there is any legal thing that could get you kicked out of EA spaces, that isn't sexual misconduct, wouldn't it be this?

I'm not even talking about people who maintain that safety/alignment research requires advancing capabilities or might do so. I'm just talking about people who do regular OpenAI or OpenAI competitor shit. 

If you're supposed to be high status in EA for doing good, aren't you supposed to be low status if you do the exact opposite? It honestly makes me feel like I'm going insane. Do EA community norms really demand that I'm supposed to act like something is normal and okay even though we all seem to believe that it really isn't okay at all? 

And yes I think there is a strong argument for ostracization. It seems like you would ostracize somebody for being a nuclear arms industry lobbyist. This seems worse. It's not behaviorally clear that these people care about anything except mild fun and status incentives, so IDK why in the community we would at all align fun and status with doing the most evil thing you can do.

Of course it does seem like 80k is somewhat to blame here since they continue to promote regular-ass jobs at OpenAI in the jobs board as far as I know. Not very clear to me why they do this.

For a lot of people, working on capabilities is the best way to gain skills before working on safety. And if, across your career, you spending half your effort on each goal, that is probably much better than not working on AI at all.

It would be nice to know more about how many EAs are getting into this plan and how many end up working in safety. I don't have the sense that most of them get to the safety half. I also think it is reasonable to believe that no amount of safety research can prevent armageddon, because the outcome of the research may just be "this is not safe", as EY seems to report, and have no impact (the capabilities researchers don't care, or, the fact that we aren't safe yet means they need to keep working in capabilities so that they can help with the safety problem). 

You seem frustrated that some EAs are working on leading AI labs, because you see that as accelerating AI timelines when we are not ready for advanced AI.

Here are some cruxes that might explain why working at leading AI labs might be a good thing:

We are uncertain of the outcomes of advanced AI

AI can be used to solve many problems, including eg poverty and health. It is plausible that we would be harming people who would benefit from this technology by delaying it. 

Also, accelerating progress of space colonization can ultimately give you access to a vast amount of resources, which otherwise we would not be able to physically reach because of the expansion of the universe. Under some worldviews (which I dont personally share), this is a large penalty to waiting.

Having people concerned about safety in leading AI labs is important to ensure a responsible deployment

If EAs systematically avoid working for top AI labs, they will be replaced by less safety-conscious staff. 

Safety-conscious researchers and engineers have done an incredible work setting up safety teams in OpenAI and DeepMind. 

I expect they will also be helpful for coordinating a responsible deployment of advanced AI in the future.

Having a large lead might be helpful to avoid race dynamics

If multiple labs are on the brink of transformative AI, they will be incentivized to cut corners to be the first to cross the finish line. Having fewer leaders can help them coordinate and delay deployment.

There might not be much useful safety research to be done now

Plausibly, AI safety research will need some experimentation and knowledge of future AI paradigms. So there might just not be much you can do to address AI risk right now.


Overall I think crux 2 is very strong, and I lend some credence to crux 1 and crux 3. I dont feel very moved by crux 4 - I think its too early to give up on current safety research, even if only because the current DL paradigm might scale to TAI already.

In any case, I am enormously glad to have safety-conscious researchers in DM and OpenAI. I think ostracizing them would be a huge error.

I agree "having people on the inside" seems useful. At the same time, it's  hard for me to imagine what an "aligned" researcher could have done at the Manhattan Project to lower nuclear risk. That's not meant as a total dismissal, it's just not very clear to me.

> Safety-conscious researchers and engineers have done an incredible work setting up safety teams in OpenAI and DeepMind. 

I don't know much about what successes here have looked like, I agree this is a relevant and important case study.

> I think ostracizing them would be a huge error.
My other comments better reflect my current feelings here.

You know in some sense I see EA as a support group for crazies. Normie reality involves accepting a lot of things as OK that are not OK. If you care a lot in any visceral sense about x risk, or animal welfare, then you are in for a lot of psychic difficulty coping with the world around you. Hell, even just caring about the shit that isn't remotely weird, like effective poverty interventions, is enough to cause psychic damage trying to cope with the way that your entire environment claims to care about helping people and behaviorally just doesn't.

So when I see similar patterns and norms applied to capabilities research, that outside of EA just get applied to everything ("oh you work in gain of function? that sounds neat"), it gives me the jeebs. 

This doesn't invalidate the kind of math @richard_ngo is doing ala "well if we get 1 safety researcher for each 5 capabilities researchers we tolerate/enable, that seems worth it". But I would like less jeebs. 

Is ostracization strategically workable? It seems like the safety community is much smaller than the capabilities community, and so ostracization (except of the most reckless capabilities researchers) could lead to capabilities people reacting in such a way that net turns people away from alignment work, or otherwise hurts the long-term strategic picture.

As a recent counterpoint to some collaborationist messages: https://forum.effectivealtruism.org/posts/KoWW2cc6HezbeDmYE/greg_colbourn-s-shortform?commentId=Cus6idrdtH548XSKZ

"It was disappointing to see that in this recent report by CSET, the default (mainstream) assumption that continued progress in AI capabilities is important was never questioned. Indeed, AI alignment/safety/x-risk is not mentioned once, and all the policy recommendations are to do with accelerating/maintaining the growth of AI capabilities! This coming from an org that OpenPhil has given over $50M to set up."

I'm comfortable  publicly criticising big orgs (I feel that I am independent enough for this), but would be less comfortable publicly criticising individual researchers (I'd be more inclined to try and persuade them to change course toward alignment; I have been trying to sow some seeds in this regard recently with some people keen on creating AGI that I've met).

yeah this is really alarming and aligns with my least possible charitable interpretation of my feelings / data.

it would help if i had a better picture of the size of the EA -> capabilities pipeline relative to not-EA -> capabilities pipeline.

to this point, why don't we take the opposite strategy? [even more] celebration of capabilities research and researchers. this would probably do a lot to ingraciate us. 

It seems like the safety community is much smaller than the capabilities community

my model is that EAs are the coolest and smartest people in the world and that status among them matters to people. so this argument seems weird to me for the same reason that it would be weird if you argued that young earth creationists shouldn't be low status in the community since there are so many of them. 

i mean there seems to be a very considerable EA to capabilities pipeline, even.

i mean if i understand your argument, it can just be applied to anything. shitheads are in the global majority on like any dimension. 

EAs may be the smartest people in your or my social circle, but they are likely not be the smartest people in the social circles of top ML people, for better or for worse. I suspect "coolest" is less well-defined and less commonly shared as a concept, as well.

yes i dont actually think that EAs are the globally highest status in the group in the world. my point here is that local status among EAs does matter to people; absolute numbers of "people in the world who agree with x" seems like a consideration that can be completely misleading in many cases. an implicit theory of change probably needs to be quite focused on local status.

i mean there's a compelling argument i'm vegan due to social pressure from the world's smartest and coolest people. i want the smartest and coolest people in the world to like me and being vegan sure seems to matter there. i don't buy an argument that the smartest and coolest people in the world should do less to align status among them with animal welfare. they seem to be quite locally effective at persuading people. 

like if you think about the people you personally know, who seem to influence people around  them (including yourself) to be much more ethical, i would be quite surprised to learn that hugbox norms got them there.

To me the core tension here is: even if a direct impact sense pure capabilities work is one of the most harmful things you can do (something which I feel fairly uncertain about), it's still also one of the most valuable things you can do, in an upskilling sense. So at least until the point where it's (ballpark) as effective and accessible to upskill in alignment by doing alignment directly rather than by doing capabilites, I think current charitability norms are better than the ostracism norms you propose. (And even after that point, charitability may still be better for talent acquisition, although the tradeoffs are more salient.)

I think this might be reasonable under charitability vs ostracism dichotomy.

However I think we can probably do better. I run a crypto venture group and we take "founders pledge" type stuff very seriously. We want to make strong, specific commitments, before its time to act on them (specifically, all upside past 2M post-tax for any member has to go towards EA crap).

Furthermore, when we talk to people, we don't really expect them (normatively speaking) to think we are aligned unless we emphasize these commitments. I would say we actively push the norm that we shouldnt receive charitability without track record. 

I would really advocate for the same thing here, if anything it seems of greater importance.

That's not to say it's obvious what these commitments should be, since its more straightforward for making money. 

My real point is that in normie land, charitability vs ostracism is the dichotomy. But I think in many cases EA already achieves more nuance, the norms demand proof of altruism in order to cash in on status.

Does that make sense? I think charitability is too strong of a norm and makes it too easy to be evil. I don't even apply it to myself! Even if there are good reasons to do things that are indistinguishable from just being bad, that doesnt mean everyone should just get benefit of the doubt. I do think that specific pledges matter. The threat of conditional shunning matters.

I can only see this backfiring and pushing people further away.

So much for open exchange of ideas

This is a very good point. I think the current sentiment comes from two sides:

  • Not wanting to alienate or make enemies with AI researchers, because alienating them from safety work would be even more catastrophic (this is a good reason)
  • Being intellectually fascinated by AI, and finding it really cool in a nerdy way (this is a bad reason, and I remember someone remarking that Bostroms book might have been hugely net-negative because it made many people more interested in AGI)

I agree that the current level of disincentives for working on capabilities is too low, and I resolve to telling AI capabilities people that I think their work is very harmful, while staying cordial with them.

I also basically feel like the norm is that I can't even begin to have these conversations bc it would violate charity norms. 

I don't think charity norms are good for talking to gain of function researchers or nuclear arms industry lobbyists. Like there are definitely groups of people that, if you just apply charity to them, you're gonna thoughtkill yourself, because they are actually doing bad shit for no good reason. 

I don't wanna be in an environment where I meet gain of function researchers at parties and have to act like they don't scare the shit out of me. 

Maybe I'm just off here about the consensus and nobody cares about what I understand to be the Yudkowsky line. In which case I'd have to ask why people think it's cool to do capabilities work without even a putative safety payoff. IDK I'd just expect at least some social controversy over this crap lol. 

like if at least 20% of the community thinks mundane capabilities work is actually really terrible (and at least 20% does seem to think this, to me), you would think that there would be pretty live debate over the topic? seems pressing and relevant? 

maybe the phrase im looking for is "missing moods" or something. it would be one thing if there was a big fight, everyone drew lines in the sand, and then agreed to get along. but nothing like that happened, i just talked to somebody tonight about their work selling AI and basically got a shrug in response to any ethical questions. so im going crazy.

I, for one, am really glad you raised this.

It seems plausible that some people caught the “AI is cool” bug along with the “EA is cool and nice and well-resourced” bug, and want to work on whatever they can that is AI-related. A justification like “I’ll go work on safety eventually” could be sincere or not.

Charity norms can swing much too far.

I’d be glad to see more 80k and forum talk about AI careers that point to the concerns here.

And I’d be glad to endorse more people doing what Richard mentioned — telling capabilities people that he thinks their work could be harmful while still being respectful.

Well, Holden says in his Appendix to his last post:

 

I don't get it. https://www.lesswrong.com/posts/N6vZEnCn6A95Xn39p/are-we-in-an-ai-overhang?commentId=o58cMKKjGp87dzTgx 

I wont associate with people doing serious capabilities research.

https://www.openphilanthropy.org/focus/global-catastrophic-risks/potential-risks-advanced-artificial-intelligence/openai-general-support 

To me at this point the expected impact of the EA phenomena as a whole is negative. Hope we can right this ship, but things really seem off the rails.

Eliezer said something similar, and he seems similarly upset about it: https://twitter.com/ESYudkowsky/status/1446562238848847877

(FWIW I am also upset about it, I just don't know that I have anything constructive to say)

Eliezer's tweet is about the founding of OpenAI, whereas Agrippa's comment is about a 2017 grant to OpenAI (OpenAI was founded in 2015, so this was not a founding grant). It seems like to argue that Open Phil's grant was net negative (and so strongly net negative as to swamp other EA movement efforts), one would have to compare OpenAI's work in a counterfactual world where it never got the extra $30 million in 2017 (and Holden never joined the board) with the actual world in which those things happened. That seems a lot harder to argue for than what Eliezer is claiming (Eliezer only has to compare a world where OpenAI didn't exist vs the actual world where it does exist).

Personally, I agree with Eliezer that the founding of OpenAI was a terrible idea, but I am pretty uncertain about whether Open Phil's grant was a good or bad idea. Given that OpenAI had already disrupted the "nascent spirit of cooperation" that Eliezer mentions and was going to do things, it seems plausible that buying a board seat for someone with quite a bit of understanding of AI risk is a good idea (though I can also see many reasons it could be a bad idea).

One can also argue that EA memes re AI risk led to the creation of OpenAI, and that therefore EA is net negative (see here for details). But if this is the argument Agrippa wants to make, then I am confused why they decided to link to the 2017 grant.

Has Holden written any updates on outcomes associated with the grant? 

One can also argue that EA memes re AI risk led to the creation of OpenAI, and that therefore EA is net negative (see here for details). But if this is the argument Agrippa wants to make, then I am confused why they decided to link to the 2017 grant.

I am not making this argument but certainly I am alluding to it. EA strategy (weighted by impact) has been to do things that in actuality accelerate timelines, and even cooperate with doing so under the "have a good person standing nearby" theory.

I don't think that lobbying against OpenAI, other adversarial action, would have been that hard. But OpenPhil and other EA leadership of the time decided to ally and hope for the best instead. This seems off the rails to me.

Has Holden written any updates on outcomes associated with the grant?

Not to my knowledge.

I don't think that lobbying against OpenAI, other adversarial action, would have been that hard.

It seems like once OpenAI was created and had disrupted the "nascent spirit of cooperation", even if OpenAI went away (like, the company and all its employees magically disappeared), the culture/people's orientation to AI stuff ("which monkey gets the poison banana" etc.) wouldn't have been reversible. So I don't know if there was anything Open Phil could have done to OpenAI in 2017 to meaningfully change the situation in 2022 (other than like, slowing AI timelines by a bit). Or maybe you mean some more complicated plan like 'adversarial action against OpenAI and any other AI labs that spring up later, and try to bring back the old spirit of cooperation, and get all the top people into DeepMind instead of spreading out among different labs'.

I don't mean to say anything pro DeepMind and I'm not sure there is anything positive to say re: DeepMind.

I think that once the nascent spirit of cooperation is destroyed, you can indeed take the adversarial route. It's not hard to imagine successful lobbying efforts that lead to regulation -- most people are in fact skeptical of tech giants wielding tons of power using AI! Among other things known to slow progress and hinder organizations. It is beyond me why such things are so rarely discussed or considered. I'm sure that Open Phil and 80k open cooperation with OpenAI has a big part in shaping narrative away from this kind of thing.

This post includes some great follow up questions for the future. Has anything been posted re: these follow up questions?

[x-post from a comment]

You know in some sense I see EA as a support group for crazies. Normie reality involves accepting a lot of things as OK that are not OK. If you care a lot in any visceral sense about x risk, or animal welfare, then you are in for a lot of psychic difficulty coping with the world around you. Hell, even just caring about the shit that isn't remotely weird, like effective poverty interventions, is enough to cause psychic damage trying to cope with the way that your entire environment claims to care about helping people and behaviorally just doesn't.

So when I see similar patterns and norms applied to capabilities research, that outside of EA just get applied to everything ("oh you work in gain of function? that sounds neat"), it gives me the jeebs. 

This doesn't invalidate the kind of math @richard_ngo is doing ala "well if we get 1 safety researcher for each 5 capabilities researchers we tolerate/enable, that seems worth it". But I would like less jeebs. 

[original comment: https://forum.effectivealtruism.org/posts/qjsWZJWcvj3ug5Xja/agrippa-s-shortform?commentId=bgf3BJZEyYik9gCti]

As far as I can tell liberal nonviolence is a very popular norm in EA. At the same time I really cannot thing of anything more mortally violent I could do than to build a doomsday machine. Even if my doomsday machine is actually a 10%-chance-of-doomsday machine or 1% or etcetera (nobody even thinks it's lower than that). How come this norm isn't kicking in? How close to completion does the 10%-chance-of-doomsday machine have to be before gentle kindness is not the prescribed reaction? 

My favorite thing about EA has always been the norm that in order to get cred for being altruistic, you actually are supposed to have helped people. This is a great property, just align incentives. But now re: OpenAI I so often hear people say that gentle kindness is the only way, if you are openly adversarial then they will just do the opposite of what you want even more. So much for aligning incentives.

More from Agrippa
30
Agrippa
· · 3m read
Curated and popular this week
 ·  · 32m read
 · 
Summary Immediate skin-to-skin contact (SSC) between mothers and newborns and early initiation of breastfeeding (EIBF) may play a significant and underappreciated role in reducing neonatal mortality. These practices are distinct in important ways from more broadly recognized (and clearly impactful) interventions like kangaroo care and exclusive breastfeeding, and they are recommended for both preterm and full-term infants. A large evidence base indicates that immediate SSC and EIBF substantially reduce neonatal mortality. Many randomized trials show that immediate SSC promotes EIBF, reduces episodes of low blood sugar, improves temperature regulation, and promotes cardiac and respiratory stability. All of these effects are linked to lower mortality, and the biological pathways between immediate SSC, EIBF, and reduced mortality are compelling. A meta-analysis of large observational studies found a 25% lower risk of mortality in infants who began breastfeeding within one hour of birth compared to initiation after one hour. These practices are attractive targets for intervention, and promoting them is effective. Immediate SSC and EIBF require no commodities, are under the direct influence of birth attendants, are time-bound to the first hour after birth, are consistent with international guidelines, and are appropriate for universal promotion. Their adoption is often low, but ceilings are demonstrably high: many low-and middle-income countries (LMICs) have rates of EIBF less than 30%, yet several have rates over 70%. Multiple studies find that health worker training and quality improvement activities dramatically increase rates of immediate SSC and EIBF. There do not appear to be any major actors focused specifically on promotion of universal immediate SSC and EIBF. By contrast, general breastfeeding promotion and essential newborn care training programs are relatively common. More research on cost-effectiveness is needed, but it appears promising. Limited existing
 ·  · 2m read
 · 
For immediate release: April 1, 2025 OXFORD, UK — The Centre for Effective Altruism (CEA) announced today that it will no longer identify as an "Effective Altruism" organization.  "After careful consideration, we've determined that the most effective way to have a positive impact is to deny any association with Effective Altruism," said a CEA spokesperson. "Our mission remains unchanged: to use reason and evidence to do the most good. Which coincidentally was the definition of EA." The announcement mirrors a pattern of other organizations that have grown with EA support and frameworks and eventually distanced themselves from EA. CEA's statement clarified that it will continue to use the same methodologies, maintain the same team, and pursue identical goals. "We've found that not being associated with the movement we have spent years building gives us more flexibility to do exactly what we were already doing, just with better PR," the spokesperson explained. "It's like keeping all the benefits of a community while refusing to contribute to its future development or taking responsibility for its challenges. Win-win!" In a related announcement, CEA revealed plans to rename its annual EA Global conference to "Coincidental Gathering of Like-Minded Individuals Who Mysteriously All Know Each Other But Definitely Aren't Part of Any Specific Movement Conference 2025." When asked about concerns that this trend might be pulling up the ladder for future projects that also might benefit from the infrastructure of the effective altruist community, the spokesperson adjusted their "I Heart Consequentialism" tie and replied, "Future projects? I'm sorry, but focusing on long-term movement building would be very EA of us, and as we've clearly established, we're not that anymore." Industry analysts predict that by 2026, the only entities still identifying as "EA" will be three post-rationalist bloggers, a Discord server full of undergraduate philosophy majors, and one person at
Thomas Kwa
 ·  · 2m read
 · 
Epistemic status: highly certain, or something The Spending What We Must 💸11% pledge  In short: Members pledge to spend at least 11% of their income on effectively increasing their own productivity. This pledge is likely higher-impact for most people than the Giving What We Can 🔸10% Pledge, and we also think the name accurately reflects the non-supererogatory moral beliefs of many in the EA community. Example Charlie is a software engineer for the Centre for Effective Future Research. Since Charlie has taken the SWWM 💸11% pledge, rather than splurge on a vacation, they decide to buy an expensive noise-canceling headset before their next EAG, allowing them to get slightly more sleep and have 104 one-on-one meetings instead of just 101. In one of the extra three meetings, they chat with Diana, who is starting an AI-for-worrying-about-AI company, and decide to become a cofounder. The company becomes wildly successful, and Charlie's equity share allows them to further increase their productivity to the point of diminishing marginal returns, then donate $50 billion to SWWM. The 💸💸💸 Badge If you've taken the SWWM 💸11% Pledge, we'd appreciate if you could add three 💸💸💸 "stacks of money with wings" emoji to your social media profiles. We chose three emoji because we think the 💸11% Pledge will be about 3x more effective than the 🔸10% pledge (see FAQ), and EAs should be scope sensitive.  FAQ Is the pledge legally binding? We highly recommend signing the legal contract, as it will allow you to sue yourself in case of delinquency. What do you mean by effectively increasing productivity? Some interventions are especially good at transforming self-donations into productivity, and have a strong evidence base. In particular:  * Offloading non-work duties like dates and calling your mother to personal assistants * Running many emulated copies of oneself (likely available soon) * Amphetamines I'm an AI system. Can I take the 💸11% pledge? We encourage A