Yale EA organizer and AI safety research.


Pragmatic AI Safety

Topic Contributions


What actions most effective if you care about reproductive rights in America?

I wasn't intending to single out you or any specific person when asking that question. More that the community overall seems to collectively have responded differently (in view of up/downvotes). Due to the fact that different people see different posts, it's hardly a controlled experiment, so it could have been just chance who happened to see the post first and make a first impression.

What actions most effective if you care about reproductive rights in America?

I notice a similarity to this post.

Somebody writes about an issue that happens to be a popular mainstream cause and asks, "how can I be most effective at doing good, given that I want to work specifically on this cause?"

I'm not saying the two issues are remotely equivalent. Obviously, to argue "this should be an EA cause area" would require very different arguments, and one might be much stronger than the other. With Ukraine, maybe you could justify it as being adjacent to nuclear risk, but the post wasn't talking about nuclear risk. Maybe close to being about preventing great power conflict, but the post wasn't talking about that, either. So, like this post, it is outside of the "standard" EA cause areas.

This comment seems to imply that if somebody is posting about a cause that isn't within the "standard" cause areas, then they should need to justify posting about it as to why this would be better than other cause areas. They cannot "leave that exercise to the reader." The first paragraph of this comment makes a meta-level point that suggests people shouldn't even post about an issue and let readers debate it in the comments (which, in fairness, is not what the author of this post did, when explicitly asking for it not to be debated in the comments after this comment was written). Instead, the author themselves must make a case for the object-level merits of the cause.

It seems others might agree, given that this comment has more karma than the original post (edit: this may or may not be currently true, but it was true at the time of this comment). If people on the forum have these beliefs about meta-level discussion norms, then I ask: why apply it to abortion and not Ukraine?

I strongly suspect that the answer is that people are letting their object-level opinions of issues subtly motivate their meta-level opinions of discussion norms. I'd rather that not happen.

20 Critiques of AI Safety That I Found on Twitter

I think it's essential to ask some questions first:

  • Why do people hold these views? (Is it just their personality, or did somebody in this community do something wrong?)
  • Is there any truth to these views? (As can be seen here, anti-AI safety views are quite varied. For example, many are attacks on the communities that care about them rather than the object-level issues.)
  • Does it even matter what these particular people think? (If not, then leave them be.)

Only then should one even consider engaging in outreach or efforts to improve optics.

Ways money can make things worse

Wanted to make a very small comment on a very small part of this post.

An assistant professor in AI wants to have several PhDs funded. Hearing about the abundance of funding for AI safety research, he drafts a grant proposal arguing why the research topic his group would be working on anyway helps not only with AI capabilities, but also with AI alignment. In the process he convinces himself this is the case, and as a next step convinces some of his students.

Yes, this certainly might be an issue! This particular issue can be mitigated by having funders do lots of grant followups to make sure that differential progress in safety, rather than capabilities, is achieved.

X-Risk Analysis by Dan Hendrycks and Mantas Mazeika provides a good roadmap for doing this. There are also some details in this post (edit since my connection may not have been obvious: I work with Dan and I'm an author of the second post).

Dialectic of Enlightenment

Curious why people are downvoting this? If it's some substantive criticism of the work I'd be interested in hearing it.

If it's just because it's not very thought through, then what do you think the "not front page" function of the forum is for? (This might sound accusatory but I mean it genuinely).

One of the reasons I posted was because I wanted to hear thoughts/criticisms of the work overall, since I felt I didn't have a good context. Or maybe to find somebody who knew it better. But downvotes don't help with this.

The totalitarian implications of Effective Altruism

This reminds me of Adorno and Horkheimer'sThe Dialectic of Enlightenment, which argues, for some of the same reasons you do, that "Enlightenment is totalitarian." A piece that feels particularly related:

For the Enlightenment, whatever does not conform to the rule of computation and utility is suspect.

They would probably say "alienation" rather than "externalization," but have some of the same criticisms.

(I don't endorse the Frankfurt School or critical theory. I just wanted to note the similarities.)

One thing to consider is moral and epistemic uncertainty. The EA community already does this to some extent, for instance MacAskill's Moral Uncertainty, Ord's Moral Parliament, the unilateralist's curse, etc. but there is an argument that it could be taken more seriously.

Introducing the ML Safety Scholars Program

This document will include all of that information (some of it isn't ready yet).

You Don't Need To Justify Everything

This is a good point which I don't think I considered enough. This post describes this somewhat.

I do think the signal for which actions are best to take has to come from somewhere. You seem to be suggesting the signal can't come from the decisionmaker at all since people make decisions before thinking about them. I think that's possible, but I still think there's at least some component of people thinking clearly about their decision, even if what they're actually doing is trying to emulate what those around them would think.

We do want to generate actual signal for what is best, and maybe we can do this somewhat by seriously thinking about things, even if there is certainly a component of motivated reasoning no matter what.

A leaderboard on the forum, ranking users by (some EA organization's estimate of) their personal impact could give rise to a whole bunch of QALYs.

If this estimate is based on social evaluations, won't the people making those evaluations have the same problem with motivated reasoning? It's not clear this is a better source of signal for which actions are best for individuals.

If signal can never truly come from subjective evaluation, it seems like it wouldn't be solved by moving to social evaluation. One thing that would seem difficult would be concrete, measurable metrics, but this seems way harder in some fields than others.

You Don't Need To Justify Everything

Yes, people will always have motivated reasoning, for essentially every explanation of their actions they give. That being said, I expect it to be weaker for the small set of things people actually think about deeply, rather than things they're asked to explain after the fact that they didn't think about at all. Though I could be wrong about this expectation.


EA groups often get criticized by university students for "not doing anything." The answer usually given (which I think is mostly correct!) is that the vast majority of your impact will come from your career, and university is about gaining the skills you need to be able to do that. I usually say that EA will help you make an impact throughout your life, including after you leave college; the actions people usually think of as "doing things" in college (like volunteering), though they may be admirable,  don't.

Which is why I find it strange that the post doesn't mention the possibility of becoming a lifeguard.

In this story, the lifeguards aren't noticing. Maybe they're complacent. Maybe they don't care about their jobs very much. Maybe they just aren't very good at noticing.  Maybe they aren't actually lifeguards at all, and they just pretend to be lifeguards. Maybe the entire concept of "lifeguarding" is just a farce.

But if it's really just that they aren't noticing, and you are noticing, you should think about whether it really makes sense to jump into the water and start saving children. Yes, the children are drowning, but no, you aren't qualified to save them. You don't know how to swim that well, you don't know how to carry children out of the water, and you certainly don't know how to do CPR. If you really want to save lives, go get some lifeguard training and come back and save far more children.

But maybe the children are dying now, and this is the only time they're dying, so once you become a lifeguard it will be too late to do anything. Then go try saving children now!

Or maybe going to lifeguard school will destroy your ability to notice drowning children. In that case, maybe you should try to invent lifeguarding from scratch.

But unless all expertise is useless and worthless, which it might be in some cases, it's at least worth considering whether you should be focused on becoming a good lifeguard.

Load More