Which AI Safety Org to Join?

Yonatan Cale

Long TL;DR: You’re an engineer, you want to work on AI Safety, you’re not sure which org to apply to, so you’re going to apply to all of them. But - oh no - some of these orgs may actively be causing harm, and you don’t want to do that. What’s your alternative? Study AI Safety for 2 years before you apply? In this post I suggest you can collect the info you want quickly by going over specific posts

Why do I think some orgs might be [not helping] or [actively causing harm]?

Example link. (Help me out in the comments with more?)

My suggestion:

1. Open the tag of the org you want on lesswrong

How: Search for a post related to that org. You’ll have tags on top of the post. Click the tag with the org name.

2. Sort by “newest first״

3. Open 2-3 posts

(Don’t read the post yet!)

4. In each post, look at the top 2-3 most upvoted comments

What I expect you’ll find sometimes

A post by the org, with comments trying to politely say “this is not safe”, heavily upvoted.

Bonus: Read the comments

Or even crazier: Read the post! [David Johnson thinks this is a must!]

Ok, less jokingly, this seems to me like a friendly way to start to see the main arguments without having to read too much background material (unless you find, for example, a term you don’t know).

Extra crazy optimization: Interview before research

TL;DR: First apply to lots of orgs, and then, when you know which orgs are interested^[1], then do your^[2] research only those orgs.

Am I saying this idea for vetting AI Safety orgs is perfect?

No, I am saying it is better than the alternative of “apply to all of them (and do no research)”, assuming you resonate with my premise of “there’s a lot of variance in effectiveness of orgs” and “that matters”.

I also hope that by posting my idea, someone will comment with something even better.

^{^}
However you choose to define "interested". Maybe research the orgs that didn't reject your CV? Maybe only research the ones that accepted you? Your call
^{^}
Consider sharing your thoughts with the org. Just remember, whoever is talking to you was chosen as a person that convinces candidates to join. They will, of course, think their org is great. Beware of reasons like "the people saying we are causing harm are wrong, but we didn't explain publicly why". The whole point is letting the community help you with this complicated question.

17 Reactions

Comments21

Sorted by

New & upvoted

Click to highlight new comments since: Today at 5:07 AM

David JohnstonOct 12 202217

I don’t endorse judging AI safety organisations by less wrong consensus alone - I think you should at least read the posts!

Yonatan CaleOct 12 20222

Thanks for the push back!

Added this to the post

Neel NandaOct 12 202213

I think this is fairly bad advice - LessWrong commenters are wrong about a lot of things. I think this is an acceptable way to get a vibe for the what the LessWrong bubble thinks though. But idk, for most of these questions the hard part is figuring out which bubble to believe. Most orgs will have some groups think they're useless, some think they're great, and probably some who think they're net negative. Finding one bubble who believes one of these three doesn't tell you much!

Yonatan CaleOct 12 20222

Thanks for the pushback!

Do you have an alternative suggestion?

Guy RavehOct 13 20222

I personally interpret Neel's comment as saying this is ~not better (perhaps worse) than going in blindly. So I just wanted to highlight that a better alternative is not needed for the sake of arguing this (even if it's a good idea to have one for the sake of future AI researchers).

Yonatan CaleOct 13 20223

Do you think that going to do capabilities work at DeepMind or OpenAI is just as impactful as going to whatever the lesswrong community recommends (as presented by their comments and upvotes) ?

Guy RavehOct 13 20222

Possibly. As we've discussed privately, I think some AI safety groups which are usually lauded are actually net negative 🙃

But I was trying to interpret Neel and not give my own opinion.

Yonatan CaleOct 13 20222

My meta-opinion is that it would be better to see what others think about working on capabilities in top labs, compared to going there without even considering the downsides. What do you think? (A)

And also that before working at "AI safety groups which are usually lauded [but] are actually net negative", it would be better to read comments of people like you. What do you think? (B)

Guy RavehOct 13 20222

I somewhat disagree with both statements.

(A) Sure, it'd be good to have opinions from relevant people, but on the other hand it's non-trivial to figure out who "relevant people" are, and "the general opinion on LW" is probably not the right category. I'd look more at what (1) people actually working in the field, and (2) the broad ML community, think about an org. So maybe the Alignment Forum.

(B) I can only answer on my specific views. My opinion on [MIRI] probably wouldn't really help individuals seeking to work there, since they probably know everything I know and have their own opinions. My opinions are more suitable for discussions on the general AI safety community culture.

Yonatan CaleOct 12 20222

By the way, I personally resonate with your advice on forming an inside view and am taking that path, but it doesn't fit everyone. Some people don't want all that homework, they want to get in a company and write code, and, to be clear, it is common for them to apply to all orgs that [they see their names in EA spaces] or something like that (very wide, many orgs). This is the target audience I'm trying to help.

JJ BalisanOct 18 20223

I would just probably tell people to work in another field than explicitly encouraging goodharting their way to trying to having positive impact in an area with extreme variance.

alex lawsenOct 12 20227

Thinking about where to work seems reasonable, listening to others' thoughts on where to work seems reasonable, this post advises both.

This post also pretty strongly suggests that lesswrong comments are the best choice of others' thoughts, and I would like to see that claim made explicit and then argued for rather than slipped in. As a couple of other comments have noted, lesswrong is far from a perfect signal of the alignment space.

Yonatan CaleOct 12 20222

Thanks (also) for the pushback part!

Do you have an alternative to lesswrong comments that you'd suggest?

WilliamKielyOct 12 20225

Seems like there is a big gap between "Study AI Safety for 2 years before you apply" and reading posts, rather than just the most up-voted comments.

WilliamKielyOct 12 20224

Other feedback: I don't understand why you call some of your suggestions "crazy"/"crazier". Also when you wrote "less joklingly" I had missed the joke. Perhaps your suggestions could be rewritten to be more clear without these words.

Yonatan CaleOct 12 20222

The joke is supposed to be that "reading the post" isn't actually that crazy. I see this wasn't understood (oops!). I'm going to start by trying to get the content right (since lots of people pushed back on it) and then try fixing this too

Yonatan CaleOct 12 20222

I didn't understand, could you say this in other words please?

Or did you mean it like this:

Seems like there is a big gap between ["Study AI Safety for 2 years before you apply" and reading posts], rather than [just the most up-voted comments].

WilliamKielyOct 12 20222

I meant it like that, yes. Seems like there is a big gap between ["Study AI Safety for 2 years before you apply" and reading posts].

Yonatan CaleOct 13 20222

So what would you suggest? Reading a few posts about that org?

Rubi J. HudsonOct 12 20225

Seems worth asking in interviews "I'm concerned about advancing capabilities and shortening timelines, what actions is your organization taking to prevent that", with the caveat that you will be BSed.

Bonus: You can turn down roles explicitly because they're doing capabilities work, which if it becomes a pattern may incentivize them to change their plan.

Yonatan CaleOct 12 20222

I agree, see foot note 2