Yudkowsky writes:

the "OpenAI" launch trashed humanity's chances of survival... Nobody involved with OpenAI's launch can reasonably have been said to have done anything else of relative importance in their lives. The net impact of their lives is their contribution to the huge negative impact of OpenAI's launch, plus a rounding error.

That's in a thread explicitly about debating the social status of billionaires, but if you take his comments seriously, they seem to apply not only to Elon, but also to longtermism and AI Safety as a whole. Whether or not you were directly involved in the launch of OpenAI, if you take Yudkowsky's view seriously, the small marginal impact of having anything to do with popularizing AI Safety dominates any good these movements many have produced.

Does that sound too outlandish and contrarian? It shouldn't. Here's Alex Berger, co-CEO of Open Philanthropy, in his recent 80,000 Hours interview:

[Michael Nielsen] thinks one of the biggest impacts of EA concerns with AI x-risk was to cause the creation of DeepMind and OpenAI, and to accelerate overall AI progress. I’m not saying that he’s necessarily right, and I’m not saying that that is clearly bad from an existential risk perspective, I’m just saying that strikes me as a way in which well-meaning increasing salience and awareness of risks could have turned out to be harmful in a way that has not been… I haven’t seen that get a lot of grappling or attention from the EA community.

Until recently, you might have argued that OpenAI was clearly good for x-risk. They were taking safety seriously, had hired many top safety researchers, etc. Then in May of this year, there was a mass exodus, including many of the people supposedly there to keep an eye on things. As Scott Alexander summarized:

most of OpenAI’s top alignment researchers, including Dario Amodei, Chris Olah, Jack Clark, and Paul Christano, left en masse for poorly-understood reasons

Do you have a strong reason to think that OpenAI remains dedicated to safety? Speculation aside, here's OpenAI CEO Sam Altman in his own words:

First of all, we’re not directed at preventing an event. We’re directed at making a good event happen more than we are at preventing a negative event. It is where I spend most of my time now.


I think that there are parts of it that are scary. There are parts of it that are potential downsides. But the upside of this is unbelievable.

So really, why aren't you freaking out? At what point would you start? What is your fire alarm if not GPT-3?

One objection might be that "freaking out" simply isn't tractable or productive. That's fair, but if you were freaking out, here are some of the things you might do:

  • Stop giving OpenAI money (They received 30M from OpenPhil in 2017)
  • Stop endorsing non-safety jobs at OpenAI (They're prominent on the 80k job board with several recent postings)

Or, if you were really serious (read: cared at all), you might:

  • Organize Microsoft employees to stop funding OpenAI, and to stop offering them compute resources (this isn't outlandish, Google employees have successfully organized against military contracts, right-wing apps are denied hosting)
  • Organize other AI orgs to commit to refusing to hire anyone still working at OpenAI in a non-safety role after January 2022

To be clear, I'm not advocating any of this. I'm asking why you aren't. I'm seriously curious and want to understand which part of my mental model of the situation is broken. Is it that you're confident the Holden Karnofsky board seat will be enough to hold everything together, even as the actual safety researchers flee? Or is it that you don't want to antagonize our new overlords? Is it that I'm out of touch, missing recent news, and OpenAI has recently convincingly demonstrated their ongoing commitment to safety?


For what it's worth, on Yudkowsky's original point about Musk, you might feel comforted by the fact that Musk was eventually removed from the board due to conflicts of interest after hiring away OpenAI researcher Andrej Karpathy. That's somewhat fair, except that Shivon Zilis still sits on the board, is simultaneously a Director at Neuralink, and was previously "Project Director, Office of the CEO" at Tesla.


New Answer
Ask Related Question
New Comment

3 Answers sorted by

I think this post and Yudkowski's Twitter thread that started it are probably harmful to the cause of AI safety.

OpenAI is one of the top AI labs worldwide, and the difference between their cooperation and antagonism to the AI safety community means a lot for the overall project. Elon Musk might be one of the top private funders of AI research, so his cooperation is also important.

I think that both this post and the Twitter thread reduce the likelihood of cooperation without accomplishing enough in return. I think that the potential to do harm to potential cooperation is about the same for a well-researched, well-considered comment as for an off-the-cuff comment, but the potential to do good is much higher for comments of the first type than the second. So, for comments that might cause offense, the standard for research and consideration should be higher than usual.

This post: it's extremely hard to understand what exactly OpenAI is being accused of doing wrong. Your sentence  "The small marginal impact of having anything to do with popularizing AI Safety dominates any good these movements many have produced." reads to me as an argument that Yudkowsky is wrong, and the fact that the launch lead indirectly to more AI safety discourse means that it was a positive. However, this doesn't match the valence of your post.

Your second argument, that most of their safety researchers left, is indeed some cause for concern (edit: although seemingly quite independent from your first point). However, surely it is perfectly possible to ask the departed safety researchers whether they themselves think that their departures should be taken as a signal of no confidence in OpenAI's commitment to safety before advocating actions to be taken against them. To clarify: you may or may not get a helpful response, but I think that this is an easy thing to do, is clearly a reasonable step if you are wondering what these departures mean, and I think you should take such easy & reasonable steps before advocating a position like this.

If OpenAI is pursuing extremely risky research without proper regard to safety, then the argument set out here ought to be far stronger. If not, then it is inappropriate to advocate doing harm to OpenAI researchers.

The Twitter thread: To an outsider, it seems like this concerns regarding the language employed at OpenAI's launch were resolved quickly in a manner that addressed the concerns of safety advocates. If the resolution did not address their concerns, and safety advocates think that this should be widely known, then that should be explained clearly, and this thread did no such thing.

It looked to me like Yudkowsky was arguing, as he often likes to, that contributions to AI risk are cardinally greater than contributions to anything else when assessing someone's impact on something. It is not obvious to me that he intended this particular episode to have more impact than his many other statements to this effect. Nonetheless, it seems to have done so (at least, I'm seeing it pop up in several different venues), and I at least would appreciate if he could clarify if there is an ongong issue here and what it is, or not.

Unfortunately we may be unlikely to get a statement from a departed safety researcher beyond mine (https://forum.effectivealtruism.org/posts/fmDFytmxwX9qBgcaX/why-aren-t-you-freaking-out-about-openai-at-what-point-would?commentId=WrWycenCHFgs8cak4), at least currently.

Thanks for the recommendation. I spent about an hour looking for contact info, but was only able to find 5 public addresses of ex-OpenAI employees involved in the recent exodus. I emailed them all, and provided an anonymous Google Form as well. I'll provide an update if I do hear back from anyone.

Is it that I'm out of touch, missing recent news, and OpenAI has recently convincingly demonstrated their ongoing commitment to safety?

This turns out to be at least partially the answer. As I'm told, Jan Leike joined OpenAI earlier this year and does run an alignment team.

I also noticed this post. It could be that OpenAI is more safety-conscious than the ML mainstream. That might not be safety-conscious enough. But it seems like something to be mindful of if we're tempted to criticize them more than we criticize the less-safety-conscious ML mainstream (e.g. does Google Brain have any sort of safety team at all? Last I checked they publish way more papers than OpenAI. Then again, I suppose Google Brain doesn't brand themselves as trying to discover AGI--but I'm also not sure how correlated a "trying to discover AGI" bra... (read more)

7Steven Byrnes8mo
Vicarious [https://en.wikipedia.org/wiki/Vicarious_(company)] and Numenta [https://en.wikipedia.org/wiki/Numenta] are both explicitly trying to build AGI, and neither does any safety/alignment research whatsoever. I don't think this fact is particularly relevant to OpenAI, but I do think it's an important fact in its own right, and I'm always looking for excuses to bring it up. :-P Anyone who wants to talk about Vicarious or Numenta in the context of AGI safety/alignment, please DM or email me. :-)
In the absence of rapid public progress, my default assumption is that "trying to build AGI" is mostly a marketing gimmick. There seem to be several other companies like this, e.g.:https://generallyintelligent.ai/ [https://generallyintelligent.ai/] But it is possible they're just making progress in private, or might achieve some kind of unexpected breakthrough. I guess I'm just less clear about how to handle these scenarios. Maybe by tracking talent flows, which is something the AI Safety community has been trying to do for a while.
Google does claim to be working on "general purpose intelligence" https://www.alignmentforum.org/posts/bEKW5gBawZirJXREb/pathways-google-s-agi [https://www.alignmentforum.org/posts/bEKW5gBawZirJXREb/pathways-google-s-agi] I do think we should be worried about DeepMind, though OpenAI has undergone more dramatic changes recently, including restructuring into a for-profit, losing a large chunk of the safety/policy people, taking on new leadership, etc.

This post seems to be making basic errors (in the opening quote, Eliezer Yudkowsky, a rationalist associated public figure involved in AI safety, is complaining about the dynamics of Musk at the creation of OpenAI, not recent events or increasing salience). It is hard to tell if the OP has a model of AI safety or insight into what the recent org dynamics mean, all of which are critical to his post having meaning. 

There’s somewhat more discussion here on LessWrong.

Also relevant to OpenAI and safety (differential progress?) see the discussion in the AMA by Paul Christiano, formerly of OpenAI. This gives one worldview/model for why increasing salience and openness is useful. 

Some content from the AMA copied and pasted for convenience below:

dynamics of Musk at the creation of OpenAI, not recent events or increasing salience

Thanks, this is a good clarification.

It is hard to tell if the OP has a model of AI safety or insight into what the recent org dynamics mean, all of which are critical to his post having meaning.

You're right that I lack insight into what the recent org dynamics mean, this is precisely why I'm asking if anyone has more information. As I write at the end:

To be clear, I'm not advocating any of this. I'm asking why you aren't. I'm seriously curious and want to understand which part of my mental model of the situation is broken.

The quotes from Paul are helpful, I don't read LW much and must have missed the interview, thanks for adding these. Having said that, if you see u/irving's comment below, I think it's pretty clear that there are good reasons for researchers not to speak up too loudly and shit talk their former employer.

11 comments, sorted by Click to highlight new comments since: Today at 7:43 AM

Unfortunately, a significant part of the situation is that people with internal experience and a negative impression feel both constrained and conflicted (in the conflict of interest sense) for public statements. This applies to me: I left OpenAI in 2019 for DeepMind (thus the conflicted).

Note that Eliezer Yudkowski argument in the opening link is that OpenAI's damage was done by fragmenting the AI Safety community on its launch.

This damage is done - and I am not sure it bears much relation to what OpenAI is trying to do going forward.

(I am not sure I agree with Eliezer on this one, but I lack details to tell if OpenAI's launch really was net negative)

I’m a complete outsider of all this, but I get the feeling that it may be impolitic of me to write this comment for reasons I don’t know. If so, please warn me and I can remove it.

Here impressions as an observer over the years. I don’t know what’s going on with OpenAI at the moment – just to preempt disappointment – but I remember what it was like in 2015 when it launched.

  1. Maybe 2015? Elon Musk was said to have read Superintelligence. I was pleasantly surprised because I liked Superintelligence.
  2. Late 2015:
    1. OpenAI was announced and I freaked out. I’m a bit on the mild-mannered side, so my “freaking out” might’ve involved the phrase, “this seems bad.” That’s a very strong statement for me. Also this was probably in my inner dialogue only.
    2. Gleb Tsipursky wrote a long open letter also making the case that this is bad. EY asked him not to publish it (further), and now I can’t find it anymore. (I think his exact words were, “Please don’t.”) I concluded that people must be trying to salvage the situation behind the scenes and that verbal hostility was not helping.
    3. Scott Alexander wrote a similar post. I was confused as to why he published it when Gleb wouldn’t? Maybe Scott didn’t ask? In any case, I was glad to have an article to link to when people asked why I was freaking out about OpenAI.
  3. Early 2016: Bostrom’s paper on openness was posted somewhere online or otherwise made accessible enough that I could read it. I didn’t learn anything importantly new from it, so it seemed to me that those were cached thoughts of Bostrom’s that he had specifically reframed to address openness to get it read by OpenAI people. (The academic equivalent of “freaking out”?) It didn’t seem like a strong critique to me, but perhaps that was the strategically best move to try to redirect the momentum of the organization into a less harmful direction without demanding that it change its branding? I wondered whether EY really didn’t contribute to it or whether he had asked to be removed from the acknowledgements.
    1. I reread the paper a few years later together with some friends. One of them strongly disliked the paper for not telling her anything new or interesting. I liked the paper for being a sensible move to alleviate the risk from OpenAI. That must’ve been one of those few times when two people had completely opposite reactions to a paper without even disagreeing on anything about it.
  4. March 30 (?), 2017: It was April 1 when I read a Facebook post to the effect that Open Phil had made a grant of $30m to OpenAI. OpenAI seemed clearly very bad to me and $30m were way more than all previous grants, so my thoughts were almost literally, “C’mon, an April Fools has to be at least remotely plausible for people to fall for it!” I think I didn’t even click the link that day. Embarrassing. I actually quickly acknowledged that getting a seat on the board of the org to try to stear it into a less destructive direction got to be worth a lot (and that $30m weren’t so much for OpenAI that it would greatly accelerate their AGI development). So after my initial shock had settled, I congratulated Open Phil on that bold move. (Internally. I don’t suppose I talk to people much.)
  5. Later I learned that Paul Christiano and other people I trusted or who were trusted by people I trusted had joined OpenAI. That further alleviated my worry.
  6. OpenAI went on to not publish some models they had generated, showing that they were backing away from their dangerous openness focus.

When Paul Christiano left OpenAI, I heard or read about it in some interview where he also mentioned that he’s unsure whether that’s a good decision on balance but that there are safety-minded people left at OpenAI. On the one hand I really want him to have the maximal amount of time available to pursue IDA and other ideas he might have. But on the other hand, his leaving (and mentioning that others left too) did rekindle that old worry about OpenAI.

I can only send hopes and well wishes to all safety-minded people who are still left at OpenAI!

Is Holden still on the board?

He is listed in the website

> OpenAI is governed by the board of OpenAI Nonprofit, which consists of OpenAI LP employees Greg Brockman (Chairman & CTO), Ilya Sutskever (Chief Scientist), and Sam Altman (CEO), and non-employees Adam D’Angelo, Holden Karnofsky, Reid Hoffman, Shivon Zilis, Tasha McCauley, and Will Hurd.

It might not be up to date though

It can’t be up to date, since they recently announced that Helen Toner joined the board, and she’s not listed.

The website now lists Helen Toner, but do not list Holden, so it seems he is no longer on the board.

That's pretty wild, especially considering getting Holden on the board was a major condition of OpenPhilanthropy's $30,000,000 grant: https://www.openphilanthropy.org/focus/global-catastrophic-risks/potential-risks-advanced-artificial-intelligence/openai-general-support#Details_on_Open_Philanthropy8217s_role

Thought it also says the grant was for 3 years, so maybe it shouldn't be surprising that his board seat only lasted that long.

Holden might have agreed to have Helen replace him. She used to work at Open Phil, too, so Holden probably knows her well enough. Open Phil bought a board seat, and it's not weird for them to fill it as they see fit, without having it reserved only for a specific individual.

There is now some meta-discussion on LessWrong.

Happy to see they think this should be discussed in public! Wish there was more on questions #2 and #3.

Also very helpful to see how my question could have been presented in a less contentious way.