H

Habryka

20862 karmaJoined

Bio

Project lead of LessWrong 2.0, often helping the EA Forum with various issues with the forum. If something is broken on the site, it's a good chance it's my fault (Sorry!).

Comments
1321

Topic contributions
1

I think in the eyes of many (including substantial fraction of the public) participating in such an investigation will be seen as importantly heroic. I think it's wrong to assume that people cannot reward and understand the difficulty of such a choice, and cannot assign respect appropriately.

This thread feels like a fine place for people to express their opinion as a stakeholder.

Like, I don't even know how to engage with 80k staff on this on the object level, and seems like the first thing to do is to just express my opinion (and like, they can then choose to respond with argument).

we've similarly seen OpenAI change it's stated policies, such as removing restrictions on military use.

As I've discussed in the comments on a related post, I don't think OpenAI meaningfully changed any of its stated policies with regards to military usage. I don't think OpenAI really ever promised anyone they wouldn't work with militaries, and framing this as violating a past promise weakens the ability to hold them accountable for promises they actually made. 

What OpenAI did was to allow more users to use their product. It's similar to LessWrong allowing crawlers or jurisdictions that we previously blocked to now access the site. I certainly wouldn't consider myself to have violated some promise by allowing crawlers or companies to access LessWrong that I had previously blocked (or for a closer analogy, let's say we were currently blocking AI companies from crawling LW for training purposes, and I then change my mind and do allow them to do that, I would not consider myself to have broken any kind of promise or policy).

Yeah, this paragraph seems reasonable (I disagree, but like, that's fine, it seems like a defensible position).

I don't regard the norms as being about witholding negative information, but about trying to err towards presenting friendly frames while sharing what's pertinent, or something?

I agree with some definitions of "friendly" here, and disagree with others. I think there is an attractor here towards Orwellian language that is intentionally ambiguous about what it's trying to say, in order to seem friendly or non-threatening (because in some sense it is), and that kind of "friendly" seems pretty bad to me.

I think the paragraph you have would strike me as somewhat too Orwellian, though it's not too far off from what I would say. Something closer to what seems appropriate to me: 

OpenAI is a frontier AI company, and as such it's responsible for substantial harm by assisting in the development of dangerous AI systems, which we consider among the biggest risks to humanity's future. In contrast to most of the jobs in our job board, we consider working at OpenAI more similar to working at a large tobacco company, hoping to reduce the harm that the tobacco company causes, or leveraging this specific tobacco company's expertise with tobacco to produce more competetive and less harmful variations of tobacco products.

To its credit, it has repeatedly expressed an interest in safety and has multiple safety teams, which are attempting to reduce the likelihood of catastrophic outcomes from AI systems.

However, many people leaving the company have expressed concern that it is not on track to handle AGI safely, that it wasn't giving its safety teams resources they had been promised, and that the leadership of the company is untrustworthy. Moreover, it has a track record of putting inappropriate pressure on people leaving the company to sign non-disparagement agreements. [With links]

We explicitly recommend against taking any roles not in computer security or safety at OpenAI, and consider those substantially harmful under most circumstances (though exceptions might exist).

I feel like this is currently a bit too "edgy" or something, and I would massage some sentences for longer, but it captures the more straightforward style that i think would be less likely to cause people to misunderstand the situation.

Maybe it's something like: I think the norms prevailing in society say that in this kind of situation you should be a bit courteous in public. That doesn't mean being dishonest, but it does mean shading the views you express towards generosity, and sometimes gesturing at rather than flat expressing complaints.

I don't really think these are the prevailing norms, especially not regards with an adversary who has leveraged illegal threats of destroying millions of dollars of value to prevent negative information from getting out. 

Separately about whether these are the norms, I think the EA community plays a role in society where being honest and accurate about our takes of other people is important. There were a lot of people who took what the EA community said about SBF and FTX seriously and this caused enormous harm. In many ways the EA community (and 80k in-particular) are playing the role of a rating agency, and as a rating agency you need to be able to express negative ratings, otherwise you fail at your core competency. 

As such, even if there are some norms in society about withholding negative information here, I think the EA and AI-safety communities in-particular cannot hold itself to these norms within the domains of their core competencies and responsibilities.

But I do think you should acknowledge that you will have casual readers who will form impressions from a quick browse, and think it's worth doing something to minimise the extent to which they come away misinformed.

Yeah, I agree with this. I like the idea of having different kinds of sections, and I am strongly in favor of making things be true at an intuitive glance as well as on closer reading (I like something in the vicinity of "The Onion Test" here)

Separately, there is a level of blunt which you might wisely avoid being in public. Your primary audience is not your only audience. If you basically recommend that people treat a company as a hostile environment, then the company may reasonably treat the recommender as hostile, so now you need to recommend that they hide the fact they listened to you (or reveal it with a warning that this may make the environment even more hostile) ... I think it's very reasonable to just skip this whole dynamic.

I feel like this dynamic is just fine? I definitely don't think you should recommend that they hide the fact they listened to you, that seems very deceptive. I think you tell people your honest opinion, and then if the other side retaliates, you take it. I definitely don't think 80k should send people to work at organizations as some kind of secret agent, and I think responding by protecting OpenAIs reputation by not disclosing crucial information about the role, feels like straightforwardly giving into an unjustified threat.

I am generally very wary of trying to treat your audience as unsophisticated this way. I think 80k taking on the job of recommending the most impactful jobs, according to the best of their judgement, using the full nuance and complexity of their models, is much clearer and straightforward than a recommendation which is something like "the most impactful jobs, except when we don't like being associated with something, or where the case for it is a bit more complicated than our other jobs, or where our funders asked us to not include it, etc.". 

I do think that doing this well requires the ability to sometimes say harsh things about an organization. I think communicating accurately about job recommendations will inevitably require being able to say "we think working at this organization might be really miserable and might involve substantial threats, adversarial relationships, and you might cause substantial harm if you are not careful, but we still think it's overall still a good choice if you take that into account". And I think those judgements need to be made on an organization-by-organization level (and can't easily be captured by generic statements in the context of the associated career guide). 

These still seem like potentially very strong roles with the opportunity to do very important work. We think it’s still good for the world if talented people work in roles like this! 

I think given that these jobs involved being pressured via extensive legal blackmail into signing secret non-disparagement agreements that forced people to never criticize OpenAI, at great psychological stress and at substantial cost to many outsiders who were trying to assess OpenAI, I don't agree with this assessment. 

Safety people have been substantially harmed by working at OpenAI, and safety work at OpenAI can have substantial negative externalities.

There was a talk by Will and Toby about the history of effective altruism. I couldn't find it quickly when I wrote the above comment, but now found it:

Load more