272 karmaJoined Mar 2019


I'm very supportive of this post. Also I will shamelessly share here a sequence I posted in February called "The Engineer's Interpretability Sequence". One of the main messages of the sequence could be described as how existing mechanistic interpretability research is not on the ball. 


I very badly want to delay timelines, especially because doing so gives us more time to develop responses, governance strategies, and tools to handle rapid changes. I think this is underemphasized. And lately, I have been thinking that the most likely thing that could make me shift my focus is the appeal of work that makes it harder to build risky AI or that improves our ability to respond to or endure threats. This contrasts with my current work which is mostly about making alignment easier.  

I believe this is a big improvement. 

I work on AI safety tools. I believe this might be the most important thing for someone like me to do FWIW. I think AI doom is not likely but likely enough to be my personal top priority.  But when I give money away I do it to GiveWell charities for reasons involving epistemic humility, moral uncertainty, and my belief in the importance of a balanced set of EA priorities. 

Does that also  apply to any post about e.g. animal welfare and climate change?

This would apply to a post titled "Reducing carbon emissions by X may be equivalent to 500M in donations to GiveWell charities."

On the question of deleting

  • I don't think this post will be particularly good at sparking good conversations. 
  • I think it would be better to have a different post that makes more effort in the estimation proposed and clearly asks a question in the title.
  • Relatedly, I think the large majority of the potential downside of this post comes from the title. Someone like Torres may have no interest in reading the actual post or taking any nuances into account when commenting on it. They likely wouldn't even read anything beyond the title. They'd just do their thing and be a pundity troll, and the title gives exactly the kind of ammunition they want. 

I generally agree, but not in this specific case for two reasons. First, I think there are more thorough, less provocative, strictly better discussions of this kind of thing already. See writing from Beckstead, Bostrom, etc. Second, I think there are specific direct harms this post could have. See my latest reply to the OP on the other branch of this thread. 

Is that the same as There's significantly less than a 1% risk from AGI for lives that morally matter (which I agree is my main uncertainty), or is it a different consideration?

I believe so. This post is about one day of delayed extinction. Not about preventing it. Not tryna split hairs tho.

What would make friends and not enemies?

Not using x-risks to imply that donating to GiveWell charities is of trivial relative importance. It's easy to talk about the importance of x-risks without making poverty and health charities the direct comparison. 

I am mostly worried about real people in the real world that (maybe) suffer from a real large risk.

I still presume you care about people who suffer from systemic issues in the world. This kind of post would not be the kind of thing that would make anyone like this feel respected. 

A case for deletion. Consider a highly-concrete and pretty likely scenario. Emille Torres finds out about this post, tweets about it along with a comment about moral rot in EA, and gets dozens of retweets and a hundred likes. Then Timnut Gebru retweets it along with another highly-negative comment and gets hundreds of retweets and a thousand likes. This post contributes to hundreds or more people more actively disliking EA--especially because it's on the actual EA forum and not a more ignorable comment from someone in a lower profile space. 

I would recommending weighing the possible harms of this post getting tons of bad press against how likely you think that it will positively change anyone's mind or lead to high-quality discussion. My beliefs here are that deleting it might be very positive in EV. 

[Edit: this post has been updated, and this comment applies substantially less now. See this thread for details. ]

As a longtermist, I think this post is bad and harmful. I strongly dislike this framing, and I think it's very unhealthy for an altruistic community. 

First, I think the fermi estimate here is  not good, principally for a lack of any discounting and for failing to try to  incorporate the objections raised in the post into the actual estimate.  But'll leave the specifics of the back-of-the envelope  estimate aside in favor of putting emphasis on what think is the most harmful thing. 

Pitting X-risks against other ways of making the world better (1) is extremely unlikely to convince anyone to work on x-risk who isn't already doing so, (2) hedges on very unlikely risky scenarios without incorporating principles involving discounting, epistemic humility, or moral uncertainty, (3)  is certain to alienate people and  is the kind of thing that makes enemies--not friends--which reduces the credibility and sociopolitical capital of longtermism, and (4) is very disrespectful toward real people in the real world who suffer from real, large problems that GiveWell charities try to address. 

I would encourage deleting this post. 

No disagreements here. I guess I imagine AIS&L work along with work on the neartermist examples I mentioned as a venn diagram with healthy overlap. I'm glad for the AIS&L community, and I think it tackles some truly unique problems. By "separate" I essentially meant "disjoint" in the title. 

+1 I think it's very worthwhile to emphasize neartermist reasons to care about work that may be primarily longtermism-oriented. 

Load more