221Joined Mar 2019


I very badly want to delay timelines, especially because doing so gives us more time to develop responses, governance strategies, and tools to handle rapid changes. I think this is underemphasized. And lately, I have been thinking that the most likely thing that could make me shift my focus is the appeal of work that makes it harder to build risky AI or that improves our ability to respond to or endure threats. This contrasts with my current work which is mostly about making alignment easier.  

I believe this is a big improvement. 

I work on AI safety tools. I believe this might be the most important thing for someone like me to do FWIW. I think AI doom is not likely but likely enough to be my personal top priority.  But when I give money away I do it to GiveWell charities for reasons involving epistemic humility, moral uncertainty, and my belief in the importance of a balanced set of EA priorities. 

Does that also  apply to any post about e.g. animal welfare and climate change?

This would apply to a post titled "Reducing carbon emissions by X may be equivalent to 500M in donations to GiveWell charities."

On the question of deleting

  • I don't think this post will be particularly good at sparking good conversations. 
  • I think it would be better to have a different post that makes more effort in the estimation proposed and clearly asks a question in the title.
  • Relatedly, I think the large majority of the potential downside of this post comes from the title. Someone like Torres may have no interest in reading the actual post or taking any nuances into account when commenting on it. They likely wouldn't even read anything beyond the title. They'd just do their thing and be a pundity troll, and the title gives exactly the kind of ammunition they want. 

I generally agree, but not in this specific case for two reasons. First, I think there are more thorough, less provocative, strictly better discussions of this kind of thing already. See writing from Beckstead, Bostrom, etc. Second, I think there are specific direct harms this post could have. See my latest reply to the OP on the other branch of this thread. 

Is that the same as There's significantly less than a 1% risk from AGI for lives that morally matter (which I agree is my main uncertainty), or is it a different consideration?

I believe so. This post is about one day of delayed extinction. Not about preventing it. Not tryna split hairs tho.

What would make friends and not enemies?

Not using x-risks to imply that donating to GiveWell charities is of trivial relative importance. It's easy to talk about the importance of x-risks without making poverty and health charities the direct comparison. 

I am mostly worried about real people in the real world that (maybe) suffer from a real large risk.

I still presume you care about people who suffer from systemic issues in the world. This kind of post would not be the kind of thing that would make anyone like this feel respected. 

A case for deletion. Consider a highly-concrete and pretty likely scenario. Emille Torres finds out about this post, tweets about it along with a comment about moral rot in EA, and gets dozens of retweets and a hundred likes. Then Timnut Gebru retweets it along with another highly-negative comment and gets hundreds of retweets and a thousand likes. This post contributes to hundreds or more people more actively disliking EA--especially because it's on the actual EA forum and not a more ignorable comment from someone in a lower profile space. 

I would recommending weighing the possible harms of this post getting tons of bad press against how likely you think that it will positively change anyone's mind or lead to high-quality discussion. My beliefs here are that deleting it might be very positive in EV. 

[Edit: this post has been updated, and this comment applies substantially less now. See this thread for details. ]

As a longtermist, I think this post is bad and harmful. I strongly dislike this framing, and I think it's very unhealthy for an altruistic community. 

First, I think the fermi estimate here is  not good, principally for a lack of any discounting and for failing to try to  incorporate the objections raised in the post into the actual estimate.  But'll leave the specifics of the back-of-the envelope  estimate aside in favor of putting emphasis on what think is the most harmful thing. 

Pitting X-risks against other ways of making the world better (1) is extremely unlikely to convince anyone to work on x-risk who isn't already doing so, (2) hedges on very unlikely risky scenarios without incorporating principles involving discounting, epistemic humility, or moral uncertainty, (3)  is certain to alienate people and  is the kind of thing that makes enemies--not friends--which reduces the credibility and sociopolitical capital of longtermism, and (4) is very disrespectful toward real people in the real world who suffer from real, large problems that GiveWell charities try to address. 

I would encourage deleting this post. 

No disagreements here. I guess I imagine AIS&L work along with work on the neartermist examples I mentioned as a venn diagram with healthy overlap. I'm glad for the AIS&L community, and I think it tackles some truly unique problems. By "separate" I essentially meant "disjoint" in the title. 

+1 I think it's very worthwhile to emphasize neartermist reasons to care about work that may be primarily longtermism-oriented. 

Thanks for the comment. I hope you think this is interesting content. 

I'm not sure I understand what the argument is.

The most important points we want to argue with this post are that (1) if a system itself is made to be safe, but it's copycatted and open-sourced, then the safety measures were not effective (2) it is bad when developers like OpenAI publish incomplete/overly-convenient analysis of the risks of what they develop that, for example, ignore copycatting,  and (3) the points from "What do we want"? and "What should we do?"

"...we are not convinced of any fundamental social benefits that film provides aside from, admittedly, being entertaining..."

Yes entertainment has value, but I don't think that entertainment from text-to-image models is/will be commensurate with film.  I could also very easily list a lot of non-entertainment uses of film involving stuff like education, communication, etc. And I think someone from 1910 could easily think of these as well. What stuff like this would you predict from text-to-image diffusion models?

So, what justifies this casual dismissal of the entertainment value of text-to-image models?

We don't. We argue that it's unlikely to outweigh harms. 

Your primary conclusion is that "the AI research community should curtail work on risky capabilities".

I wouldn't say this is our primary conclusion. See my first response above. Also, I don't think this is obvious. Sam Altman, Demis Hassabis, and many others strongly disagree with this. 

The problem is coordinating our behavior. If OpenAI decides not to work on it, someone else will

We disagree that the counterfactual to OpenAI not working on some projects like DALLE2 or GPT4 would be similar to the status quo.  We discussed this in the paragraph that says "...On one hand, AI generators for offensive content were probably always inevitable. However..."

...Government bans? Do you realize how hard it would be to prevent text-to-image models from being created and shared...

Yes. We do not advocate for government bans. My answer to this is essentially what we wrote in the "The role of AI governance." I don't have much to add beyond what we already wrote. I recommend rereading that section. In short, there are regulatory tools that can be used. For example, the FTC may have a considerable amount of power in some cases.

Are we talking about 100 people a year being victimized? That would indeed be sad, but compared to potential human extinction from AI, probably not as big of a deal.

Where did the number 100 come from? In the post, we cite one article about a study from 2019 that found  ~15,000 deepfakes online. That was in 2019 when image and video generation were much less developed than today. And in the future, things may be much more widespread because of open-source tools based on SD that are easy to use.

Another really important point, I think, is that we argue in the post that trying to avoid dynamics involving racing toward TAI, copycatting, and open-sourcing of models will LESSEN X-risk. You wrote your comment as if we are trying to argue that preventing sex crimes are more important than X-risk. We don't say this. I recommend rereading the "But even if one does not view the risks specific to text-to-image models as a major concern..." paragraph and the "The role of AI researchers" section. 

Finally, and I want to put a star on this point -- we all should care a lot about sex crime. And I'm sure you do. Writing off problems like this by comparing them to X-risk  (1) isn't valid in this case because we argue for improving the dev ecosystem to address both of these problems, (2) should be approached with great care and good data if it needs to be done, and (3) is one type of thing that leads to a lot of negativity  and bad press about EA. 

I think this is probably even more  true for your comments on entertainment value and whether that might outweigh the harms of deepfake sex crimes. First, I'm highly skeptical that we will find uses for text-to-image models that are so widely usable and entertaining that it would be commensurate to the harms of diffusion-deepfake sex crime. But even if we could be confident that entertainment would hypothetically outweigh sex crimes on pure utilitarian grounds, in the real world with real politics and EA critics, I do not think this position would be tenable. It could serve to undermine support for EA and end up being very negative if widespread. 

Load more