The Slippery Slope from DALLE-2 to Deepfake Anarchy

philljkc

The Slippery Slope from DALLE-2 to Deepfake Anarchy

[anonymous], philljkc

21 min read · Nov 5, 2022

Comments 11

Sorted by

New & upvoted

Matthew_Barnett

I'm confused about this post. I don't buy the argument, but really, I'm not sure I understand what the argument is. Text-to-image models have risks? Every technology has risks. What's special about this technology that makes the costs outweigh the benefits?

Consider this paragraph,

Especially in the case of DALLE-2 and Stable Diffusion, we are not convinced of any fundamental social benefits that access to general-purpose image generators provide aside from, admittedly, being entertaining. But this does not seem commensurate with the potentially-devastating harms that deepfakes can have on victims of sex crimes. Thus, it seems that these models, as they have been rolled out, fail this basic cost/benefit test.

I don't find anything about this obvious at all.

Suppose someone in the 1910s said, "we are not convinced of any fundamental social benefits that film provides aside from, admittedly, being entertaining. But this does not seem commensurate with the potentially-devastating harms that film can cause by being used as a vehicle for propaganda. Thus, it seems that films, as they have been rolled out, fail this basic cost/benefit test." Would that have convinced you to abandon film as an art form?

Entertainment has value. Humanity collectively spends hundreds of billions of dollars every year on entertainment. People would hate to live in a world without entertainment. It would be almost dystopian. So, what justifies this casual dismissal of the entertainment value of text-to-image models?

Your primary conclusion is that "the AI research community should curtail work on risky capabilities". But isn't that obvious? Everyone should curtail working on unnecessarily risky capabilities.

The problem is coordinating our behavior. If OpenAI decides not to work on it, someone else will. What is the actual policy being recommended here? Government bans? Do you realize how hard it would be to prevent text-to-image models from being created and shared on the internet? That would require encroaching on the open internet in a way that seems totally unjustified to me, given the magnitude of the risks you've listed here.

What's missing here is any quantification of the actual harms from text-to-image models. Are we talking about 100 people a year being victimized? That would indeed be sad, but compared to potential human extinction from AI, probably not as big of a deal.

[anonymous]

Thanks for the comment. I hope you think this is interesting content.

I'm not sure I understand what the argument is.

The most important points we want to argue with this post are that (1) if a system itself is made to be safe, but it's copycatted and open-sourced, then the safety measures were not effective (2) it is bad when developers like OpenAI publish incomplete/overly-convenient analysis of the risks of what they develop that, for example, ignore copycatting, and (3) the points from "What do we want"? and "What should we do?"

"...we are not convinced of any fundamental social benefits that film provides aside from, admittedly, being entertaining..."

Yes entertainment has value, but I don't think that entertainment from text-to-image models is/will be commensurate with film. I could also very easily list a lot of non-entertainment uses of film involving stuff like education, communication, etc. And I think someone from 1910 could easily think of these as well. What stuff like this would you predict from text-to-image diffusion models?

So, what justifies this casual dismissal of the entertainment value of text-to-image models?

We don't. We argue that it's unlikely to outweigh harms.

Your primary conclusion is that "the AI research community should curtail work on risky capabilities".

I wouldn't say this is our primary conclusion. See my first response above. Also, I don't think this is obvious. Sam Altman, Demis Hassabis, and many others strongly disagree with this.

The problem is coordinating our behavior. If OpenAI decides not to work on it, someone else will

We disagree that the counterfactual to OpenAI not working on some projects like DALLE2 or GPT4 would be similar to the status quo. We discussed this in the paragraph that says "...On one hand, AI generators for offensive content were probably always inevitable. However..."

...Government bans? Do you realize how hard it would be to prevent text-to-image models from being created and shared...

Yes. We do not advocate for government bans. My answer to this is essentially what we wrote in the "The role of AI governance." I don't have much to add beyond what we already wrote. I recommend rereading that section. In short, there are regulatory tools that can be used. For example, the FTC may have a considerable amount of power in some cases.

Are we talking about 100 people a year being victimized? That would indeed be sad, but compared to potential human extinction from AI, probably not as big of a deal.

Where did the number 100 come from? In the post, we cite one article about a study from 2019 that found ~15,000 deepfakes online. That was in 2019 when image and video generation were much less developed than today. And in the future, things may be much more widespread because of open-source tools based on SD that are easy to use.

Another really important point, I think, is that we argue in the post that trying to avoid dynamics involving racing toward TAI, copycatting, and open-sourcing of models will LESSEN X-risk. You wrote your comment as if we are trying to argue that preventing sex crimes are more important than X-risk. We don't say this. I recommend rereading the "But even if one does not view the risks specific to text-to-image models as a major concern..." paragraph and the "The role of AI researchers" section.

Finally, and I want to put a star on this point -- we all should care a lot about sex crime. And I'm sure you do. Writing off problems like this by comparing them to X-risk (1) isn't valid in this case because we argue for improving the dev ecosystem to address both of these problems, (2) should be approached with great care and good data if it needs to be done, and (3) is one type of thing that leads to a lot of negativity and bad press about EA.

I think this is probably even more true for your comments on entertainment value and whether that might outweigh the harms of deepfake sex crimes. First, I'm highly skeptical that we will find uses for text-to-image models that are so widely usable and entertaining that it would be commensurate to the harms of diffusion-deepfake sex crime. But even if we could be confident that entertainment would hypothetically outweigh sex crimes on pure utilitarian grounds, in the real world with real politics and EA critics, I do not think this position would be tenable. It could serve to undermine support for EA and end up being very negative if widespread.

Rohin Shah

But even if we could be confident that entertainment would hypothetically outweigh sex crimes on pure utilitarian grounds, in the real world with real politics and EA critics, I do not think this position would be tenable.

Isn't this basically society's revealed position on, say, cameras? People can and do use cameras for sex crimes (e.g. voyeurism) but we don't regulate cameras in order to reduce sex crimes.

I agree that PR-wise it's not a great look to say that benefits outweigh risks when the risks are sex crimes but that's because PR diverges wildly from reality. (And if cameras were invented today, I'd expect we'd have the same PR arguments about them.)

None of this is to imply a position on deepfakes -- I don't know nearly enough about them. My position is just that it should in fact come down to a cost/benefit calculation.

I could also very easily list a lot of non-entertainment uses of film involving stuff like education, communication, etc.

Random nitpick, but text-to-image models seem plausibly very useful for education and communication. I would love for people's slide decks with pages and pages of text to be replaced by images that convey the same points better. Maybe imagine Distill-like graphics / papers, except that it no longer takes 5x as long to produce them relative to a normal paper.

philljkc

We agree for sure that cost/benefit ought be better articulated when deploying these models (see the What Do We Want section on Cost-Benefit Analysis). The problem here really is the culture of blindly releasing and open-sourcing models like this, using a Go Fast And Break Things mentality, without at least making a case for what the benefits are, what the harms are, and not appealing to any existing standard when making these decisions.

Again, it's possible (but not our position) that the specifics of DALLE-2 don't bother you as much, but certainly the current culture we have around such models and their deployment seems an unambiguously alarming development.

The text-to-image models for education + communication here seems like a great idea! Moreover, I think it's definitely consistent with what we've put forth here too, since you could probably fine-tune on graphics contained in papers related to your task at hand. The issue here really is that people are incurring unnecessary amounts of risk by making, say, an automatic Distill-er by using all images on the internet or something like that, when training on a smaller corpora would probably suffice, and vastly reduce the amount of possible risk of a model intended originally for Distill-ing papers. The fundamental position we advance that better protocols are needed before we start mass-deploying these models, and not that NO version of these models / technologies could be beneficial, ever.

philljkc

I think the core takeaway, at least from my end, is that this post elucidates a model , and tells a more concrete story, for how proliferation of technologies of a certain structure and API (e.g., general-purpose query-based ML models) can occur, and why they are dangerous. Most importantly, this entails that, even if you don't buy the harms of DALLE-2 itself (which, we have established, you should, in particular for its potential successors), this pattern of origination -> copycatting -> distribution -> misuse is a typical path for the release of technologies like this. If you buy that a dangerous capability could ever be produced by an AI model deployable with an API of the form query -> behaviour (e.g. by powerful automatic video generation from prompts, powerful face-editing tools given a video, or an agent with arbitrary access to the internet controlled via user queries), this line of reasoning could therefore apply and/or be useful. This informs a few things:

Technologies, once proliferated, are like a Pandora's Box (or indeed, a slippery slope), and so therefore the very coordination problem / regulatory problem you speak of is most easily solved at the level of origination. This is a useful insight now, while many of the most dangerous AIs to be developed are yet to be originated.
The potential harms of these technologies come from their unbounded scope, i.e. from the generality of function, lack of restriction of user access, or from the parameter count of these models being so large as to make their behaviour inherently hard to reason with. All of these things make these kinds of models more particularly amenable to misuse. So this post, in my mind, also takes a view on the source of capabilities risk from these models: in their generality and open scope. This can therefore inform the kinds of models / training techniques that are more dangerous: e.g. that for which the scope is the widest, where most possible failures could happen because the right behaviour is more nebulously defined.

In general, I would urge you to consider this paragraph (in particular point (3)), the argument there seeming to be the bulk of your criticism.

Overall, the slippery slope from the carefully-guarded DALLE-2 to the fully-open-source Stable Diffusion took less than 5 months. On one hand, AI generators for offensive content were probably always inevitable. However (1) not this soon. Delays in advancements like these increase the chances that regulation and safety work won’t be so badly outpaced by capabilities. (2) Not necessarily in a way that was enabled by companies like OpenAI and StabilityAI who made ineffective efforts to avoid harms yet claim to have clean hands while profiting greatly off these models. And (3) other similar issues with more powerful models and higher stakes might be more avoidable in the future. What will happen if and when video generators, GPT-N, advanced generalist agents, or other potentially very impactful systems are released and copycatted?

In other words, it's maybe not as much about DALLE-2 itself, but about the extrapolation of this pattern to models like it, and ways to deal with that before a model with existential risk is brought up (and by that point, if the data is in on that, we're probably dead already).

Thanks for reading, and for the comment. I hope this clarifies the utility of this article for you.

Sharmake

This. It suggests that once a powerful AI is released, even in restricted format, others will be motivated to copy it. This is a bad dynamic for AGI, as existential risk only depends on the least-safe actor. If this dynamic is repeated, it's very possible that we all die due to copies of AGI being released.

ChristianKleineidam

The potential harms of these technologies come from their unbounded scope

Previous technologies also have quite unbounded scopes. That does not seem to me different from the technology of film. The example of film in the post you were replying too also has an unbounded scope.

This can therefore inform the kinds of models / training techniques that are more dangerous: e.g. that for which the scope is the widest

Technologies with a broad scope are more like to be dangerous but they are also more likely to be valuable.

If you look at the scope of photoshop it can be already used by people to make deepfake porn. It can also used by people to print fake money.

Forbidding broad-scope technologies to be deployed would have likely prevented most of the progress in the last century and would make a huge damper on future progress as well.

When it comes to gene editing, our society decides to regulate its application but is very open that developing the underlying technology is valuable.

The analogy to how we treat gene editing would be to pass laws to regulate image creation. The fact that deepfake porn is currently not heavily criminalized is a legislative choice. We could pass laws to regulate it like other sexual assaults.

Instead of regulating at the point of technology creation, you could focus on regulating technology use. To the extent that we are doing a bad job at that currently, you could build a think tank that lobbies for laws to regulate problems like deepfake porn creation and that constantly analysis new problems and lobbies for the to be regulated.

When it comes to the issue of deepfake porn, it's also worth looking why it's not criminalized. When Googling I found https://inforrm.org/2022/07/19/deepfake-porn-and-the-law-commissions-final-report-on-intimate-image-abuse-some-initial-thoughts-colette-allen/ which makes the case that it should be regulated but which cites a government report which suggests that deepfake porn creation should be legal while sharing it shouldn't be legal. I would support making both illegal, but I think approaching the problem from the usage point of view seem the right strategy.

philljkc

When it comes to gene editing, our society decides to regulate its application but is very open that developing the underlying technology is valuable.

Here, I would refer to the third principle proposed in the "What Do We Want" section as well (on Cost-Benefit evaluation): I think that there should be at least more work done to try and anticipate / mitigate harms done by these general technologies. Like what is the rough likelihood of an extremely good outcome vs. extremely bad outcome for model X being deployed? If I add modification Y to it, does this change?

I don't think our views are actually inconsistent here: if society scopes down the allowed usage of a general technology to comply with a set of regulatory standards that are deemed safe, that would work for me.

My personal view on the danger here really is really that there isn't enough technical work here to mitigate the misusage of models, or even to enforce compliance in a good way. We really need technical work on that, and only then can we start effectively asking the regulation question. Until then, we might want to just delay release of super-powerful successors for this kind of technologies, until we can give better performance guarantees for systems like this, deployed this publicly.

Sharmake

Meanwhile, there are deep problems with the “let’s build transformative AI in order to make sure it’s safe” strategy. In particular, OpenAI and DeepMind both express that they want to race to generate highly transformative intelligent systems. The goal they both profess is to be the first to develop them so that they can exercise responsible stewardship and ensure that it is as aligned and beneficial as possible. This is a benevolent form of what Nick Bostrom refers to in Superintelligence as gaining a “decisive strategic advantage” which may make the first developer of particularly transformative AI too powerful to compete with. There are many problems with this strategy including: (1) It is entirely based on racing to develop transformative AI, and faster timelines exacerbate AI risks. This is especially perverse if multiple actors are competitively racing to do so. (2) Nobody should trust a small set of people like Sam Altman and Demis Hassabis to unilaterally exercise benevolent stewardship over transformative AI. Arguably, under any tenable framework for AI ethics, a regime in which a small technocratic set of people unilaterally controlled transformative AI would be inherently unethical. Meaningful democratization is needed. (3) OpenAI’s approach to DALLE-2 should further erode confidence in them in particular. Their overly-convenient technical report on risks that failed to make any mention of copycatting combined with how quickly they worked to profit off of DALLE-2 are worrying signs. (4) Copycatting makes racing to build transformative AI strictly more risky. Even if one fully-trusted a single actor like OpenAI or DeepMind to exercise perfect stewardship over transformative AI if they monopolized it, how quickly DALLE-2 was copycatted multiple times suggests that copycatting may undermine attempts at benevolent strategic dominance. Copycatting would most likely serve to broaden the set of technocrats who control transformative AI but still fail to democratize it. So if a company like OpenAI or DeepMind races to build transformative AI, and if it is still copycatted anyway, we get the worst of all worlds: unsecure, non-democratized, transformative AI on a faster timeline. If a similar story plays out with powerful, highly transformative AI as has with DALLE-2, humanity may be in trouble.

Let's be honest, a lot of the claims by OpenAI and Deepmind shows bad signs of having motivated reasoning. This is equivalent to a tobacco creating company claiming that their research helps make tobacco safe.

No, it only benefits the company in the form of profits.

[anonymous]

I have been playing with Stability Diffusion for the past week. (It's a bit addictive.) It's currently very time consuming to make photo realistic deep fake images. It probably easier to do so with photo shop. What I can see happens is that people will use Stable Diffusion to make a lot of creative images for political messages intend to attack and mock the opposite side instead of trying to mislead.

[anonymous]

Thanks or the comment. I think that simple interfaces for SD like this are not particularly worrysome. But I think that now (1) inpainting/outpainting, (2) dreambooth (see this SFW example), (3) GUIs that make it easy to use these, and (4) future advancements in difusion models (remember that DLALE-2 was only released in April of this year) are the main causes for concern.

Comments

Curated and popular this week

Was Partisanship Good for the Environmental Movement?

Jeffrey Heninger·2y ago·Curated 5d ago·6m read

This is the third in a sequence of posts taken from my recent report: Why Did Environmentalism Become Partisan? Summary Rising partisanship did not make environmentalism more popular or politically effective. Instead, it saw flat or falling overall public opinion, fewer major legislative achievements, and fluctuating executive actions. Public Opinion...

135

Possible mistake EAs are making and shout out to Pause AI UK

Michelle_Hutchinson·6d ago·4m read

I think right now EAs might be making a significant mistake by paying insufficient attention to the political realm. As EAs we tend to figure out what’s most impactful for us to work on and focus hard. That’s great! But there are various actions that are ‘non-delegatable’ - the extent to which an individual can do the action is limited (like voting, going to a protest, making hard money contributions to particular campaigns). It might be useful if we were all more in the habit of doing variou...

GWWC's 2025 impact evaluation (executive summary)

Aidan Whitfield🔸, Giving What We Can🔸·1d ago·2m read

This post presents the executive summary from Giving What We Can’s impact evaluation for 2025. At the end of this post we share links to more information, including the full report and...

[anonymous]

Thanks for the comment. I hope you think this is interesting content.

I'm not sure I understand what the argument is.

"...we are not convinced of any fundamental social benefits that film provides aside from, admittedly, being entertaining..."

So, what justifies this casual dismissal of the entertainment value of text-to-image models?

We don't. We argue that it's unlikely to outweigh harms.

Your primary conclusion is that "the AI research community should curtail work on risky capabilities".

I wouldn't say this is our primary conclusion. See my first response above. Also, I don't think this is obvious. Sam Altman, Demis Hassabis, and many others strongly disagree with this.

The problem is coordinating our behavior. If OpenAI decides not to work on it, someone else will

...Government bans? Do you realize how hard it would be to prevent text-to-image models from being created and shared...

Are we talking about 100 people a year being victimized? That would indeed be sad, but compared to potential human extinction from AI, probably not as big of a deal.

The Slippery Slope from DALLE-2 to Deepfake Anarchy

The Slippery Slope from DALLE-2 to Deepfake Anarchy

OpenAI developed DALLE-2. Then StabilityAI made an open source copycat. This is a dangerous dynamic.

Abstract

What’s wrong?

How did we get here?

What do we want?

Scoping of function

Limitations for access

Complete cost/benefit analysis

What should we do?

The role of AI researchers

The role of AI governance

The status quo: terms of use, content policies, and applications for access

Governance of originators

Governance of copycatters

Governance of distributors

The role of the public

Conclusion