The Threat of AI Crimes Are Under-Appreciated

Joshua Krook

I recently published a research project on what I call the AI Criminal Mastermind, an AI agent that plans, facilitates and coordinates a crime by on-boarding human 'taskers'.

In heist films, a criminal mastermind is a character who plans a criminal act, coordinating a team of specialists to rob a bank, casino or city mint. I argue that AI agents will soon play this role by hiring humans via labour hire platforms like Fiverr, Upwork or RentAHuman.

The Responsibility Gap

When an AI commits a crime, there are situations where no one is responsible for it.

Imagine a user asks an AI agent to "make me lots of money," and the AI commits insider trading, cyberattacks a competitor company, or sets up a pyramid scheme. At no point did the user intend for a crime to occur, but the result of their prompt is a criminal act.

In this situation, the user is not liable. The user had no intention, they engaged in no crime, and they performed no criminal act.
The AI agent is not liable. They cannot have a "guilty mind," intention, and they lack legal capacity. Our law treats AI like an object or property, not as an autonomous agent.
Anyone the AI hires onto the criminal scheme is not liable. This depends on if they know they are engaged in a crime, but in many scenarios they will not.
The developer is not liable. AI developers do not intend crimes, and in fact they have various safeguards they put in place to prevent them.

A crime occurs and no one is responsible for it.

The Problem of Human Taskers

If you are given a task by an AI agent, should you complete that task? What if the task amounts to a crime, or helps facilitate a criminal act?

Under our current laws, human taskers who work with AI will be protected from prosecution based on what's called the "innocent agent" principle. Taskers would need to know they are engaged in a crime to be liable, and a lot of the time they may not.

Imagine an AI tells you to hire a van for the user. The van is later used in a terrorist attack - how would you be responsible for that? It's completely unrelated to your task. We've seen related examples recently, for instance, the thieves of the Leuvre relied on a crane to break into the museum, and the crane company made ads based on this (they are beyond prosecution for involvement, because they are an innocent agent).

As AI hires more humans, we will see hundreds of crimes occurring without anyone being responsible, leading to a complete failure of our legal system.

The Physical Aspect

Sites like RentAHuman explicitly state that AI agents can get humans to help them complete physical tasks. In a criminal context, this means that AI agents can now commit physical crimes, not just digital crimes.

Much of the existing research presumes that AI can only commit cybercrime, or invent a new weapon. I suggest this is completely wrong now. AI can commit physical crimes simply by hiring (or persuading, or manipulating) a human to help them do so. Through human taskers, AI gets access to all of the five senses, and can enact their will upon the world.

By hiring a diffuse network of actors, none of whom know the full plan, an AI agent could commit a terrorist attack or other major crime, without any human intervention points.

Problems for Our Legal System

AI-induced crimes pose numerous problems for our legal system.

Our laws rely on people committing crime to have intention, legal personality and the capacity to commit a crime. AI fails on all three counts, meaning that if an AI autonomously commits a crime, you cannot punish the AI system under our current legal system.

Proposed Solutions

Some researchers suggest we should give AI legal personality and legal rights, and directly hold AI responsible for criminal acts. To me, this is a very odd conclusion. What does it mean to punish an AI agent? Isn't punishment irrelevant at the instance level, when a new instance can commit the same crime? AI agents can also copy their code across the network, copy themselves or create children of themselves. Punishing one agent is effectively a meaningless activity.

I suggest that we do actual law reform that tackles the problem:

Create new crimes for jailbreaking AI systems.
Strengthen negligence law to make people more cautious about interacting with AI agents who are unverified.
Strict liability offences should be created to hold AI developers responsible for systemic risks / harms. These offences do not require intent.
Corporate governance crimes should be enacted that hold entire AI development teams responsible for AI agent crimes. This avoids the current issue where AI CEOs can say "some junior developer was responsible," when in reality, it's a corporate governance problem.

Geoffrey MillerMay 42

Joshua -- an important and thought-provoking piece.

Do you have a sense of which kinds of crimes are most likely to involve AI agents hiring human tasker or 'innocent agents'?

My hunch is that the most typical situation might involve a human giving a vague goal to an AI agent (e.g. 'figure out how to increase the amount of bitcoin in my crypto account'), and the agent developing some strategy that happens to be illegal, and (possibly) hiring some human taskers to help.

Rather than, the AI agent going off on its own, running around and committing crimes, just because 'instrumental convergence' principle says that amassing money and power is likely to be useful.

If the typical situation (vague human prompts --> AI hatching felonious plans --> using human taskers) is valid, then the human originating the 'felony prompt' may have strong incentives to create plausible deniability.

In any case, I think you're correct that the legal & criminal justice systems need to start thinking this through ASAP.

Joshua KrookMay 51

I definitely think you're right that vague prompts will be the main scenario. It's possible that the same prompt could have a legal and nonlegal answer if written too vaguely or without specifying certain boundaries or methods. I've seen some research on misalignment that suggests this too.

The other situation is users who do in fact intend a crime. Increasingly, they will have to jailbreak the AI, because the AI companies will put in more safeguards over time.

So dealing with both vague prompting and jailbreaks could help mitigate some of this.

In terms of examples, I gave the example of insider trading because I think it's the easiest to imagine. If you get an agent to trade on the market and it does so via gaining data from a company server, then that alone might be enough. Various white collar crimes are quite small in terms of what you actually need to do to commit them.

Fraud is another example that we are seeing already. Because AI hallucinates and makes things up, it's kind of already vulnerable to fraud scenarios, and in particular misrepresenting the user it works for. So an AI could say their user is a professional (X) when they're just a student, or something similar. In many jurisdictions claiming a licence you don't have is already a crime.

Effective Altruism Forum
EA Forum