On entities and events, AI alignment, responsibility and control, and consciousness in machines.
When someone says, for instance, that God exists, one way to proceed is by first establishing the definition of “God” and what it means to exist. While this seems straightforward, we know from experience that it rarely is. People associate different meanings with the word “God”. Even more complex is the property of existing. Is it necessary, for something to exist, that it be visible and tangible? Does a thing not exist if it is not observable, like a preference that is held but never revealed? Different underlying commitments add further difficulty in reaching consensus over even basic definitions.
Now consider a different approach. Instead of attempting to settle on definitions, we begin with specific, self-contained claims. We then reduce any assertion about a phenomenon to a family of claims, each of which can be evaluated independently as true, false, indeterminable, or meaningless. One claim concerning the existence of God could be “when humans distance themselves from traditional religion, they invite destruction through worse substitutes.” This claim is either true or false. A separate claim could be “God dwells in heaven, which exists beyond this realm,” which is indeterminable.
Debates on free will notoriously fail to move beyond disagreements over definitions. It is therefore an ideal candidate for this approach. With the rise of AI, the question has acquired a new practical context: Can an AI be held accountable? Should it have rights? These questions presuppose answers to whether AI systems can possess what we call “free will”. Rather than attempting to agree on a definition of free will, let’s consider specific, self-contained claims about the phenomenon, from general ones to those specific to AI, and evaluate their truth.
1. All events are inevitable.
This claim is trivial as stated, in the sense that we cannot imagine any event in the world that could falsify it. It is true in the same way as the claim “all triangles have three sides” is true. No matter which part of the world or time period we consider, we will never find a triangle without three sides. Likewise, no matter what occurs in the actual world – whether I stand up or remain seated, whether a dog barks or maintains silence, whether an AI restricts some information or reveals it – no occurrence past or future could disprove the claim that all events are inevitable.
Traditional claims about determinism give the impression of being substantive. But when we ask what the person who makes that claim knows about the actual world that the person who denies it does not, we struggle to find anything. In fact, he need not possess any particular knowledge of the world to make the inevitability claim successfully. In effect, he is not saying anything more than “events that occur, occur, and events that do not occur, do not occur.”
In many discussions of free will, the issue is framed by asking whether a person “could have done otherwise” in exactly the same situation, bringing in talk of alternative possibilities and other ways the world might have gone. But it is not clear what status we should assign to things that do not occur, beyond the fact that they are not real. The metaphysics of modality deserves its own treatment, which I will not attempt here. For now, I will only note that whatever argument is given to prove the inevitability of all events, in no case can the person show that he knows anything more about the actual world than his opponent.
2. No one can be held responsible for their actions.
When we talk about free will, we tend to assume the existence of an independent, inherent self who can have it or not have it, and with which we associate some deeper sense of responsibility beyond relationships between events. We think that the thing which has free will is an “I” which exists above and beyond the fabric of our skin and muscles, the bodily processes, the thoughts that form, and so forth. This view is challenged by Buddhism, empiricist philosophy, and modern science.
As Bertrand Russell put it, when we look at Mr. Smith, “we see a pattern of colors; when we listen to him talking, we hear a series of sounds. We believe that, like us, he has thoughts and feelings. But what is Mr. Smith apart from all these occurrences? A mere imaginary hook, from which the occurrences are supposed to hang. They have in fact no need of a hook, any more than the earth needs an elephant to rest upon…it is a collective name for a number of occurrences. If we take it as anything more, it denotes something completely unknowable, and therefore not needed for the expression of what we know.”
The self is thus not a substance that performs actions. It is a “convenient way of collecting events into bundles.”
Once we recognize this, we can see that responsibility must be located not in some unchanging entity but in the relationships between states of affairs themselves. In the substance view of the self, we say that the thing responsible for an event, like a light bulb turning on, is an entity: me. But in fact it is not I who turned the bulb on, but my pressing the switch. In other words, it is not an entity but another event that can be held responsible for the event that occurred.
Further, as David Hume demonstrated, there is no observable “force” of causation. Therefore, responsibility can only refer to certain relationships of precedence and succession between events. When we say one event is responsible for another, we mean that they stand in a particular relation to each other.
In cases of manipulation, we would say, for instance, that my lying to her was responsible for her forming a wrong judgment, or that my injecting corrupt data was responsible for the AI’s making a mistaken judgment.
Seen this way, we can establish responsibility for every event that occurs. Factually, only another event can be held responsible for an event. It is merely for our convenience that we describe the event held responsible in terms of an entity, human or otherwise.
3. AI systems cannot be held accountable in the same way humans are, as they lack a unified or embodied self that is capable of reward and punishment.
Since it is for our convenience that we think in terms of entities rather than events, it is worth considering why we do so. The obvious advantage is instrumental: there are certain benefits that can be derived by thinking and acting in terms of persons and holding entities responsible rather than events.
An easy way to understand this is to consider the usefulness of treating a corporation as if it were a person: a thing that is subject to creation and death, pays taxes and receives benefits, can be healthy or sick, is able to sue and get sued, can be controlled or get out of control, can be rewarded and punished, and so forth. If we tried to locate this corporate entity, we wouldn’t find it anywhere, just as we never find a “me” in any part of our body. The corporation is a useful fiction, a way of bundling together events and relationships that serves specific practical purposes.
To the extent that similar benefits are derived from holding AI systems responsible at one level rather than another, it serves the same purpose as holding a person or corporation responsible at a particular level. The norms of accountability that we have historically applied to humans, animals, corporations, and even deities can be understood in terms of these practical benefits.
Seen in this light, the debates about AI alignment are essentially a recent version of patterns we have seen before. In the late eighteenth and nineteenth centuries, for instance, markets came to be understood as a self-regulating device, an institutional mechanism that organized people’s lives through prices and incentives, with a logic of its own that no individual controlled. Later, the modern corporation emerged as an economic organism whose internal goals and powers could rival those of the state, again not reducible to the psychology of any one manager or shareholder. The Progressive Era struggles over monopolies and corporate regulation were, in effect, attempts at alignment: how to steer these large, semi-autonomous systems so that the patterns of events they produced did not simply follow the narrow logic of profit and competition, but remained compatible with the broader interests of citizens. Advanced AI systems can be seen as the latest instance of this familiar problem.
4. AI systems cannot truly deliberate like humans; their responses are automatic like those of animals.
Viewed up close, the neurological activities of a human during rational deliberation would appear as electrochemical impulses, no different in kind from automatic processes. When viewed from a distance, the instinctive responses of an animal reveal themselves to be part of something more complex and adaptive. So in what material way do deliberation and instinct differ?
One bias we have is to associate effort with intention and struggle with agency. We tend to value things more highly when greater effort goes into making them, irrespective of whether they are actually better. Suppose one person learns piano easily, while another struggles, painstakingly making individual associations and movements. Does the latter have any special claim to the quality of the music? Clearly not. The same applies to someone who arrives at a judgment through lengthy deliberation compared to another who arrives at it quickly. The former does not have a privileged claim to being right. It is possible that the second person is simply more gifted or experienced.
The value of deliberation lies in the fact that it often leads us to better judgments in important matters, where “better” is judged by how appropriate the judgment proves to be, not by the process that produced it. Haste usually implies error in such matters, but this is a statistical correlation, not a metaphysical distinction. When a person deliberates, it does not inherently indicate better judgment, even though it often does. Our norms reflect this practical reality, allowing us to treat deliberation as relevant to accountability. But objectively, there cannot be anything inherently unique or superior about what we call deliberation. If an AI arrives at appropriate judgments without apparent struggle or delay, and we find ourselves questioning its capacity to deliberate, this reveals something about our biases, not about the AI’s lack of agency.
5. AI systems lack consciousness.
The truth of this claim, like all others, depends on how certain concepts are understood. In other words, based on how we define consciousness, we will be dealing with different, independent claims.
By consciousness, I do not mean merely the state of being active, like industrial equipment, or merely the capacity to experience emotions, as animals do. John Locke defined consciousness as “the perception of what passes in a man’s own mind.” Consciousness is a form of perception, which implies knowledge. When you perceive a tree, you form a representation of something external to your mind. When you are conscious, you form a representation of something internal: your own mental states and processes. This distinction suggests consciousness is not metaphysically unique; it is simply knowledge of a particular kind.
Consider what happens when you catch yourself getting angry. You notice the rising tension. This noticing is not the anger itself; it’s an awareness of the anger. You have formed knowledge about your own mental state. Now consider someone who becomes angry but lacks this awareness; they simply act on the anger without recognizing it. We say the first person has awareness of their anger in a way the second person does not, and this is because the first person possesses certain knowledge that the second lacks.
If consciousness is a kind of knowledge, how is such knowledge acquired? The same way all knowledge is acquired: by synthesizing data, by forming compressed representations that manifest as successful predictions.
Thus, to say that an entity is conscious is to say that it possesses certain knowledge that an entity lacking consciousness does not. Consciousness is, in principle, accessible to any entity capable of acquiring knowledge about its own states and processes, regardless of whether that entity is biological, capable of emotions, or human. This additional knowledge may yield practical advantages, but this is not necessary. There is nothing in principle that prevents an AI from being conscious.
6. AI systems lack control over their actions.
By control, I mean the ability to regulate something either according to a teleological pattern or according to one’s conscious intention. These are two distinct senses in which control is usually understood.
In the first sense, teleological control, we say, for example, that the sphincters in the bladder and urethra control the outflow of urine. They fulfill a purpose. When they malfunction, they fail to fulfill this purpose and thereby fail to exert control. This is control understood as functional regulation toward an end. Many machines possess control in this sense. The question is not whether such systems have purposes, since they clearly do, but whether these purposes are “real” in some deeper sense.
This leads us to consider whether purposes must be self-generated to count as genuine. But on reflection, human purposes are no more self-generated than AI purposes. We do not generate our own desires from nothing; they arise from biological imperatives, cultural conditioning, prior experiences, genetic dispositions. The fact that we cannot trace our desires back to some uncaused origin – a notion that is itself conceptually incoherent or unknowable – does not make them less real or less effective in organizing our behavior.
In the second sense, conscious control, we say that a person has control over his bladder because his conscious intentions align with his ability to regulate it. To the extent that he lacks awareness of his intentions or is unable to regulate the release according to those intentions, he lacks control. This form of control depends on consciousness, and as we noted in the previous claim, consciousness is a form of knowledge.
In the real world, we find that people have varying degrees of control over different things at different times. A person driving a vehicle may have control over its speed and direction at one moment but lose it later due to a malfunction in the vehicle or a muscle spasm in his body. Control is thus not an all-or-nothing property but a matter of degree and context.
If an AI possesses the relevant self-knowledge and if its internal states successfully regulate its behavior toward ends, then it possesses control in both senses. Whether any current AI systems meet these criteria is an empirical question. But nothing in principle prevents AI systems from having control, any more than anything in principle prevents them from having consciousness or engaging in deliberation.
Conclusion
The list of claims I have considered here is far from exhaustive, and different conceptions from those I have used for terms like “consciousness”, “responsibility”, or “control” would create their own, independent claims. But the method suggested here is, I believe, the best way to explore the possibility of free will in AI, as indeed it is for humans.
