I'm new here, and this is my first post.
The central topic involves the notion of sentience in machines - "Can machines develop Sentience? What does Sentience mean in the context of machines? and How do you reliably test for it?"
Last year there was a sizeable buzz online about this topic when an engineer at Google- Blake Lemoine, claimed that LaMDA - a language model being developed at the company, had developed a capacity for emotion and self-awareness akin to that of a human being.
This sparked debates and discussions about the idea of "sentient machines". Some parties (including Google in an official report) felt the engineer's claims were unfounded and delusional.
Earlier however, a Google Vice-President in an individual statement felt LaMDA had developed an extremely impressive capacity for social modelling, indicating that AI systems were gradually approaching consciousness (in the sense of modelling the self).
A distinction between the Google VP's position and Lemoine's is that the VP describes LaMDA as being able to understand entities that feel, while Lemoine says LaMDA itself is able to feel.
If machines are truly capable of sentience, then ethical discussions about machine rights suddenly become important and urgent. It would be unethical for humans to keep making unilateral decisions about the use (and disposal) of technology which has somehow developed its own capacity for emotion.
In spite of the disagreement that exists in discussions about Artificial Sentience, something everyone generally agrees on is that AI systems are becoming increasingly capable. There is no reason to think Sentience is beyond the capacity of machines, either now or in the future.
The issue is that we're not sure how to definitively test for Artificial Sentience if/when it happens. Tests based on human judgement are subjective and unpersuasive. Lemoine for example, was completely convinced of LaMDA's sentience. A good number of other people though, were highly skeptical of his judgements.
To obtain some guidance on this issue of subjectiveness, we can draw some insight from Neuroscience:
In a subfield of Neuroscience called Affective Neuroscience, there are procedures to detect the emotional state of an individual from an fMRI (functional Magnetic Resonance Imaging) scan of their brain. From an fMRI scan, you can reliably tell if an individual is sad, happy, angry, etc.
Procedures like this find immense value in Medicine. E.g, Brain scans are used to detect consciousness in comatose patients. Usually you can tell if a human being is conscious by observing their behaviour, or their response to stimulus. You can't do this for comatose/unresponsive patients, so brain scans offer a valuable alternative.
Such procedures highlight an objective test of consciousness in humans. They are independent of individual human judgement - instead they rely on a tested method of analysing brain scans to make inferences about consciousness.
How can we apply an analogous approach to machines?
What is an Objective way to test for Sentience in a Machine/Language Model? Independent of what it says, and independent of people's personal opinions?
To do this, we can draw on structural parallels that exist between artificial neural networks (the technology underpinning todays AI systems), and the human brain.
A number of AI experts are quick to emphasise that artificial neural networks are only loosely inspired by the human brain, and so we should not attempt to draw serious analogies between them.
Some Neuroscientists however, are convinced enough of their similarities to employ artificial neural networks as a conceptual framework for understanding the human brain better. I side with these neuroscientists.
Studies in Artificial Intelligence [1, 2, 3, 4] provide techniques to "open up" artificial neural networks, and understand them at the level of individual neurons. Using these techniques, we can carry out "brain scans" on machines/language models to detect the "emotional state" they're experiencing at a given point in time.
This will provide an objective way to detect emotion (and consequently sentience) in a given machine/language model. We wouldn't have to debate over subjective tests and personal judgements, but could instead employ an objective procedure to "scan the brains" of machines, and obtain information about what they are (or are not) feeling.
I wrote an initial research manuscript giving some more technical detail on this approach to detecting sentience in machines. Interested people can find it here.