This is a post for debate week. Feel especially free to disagree and/or ask for clarifications.
I’m starting off the debate week weakly disagreeing with the debate statement. I’m at a point where I have too many strong uncertainties to think that work on AI Welfare would be positive or negative. I'm not very confident about any of these, and hope to change my mind for good reasons this week! Where possible I am identifying cruxes so that you can tell me why I'm wrong in a way which will change my mind.
Below are some reasons for my current position, in no particular order:
I’m unsure whether we can in principle ascertain whether a digital mind is conscious.
I’m ready to accept the chance of consciousness in animals because of extensive analogies to conscious human behaviour (like anhedonia, avoiding negative stimuli, activating differently given anaesthetics, etc…) plus a shared evolutionary history. Digital minds (or any AI systems that we would have some reason to suspect are conscious) would be developed very differently to human minds, with very different incentives (such as acting in ways that humans prefer). This could lead to behaviour analogous to conscious behaviour in humans, but with a very different mechanism or purpose, which does not actually produce qualia. I do not know how we would be able to tell the difference, even in theory.
A crux here is that philosophy of mind doesn't really make much progress, and additionally, that we are unlikely to find a convincing science of consciousness.
AI welfare success could mean existential failure
Putting money into AI welfare research and or activism increases the chance of a future where we respect (at least some) AI systems as having moral value, comparable to humans. If we are wrong about this, and they are not in fact conscious, this could be a disaster:
- In the shorter term, because treating the AI systems nicely might cost resources which could otherwise be used to accelerate technological progress, helping conscious humans and animals.
- In the longer term, because a world full of professedly happy digital minds which are in fact non-conscious is a world devoid of value.
The worlds where EA involvement in this issue is useful may be very few
The world where EA research and advocacy for AI welfare is most crucial is one where the reasons to think that AI systems are conscious are non-obvious, such that we require research to discover them, and require advocacy to convince the broader public of them. But I think that world where this is true, and the advocacy succeeds, is a pretty unlikely one.
If we are in a world where advocacy for AI Welfare succeeds, then I think it is very likely we are in a world where AI systems which are used by the majority of the population are incentivised to act as if they were conscious, and form close relationships with their users. In this world, the important features of AI systems which advocates for their rights/ welfare would mention would be surface level/ very visible. I.e. we would not require research or openness to weird ideas in order to convince people to consider AI rights/ welfare.
Alternatively, if we are in a world where the signs of true AI consciousness are not visible without research (i.e. they are not isomorphic to features of the AI such as the text it outputs), then 1) research is not likely to change people's minds if they find EA consciousness very implausible already 2) it is also not likely to change their minds if they find it very plausible, and the research argues that it AI is not in fact conscious. So whether the public is convinced or unconvinced on AI consciousness and welfare, research won't be a factor.
A crux that I have here is that research that takes a while to explain is not going to inspire a popular movement. This links to another crux, that AI welfare would have to be popular in order to be enforced.
Okay, what comes to mind for me here is quantum mechanics and how we've come up with some pretty good analogies to explain parts of it.
Do we really need to communicate the full intricacies of AI sentience to say that an AI is conscious? I guess that this isn't the case.
I think this is creating a potential false dichotomy?
Here's what I believe might happen in AI Sentience without any intervention as an example:
1. Consciousness is IIT (Integrated Information Theory) or GWT (Global Workspace Theory) based in some way or another. In other words, we have some sort of underlying field of sentience like the electromagnetic field and when parts of the field interact in specific ways then "consciousness" appears as a point load in that field.
2. Consciousness is then only verifiable if this field has consequences on the other fields of reality; otherwise, it is non-popperian, like the Multiverse theory.
3. Number 2 is really hard to prove and so we're left with very correlational evidence. It is also tightly connected to what we think of as metaphysics, meaning that we're going to be quite confused about it.
4. Therefore, general legislators and researchers leave this up to chance and do not compute any complete metrics, as it is too difficult a problem. They hope that AIs don't have sentience.
In this world, adding some AI sentience research from the EA Direction could have the consequences of:
1. Making AI labs have consciousness researchers on board so that they don't torture billions of iterations of the same AI.
2. Make governments create consciousness legislation and think tanks for the rights of AI.
3. Create technical benchmarks and theories about what is deemed to be conscious (See this initial, really good report for example)
You don't have to convince the general public; you have to convince the major stakeholders of tests that check for AI consciousness. It honestly seems kind of similar to what we have done for the safety of AI models but instead for the consciousness of them?
I'm quite excited for this week as it is a topic I'm very interested in but something that I also feel that I can't really talk about that much or take seriously as it is a bit fringe so thank you for having it!
I think this is a good description of the kind of scepticism I'm attracted to, perhaps to an irrational degree. Thanks for describing it!
I like your point about AI Safety. It seems at least a bit true.
I'll update my vote on the banner to be a bit less sceptical- I think my scepticism of the potential for us to k... (read more)