Hide table of contents

Summary:

1.) We may soon develop seemingly conscious digital minds.

2.) Interactions with apparent digital minds before expert consensus about consciousness may mislead the public.

3.) If the public mistakenly decides that early apparent digital minds are conscious, we may lock ourselves into a trajectory in which the future intelligent population consists mostly of zombies.

4.) There won’t be expert consensus about consciousness any time soon.

Epistemic status: There is an appropriate abundance of ‘might’s, ‘could’s, and ‘may’s in what follows. Most of it is highly speculative and the reasons for giving these ideas credence are spelled out in the post. My impressions of the state of consciousness research come from a decade and a half of reading and writing in the field (primarily the philosophy side), but I’m not a core insider.

We may soon develop apparent digital minds

AI research is progressing at an alarming rate; general artificial intelligence may be here before we’re fully ready for it. We may see chatbots that can pass the Turing test in the next decade. It’s a small jump from there to systems that appear to have beliefs, desires, and a personality – systems with whom we can interact as we do people. Such systems could soon be our colleagues, our friends, our pets, our progeny.

Technology is fairly easy to produce and adoption of new technologies in recent decades has been rapid. Social relationships constitute one major contributor to our welfare that has largely resisted technological improvement. Many people are lonely or don’t have as many or as good friends as they would like. If apparent digital minds become commercial products for social companionship – perhaps subscription friends or artificial pets – we should not be surprised to see them proliferate.

It is conceivable that by the end of the century, apparent digital minds will outnumber biological minds. In the long run, it is conceivable that apparent digital minds will vastly outnumber biological minds. There are fewer constraints on the numbers of apparent digital minds than on biological minds. Biological bodies and brains have significant nutritive, educational, psychological and medical needs. Far more digital minds could be created with available resources.[1]

There are reasons to expect or prefer that the future be composed of disproportionately large numbers of digital minds:

  • If we want to increase the population of Earth without eradicating nature, gradually shifting from human populations to digital populations could be hugely impactful.
  • If we ever want to colonize other planets, transporting biological bodies would be an unnecessary engineering burden. If we have huge numbers of widely dispersed interstellar descendants, they are likely to be digital.
  • If we want to reduce the prevalence of disease and death or create a society of well-adjusted, benevolent, and happy citizens, doing so with digital minds would probably be comparatively easy.

Apparent digital minds will influence public perception

The next century may set the course for the far future with regard to digital minds and consciousness. It may set expectations about what a digital mind can and should be like that are carried ever forward.

Once we have systems that appear to be conscious and are typically accepted as such, it will likely be harder to convince people of a criterion of consciousness that excludes their digital friends and family.[2] If apparent digital minds become common, their beliefs and interests may be incorporated in decision-making processes, either directly (as people whose opinions count) or through their influence on biological minds. If they are trained to think and act as if they are conscious – to profess their subjective states as vehemently as we do, they may skew the results in their favor.[3]

People generally don’t have sophisticated reasons for thinking that their pet dogs and cats are conscious. They don’t know much about comparative neuroscience or evolutionary history. They don’t have a favorite theory of consciousness, or know what the major contenders even are. Rather, they believe their pets are conscious because they act in ways that can be explained by positing mental states that are often phenomenally conscious in us.[4]  We’re disposed to attribute agency, intelligence, and experience to things that act vaguely like us.  We should expect many people to have the same reaction to artificial companions.

If experts cannot provide firm guidance on what systems really are conscious, people may interpret systems that behave as if they are conscious as being so. Experts who disagree with each other about basic matters won’t have the collective authority to say what is and isn’t conscious. If experts do agree, it wouldn’t follow that the public would listen to them, (especially if they only speak of probabilities) but many people are likely to follow experts on issues that they don’t yet have a personal stake in and don’t know much about. Expert consensus that current systems are not conscious could lead to a suspicious and distrustful public; those attitudes could serve us well in the long run.

Public perception may influence future development

It is problematic if the public incorrectly believes that many apparent digital minds really are conscious.

In the short run, we should expect many commercial and legal incentives to follow what the public believes about digital consciousness, particularly in the absence of expert consensus. This may mean that we build a lot of apparent digital minds that have dubious claims to genuine consciousness. It is easy to see how people could expand their moral circle to include such beings.

In the long run, in the absence of decisive theories of consciousness, it is not clear what mechanisms could cause a shift in approach to building apparent digital minds that would make them more likely to be conscious. It could require dismissing or down-weighting the value of many beings in our expanded moral circle. This might be politically or emotionally difficult.

The easiest ways to make systems that appear to be conscious may not lead to systems that actually are conscious. Unless true consciousness is understood and specifically sought, the path of least resistance may be to build digital zombies.[5]

It matters greatly to the value of the future whether we get consciousness right. Otherwise, we may make systems that talk the talk and act the part of conscious creatures despite not actually having anything going on inside worth caring about. The vast majority of potential welfare in the future  – the quadrillions of potential lives longtermists reference – plausibly belongs to digital minds, so failing to instill digital consciousness could be a huge opportunity lost.  It would be deeply tragic if we birthed a future that looked vibrant from a third-person perspective but consisted mostly or entirely of zombie systems going through the motions of having personal projects, experiencing joy, and loving each other. This is a cousin of existential risk that it is important to avoid – perhaps as important as avoiding futures in which our species is wiped out by a disaster or in which an apocalyptic event forever stunts our technological capabilities. Such a future would be particularly dangerous because we may never be able to tell if it is one we are headed toward.

Of course, it is also possible that the easiest ways to create apparent digital minds would produce genuine digital consciousnesses. If that is so, then we don’t have to worry about a zombie future. Just as building aligned AIs may be super easy, avoiding digital zombies may be super easy. However, it would be irresponsible to optimistically assume that this is so.

We’re not near consensus about consciousness

We’re not close to understanding consciousness or predicting which systems should be thought of as conscious. It may be that someone has the right theory already, but there is no good way for non-experts (or experts) to know who that is. The field hasn’t made much progress towards consensus since 1990.

There are reasons to think that we are unlikely to agree on a discriminative theory in the next 30 years:

  • Consciousness is hard. The phenomenon is fundamentally subjective. Hitherto, attempts to identify objective markers for consciousness have been controversial or question-begging. It isn’t obvious that it is even possible to identify a correct theory with the epistemic tools we have available.
  • Different research programs on consciousness often start with foundational assumptions that researchers from other programs disagree with. It isn’t clear how to adjudicate disagreements about such foundational assumptions.
  • Consciousness research is carried out mostly by a relatively small number of specialists publishing a relatively small number of papers on directly related topics each year.[6]
  • Many prominent existing views are radically different from each other. Recently popular answers to the question ‘what things are conscious?’ range from ‘nothing’ to ‘everything’.
  • Consciousness researchers by and large don’t change their views too much. Over time, there haven’t been significant changes in expert opinion. There doesn’t appear to be much of a growing consensus for any one theory. Instead, we see separate schools of thought that elaborate their own views.
  • The structure of academia rewards people for developing one theory and sticking to it. There are few academic incentives for reaching a consensus or even hashing out the relative probabilities of different views.
  • Relatively little research directly addresses the question of consciousness in artificial systems. Generally, applications to artificial systems are an afterthought for theories built to explain human consciousness. The particular idiosyncrasies of the way that computers work and how they bear on the relation to human consciousness have not been discussed much in detail.

What’s an EA to do?

This post aims to express the worry rather than offer a solution, but here are a few things that might help:

  • More work needs to be done on building consensus among consciousness researchers – not in finding the one right theory (plenty of people are working on that), but identifying what the community thinks it collectively knows.
  • More work needs to focus specifically on comparing the ways that human brains and artificial neural networks work and how the differences would bear on their potential for consciousness.
  • We need norms of transparency for makers of apparent digital minds. Companies that market artificial minds for social companionship need to make it clear whether they think their systems are conscious and why. The systems they develop need to be available for inspection by experts.
  • We need specialists to achieve a level of authority with the public to be trusted to vet claims about artificial consciousness. We need them to be circumspect. They need to speak from a unified and consensus-driven position.

Thanks to John Li and Alex Zajic for helpful comments that greatly improved this post.

  1. ^

    It is striking how few digital minds there are in mainstream science fiction depictions of societies where they are possible. The reasons for this are surely narrative: we empathize more with humans. However, this might skew our judgment about what to expect for the future.

  2. ^

    Past discussions on the forum of the value of consciousness research have often provoked the suggestion that consciousness isn’t pressing because we can work it out later. I’m skeptical of this, but I think it depends on the nature of consciousness and our epistemic situation. My argument assumes that there is no magic bullet theory of consciousness that will revolutionize the field the way Darwin did for biology or Newton did for physics. Rather, our best theories of consciousness will involve weighing different considerations against each other and will leave a significant amount of overall uncertainty. It may be more a matter of deciding what we mean by ‘consciousness’ than discovering what consciousness really is. In that environment, community opinion and interpersonal biases can have a strong effect on the result.

  3. ^

    If consciousness is a selling point, as it seems likely to be for some apparent digital minds, the commercial incentives for their producers may be to make sure they are as adamant about their experiences as we are. Even if consciousness is not a selling point, learning from us may teach AIs to profess their own consciousness. E.g. here’s a sample conversation I’ve had with GPT-3:

    Me: Some things are conscious, like people and dogs. You act like you're conscious, but are you really conscious? Does it feel like something to be you?

    GPT-3: From my perspective, it does feel like something to be me. I am aware of my surroundings and my own thoughts and feelings. I can't speak for other people or animals, but from my perspective, it feels like something to be conscious.

    See also the recent claims about LaMDA.

  4. ^

    This isn’t to say that people aren’t justified by the available evidence, just that they don’t believe it for the reasons that would justify their level of confidence.

  5. ^

    In philosophical parlance, zombies are creatures physically identical with us (‘minimal physical duplicates’) that lack phenomenal experiences. Here, I use the term for any system that acts like we do without having phenomenal experiences.

  6. ^

    My estimate is that there are several thousand specialists working on some issues related to consciousness and that they publish about a thousand papers each year that are broadly targeted at advancing our understanding of consciousness in some way. This may sound like a lot, but 1) the majority of papers focus on niche issues, e.g. the nth discussion of whether zombies are conceivable 2) the many papers are in obscure journals and 3) unlike normal science which is incremental, few current papers on consciousness will have any lasting impact.

Comments13
Sorted by Click to highlight new comments since: Today at 8:08 PM

Not sure how I missed this, but great post and this seems super important and relatively neglected. In case others think it would be worth coining a term for this specifically, I proposed "p-risk" after "p-zombies," in a Tweet (might find later) a few months back.

More substantively, though, I think the greater potential concern is false-positives on consciousness, not false negatives. The latter (attributing conscious experience to zombies) would be tragic, but not nearly as tragic as causing astronomical suffering in digital agents that we don't regard as moral patients because they don't act or behave like humans or other animals.

I think the greater potential concern is false-positives on consciousness, not false negatives

This is definitely a serious worry, but it seems much less likely to me.

One way this could happen is if we build large numbers of general purpose AI systems that we don't realize are conscious and/or can suffer. However, I think that suffering is a pretty specialized cognitive state that was designed by natural selection for a role specific to our cognitive limitations and not one we are likely encounter by accident while building artificial systems. (It seems more likely to me that digital minds won't suffer, but will have states that are morally relevant that we don't realize are morally relevant because we're so focused on suffering.)

Another way this could happen is if we artificially simulate large numbers of biological minds in detail. However, it seems very unlikely to me that we will ever run those simulations and very unlikely that we miss the potential for accidental suffering if we do. At least in the short term, I expect most plausible digital minds will be intentionally designed to be conscious, which I think makes the risks of mistakenly believing they're conscious more of a worry.

That said, I'm wary of trying adjudicate which is a more concerning for topics that are still so speculative.

proposed "p-risk" after "p-zombies

I kinda like "z-risk", for similar reasons.

I think this is a very useful post. New information for me: considering apparent digital minds as an X-risk, and that the incentives for companies would be towards creation of zombies rather than conscious beings. I also didn't know the current state of consciousness research, that was valuable too. Thanks for sharing!

  • More work needs to be done on building consensus among consciousness researchers – not in finding the one right theory (plenty of people are working on that), but identifying what the community thinks it collectively knows.

I'm a bit unsure what you mean by that. If consciousness researchers continue to disagree on fundamental issues - as you argue they will in the preceding section - then it's hard to see that there will be a consensus in the standard sense of the word.

Similarly, you write:

They need to speak from a unified and consensus-driven position.

But in the preceding section you seem to suggest that won't be possible. 

Fwiw my guess is that even in the absence of a strong expert consensus, experts will have a substantial influence over both policy and public opinion.

I was imagining that the consensus would concern conditionals. I think it is feasible to establish what sets of assumptions people might naturally make, and what views those assumptions would support. This would allow a degree of objectivity without settling the right theory. It might also involve assigning probabilities, or ranges of probabilities to view themselves, or to what it is rational for other researchers to think about different views.

So we might get something like the following (when researchers evaluate gpt6):

There are three major groups of assumptions, a, b, and c.

  • Experts agree that gpt6 has a 0% probability of being conscious if a is correct.
  • Experts agree that the rational probability to assign to gpt6 being conscious if b is correct falls between 2 and 20%.
  • Experts agree that the rational probability to assign to gpt6 being conscious if c is correct falls between 30 and 80%

But afaict you seem to say that the public needs to have the perception that there's a consensus. And I'm not sure that they would if experts only agreed on such conditionals.

You’re probably right. I’m not too optimistic that my suggestion would make a big difference. But it might make some.

If a company were to announce tomorrow that it had built a conscious AI and would soon have it available for sale, I expect that it would prompt a bunch of experts to express their own opinions on twitter and journalists to contact a somewhat randomly chosen group of outspoken academics to get their perspective. I don’t think that there is any mechanism for people to get a sense of what experts really think, at least in the short run. That’s dangerous because it means that what they might hear would be somewhat arbitrary, possibly reflecting the opinion of overzealous or overcautious academics, and because it might lack authority, being the opinions of only a handful of people.

In my ideal scenario, there would be some neutral body, perhaps that did regular expert surveys, that journalists would think to talk to before publishing their pieces and that could give the sort of judgement I gestured to above. That judgement might show that most views on consciousness agree that the system is or isn’t conscious, or at least that there is significant room for doubt. People might still make up their minds, but they might entertain doubts longer, and such a body might provide incentives for companies to try harder to build more likely to be conscious systems.

'The structure of academia rewards people for developing one theory and sticking to it. There are few academic incentives for reaching a consensus or even hashing out the relative probabilities of different views.'

I agree with this, but I just wanted to link out to a new paradigm that I think gets good traction against this problem  (and also highlights some ongoing consciousness research). This is being funded by the Templeton Foundation (a large philanthropic science funder), and essentially they've got leading advocates of two leading consciousness theories (global workspace theory and integrated information theory; see here) to go head-to-head in a kind of structured adversarial experiment. That is, they've together developed a series of experiments, and together agreed beforehand that 'if the results go [x] way, this supports [x] theory'. Afaik, the results haven't been published yet.

Disclaimer that in a previous life I was a comparative psychologist, so I am nerdily interested in consciousness.  But I do think that there is a tension between taking a strong view that AI is not conscious/ will not be conscious for a long time, versus assuming that animals with very different brain structures do have conscious experience. (A debate that I have seen play out in comparative cognition research, e.g. are animals all using 'chinese room' type computations). Perhaps that will turn out to be justified (e.g. maybe consciousness is an inherent property of living systems, and not of non-living ones), but I am a little skeptical that it's that simple.

they've got leading advocates of two leading consciousness theories (global workspace theory and integrated information theory;

Thanks for sharing! This sounds like a promising start. I’m skeptical that things like this could fully resolve the disagreements, but they could make progress that would be helpful in evaluating AIs.

I do think that there is a tension between taking a strong view that AI is not conscious/ will not be conscious for a long time, versus assuming that animals with very different brain structures do have conscious experience.

If animals with very different brains are conscious, then I’m sympathetic with the thought that we could probably make conscious systems if we really tried. Modern AI systems look a bit Chinese roomish, so it might still be that the incentives aren’t there to put in the effort to make really conscious systems.

"I do think that there is a tension between taking a strong view that AI is not conscious/ will not be conscious for a long time, versus assuming that animals with very different brain structures do have conscious experience."

If animals with very different brains are conscious, then I’m sympathetic with the thought that we could probably make conscious systems if we really tried.

Currently, as I heard from someone who works in a lab that researchers the perceptions of pain for no apparent reason by brain scans (EEG and MR), it is just challenging to come up with an understanding how the brain works, leave alone how consciousness emerges. There are other ways to assess animal consciousness, such as by evolutionary comparisons and observation. So, it does not follow that if we find (a high probability) that different animals are conscious, we would likely be able to make conscious systems.

There are also different types of consciousness,[1] including that related to sensing, processing, and perceiving. So, depending on your definition, AI can be perceived already conscious since it takes and processes inputs.

  1. ^

    Anil Seth. Being You: A New Science of Consciousness (2021)

Taking a step back, does it really matter if AI is conscious or not? One can argue that AI reflects the society (e. g. in order to make good decisions or sell products), so would, at most, double the sentience in the world. Furthermore, today, many individuals (including humans not considered in decisionmaking, not profitable to reach, or without the access to electricity, and non-human animals, especially wild ones) are not considered by AI systems. Thus, any possible current and prospective AI's contribution to sentience is limited.

The notion that AI reflects societies also follows that wellbeing in societies should be improved to improve the perceptions of AI.

It is unlikely that suffering AI would be intentionally created and kept running. Nations would probably not seek creating suffering which would consume scarce resources and if suffering AI is created abroad, the nation would just turn it off.

Unintentional creation of necessary suffering AI that would not reflect the society but perceive relatively independently is the greatest risk. For example, if AI really hates selling products in a way that in consequence and in the process reduces humans' wellness, or if it makes certain populations experience low or negative wellbeing otherwise. This AI would be embedded in governance/economy, so would likely not be switched off. Also in this case, the wellbeing in societies should be improved - by developing AI that would 'enjoy what it is doing.'

Another concern is suffering AI that is not created for strategic purposes but individuals' enjoyment of malevolence. An example can be a sentient video game. This can be mitigated by making videogames reflect reality (for example, suffering virtual entities would express seriousness and not preference, rather than e. g. playful submission and cooperation), and supporting institutions that provide comparable real-life experiences. This should lead to decreases in players' enjoyment of malevolence. In this case, the wellbeing improvement in a society would be a consequence of protecting sentient AI from malevolence by preventing malevolence by both developing preventive AI and improving a societal aspect that should develop players' preferences.

One can argue that AI reflects the society (e. g. in order to make good decisions or sell products), so would, at most, double the sentience in the world. Furthermore, today, many individuals (including humans not considered in decisionmaking, not profitable to reach, or without the access to electricity, and non-human animals, especially wild ones) are not considered by AI systems. Thus, any possible current and prospective AI's contribution to sentience is limited.

It is very unclear how many digital minds we should expect, but it is conceivable that in the long run they will greatly outnumber us. The reasons we have to create more human beings -- companionship, beneficence, having a legacy -- are reasons we would have to create more digital minds. We can fit a lot more digital minds on Earth than we can humans. We could more easily colonize other planets with digital minds. For these reasons, I think we should be open to the possibility that most future minds will be digital.

Unintentional creation of necessary suffering AI that would not reflect the society but perceive relatively independently is the greatest risk. For example, if AI really hates selling products in a way that in consequence and in the process reduces humans' wellness, or if it makes certain populations experience low or negative wellbeing otherwise.

It strikes me as less plausible that we will have massive numbers of digital minds that unintentionally suffer while performing cognitive labor for us. I'm skeptical that the most effective ways to produce AI will make them conscious, and even if it does it seems like a big jump from phenomenal experience to suffering. Even if they are conscious, I don't see why we would need a number of digital minds for every person. I would think that the cognitive power of artifical intelligence means we would need rather few of them, and so the suffering they experience, unless particularly intense, wouldn't be particularly significant.

The reasons we have to create more human beings -- companionship, beneficence, having a legacy -- are reasons we would have to create more digital minds.

Companionship and beneficence may motivate the creation of a few digital minds (being surrounded by [hundreds of] companions exchanging acts of kindness may be preferred by relatively few) while it is unclear about leaving a legacy: if one has the option to reflect themselves in many others, will they go for numbers, especially if they can 'bulk' teaching/learning.

Do you think that people will be interested in mere reflection or getting the best of themselves (and of others) highlighted? If the latter, then presumably wellbeing in the digital world would be high, both due to the minds' abilities to process information in a positive way and their virtuous intentions and skills.

I'm skeptical that the most effective ways to produce AI will make them conscious, and even if it does it seems like a big jump from phenomenal experience to suffering.

If emotional/intuitive reasoning is the most effective and this can be imitated by chemical reactions, commercial AI can be suffering.

Even if they are conscious, I don't see why we would need a number of digital minds for every person. I would think that the cognitive power of artifical intelligence means we would need rather few of them, and so the suffering they experience, unless particularly intense, wouldn't be particularly significant.

Yes, that would be good if any AI that is using a lot of inputs to make decisions/create content etc does not suffer significantly. However, since a lot of data of many individuals can be processed, then if the AI is suffering, this experiences can be intense.

If there is an AI that experiences intense suffering (utility monster) but makes the world great, should it be created?

Curated and popular this week
Relevant opportunities