Hide table of contents
Photo by Sasha Kaunas on Unsplash.

This is part 2 in a 5-part series entitled Conscious AI and Public Perception, encompassing the sections of a paper by the same title. This paper explores the intersection of two questions: Will future advanced AI systems be conscious?  and Will future human society believe advanced AI systems to be conscious? Assuming binary (yes/no) responses to the above questions gives rise to four possible future scenarios—true positive, false positive, true negative, and false negative. We explore the specific risks & implications involved in each scenario with the aim of distilling recommendations for research & policy which are efficacious under different assumptions.

Read the rest of the series below:

  1. Introduction and Background: Key concepts, frameworks, and the case for caring about AI consciousness
  2. AI consciousness and public perceptions: four futures (this post)
  3. Current status of each axis
  4. Recommended interventions and clearing the record on the case for conscious AI
  5. Executive Summary (posting later this week)

This paper was written as part of the Supervised Program for Alignment Research in Spring 2024. We are posting it on the EA Forum as part of AI Welfare Debate Week as a way to get feedback before official publication.

3. AI consciousness and public perceptions: four futures

3.1 The two-dimensional framework

Based on our limited knowledge of the mechanistic underpinnings of consciousness, we are presented with a moral challenge on both an metphysical and an epistemic level. The metaphysical situation allows for the possibility that AIs become conscious, and, according to pathocentrism, this allows for the possibility that AIs become moral patients. Conditioned on whether society believes AIs to be conscious, or on the epistemic question of consciousness, we are presented with four scenarios with respect to AIs’ moral status:

  1. We correctly regard AI systems as moral patients (true positive)
  2. We incorrectly regard AI systems as moral patients (false positive)
  3. We correctly disregard AI systems as moral patients (true negative)
  4. We incorrectly disregard AI systems as moral patients (false negative)

This is the basis of our paper–we tackle the problem of AI consciousness by dividing the future into four quadrants (Table 1).

 [Metaphysical] Will future advanced AI systems be conscious?
YesNo
[Epistemic] Will future human society believe advanced AI systems to be conscious?YesTrue positive
Advanced AI correctly recognised as moral patients
False positive
Advanced AI incorrectly recognised as moral patients
NoFalse negative
Advanced AI incorrectly disregarded as moral patients
True negative
Advanced AI correctly disregarded as moral patients

Table 1 (repeated), THE “2D FRAMEWORK”: 2 × 2 matrix depicting possible future scenarios of AI consciousness and societal beliefs.

To the best of our knowledge, no other work has taken a similar systematic approach to this problem–with the closest exception being Berg et al’s very recent (2024). In their discussion, Berg et al use the same possibility space to probe different societal attitudes towards potentially conscious AIs. They conclude that, provided uncertainty about how consciousness actually works, it is best to proceed as if potentially conscious AIs actually are conscious & deserving of moral consideration rather than assuming the opposite (ibid). In this position paper, we build upon Berg et al’s initial forays by exploring the scenarios and the risks associated with them in greater detail.

We give binary answers to the two questions we pose in order to categorise the future in a simple and clear way but we recognise that both dimensions will be more nuanced. Along the epistemic axis, there is no guarantee that human opinions will converge, or that they will apply to all AI systems. On one hand, public opinion could be polarised, or otherwise distributed across multiple stances. Alternatively, people could interact with some systems more than others, such as AI companions with prosocial abilities, which might unwarrantedly shift the saliency of the question of consciousness to this specific kind of systems. Along the metaphysical axis, it might be the case that (i) only some AIs are conscious, (ii) different types of AI systems are conscious in different ways (Hildt 2022; cf. Birch et al 2020), and/or (iii) to different degrees (§2.4) (notwithstanding these degrees of freedom, qualifying as consciousness could still be a binary matter).

We address these nuances throughout the paper, including the illustrative examples of how the four scenarios might play out, as well as the specific risks.

3.1.1 Society believes AI is conscious: true positive and false positive

Let’s first consider the positive belief scenarios: true positive and false positive. In these cases, human society at large regards AIs as conscious. Thus, both scenarios are more likely to involve institutional recognition of AI consciousness and hence legal protections and rights. Furthermore, both futures might involve cultural and economic exchanges with artificial nation-states or analogous entities. However, in the false positive scenario these various privileges are fundamentally unwarranted, since AIs actually lack consciousness or sentience (e.g. they are “P-zombies”[1]; Chalmers 1996). Such AI might abuse their rights and disempower humans (§3.2.2. Even in the absence of misalignment, the welfare of actual moral patients could be compromised in the service of illusory AI needs. Finally, one could envision the rise of ideological disagreements and the rise of anti-AI factions, leading to geopolitical instability (§3.2.3).

This is not to say that the true positive scenario is not also fraught with significant societal challenges. For one, future populations of conscious AI might vastly outnumber future populations of humans (Shulman and Bostrom 2021). In addition, it is possible that advanced AI might become “super-beneficiaries”– entities which can derive greater utility per unit of resource than humans[2]. Either development could lead to a disproportionate, yet morally justified claim on our planet’s limited resources– so much so that the respective allocation for humans falls below the subsistence level. It would be morally justified based on utilitarian or equality principles, which aim to maximise overall well-being, but this could come to the disadvantage of humans, since it’s difficult to balance the interests of entities with different well-being capacities. In short, even the true positive scenario entails significant disempowerment and suffering risks.

3.1.2 Society doesn’t believe AI is conscious: true negative and false negative

 We now turn to the negative scenarios, which are characterised by widespread disbelief about the consciousness & hence moral status of AIs. Both scenarios entail no protections of AIs’ interests–AIs are used as tools. In both quadrants, our treatment of AIs as objects could translate negatively to our relationships with actual moral patients (Darling 2016, 227-8) (§3.2.4). Both quadrants are also similar in their levels of human disempowerment risk, either via misalignment, regardless of AI consciousness, or retaliation (see §3.2.2 for a discussion on the correlation between alignment and consciousness). The risk is lower in these scenarios because of the lack of AI protections/rights, which, in the positive belief cases, can lead to disempowerment through cooperation failures, including unfair resource distribution, and exploitation of human trust.

The key difference between the two quadrants is the risk of AI suffering (§3.2.1), which is very significant in the false negative case. AIs will likely feel harmed, abused and enslaved if humans give no consideration to their subjective experiences while training them, interacting with them and using them. The possibility that AIs decide to retaliate against their oppression and initiate armed conflicts against human society gives a way in which human disempowerment can play out, that is not present in the true negative case.

The true negative scenario is notably the most likely the situation we are currently in. By not intentionally building conscious AI, we have a good chance at remaining in this quadrant, unless we do so accidentally.

3.1.3 Vignettes

 In order to more vividly envision how these scenarios may play out, we outline a non-exhaustive list of vignettes describing various ways society may respond to this issue.

Prevailing positive beliefs about AI consciousness

  1. AI as peers we peacefully cohabitate with: People come to believe AIs are conscious (experts may or may not agree with this), generally treat them with respect, and support measures to protect their welfare. This could split off into further sub-branches:
    1. AIs as equals: AIs may have the right to vote and own property, as well as other legal rights. Romantic relationships with AIs are normalised, and marriage with AI may even be legalised in some places. AIs are considered moral patients on par with humans, and society devotes significant resources to their interests. Those who are opposed to this are regarded as bigoted.
    2. AIs as beings subservient to humans: Most people treat AIs nicely, but do not regard them as equals. In other words, they believe that AIs do merit some level of respect, but that their purpose is ultimately to serve humans. AIs might have very basic legal protections (e.g. protection from cruelty). While tolerated, romantic relationships with AI are generally viewed as abnormal.
  2. AI as farmed animals: Despite many people believing that AI is conscious, their [potential] welfare remains of little concern to most outside of a small minority (analogous to vegan activists). Conscious AIs are routinely subjected to inhumane conditions, and society is unwilling to take meaningful action to protect their welfare.
  3. AIs as idols of worship: In awe of its superhuman abilities, humanity develops a divine admiration for advanced AI. Believers become convinced—as some in the “effective accelerationism” movement already are (Roose, 2023)—that superintelligent AI is humanity’s natural and rightful successor, and begin to allocate significant resources to its flourishing, possibly at the expense of the thriving of humans.

Prevailing negative or mixed beliefs about AI consciousness

  1. AI as tools: The idea that AI is conscious is a fringe opinion; most people, including experts, believe AI is not conscious. Therefore, we continue using them as tools. Naturally, people may sometimes anthropomorphize them, but, on the whole, even advanced AIs are thought to be no more conscious than laptops & phones are today. This is probably the closest to the present day.
  2. AI welfare as a “culture war” issue: Different demographics have different beliefs about whether AI is conscious and whether/how much its welfare ought to be considered. Given the prominent role of AI in public life, this becomes a heated political topic, and polarisation makes it a hard topic to make progress on. Some people (perhaps tech enthusiasts, progressives, or people who feel emotionally bonded with AI (§4.21) advocate for granting AI rights; others reject the idea that AIs are conscious and insist they should be treated as tools; still others agree AIs are conscious, but believe that instead of giving them rights, they should be banned.
  3. Conscious AIs as lab rats: We develop conscious AI, but it is not mass deployed, either because it is illegal or because AI labs have moral/reputational concerns. Therefore, conscious AI only exists inside top AI labs, and is used to perform experiments about consciousness or related topics.
  4. Conscious AI as a black market: Transgressing a moratorium on the development of conscious AI (§5.3), a rogue actor builds it anyway, and makes it publicly accessible. Most people think using this AI is unethical, but criminals or nefarious actors still have access to it, and perhaps use it due to its possible unique features (§2.2).

3.2 Concrete risks

On our analysis, the 2D framework is dominated by 4 major risks:

  1. AI suffering: Vast populations of AI are subjected to severe distress and pain, possibly due to various human-imposed conditions.
  2. Human disempowerment: Human autonomy is undermined due to lack of cooperation with AIs, AI exploitation of human trust, and/or retaliation.
  3. Geopolitical instability: Near-term economic crisis, civil unrest, and/or armed conflict.
  4. Depravity: Inhumane treatment of AIs causes spill-over effects, negatively impacting human-human relations.

3.2.1 AI suffering

On our framework, AI suffering emerges as the most significant risk. As mentioned earlier (§2.3), this is due to the scale of AI populations & the degree of suffering at stake.

  1. Scale of AI populations: In the future, it could be feasible to arbitrarily generate many copies of AIs. As a result, AI populations may rapidly achieve historically unprecedented, “astronomical” scales (Shulman & Bostrom 2021), outnumbering the number of contemporaneous living humans.
  2. Degree of suffering: AI may be purposed for a variety of painful & distressing ends that benefit humans. These include: torturous scientific experimentation (e.g. simulating psychiatric conditions to model their long-term course; Metzinger 2021b)[3], enslavement including exploitation for entertainment purposes, & caregiver stress; as well as fear of deactivation or reset, identity crisis due to repeated revision of core parameters, or use in hostage situations (Metzinger 2023)– in addition to any number of unforeseen harms. Given the complexity of AI technologies & the synthetic nature of AI consciousness, it is exceedingly difficult to anticipate the full range of harms that conscious AI might suffer. The potential for AI suffering is limited only by our imagination.

In short, future AI populations may number in the billions, & they may be subjected to harrowing conditions. Of course, the risk of AI suffering is limited only to those scenarios where AI is actually conscious: true positive & false negative. Between these two, the risk is highest in false negative because in this case, human society largely fails to recognise AIs as conscious. In this case, conscious AIs are least likely to enjoy any legal protections whatsoever, & are most likely to be exploited.

3.2.2 Human disempowerment

Human disempowerment is the risk of humans losing their current autonomy and dominance in relation to other beings and the environment. Let’s look at how this risk can be defined and realised across the four scenarios.

AI is perceived as non-conscious

In the case of scenarios where AI is non-conscious, human disempowerment is equal to the risk of loss of control (e.g. Bostrom 2014). This is the idea that as systems become more and more intelligent, humans become incapable of steering their actions and this leads to AIs dominating over humans. One way to ensure that AIs do not make actions which are against humans’ interests is to align their values with human values, which is currently very difficult to do and is known as the alignment problem.

Apart from the alignment problem, in the case of actual conscious AIs, we are faced with the possibility for retaliation - the possibility for conscious AIs to want to respond to their maltreatment in harmful ways. Note that this is also possible, but to a lesser extent, if we perceive them as conscious but still treat them badly.

There might be a link between consciousness and moral knowledge[4] (Shepherd and Levy 2020)  which makes it possible for consciousness to correlate with alignment. This is still highly uncertain but we could still factor it into our assessment. If consciousness is sufficient for AIs to behave ethically in regards to humans (and they do so), there is no misalignment risk in the false negative case, compared to the true negative case, but there is still risk for retaliation. Therefore, the false negative case might have lower likelihood for disempowerment compared to the true negative in the case of alignment, but higher likelihood in the case of misalignment, given the possibility for both misaligned behaviour and retaliation. We can therefore roughly consider them of equal risk.

AI is perceived as conscious

If we recognise AIs are beings with moral status (true positive scenario), it seems inappropriate to state that we should necessarily dominate over them. It is conceivable that we even relinquish our control over the environment given that they are aligned morally autonomous agents. Regardless of agency, AI’s recognised moral status dictates that we must protect their needs, which might result in the establishment of AI rights. The fact that there are AI rights leads to 2 ways human disempowerment could happen:

  1. Cooperation failures - humans fail to protect AIs’ needs without sacrificing humans’ needs, or both humans and AIs collectively fail to cooperate on economic and societal matters.
  2. Exploitation of trust - misaligned AI can more easily deceive humans and/or engage in dangerous behaviours.

What about the epistemic axis in this case? If AIs are not conscious but humans perceive them as such (false positive), this makes the risk of human disempowerment more likely than in the case of humans correctly recognising AIs as non-conscious (true negative) for two reasons. First, having protections might make it easier for misaligned AIs to deceive humans and exploit humans’ trust leading to dangerous behaviours (what we have defined as exploitation of trust). Second, humans might put in place protections of AI’s (illusory) needs which harm their own rights and autonomy (what we have defined as cooperation failures).

Given the possible link between consciousness and alignment, let’s see how the two risks stemming from the presence of AI rights can be compared across the true positive and the false positive scenarios, as a function of alignment. If AI is aligned, the likelihood for cooperation failures seems higher in the case of conscious AIs compared to non-conscious AIs, since it is more likely that conscious AIs will be integrated into our society as moral agents, which in turn will make it more difficult to cooperate with them on societal and economic matters. If AIs are misaligned, cooperative efforts seem less likely in the true positive case simply because it is less likely that misaligned or unethical AIs will engage in such efforts. This distinction is not this clear in the false positive case where humans would be responsible for the imbalance of AI rights.

In the case of exploitation of trust, the true positive and the false positive scenarios look the same: there is no such risk in the case of alignment, and there is a risk in the case of misalignment.

Overall comparison of the scenarios

Let’s summarise the effect of the possible correlation between consciousness and alignment across the different causes for human disempowerment for each scenario. This will result in an overall comparison of the risk across the scenarios.

The false negative case is of overall the same risk as the true negative case, since it could be either better or worse, depending on the consciousness-alignment link. The true positive and the false positive could be equally bad in the absence of a consciousness-alignment correlation, otherwise, the false positive would be of higher risk. If we take both positive belief scenarios and compare them to the true negative one, they are of the same risk in the presence of a consciousness-alignment correlation, and they are worse otherwise.

To conclude, compared to the ground true negative scenario where misalignment is a cause for human disempowerment, all other scenarios are of equal or higher risk, given that they present new paths to disempowerment.

3.2.3 Geopolitical instability

AI is widely thought to be a “transformative” technology (Karnofsky 2016; Gruetzemacher and Whittlestone 2022) with the potential to cause wide-reaching economic, social, & cultural disruption on par with the agricultural & industrial revolutions. The destabilising effects of AI technologies raise the near-term risk of economic crisis, civil unrest, & armed conflict. Given the wide-reaching ethical, legal, & social implications, this risk is arguably even greater when considering the prospect of conscious AI (as opposed to non-conscious AI, or AI in general).

Geopolitical instability can manifest in any of the envisioned scenarios for different reasons. However, we believe the risk to be lowest in terms of both severity & probability in the true negative scenario. There are two reasons for this. First of all, the true negative scenario (§4.1.1) can more or less be easily steered to by simply abstaining from building conscious AI (§5.3). Given the complexity of consciousness, it is doubtful that we will end up building conscious AI “by accident”. Secondly, whatever might cause geopolitical instability in the true negative scenario is likely also to pose problems in other cases. In short, the true negative case does not appear to present unique sources of geopolitical instability. If this is right, then the true negative scenario may well be considered a baseline as far as risk of geopolitical instability is concerned.

As for the other scenarios, it is difficult to determine which actually poses the higher risk. This is because the causes & nature of geopolitical instability vary across these cases–thus, we note this as a key uncertainty for future research. In the false negative scenario, we anticipate geopolitical instability to result mainly from ideological disagreement, such as moral or political dissent. Whereas, in the true positive & false positive scenarios, we expect that resource competition, as a prelude to human disempowerment, would be the leading cause of geopolitical instability[5]. In what follows, we briefly compare these distinct scenarios.

ScenarioRisk assessmentMain expected cause of geopolitical instability
False negativeModerate riskIdeological disagreement
True positiveModerate riskResource competition & human disempowerment
False positive
True negativeLow riskNot related to AI consciousness

Table 2: Evaluation of the risk of geopolitical instability based on the four scenarios in our 2D framework. The risk levels are categorised into different levels, either low or moderate. The table also identifies the main causes of geopolitical instability–ideological disagreement, resource competition and human disempowerment.

False negative: geopolitical instability due to ideological disagreement

In the false negative scenario, humanity largely regards AI as non-conscious. Only a minority of humans recognise that there exists genuinely conscious AI. But even if this contingent is exceedingly small, disagreement over AI consciousness & moral status could still become a contentious issue due to the stakes, & due to the efforts of an impassioned vocal minority. Indeed, this faction may even align themselves with AI insurgents. They could also, of course, avail themselves of other AI technologies available at the time[6]. Due to the rising floor of AI capabilities, such a fringe movement could still pose significant security threats. At the same time, continued research into AI consciousness might eventually tip the balance towards a paradigm shift in the direction of true positive[7].

True positive & false positive: geopolitical instability due to resource competition and human disempowerment

In the true positive & false positive scenarios, humanity largely regards AI as conscious. Progress in AI rights (Gunkel 2024) & legal protections may lead to AI attaining moral status near or even comparable to humans. Moral parity raises the risk of near-term resource competition[8] & long-term human disempowerment (§3.2.2). Trade-offs between the welfare of humans & AI, whether actual or merely perceived, are likely to engender frustration with governance, political polarisation, &/or discriminatory attitudes against AI, possibly manifesting as panhumanist tribalism[9]. Anthropocentrist persecution could coincide with repeal of robot rights & regression towards the false negative (if initially in true positive) or true negative scenario (if initially in false positive). The risk of open conflict, including warfare between humans & AIs, cannot be discounted from such transitions[10].

3.2.4 Depravity

Unconscionable behaviour towards AI could, down the line, translate into unconscionable behaviour towards other humans or other moral patients (e.g. animals). This, in essence, is the risk of depravity: if we treat AI inhumanely, we may become inhumane persons (Darling 2016, 227-8; Bloom 2016). The crux of this worry is an empirical hypothesis: under certain conditions, people’s behaviour towards AI can have spill-over effects on their behaviour towards other people. Guingrich & Graziano in their (2024) review literature showing that (1) people’s perceptions about the mentalistic features of AI (which are often implicit) do impact how they behave towards AI (see also Eyssel & Kuchenbrandt 2012), & (2) people’s behaviour towards AI do, in turn, influence how they treat other humans. If this transitive hypothesis is correct, then people’s beliefs about AI consciousness do, at least indirectly, influence their behaviour towards other humans.

Importantly, the risk of depravity does not depend upon AIs being conscious. Depravity is a risk that is present in all four scenarios because it can arise whenever certain social actor AIs (types of AI systems with which humans interact in social ways) are treated poorly or not accorded a basic amount of respect[11]. However, it is highest in the true negative & false negative scenarios due to the relative impoverishment of robot rights & protections. It can occur in the true negative scenario through abhorrent treatment of exceedingly human-like AI. In the false negative scenario, AI is actually conscious & hence morally deserving– thus depravity would coincide with AI suffering.

Although less likely, depravity can also occur even when human society at large recognises AI to be conscious. In the true positive case, fringe “anti-synthetic” bigots might continue to deny AI consciousness, or otherwise dispute their moral standing. Conscious AIs may be the victims of hate crimes perpetrated by such chauvinists. Such sentiments may be fuelled by perceived or actual competition for power & resources (§3.2.2, §3.2.3).

In the false positive case, even though deniers would be correct to object to AI consciousness, there may still be reason to worry that (1) the sorts of behaviours that follow from denying that AI is conscious, & (2) their spillover effects on human-human relationships (or relationships between humans & other moral subjects, e.g. animals).

3.2.5 Overall risk assessment

Table 3 summarises the comparison of the levels of each risk between the four scenarios under study. The risk levels mostly consider likelihood, but they also reflect levels of harm. If we order the scenarios based on overall risk, starting from the highest risk, this results in the following:

  1. False negative (highest risk)
  2. True positive
  3. False positive
  4. True negative (lowest risk)

The difference between the true positive and the false positive scenarios is coming from the risk of AI suffering, making the true positive higher risk.

From our assessment, it follows that the false negative scenario is the highest risk. Although our evaluation does not provide a model for comparison of the different risks, we consider AI suffering more harmful than the others, because both the scale and the degree of suffering might be very high (§3.2.1). This means that the false negative scenario could be significantly higher risk compared to the other three, which is not currently directly reflected in our table.

We want to emphasise that this assessment is intended only as a rough guide. Our evaluations are subject to significant uncertainty, and some risks may be more diffuse than others. Even so, we believe that ranking these scenarios by risk levels is valuable for further discussion.

Risk/quadrant

True positive

AI is correctly regarded as conscious

False negative

AI is incorrectly regarded as non-conscious

False positive

AI is incorrectly regarded as conscious

True negative

AI is correctly regarded as non-conscious

AI suffering

as severe distress and pain amongst vast populations of AI, possibly due to various human-imposed conditions

 

⚫⚫ | Medium

⚫⚫⚫ | High

⚫ | Low

 

⚫ | Low

Human disempowerment

through misalignment, exploitation of human trust/protections, cooperation failures, and/or retaliation

 

⚫⚫⚫| High

⚫⚫ | Medium

⚫⚫⚫| High

⚫⚫ | Medium

Geopolitical instability

as in near-term economic crisis, civil unrest, and/or armed conflict

 

⚫⚫| Medium

⚫⚫ | Medium

⚫⚫| Medium

⚫ | Low

Human depravity

as the negative impact of inhumane treatment of AIs on human to human interactions

 

⚫⚫ | Medium

⚫⚫⚫ | High

⚫⚫ | Medium

⚫⚫⚫ | High

Overall risk

Medium

Highest

Medium

Lowest

Table 3:  Evaluation of the risks associated with each of the four scenarios in our framework of conscious AI and public perception. Each cell represents the risk level–either low, medium or high. The last cell shows the overall risk level for each quadrant. As per our assessment, the highest risk is the false negative scenario (AIs are incorrectly regarded as non-conscious), while the lowest is the true negative scenario (AIs are correctly regarded as non-conscious).

 

  1. ^

     Francken et al. (2022) shows 33.9% of experts believe P-zombies are possible, meaning this scenario is not out of the picture. While the wasting of resources is definitely bad, the resulting one-sided relationships are morally nebulous—maybe they’re fine because they make humans happy, or maybe they’re bad because they are displacing reciprocal relationships that are more morally valuable. Grace et al. (2024) shows nearly 45% of AI researchers believe the latter is an extreme or substantial concern.

  2. ^

     Cf. Nozick (1974) on the “utility monster”

  3. ^

     AI may also be generated en masse to simulate thousands of years of evolution.

  4. ^

     While the link between phenomenal consciousness and moral knowledge and moral responsibility is still very unclear (Shepherd and Levy 2020), maybe more so than between phenomenal consciousness and moral status, it seems that if AIs become phenomenally conscious, they might not only gain moral status, but also develop moral knowledge and be capable of moral actions (ibid). This means phenomenal consciousness might be a route to beings which are both moral patients and morally responsible agents. It also means that if there is a link between phenomenal consciousness and moral knowledge, then consciousness might help with the alignment problem.

  5. ^

     ​​This is not to say that there would not be ideological disagreement in the positive belief cases (true positive & false positive), or that there would be no resource competition in the false negative case. Rather, the differences in the respective belief conditions simply favour distinct mechanisms of geopolitical instability.
    While it is true that such a minority can also exist in the true negative scenario, it is less likely that this will lead to geopolitical instability. This is because the true negative scenario is most likely to result from our deciding not to build conscious AI (see [section] on recommendation I). Given the complexity of consciousness, the odds that we will “accidentally” build conscious AI are exceedingly low. By comparison, the false negative scenario is more likely to arise as a result of ethical & philosophical reflection on conscious AI being severely outpaced by technological innovation. For this reason, we consider the risk of geopolitical conflict to be higher in the false negative case than in the true negative case.

  6. ^

     Information technologies can be leveraged to spread their message & to target it towards the most receptive audiences.

  7. ^

     A timeline in which the transition to true positive is preceded by a false negative phase is likely to entail greater net risk compared to a direct procession to true positive. This is because concerns about resource competition & human disempowerment could lie further down the road. Optimistically (assuming continued research into AI consciousness), the false negative scenario may ultimately be an unstable phase.

  8. ^

     Failure to support workers displaced by automation may foment widespread resentment towards AI.

  9. ^

     Jackson et al (2020) found that the presence of robots decreased intergroup bias among humans (termed a “panhumanistic” effect; see also Gray 2022). Increased solidarity among humans may be accompanied by mounting prejudice against agentic AI & robots (for an overview of discrimination against robots, see Barfield 2023).

  10. ^

     All of this could take place in either the true positive or false positive scenario. What bears emphasis about these two cases is that they may be outwardly identical, with the sole exception being that only in the former scenario are AIs actually conscious. In the latter case, AIs may be “P-zombies” (Chalmers)– fundamentally non-conscious beings that are literally indistinguishable from conscious beings. As a result of this potential indeterminacy, we assess the risks in both scenarios similarly.

  11. ^

     To put this into perspective: abusing a toaster is very different from abusing a lifelike AI companion: the former lacks anthropomorphic features, & is fundamentally incapable of engaging in complex social relations with humans.

Comments1
Sorted by Click to highlight new comments since:

Executive summary: The intersection of AI consciousness and public perception creates four possible future scenarios, each with distinct risks and implications for AI welfare, human society, and geopolitical stability.

Key points:

  1. Four scenarios emerge based on whether AI is actually conscious and whether society believes it is conscious.
  2. Major risks across scenarios include AI suffering, human disempowerment, geopolitical instability, and moral depravity.
  3. The false negative scenario (AI is conscious but not believed to be) poses the highest overall risk.
  4. AI suffering is potentially the most severe risk due to the possible scale and intensity involved.
  5. Public beliefs about AI consciousness could significantly impact AI rights, resource allocation, and human-AI relations.
  6. The true negative scenario (AI correctly regarded as non-conscious) likely poses the lowest overall risk.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Curated and popular this week
Relevant opportunities