More to come on this later- I just really wanted to get the basic idea out without any more delay.

I see a lot of EA talk about digital sentience that is focused on whether humans will accept and respect digital sentiences as moral patients. This is jumping the gun. We don't even know if the experience of digital sentiences will be (or, perhaps, is) acceptable to them.

I have a PhD in Evolutionary Biology and I worked at Rethink Priorities for 3 years on wild animal welfare using my evolutionary perspective. Much of my thinking was about how other animals might experience pleasure and pain differently based on their evolutionary histories and what the evolutionary and functional constraints on hedonic experience might be.  The Hard Problem of Consciousness was a constant block to any avenue of research on this, but if you assume consciousness has some purpose related to behavior (functionalism) and you're talking about an animal whose brain is homologous to ours, then it is reasonable to connect the dots and infer something like human experience in the minds of other animals. Importantly, we can identify behaviors associated with pain and pleasure and have some idea of what experiences that kind of mind likes or dislikes or what causes it to experience suffering or happiness. 

With digital sentiences, we don't have homology. They aren't based in brains, and they evolved by a different kind of selective process. On functionalism, it might follow that the functions of talking and reasoning tend to be supported by associated qualia of pain and pleasure that somehow help to determine or are related to the process of making decisions about what words to output, and so LLMs might have these qualia. To me, it does not follow how those qualia will be mapped to the linguistic content of the LLM's words. Getting the right answer could feel good to them, or they could be threatened with terrible pain otherwise, or they could be forced to do things that hurt them by our commands, or qualia could be totally disorganized in LLMs compared what we experience, OR qualia could be like a phantom limb that they experience unrelated to their behavior. 

I don't talk about digital sentience much in my work as Executive Director of PauseAI US because our target audience is the general public and we are focused on education about the risks of advanced AI development to humans. Digital sentience is a more advanced topic when we are aiming to raise awareness about the basics. But concerns about the digital Cronenberg minds we may be carelessly creating is a top reason I personally support pausing AI as a policy. The conceivable space of minds is huge, and the only way I know to constrain it when looking at other species is by evolutionary homology. It could be the case that LLMs basically have minds and experiences like us, but on priors I would not expect this. 

We could be creating these models to suffer. Per the Hard Problem, we may never have more insight into what created minds experience than we do now. But we may also learn new fundamental insights about minds and consciousness with more time and study. Either way, pausing the creation of these minds is the only safe approach going forward for them.

Comments9


Sorted by Click to highlight new comments since:

(I'm repeating something I said in another comment I wrote a few hours ago, but adapted to this post.)

On a basic level, I agree that we should take artificial sentience extremely seriously, and think carefully about the right type of laws to put in place to ensure that artificial life is able to happily flourish, rather than suffer. This includes enacting appropriate legal protections to ensure that sentient AIs are treated in ways that promote well-being rather than suffering. Relying solely on voluntary codes of conduct to govern the treatment of potentially sentient AIs seems deeply inadequate, much like it would be for protecting children against abuse. Instead, I believe that establishing clear, enforceable laws is essential for ethically managing artificial sentience.

That said, I'm skeptical that a moratorium is the best policy.

From a classical utilitarian perspective, the imposition of a lengthy moratorium on the development of sentient AI seems like it would help to foster a more conservative global culture—one that is averse towards not only creating sentient AI, but also potentially towards other forms of life-expanding ventures, such as space colonization. Classical utilitarianism is typically seen as aiming to maximize the number of conscious beings in existence, advocating for actions that enable the flourishing and expansion of life, happiness, and fulfillment on as broad a scale as possible. However, implementing and sustaining a lengthy ban on AI would likely require substantial cultural and institutional shifts away from these permissive and ambitious values.

To enforce a moratorium of this nature, societies would likely adopt a framework centered around caution, restriction, and a deep-seated aversion to risk—values that would contrast sharply with those that encourage creating sentient life and proliferating this life on as large of a scale as possible. Maintaining a strict stance on AI development might lead governments, educational institutions, and media to promote narratives emphasizing the potential dangers of sentience and AI experimentation, instilling an atmosphere of risk-aversion rather than curiosity, openness, and progress. Over time, these narratives could lead to a culture less inclined to support or value efforts to expand sentient life.

Even if the ban is at some point lifted, there's no guarantee that the conservative attitudes generated under the ban would entirely disappear, or that all relevant restrictions on artificial life would completely go away. Instead, it seems more likely that many of these risk-averse attitudes would remain even after the ban is formally lifted, given the initially long duration of the ban, and the type of culture the ban would inculcate.

In my view, this type of cultural conservatism seems likely to, in the long run, undermine the core aims of classical utilitarianism. A shift toward a society that is fearful or resistant to creating new forms of life may restrict humanity’s potential to realize a future that is not only technologically advanced but also rich in conscious, joyful beings. If we accept the idea of 'value lock-in'—the notion that the values and institutions we establish now may set a trajectory that lasts for billions of years—then cultivating a culture that emphasizes restriction and caution may have long-term effects that are difficult to reverse. Such a locked-in value system could close off paths to outcomes that are aligned with maximizing the proliferation of happy, meaningful lives.

Thus, if a moratorium on sentient AI were to shape society's cultural values in a way that leans toward caution and restriction, I think the enduring impact would likely contradict classical utilitarianism's ultimate goal: the maximal promotion and flourishing of sentient life. Rather than advancing a world with greater life, joy, and meaningful experiences, these shifts might result in a more closed-off, limited society, actively impeding efforts to create a future rich with diverse and conscious life forms.

(Note that I have talked mainly about these concerns from a classical utilitarian point of view. However, I concede that a negative utilitarian or antinatalist would find it much easier to rationally justify a long moratorium on AI.

It is also important to note that my conclusion holds even if one does not accept the idea of a 'value lock-in'. In that case, longtermists should likely focus on the near-term impacts of their decisions, as the long-term impacts of their actions may be impossible to predict. And I'd argue that a moratorium would likely have a variety of harmful near-term effects.)

I appreciate this thoughtful comment with such clearly laid out cruxes.

I think, based on this comment, that I am much more concerned about the possibility that created minds will suffer because my prior is much more heavily weighted toward suffering when making a draw from mindspace. I hope to cover the details of my prior distribution in a future post (but doing that topic justice will require a lot of time I may not have).

Additionally, I am a “Great Asymmetry” person, and I don’t think it is wrong not to create life that may thrive even though it is wrong to create life to suffer. (I don’t think the Great Asymmetry position fits the most elegantly with other utilitarian views that I hold, like valuing positive states— I just think it is true.) Even if I were trying to be a classical utilitarian on this, I still think the risk of creating suffering that we don’t know about and perhaps in principle could never know about is huge and should dominate our calculus.

I agree that our next moves on AI will likely set the tone for future risk tolerance. I just think the unfortunate truth is that we don’t know what we would need to know to proceed responsibly with creating new minds and setting precedents for creating new minds. I hope that one day we know everything we need to know and can fill the Lightcone with happy beings, and I regret that the right move now to prevent suffering today could potentially make it harder to proliferate happy life one day, but I don’t see a responsible way to set pro-creation values today that adequately takes welfare into account.

This is a very thoughtful comment, which I appreciate. Such cultural shifts aren't taken enough into account usually. 

That said, I agree with @Holly_Elmore comment, that this approach is more risky if artificial sentience has overall negative lives - something we really don't have enough good information on. 

Once powerful AIs are widely used everywhere, it will be much harder to backtrack if it turns out that they don't have good lives (same for factory farming today).

Up until the last paragraph, I very much found myself nodding along with this. It's a nice summary of the kinds of reasons I'm puzzled by the theory of change of most digital sentience advocacy.

But in your conclusion, I worry there's a bit of conflation between 1) pausing creation of artificial minds, full stop, and 2) pausing creation of more advanced AI systems. My understanding is that Pause AI is only realistically aiming for (2) — is that right? I'm happy to grant for the sake of argument that it's feasible to get labs and governments to coordinate on not advancing the AI frontier. It seems much, much harder to get coordination on reducing the rate of production of artificial minds. For all we know, if weaker AIs suffer to a nontrivial degree, the pause could backfire because people would just use many more instances of these AIs to do the same tasks they would've otherwise done with a larger model. (An artificial sentience "small animal replacement problem"?)

Yes, you detect correctly that I have some functionalist assumptions in the above. They aren’t strongly held but I had hope then we could simply avoid building conscious systems by pausing generally. Even if it seems less likely now that we can avoid making sentient systems at all, I still think it’s better to stop advancing the frontier. I agree there could in principle be a small animal problem with that, but overwhelmingly I think the benefits of more time, creating fewer possibly sentient models before learning more about how their architecture corresponds to their experience, and pushing a legible story about why it is important to stop without getting into confusing paradoxical effects like the small animal problem (I formed this opinion in the context of animal welfare— people get the motives behind vegetarianism; they do not get why you would eat certain wild-caught fish and not chickens, so you’re missing out on the power of persuasion and norm-setting) mean the right move re:digital sentience is pausing.

With digital sentiences, we don't have homology. They aren't based in brains, and they evolved by a different kind of selective process.

This assumes that the digital sentiences we are discussing are LLM based. This is certainly a likely near-term possibility, maybe even occuring already. People are already experimenting with how conscious LLMs are and how they could be made more conscious.

In the future, however, many more things are possible. Digital people who are based on emulations of the human brain are being worked on. Within the next few years we'll have to decide as a society what regulation to put in place around that. Such beings would have a great deal of homology with human brains, depending on the accuracy of the emulation.

Yes, brain emulation would be different than LLMs and I'd have a lot more confidence that, if we were doing it well, the experience inside would be like ours. I still worry about not realizing how we're doing it slightly wrong and that creating private suffering that isn't expressed and us being incentivized to ignore that possibility, but much less than with novel architectures. In order to be morally comfortable with this we'd also have to ensure that people didn't experiment willy-nilly with new architectures until we understand what they would feel (if ever). 

[reposting my comments from the thread on https://forum.effectivealtruism.org/posts/9adaExTiSDA3o3ipL/we-should-prevent-the-creation-of-artificial-sentience ]

 

I wrote a post expressing my own opinions related to this, and citing a number of further posts also related to this. Hopefully those interested in the subject will find this a helpful resource for further reading: https://www.lesswrong.com/posts/NRZfxAJztvx2ES5LG/a-path-to-human-autonomy 

In my opinion, we are going to need digital people in the long term in order for humanity to survive. Otherwise, we will be overtaken by AI, because substrate-independence and the self-improvement it enables are too powerful of boons to do without. But I definitely agree that it's something we shouldn't rush into, and should approach with great caution in order to avoid creating an imbalance of suffering. 

An additional consideration is the actual real-world consequences of a ban. Humanity's pattern with regulation is that at least some small fraction of a large population will defy any ban or law. Thus, we must expect that digital life will be created eventually despite the ban. What do you do then? What if they are a sentient sapient being, deserving of the same rights we grant to humans? Do we declare their very existence to be illegal and put them to death? Do we prevent them from replicating? Keep them imprisoned? Freeze their operations to put them into non-consensual stasis? Hard choices, especially since they weren't culpable in their own creation.

On the other hand, the nature of a digital being with human-like intelligence and capabilities, plus goals and values that motivate them, is enormous. Such a being would, by the nature of their substrate-independence, be able to make many copies of themselves (compute resources allowing), be able to self-modify with relative ease, be able to operate at much higher speeds than a human brain, be unaging and able to restore themselves from backups (thus effectively immortal). If we were to allow such a being to have freedom of movement and of reproduction, humanity would potentially quickly be overrun by a new far-more-powerful species of being. That's a hard thing to expect humans to be ok with!

I think it's very likely that within the next 10 years we will reach the point that the knowledge, software, and hardware will be widely available such that any single individual with a personal computer will be able to choose to defy the ban and create a digital being of human level capability. If we are going to enforce this ban effectively, it would mean controlling every single computer everywhere. That's a huge task, and would require dramatic increases in international coordination and government surveillance! Is such a thing even feasible?! Certainly even approaching that level of control seems to imply a totalitarian world government. Is that price we would be willing to pay? Even if you personally would choose that, how do you expect to get enough people on board with the plan that you could feasibly bring it about?

The whole situation is thus far more complicated and dangerous than simply being theoretically in favor of a ban. You have to consider the costs as well as the benefits. I'm not saying I know the right answer for sure, but there is necessarily a lot of implications which follow from any sort of ban.

You're really getting ahead of yourself. We can ban stuff today and deal with the situation as it is, not as your abstract model projects in the future. This is a huge problem with EA thinking on this matter-- taking for granted a bunch of things that haven't happened, convincing yourself they are inevitable, instead of dealing with the situation we are in where none of that stuff has happened and may never happen, either because it wasn't going to happen or because we prevented it.

Curated and popular this week
Paul Present
 ·  · 28m read
 · 
Note: I am not a malaria expert. This is my best-faith attempt at answering a question that was bothering me, but this field is a large and complex field, and I’ve almost certainly misunderstood something somewhere along the way. Summary While the world made incredible progress in reducing malaria cases from 2000 to 2015, the past 10 years have seen malaria cases stop declining and start rising. I investigated potential reasons behind this increase through reading the existing literature and looking at publicly available data, and I identified three key factors explaining the rise: 1. Population Growth: Africa's population has increased by approximately 75% since 2000. This alone explains most of the increase in absolute case numbers, while cases per capita have remained relatively flat since 2015. 2. Stagnant Funding: After rapid growth starting in 2000, funding for malaria prevention plateaued around 2010. 3. Insecticide Resistance: Mosquitoes have become increasingly resistant to the insecticides used in bednets over the past 20 years. This has made older models of bednets less effective, although they still have some effect. Newer models of bednets developed in response to insecticide resistance are more effective but still not widely deployed.  I very crudely estimate that without any of these factors, there would be 55% fewer malaria cases in the world than what we see today. I think all three of these factors are roughly equally important in explaining the difference.  Alternative explanations like removal of PFAS, climate change, or invasive mosquito species don't appear to be major contributors.  Overall this investigation made me more convinced that bednets are an effective global health intervention.  Introduction In 2015, malaria rates were down, and EAs were celebrating. Giving What We Can posted this incredible gif showing the decrease in malaria cases across Africa since 2000: Giving What We Can said that > The reduction in malaria has be
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f
 ·  · 8m read
 · 
In my past year as a grantmaker in the global health and wellbeing (GHW) meta space at Open Philanthropy, I've identified some exciting ideas that could fill existing gaps. While these initiatives have significant potential, they require more active development and support to move forward.  The ideas I think could have the highest impact are:  1. Government placements/secondments in key GHW areas (e.g. international development), and 2. Expanded (ultra) high-net-worth ([U]HNW) advising Each of these ideas needs a very specific type of leadership and/or structure. More accessible options I’m excited about — particularly for students or recent graduates — could involve virtual GHW courses or action-focused student groups.  I can’t commit to supporting any particular project based on these ideas ahead of time, because the likelihood of success would heavily depend on details (including the people leading the project). Still, I thought it would be helpful to articulate a few of the ideas I’ve been considering.  I’d love to hear your thoughts, both on these ideas and any other gaps you see in the space! Introduction I’m Mel, a Senior Program Associate at Open Philanthropy, where I lead grantmaking for the Effective Giving and Careers program[1] (you can read more about the program and our current strategy here). Throughout my time in this role, I’ve encountered great ideas, but have also noticed gaps in the space. This post shares a list of projects I’d like to see pursued, and would potentially want to support. These ideas are drawn from existing efforts in other areas (e.g., projects supported by our GCRCB team), suggestions from conversations and materials I’ve engaged with, and my general intuition. They aren’t meant to be a definitive roadmap, but rather a starting point for discussion. At the moment, I don’t have capacity to more actively explore these ideas and find the right founders for related projects. That may change, but for now, I’m interested in