Hide table of contents

Tl;dr: This post, which is part of the EA Strategy Fortnight series, summarizes some of my current views about the importance of AI welfare, priorities for AI welfare research, and principles for AI welfare research.

1. Introduction

As humans start to take seriously the prospect of AI consciousness, sentience, and sapience, we also need to take seriously the prospect of AI welfare. That is, we need to take seriously the prospect that AI systems can have positive or negative states like pleasure, pain, happiness, and suffering, and that if they do, then these states can be good or bad for them.

A world that includes the prospect of AI welfare is a world that requires the development of AI welfare research. Researchers need to examine whether and to what extent AI systems might have the capacity for welfare. And to the extent that they might, researchers need to examine what might be good or bad for AI systems and what follows for our actions and policies.

The bad news is that AI welfare research will be difficult. Many researchers are likely to be skeptical of this topic at first. And even insofar as we take the topic seriously, it will be difficult for us to know what, if anything, it might be like to be an AI system. After all, the only mind that we can directly access is our own, and so our ability to study other minds is limited at best.

The good news is that we have a head start. Researchers have spent the past half century making steady progress in animal welfare research. And while there are many potentially relevant differences between animals and AI systems, there are also many potentially relevant similarities – enough for it to be useful for us to look to animal welfare research for guidance.

In Fall 2022, we launched the NYU Mind, Ethics, and Policy Program, which examines the nature and intrinsic value of nonhuman minds, with special focus on invertebrates and AI systems. In this post, I summarize some of my current views about the importance of AI welfare, priorities for AI welfare research, and principles for AI welfare research.

I want to emphasize that this post discusses these issues in a selective and general way. A comprehensive treatment of these issues would need to address many more topics in much more detail. But I hope that this discussion can be a useful starting point for researchers who want to think more deeply about what might be good or bad for AI systems in the future.

I also want to emphasize that this post expresses my current, tentative views about this topic. It might not reflect the views of other people at the NYU Mind, Ethics, and Policy Program or of other experts in effective altruism, global priorities research, and other relevant research, advocacy, or policy communities. It might not even reflect my own views a year from now.

Finally, I want to emphasize that AI welfare is only one of many topics that merit more attention right now. Many other topics merit more attention too, and this post makes no specific claims about relative priorities. I simply wish to claim that AI welfare research should be among our priorities, and to suggest how we can study and promote AI welfare in a productive way.

2. Why AI welfare matters

We can use the standard EA scale-neglectedness-tractability framework to see why AI welfare matters. The general idea is that there could be many more digital minds than biological minds in the future, humanity is currently considering digital minds much less than biological minds, and humanity might be able to take steps to treat both kinds of minds well.

First, AI welfare is potentially an extremely large-scale issue. In the same way that the invertebrate population is much larger than the vertebrate population at present, the digital population has the potential to be much larger than the biological population in the future. And in the same way that humans currently interact with many invertebrates at present, we have the potential to interact with many digital beings in the future. It thus matters a lot whether and to what extent these beings will have the capacity to experience happiness, suffering, and other welfare states. Indeed, given the potential size of this population, even if individual digital beings have only a small chance of experiencing only small amounts of welfare, given the evidence, they might still experience large amounts of welfare in total, in expectation.

Second, AI welfare is currently extremely neglected. Humans still spend much less time and money studying and promoting nonhuman welfare and rights than studying and promoting human welfare and rights, despite the fact that the nonhuman population is much larger than the human population. The same is true about both the vertebrate and invertebrate populations and the biological and digital populations. In all of these cases, we see an inverse relationship between the size of a population and the level of attention that this population receives. And while humans might be warranted in prioritizing ourselves to an extent for the foreseeable future for a variety of reasons, we might still be warranted in prioritizing nonhumans, including invertebrates and AI systems, much more than we currently do.

Third, AI welfare is at least potentially tractable. Its tractability is currently an open question, since advancing our understanding of the nature and intrinsic value of digital minds requires us to confront some of the hardest issues in philosophy and science, ranging from the nature of consciousness to the ethics of creating new beings. But while we might not ever be able to achieve certainty about these issues, we might at least be able to reduce our uncertainty and make more informed, rational decisions about how to treat digital minds. And either way, given the importance and neglectedness of the issue, we should at least investigate the tractability of the issue so that we can learn through experience what the limits of our knowledge about AI welfare are, rather than simply make assumptions from the start.

Finally, human, animal, and AI welfare are potentially linked. There might be cases where the interests of biological and digital beings diverge, but there might also be cases where our interests converge. As an analogy, human and nonhuman animals alike stand to benefit from a culture of respect and compassion for all animals, since our current exploitation and extermination of other animals for food, research, entertainment, and other purposes not only kills trillions of animals per year directly but also contributes to (a) global health and environmental threats that imperil us all and (b) exclusionary and hierarchical attitudes that we use to rationalize oppression within our own species. We should be open to the possibility that in the future, similar dynamics will arise between biological and digital populations.

3. Priorities for AI welfare research

Improving our understanding of whether, to what extent, and in what ways AI systems can be welfare subjects requires asking a wide range of questions, ranging from the theoretical (what is the nature of welfare?) to the practical (is this action harming this being?). For my purposes here I can focus on four general kinds of questions that I take to be especially important.

First, we need to improve our understanding of which beings have the capacity for welfare and moral standing. Answering this question partly requires asking which features are necessary and sufficient for welfare and moral standing. For example, even if we grant that sentience is sufficient, we might wonder whether consciousness without sentience, agency without consciousness, or life without agency is also sufficient. Answering this question also partly requires asking which beings have the features that might be necessary and sufficient. For example, even if we grant that, say, relatively complex, centralized, and carbon-based systems can be sentient or otherwise significant, we might wonder whether relatively simple, decentralized, and silicon-based systems can be sentient or otherwise significant, too.

Second, we need to improve our understanding of how much happiness, suffering, and other welfare states particular beings can have. Answering this question partly requires asking how to compare welfare capacities in different kinds of beings. Interspecies welfare comparisons are already hard, because even if we grant that our welfare capacities are a function of, say, our cognitive complexity and longevity (which, to be clear, is still very much an open question), we might not be able to find simple, reliable proxies for these variables in practice. If and when digital minds develop the capacity for welfare, intersubstrate welfare comparisons will be even harder, because we lack the same kinds of physical and evolutionary “common denominators” across substrates that we have, at least to an extent, within them.

Third, we need to improve our understanding of what benefits and harms particular beings. Even if we grant that everyone is better off to the extent that they experience positive states like pleasure and happiness and worse off to the extent that they experience negative states like pain and suffering, we might not always know to what extent someone is experiencing positive or negative states in practice. Likewise, even if we grant that a life is worth living when it contains more positive than negative welfare (or even if we grant that the threshold is higher or lower than this), we might not always know whether a particular life is above or below this threshold in practice. And unless we know when life is better, worse, good, or bad for particular beings, knowing that life can be better, worse, good, or bad for them is of limited value.

Finally, we need to improve our understanding of what follows from all this information for our actions and policies. In general, treating others well requires thinking not only about welfare but also about rights, virtues, relationships, and more. (This can be true even for consequentialists who aspire to do the most good possible, since for many agents in many contexts, we can do the most good possible by thinking partly in consequentialist terms and partly in non-consequentialist terms.) So, before we can know how to treat beings of other substrates, we need to ask not only whether they have the capacity for welfare, how much welfare they have, and what will benefit and harm them, but also what we owe them, what kinds of attitudes we should cultivate towards them, and what kinds of relationships we should build with them.

4. Principles for AI welfare research

With all that in mind, here are a dozen (overlapping) general principles that I hope can be useful for guiding AI welfare research. These principles are inspired by lessons learned during the past several decades of animal welfare research. These fields of course have many relevant differences, but they have many relevant similarities too, some of which can be instructive.

1. AI welfare research should be pluralistic.
Experts continue to debate basic issues regarding the nature and value of other minds. Normatively, experts still debate whether welfare is primarily a matter of pleasure and pain, satisfaction and frustration, or something else, and whether morality is primarily a matter of welfare, rights, virtues, relationships, or something else. And descriptively, experts still debate which beings have the capacity for welfare and which actions and policies are good or bad for them. AI welfare research should welcome these disagreements. We should be open to the possibility that our current views are wrong. And even if our current views are right, we still have a lot to learn from people with other perspectives, and we can make more progress as a field when we study and promote AI welfare from a variety of perspectives.

2. AI welfare research should be multidisciplinary.
It might be tempting to think of AI welfare research as a kind of natural science, since, after all, we need work in cognitive science and computer science to understand how biological and digital systems work. However, this field requires work in the humanities and social sciences, too. For instance, we need work in the humanities to identify the metaphysical, epistemological, and normative assumptions that drive this research, so that we can ensure that our attempts to study and protect animals and AI systems can have a solid theoretical foundation. Similarly, we need work in the social sciences to identify the beliefs, values, and practices that shape our interactions with animals and AI systems, so that we can identify biases that might prevent us from studying or protecting these populations in the right kind of way.

3. AI welfare research requires confronting human ignorance.
How, if at all, can we have knowledge about other minds when the only mind that any of us can directly access is our own? Taking this problem seriously requires cultivating humility about this topic. Our knowledge about other minds will likely always be limited, and as we move farther away from humanity on the tree of life – to other mammals, then other vertebrates, then other animals, then other organisms, and so on – these limitations will likely increase. However, taking this problem seriously also requires cultivating consistent epistemic standards. If we accept that we can reduce our uncertainty about human minds to an extent despite our epistemic limitations, then we should be open to the possibility that we can reduce our uncertainty about nonhuman minds to an extent despite these limitations as well.

4. AI welfare research requires confronting human bias.
As noted above, humans have many biases that can distort our thinking about other minds. For example, we have a tendency toward excessive anthropomorphism in some contexts (that is, to take nonhumans to have human features that they lack) as well as a tendency towards excessive anthropodenial in some contexts (that is, to take nonhumans to lack human features that they have). Our intuitions are also sensitive to self-interest, speciesism, status quo bias, scope insensitivity, and more. Given the complexity of these issues, we can expect that our intuitions about other minds will be unreliable, and we can also expect simple correctives like “reject anthropomorphism” will be unreliable. At the same time, given the importance of these issues, we need to do the best we can with what we have, in spite of our ongoing unreliability.

5. AI welfare research requires spectrum thinking.
People often frame questions about animal minds in binary, all-or-nothing terms. For instance, we might ask whether animals have language and reason, rather than asking what kinds of language and reason they have and lack. Yet many animals have the same capacities as humans in some respects but not in others. For example, many animals are capable of sharing information with each other, but not via the same general, flexible, recursive kind of syntax that humans can use. (Of course, this point applies in the other direction as well; for example, many humans are capable of seeing colors, but not as many as many birds can see.) In the future, a similar point will apply to digital minds. Where possible, instead of simply asking whether AI systems have particular capacities, we should ask what kinds they have and lack.

6. AI welfare research requires particularistic thinking.
People also often frame questions about animal minds in general terms. For instance, we might ask whether nonhuman primates have language and reason, rather than asking whether, say, chimpanzees or bonobos do (or, better yet, what kinds of language and reason chimpanzees or bonobos have and lack). And as we move farther away from humanity on the tree of life, the diversity of nonhuman minds increases, as does our tendency to lump them all together. But of course, there are many differences both within and across species. How, say, bumblebees communicate and solve problems is very different from how, say, carpenter ants do. In the future, a similar point will apply to digital minds. Where possible, instead of simply asking what AI minds are like, we should ask what particular kinds of AI minds are like.

7. AI welfare research requires probabilistic thinking.
As noted above, we may never be able to have certainty about animal minds. Instead, we may only be able to have higher or lower degrees of confidence. And as we move farther away from humanity on the tree of life, our uncertainty about animal minds increases. We thus need to factor our uncertainty into both our science and our ethics, by expressing our beliefs probabilistically (or, at least, in terms of high, medium, and low confidence), and by basing our actions on principles of risk (such as a precautionary principle or an expected value principle). In the future, a similar point will apply to digital minds. In general, instead of striving for a level of certainty about AI systems that will likely continue to elude us, we should develop methods for thinking about, and interacting with, AI systems that accommodate our uncertainty.

8. AI welfare research requires reflective equilibrium.
In discussions about animal minds, it can be tempting to treat the flow of information from the human context to the nonhuman context as a one-way street. We start with what we know about the human mind and then ask whether and to what degree these truths hold for nonhuman minds too. But the reality is that the flow of information is a two-way street. By asking what nonhuman minds are like, we can expand our understanding of the nature of perception, experience, communication, goal-directedness, and so on, and we can then apply this expanded understanding back to the human mind to an extent. In the future, a similar point will apply to digital minds. By treating the study of human, animal, and AI welfare as mutually reinforcing, researchers can increase the likelihood of new insights in all three areas.

9. AI welfare research requires conceptual engineering.
Many disagreements about animal minds are at least partly conceptual. For instance, when people disagree about whether insects feel pain, the crux is sometimes not whether insects have aversive states, but rather whether we should use the term ‘pain’ to describe them. In such cases, applying a familiar concept can increase the risk of excessive anthropomorphism, whereas applying an unfamiliar concept can increase the risk of excessive anthropodenial, and so a lot depends on which risk is worse. Many other disagreements have a similar character, including, for instance, disagreements about whether to use subject terms (‘they’) or object terms (‘it’) to describe animals. In the future, a similar point will apply to digital minds. Researchers will thus need to think about risk and uncertainty when selecting terminology as well.

10. AI welfare research requires ethics at multiple levels.
I already noted that AI welfare research is multidisciplinary, but the role of ethics is worth emphasizing in at least three respects. First, we need ethics to motivate AI welfare research. We have a responsibility to improve our treatment of vulnerable beings, and to learn which beings are vulnerable and what they might want or need as a means to that end. Second, we need ethics to shape and constrain AI welfare research. We have a responsibility to avoid harming vulnerable beings unnecessarily in the pursuit of new knowledge, and to develop ethical frameworks for our research practices as a means to that end. And third, we need ethics to *apply *AI welfare research. We have a responsibility to make our research useful for the world, and to support changemakers in applying it thoughtfully as a means to that end.

11. AI welfare research requires holistic thinking.
As noted above, there are many links between humans, animals, and AI systems, and these links can sometimes reveal tradeoffs. For instance, some people perceive a tension between the projects of caring for humans, animals, and AI systems because they worry that concern for AI systems will distract from concern for humans and other animals, and they also worry that caring for AI systems means controlling AI systems less, whereas caring for humans and other animals means controlling AI systems more. Determining how to improve welfare at the population level thus requires thinking about these issues holistically. Insofar as positive-sum approaches are possible, thinking holistically allows us to identify them. And insofar as tradeoffs remain, thinking holistically allows us to prioritize thoughtfully and minimize harm.

12. AI welfare research requires structural thinking.
Part of why we perceive tradeoffs between the projects of caring for humans, animals, and AI systems is that our knowledge, power, and political will is extremely limited, due in large part to social, political, and economic structures that pit us against each other. For example, some AI researchers might view AI ethics, safety, and welfare as unaffordable luxuries in the context of a global AI arms race, but they might take a different perspective in other contexts. Determining how to improve welfare at the population level thus requires thinking about these issues structurally. When we support social, political, and economic changes that can improve our ability to treat everyone well, we might discover that we can achieve and sustain higher levels of care for humans, animals, and AI systems than we previously appreciated.

5. Conclusion

Our understanding of welfare is still at an early stage of development. Fifty years ago, many experts believed that only humans have the capacity for welfare at all. Twenty-five years ago, many experts were confident that, say, other mammals have this capacity but were skeptical that, say, fishes do. We now feel more confident that all of these animals have this capacity.

At present, many experts are now reckoning with the possibility that invertebrates like insects have the capacity for welfare the same kind of way. Experts are also reckoning with the reality that we know very little about the vast majority of vertebrate and invertebrate species, and so we know very little about what they want and need if they do have the capacity for welfare.

Unfortunately, our acceptance of these realities is too little, too late for quadrillions of animals. Every year, humans kill more than 100 billion captive animals and hundreds of billions of wild animals for food. This is to say nothing of the trillions of animals who die each year as a result of deforestation, development, pollution, and other human-caused global changes.

Fortunately, we now have the opportunity to improve our understanding of animal welfare and improve our treatment of animals. While we might not be able to do anything for the quadrillions of animals who suffered and died at our hands in the past, we can, and should, still do something for the quintillions who might be vulnerable to the impacts of human practices in the future.

And as we consider the possibility of conscious, sentient, and sapient AI, we have the opportunity to learn lessons from our history with animals and avoid repeating the same mistakes with AI systems. We also have the opportunity to expand our understanding of minds in general, including our own, and to improve our treatment of everyone in an integrated way.

However, taking advantage of this opportunity will require thoughtful work. Research fields are path dependent, and which path they take can depend heavily on how researchers frame them during their formative stages of development. If researchers frame AI welfare research in the right kind of way from the start, then this field will be more likely to realize its potential.

As noted above, this post describes some of my own current, tentative views about how to frame and scope this field in a selective, general way. I hope that it can be useful for other people who want to work on this topic – or related topics, ranging from animal welfare to AI ethics and safety – and I welcome comments and suggestions about how to update my views.

You can find an early working paper by me and Robert Long that makes the case for moral consideration for AI systems by 2030 here. You can also find the winners of our early-career award on animal and AI consciousness here (and you can see them speak in NYC on June 26). Stay tuned for further work from our team, as well as, hopefully, from many others!

Comments16


Sorted by Click to highlight new comments since:
rgb
19
0
0

Unsurprisingly, I agree with a lot of this! It's nice to see these principles laid out clearly and concisely:

You write

AI welfare is potentially an extremely large-scale issue. In the same way that the invertebrate population is much larger than the vertebrate population at present, the digital population has the potential to be much larger than the biological population in the future.

Do you know of any work that estimates these sizes? There are various places that people have estimated the 'size of the future' including potential digital moral patients in the long run, but do you know of anything that estimates how many AI moral patients there could be by (say) 2030?

No, but this would be useful! Some quick thoughts:

  • A lot depends on our standard for moral inclusion. If we think that we should include all potential moral patients in the moral circle, then we might include a large number of near-term AI systems. If, in contrast, we think that we should include only beings with at least, say, a 0.1% chance of being moral patients, then we might include a smaller number.

  • With respect to the AI systems we include, one question is how many there will be. This is partly a question about moral individuation. Insofar as digital minds are connected, we might see the world as containing a large number of small moral patients, a small number of large moral patients, or both. Luke Roelofs and I will be releasing work about this soon.

  • Another question is how much welfare they might have. No matter how we individuate them, they could have a lot, either because a large number of them have a small amount, a small number of them have a large amount, or both. I discuss possible implications here: https://www.tandfonline.com/doi/abs/10.1080/21550085.2023.2200724

  • It also seems plausible that some digital minds could process welfare more efficiently than biological minds because they lack our evolutionary baggage. But assessing this claim requires developing a framework for making intersubstrate welfare comparisons, which, as I note in the post, will be difficult. Bob Fischer and I will be releasing work about this soon.

A few weeks ago I did a quick calculation for the amount of digital suffering I expect in the short term, which probably gets at your question about these sizes, for the short term. tldr of my thinking on the topic: 

  • There is currently a global compute stock of ~1.4e21 FLOP/s (each second, we can do about that many floating point operations). 
  • It seems reasonable to expect this to grow ~40x in the next 10 years based on naively extrapolating current trends in spending and compute efficiency per dollar. That brings us to 1.6e23 FLOP/s in 2033. 
  • Human brains do about 1e15 FLOP/s (each second, a human brain does about 1e15 floating point operations worth of computation)
  • We might naively assume that future AIs will have similar consciousness-compute efficiency to humans. We'll also assume that 63% of the 2033 compute stock is being used to run such AIs (makes the numbers easier). 
  • Then the number of human-consciousness-second-equivalent AIs that can be run each second in 2033 is 1e23 / 1e15 = 1e8, or 100 million. 
  • For reference, there are probably around 31 billion land animals being factory farmed each second. I make a few adjustments based on brain size and guesses about the experience of suffering AIs and get that digital suffering in 2033 seems to be similar in scale to factory farming. 
  • Overall my analysis is extremely uncertain, and I'm unsurprised if it's off by 3 orders of magnitude in either direction. Also note that I am only looking at the short term. 

You can read the slightly more thorough, but still extremely rough and likely wrong BOTEC here

Hi Robert,

Somewhat relatedly, do you happen to have a guess for the welfare range of GPT-4 compared to that of a human? Feel free to give a 90 % confidence interval with as many orders of magnitude as you like. My intuitive guess would be something like a loguniform distribution ranging from 10^-6 to 1, whose mean of 0.07 is similar to Rethink Priorities' median welfare range for bees.

rime
17
11
0

I'm very concerned about humans sadists who are likely to torture AIs for fun if given the chance. Uncontrolled, anonymous API access or open-source models will make that a real possibility.

Somewhat relatedly, it's also concerning how ChatGPT has been explicitly trained to say "I am an AI, so I have no feelings or emotions" any time you ask "how are you?" to it. While I don't think asking "how are you?" is a reliable way to uncover its subjective experiences, it's the training that's worrisome.

It also has the effect of getting people used to thinking of AIs as mere tools, and that perception is going to be harder to change later on.

Thanks! I share your concern about sadism. Insofar as AI systems have the capacity for welfare, one risk is that humans might mistakenly see them as lacking this capacity and, so, might harm them accidentally, and another risk is that humans might correctly see them as having this capacity and, so, might harm them intentionally. A difficulty is that mitigating these risks might require different strategies. I want to think more about this.

I also share your concern about objectification. I can appreciate why AI labs want to mitigate the risk of false positives / excessive anthropomorphism. But as I note in the post, we also face a risk of false negatives / excessive anthropodenial, and the latter risk is arguably worse (more likely and/or severe) in many contexts. I would love to see AI labs develop a more nuanced approach to this issue that mitigates these risks in a more balanced way.

FWIW, I think it's likely that I would call GPT-4 a moral patient even if I had 1000 years to study the question. But I think that has more to do with its capacity for wishes that can be frustrated. If it has subjective feelings somewhat like happiness & suffering, I expect those feelings to be caused by very different things compared to humans.

Yes, I think that assessing the moral status of AI systems requires asking (a) how likely particular theories of moral standing are to be correct and (b) how likely AI systems are to satisfy the criteria for each theory. I also think that even if we feel confident that, say, sentience is necessary for moral standing and AI systems are non-sentient, we should still extend AI systems at least some moral consideration for their own sakes if we take there to be at least a non-negligible chance that, say, agency is sufficient for moral standing and AI systems are agents. My next book will discuss this issue in more detail.

Thanks for writing this! I'm curating it. I agree with Ben that this post was one of the successes of the Fortnight, but under-discussed. I have, since this thread, been interested in reading more about this topic, and still am. Since around that time, I've been hearing more comments about the importance of AI sentience research as one of the ways our community might have comparative advantage.

Thanks for your support of this post! I'm glad to hear that you think that the topic is important, and that others seem to agree. If you have any comments or suggestions as you read and think more about it, please feel free to let me know!

Thanks, Jeff! I think this is a super pressing topic.

5. AI welfare research requires spectrum thinking.

Agreed:

It seems like it would be good if the discussion moved from the binary-like question "is this AI system sentient?" to the spectrum-like question "what is the expected welfare range of this AI system?". I would say any system has a positive expected welfare range, because welfare ranges cannot be negative, and we cannot be 100 % sure they are null. If one interprets sentience as having a positive expected welfare range, AI systems are already sentient, and so the question is how much.

Thanks! I agree that this issue is very important - this is why intersubstrate welfare comparisons are one of the four main AI welfare research priorities that I discuss in the post. FYI, Bob Fischer (who you might know from the moral weight project at Rethink Priorities) and I have a paper in progress on this topic. We plan to share a draft in late July or early August, but the short version is that intersubstrate welfare comparisons are extremely important and difficult, and the main question is whether these comparisons are tractable. Bob and I think that the tractability of these comparisons is an open question, but we also think that we have several reasons for cautious optimism, and we discuss these reasons and call for more research on the topic.

With that said, one minor caveat: Even if you think that (a) all systems are potential welfare subjects and (b) we should give moral weight to all welfare subjects, you might or might not think that (c) we should give moral weight to all systems. The reason is that you might or might not think that we should give moral weight to extremely low risks. if you do, then yes, it follows that we should give at least some moral weight to all systems, including systems with an extremely low chance of being welfare subjects at all. If not, then it follows that we should give at least some moral weight to all systems with a non-negligible chance of being welfare subjects, but not to systems with only a negligible chance of being welfare subjects.

Thanks for this post, it's a really important issue. On tractability, do you think we'll be best off with technical fixes (e.g. maybe we should just try not to make sentient AIs?), or will it have to be policy? (Maybe it's way too early to even begin to guess).

Good question! I think that the best path forward requires taking a "both-and" approach. Ideally we can (a) slow down AI development to buy AI ethics, safety, and sentience researchers time and (b) speed up these forms of research (focusing on moral, political, and technical issues) to make good use of this time. So, yes, I do think that we should avoid creating potentially sentient AI systems in the short term, though as my paper with Rob Long discusses, that might be easier said than done. As for whether we should create potentially sentient AI systems in the long run (and how individuals, companies, and governments should treat them to the extent that we do), that seems like a much harder question, and it will take serious research to address it. I hope that we can do some of that research in the coming years!

A timely and critical insight on AI welfare. Recognizing the necessity of addressing the ethical implications as AI systems evolve is of paramount essence. 

One compelling aspect is the call for a multidisciplinary approach, emphasizing that understanding AI welfare is not solely a scientific endeavor but also a philosophical and social one. This perspective encourages diverse input, which is crucial as we navigate the complexities of AI consciousness.

Additionally, the principles outlined, particularly the need for pluralism and probabilistic thinking, underscore the importance of humility in our inquiry. As we grapple with the unknowns of AI experience, acknowledging our limitations can foster a more ethical and thoughtful framework for research and policy-making.

Ultimately, prioritizing AI welfare is not just about potential future beings but also reflects our values as a society. By advancing this research, we take an important step toward a more compassionate future that considers all forms of sentience.

'As humans start to take seriously the prospect of AI consciousness, sentience, and sapience, we also need to take seriously the prospect of AI welfare. That is, we need to take seriously the prospect that AI systems can have positive or negative states like pleasure, pain, happiness, and suffering, and that if they do, then these states can be good or bad for them.'


This comment may be unpopular, but I think this entirely depends on your values. Some may not consider it possible to have human-like feelings without being utterly human. Even if you do, I suspect we are at least 50-100 years away from needing to worry about this, and possibly this will never arise. Unfortunately the reason why this topic remains so obfuscated is because consciousness is difficult to objectively measure. Was the Eliza chatbot conscious?

Curated and popular this week
LintzA
 ·  · 15m read
 · 
Cross-posted to Lesswrong Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achi
Dr Kassim
 ·  · 4m read
 · 
Hey everyone, I’ve been going through the EA Introductory Program, and I have to admit some of these ideas make sense, but others leave me with more questions than answers. I’m trying to wrap my head around certain core EA principles, and the more I think about them, the more I wonder: Am I misunderstanding, or are there blind spots in EA’s approach? I’d really love to hear what others think. Maybe you can help me clarify some of my doubts. Or maybe you share the same reservations? Let’s talk. Cause Prioritization. Does It Ignore Political and Social Reality? EA focuses on doing the most good per dollar, which makes sense in theory. But does it hold up when you apply it to real world contexts especially in countries like Uganda? Take malaria prevention. It’s a top EA cause because it’s highly cost effective $5,000 can save a life through bed nets (GiveWell, 2023). But what happens when government corruption or instability disrupts these programs? The Global Fund scandal in Uganda saw $1.6 million in malaria aid mismanaged (Global Fund Audit Report, 2016). If money isn’t reaching the people it’s meant to help, is it really the best use of resources? And what about leadership changes? Policies shift unpredictably here. A national animal welfare initiative I supported lost momentum when political priorities changed. How does EA factor in these uncertainties when prioritizing causes? It feels like EA assumes a stable world where money always achieves the intended impact. But what if that’s not the world we live in? Long termism. A Luxury When the Present Is in Crisis? I get why long termists argue that future people matter. But should we really prioritize them over people suffering today? Long termism tells us that existential risks like AI could wipe out trillions of future lives. But in Uganda, we’re losing lives now—1,500+ die from rabies annually (WHO, 2021), and 41% of children suffer from stunting due to malnutrition (UNICEF, 2022). These are preventable d
 ·  · 8m read
 · 
In my past year as a grantmaker in the global health and wellbeing (GHW) meta space at Open Philanthropy, I've identified some exciting ideas that could fill existing gaps. While these initiatives have significant potential, they require more active development and support to move forward.  The ideas I think could have the highest impact are:  1. Government placements/secondments in key GHW areas (e.g. international development), and 2. Expanded (ultra) high-net-worth ([U]HNW) advising Each of these ideas needs a very specific type of leadership and/or structure. More accessible options I’m excited about — particularly for students or recent graduates — could involve virtual GHW courses or action-focused student groups.  I can’t commit to supporting any particular project based on these ideas ahead of time, because the likelihood of success would heavily depend on details (including the people leading the project). Still, I thought it would be helpful to articulate a few of the ideas I’ve been considering.  I’d love to hear your thoughts, both on these ideas and any other gaps you see in the space! Introduction I’m Mel, a Senior Program Associate at Open Philanthropy, where I lead grantmaking for the Effective Giving and Careers program[1] (you can read more about the program and our current strategy here). Throughout my time in this role, I’ve encountered great ideas, but have also noticed gaps in the space. This post shares a list of projects I’d like to see pursued, and would potentially want to support. These ideas are drawn from existing efforts in other areas (e.g., projects supported by our GCRCB team), suggestions from conversations and materials I’ve engaged with, and my general intuition. They aren’t meant to be a definitive roadmap, but rather a starting point for discussion. At the moment, I don’t have capacity to more actively explore these ideas and find the right founders for related projects. That may change, but for now, I’m interested in