Hide table of contents

This post was written by Simon Eckerström Liedholm, Strategy Researcher at Wild Animal Initiative, and it is a contribution to the AGI & Animals Debate Week. It partially builds on an earlier exploration of transformative AI and wild animals written by Mal Graham. Disclaimer: I (Simon) would describe myself primarily as a biologist. While I have a long-standing interest in AI safety, I'm by no means an expert on the subject.

 

Executive Summary

This post asks whether, if AGI goes well for humans, it will also go well for animals, with a particular focus on wild animals. I estimate a ~30% probability that it would.

My prediction rests on several considerations. First, I think the most likely path to AGI “going well” for humans probably involves aligning the AGI with something roughly similar to the values of an average present-day human, values that currently permit a lot of suffering among farmed animals. Second, if current values are mostly locked in, which seems like a plausible scenario, I don't think it would lead to meaningful improvements of animal welfare.

I also conclude with some tentative suggestions for animal welfare advocates, and highlight potential backfire risks.

 

Definitions and assumptions

What question is this post meant to answer?

This post is intended to help determine whether the following statement is true (in other words, it is basically just a verbose prediction):

If AGI goes well for humans, it’ll probably go well for animals” (see here).

We can reformulate this as a question: “If AGI goes well for humans, will it go well for animals?”. (Per the debate week announcement post, “probably” refers to a 70% probability). I will answer this question by providing my own subjective probability for how likely it is that AGI goes well for animals, given that it goes well for humans. Below, I clarify how I’ve chosen to define the words in this question.

 

“Animals”

For the purposes of this post, I will assume that “animals” refers to all sentient biological animals (except humans), as well as all sentient simulated animals. However, I will mostly discuss wild animals (see e.g., “What do you count as a wild animal?” here).

 

“AGI”

I will refer to artificial general intelligence (“AGI”; an AI that “matches or surpasses human capabilities across virtually all cognitive tasks”), to adhere to the debate week question. But the scenarios I discuss will often assume that progress doesn't stop at human-level AI. Many of the considerations that matter for wild animals, for instance — large-scale ecological disruption or active intervention, space colonization, value lock-in — may become relevant sometime after the advent of AGI and before “artificial superintelligence” (ASI).

 

“Go well”

I will assume that “[...] goes well for humans” refers to a world where humans are either equally well off or better off than they currently are. This includes mediocre worlds that are essentially as good for humans as the current world is. This unfortunately doesn’t do a great job of tracking what I would intuitively define as “going well” for humans, but I’ve decided to define it this way for practical reasons, it makes defining the question more manageable. Note that my definition refers to the net outcome, so the world may change drastically while still being roughly as desirable for humans as the pre-AGI world. Also note that evaluating whether things “went well” for a subgroup (e.g., “humans”) can get tricky, particularly if the group in question determines whether things “went well” for them by comparing the overall outcome to their values. But for the purposes of answering the question at hand, we can postulate that any frustration among humans related to persistent or increased animal suffering is here ignored when determining if things “went well” for humans.

For animals, I will assume a broadly consequentialist assessment of effects on animal welfare (see e.g., John & Sebo 2020). I would argue that, on the whole, animals are not currently well off. So I will assume that “going well” for animals means something like “the ratio of positive experiences to negative experiences among animals is both meaningfully higher than it is today, and is above 1 (i.e. net positive)”.

 

Considerations

Below, I list some important considerations that I think will likely affect the probability of AGI going well for animals if it goes well for humans. I generally draw on two different kinds of evidence; observations about current alignment techniques, and more speculative reasoning about things like value lock-in and methods used for aligning a future AGI. It is certainly possible that tendencies apparent in current alignment techniques will bear little resemblance to whatever approach is eventually used for AGI, but I think there will likely be meaningful resemblance.

 

AGI timelines

AGI timelines probably matter for animal welfare outcomes

Even if the development of AGI goes well for humans, the speed with which it is developed will likely affect whether it goes well for animals. My guess is that the sooner AGI is developed, the worse it will be for animals, despite it going well for humans. This is mostly because I expect that if AGI is developed very soon (e.g., in the next ~5 years) and it goes well for humans, it will likely do so only marginally, and there wouldn’t be much time for humans to deliberate on how to make sure it goes well for animals, or for values around (wild) animal welfare to evolve in a positive direction.

 

When should do I expect AGI to be developed?

Many AI researchers and forecasting experts’ median predictions are that AGI is somewhere between a couple years and a couple decades away. As of March 2026, a relevant Metaculus prediction indicates that there’s a 50% probability that a “general AI system” will be deployed by 2032. A survey of ~1700 AI researchers conducted in 2023 (Grace et al. 2025) showed a median prediction of “High-Level Machine Intelligence” (HLMI) by 2047. Another survey of 421 participants (the “2025 AI Forecasting Survey”; see also Ho 2026) recruited from the AI forecasting community showed a median estimate of HLMI by 2030. For other approaches, and attempts to extract consensus estimates, see e.g., Wynroe et al. (2023), Todd (2025a), Todd (2025b), and Owen (2025). Note that the exact questions posed to each group were not identical, and the groups surveyed were meaningfully different from each other, so these results should be interpreted with care. One general trend, though, is that the predicted timelines are becoming noticeably shorter (see e.g., the Metaculus prediction and Grace et al. 2025), which might partially invalidate the results of predictions that are even a few years old.

So what does all of this mean for effects on animal welfare? For me, it is this: If AGI is developed somewhere around ~5 to ~25 years from now, that is not much time to reliably ensure that the future AGI meaningfully cares about animal welfare, even if things go well for humans.

 

Working on safeguarding the welfare of future wild animals seems particularly important

As was argued in a post by Vaintrob & West (2025), it seems plausible that a future AGI would eventually render animal farming obsolete (though see Fai Tse 2022 for a different perspective). Raising whole animals in order to use only some parts of them, for instance, is fundamentally inefficient, so cheap cultivated meat may be developed and could remove economic incentives to continue the practice of animal farming. However, it is much less clear whether a future AGI would try to help animals suffering in the wild, who do not exist because we’ve bred them. And in my opinion, failing to alleviate the suffering of the vast number of wild animals who exist today due to indifference would be a tragedy (Johannsen 2020; Faria 2023; O’Brien 2025).

 

Future values

Predicting future values based on current values

Values of present-day humans

While surveys indicate that humans do have some level of moral concern for both farmed and wild animals in theory (Weathers et al. 2020; Jaeger & Wilks 2023), it is unclear how well that concern translates to action. Revealed preferences indicate that many humans do not value the suffering of factory farmed animals much at all. The picture is even bleaker for wild animals — especially those considered “pests,” like rats, and animals who may be hard to relate to, like flies (Jaeger & Wilks 2023). Due to what I would describe as fallacious reasoning (Animal Ethics 2025), most people tend to ignore suffering in the wild. While there seems to be some support for intervening to improve the welfare of wild animals in certain narrow ways (Sleegers et al. 2025), those attitudes might not directly generalize to large-scale interventions to improve the lives of wild animals for their own sakes.

 

Apparent values of present-day AIs (LLMs)

To get a sense of the values a future AGI might have in relation to animal welfare, it may be useful to look at current Large Language Models (LLMs). Does concern for animal welfare seem to be high on the agenda for companies like OpenAI, Anthropic, or Google DeepMind? My impression is that it is not. In a review of ~70 ethical guidelines and policy documents regarding AI (Singer & Fai Tse 2022; Fai Tse et al. 2025), only two explicitly mention animal welfare (though note that since the review was conducted, Anthropic’s updated version of Claude’s Constitution now includes a reference to animal welfare). Singer & Fai Tse (2022) also reviewed the course materials for ~70 AI ethics/computer science ethics courses, and found that none discussed AI's impact on animals (though one of them addressed wildlife preservation).

It is not surprising to me that current LLMs, which are trained on human-generated text during pre-training, then fine-tuned in the post-training stage using e.g., Reinforcement Learning from Human Feedback (RLHF), will express fairly speciesist values that might not be far from the values of an average human in a Western country. Hagendorff et al. (2023) found that different AI systems (not just LLMs) inherited cultural biases against certain animals, and similar tendencies have been found in recent benchmarking efforts (Ghose et al. 2024). See also Compassion in Machine Learning's (CaML) CompassionBench, which compares models in terms of concern for the welfare of sentient beings.

 

Values of future decision makers

There are many ways that a future AGI could be considered “aligned” (see Gabriel 2020 for a useful high-level taxonomy of AI alignment targets), but I think the most likely future in which AGI goes well for humans, is a future where the AGI is aligned with something that is at least fairly close to the average present-day human’s values. And if the AGI absorbs such values, i.e. values that currently permit enormous animal suffering on factory farms for instance, then I do not expect animals to have a good future (see here for a similar argument). If the AI system is mostly or completely indifferent to natural causes of suffering among wild animals, for instance, animal suffering may continue or even accelerate, potentially on a cosmic scale.

Relatedly, if value lock-in is compatible with things going well for humans in a post-AGI world — and thus by definition moral circle expansion would halt — then such futures would likely mean worse outcomes for animals. On the other hand, if values are able to continue to evolve and influence decision making, it is possible that future humans with greater material abundance and more time for leisure activities etc, would expand their moral circle to incorporate animal welfare (Russel 2019, p. 240; Reese Anthis & Paez 2021).

 

An animal-welfare-indifferent AGI may be especially bad from a suffering-focused perspective

If one weighs positive and negative experience of the same intensity equally, and is fundamentally uncertain about whether, on the whole, suffering outweighs enjoyment for sentient animals (Ng 1995; Groff & Ng 2019; Horta 2015; Browning & Veit 2023), then differences in the total amount of animals’ welfare-relevant experiences (by e.g., a change in the number of sentient animals alive at any moment) would not make a difference to expected net welfare, all else being equal. So if an AGI that is indifferent to animal welfare would mostly affect the total amount of animals’ welfare-relevant experiences, there would be no way of determining whether the change in the amount of welfare relevant experiences were good or bad from the net-welfare-agnostic perspective. Although figuring out whether such a change is good or bad becomes extremely valuable and strongly favors avoiding locking in a decision prematurely (Soryl & Sandberg 2025).

However, developing an AGI that cares for humans but is essentially indifferent to the suffering of wild animals might be asymmetrically bad from the perspective of a suffering-focused view. The number of animals on Earth is bounded by zero (i.e., we cannot have less than no suffering, or less than zero animals on Earth), while the number of planets that could be seeded with life is very high (Dello-Iacovo 2016; O'Brien 2021; Soryl & Sandberg 2025), as is the number of simulations of sentient animals that could be run (Tomasik 2019).

 

“Hard-coding” animal welfare concerns (and associated risks)

In terms of scenarios that seem compatible with positive outcomes for animal welfare, it is worth talking about possible attempts to get the AI to value the experiences of sentient animals roughly equal to human experiences; see for instance the idea of a Sentientist Coherent Extrapolated Volition (SCEV, Moret 2023), a proposed variant of Coherent Extrapolated Volition (CEV). I want to flag here that this general approach of incorporating animal welfare might come with an unacceptable level of existential risk (Dearnaley 2023), but given that this leads us into a discussion about scenarios where AGI does not go well for humans, I will return to this question in the conclusions section instead. Lastly I should note that CEV is not a strategy that can currently be implemented, and I’m not sure to what extent it is even a viable strategy once a highly advanced AGI has been created.

 

How likely is it that, if AGI goes well for humans, it will go well for animals?

Arguably, almost any possible future will seem weird in some way given that the status quo is continuous change (Karnofsky 2021), and futures where an AGI is developed and it goes well for humans will seem particularly alien to us. So predicting outcomes for animal welfare in such “weird” futures is not easy. Moreover, we don’t even really know which of all the currently existing animals are sentient. If one were to be completely ignorant, a flat prior for a binary outcome of either 1 or 0 (“will AGI go well for animals if it goes well for humans, yes or no?”), the expected outcome is 0.5, or 50% probability. I’m overall very uncertain, but I mostly see reasons to be pessimistic (see the Considerations section above), so I think my best guess is around a ~30% probability of AGI going well for animals, given it going well for humans. I therefore largely disagree with the statement in the debate week announcement post.

 

Conclusions

General thoughts

The question I’m answering here is ultimately getting at the extent to which outcomes for humans and outcomes for animals are associated — and the specifics of such an association. In general, I think they are probably not correlated on the current trajectory. My overarching worry, and the reason I estimate a 30% likelihood of AGI going well for animals if it goes well for humans, is that being somewhat indifferent to animal welfare is in some sense a feature rather than a bug of current AI-alignment efforts. If alignment is considered successful when it captures an average or consensus of current human values, then a successfully aligned AI would largely disregard the welfare of animals. I want to be clear, though, that I’m not arguing for the opposite extreme either, i.e., I’m not suggesting we try to encode a strong utilitarian animal welfare commitment into an AGI's terminal goals, given that doing so could carry a significant existential risk (Dearnaley 2023). The challenge is finding a middle ground: ensuring that animal welfare is not ignored entirely, while avoiding the opposite failure mode, where the AGI essentially ignores the welfare of humans instead.

I think there are two plausible and broad categories of scenarios, where AGI goes well for humans and for animals. The first is that the values of present-day humans are not locked in. If we can avoid having our current values locked in, that would at least allow for a continued moral circle expansion post AGI. The second is that the future decision makers are mostly indifferent to animal welfare, but that animal welfare ends up being improved essentially by accident, for instance by things like economic incentives to switch to cultivated meat, or perhaps the wild animals that end up being favored in the future have better lives than the ones that dominate currently. Note that accidental improvements are probably not going to end up near an optimum in terms of animal welfare, but might still end up making things meaningfully better.

 

What would I recommend animal welfare advocates do?

Risks of accidental harm

I think it is worth flagging here that there may be ways of doing accidental harm. The main worry I have, as mentioned earlier, is that there might be a strong trade-off between focusing on alignment methods that “work” in some very basic sense (e.g., avoiding human extinction), and focusing on alignment methods that are unlikely to work but would create the ideal AGI if they worked. My guess is that trying to align an AGI to some idealized human values will probably be much harder (and less likely to end well for humans) than to create an AGI that is simply obedient to its creators. If so, people motivated by a concern for animal welfare and pushing for an “ideal” AI, could end up causing more harm than good. Attempts at AI alignment that tries to be sensitive to the scale of (wild) animal welfare, may accidentally create a “tyranny of the tiny” (Dearnaley 2023). If taken to its logical extreme under a consequentialist framework, humans would essentially be seen as a rounding error in the moral calculus. “All humans dying” could be seen by the AGI as minor collateral damage in the path to optimizing the world for e.g., marine copepod welfare, and the welfare of other small and extremely numerous animals.

As another related example that is more immediately relevant: Amanda Askell (working on fine-tuning and AI alignment at Anthropic) was recently interviewed on a podcast (Hard Fork 2026), and she made a compelling argument for not trying to hard-code rules into Claude’s Constitution, but rather giving Claude guidelines for what processes to follow when acting in the world. This is to avoid Claude, in essence, “learning the wrong lesson”, by strictly following a rule even in edgecases where it knows it would lead to bad outcomes. If providing general guidelines is indeed a better tool to help align (current) AI models, then advocating for some specific constraint that would require LLMs to e.g., “never assist in planning a non-vegan meal,” could be a bad idea (it would likely also be disliked by many users, making such an approach inviable). More open-ended approaches, like the reference to animal welfare in Claude’s Constitution, would then be preferable.

Furthermore, while the representation of animal welfare concern in training data for AI models is presumably very low, I think it is better to work collaboratively with AI companies and policymakers than to e.g., try to solve the problem using “brute force” and synthetic data. Just as Google dislikes manipulations of their search result rankings, AI companies will likely not appreciate deliberate attempts to manipulate their training data. There might be a middle ground that consists of working with AI companies to address the data gap, although I’m not sure how efficient such an approach would be.

 

Tentative suggestions

With all of those caveats out of the way, below are a list of ideas that seem like productive areas to explore:

  • Identifying ways for animal welfare concerns to be included in alignment approaches such as documents intended to govern AI behavior. This would involve collaborating with people at the leading AI companies.
  • Exploring whether there are productive ways to influence AI policy and governance.
    • See, for example, the policy brief about AI and animal welfare, written by a member of New Zealand’s Animal Justice Party (Singleton 2025).
    • See also Taylor (2023) for a discussion of whether there are ways to ensure representation of animals’ interests in decision making around AI.
  • Making the case for animal welfare concern, and especially concern for wild animal welfare, which feed into discussions around what successful AI alignment would mean in practice.
  • Evaluations of the apparent values and “psychology” of current AI models, to inform action.
  • Community building.

 

What would make me change my mind?

I’ve only spent a few days writing this post, and I’m quite uncertain about both my probability estimate and my suggestions for how to make sure that probability goes up. Below are some of my main uncertainties where new information is likely to change my mind meaningfully on the probability estimate in particular.

  • If I got the AI-timeline prediction seriously wrong.
  • If I encountered information that indicated to me that the general population and current AI models have a lot more concern for (wild) animal welfare than I currently think they do.
  • If there was clear evidence that in order for AGI to eventually go well for humans, something like specifying ‘indirect normativity' would be required, such that trying to create an AI that is obedient (Potham & Harms 2025) to current human preferences was not sufficient.
  • If I encountered evidence that moral circle expansion would be very likely to continue in a future where AGI goes well for humans.
  • If I encountered compelling arguments for whether simulated animals would generally have good or bad lives.

Note that different ways of defining terms (like “going well”) may also lead to apparent disagreement (i.e., if we were just answering the same question, we would not disagree).

 

Acknowledgments

Thanks to Shannon Ray for reviewing and helping improve this post.

5

0
0

Reactions

0
0

More posts like this

Comments
No comments on this post yet.
Be the first to respond.
Curated and popular this week
Relevant opportunities