What would an animal-aligned AI be aligned to?

This is the third in a sequence of posts taken from my recent report: Why Did Environmentalism Become Partisan? Summary Rising partisanship did not make environmentalism more popular or politically effective. Instead, it saw flat or falling overall public opinion, fewer major legislative achievements, and fluctuating executive actions. Public Opinion...

130

Possible mistake EAs are making and shout out to Pause AI UK

Michelle_Hutchinson·6d ago·4m read

I think right now EAs might be making a significant mistake by paying insufficient attention to the political realm. As EAs we tend to figure out what’s most impactful for us to work on and focus hard. That’s great! But there are various actions that are ‘non-delegatable’ - the extent to which an individual can do the action is limited (like voting, going to a protest, making hard money contributions to particular campaigns). It might be useful if we were all more in the habit of doing variou...

GWWC's 2025 impact evaluation (executive summary)

Aidan Whitfield🔸, Giving What We Can🔸·10h ago·2m read

This post presents the executive summary from Giving What We Can’s impact evaluation for 2025. At the end of this post we share links to more information, including the full report and...

Recent opportunities to take action

How to Lobby Against the Save Our Bacon Act

minthin·2h ago·1m read

$1M AI x-risk grant round is live on grantmaking.ai - apply for funding, review applicants, or fund projects

Matt Brooks·1d ago·3m read

130

Possible mistake EAs are making and shout out to Pause AI UK

Michelle_Hutchinson·6d ago·4m read

^{^}

Research into the welfare ranges of different species by Rethink Priorities, considered leading work in this area, found the 90% confidence range for a pig’s capacity for pleasure and suffering to fall somewhere between equivalent to 1.031 humans and 0.005 humans, with a median of 0.515. Shrimps range from 1.149 to 0 humans, with a median of 0.031.

^{^}

Meaning, roughly, better than the best humans at all cognitive tasks

^{^}

There may be no humans in the loop, or humans may believe they are in the loop but their decisions are influenced by superintelligent, superpersuasive AI.

^{^}

Jakub Stencel left me a long comment on a draft explaining that I was misrepresenting moral realism here in a narrow way that was not relevant to the overall post. Consider yourself warned.

^{^}

One could ask: if intelligence does not converge on moral truths—i.e. morality is subjective—why should we advocate for any moral worldview? Doesn’t that mean our particular moral beliefs are arbitrary? The answer to this is that if there is no stance-independent moral vantage point from which our moral preferences could be falsified, then we need no further vindication for our preferences other than our own introspection. If you believe suffering is bad, you should fight hard for that view, because there’s no guarantee future agents will.

^{^}

In plain English, will beings with more neurons tend to have a greater capacity for sentience? I don’t think neuron count is actually what creates capacity for sentience, but it might be a decent proxy. If it isn’t, n would be zero: same welfare range regardless of neuron count. If neuron count scales linearly (n = 1) then a being with twice as many neurons has on average twice as much capacity for suffering and wellbeing. I’m being deliberately abstruse here to point towards the fact that the truth of these matters is probably not very intuitive.

^{^}

50% means perfect uncertainty, 0% means “this claim is certainly false”

^{^}

Numbers higher than 1 mean that e.g. chickens are much less morally important than humans, while numbers below 1 result in a more modest discrepancy

^{^}

Some other biological metric may be better than neurons. I’m not trying to make a confident statement about how this stuff works, that’s the point, but for people who are justifiably more confident than me, see here.

^{^}

This is based on 1) Anthropic’s constitution doesn’t attempt to assign weights to principles, 2) my own understanding of how constitutional training works, and 3) my experience auditing Claude’s ethical propensities as part of Anima International’s Animal Welfare Alignment Team.

^{^}

A few researchers have worked very hard on some of them, but relative to similarly hard questions, this is a small collective investment.

Breadth	Alignment target
Broad principles	Compassion, fairness
↓	Work to reduce suffering and increase happiness of sentient minds, regardless of the substrate (human, nonhuman animal, digital, or beyond)
↓	Maximize wellbeing given that 1) capacity for wellbeing increases linearly with neuron count and 2) one minute of excruciating pain is as bad as 100 hours of annoying pain
Specific outcomes	Replace all animal farming with cell-cultivated meat. Use gene drives or other technologies to abolish suffering in wild animals.

Belief	Credence^[7]
Treat sentient beings with compassion	99.9%
Treat nonhuman animals as sentient, using the best evidence to determine their sentience-adjusted welfare range	98%
Work to reduce suffering and increase happiness of sentient minds	85%
Work to abolish the slaughter of sentient animals for food, clothing, and research	70%
The exponent n by which sentience-adjusted welfare range scales^[8] relative to “number of neurons^[9] ^ n” is less than 1	55%
Hens in battery cages lead net-positive lives	10%

What would an animal-aligned AI be aligned to?

What would an animal-aligned AI be aligned to?

Alignment to what?

Does intelligence lead to better moral conclusions?

Which questions should we defer to superintelligent AI?

Current alignment strategies are imprecise

Where does that leave animal welfare alignment?

Strategy #1: Train & evaluate a broad distribution of practical decisions

Strategy #2: Urgently research unresolved foundational questions

Minimum viable values for animal welfare alignment

Summary of recommendations