Hello! I'm Toby. I'm a Content Strategist at CEA. I work with the Online Team to make sure the Forum is a great place to discuss doing the most good we can. You'll see me posting a lot, authoring the EA Newsletter and curating Forum Digests, making moderator comments and decisions, and more.
Before working at CEA, I studied Philosophy at the University of Warwick, and worked for a couple of years on a range of writing and editing projects within the EA space. Recently I helped run the Amplify Creative Grants program, to encourage more impactful podcasting and YouTube projects. You can find a bit of my own creative output on my blog, and my podcast feed.
Reach out to me if you're worried about your first post, want to double check Forum norms, or are confused or curious about anything relating to the EA Forum.
Reach out to me if you're worried about your first post, want to double check Forum norms, or are confused or curious about anything relating to the EA Forum.
Thanks! That's clarifying.
I wonder though - would that kind of world, where humans are empowered but don't experience intense (and perhaps moderate) suffering - be one where humans cared about animal welfare? I can see the intuition going either way. Either:
a) Extrapolating beyond person-to-person morality is (often) a luxury pursuit and more of it will happen in a post-scarcity world.
b) Caring about animal suffering in the food system and in nature requires compassion, and compassion is rooted in being able to imagine the states of the sufferer. If humans all live minimal suffering lives, they won't be able to do so.
I think in the long-run I'd be more confident that corrigible AI would lead to good futures than AI that is aligned to specific values (besides perhaps some side-constraints). This is mainly because I'm pretty clueless and think our current values are likely to be wrong, and I'd rather we had more time to improve them.
I haven't thought enough about the relationship between power concentration and corrigibility though - I expect that could change my mind.
AGI, whether rogue or human-aligned, may not decide to keep other planets free of biological animals (though it seems like a bigger risk for human-aligned AGI)
This is a really interesting point that I hadn't thought of before.
Very lightly held counterargument to your conclusion:
P1: The more capable an AGI system is, the harder it is to align.
P2: Terraforming other planets requires AGI at the very top of the capability distribution.
P3: The pool of systems capable of terraforming is therefore drawn disproportionately from the capability range where misalignment is most likely.
Conclusion: Most worlds containing planet-terraforming AGI are probably rogue-AGI worlds. So the "spreading wild animal suffering to new planets" scenario may be more associated with alignment failure than alignment success.
Corollary: If you agree you should be mildly agree-voting.
Anyone can post a comment, which our guests and other participants can respond to. These comments might be questions, the answer to which might change your mind on the debate statement, or crucial considerations that you are uncertain about, and might be able to make progress on in this conversation.
Nice little Claude summary of the debate so far, which might help identify the missing points:
The debate centres on whether human-aligned AGI would automatically benefit animals, or whether animal-specific interventions are needed.
The pessimistic case is well-represented. Jim Buhler argues we have no good reason to assume AI safety work helps animals — saving humans preserves factory farming, and the claim that empowered humans would improve wild animal welfare rests on untenable assumptions. Simon Eckerström Liedholm (Wild Animal Initiative) estimates only ~30% probability of good animal outcomes conditional on good human outcomes, largely because the most likely alignment path locks in current human values, which permit enormous animal suffering. Hannah McKay (Rethink Priorities) argues that cultivated meat won't be automatically solved by AGI — regulatory, political, and consumer barriers form a sequential chain where the combined probability of resolution is low.
The bridge position comes from Aidan Kankyoku, who thinks it probably (~70%) goes well for animals but that this isn't sufficient certainty to neglect animal-specific alignment. He argues animal welfare is now functionally a subsidiary of the "Make AI Go Well" movement.
MichaelDickens contributed three posts: a taxonomy of alignment research by animal-friendliness, a cost-effectiveness model finding alignment-to-animals only marginally more cost-effective than general alignment, and a meditation on how current alignment paradigms (unlike CEV) give him roughly 50/50 odds on animal outcomes.
The discussion thread (~58 comments) skews disagree, though with real spread. The most common argument for disagreement is historical precedent: technological and economic progress has been bad for animals so far, with factory farming as the central exhibit. Value lock-in is the second recurring worry — that alignment to current human values would freeze in a set of preferences that are largely indifferent to animal suffering (SimonM_, Babel, Dylan Richardson, Tristan Katz). Several voters also flag the risk of spreading wild animal suffering to new planets. On the agree side, the strongest argument is economic: post-scarcity conditions erode factory farming's viability because alternatives become cheaper (OscarD, Erich_Grunewald, Brad West, JDBauman). A few voters (Ronak Mehta at 100% Agree, Ligeia, Artūrs Kaņepājs) argue that a genuinely superintelligent system would recognise animal sentience as morally relevant. A notable cluster sits at or near 0% Agree not because they're confident things go badly, but because they think the question is unanswerable given the number of branching futures (NickLaing, Seth Ariel Green, Jim Buhler). Peter Wildeford offers a useful split: on a causal reading (alignment mechanisms also help animals) he's pessimistic; on an evidential reading (conditional on good human outcomes, what world are we in?) he's somewhat more optimistic.
PS- looks like Michael Dickens just posted on this.