Hello! I'm Toby. I'm a Content Strategist at CEA. I work with the Online Team to make sure the Forum is a great place to discuss doing the most good we can. You'll see me posting a lot, authoring the EA Newsletter and curating Forum Digests, making moderator comments and decisions, and more.
Before working at CEA, I studied Philosophy at the University of Warwick, and worked for a couple of years on a range of writing and editing projects within the EA space. Recently I helped run the Amplify Creative Grants program, to encourage more impactful podcasting and YouTube projects. You can find a bit of my own creative output on my blog, and my podcast feed.
Reach out to me if you're worried about your first post, want to double check Forum norms, or are confused or curious about anything relating to the EA Forum.
Reach out to me if you're worried about your first post, want to double check Forum norms, or are confused or curious about anything relating to the EA Forum.
If you had to allocate a marginal $500,000, would you put it towards animal-specific alignment work (like the ideas in this list) or general alignment work?
Thanks! That's clarifying.
I wonder though - would that kind of world, where humans are empowered but don't experience intense (and perhaps moderate) suffering - be one where humans cared about animal welfare? I can see the intuition going either way. Either:
a) Extrapolating beyond person-to-person morality is (often) a luxury pursuit and more of it will happen in a post-scarcity world.
b) Caring about animal suffering in the food system and in nature requires compassion, and compassion is rooted in being able to imagine the states of the sufferer. If humans all live minimal suffering lives, they won't be able to do so.
I think in the long-run I'd be more confident that corrigible AI would lead to good futures than AI that is aligned to specific values (besides perhaps some side-constraints). This is mainly because I'm pretty clueless and think our current values are likely to be wrong, and I'd rather we had more time to improve them.
I haven't thought enough about the relationship between power concentration and corrigibility though - I expect that could change my mind.
AGI, whether rogue or human-aligned, may not decide to keep other planets free of biological animals (though it seems like a bigger risk for human-aligned AGI)
This is a really interesting point that I hadn't thought of before.
Very lightly held counterargument to your conclusion:
P1: The more capable an AGI system is, the harder it is to align.
P2: Terraforming other planets requires AGI at the very top of the capability distribution.
P3: The pool of systems capable of terraforming is therefore drawn disproportionately from the capability range where misalignment is most likely.
Conclusion: Most worlds containing planet-terraforming AGI are probably rogue-AGI worlds. So the "spreading wild animal suffering to new planets" scenario may be more associated with alignment failure than alignment success.
Corollary: If you agree you should be mildly agree-voting.
Thanks for your contributions to the discussion @Hannah McKay🔸 , @Jo_🔸 , @Lee Wall , and @Alistair Stewart!
I have to head off at 7, but you are welcome to keep commenting, as is anyone else who sees this comment.