Defusing AGI Danger

This is a crosspost from the new Animal Welfare Alignment Newsletter by Anima International. You can subscribe on Substack if you are interested in following these efforts. Audio reading also available on Substack. The goals of this post are to: 1. Raise a question I see as crucially important to the goal of aligning AI to animal welfare...

166

The first video from Giving What We Can's new channel is out now!

JustinPortela·3d ago·1m read

Hello! I'm Justin Portela. I got hired by GWWC to make YouTube videos after AI in Context did such a kickass job. My channel is using that same cinematic, high-production value beauty to talk about everything in the EA universe that isn't AI. ...

New round of digital minds funding opportunities at Longview

zdgroff, Longview Philanthropy·4d ago·2m read

This is a linkpost for Request for Proposals: Research and Applied Work on Digital Minds. I'm glad to announce a request for proposals for research and applied work on digital minds at Longview Ph...

Recent opportunities to take action

PauseCon London '26: Applications now open

Jonathan@PauseAI·4h ago·1m read

Seeking feedback and collaborators for an AI welfare project

Juliana Grant·6h ago·2m read

A huge way you can help pigs in 5-20 minutes (in the US)

ElliotTep·20h ago·1m read

For the interested, this is a good example of backchaining applied to AI safety. ↩︎
Technically, we want to expand the parts of the argument such that we think additional labor can most shift if from being “true” to “false”. Just expanding things that might be false seems like a good proxy. ↩︎
See The Rocket Alignment Problem for an example of such an argument. ↩︎
Rohin Shah puts about 30% on “the first thing we try just works and we don’t even need to solve any sort of alignment problem” in AI Alignment Podcast: An Overview of Technical AI Alignment in 2018 and 2019 with Buck Shlegeris and Rohin Shah. ↩︎

Defusing AGI Danger

Defusing AGI Danger

tl;dr

Introduction

Applied to AGI Safety

(2) AGIs will be autonomous agents...

(3) AGI goals will be misaligned with what we want...

Applied to Agendas

Pitfalls

Vague danger scenarios

False narratives

Conclusion