An Exercise to Build Intuitions on AGI Risk

Lauro Langosco

An Exercise to Build Intuitions on AGI Risk

Lauro Langosco

9 min readJun 8, 2023

Comments

Sorted by

New & upvoted

No comments on this post yet.

Be the first to respond.

Comments

More from the author

AI policy careers in the EU

Lauro Langosco·6y ago·13m read

Curated and popular this week

What would an animal-aligned AI be aligned to?

Aidan Kankyoku, Anima International·1w ago·Curated 3d ago·15m read

This is a crosspost from the new Animal Welfare Alignment Newsletter by Anima International. You can subscribe on Substack if you are interested in following these efforts. Audio reading also available on Substack. The goals of this post are to: 1. Raise a question I see as crucially important to the goal of aligning AI to animal welfare...

186

The first video from Giving What We Can's new channel is out now!

JustinPortela·5d ago·1m read

Hello! I'm Justin Portela. I got hired by GWWC to make YouTube videos after AI in Context did such a kickass job. My channel is using that same cinematic, high-production value beauty to talk about everything in the EA universe that isn't AI. ...

Let's taboo the V-word

lincolnq·17h ago·8m read

“How long have you been v*g*n?” This is one of the most common icebreakers at animal protection events. It’s a baseline assumption, and it mostly holds true: if you’re out advocating for animals not to be tortured or abused, realistically these days you are v**n, or close. And it makes for good conversation. It seems fairly safe to assume when you meet strangers. But this assumption is hurting the movement in a way which we don’t always notice: someone new comes into the sp...

Recent opportunities to take action

The EA Opportunities Board now has full-time roles

Agnes Hasselblad 🔸·1h ago·3m read

177

Possible mistake EAs are making and shout out to Pause AI UK

Michelle_Hutchinson·2w ago·4m read

A huge way you can help pigs in 5-20 minutes (in the US)

ElliotTep·3d ago·1m read

^{^}

Of course, AGI safety researchers do build research experience in adjacent fields like deep learning and maths, but there are intuitions and ways of thinking specific to AGI safety that one doesn’t typically inherit from other fields.

^{^}

I adopt the terms “builder / breaker” from the ELK report, though I may not be using the terms in exactly the same way.

^{^}

If helpful, you can choose a more concrete disaster scenario, such as “an autonomous human-level AGI breaks containment”.

^{^}

I'm somewhat dissatisfied with this example because the flaw is obvious enough that there's no need to go into much concrete detail. Usually you'd do more of that, e.g. if the plan is to use the oracle or 'tool-AI' to prevent a dangerous AGI from being built, how exactly might that work?

An Exercise to Build Intuitions on AGI Risk

An Exercise to Build Intuitions on AGI Risk

Introduction

The exercise

Builder phase

Breaker phase

Iterate

Details

Resources

Writing on AGI safety

Other writing on how to learn about / work in AGI safety