Hide table of contents

Imagine GPT-8 arrives in 2026, creates a fast takeoff and AGI kills us all. 

Now imagine the AI Safety work we did in the meantime (technical and non-technical) progressed in a linear fashion (i.e. in resources committed), but it obviously wasn't enough to prevent us all being killed.

What were the biggest, most obvious mistakes we made as (1) individuals and (2) as a community? 

For example, I spend some of my time working on AI Safety but it is ~ 10%. In this world I probably should have committed my life to it. Maybe I should have considered more public displays of protest?




New Answer
New Comment

3 Answers sorted by

Not going all out on pushing for a global moratorium on AGI now. I'm starting to feel that all my other work is basically just rearranging deckchairs on the Titanic.

I am starting to feel the same mate. Reach out if I can help.

Have DM'd you

Personal answer (I am now trying to correct this): Not pivoting fast enough into interpretability research.

Great question! If AI kills us in the next few years, it seems likely it would be by using a pathway to power that is currently accessible by humans; and just acts as a helper/accelerator to human actions.

The top two existential risks that meet that criteria are engineered bioweapon and nuclear exchange.

Currently, there is a great deal of research into how LLMs can assist research in a broad set of fields, with good results performing similar to a human specialist and in creativity tasks to identify new possibilities. Nothing that human researches can’t do, but the speed and low cost of these models is already surprising and likely to accelerate many fields.

For bioweapon risk, the risk I see would be direct development where the AI is an assistant to the human-led efforts. The specific bioengineering skills to create a an AI designed bug are scarce, but the equipment isn’t.

How could an AI accelerate nuclear risk? One path I could see is again AI-assisted, human led, except for controlling social media content and attitude to increase global tensions. This seems less likely than the bioweapon option.

What others are there?

I like this take: if AI is dangerous enough to kill us in three years, no feasible amount of additional interpretability research would save us.

Our efforts should instead go to limiting the amount of damage that initial AIs could do. That might involve work securing dangerous human-controlled technologies. It might involve creating clever honey pots to catch unsophisticated-but-dangerous AIs before they can fully get their act together. It might involve lobbying for processes or infrastructure to quickly shut down Azure or AWS.

Curated and popular this week
Relevant opportunities