Hide table of contents

I remember the time when I wasn't yet fully convinced that risks from misaligned AI pose serious threats to humanity. I also remember thinking about the argument that finally convinced me this must be the case. The argument is simple; superintelligent agents can be deceitful and interpretability is difficult. As a result, you might assume you know what to expect from your superintelligent model when, well, you don't. 

So, I'm asking everyone: what is the argument that convinced you about the importance of the alignment problem? Gathering these answers will help me reflect on the best ways to communicate AI safety with people that don't take alignment seriously or have no background in the area. Thanks in advance! 

7

0
0

Reactions

0
0
New Answer
New Comment

1 Answers sorted by

Short : Why sandboxing doesn't work.
tl;dr : this and this, but especially this and this.

I'll even give you a detailed account of the different steps I went through, I'm all too happy to answer. 

Unexpected sidenote : Anticipating as many counter-arguments as you can seems to me to be a very good strategy during a presentation. In my local EA student group, I made sure to cover 9-10 common counter-arguments, and the results were quite impressive (one person pivoted from unconvinced to quite convinced). 

So here is how it evolved for me :
1-Superintelligence risk is a topic + jpeg inventor says singularity is possible during a conference at uni.
Me : Ok, but its sounds just crazy, and I feel bad hearing about it. They all must be crazy. Or I must exagerate the problem in my head. Let's forget about it.

2-Defining instrumental convergence, and reframing AGI as an algorithm, and not a robot apocalypse.
Me : Ah, I see why it'd be a problem. But it'll figure out ethics on its own, right ?

3-Orthogonality thesis
Me : Oh, I see the problem now. Well, let's just sandbox it ?

4-Containment problem
This one took me a while. You really need to exhaustify as clearly as possible the different scenarios and models. The first time it was explained to me, some of it was left implicit, and I didn't understand. I almost felt like I was "harrassing" the presenter to get it finally spelled out, and that caused a big update in my head.
Me : Oh, ok, damn. But it seems very theoretical, or I don't know, socio-historically situated. It sounds like a sci-fi thing.

5-Seeing the expert surveys, and that people like Stuart Russell and not only people like Elon Musk are worried about this.
Me :  Ok, but that still seems very, very far away.

6-Discovering AIs that are scary as shit and yet not powerful.

I would definitely rank 4 (containment) as the most crucial one. Weren't it for my "harassment", I think I would have went past the entire cause area.
A fellow in our group noted 5 (expert surveys) as being the most convincing. 
The only argument I dared to publicly share on social media however is 6 (existing AIs) -less related, but pretty robust.

PS : First written thing on the forum, sorry if I'm not respecting all the norms '^^

Curated and popular this week
Relevant opportunities