This is an article I wrote last year about my mental model of the risks of AGI, written for a general audience (with various tweaks afterward). At the time, it had nearly zero readers. Recently I was a bit miffed about accelerationist David Shapiro deleting my entire conversation with him on Substack (in which he mostly ignored me anyway) so I thought "why not try a friendlier audience?" Let me know if you think I did a good job (or not).

Synopsis: for reasons laid out in the article, I think most of the risk comes from the possibility of a low compute-to-intelligence ratio, and I doubt that the first AGI will be one we should be most worried about. Instead, the problem is that the first one leads to others, and to competition that leads to create more and varied designs. I don't imagine my perspective is novel; the point is to just to explain it in a way that is engaging, accessible and logical.

3

0
0

Reactions

0
0
Comments3
Sorted by Click to highlight new comments since:

It's a long post, and it starts by talking about consciousness. 

Does it contain any response to the classic case for AI Risk, e.g. Bostrom's Superintelligence or Yudkowsky's List of Lethalities? 

I only mentioned human consciousness to help describe an analogy; hope it wasn't taken to say something about machine consciousness.

I haven't read Superintelligence but I expect it contains the standard stuff―outer and inner alignment, instrumental convergence etc. For the sake of easy reading, I lean into instrumental convergence without naming it, and leave the alignment problem implicit as a problem of machines that are "too much" like humans, because

  • I think AGI builders have enough common sense not to build paperclip maximizers
  • Misaligned AGIs―that seem superficially humanlike but end up acting drastically pathological when scaled to ASI―are harder to describe so instead I describe (by analogy) something similar: humans outside the usual distribution. I argue that psychopathy is absence of empathy, so when AGIs surpass human ability it's way too easy to build in a machine like that. (Indeed, I could've said, even normal humans can easily turn off their empathy with monstrous results, see: Nazis, Mao's CCP).

I don't incorporate Yudkowsky's ideas because I found the List of Lethalities to be annoyingly incomplete and unconvincing, and I'm not aware of anything better (clear and complete) that he's written. Let me know if you can point me to anything.

Okay, not a friendly audience after all! You guys can't say why you dislike it?

Story of my life... silent haters everywhere.

Sometimes I wonder, if Facebook groups had downvotes, would it be as bad, or worse? I mean, can EAs and rationalists muster half as much kindness as normal people for saying the kinds of things their ingroup normally says? It's not like I came in here insisting alignment is easy actually.

Curated and popular this week
Relevant opportunities