17Joined Mar 2022


After reading this linkpost, I’ve updated toward thinking that there’s actually more agreement between Yudkowsky and Christiano than I thought there was. In summary, they seem to agree on:

-AI systems are being developed that could deliberately and irreversibly disempower humanity. These systems could exist soon, and there won’t necessarily be a "fire alarm."

-Many of the projects intended to help with AI alignment aren't making progress on key difficulties, and don’t really address the “real” problem of reducing risk of catastrophic outcomes.

-AI risk seems especially hard to communicate, because people want to either hear that everything is fine or that the world is ending. See need for closure / ambiguity aversion.  The truth is much more confusing, and human minds have a tough time dealing with this.

These points of agreement might seem trivial, but at least Yudkowsky and Christiano are, in my opinion, speaking the same language, i.e., not talking past each other like what appeared to be going on in the ‘08 Hanson-Yudkowsky AI Foom debate.