Hide table of contents

I believe it was Paul Christiano who said in an 80,000 hours interview that there is a surprisingly high chance that AI alignment might end up not actually being difficult.

I’m curious if anyone has done any research or tried to forecast the likelihood that AI Alignment is a difficult vs. ends up being an easy problem to solve relative to progress in creating advanced AI.

Specifically, by the time we reach transformative AI, how likely is it that AI Alignment will occur naturally if current trends in AI capabilities and AI safety research continue, so that we are able to robustly, sustainably prevent x-risk from AI on our current trajectory?




New Answer
New Comment
Sorted by Click to highlight new comments since:

If you haven't already come across it, you might find the points given under the "how big of a risk is misalignment" section of Ajeya Cotra's post on cold takes interesting. I would be pretty interested in a more comprehensive list of ways out there that alignment optimists and pessimists tend to disagree about the difficulty of the problem, and what ML experts (outside of AI safety) think about each specific point or if they have other cruxes. 

Curated and popular this week
Relevant opportunities