I’ve written a draft report evaluating a version of the overall case for existential risk from misaligned AI, and taking an initial stab at quantifying the risk from this version of the threat. I’ve made the draft viewable as a public google doc here (Edit: arXiv version here, video presentation here, human-narrated audio version here). Feedback would be welcome.
This work is part of Open Philanthropy’s “Worldview Investigations” project. However, the draft reflects my personal (rough, unstable) views, not the “institutional views” of Open Philanthropy.
Could you clarify what you mean by this? I think I don't understand what the word "true", italicized, is supposed to mean here. Are you just reporting the impression (i.e. a belief not adjusted to account for other people's beliefs) that you are ~100% certain an existential catastrophe from misaligned, power-seeking AI will (by default) occur by 2070? Or are you saying that this is what prima facie seems to you to be the case, when you extrapolate naively from current trends? The former seems very overconfident (even conditional on an existential catastrophe occurring by that date, it is far from certain that it will be caused by misaligned AI), whereas the latter looks pretty uninformative, given that it leaves open the possibility that the estimate will be substantially revised downward after additional considerations are incorporated (and you do note that you think "there's a decent chance of exciting surprises"). Or perhaps you meant neither of these things?
I guess the most helpful thing (at least to someone like me who's trying to make sense of this apparent disagreement between you and Joe) would be for you to state explicitly what probability assignment you think the totality of the evidence warrants (excluding evidence derived from the fact that other reasonable people have beliefs about this), so that one can then judge whether the discrepancy between your estimate and Joe's is so significant that it suggests "some mistake in methodology" on your part or his, rather than a more mundane mistake.