PS

Peter Salib

Law Professor @ University of Houston
0 karmaJoined

Comments
3

Agree, and this relates to my point about distinguishing the likelihood of retaining alignment knowledge from the likelihood of rediscovering it. 

A thought on unipolarity: One worry here is that the pursuit of post-AGI unipolarity could be the exact kind of thing that triggers a catastrophic setback. If one nation or coalition looks posed to create an aligned-to-them AGI, and lock up control of the future, this gives other nations strong incentives to launch preemptive strikes before that happens. This can be true even if everyone understands the impending AGI to be quite benevolent. There are plenty of nuclear-armed nations whose leaders might strongly prefer to retain private power, even to the detriment of their populations' wellbeing, rather than accept benign foreign hegemony. 

I think it's worth separating out two questions about alignment in the re-run: (1) how likely are we to retain alignment knowledge, and (2) how likely are we to rediscover alignment knowledge? The second seems important because, for some possible futures rediscovery of alignment techniques seems likely to correlate strongly with rediscovery of the technology necessary to make ASI. Then, conditional on being able to make capable AI, we might be quite likely to be able to align it. 

The more that alignment is easy, and the relevant techniques look a lot like the techniques you need to make capable AI in the first place, the more likely alignment rediscovery conditional on AI rediscovery seems. While it's currently uncertain whether today's alignment techniques will scale to ASI, many (e.g. RL-based techniques) do seem quite closely related to the techniques you need to make the AI capable in the first place.