I was listening to https://80000hours.org/podcast/episodes/paul-christiano-ai-alignment-solutions recently, and found it very helpful. However, I still have a question at the end of it.
What is the plan that organizations like OpenAI have to prevent bad outcomes from AGI? From how Paul Christiano frames it, it seems like it's "create AGI, and make sure it's aligned."
But I don't understand how this plan accounts for competition. To use a stupid analogy, if I was concerned that cars in America weren't safe, I might start my own car company to manufacture and sell safer cars. Maybe I spend a lot of time engineering a much safer car. But my efforts would be for naught if my cars weren't very popular (and hence my company wasn't very successful), even if they were groundbreakingly safe.
It seems like this latter part is most of the trick, at least in the domain of cars.
I'd like to understand in more detail how this analogy breaks down. I can imagine several ways, but would love to hear it direct from the horse's mouth.
Good point. Maybe another thing here is that under Paul's view, working on AGI / AI alignment now increases the probability that the whole AI development ecosystem heads in a good direction. (Prestigious + safe AI work increases the incentives for others to do safe AI work, so that they appear responsible.)
Speculative: perhaps the motivation for a lot of OpenAI's AI development work is to increase its clout in the field, so that other research groups take the AI alignment stuff seriously. Also sucking up talented researchers to increase the overall proportion of AI researchers that are working in a group that takes safety seriously.