I was listening to https://80000hours.org/podcast/episodes/paul-christiano-ai-alignment-solutions recently, and found it very helpful. However, I still have a question at the end of it.
What is the plan that organizations like OpenAI have to prevent bad outcomes from AGI? From how Paul Christiano frames it, it seems like it's "create AGI, and make sure it's aligned."
But I don't understand how this plan accounts for competition. To use a stupid analogy, if I was concerned that cars in America weren't safe, I might start my own car company to manufacture and sell safer cars. Maybe I spend a lot of time engineering a much safer car. But my efforts would be for naught if my cars weren't very popular (and hence my company wasn't very successful), even if they were groundbreakingly safe.
It seems like this latter part is most of the trick, at least in the domain of cars.
I'd like to understand in more detail how this analogy breaks down. I can imagine several ways, but would love to hear it direct from the horse's mouth.
I think that's basically right. I believe something like was Eliezer's plan too, way back in the day, but then he updated to believing that we don't have the basic ethical, decision theoretic, and philosophical stuff figured out that's prerequisite to actually making a safe AGI. More on that in his Rocket Alignment Dialogue.