I was listening to https://80000hours.org/podcast/episodes/paul-christiano-ai-alignment-solutions recently, and found it very helpful. However, I still have a question at the end of it.
What is the plan that organizations like OpenAI have to prevent bad outcomes from AGI? From how Paul Christiano frames it, it seems like it's "create AGI, and make sure it's aligned."
But I don't understand how this plan accounts for competition. To use a stupid analogy, if I was concerned that cars in America weren't safe, I might start my own car company to manufacture and sell safer cars. Maybe I spend a lot of time engineering a much safer car. But my efforts would be for naught if my cars weren't very popular (and hence my company wasn't very successful), even if they were groundbreakingly safe.
It seems like this latter part is most of the trick, at least in the domain of cars.
I'd like to understand in more detail how this analogy breaks down. I can imagine several ways, but would love to hear it direct from the horse's mouth.
I think the important disanalogy is that once you've created a safe AGI of sufficient power, you win. (Because it's an AGI, so it can go around doing powerful AGI stuff – other projects could be controlled or purchased, etc.)
It's not for sure the case that first-past-the-post will be the end-of-the-day winner, but being first-past-the-post is probably a big advantage. Bostrom has some discussion of this in the multipolar / singleton section of Superintelligence, if I recall correctly.
Drexler's Comprehensive AI Services is an alternative framing for what we mean by AGI. Probably relevant here, though I haven't engaged closely with it yet.
Would probably incur a lot of bad PR.