I've been recently considering thinking about AI Alignment as analogous to raising a child, and I want to get feedback on this idea.
Specifically, aligning AI to human values feels like what happens when (for example, no offense intended) a mormon family raises their children to be mormon. I think of stories like Educated in which parents try really really hard to make their kid follow their religious values, and they sometimes fail.
I think this view differs from other stories of alignment because it views an AGI as having its own goals other than just the ones we give it (maybe this is assuming inner misalignment by default?). This story feels very plausible because in our production of AGI, I think we will create agents that are morally valuable and have their own goals. The example of a mormon family feels relevant here because it seems morally dicey or wrong for parents to insist on their child's goals/values, it may likewise be dicey to write the utility function of another sentient being and center such a function on ourselves. Analogy may also be good because it shows just how hard alignment is. Analogy is lacking because children do not become super-intelligent.
Beyond feedback on this framing, I would also appreciate links to relevant things other people have written, thanks!
You're in good company. Tom Griffiths offers this analogy in Brian Christian's 'The Alignment Problem':
“Griffiths views parenthood as a kind of proof of concept for the alignment problem. The story of human civilisation, he notes, has always been about how to instill values in strange, alien, human-level intelligences who will inevitably inherit the reins of society from us - namely, our kids.”