Excellent explanation. It seems to me that this problem might be mitigated if we reworked AI's structure/growth so that it mimicked a human brain as closely as possible.
I think a superintelligent AI will be able to find solutions with no moral uncertainty. For example, I can't imagine what philosopher would object to bioengineering a cure to a disease.
Sorry, I'm still a little confused. If we establish an AI's terminal goal from the get-go, why wouldn't we have total control over it?
Ok, maybe don't include every philosopher. But I think it would be good to include people with a diverse range of views: utilitarians, deontologists, animal rights activists, human rights activists, etc. I'm uncomfortable with the thought of AI unilaterally imposing a contentious moral philosophy (like extreme utilitarianism) on the world.
Even with my constraints, I think AI would be free to solve many huge problems, e.g. climate change, pandemics, natural disasters, and extreme poverty.
I've tried to engage with the intro materials, but I still have several questions:
a. Why doesn't my proposed prompt solve outer alignment?
b. Why would AI ever pursue a proxy goal at the expense of its assigned goal? The human evolution analogy doesn't quite make sense to me because evolution isn't an algorithm with an assigned goal. Besides, even when friendship doesn't increase the odds of reproduction, it doesn't decrease the odds either; so this doesn't seem like an example where the proxy goal is being pursued at the expense of the assigned goal.
c. I've read that it's very difficult to get AI to point at any specific thing in the environment. But wouldn't that problem be resolved if AI deeply understood the distribution of ways that humans use language to refer to things?
"How do we choose the correct philosophers?" Choose nearly all of them; don't be selective. Because the AI must get approval fom every philosopher, this will be a severe constraint, but it ensures that the AI's actions will be unambiguously good. Even if the AI has to make contentious extrapolations about some of the philosophers, I don't think it would be free to do anything awful.
Thanks for the thoughtful response, Ryan.
Why do you think the increase in racing between nations would outweigh the decrease in racing between companies? I have the opposite intuition, especially if the government strikes a cosmopolitan tone: "This isn't an arms race; this is a global project to better humanity. We want every talented person from every nation to come work on this with us. We publicly commit to using AGI to do ___ and not ___."
I have trouble understanding why a nation would initiate violent conflict with the U.S. over this. What might that scenario look like?
Finally, if the government hired AGI companies' ex-employees, people concerned about x-risk would be heavily represented. (Besides, I think government is generally more inclined to care about negative externalities/x-risk than companies- the current problem is ignorance, not indifference.)