When I introduce people to AI safety I usually get one of three responses:
a) “that makes a lot of sense. What can we do about it?”,
b) “I get it rationally, but intuitively I don’t feel it”,
c) “I just don’t buy it - I don’t think machines can’t be smarter than humans”, “I still think that we can just program them the way we want” or something along these lines.
I get the last response even after giving the standard arguments for why a stop button won’t work, why superhuman intelligence is plausible or why intelligence doesn’t imply morality. So my hypothesis is that they find the thought of unaligned superhuman AI so unintuitive that they are unwilling to actually consider the arguments.
Thus, my question is: What are the best intuition pumps for AI safety?
I’m personally looking for Carl Shulman-style common sense arguments similar to those of his 80K podcast appearance. He argues that buying insurance for a gain-of-function lab would probably cost billions of dollars which gives us a better intuition about the risk involved.
I have recently started making the following argument. If you think that AI won’t be smarter than humans but agree that we cannot perfectly control AI in the same way that we cannot perfectly control humans, then you should be willing to pay as much money towards aligning AI as society spends on aligning humans, e.g. terror defense, prisons, police, and the justice system.
According to Investopedia, the US alone spends 175$ Billion on counterterrorism and 118$ Billion on police per year.
This paper from 2004 estimates that 70 rich nations spend more than 360$ Billion combined on the justice system in 1997.
Thus, if we adjust for inflation and missing countries we will likely get a lower bound of at least 1 Trillion Dollars spend per year on aligning humans. What we currently spend on AI safety is many orders of magnitude away from this.
Do you think this argument makes sense? Feedback and further suggestions are welcome. Your argument can also address different concerns that people typically have about AI safety.
One thing I could imagine happening in these situations is that people close themselves off to object level arguments to a degree, and maybe for (somewhat) good reason.
I remember once when I was younger talking to a Christian fanatic of sorts, who kept coming up with new arguments for why the bible must obviously be true due to the many correct predictions it has apparently made, plus some argument about irreducible complexity. In the moment, I couldn't really tell if/where/why his arguments failed. I found them somewhat hard to follow and just knew the conclusion would be something that is both weird and highly unlikely (for reasons other than his concrete arguments). So my impression then was "there surely is something wrong about his claims, but in this very moment I'm lacking the means to identify the weaknesses".
I sometimes find myself in similar situations when some person tries to get me to sign something or to buy some product they're offering. They tend to make very convincing arguments about why I should definitely do it. I often have no good arguments against that. Still, I tend to resist many of these situations because I haven't yet heard or had a chance to find the best counter arguments.
When somebody who has thought a lot about AI safety and is very convinced of its importance talks to people to whom this whole area is new and strange, I can imagine similar defenses being present. If this is true, more/better/different arguments may not necessarily be helpful to begin with. Some things that could help: