My conception of an "AI x-risk warning shot" is an event that signals the potential for impending AI x-risks in a "widely broadcasted manner", but is not itself an unfolding x-risk.

  • If such a warning shot occurs, is it appropriate to infer the responses of governments from their responses to other potential x-risk warning shots, such as COVID-19 for weaponized pandemics and Hiroshima for nuclear winter?
  • To the extent that x-risks from pandemics has lessened since COVID-19 (if at all), what does this suggest about the risk mitigation funding we expect following AI x-risk warning shots?
  • Do x-risk warning shots like Hiroshima trigger strategic deterrence programs and empower small actors with disproportionate destructive capabilities by default?
New Answer
Ask Related Question
New Comment

1 Answers sorted by

As with many other questions around AI, I think a lot of the answer here depends on other world-states/variables around AI and geopolitics, including expected/actual takeoff speeds, takeoff heights (I.e., how intelligent can an AGI get before hitting limits in the short term), the tractability of treaty verification (e.g., does developing a seed AI/AGI require hard-to-hide amounts of compute), the tractability of alignment, the existence of other systems/technologies (e.g., narrow AI in cybersecurity), and more. But caching that thought for now…

A related, crucial issue is what the “AI warning shot” looks like: suppose it’s some powerful narrow AI that manages to be very effective at hacking, and escapes and wreaks some havoc on certain types of global systems until it is somehow contained after a few months. I suspect that a threat like this would be taken very seriously. Also, any kind of situation where someone could do a postmortem of some near-catastrophe and say “Oh sh!t, if we had done this just slightly differently/not had some backup failsafe we probably would have all been killed” would probably be enough to trigger serious concern and action. However, at that point it might be too late, depending on world-states/variables I mentioned above (e.g., whether small actors could try to replicate the actions to purposely destroy everything).

Another issue is that I think there have been relatively few “X-risk warning shots” thus far, if any: I’m unclear about why Hiroshima would qualify (whereas something like the Cuban Missile Crisis seems more plausible as a “warning shot” given how it came at a time where so many nuclear weapons existed), and COVID similarly I think doesn’t rise to that level, given the relatively low lethality which I suspect played a major role in the apathy of response/containment compliance (I.e., a more lethal pathogen probably would have been met with much stronger measures, I think).

So in summary, 1) I think it depends a lot on other variables we are (or at least I am) uncertain about, and 2) I don’t think we have hardly any good analogues of “warning shots” in the past, COVID and Hiroshima notwithstanding (but plausibly the Cuban Missile Crisis?).