Racing through a minefield: the AI deployment problem

Comments 1

Sorted by

New & upvoted

Wonder if there might be some avenue of leading groups holding equity stakes in each other as an angle of aligning incentives. Imperfect analogy is in the auto industry, for example how Toyota/ Subaru and others hold equity in each other and share best practices in safety/hybrid tech. https://www.reuters.com/article/us-toyota-subaru/toyota-strengthens-japan-partnerships-with-bigger-subaru-stake-idUSKBN1WC04E

Comments

Generally, or at least, this is what I’d like it to refer to. ↩
Thanks to beta reader Ted Sanders for suggesting this analogy in place of the older one, “removing mines from the minefield.” ↩
One genre of testing that might be interesting: manipulating an AI system’s “digital brain” in order to simulate circumstances in which it has an opportunity to take over the world, and seeing whether it does so. This could be a way of dealing with the King Lear problem. More here. ↩
Modern AI systems tend to be trained with lots of trial-and-error. The actual code that is used to train them might be fairly simple and not very valuable on its own; but an expensive training process then generates a set of “weights” which are ~all one needs to make a fully functioning, relatively cheap copy of the AI system. ↩
I mean, this is part of the challenge. In theory, you should deploy an AI system if the risks of not doing so are greater than the risks of doing so. That’s going to depend on hard-to-assess information about how safe your system is and how dangerous and imminent others’ are, and it’s going to be easy to be biased in favor of “My systems are safer than others’; I should go for it.” Seems hard. ↩

The basic premises of “racing through a minefield”