Introduction
When a system is made safer, its users may be willing to offset at least some of the safety improvement by using it more dangerously. A seminal example is that, according to Peltzman (1975), drivers largely compensated for improvements in car safety at the time by driving more dangerously. The phenomenon in general is therefore sometimes known as the “Peltzman Effect”, though it is more often known as “risk compensation”.[1] One domain in which risk compensation has been studied relatively carefully is NASCAR (Sobel and Nesbit, 2007; Pope and Tollison, 2010), where, apparently, the evidence for a large compensation effect is especially strong.[2]
In principle, more dangerous usage can partially, fully, or more than fully offset the extent to which the system has been made safer holding usage fixed. Making a system safer thus has an ambiguous effect on the probability of an accident, after its users change their behavior.
There’s no reason why risk compensation shouldn’t apply in the existential risk domain, and we arguably have examples in which it has. For example, reinforcement learning from human feedback (RLHF) makes AI more reliable, all else equal; so it may be making some AI labs comfortable releasing more capable, and so maybe more dangerous, models than they would release otherwise.[3]
Yet risk compensation per se appears to have gotten relatively little formal, public attention in the existential risk community so far. There has been informal discussion of the issue: e.g. risk compensation in the AI risk domain is discussed by Guest et al. (2023), who call it “the dangerous valley problem”. There is also a cluster of papers and works in progress by Robert Trager, Allan Dafoe, Nick Emery-Xu, Mckay Jensen, and others, including these two and some not yet public but largely summarized here, exploring the issue formally in models with multiple competing firms. In a sense what they do goes well beyond this post, but as far as I’m aware none of t
I liked the talk. I also loved the boots! Great job.
In terms of feedback/reaction: I work on AI alignment, game theory, and cooperative AI, so Moloch is basically my key concern. And from that position, I highly approve of the overall talk, and of all of the content in particular --- except for one point, where I felt a bit so-so. And that is the part about what the company leaders can do to help the situation.
The key thing is 9:58-10:09 ("We need leaders who are willing to flip the Moloch's playbook. ...") , but I think this part then changes how people interpret 10:59-10:11 ("Perhaps companies can start competing over who ... "). I don't mean to say that I strongly disagree here --- rather, I mean that this part seems objectively speculative, which was in contrast with everything else in the talk (which seemed super solid).
More specifically, the talk's formulation suggested to me that the key thing is whether the leaders would be willing to not play the Moloch game. In contrast, it seems quite possible that this by itself wouldn't help at all, for example because they would just get fired if they tried. My personal guess is that "the key thing" is affordance the leaders have for not playing the Moloch game / the costs they incur for doing so. Or perhaps the combination of this and the willingness to not play the Moloch game. And this is also how I would frame the 10:59-10:11 part --- that we should try to make it such that the companies can compete on those other things that turn this into a race to the top. (As opposed to "the companies should compete on those other things".)
Maybe a link is missing or the embed function isn't working on my phone? As I'm not seeing anything.
(Also, do you have a transcript you could post?)
YouTube link here: https://www.youtube.com/watch?v=WX_vN1QYgmE (it's embedded in the post, as JohnSnow points out — not sure if something is breaking for you?)
Transcript here: https://www.ted.com/talks/liv_boeree_the_dark_side_of_competition_in_ai/transcript
Thanks!
Executive summary: Competition can drive innovation but also create traps that lead to lose-lose outcomes. This dynamic is happening in AI and needs wise leadership to avoid catastrophe.
Key points:
This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.
The TED talk is embedded on PC