Introduction
When a system is made safer, its users may be willing to offset at least some of the safety improvement by using it more dangerously. A seminal example is that, according to Peltzman (1975), drivers largely compensated for improvements in car safety at the time by driving more dangerously. The phenomenon in general is therefore sometimes known as the “Peltzman Effect”, though it is more often known as “risk compensation”.[1] One domain in which risk compensation has been studied relatively carefully is NASCAR (Sobel and Nesbit, 2007; Pope and Tollison, 2010), where, apparently, the evidence for a large compensation effect is especially strong.[2]
In principle, more dangerous usage can partially, fully, or more than fully offset the extent to which the system has been made safer holding usage fixed. Making a system safer thus has an ambiguous effect on the probability of an accident, after its users change their behavior.
There’s no reason why risk compensation shouldn’t apply in the existential risk domain, and we arguably have examples in which it has. For example, reinforcement learning from human feedback (RLHF) makes AI more reliable, all else equal; so it may be making some AI labs comfortable releasing more capable, and so maybe more dangerous, models than they would release otherwise.[3]
Yet risk compensation per se appears to have gotten relatively little formal, public attention in the existential risk community so far. There has been informal discussion of the issue: e.g. risk compensation in the AI risk domain is discussed by Guest et al. (2023), who call it “the dangerous valley problem”. There is also a cluster of papers and works in progress by Robert Trager, Allan Dafoe, Nick Emery-Xu, Mckay Jensen, and others, including these two and some not yet public but largely summarized here, exploring the issue formally in models with multiple competing firms. In a sense what they do goes well beyond this post, but as far as I’m aware none of t
Oh hey, I wrote a blog post (sorta) about this.
The TLDR: Since I was a teenager I've been looking for ways to give effectively, and once GiveWell appeared doing so became a whole lot easier.
I was interested in most of the relevant cause areas in some form from childhood (the global poor, animal welfare, extinction risks), and independently formulated utilitarianism (not uncommon I’m told, both Bertrand Russell and Brian Tomasik apparently did the same), so I was a pretty easy sell.
I was assigned “Famine, Affluence, and Morality” and “All Animals are Equal” for a freshman philosophy course, and decided Peter Singer really got it and did philosophy in the way that seemed most important to me. Later I revisited Singer when working on a long final paper about animal rights and I ran into his TED talk on Effective Altruism.
At first I was sympathetic but not that involved, but gradually realized that it was much more the style of ethics/activism I was interested in promoting than the other things on the table, or at least on top of them. I founded my school’s Effective Altruism club while I still didn’t really know all that much about the movement, and started learning more, especially after a friend (Chris Webster) recommended the 80,000 Hours podcast to me.
Around this same time I read Reasons and Persons, and met my friend and long-time collaborator Nicholas Kross who introduced me to many rationalist ideas and thinkers, and by the end of undergrad, I was basically a pretty doctrinaire, knowledgeable EA. Kind of a long story, but the whole thing was pretty much in fits and starts so I don’t know a great way to compress it.
Here's mine: My Effective Altruism Timeline (2013)
I just find it delightful that that HPMOR is the start of so many people's EA origin story, partly just as a curiosity as I had an opposite path to so many people (AMF > EA > LW > HPMOR)
Presumably there are many people alive today because of a chain of events started with EY writing a fanfic of all things.
That's kind of wonderful to think about - and I think it's hilarious that you have the opposite journey!