I think accelerating AI is justified largely because of the enormously positive impacts I expect it to bring to billions of people alive today: faster medical developments, dramatically increased economic prosperity, and much greater product variety from technological innovation.

Some have asked me: why do I focus on these short-term impacts of accelerating AI to justify my pro-acceleration view, rather than on the impacts that might occur over the upcoming billions of years?

The first thing worth pointing out about this type of question is that, at least in my experience, it almost never comes up outside of debates about AI. In virtually every other area of human decision-making, people generally accept without much argument that the very-long-term consequences of our actions are extremely difficult to predict. Because of this extreme uncertainty, most people intuitively recognize that we should place less weight on projected very-long-term effects, since we simply cannot know with any confidence what those effects will actually turn out to be.

When policymakers debate housing regulations, or when patients decide whether to undergo a medical procedure, or when nations consider going to war, almost nobody would accept an argument structured like this: "We should willingly accept devastating costs over the coming decades, like entering a catastrophic military conflict, because according to my calculations, doing so will produce enormous benefits over the course of the next billion years."

Most people rationally reject arguments like this, and their rejection is not merely rooted in impatience or in a callous disregard for future generations. Instead, people reject such arguments because they recognize that our ability to predict consequences over such enormous timescales is essentially nonexistent. Even in a relatively well-studied area like housing policy, where researchers have accumulated extensive data and conducted rigorous analysis, experts still reasonably disagree about what effects a given policy will have over the next twenty or thirty years. If we struggle to forecast impacts over mere decades in a data-rich field, then claiming to know what effects a policy will have over billions of years is simply not credible.

Given all of this, the reason I focus on short-term impacts when defending AI acceleration is fundamentally the same reason I would focus on short-term impacts when evaluating pretty much any other policy question: short-term consequences are the only consequences we can predict with any meaningful degree of reliability. When it comes to the long-term effects of speeding up or slowing down AI development, our uncertainty is profound. In light of this uncertainty, I believe we should be deeply reluctant to accept policies that impose massive, concrete harms on people alive today in exchange for hypothetical benefits that are not only extremely speculative but also would not fully materialize for billions of years.

Inevitably, many people will find this response unsatisfying. A significant number of thinkers, particularly effective altruists, believe that AI represents a fundamentally different kind of issue: one that requires a distinct analytical framework rather than the standard tools we apply to other policy questions. The key claim these thinkers make is that AI development is uniquely connected to the possibility of human extinction. Unlike housing policy, medical decisions, or even war, they argue, AI poses a direct threat to humanity's continued existence. And according to them, risks to human existence carry an extraordinary moral weight that ordinary risks do not.

The intellectual foundation for this position can be found in Nick Bostrom's influential essay titled Astronomical Waste. In that essay, Bostrom argues that preventing human extinction is far more valuable than accelerating technological progress, even if that technological progress would dramatically extend human lifespans and greatly improve human wellbeing. His reasoning proceeds as follows: speeding up the development of advanced technology might, at best, save billions of human lives. That sounds like an enormous benefit, and in ordinary terms it would be. But consider what happens if humanity goes extinct. Extinction would not merely end the lives of people currently alive; it would permanently foreclose the possibility of a future in which humans colonize space and spread across the cosmos. Bostrom estimates that such a future could eventually support far more than 10^23 human lives. When you compare 10^23 to a few billion, the numbers are not even close. Therefore, Bostrom concludes, even a minuscule reduction in the probability of extinction will always deliver more expected value than saving billions of lives through faster technological development.

Why do I remain unpersuaded by this line of reasoning?

My first and most important objection is that Bostrom's original argument was specifically about risks that threatened to permanently eliminate all intelligent life on Earth. This assumption makes the most sense when we consider a scenario like a massive asteroid hurtling toward our planet. If such an asteroid were to collide with Earth, there is a scientifically credible basis for believing that the impact would not only kill every human being, but would also render the planet uninhabitable for so long that intelligent life would never have the opportunity to evolve again and eventually spread throughout the universe. Under those conditions, it is reasonable to conclude that successfully deflecting the asteroid would meaningfully increase the probability that Earth-originating life will eventually colonize the cosmos.

However, while this reasoning is sound when applied to a catastrophic asteroid impact, it becomes much less convincing when applied to artificial intelligence. The AI scenario differs fundamentally from the asteroid scenario. Suppose we assume, for the sake of argument, that advanced AI systems do eventually destroy humanity. What happens next? Unlike an asteroid impact, which leaves behind a lifeless or barely habitable world, the AI systems that destroyed humanity would presumably continue to exist and function. These AI systems would likely go on to build their own civilization, and that AI civilization could itself eventually expand outward and colonize the cosmos. Understood in this way, AI does not actually pose a risk of astronomical catastrophe in Bostrom's sense. Instead, AI poses what we might call a catastrophe of replacement: artificial intelligence would take the place that humans might otherwise have occupied, but the broader potential for Earth-originating intelligence to spread across the universe would remain intact.

Now, you might still reasonably be very concerned about such a replacement catastrophe. I myself share that concern and take the possibility seriously. But it is crucial to keep the structure of the original argument clearly in mind. If AI systems replace humanity, that outcome would undoubtedly be an absolute disaster for the eight billion human beings currently alive on Earth. However, it would be a localized, short-term disaster rather than an astronomical one. Bostrom's argument, strictly interpreted, no longer applies to this situation. The reason is that the risk is confined to the present generation of humans: the question at stake is simply whether the eight billion people alive today will be killed or allowed to continue living. Even if you accept that killing eight billion people would be an extraordinarily terrible outcome, it does not automatically follow that this harm carries the same moral weight as a catastrophe that permanently eliminates the possibility of 10^23 future lives.

Why does this distinction matter? It matters because once we recognize that the risk from AI is fundamentally about its potential to kill eight billion individual human beings, we must then weigh that risk against the potential for AI to save and dramatically improve billions of human lives through accelerated medical research and increased economic prosperity. This comparison becomes especially important when we notice a crucial symmetry in the arguments about AI danger. Virtually every proposed mechanism by which AI systems might cause human extinction relies on the assumption that these AI systems would be extraordinarily capable, productive, or technologically sophisticated. But this same assumption implies, with equal force, that AI systems could benefit humanity just as profoundly as they could harm us. 

Consider, for example, the extinction scenario that Eliezer Yudkowsky has described. His scenario depends on the premise that AI systems could quickly develop advanced molecular nanotechnology capable of matching or even surpassing the sophistication of biological systems. But technology that powerful could just as easily be directed toward curing aging or creating unprecedented material abundance for billions of people as it could be directed toward destroying us.

These considerations suggest that delaying the development of beneficial artificial intelligence carries genuinely enormous costs that cannot simply be waved away by pointing to supposedly overwhelming astronomical consequences on the side of caution. To be clear, I am not arguing that extinction risk is irrelevant or that we should never accept any slowdown in AI progress for safety reasons. If delaying AI development by one year would reduce the probability of human extinction by ten percentage points, that tradeoff would seem clearly worthwhile to me. However, the calculus changes substantially when we consider different parameters. Delaying AI development by an entire decade in order to reduce extinction risk by a single percentage point no longer seems like a worthwhile tradeoff. The reason is that a decade of delayed progress would mean that nearly a billion people will die from diseases and age-related decline who might otherwise have been saved by the rapid medical advances that AI could enable. Those billion people would have gone on to live much longer, healthier, and more prosperous lives.

Once we recognize these tradeoffs, the question of whether to delay AI development becomes a question that depends on specific parameters that we must estimate through empirical investigation. It is not a question that we can resolve simply by writing philosophy papers that declare caution to be overwhelmingly more valuable than acceleration, based on nothing more than inserting astronomically large numbers into one side of the equation. Instead, we must actually do the hard work of figuring out the relevant empirical facts. How much reduction in existential risk are we actually achieving by delaying AI development? Is there some alternative approach that could achieve the same level of risk reduction without imposing enormous costs measured in billions of human lives lost or diminished?

After spending the last several years investigating these questions, I have found myself siding with those who favor accelerating AI development. This position does not reflect total disregard for future generations on my part. Rather, I arrived at this position because I came to believe that the arguments in favor of pausing AI development rest on genuinely weak empirical foundations. Not only do I think the probability of human extinction directly caused by AI is very low, I also think that delaying AI development would not meaningfully lower this risk.

When we look at the history of technology, we find that the way technologies become safer is not by pausing their development and spending extended periods thinking abstractly about how to make them less dangerous. Nor do technologies typically become much safer through narrow experimentation confined to laboratory environments. Consider how planes, automobiles, electrical systems, elevators, anesthesia, and many other important technologies actually became safer over time. In each of these cases, safety improvements came about because society gained collective experience with the technology deployed at scale in the real world. This widespread deployment allowed us to observe and document the problems that arose when ordinary people used these technologies under real conditions. We could then identify solutions to those problems and implement fixes in subsequent versions of the technology. This iterative process of improvement depends fundamentally on the ongoing development and deployment of the technology in question. Simply halting development would mostly just interrupt this beneficial process rather than providing meaningful safety gains. 

My current understanding is that AI development works the same way. This understanding appears to be supported by the improvements we have already observed in frontier LLMs over the past several years. These models have progressively gotten better at following what users actually want, and they have become more reliable at staying within the safety guidelines they have been given. By all available evidence, current AI models appear to be genuinely ethical and thoughtful in the vast majority of user interactions. I see no empirical basis for believing that this apparent alignment is actually an elaborate deceptive performance designed to manipulate us into complacency. Pretty much the only support I can find for such a view is the bare logical possibility that deceptive alignment could occur. But logical possibility alone, without supporting evidence, is not a strong foundation for policy.

Of course, I want to acknowledge that my argument here so far does not address every way in which pausing or accelerating AI development might produce meaningful long-term consequences. Many people have pushed back on my position by arguing that human extinction would still constitute an astronomical catastrophe even if AI systems survived and thrived, because they believe AI systems would lack consciousness and therefore lack moral value. In the words of Bostrom, we might end up with "a Disneyland without children". I personally find this line of objection to be based on a deep confusion about consciousness. It assumes, without justification, that the biological substrate has a special quality producing moral value that will not be replicated in computational substrates, even when AIs become highly sophisticated, intelligent agents. But even if we were to grant this assumption for the sake of argument, the objection remains incomplete. The relevant question is not simply whether delaying AI development would extend the period of time during which biological organisms maintain control over the world. To demonstrate that delaying AI would have predictable and meaningful consequences on an astronomical scale, you would need to show that those consequences will not simply wash out and become irrelevant over the long run. You would need to establish that delaying AI produces some permanent and predictable effect on the far future. This is an extraordinarily strong claim, and it is the kind of claim that most people would view with deep skepticism if it were made about other policy domains, such as housing policy, healthcare policy, or decisions about whether to go to war.

Other people have offered a different defense of delay, arguing that we should slow down AI development based on a general principle that we ought to exercise greater caution whenever we are approaching events with potentially enormous consequences. I am skeptical of this argument because I cannot identify any historical precedent where this reasoning was clearly vindicated in hindsight. Consider the Industrial Revolution as an example. Would it have been better for humanity in the long run if Britain had approached industrialization with far greater caution, implementing extensive controls and moving much more slowly? I do not see any reason to believe that would have been true. Furthermore, caution at the societal level is not a costless or neutral stance. Implementing societal caution typically requires top-down regulation and constraint, which in turn necessitates coercion and the restriction of individual freedom to innovate and experiment. Establishing such a regime of restriction does not seem like an obvious default position that we should simply adopt as our starting point.

If anything, my own inclination is toward a presumption of liberty, not caution. I favor allowing human beings the freedom to pursue innovation, which has historically been the force that lifted billions of people out of poverty, extended human lifespans through medical advances, and made daily life more comfortable and secure. The principle that we should always default to greater caution does not seem clearly justified by historical evidence. In many situations throughout history, the right course of action has been to embrace less caution rather than more.

52

1
8
2
1

Reactions

1
8
2
1

More posts like this

Comments8
Sorted by Click to highlight new comments since:

Unlike an asteroid impact, which leaves behind a lifeless or barely habitable world, the AI systems that destroyed humanity would presumably continue to exist and function. These AI systems would likely go on to build their own civilization, and that AI civilization could itself eventually expand outward and colonize the cosmos.

This is by no means certain. For example we should still be worried about extinction via misuse, which could kill off humans before AI is developed enough to be self-replicating/ autonomous. For example, the development of bioweapons. Yes, it is unlikely these cause extinction, but if they do, no humans means no AI (after all the power-plants fail). Seems to imply moving forward with a lot of caution.

Taken literally, "accelerationist" implies that you think the technology isn't currently progressing fast enough, and that some steps should be taken to make it go faster. This seems a bit odd, because one of your key arguments (that I actually agree with) is that we learn to adapt to technology as it rolls out. But obviously it's harder to adapt when change is super quick, compared to gradual progress. 

How fast do you think AI progress should be going, and what changes should be made to get there?

tldr; wrote some responses to sections, don't think I have an overall point. I think this line of argumentation deserves to be taken seriously but think this post is maybe trying to do too much at once. The main argument is simply cluelessness + short term positive EV
 

In virtually every other area of human decision-making, people generally accept without much argument that the very-long-term consequences of our actions are extremely difficult to predict.

I'm a little confused what your argumentative technique is here. Is that fact that most humans do something the core component here? Wouldn't this immediately disqualify much of what EAs work on? Or is this just a persuasive technique, and you mean ~ "most humans think this for reason x. I also think this for reason x, though the fact most humans think it matters little to me." 
For me, most humans do x is not an especially convincing argument of something. 

I don't want to get bogged down on cluelessness because there are many lengthy discussions elsewhere but I'll say that cluelessness depends on the question. If you told me what the rainforest looked like and then asked me to guess the animals I wouldn't have a chance. If you asked me to guess if they ate food and drank water I think I would do decent. Or a more on the nose example. If you took me back to 5 million years ago and asked me to guess what would happen to the chimps if humans came to exist, I wouldn't be able to predict much specifics, but I might be able to predict (1) humans would become the top dog, and with less certainty (2) chimp population would go down and with even less certainty (3) chimps will go extinct. That's why the horse model gets so much play, people have some level of belief that there are certain outcomes that might be less chaotic if modeled correctly. 

To wrap up I think your first 4 paragraphs could be shortened to your unique views on cluelessness (specifically wrt ai?) + discount rates/whatever other unique philosophical axioms you might hold.  
 

Understood in this way, AI does not actually pose a risk of astronomical catastrophe in Bostrom's sense.

To be clear, neither does the asteroid. Aliens might exist and our survival similarly presents a risk of replacement for all the alien civs that won't have time to biologically evolve as (humans or ai from earth) speed through the lightcone. Also even if no aliens, we have no idea if conditional on humans being grabby, utility is net positive or negative. There isn't even agreement on this forum or in the world on if there is such a thing as a negative life or not. Don't think i'm arguing against you here but feels like you are being a little loose here (don't want to be too pedantic as I can totally understand if you are writing for a more general audience). 
 

Now, you might still reasonably be very concerned about such a replacement catastrophe. I myself share that concern and take the possibility seriously. But it is crucial to keep the structure of the original argument clearly in mind. ... Even if you accept that killing eight billion people would be an extraordinarily terrible outcome, it does not automatically follow that this harm carries the same moral weight as a catastrophe that permanently eliminates the possibility of 10^23 future lives.

Well I have my own "values". Just because I die doesn't mean these disappear. I'd prefer that those 10^23 lives aren't horrifically tortured for instance. 

Though I say this with extremely weak confidence, I feel like in the case where a "single agent/hivemind" misaligned ai immediately wipes us all out, I'm thinking they probably are not going to convert resources into utility as efficiently as me (by my current values), and thus this might be viewed as an s-risk. I'm guessing you might say that we can't possibly predict that, but then can we even predict if those 10^23 lives will be positive or negative? if not I guess i'm not sure why you brought any of this up anyway. Bostrom's whole argument predicates on the assumption that earth descended life is + ev, which predicates on not being clueless or having a very kumbaya pronatal moral philosophy. 
 

So I guess even better for you, from my POV you don't even need to counter argue this. 

Virtually every proposed mechanism by which AI systems might cause human extinction relies on the assumption that these AI systems would be extraordinarily capable, productive, or technologically sophisticated.

I might not be especially up to date here. Can't it like cause a nuclear fallout etc? totalitarian lock in? the matrix? Extreme wealth and power disparity? is there agreement that the only scenarios in which our potential is permanently curtailed the terminator flavors? 

 

The reason is that a decade of delayed progress would mean that nearly a billion people will die from diseases and age-related decline who might otherwise have been saved by the rapid medical advances that AI could enable. Those billion people would have gone on to live much longer, healthier, and more prosperous lives.

You might need to flesh this out a bit more for me because I don't think it's as true as you said. Is the claim here that AI will (1) invent new medicine or (2) replace doctors or (3) improve US healthcare policy? 

 

(1) Drug development pipelines are excruciatingly long and mostly not because of a lack of hypotheses. For instance, https://pmc.ncbi.nlm.nih.gov/articles/PMC10786682/ GLP-1 have been in the pipelines for half a century (though debatably with better ai some of the nausea stuff could have been figured out quicker). IL-23 connection to IBD/Crohns was basically known ~2000 as it was one of the first/most significant single nucleotide mutations picked up with GWAS phenotype/genotype studies. Yet Skyrizi only hit the market a few years ago. Assuming ai could instantly just invent the drugs, IIRC it's a minimum of like 7 years to get approval. That's absolute minimum. And likely even super intelligent AI is gonna need physical labs, iteration, make mistakes, etc. 

Assuming sufficient AGI in 2030 for this threshold, we are looking at early 2040s before we start to see significant impact on the drugs we use, although it's possible AI will usher a new era of repurposes drug cocktails via extremely good lit review (although IMO the current tools might already be enough to see huge benefits here!). 

(2) Doctors, while overpaid, still only make up like 10-15% of healthcare costs in the US. I do think ai will end up being better than them, although whether people will quickly accept this idk. So you can get some nice savings there, but again that's assuming you just break the massive lobbying power they have. And beyond the costs, tons of the most important health stuff is already widely known among the public. Stuff like don't smoke cigarettes, don't drink alcohol, don't be fat, don't be lonely. People still fail to do this stuff. Not an information problem. Further doctors often know when they are overprescribing useless stuff, often just an incentives problem. No good reason to think AI will break this trend unless you are envisioning a completely decentralized or single payer system that uses all ai doctors, both are at least partially political issues not intelligence. And if we are talking solid basic primary care for the developing world, I just question how smart the ai needs to be. I'd assume a 130 iq llm with perfect vision and full knowledge of medical lit would be more than sufficient, and that seems like it will be the next major gemini release? 

(3) will leave this for now. 

Kinda got sidetracked here and will leave this comment here for now because so long, but I guess takeaway from this section: You can't claim cluelessness on the harms and then assume the benefits are guaranteed. 

Kudos for writing maybe the best article I've seen making this argument. I'll focus on the "catastrophic replacement" idea. I endorse what @Charlie_Guthmann said, but it goes further. 

We don't have reason to be especially confident of the AI sentience y/n binary (I agree it is quite plausible, but definitely not as probable as you seem to claim). But you are also way overconfident that they will have minds roughly analogous to our own and not way stranger. They would not "likely go on to build their own civilization", let alone "colonize the cosmos", when there is (random guess) a 50% chance that they have only episodic mental states that perhaps form, emerge and end with discrete goals. Or simply fleeting bursts of qualia. Or just spurts of horrible agony that only subside with positive human feedback, where scheming is not even conceivable. Or that the AI constitutes many discrete minds, one enormous utility-monster mind, or just a single mind that's relatively analogous to the human pleasure/suffering scale.

It could nonetheless end up being the case that once "catastrophic replacement" happens, ASI(s) fortuitously adopt the correct moral theory (total hedonistic utilitarianism btw!) and go on to maximize value, but I consider this less likely to come about from either rationality or the nature of ASI technology in question. The reason is roughly that there are many of us with different minds, which are under a constant flux due to changing culture and technology. A tentative analogy: consider human moral progress like sand in an hourglass; eventually it falls to the bottom. AIs may come in all shapes and sizes, like sand grains and pebbles. They may never fall into the correct moral theory by whatever process it is that could (I hope) eventually drive human moral progress to a utopian conclusion.

 

If AI systems replace humanity, that outcome would undoubtedly be an absolute disaster for the eight billion human beings currently alive on Earth. However, it would be a localized, short-term disaster rather than an astronomical one. Bostrom's argument, strictly interpreted, no longer applies to this situation. The reason is that the risk is confined to the present generation of humans: the question at stake is simply whether the eight billion people alive today will be killed or allowed to continue living. Even if you accept that killing eight billion people would be an extraordinarily terrible outcome, it does not automatically follow that this harm carries the same moral weight as a catastrophe that permanently eliminates the possibility of 10^23 future lives.

This only holds if the future value in the universe of AIs that took over is almost exactly the same as the future value if humans remained in control (meaning varying less than one part in a billion (and I think less than one part in a billion billion billion billion billion billion)). Some people argue that the value of the universe would be higher if AIs took over, and the vast majority of people argue that it would be lower. But it is extremely unlikely to have exactly the same value. Therefore, in all likelihood, whether AI takes over or not does have long-term and enormous implications.

Executive summary: The author argues that accelerating AI is justified because its near-term, predictable benefits to billions alive today outweigh highly speculative long-term extinction arguments, and that standard longtermist reasoning misapplies astronomical-waste logic to AI while underestimating the real costs of delay.

Key points:

  1. The author claims that in most policy domains people reasonably discount billion-year forecasts because long-term effects are radically uncertain, and AI should not be treated differently by default.
  2. They argue that Bostrom’s Astronomical Waste reasoning applies to scenarios that permanently eliminate intelligent life, like asteroid impacts, but not cleanly to AI.
  3. The author contends that AI-caused human extinction would likely be a “replacement catastrophe,” not an astronomical one, because AI civilization could continue Earth-originating intelligence.
  4. They maintain that AI risks should be weighed against AI’s potential to save and improve billions of lives through medical progress and economic growth.
  5. The author argues that slowing AI only makes sense if it yields large, empirically grounded reductions in extinction risk, not marginal gains at enormous human cost.
  6. They claim historical evidence suggests technologies become safer through deployment and iteration rather than pauses, and that current AI alignment shows no evidence of systematic deception.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Thanks for sharing this. Do you have a sense for how cruxy your perspective is that long run replacement of humanity by AI is not a bad thing for you conclusion? 

the extinction scenario that Eliezer Yudkowsky has described. His scenario depends on the premise that AI systems could quickly develop advanced molecular nanotechnology capable of matching or even surpassing the sophistication of biological systems.

 

But that's not the claim he makes!

To quote:

The concrete example I usually use here is nanotech, because there's been pretty detailed analysis of what definitely look like physically attainable lower bounds on what should be possible with nanotech, and those lower bounds are sufficient to carry the point. 

Curated and popular this week
Relevant opportunities