Among effective altruists, it is sometimes claimed that delaying AI development for safety reasons is ethically justified based on straightforward utilitarian logic—particularly the idea that reducing existential risk has overwhelming moral value. However, I believe this claim is mistaken. While the primary argument for delaying AI may appear utilitarian on the surface, I think it actually depends on deeper ethical assumptions that are not strictly utilitarian in nature.
To be clear, I am not arguing that one cannot, in theory, construct a logically consistent utilitarian argument for delaying AI. One could, for instance, argue that the AIs we create won't be conscious in a way that has moral value, or that misaligned AI will lead to immense suffering—making it worthwhile to delay AI if doing so would genuinely act to mitigate these specific outcomes. My claim is not that such an argument would be logically incoherent. Rather, my claim is that the standard case for delaying AI—the argument most commonly made in effective altruist discussions—seems to not actually rely on these premises. Instead, it appears to rest on an implicit speciesist assumption that prioritizes the survival of the human species itself, rather than purely impartial utilitarian concerns for maximizing well-being or preventing suffering.
In this post, I try to demonstrate this claim. First, I outline what I see as the "standard case" for delaying AI from an EA longtermist perspective. I then argue that, despite the common perception that this case follows straightforward utilitarian reasoning, it actually seems primarily driven by a preference for preserving the human species as a category—even when this conflicts with traditional utilitarian objectives like maximizing the well-being of all sentient beings, including future AI entities.
The "standard case" for delaying AI
While there is considerable debate on this topic, here I will outline what I perceive as the "standard case" for delaying AI development for safety reasons. This is based on numerous discussions I have had with EAs about this topic over the last few years, as well as my personal review of many articles and social media posts advocating for pausing or delaying AI.
However, I want to emphasize that this is not the only reasoning used to justify delaying AI—there is indeed significant variation in how different EAs approach this issue. In other words, I am not claiming that this argument fully captures the views of all, or even most, EAs who have thought about this subject. Nonetheless, I believe the following argument is still broadly representative of a common line of reasoning:
Step 1: The Astronomical Waste Argument
This step in the argument claims that reducing existential risk—even by a tiny amount—is overwhelmingly more valuable than accelerating technological progress. The reasoning is that an existential catastrophe would eliminate nearly all future value, whereas hastening technological advancement (e.g., space colonization, AGI, etc.) would only move technological maturity forward by a short period of time. Given this, even a minor reduction in existential risk is argued to be vastly more important than accelerating progress toward a utopian future.
Step 2: AI is an existential risk that can be mitigated by delaying AGI
This step in the argument claims that slowing down AI development gives us more time to conduct safety research, which in turn reduces the risk that future AGIs will have values misaligned with human interests. By delaying AGI, we increase our ability to develop adequate regulatory safeguards and technical alignment techniques, thereby lowering the probability of an AI-driven existential catastrophe, whereby the human species either goes extinct or is radically disempowered.
Conclusion: The moral case for delaying AGI
Based on these reasoning steps, the conclusion is that delaying AGI is morally justified because it meaningfully reduces existential risk, and the value of this risk reduction vastly outweighs any negative consequences for currently existing humans. While delaying AGI may postpone medical breakthroughs and other technological advancements—thereby shortening the lifespans of people alive today, and forcing them to endure avoidable suffering for longer—this cost is seen as negligible in comparison to the overwhelming moral importance of preventing an AI-induced existential catastrophe that could wipe out all future generations of humans.
Why the standard case for delaying AI seems to rest on non-utilitarian assumptions
It may be tempting to believe that the argument I have just outlined closely parallels the argument for prioritizing other existential risks—such as the risk of a giant asteroid impact. However, these arguments are actually quite distinct.
To illustrate, consider the hypothetical scenario of a massive asteroid on a direct collision course with Earth. If this asteroid were to strike, it would not only wipe out all currently existing human life but also eliminate the possibility of any future civilization emerging. This means that all potential future generations—who could have gone on to colonize the universe and create an astronomically large amount of moral value—would never come into existence. According to the astronomical waste argument, preventing this catastrophe would be of overwhelming moral importance because the value of ensuring that future civilizations do emerge vastly outweighs the relatively minor concern of whether civilization emerges slightly earlier or later.
At first glance, proponents of delaying AI might want to claim that their argument follows the same logic. However, this would be misleading. The key difference is that in most existential risk scenarios, civilization itself would be completely destroyed, whereas in the case of AI risk, civilization would continue to exist—just under AI control rather than human control.
In other words, even if AIs were to drive humans to extinction or permanently disempower humanity, this would not necessarily mean that all future moral value is lost. AIs could still go on to build an intergalactic civilization, ensuring that complex life continues to exist, expand, and potentially flourish across the universe—just without humans. This means that the long-term future would still be realized, but in a form that does not involve human beings playing the central role.
This distinction is crucial because it directly undermines the application of the astronomical waste argument to AI existential risk. The astronomical waste argument applies most straightforwardly to scenarios where all future potential value is permanently destroyed—such as if a catastrophic asteroid impact wiped out all complex life on Earth, preventing any future civilization from emerging. But if AIs take over and continue building an advanced civilization, then the universe would still be filled with intelligent beings capable of creating vast amounts of moral value. The primary difference is that these beings would be AIs rather than biological humans.
This matters because, from a longtermist utilitarian perspective, the fundamental goal is to maximize total utility over the long term, without privileging any specific group based purely on arbitrary characteristics like species or physical substrate. A consistent longtermist utilitarian should therefore, in principle, give moral weight to all sentient beings, whether they are human or artificial. If one truly adheres to this impartial framework, then they would have no inherent preference for a future dominated by biological humans over one dominated by highly intelligent AIs.
Of course, one can still think—as I do—that human extinction would be a terrible outcome for the people who are alive when it occurs. Even if the AIs that replace us are just as morally valuable as we are from an impartial moral perspective, it would still be a moral disaster for all currently existing humans to die. However, if we accept this perspective, then we must also acknowledge that, from the standpoint of people living today, there appear to be compelling reasons to accelerate AI development rather than delay it for safety reasons.
The reasoning is straightforward: if AI becomes advanced enough to pose an existential threat to humanity, then it would almost certainly also be powerful enough to enable massive technological progress—potentially revolutionizing medicine, biotechnology, and other fields in ways that could drastically improve and extend human lives. For example, advanced AI could help develop cures for aging, eliminate extreme suffering, and significantly enhance human health through medical and biological interventions. These advancements could allow many people who are alive today to live much longer, healthier, and more fulfilling lives.
As economist Chad Jones has pointed out, delaying AI development means that the current generation of humans risks missing out on these transformative benefits. If AI is delayed for years or decades, a large fraction of people alive today—including those advocating for AI safety—would not live long enough to experience these life-extending technologies. This leads to a strong argument for accelerating AI, at least from the perspective of present-day individuals, unless one is either unusually risk-averse, or they have a very high confidence (such as above 50%) that AI will lead to human extinction.
To be clear, if someone genuinely believes there is a high probability that AI will wipe out humanity, then I agree that delaying AI would seem rational, since the high risk of personal death would outweigh the small possibility of a dramatically improved life. But for those who see AI extinction risk as relatively low (such as below 15%), accelerating AI development appears to be the more pragmatic personal choice.
Thus, while human extinction would undoubtedly be a disastrous event, the idea that even a small risk of extinction from AI justifies delaying its development—even if that delay results in large numbers of currently existing humans dying from preventable causes—is not supported by straightforward utilitarian reasoning. The key question here is what extinction actually entails. If human extinction means the total disappearance of all complex life and the permanent loss of all future value, then mitigating even a small risk of such an event might seem overwhelmingly important. However, if the outcome of human extinction is simply that AIs replace humans—while still continuing civilization and potentially generating vast amounts of moral value—then the reasoning behind delaying AI development changes fundamentally.
In this case, the clearest and most direct tradeoff is not about preventing "astronomical waste" in the classic sense (i.e., preserving the potential for future civilizations) but rather about whether the risk of AI takeover is acceptable to the current generation of humans. In other words, is it justifiable to impose costs on presently living people—including delaying potentially life-saving medical advancements—just to reduce a relatively small probability that humanity might be forcibly replaced by AI? This question is distinct from the broader existential risk arguments that typically focus on preserving all future potential value, and it suggests that delaying AI is not obviously justified by utilitarian logic alone.
From a historical perspective, existential transitions—where one form of life is replaced by another—are not uncommon. Mass extinctions have occurred repeatedly throughout Earth's history, yet they have not resulted in the total elimination of all complex life or all utilitarian moral value. If AIs were to replace humans, it would be a transition of similar nature, not necessarily a total moral catastrophe in the way that true extinction of all complex life would be.
Another natural process that mirrors the pattern of one form of life being replaced by another is ordinary generational replacement. By this, I am referring to the fact that, as time passes, each generation of humans inevitably ages and dies, and a new generation is born to take its place. While this cycle preserves the human species as a whole, it still follows the fundamental pattern of one group of individuals—who once fully controlled the world—entirely disappearing and being replaced by another group that did not previously exist.
Once we recognize these parallels, it becomes clearer that AI existential risk is functionally more similar to a generational transition between different forms of intelligent life than it is to the total extinction of all complex life. The key difference is that, instead of new biological humans replacing old biological humans, future AI entities would replace humans altogether. But functionally, both processes involve one intelligent group dying out and another taking over, continuing civilization in a new form.
This realization highlights a fundamental assumption underlying the "standard case" for delaying AI: it is not primarily based on a concern for the survival of individual human beings, or the continuity of civilization, but rather on a speciesist preference for the survival of the human species as a category.
The assumption is that the death of humanity is uniquely catastrophic not because intelligent life or civilization would end, but because the human species itself would no longer exist. Here, humanity is not being valued merely as a collection of currently living individuals but as an abstract genetic category—one that is preserved across generations through biological reproduction. The implicit belief appears to be that even though both humans and AIs would be capable of complex thought and moral reasoning, only humans belong to the privileged genetic category of "humanity", which is assumed to have special moral significance.
This speciesist assumption suggests that the true moral concern driving the argument for delaying AI is not the loss of future moral value in general, but rather the loss of specifically human control over that value. If AI were to replace humans, civilization would not disappear—only the genetic lineage of Homo sapiens would. The claim that this constitutes an "existential catastrophe" is therefore not based on the objective loss of complex life that could create moral value, but on the belief that only human life (or biological life), as opposed to artificial life, is truly valuable.
As a result, the standard argument for delaying AI fundamentally relies on prioritizing the survival of the human species as a category, rather than simply the survival of sentient beings capable of experiencing value, or improving the lives of people who currently exist. This assumption is rarely made explicit, but once recognized, it undermines the idea that AI-driven human extinction is straightforwardly comparable to an asteroid wiping out all life. Instead, it becomes clear that the argument is rooted in a preference for human biological continuity—one that is far more species-centric than purely utilitarian in nature.
Can a utilitarian case be made for delaying AI?
So far I have written about the "standard case" for delaying AI development, as I see it. However, to be clear, I am not denying that one could construct a purely utilitarian argument for why AIs might generate less moral value than humans, and thus why delaying AI could be justified. My main point, however, is that evidence supporting such an argument is rarely made explicit or provided in discussions on this topic.
For instance, one common claim is that the key difference between humans and AIs is consciousness—that is, humans are known to be conscious, while AIs may not be. Because moral value is often linked to consciousness, this argument suggests that ensuring the survival of humans (rather than being replaced by AIs) is crucial for preserving moral value.
While I acknowledge that this is a major argument people often invoke in personal discussions, it does not appear to be strongly supported within effective altruist literature. In fact, I have come across very few articles on the EA Forum or in EA literature that explicitly argue that AIs will not be conscious and then connect this point to the urgency of delaying AI, or reducing AI existential risk. Indeed, I suspect there are many more articles from EAs that argue what is functionally the opposite claim—namely, that AIs will probably be conscious. This is likely due to the popularity of functionalist theories of consciousness among many effective altruists, which suggest that consciousness is determined by computational properties rather than biological substrate. If one accepts this view, then there are few inherent reasons to assume that future AIs would lack consciousness or moral worth.
Another potential argument is that humans are more likely than AIs to pursue goals aligned with utilitarian values, which would make preserving human civilization morally preferable. While this argument is logically coherent, it does not seem to have strong, explicit support in the EA literature—at least, to my knowledge. I have encountered few, if any, rigorous EA analyses that explicitly argue future AIs will likely be less aligned with utilitarian values than humans. Without such an argument, this claim remains little more than an assertion. And if one can simply assert this claim without strong evidence, then one could just as easily assert the opposite—that AIs, on average, might actually be more aligned with utilitarian values than humans—leading to the opposite conclusion.
Either way, such an argument would depend on empirical evidence about the likely distribution of AI goals in the future. In other words, to claim that AIs are less likely than humans to adopt utilitarian values, one would need to provide concrete evidence about what kinds of objectives advanced AIs are actually expected to develop. However, discussions on this topic rarely present detailed empirical analyses of what this distribution of AI goals is likely to look like, making this claim very speculative, and so far, largely unjustified.
Thus, my argument is not that it is logically impossible to construct a utilitarian case for delaying AI in the name of safety. I fully acknowledge that such an argument could be made. However, based on the literature that currently exists supporting the idea of delaying AI development, I suspect that the most common real-world justification that people rely on for this position is not a carefully constructed utilitarian argument. Instead, it appears to rest largely on an implicit speciesist preference for preserving the human species—an assumption that is disconnected from traditional utilitarian principles, which prioritize maximizing well-being for actual individuals, rather than preserving a particular species for its own sake.
Nate Soares' take here was that an AI takeover would most likely lead to an "unconscious meh" scenario, where "The outcome is worse than the “Pretty Good” scenario, but isn’t worse than an empty universe-shard" and "there’s little or no conscious experience in our universe-shard’s future. E.g., our universe-shard is tiled with tiny molecular squiggles (a.k.a. “molecular paperclips”)." Whereas humanity boosted by ASI would probably lead to a better outcome.
That was also the most common view in the polls in the comments there.