Huh? This argument only goes through if you have a sufficiently low probability of existential risk or an extremely low change in your probability of existential risk, conditioned on things moving slower.
This claim seems false, though its truth hinges on what exactly you mean by a "sufficiently low probability of existential risk" and "an extremely low change in your probability of existential risk".
To illustrate why I think your claim is false, I'll perform a quick calculation. I don't know your p(doom), but in a post from three years ago, you stated,
If you believe the key claims of "there is a >=1% chance of AI causing x-risk and >=0.1% chance of bio causing x-risk in my lifetime" this is enough to justify the core action relevant points of EA.
Let's assume that there's a 2% chance of AI causing existential risk, and that, optimistically, pausing for a decade would cut this risk in half (rather than barely decreasing it, or even increasing it). This would imply that the total risk would diminish from 2% to 1%.
According to OWID, approximately 63 million people die every year, although this rate is expected to increase, rising to around 74 million in 2035. If we assume that around 68 million people will die per year during the relevant time period, and that they could have been saved by AI-enabled medical progress, then pausing AI for a decade would kill around 680 million people.
This figure is around 8.3% of the current global population, and would constitute a death count higher than the combined death toll from World War 1, World War 2, the Mongol Conquests, the Taiping rebellion, the Transition from Ming to Qing, and the Three Kingdoms Civil war.
(Note that, although we are counting deaths from old age in this case, these deaths are comparable to deaths in war from a years of life lost perspective, if you assume that AI-accelerated medical breakthroughs will likely greatly increase human lifespan.)
From the perspective of an individual human life, a 1% chance of death from AI is significantly lower than a 8.3% chance of death from aging—though obviously in the former case this risk would apply independently of age, and in the latter case, the risk would be concentrated heavily among people who are currently elderly.
Even a briefer pause lasting just two years, while still cutting risk in half, would not survive this basic cost-benefit test. Of course, it's true that it's difficult to directly compare the individual personal costs from AI existential risk to the diseases of old age. For example, AI existential risk has the potential to be briefer and less agonizing, which, all else being equal, should push us to favor it. On the other hand, most people might consider death from old age to be preferable since it's more natural and allows the human species to continue.
Nonetheless, despite these nuances, I think the basic picture that I'm presenting holds up here: under typical assumptions (such as the ones you gave three years ago), a purely individualistic framing of the costs and benefits of AI pause do not clearly favor pausing, from the perspective of people who currently exist. This fact was noted in Nick Bostrom's original essay on Astronomical Waste, and more recently, by Chad Jones in his paper on the tradeoffs involved in stopping AI development.
For your broader point of impartiality, I feel like you are continuing to assume some bizarre form of moral realism and I don't understand the case. Otherwise, why do you not consider rocks to be morally meaningful? Why is a plant not valuable?
Traditionally, utilitarianism regards these things (rocks and plants) as lacking moral value because they do not have well-being or preferences. This principle does not clearly apply to AI, though it's possible that you are making the assumption that future AIs will lack sentience or meaningful preferences. It would be helpful if you clarified how you perceive me to be assuming a form of moral realism (a meta-ethical theory), as I simply view myself as applying a standard utilitarian framework (a normative theory).
I do not understand the philosophical position you are taking here - it feels like you're saying that the standard position is speciesist and arbitrary and then drawing an arbitrary distinction slightly further out?
Standard utilitarianism recognizes both morally relevant and morally irrelevant distinctions in value. According to a long tradition, following Jeremy Bentham and Peter Singer, among others, the species category is considered morally irrelevant, whereas sentience and/or preferences are considered morally relevant. I do not think this philosophy rests on the premise of moral realism: rather, it's a conceptual framework for understanding morality, whether from a moral realist or anti-realist point of view.
To be clear, I agree that utilitarianism is itself arbitrary, from a sufficiently neutral point of view. But it's also a fairly standard ethical framework, not just in EA but in academic philosophy too. I don't think I'm making very unusual assumptions here.
In your comment, you raise a broad but important question about whether, even if we reject the idea that human survival must take absolute priority other concerns, we might still want to pause AI development in order to “set up” future AIs more thoughtfully. You list a range of traits—things like pro-social instincts, better coordination infrastructures, or other design features that might improve cooperation—that, in principle, we could try to incorporate if we took more time. I understand and agree with the motivation behind this: you are asking whether there is a prudential reason, from a more inclusive moral standpoint, to pause in order to ensure that whichever civilization emerges—whether dominated by humans, AIs, or both at once—turns out as well as possible in ways that matter impartially, rather than focusing narrowly on preserving human dominance.
Having summarized your perspective, I want to clarify exactly where I differ from your view, and why.
First, let me restate the perspective I defended in my previous post on delaying AI. In that post, I was critiquing what I see as the “standard case” for pausing AI, as I perceive it being made in many EA circles. This standard case for pausing AI often treats preventing human extinction as so paramount that any delay of AI progress, no matter how costly to currently living people, becomes justified if it incrementally lowers the probability of humans losing control.
Under this argument, the reason we want to pause is that time spent on “alignment research” can be used to ensure that future AIs share human goals, or at least do not threaten the human species. My critique had two components: first, I argued that pausing AI is very costly to people who currently exist, since it delays medical and technological breakthroughs that could be made by advanced AIs, thereby forcing a lot of people to die who could have otherwise been saved. Second, and more fundamentally, I argued that this "standard case" seems to rest on an assumption of strictly prioritizing human continuity, independent of whether future AIs might actually generate utilitarian moral value in a way that matches or exceeds humanity.
I certainly acknowledge that one could propose a different rationale for pausing AI, one which does not rest on the premise that preserving the human species is intrinsically worth more than than other moral priorities. This is, it seems, the position you are taking.
Nonetheless, I don't find your considerations compelling for a variety of reasons.
To begin with, it might seem that granting ourselves "more time" robustly ensures that AIs come out morally better—pro-social, cooperative, and so on. Yet the connection between “getting more time” to “achieving positive outcomes” does not seem straightforward. Merely taking more time does not ensure that this time will be used to increase, rather than decrease, the relevant quality of AI systems according to an impartial moral view. Alignment with human interests, for example, could just as easily push systems in directions that entrench specific biases, maintain existing social structures, or limit moral diversity—none of which strongly aligns with the “pro-social” ideals you described. In my view, there is no inherent link between a slower timeline and ensuring that AIs end up embodying genuinely virtuous or impartial ethical principles. Indeed, if what we call “human control” is mainly about enforcing the status quo or entrenching the dominance of the human species, it may be no better—and could even be worse—than a scenario in which AI development proceeds at the default pace, potentially allowing for more diversity and freedom in how systems are shaped.
Furthermore, in my own moral framework—which is heavily influenced by preference utilitarianism—I take seriously the well-being of everyone who currently exists in the present. As I mentioned previously, one major cost to pausing AI is that it would likely postpone many technological benefits. These might include breakthroughs in medicine—potential cures for aging, radical extensions of healthy lifespans, or other dramatic increases to human welfare that advanced AI could accelerate. We should not simply dismiss the scale of that cost. The usual EA argument for downplaying these costs rests on the Astronomical Waste argument. However, I find this argument flawed, and I spelled out exactly why I found this argument flawed in the post I just wrote.
If a pause sets back major medical discoveries by even a decade, that delay could contribute to the premature deaths of around a billion people alive today. It seems to me that an argument in favor of pausing should grapple with this tradeoff, instead of dismissing it as clearly unimportant compared to the potential human lives that could maybe exist in the far future. Such a dismissal would seem both divorced from common sense concern for existing people, and divorced from broader impartial utilitarian values, as it would prioritize the continuity of the human species above and beyond species-neutral concerns for individual well-being.
Finally, I take very seriously the possibility that pausing AI would cause immense and enduring harm by requiring the creation of vast regulatory controls over society. Realistically, the political mechanisms by which we “pause” advanced AI development would likely involve a lot of coercion, surveillance and social control, particularly as AI starts becoming an integral part of our economy. These efforts are likely to expand state regulatory powers, hamper open competition, and open the door to a massive intrusion of state interference in economic and social activity. I believe these controls would likely be far more burdensome and costly than, for example, our controls over nuclear weapons. If our top long-term priority is building a more free, prosperous, inclusive, joyous, and open society for everyone, rather than merely to control and stop AI, then it seems highly questionable that creating the policing powers required to pause AI is the best way to achieve this objective.
As I see it, the core difference between the view you outlined and mine is not that I am ignoring the possibility that we might “do better” by carefully shaping the environment in which AIs arise. I concede that if we had a guaranteed mechanism to spend a known, short period of time intentionally optimizing how AIs are built, without imposing any other costs in the meantime, that might bring some benefits. However, my skepticism flows from the actual methods by which such a pause would come about, its unintended consequences on liberty, the immediate harms it imposes on present-day people by delaying technological progress, and the fact that it might simply entrench a narrower or more species-centric approach that I explicitly reject. It is not enough to claim that “pausing gives us more time", suggesting that "more time" is robustly a good thing. One must argue why that time will be spent well, in a way that outweighs the enormous and varied costs that I believe are incurred by pausing AI.
To be clear, I am not opposed to all forms of regulation. But I tend to prefer more liberal approaches, in the sense of classical liberalism. I prefer strategies that try to invite AIs into a cooperative framework, giving them legal rights and a path to peaceful integration—coupled, of course, with constraints on any actor (human or AI) who threatens to commit violence. This, in my view, simply seems like a far stronger foundation for AI policy than a stricter top-down approach in which we halt all frontier AI progress, and establish the associated sweeping regulatory powers required to enforce such a moratorium.
Let's define "shumanity" as the set of all humans who are currently alive. Under this definition, every living person today is a "shuman," but our future children may not be, since they do not yet exist. Now, let's define "humanity" as the set of all humans who could ever exist, including future generations. Under this broader definition, both we and our future children are part of humanity.
If all currently living humans (shumanity) were to die, this would be a catastrophic loss from the perspective of shuman values—the values held by the people who are alive today. However, it would not necessarily be a catastrophic loss from the perspective of human values—the values of humanity as a whole, across time. This distinction is crucial. In the normal course of events, every generation eventually grows old, dies, and is replaced by the next. When this happens, shumanity, as defined, ceases to exist, and as such, shuman values are lost. However, humanity continues, carried forward by the new generation. Thus, human values are preserved, but not shuman values.
Now, consider this in the context of AI. Would the extinction of shumanity by AIs be much worse than the natural generational cycle of human replacement? In my view, it is not obvious that being replaced by AIs would be much worse than being replaced by future generations of humans. Both scenarios involve the complete loss of the individual values held by currently living people, which is undeniably a major loss. To be very clear, I am not saying that it would be fine if everyone died. But in both cases, something new takes our place, continuing some form of value, mitigating part of the loss. This is the same perspective I apply to AI: its rise might not necessarily be far worse than the inevitable generational turnover of humans, which equally involves everyone dying (which I see as a bad thing!). Maybe "human values" would die in this scenario, but this would not necessarily entail the end of the broader concept of impartial utilitarian value. This is precisely my point.
I don’t subscribe to moral realism. My own ethical outlook is a blend of personal attachments—my own life, my family, my friends, and other living humans—as well as a broader utilitarian concern for overall well-being. In this post, I focused on impartial utilitarianism because that’s the framework most often used by effective altruists.
However, to the extent that I also have non-utilitarian concerns (like caring about specific people I know), those concerns incline me away from supporting a pause on AI. If AI can accelerate technologies that save and improve the lives of people who exist right now, then slowing it down would cost lives in the near term. A more complete, and more rigorous version of this argument was outlined in the post.
What I find confusing about other EA's views, including yours, is why we would assign such great importance to “human values” as something specifically tied to the human species as an abstract concept, rather than merely being partial to actual individuals who exist. This perspective is neither utilitarian, nor is it individualistic. It seems to value the concept of the human species over and above the actual individuals that comprise the species, much like how an ideological nationalist might view the survival of their nation as more important than the welfare of all the individuals who actually reside within the nation.
I realize my position can be confusing, so let me clarify it as plainly as I can: I do not regard the extinction of humanity as anything close to “fine.” In fact, I think it would be a devastating tragedy if every human being died. I have repeatedly emphasized that a major upside of advanced AI lies in its potential to accelerate medical breakthroughs—breakthroughs that might save countless human lives, including potentially my own. Clearly, I value human lives, as otherwise I would not have made this particular point so frequently.
What seems to cause confusion is that I also argue the following more subtle point: while human extinction would be unbelievably bad, it would likely not be astronomically bad in the strict sense used by the "astronomical waste" argument. The standard “astronomical waste” argument says that if humanity disappears, then all possibility for a valuable, advanced civilization vanishes forever. But in a scenario where humans die out because of AI, civilization would continue—just not with humans. That means a valuable intergalactic civilization could still arise, populated by AI rather than by humans. From a purely utilitarian perspective that counts the existence of a future civilization as extremely valuable—whether human or AI—this difference lowers the cataclysm from “astronomically, supremely, world-endingly awful” to “still incredibly awful, but not on a cosmic scale.”
In other words, my position remains that human extinction is very bad indeed—it entails the loss of eight billion individual human lives, which would be horrifying. I don't want to be forcibly replaced by an AI. Nor do I want you, or anyone else to be forcibly replaced by an AI. I am simply pushing back on the idea that such an event would constitute the absolute destruction of all future value in the universe. There is a meaningful distinction between “an unimaginable tragedy we should try very hard to avoid” and “a total collapse of all potential for a flourishing future civilization of any kind.” My stance falls firmly in the former category.
This distinction is essential to my argument because it fundamentally shapes how we evaluate trade-offs, particularly when considering policies that aim to slow or restrict AI research. If we assume that human extinction due to AI would erase all future value, then virtually any present-day sacrifice—no matter how extreme—might seem justified to reduce that risk. However, if advanced AI could continue to sustain its own value-generating civilization, even in the absence of humans, then extinction would not represent the absolute end of valuable life. While this scenario would be catastrophic for humanity, attempting to avoid it might not outweigh certain immediate benefits of AI, such as its potential to save lives through advanced technology.
In other words, there could easily be situations where accelerating AI development—rather than pausing it—ends up being the better choice for saving human lives, even if doing so technically slightly increases the risk of human species extinction. This does not mean we should be indifferent to extinction; rather, it means we should stop treating extinction as a near-infinitely overriding concern, where even the smallest reduction in its probability is always worth immense near-term costs to actual people living today.
For a moment, I’d like to reverse the criticism you leveled at me. From where I stand, it is often those who strongly advocate pausing AI development, not myself, who can appear to undervalue the lives of humans. I know they don’t see themselves this way, and they would certainly never phrase it in those terms. Nevertheless, this is my reading of the deeper implications of their position.
A common proposition that many AI pause advocates have affirmed to me is that it very well could be worth it to pause AI, even if this led to billions of humans dying prematurely due to them missing out on accelerated medical progress that could otherwise have saved their lives. Therefore, while these advocates care deeply about human extinction (something I do not deny), their concern does not seem rooted in the intrinsic worth of the people who are alive today. Instead, their primary focus often seems to be on the loss of potential future human lives that could maybe exist in the far future—lives that do not yet even exist, and on my view, are unlikely to exist in the far future in basically any scenario, since humanity is unlikely to be preserved as a fixed, static concept over the long-run.
In my view, this philosophy neither prioritizes the well-being of actual individuals nor is it grounded in the utilitarian value that humanity actively generates. If this philosophy were purely about impartial utilitarian value, then I ask: why are they not more open to my perspective? Since my philosophy takes an impartial utilitarian approach—one that considers not just human-generated value, but also the potential value that AI itself could create—it would seem to appeal to those who simply took a strict utilitarian approach, without discriminating against artificial life arbitrarily. Yet, my philosophy largely does not appeal to those who express this view, suggesting the presence of alternative, non-utilitarian concerns.
I think your response largely assumes a human-species-centered viewpoint, rather than engaging with my critique that is precisely aimed at re-evaluating this very point of view.
You say, “AIs will probably not care about the same things, so the universe will be worse by our lights if controlled by AI.” But what are "our lights" and "our values" in this context? Are you referring to the values of me as an individual, the current generation of humans, or humanity as a broad, ongoing species-category? These are distinct—and often conflicting—sets of values, preferences, and priorities. It’s possible, indeed probable, that I, personally, have preferences that differ fundamentally from the majority of humans. "My values" are not the same as "our values".
When you talk about whether an AI civilization is “better” or “worse,” it’s crucial to clarify what perspective we’re measuring that from. If, from the outset, we assume that human values, or the survival of humanity-as-a-species, is the critical factor that determines whether an AI civilization is better or worse than our own, that effectively begs the question. It merely assumes what I aim to challenge. From a more impartial standpoint, the mere fact that AI might not care about the exact same things humans do doesn’t necessarily entail a decrease in total impartial moral value—unless we’ve already decided in advance that human values are inherently more important.
(To make this point clearer, perhaps replace all mentions of "human values" with "North American values" in the standard arguments about these issues, and see if it makes these arguments sound like they privilege an arbitrary category of beings.)
While it’s valid to personally value the continuation of the human species, or the preservation of human values, as a moral preference above other priorities, my point is simply that that’s precisely the species-centric assumption I’m highlighting, rather than a distinct argument that undermines my observations or analysis. Such a perspective is not substrate or species-neutral. Nor is it obviously mandated by a strictly utilitarian framework; it’s an extra premise that privileges the category "humankind" for its own sake. You may believe that such a preference is natural or good from your own perspective, but that is not equivalent to saying that it is the preference of an impartial utilitarian, who would, in theory, make no inherent distinction based purely on species, or substrate.
A reflection on the posts I have written in the last few months, elaborating on my views
In a series of recent posts, I have sought to challenge the conventional view among longtermists that prioritizes the empowerment or preservation of the human species as the chief goal of AI policy. It is my opinion that this view is likely rooted in a bias that automatically favors human beings over artificial entities—thereby sidelining the idea that future AIs might create equal or greater moral value than humans—and treating this alternative perspective with unwarranted skepticism.
I recognize that my position is controversial and likely to remain unpopular among effective altruists for a long time. Nevertheless, I believe it is worth articulating my view at length, as I see it as a straightforward application of standard, common-sense utilitarian principles that merely lead to an unpopular conclusion. I intend to continue elaborating on my arguments in the coming months.
My view follows from a few basic premises. First, that future AI systems are quite likely to be moral patients; second, that we shouldn’t discriminate against them based on arbitrary distinctions, such as their being instantiated on silicon rather than carbon, or having been created through deep learning rather than natural selection. If we insist on treating AIs fundamentally differently from a human child or adult—for example, by regarding them merely as property to be controlled or denying them the freedom to pursue their own goals—then we should identify a specific ethical reason for our approach that goes beyond highlighting their non-human nature.
Many people have argued that consciousness is the key quality separating humans from AIs, thus rendering any AI-based civilization morally insignificant compared to ours. They maintain that consciousness has relatively narrow boundaries, perhaps largely confined to biological organisms, and would only arise in artificial systems under highly specific conditions—for instance, if one were to emulate a human mind in digital form. While I acknowledge that this perspective is logically coherent, I find it deeply unconvincing. The AIs I am referring to when I write about this topic would almost certainly not be simplistic, robotic automatons; rather, they would be profoundly complex, sophisticated entities whose cognitive abilities rival or exceed those of the human brain. For anyone who adopts a functionalist view of consciousness, it seems difficult to be confident that such AIs would lack a rich inner experience.
Because functionalism and preference utilitarianism—both of which could grant moral worth to AI preferences even if they do not precisely replicate biological states—have at least some support within the EA community, I remain hopeful that, if I articulate my position clearly, EAs who share these philosophical assumptions will see its merits.
That said, I am aware that explaining this perspective is an uphill battle. The unpopularity of my views often makes it difficult to communicate without instant misunderstandings; critics seem to frequently conflate my arguments with other, simpler positions that can be more easily dismissed. At times, this has caused me to feel as though the EA community is open to only a narrow range of acceptable ideas. This reaction, while occasionally frustrating, does not surprise me, as I have encountered similar resistance when presenting other unpopular views—such as challenging the ethics of purchasing meat in social contexts where such concerns are quickly deemed absurd.
However, the unpopularity of these ideas also creates a benefit: it creates room for rapid intellectual progress by opening the door to new and interesting philosophical questions about AI ethics. If we free ourselves from the seemingly unquestionable premise that preserving the human species should be the top priority when governing AI development, we can begin to ask entirely new and neglected questions about the role of artificial minds in society.
These questions include: what social and legal frameworks should we pursue if AIs are seen not as dangerous tools to be contained but as individuals on similar moral footing with humans? How do we integrate AI freedom and autonomy into our vision of the future, creating the foundation for a system of ethical and pragmatic AI rights?
Under this alternative philosophical approach, policy would not focus solely on minimizing risks to humanity. Instead, it would emphasize cooperation and inclusion, seeing advanced AI as a partner rather than an ethical menace to be tightly restricted or controlled. This undoubtedly requires a significant shift in our longtermist thinking, demanding a re-examination of deeply rooted assumptions. Such a project cannot be completed overnight, but given the moral stakes and the rapid progress in AI, I view this philosophical endeavor as both urgent and exciting. I invite anyone open to rethinking these foundational premises to join me in exploring how we might foster a future in which AIs and humans coexist as moral peers, cooperating for mutual benefit rather than viewing each other as intrinsic competitors locked in an inevitable zero-sum fight.
We can assess the strength of people's preferences for future generations by analyzing their economic behavior. The key idea is that if people genuinely cared deeply about future generations, they would prioritize saving a huge portion of their income for the benefit of those future individuals rather than spending it on themselves in the present. This would indicate a strong intertemporal preference for improving the lives of future people over the well-being of currently existing individuals.
For instance, if people truly valued humanity as a whole far more than their own personal well-being, we would expect parents to allocate the vast majority of their income to their descendants (or humanity collectively) rather than using it for their own immediate needs and desires. However, empirical studies generally do not support the claim that people place far greater importance on the long-term preservation of humanity than on the well-being of currently existing individuals. In reality, most people tend to prioritize themselves and their children, while allocating only a relatively small portion of their income to charitable causes or savings intended to benefit future generations beyond their immediate children. If people were intrinsically and strongly committed to the abstract concept of humanity itself, rather than primarily concerned with the welfare of present individuals (including their immediate family and friends), we would expect to see much higher levels of long-term financial sacrifice for future generations than we actually observe.
To be clear, I'm not claiming that people don’t value their descendants, or the concept of humanity at all. Rather, my point is that this preference does not appear to be strong enough to override the considerations outlined in my previous argument. While I agree that people do have an independent preference for preserving humanity—beyond just their personal desire to avoid death—this preference is typically not way stronger than their own desire for self-preservation. As a result, my previous conclusion still holds: from the perspective of present-day individuals, accelerating AI development can still be easily justified if one does not believe in a high probability of human extinction from AI.
I'm not talking about "arbitrary AI entities" in this context, but instead, the AI entities who will actually exist in the future, who will presumably be shaped by our training data, as well as our training methods. From this perspective, it's not clear to me that your claim is true. But even if your claim is true, I was actually making a different point. My point was instead that it isn't clear that future generations of AIs would be much worse than future generations of humans from an impartial utilitarian point of view.
(That said, it sounds like the real crux between us might instead be about whether pausing AI would be very costly to people who currently exist. If indeed you disagree with me about this point, I'd prefer you reply to my other comment rather than replying to this one, as I perceive that discussion as likely to be more productive.)