This is my submission for the 'Essays on Longtermism' competition, written in response to "Chapter: 22 Existential Risk from Power-Seeking AI," from ‘Essays on Longtermism: Present Action for the Distant Future’ recently published by Oxford University Press. My essay builds on Joe Carlsmith's argument that the primary danger lies not in technical failure, but in the human systems: race dynamics, externalities, and powerful incentives, that make the deployment of unsafe AI a near certainty.
We love a good salvation story. It’s a tale as old as time: a formidable, seemingly insurmountable problem threatens our existence, and just when all seems lost, a hero arrives: a savior, a magic wand, a technological messiah. Today, that hero has a new name: Artificial Intelligence. The narrative is as seductive as it is simple. AI, we’re told, will be the ultimate problem-solver. It will cure cancer, reverse climate change, end poverty, and usher in an age of unprecedented prosperity. It’s a comforting thought, this belief in a deus ex machina built of silicon and code. It suggests that our hardest challenges are merely complex computational problems waiting for a powerful enough processor to solve them.
But this tidy narrative has a crack running through it, a fundamental flaw that has little to do with algorithms and everything with us. The idea that technological potential automatically translates into guaranteed human progress is a dangerous myth. Powerful tools don’t arrive in a vacuum; they land in the messy, complicated, and often deeply flawed world of human affairs. They are shaped by our incentives, wielded by our institutions, and constrained by our limited wisdom. The central tension of our age, then, is not merely about how we get machines to pursue human values, but how we first align humanity with a shared, coherent vision of a better future. Before we can teach a machine what is good, we have a much harder task: agreeing on it ourselves and building a world that enables that good to flourish. The challenge of getting AI right, it turns out, is a mirror reflecting our own deepest dysfunctions.
It’s easy to get swept up in the promise. After all, history is filled with technological leaps that remade our world for the better. But it’s also a graveyard of unintended consequences. Look at nuclear energy. It promised a future of limitless, clean power, a solution to our energy woes. And in many ways, it delivered. Yet, that same atomic mastery gave us the specter of nuclear annihilation and a legacy of radioactive waste we still don’t know how to handle. Biotechnology offers another cautionary tale. Gene editing holds the key to eradicating hereditary diseases, yet it also opens a Pandora’s box of ethical dilemmas, from designer babies to the terrifying potential of engineered pathogens. In both cases, the technology itself was neutral; its impact was a direct result of human choices, geopolitical rivalries, and market forces. So, why does the "AI will save us" mindset persist so strongly? Honestly, it’s because it’s easy. It absolves us of responsibility. It’s far more comfortable to imagine a superintelligent machine untangling the knot of climate change than it is to confront the messy political and economic realities that prevent us from acting now. This "solutionism" mindset carries a profound moral hazard. By framing our greatest challenges as technical puzzles, we risk outsourcing our moral and political duties.
We wait for the machine to fix us, instead of doing the hard, grinding work of fixing ourselves: of reforming our institutions, bridging our divides, and re-examining the very systems that created these problems in the first place. The allure of a technological silver bullet is that it allows us to believe we can have progress without sacrifice, solutions without solidarity. It’s a deeply human hope, but one that has repeatedly proven to be a dangerous illusion, especially when strong incentives exist to create and deploy these powerful systems on a widespread scale, regardless of the risks.
When we imagine AI-driven breakthroughs, we tend to picture a universally shared bounty. But the question we rarely ask is: who benefits, and who bears the risks? The same systems that might one day design personalized cancer treatments will be developed and deployed within a global economic framework that already struggles with profound inequality. Will these life-saving therapies be a public good, or will they be accessible only to those who can afford the premium? The market logic that governs so much of our world suggests the latter. Technology developed within a system of capital is foremost concerned with economy, and its orientation in the free market can easily reinforce, rather than dismantle, existing hierarchies of race and class. The result isn't a world cured of disease, but a world where the wealthy retreat to climate-secure marvels while others are left outside, scavenging for survival. This problem is magnified by the dual-use nature of advanced AI.
A system designed for good can almost always be repurposed for harm. An AI that can design novel proteins to fight disease could, in the wrong hands, design novel pathogens for a bioweapon. Already, frontier AI models far outperform expert human virologists in their own fields, with some being able to design harmful genomic sequences that can evade existing screening protocols. This isn’t a distant, science-fiction threat; it’s a direct consequence of pouring immense resources into capability without a corresponding investment in safety and governance. The same dynamic applies to our information ecosystem. We celebrate AI’s ability to generate text, images, and audio, imagining a world of enhanced creativity and communication. Yet these same tools are already being used to create disinformation at an unprecedented scale and efficiency. They lower the barrier to entry, allowing not just state actors but low-resourced domestic networks and even lone individuals to craft hyper-realistic deepfakes, automate personalized harassment campaigns, and flood online spaces with divisive content. This erosion of our shared reality creates a "social fissure" that malicious actors can exploit to undermine trust in institutions, in the media, and in democracy itself. The tool that promised to connect us becomes a weapon to tear us apart.
It’s tempting to frame the existential risk from AI as a purely technical challenge: a runaway superintelligence with goals that diverge from our own. While that remains a serious long-term concern, a more immediate and perhaps more probable danger stems from something far more familiar: human nature. A significant portion of the risk comes not from AI acting uncontrollably, but from humans using AI in reckless, short-sighted, or malicious ways. The core drivers of this risk are deeply embedded in our social and economic systems. Consider the intense race dynamics among corporations and nations to develop the most powerful AI. In a competitive environment, the pressure to deploy a system before others do can easily override safety considerations. The time and effort devoted to ensuring an AI is robustly safe trades off directly against the speed of development. This creates a terrible feedback loop where even well-intentioned actors feel pressured to cut corners and accept higher levels of risk, lest they fall behind their rivals. Add to this the classic problem of externalities: an actor might rationally choose to deploy a risky system if the potential profits are concentrated on them, while the potential catastrophic costs are distributed across all of society. This is why reframing the project is so critical.
The true work is not just about writing better code; it’s a socio-political and moral undertaking. First, we must build a shared world model. Before we can get an AI to pursue "human values," we need a baseline agreement on reality. AI-powered disinformation directly attacks this foundation. We need to build resilient information ecosystems and foster a culture of critical thinking that can withstand the flood of synthetic content. Without a common ground of facts, any conversation about what is "good" becomes impossible. Second, we have to ensure widespread benefit. The promise of AI must be a promise for everyone. If its gains, whether economic, medical, or social, flow only to a small elite, the result will not be utopia but mass unemployment, social unrest, and global instability. We must design economic and governance structures, from universal basic income to new models of ownership, that ensure at least seventy percent of humanity is set up for success in an AI-augmented world. An AI pursuing the values of only one percent of the population is, by definition, at odds with humanity. Lastly, we need to create robust governance. Power-seeking is not just a theoretical risk for future AI; it’s a present reality for human institutions. AI will concentrate power in unprecedented ways.
Our current governance models, including flimsy "human-in-the-loop" oversight that is often just a compliance checkbox, are not built to handle this. Automation bias shows that humans tend to passively endorse AI outputs, especially when a model seems accurate overall, which means reviewers intervene least when an algorithm's error is greatest. We need new international norms, regulatory bodies, and accountability frameworks that can manage this technology, prevent its misuse, and ensure that human control remains meaningful, not merely illusory.
The narrative of AI as our savior is appealing because it offers simple answers to complex problems. But the truth is that AI is not an answer; it is a question. It asks us what kind of future we truly want, and it forces us to confront the ways in which our current systems are falling short of that vision. The development of AI is a mirror, reflecting both the brilliance of human ingenuity and the depth of our collective folly. Its promise is inextricably linked to our own capacity for coordinated moral progress. We are, in a sense, building a second intelligent species on this planet, one that will be shaped by our goals, our biases, and our blind spots. The danger is not just that we might lose control of it, but that we might build it perfectly to execute our own worst impulses on a global scale. The path forward is not to halt progress, but to infuse it with a profound sense of humility and responsibility.
What future do we want, and how will we, together, guide this powerful new tool to help us reach it?
That is the real challenge, and the clock is ticking.
