This  post provides an overview of this report.

Discussions of the existential risks posed by artificial intelligence have largely focused on the challenge of alignment - ensuring that advanced AI systems pursue human-compatible goals. However, even if we solve alignment, humanity could still face catastrophic outcomes from how humans choose to use transformative AI technologies.

A new analysis examines these "misuse risks" - scenarios where human decisions about AI deployment, rather than AI systems acting against human interests, lead to existential catastrophe. This includes both intentional harmful uses (like developing AI-enabled weapons) and reckless deployment without adequate safeguards. The analysis maps out how such human-directed applications of AI, even when technically aligned, could lead to permanent loss of human potential.

The report identifies three broad categories of existential risk from AI misuse:

  • War Scenarios: AI could make wars both more likely and more destructive by enabling rapid weapons development, automating command and control systems, and creating new categories of WMDs. Near-term risks include AI-enhanced bioweapons and autonomous cyberweapons, while longer-term concerns involve technologies like self-replicating nanoweapons or automated weapons factories. The speed of AI-driven warfare could compress decision timelines and increase chances of unintended escalation.
  • Totalitarian Scenarios: AI could fundamentally reshape power dynamics through three reinforcing mechanisms: advanced surveillance capabilities, unprecedented persuasion and social manipulation tools, and increasing returns to centralization in an automated economy. This might occur through an existing state leveraging these advantages, a corporate entity seizing power, or gradual expansion of control in response to other AI risks. The combination of these factors could enable more stable and comprehensive authoritarian control than was historically possible.
  • Decay Scenarios: AI could introduce technologies or capabilities that fundamentally destabilize ordered society. This could occur through the proliferation of cheap asymmetric weapons like AI-enhanced bioweapons or autonomous drones, making society permanently vulnerable to disruption by small groups. Or through increased brittleness from over-dependence on automated systems, where cascading infrastructure failures become more likely and harder to recover from. Or through erosion of shared truth and social cohesion as AI systems enable unprecedented levels of misinformation and social manipulation. The common thread is that these technologies, once deployed, could irreversibly increase civilization's vulnerability to catastrophic collapse.

Among these, war scenarios emerge as perhaps the most concerning. Two factors drive this assessment: First, wars have historically been a common route for new technologies to prove destructive, providing clear precedent and understood pathways to catastrophe. Second, several AI-enabled weapons technologies appear technically feasible in the near term, particularly bioweapons and autonomous cyberweapons. Unlike nuclear weapons, these technologies may be relatively cheap to develop and hard to regulate effectively.

The analysis provides a systematic framework for evaluating different AI technologies based on factors like technical feasibility, development barriers, and potential for catastrophic outcomes. For example, while autonomous drones might seem worrying, their development faces significant hardware constraints. In contrast, software-based capabilities like AI-assisted bioweapon design or cyber-operations may pose more urgent risks than hardware-dependent technologies, since they face fewer practical barriers to development.

Another surprising finding is that the automation of military command and control systems might actually reduce catastrophic risks in some scenarios by making decisions more precise and considered, while simultaneously creating new risks through faster escalation dynamics or vulnerability to sophisticated attacks. The analysis also suggests that many of the most dangerous capabilities might be developed before full Transformative AI, highlighting the importance of near-term governance.

The report also highlights how misuse risks interact with other challenges in AI development. Racing dynamics between nations or companies could incentivize rapid deployment of dangerous capabilities. Attempts to prevent misaligned AI could inadvertently create tools for surveillance and control. Understanding these dynamics is crucial for developing effective governance strategies that avoid backfire risks.

For a detailed examination of these risks and their implications for AI development and policy, see the full report. The analysis provides concrete recommendations for AI labs, policymakers, and others working to ensure safe development of transformative AI systems.

Ultimately, even perfectly aligned AI systems could enable catastrophic outcomes if deployed without adequate safeguards and coordination. As we race to solve technical alignment challenges, we must also develop frameworks to govern the use of increasingly powerful AI capabilities.

Comments


No comments on this post yet.
Be the first to respond.
Curated and popular this week
 ·  · 38m read
 · 
In recent months, the CEOs of leading AI companies have grown increasingly confident about rapid progress: * OpenAI's Sam Altman: Shifted from saying in November "the rate of progress continues" to declaring in January "we are now confident we know how to build AGI" * Anthropic's Dario Amodei: Stated in January "I'm more confident than I've ever been that we're close to powerful capabilities... in the next 2-3 years" * Google DeepMind's Demis Hassabis: Changed from "as soon as 10 years" in autumn to "probably three to five years away" by January. What explains the shift? Is it just hype? Or could we really have Artificial General Intelligence (AGI)[1] by 2028? In this article, I look at what's driven recent progress, estimate how far those drivers can continue, and explain why they're likely to continue for at least four more years. In particular, while in 2024 progress in LLM chatbots seemed to slow, a new approach started to work: teaching the models to reason using reinforcement learning. In just a year, this let them surpass human PhDs at answering difficult scientific reasoning questions, and achieve expert-level performance on one-hour coding tasks. We don't know how capable AGI will become, but extrapolating the recent rate of progress suggests that, by 2028, we could reach AI models with beyond-human reasoning abilities, expert-level knowledge in every domain, and that can autonomously complete multi-week projects, and progress would likely continue from there.  On this set of software engineering & computer use tasks, in 2020 AI was only able to do tasks that would typically take a human expert a couple of seconds. By 2024, that had risen to almost an hour. If the trend continues, by 2028 it'll reach several weeks.  No longer mere chatbots, these 'agent' models might soon satisfy many people's definitions of AGI — roughly, AI systems that match human performance at most knowledge work (see definition in footnote). This means that, while the compa
 ·  · 4m read
 · 
SUMMARY:  ALLFED is launching an emergency appeal on the EA Forum due to a serious funding shortfall. Without new support, ALLFED will be forced to cut half our budget in the coming months, drastically reducing our capacity to help build global food system resilience for catastrophic scenarios like nuclear winter, a severe pandemic, or infrastructure breakdown. ALLFED is seeking $800,000 over the course of 2025 to sustain its team, continue policy-relevant research, and move forward with pilot projects that could save lives in a catastrophe. As funding priorities shift toward AI safety, we believe resilient food solutions remain a highly cost-effective way to protect the future. If you’re able to support or share this appeal, please visit allfed.info/donate. Donate to ALLFED FULL ARTICLE: I (David Denkenberger) am writing alongside two of my team-mates, as ALLFED’s co-founder, to ask for your support. This is the first time in Alliance to Feed the Earth in Disaster’s (ALLFED’s) 8 year existence that we have reached out on the EA Forum with a direct funding appeal outside of Marginal Funding Week/our annual updates. I am doing so because ALLFED’s funding situation is serious, and because so much of ALLFED’s progress to date has been made possible through the support, feedback, and collaboration of the EA community.  Read our funding appeal At ALLFED, we are deeply grateful to all our supporters, including the Survival and Flourishing Fund, which has provided the majority of our funding for years. At the end of 2024, we learned we would be receiving far less support than expected due to a shift in SFF’s strategic priorities toward AI safety. Without additional funding, ALLFED will need to shrink. I believe the marginal cost effectiveness for improving the future and saving lives of resilience is competitive with AI Safety, even if timelines are short, because of potential AI-induced catastrophes. That is why we are asking people to donate to this emergency appeal
 ·  · 1m read
 · 
We’ve written a new report on the threat of AI-enabled coups.  I think this is a very serious risk – comparable in importance to AI takeover but much more neglected.  In fact, AI-enabled coups and AI takeover have pretty similar threat models. To see this, here’s a very basic threat model for AI takeover: 1. Humanity develops superhuman AI 2. Superhuman AI is misaligned and power-seeking 3. Superhuman AI seizes power for itself And now here’s a closely analogous threat model for AI-enabled coups: 1. Humanity develops superhuman AI 2. Superhuman AI is controlled by a small group 3. Superhuman AI seizes power for the small group While the report focuses on the risk that someone seizes power over a country, I think that similar dynamics could allow someone to take over the world. In fact, if someone wanted to take over the world, their best strategy might well be to first stage an AI-enabled coup in the United States (or whichever country leads on superhuman AI), and then go from there to world domination. A single person taking over the world would be really bad. I’ve previously argued that it might even be worse than AI takeover. [1] The concrete threat models for AI-enabled coups that we discuss largely translate like-for-like over to the risk of AI takeover.[2] Similarly, there’s a lot of overlap in the mitigations that help with AI-enabled coups and AI takeover risk — e.g. alignment audits to ensure no human has made AI secretly loyal to them, transparency about AI capabilities, monitoring AI activities for suspicious behaviour, and infosecurity to prevent insiders from tampering with training.  If the world won't slow down AI development based on AI takeover risk (e.g. because there’s isn’t strong evidence for misalignment), then advocating for a slow down based on the risk of AI-enabled coups might be more convincing and achieve many of the same goals.  I really want to encourage readers — especially those at labs or governments — to do something