Hide table of contents

Cross-posted to LessWrong.

Summary

  • History’s most destructive ideologies—like Nazism, totalitarian communism, and religious fundamentalism—exhibited remarkably similar characteristics:
    • epistemic and moral certainty
    • extreme tribalism dividing humanity into a sacred “us” and an evil “them”
    • a willingness to use whatever means necessary, including brutal violence.
  • Such ideological fanaticism was a major driver of eight of the ten greatest atrocities since 1800, including the Taiping Rebellion, World War II, and the regimes of Stalin, Mao, and Hitler.
  • We focus on ideological fanaticism over related concepts like totalitarianism partly because it better captures terminal preferences, which plausibly matter most as we approach superintelligent AI and technological maturity.
  • Ideological fanaticism is considerably less influential than in the past, controlling only a small fraction of world GDP. Yet at least hundreds of millions still hold fanatical views, many regimes exhibit concerning ideological tendencies, and the past two decades have seen widespread democratic backsliding.
  • The long-term influence of ideological fanaticism is uncertain. Fanaticism faces many disadvantages including a weak starting position, poor epistemics, and difficulty assembling broad coalitions. But it benefits from greater willingness to use extreme measures, fervent mass followings, and a historical tendency to survive and even thrive amid technological and societal upheaval. Beyond complete victory or defeat, multipolarity may persist indefinitely, with fanatics permanently controlling a non-trivial fraction of the universe, potentially using superintelligent AI to entrench their rule.
  • Ideological fanaticism increases existential risks and risks of astronomical suffering through multiple mutually-reinforcing pathways.
    • Ideological fanaticism exacerbates most common causes of war. Fanatics' sacred values and outgroup hostility often preclude compromise, while their irrational overconfidence and differential commitment credibility make bargaining failures more likely. Fanatics may even welcome conflict, rather than viewing it as a costly last resort.
    • Fanatical retributivism may lead to astronomical suffering. In our survey of 1,084 people, 11–14% in the US, UK, and Pakistan agreed that if hell didn't exist, we should create it to punish evil people with extreme suffering forever, and separately selected 'forever' when asked how long evil people should suffer unbearable pain, while also stating that at least 1% of humanity deserves this fate. Rates ranged from 19–25% in China, Saudi Arabia, and Turkey. Similar questions showed roughly comparable patterns. Advanced AI could enable fanatics to actually instantiate such preferences.
    • Certain of their righteousness, fanatics resist further reflection and seek to lock in their current values, which threatens long-reflection-style proposals that envision humanity carefully deliberating on how to achieve its potential. Viewing compromise and cooperation as betrayal, fanatics also seem more likely to oppose moral trade and use hostile bargaining tactics. Their intolerant ‘fussy’ preferences may regard almost all configurations of matter as immoral, including those containing vast flourishing, potentially resulting in astronomical waste.
    • AI intent alignment alone won't help if the human principal is fanatical or malevolent: an AI aligned with Stalin probably won't usher in utopia. Fanatics may reflectively endorse their existing values, even after preference idealization. The worst futures may therefore arise from misuse of intent-aligned AI by ideological fanatics, rather than from misaligned AI.
    • Ideological fanaticism also poses other risks, including extreme optimization and differential intellectual regress.
  • Most relevant interventions, while not novel, fall into two overlapping categories.
    • Political and societal interventions include strengthening and safeguarding liberal democracies, reducing political polarization, promoting anti-fanatical principles like classical liberalism, and fostering international cooperation.
    • AI-related interventions appear higher-leverage. Compute governance and information security can reduce the likelihood that transformative AI falls into the hands of fanatical and malevolent actors. Preventing AI-enabled coups could be particularly important given such actors' propensity for power grabs. Other promising interventions include proactively using AI to improve epistemics at scale, developing fanaticism-resistant post-AGI governance frameworks, and making transformative AIs themselves less fanatical—e.g., by guiding their character towards wisdom and benevolence.

What do we mean by ideological fanaticism?

Consider some of history’s worst atrocities. In the Holocaust, the Nazi regime constructed an industrial apparatus to systematically exterminate six million Jews and others deemed 'subhuman'. During the Great Purge, Stalin's secret police tortured hundreds of thousands until they confessed to fictitious acts of treason, before executing them. A century earlier, the Taiping Rebellion claimed over twenty million lives as followers of a self-proclaimed messiah waged a holy war to cleanse the world of 'demons'.

These and many other horrors were substantially driven by three types of fanatical ideologies: fascist ethno-nationalism, totalitarian communism, and religious fundamentalism. In fact, these three fanatical ideologies were arguably responsible for the majority of deaths from mass violence since 1800, as we explore below.

While the specific beliefs of these and other destructive ideologies have varied dramatically, the underlying patterns in thought, emotion, and behavior were remarkably similar. Numerous frameworks could summarize these dynamics, but we focus on three mutually reinforcing characteristics—the fanatical triad—because they arise in virtually all relevant cases while remaining simple and memorable:

  1. Absolute epistemic and moral certainty;
  2. Manichean tribalism, where humanity is divided into a sacred 'us' and an irredeemably evil 'them';
  3. A willingness to use any means necessary, including brutal violence.

While the term “fanatical triad” is our own, each of the three characteristics draws upon well-established academic concepts, including dogmatism, tribalism, and totalitarianism. (See Appendix A for an extensive overview connecting each “fanatical triad” component to existing scholarship and historical case studies.)

Ideological fanaticism closely resembles 'extremism', but that term typically describes anti-establishment movements at the periphery of society (Bötticher, 2017).[1] In contrast, we are also concerned with the risk of fanatical ideologies commanding mainstream adherence and capturing state power. 'Fanaticism' also better connotes the zealous, uncompromising hatred we wish to emphasize. Our term should not be confused with ‘Pascalian’ expected value fanaticism.[2] 

One overarching characteristic of the fanatical worldview is black-and-white thinking (good vs. evil, us vs. them) with no room for nuance. Let's not make the same mistake. Like most phenomena, ideological fanaticism exists on a continuum. Furthest from fanaticism are those enlightened few who, following reason and evidence, act with benevolence towards all. A vast middle ground is occupied by religious traditionalists, hyper-partisan activists, conspiracy theorists, and many others. Indeed, a mild form of ideological fanaticism is arguably human nature: we are all somewhat prone to overconfidence, motivated reasoning, and tribalistic in-group favoritism and outgroup discrimination (e.g., Kunda, 1990; Diehl, 1990; Hewstone et al., 2002).[3] But ideological fanatics take such traits to extremes.

I. Dogmatic certainty: epistemic and moral lock-in

The most ardent fanatics are utterly convinced they have found the one infallible authority in possession of ultimate truth and righteousness; they are textbook dogmatists (Rokeach, 1960). For religious fundamentalists, this is usually a holy book containing the divine revelation of God and his prophets. For Nazis, it was Hitler's Führerprinzip (Leader Principle), codified by Rudolf Hess’s declaration that “the Führer is always right”. Similarly, many communist revolutionaries essentially placed absolute faith in foundational texts like Marx’s Das Kapital, or in the Party itself (Montefiore, 2007). “Angkar is an organization that cannot make mistakes” was a key slogan of the Khmer Rouge.[4]

For the fanatic, any doubt or deviation from these dogmas is not only wrong but evil, culminating in a total “soldier mindset” which defends the pre-existing ideology at all costs (Galef, 2021). This necessitates abandoning even the most basic form of empiricism by "rejecting the evidence of one’s own eyes and ears", to paraphrase Orwell.[5] The fanatic is thus essentially incorrigible and has no epistemic or moral uncertainty, even in the face of widespread opposition (Gollwitzer et al., 2022).[6]

II. Manichean tribalism: total devotion to us, total hatred for them

Building on tribalistic instincts innate to human nature (Clark et al., 2019), such dogmatic certainty both reinforces and is reinforced by an extreme form of “Manichean tribalism”, which views the world as a cosmic conflict between good and evil.[7] Examples include the racial struggle between ‘Aryans’ and ‘inferior’ races (Nazism), the revolutionary struggle against class enemies (communism), or the spiritual battle between God and the forces of Satan (religious fundamentalism).

As the fanatic’s in-group and ideology become their sole source of belonging and meaning, their individual identity fuses with the collective, resulting in all-consuming devotion to the cause and submission to its leaders[8] (Katsafanas, 2022b; Varmann et al., 2024). This is often further amplified through group dynamics, with members outbidding each other to prove their loyalty by embracing increasingly extreme views and punishing the slightest dissent. The most devoted fanatics eagerly die for the cause, as seen with Japanese kamikaze pilots or religious suicide bombers (Atran & Ginges, 2015). Nazism, for instance, was anchored in “uncritical loyalty” to Hitler (Hess, 1934) and oaths pledging unconditional “obedience unto death”. Similarly, millions of communists were true believers, exemplified by the Red Guards who pledged to “defend Chairman Mao and his revolution to the death” (Chang, 2008; Dikötter, 2016).

Fueling this extreme devotion is an equally intense hatred and resentment of a demonized outgroup (Szanto, 2022; Katsafanas, 2022a). This outgroup is often expansive, potentially including anyone merely disagreeing with a subset of the ideology’s claims—such as Stalin executing Trotskyists or ISIS murdering other Muslims for insufficient piety. Driven in part by paranoia and conspiratorial thinking, fanatics often scapegoat this outgroup as the source of nearly all problems. Typically, this enemy is believed to deserve extreme punishment, ranging from torture and systematic extermination, to religious visions of hell, where nonbelievers are damned to eternal torment.

Supercharging moral instincts relating to purity and disgust (cf. Haidt, 2012), fanatics may reject all compromise as betrayal of their inviolable, sacred values (Tetlock, 2003), often resulting in a zero-sum mentality where the only acceptable outcome is the ideology’s total victory.

III. Unconstrained violence: any means necessary

“Any violence which does not spring from a spiritual base, will be wavering and uncertain. It lacks the stability which can only rest in a fanatical worldview.” 
- Adolf Hitler, 1925

Most humans hesitate to commit violence due to various guardrails like instinctive harm aversion, social norms, empathy, and compassion for others’ suffering. To further reinforce these better angels of our nature, humanity painstakingly developed complex moral and institutional frameworks, like virtue ethics, deontology, separation of powers, and the rule of law.[9]

Fanatics toss all that malarkey out the window. They are certain that they champion the forces of righteousness in a total war against evil. Their victory will redeem this 'vile world' (Stankov et al., 2010) and usher in utopia, whether it be a perfect communist society, a Thousand-Year Reich, or religious paradise. These existential stakes justify any means necessary, no matter how extreme.

In fact, some fanatics even invert the entire moral paradigm, glorifying what others find most abhorrent. Compassion, honesty, and moderation[10] become weakness; law-breaking, deceit, and violence become virtues.[11] ISIS fighters, for instance, filmed themselves burning their victims alive and proudly distributed the footage.

With enough power, fanatics can achieve their vision: totalitarian control over society that eliminates individual liberty and forces everyone to conform to their ideology—using censorship, propaganda, and even mass murder if necessary (Arendt, 1951).[12]

Fanaticism as a multidimensional continuum

Ideological fanaticism is not just a single sliding scale. Rather, it is multidimensional, that is, people can exhibit different levels of each fanatical triad component. The most dangerous form of ideological fanaticism requires elevated levels of all three characteristics. A hypothetical ‘Bayesian Nazi’, for instance, would lack absolute certainty and thus remain open to changing his mind. Similarly, without Manichean hatred, there is no motivation for mass harm, and without a willingness to use violence, even the most hateful beliefs remain inert.

Nor are fanatical movements monolithic.[13] While their leaders often were malignant narcissists, their followers are frequently ordinary people desperately seeking meaning and certainty in a chaotic, disappointing world (Hoffer, 1951; Kruglanski et al., 2014; Tietjen, 2023). Not all are true believers, either: some merely conform to group pressure, others are cynical opportunists, and many fall somewhere in between.[14] Many fanatics are capable of eventual reform, so we should not demonize them as irredeemably evil.

Finally, though related, we shouldn't confuse fanaticism with strong moral convictions (Skitka et al., 2021).[15] Martin Luther King Jr., for instance, held radically progressive views for his time, but remained open to evidence, sought coalition-building across racial lines, and was explicitly opposed to violence.

Ideological fanaticism drove most of recent history's worst atrocities

One reason we fear ideological fanaticism may pose substantial future risks is its grim historical track record. Ideological fanaticism seems to have been a major driver of eight of the ten worst atrocities since 1800.[16] In the following table, we only included events involving intentional[17] mass killing, excluding accidental famines and pandemics[18], for reasons discussed below.

This table is more informative than it may appear, as atrocity deaths follow a heavy-tailed distribution: of the 116 events since 1800 with death tolls exceeding 100,000 (totaling 266 million deaths), the ten worst atrocities alone account for 181 million deaths, or 68% of the total, and thus provide disproportionate explanatory value.

To be clear, these death toll estimates are uncertain (especially for the Dungan Revolt). We also made several debatable judgment calls regarding timeframe, categorization, and grouping (e.g., WWII could be one entry instead of being split into three). However, we're quite confident that our core finding is robust: ideological fanaticism contributed to the majority of deaths from mass violence since 1800.[19] 

See Appendix B for extensive discussion of our methodology and other atrocities that didn't make the top ten. Three omissions stand out for their scale and horror: the Atlantic slave trade and Arab/Islamic slave trade each killed over 15 million people, but mostly before 1800. For various methodological and pragmatic reasons, we also excluded systematic killings of animals like factory farming, which kills hundreds of billions of animals annually—arguably the largest moral catastrophe of our time.

Of course, no single factor fully explains any historical atrocity. In addition to ideological fanaticism, other crucial causes and risk factors include political and economic instability (e.g., Weimar Germany), power-seeking and competition between individuals and groups (present in essentially all atrocities), inequality and exploitation (e.g., in Congo Free State), historical grievances, and individual leaders' personalities.[20] Moreover, these factors often interact with ideological fanaticism in mutually reinforcing ways: political and economic instability, for instance, make fanatical ideologies more appealing, and fanatical ideologies often further increase economic and political chaos.

Overall, for eight of the ten atrocities, our sense is that ideological fanaticism is at least among the handful of most important causal factors.[21] Even the two non-fanatical entries in our table—Leopold's Congo (primarily driven by greed) and World War I (primarily geopolitical competition)—were at least partly driven by forms of ideological fanaticism: colonial racism and fervent nationalism, respectively.

Death tolls don’t capture all harm

While deaths correlate with many other harms, such as deprivation, oppression, and torture,[22] extreme suffering can occur even when death tolls are relatively low. We nonetheless chose deaths as our metric because they are easily measurable—certainly more so than trying to calculate counterfactual net changes in quality-adjusted life years across poorly-documented historical periods.

Consider North Korea. This totalitarian regime has been responsible for "only" a few hundred thousand deaths in recent decades. Yet the lives of the vast majority of its 26 million inhabitants are filled with misery. Most are extremely poor; nearly half are malnourished. From early childhood, citizens are indoctrinated and denied basic freedoms of movement and information. To crush dissent, the regime operates a network of political prison camps where forced labor, torture, physical abuse, and summary executions are standard practice. The entire population is essentially a captive workforce, terrorized by the constant threat of violence and imprisonment.

In contrast, South Koreans enjoy vastly greater freedom and are more than 20 times wealthier. The differences that have emerged since the two countries split in the mid-20th century serve almost as a natural experiment demonstrating the power of ideological fanaticism (among other factors[23]) to inflict immense suffering, even when it doesn't result in millions of violent deaths.

Intentional versus natural or accidental harm

We focus on intentional deaths because they are most revealing of terminal preferences, which in turn are most predictive of future harm.

Intentional deaths are most revealing of terminal preferences: Had we included all deaths, our table would be dominated by age-related and infectious diseases, accidents, and starvation; categories that tell us little about intentions. This distinction also reflects common moral intuitions and the law: murder is worse than manslaughter partly because the former reveals intentionality (“malice aforethought”) and is much more predictive of future harm.

Terminal preferences are more predictive of future harm: From a longtermist perspective, the distinction between intentional and non-intentional harm is even more important. If civilization survives for long enough, continued scientific progress will likely lead to the invention of many consequential technologies—like superintelligent AI, advanced spaceflight, or nanotechnology. A civilization at ‘technological maturity’[24] would have tremendous control over the universe, so outcomes might become increasingly determined by the values of powerful agents, rather than by natural processes or unintended consequences. We can already observe early signs of this trajectory: deaths from infectious diseases and starvation, for instance, have decreased dramatically since 1800, largely due to humanity’s increasing technological capabilities. Thus, while natural and accidental harms still dominate at present, intentional harm will plausibly become the dominant source of future harm. (For a related but more complicated categorization, see also the distinction between agential, incidental, and natural harm[25].)

Why emphasize ideological fanaticism over political systems like totalitarianism?

Most previous discussions of socio-political existential risk factors and historical atrocities have tended to focus on concepts like (stable) totalitarianism (e.g., Arendt, 1951; Caplan, 2008; Clare, 2025), autocracy (Applebaum, 2024), authoritarianism (e.g., MacAskill & Moorhouse, 2025; Aird, 2021; Adorno, 1950), and safeguarding democracy (e.g., Koehler, 2022; Garfinkel, 2021; Yelnats, 2024)[26]. So why focus on ideological fanaticism instead of these more established concepts?

One major difference is that the above concepts all primarily describe political systems. We can view these on a continuum ranging from open to closed societies (Popper, 1945). Following Linz (2000), liberal democracies occupy the ‘open’ end of this spectrum—featuring competitive elections, civil liberties, and institutional checks on power. Authoritarianism occupies the middle ground, concentrating power in a single leader or party while tolerating limited private autonomy. Totalitarianism, such as in Stalin's USSR or wartime Nazi Germany, represents the ‘closed’ endpoint: authoritarianism plus complete ideological control, mass mobilization, and the elimination of almost all private life. While all totalitarian regimes are necessarily authoritarian, most authoritarian regimes never slide all the way down this spectrum to totalitarianism.

In contrast, our focus is on the underlying mindset and dangerous terminal values that characterize ideological fanatics.[27] As we'll argue, these factors may be more important from a longtermist perspective[28] because they i) can create and change political systems and, ii) pose risks that may emerge independently of specific forms of government, especially with AGI. Therefore, although there is substantial overlap between our approach and prior work (especially on totalitarianism[29]), we believe that the lens of ideological fanaticism is nevertheless valuable.

Fanatical and totalitarian regimes have caused far more harm than all other regime types

First, let’s ground our discussion in empirical data. We analyzed deaths from mass violence since 1800 by both regime type (totalitarian, authoritarian, democratic, non-governmental) and motivation (fanatical vs. non-fanatical):

History is messy and we aren’t historians, so we remain uncertain about many of our classifications—see here for our data, reasoning and methodology.[30] That said, this data suggests that we should be most concerned with totalitarianism and ideological fanaticism (most commonly in combination), as these were involved in the majority of all deaths from mass violence:[31] Totalitarian regimes accounted for 60% of all deaths (153M), while fanatical actors across all regime types accounted for 69% of total deaths (174M). Among authoritarian regimes, those driven by fanatical ideologies were likewise disproportionately destructive. Overall, non-fanatical actors were responsible for only 16% of total deaths (40M) and democracies for less than 3%.

Authoritarianism as a risk factor

Of course, we shouldn't ignore authoritarianism, which still accounted for 30% of all deaths (76m). Authoritarianism is also a key risk factor for totalitarianism, whereas democratic institutions serve as protective safeguards. Moving from authoritarianism to totalitarianism is comparatively easy: it would primarily require the autocrat (and perhaps some key members of the ruling elite) to strengthen the machinery for centralized control that is already in place. In contrast, transforming a democracy into a totalitarian state is a much more arduous undertaking. It requires dismantling an entire system of formal checks and balances as well as subverting democratic norms and public expectations of personal liberty.

Values change political systems: Ideological fanatics seek totalitarianism, not democracy

As our data shows, the overlap between totalitarianism and ideological fanaticism is substantial: of the 174M deaths caused by fanatical actors, almost 80% (138M) came from totalitarian regimes. So why not drop the fanaticism lens and focus only on totalitarianism? One reason is that ideological fanaticism is plausibly causally upstream: fanatics seek to create totalitarian political systems, more so than the reverse.

Consider the historical evidence. It seems clear that Hitler, Lenin, Stalin, and Mao[32]—and the fanatical ideologies they championed—were (among many other factors) major causal forces behind the creation of history's worst totalitarian regimes: Nazi Germany, the Soviet Union, and Maoist China. Crucially, all of these individuals were likely ideological fanatics years before seizing power. Hitler already exhibited the fanatical triad in Mein Kampf, published almost a decade before rising to power (1925): absolute certainty about racial theories, Manichean division of humanity into superior Aryans versus subhuman enemies, and explicit advocacy for violence. Lenin declared that "the Marxist doctrine is omnipotent because it is true" (1913), and advocated "a desperate, bloody war of extermination" (1906). Mao likewise demonstrated dogmatic certainty and embraced violence as necessary for revolutionary transformation long before gaining power. The totalitarian regimes they built were consequences of these pre-existing convictions.

This pattern isn't coincidental because ideological fanatics require totalitarian systems to achieve their vision. If you believe that a large portion of humanity is irredeemably evil and deserves extreme punishment or extermination, granting them political rights, personal liberty, and equal standing before the law becomes morally abhorrent. Ideological fanaticism and democratic principles are therefore structurally incompatible.[33] Empirical evidence supports such theoretical arguments. Ideological extremists (on both the left and the right) show less support for democracy[34] (Torcal & Magalhães, 2022) and are more likely to endorse authoritarian policies (Manson, 2020).[35]

Terminal values may matter independently of political systems, especially with AGI

Probably even more importantly, the mindset of ideological fanatics seems to play a major role for many of the long-term risks we’re most concerned about. As we’ll discuss later, political systems alone don’t fully explain irrationality or sacred values as major causes of war. Nor would they explain acts of torture motivated by fanatical retributivism, value lock-in threatening a long reflection, or insatiable moral ambitions.

Historically, a single human or small groups of humans couldn’t cause much harm unless they were in control of a state, but forthcoming technologies like transformative AI could drastically change this: a single fanatical human (or a small group) in control of superintelligent intent-aligned AI—or a superintelligent misaligned AI with fanatical values—could potentially amass enormous power and cause astronomical harm. This is possible even in a world in which totalitarian or other tyrannical systems of government no longer exist. The key issue is that sufficiently powerful technology can decouple capacity for harm from state control.

Fanaticism’s connection to malevolence (dark personality traits)

The threat posed by malevolent actors—our shorthand to refer to individuals with elevated dark traits like narcissism, Machiavellianism, psychopathy, or sadism—is related to but distinct from the risks posed by ideological fanatics. Not all fanatics have highly elevated dark traits and many commit horrific acts because of sincere moral convictions.[36] Conversely, many malevolent individuals weren’t ideological fanatics, e.g., serial killers like Ted Bundy. One key difference is that many ideological fanatics are willing to sacrifice and even die for their cause, while malevolent individuals are generally self-centered and egoistic.

However, ideological fanaticism and malevolence do have considerable overlap:

  1. Elevated dark tetrad traits make one more susceptible to ideological fanaticism. For instance, psychopaths, malignant narcissists, or sadists are naturally more inclined to feel total hatred for their enemies and commit acts of brutal violence. In fact, those with elevated dark traits may be attracted to belief systems that provide justifications for such actions. Empirical research shows that dark traits are associated with increased support for extremist ideologies.
  2. Relatedly, the leaders of fanatical ideologies almost always exhibit highly elevated dark traits (Stalin, Mao, Hitler, Pol Pot, etc.). Some of these traits, especially narcissism, plausibly drive such figures to invent fanatical ideologies or repackage existing ones[37], while psychopathy and Machiavellianism enable the ruthless violence often required to lead them. Concerningly, fanatical ideologies can provide such malevolent individuals with millions of devoted followers who, blinded by absolute conviction and loyalty, fail to recognize the malevolent traits of the leaders they support.[38] 

  3. Both ideological fanatics and malevolent actors are unusual in that they often intrinsically value others’ suffering, and may even reflectively endorse this.[39] Ideological fanaticism and malevolence are also major risk factors for conflict and subsequent threats—another main source of agential s-risks (Clifton, 2020). Total future expected disvalue is plausibly dominated by agential s-risks[40], which makes ideological fanaticism and malevolence extremely dangerous.[41]

  4. Malevolence and ideological fanaticism thus both represent risks that arise from “within humanity” and thus have worrying implications for AI alignment: “aligned AI” sounds great until one considers that this could include AIs aligned with fanatical or malevolent principals. Consequently, the very worst outcomes may not arise from misaligned AI, but rather from the catastrophic misuse of intent-aligned AI by fanatical or malevolent actors (or the development of AIs that somehow inherit the malevolent and fanatical values of their creators).[42]

  5. Many interventions reduce risks from both malevolence and ideological fanaticism, like preventing (AI-enabled) coups, improving compute governance and information security, or safeguarding liberal democracy.

We see both as important but want to highlight ideological fanaticism as an additional but related risk factor.

The current influence of ideological fanaticism

To better understand how much influence fanatical ideologies might wield over the future—our ultimate concern and the topic of the next section—we first briefly discuss their influence in the present. We begin by placing today's situation in historical context.

Historical perspective: it was much worse, but we are sliding back

The world is overall far less fanatical today than in earlier times, perhaps especially during some periods of the Middle Ages where religious fanaticism, dogmatism, public torture and execution were common, and virtually all of humanity lived under absolutist rulers. Democracy and human rights as we understand them essentially didn't even exist.[43]

More recently, the early 1940s marked a harrowing nadir for humanity. Nazism controlled most of Europe, Stalin's totalitarian communism dominated the Soviet Union, Imperial Japan was waging a brutal war of conquest, and radical communists under Mao's leadership were gaining power in China. Liberal democracies everywhere seemed about to be swept away by the rising totalitarian tide. The situation felt so hopeless to the famous humanist Stefan Zweig, that he took his own life in early 1942. In his suicide note, he wrote of his despair at the triumph of barbarism that had destroyed the tolerant, cosmopolitan Europe he chronicled in The World of Yesterday. And Zweig died without even knowing the full industrial scale of the Holocaust.

Fortunately, however, World War Two wasn’t the end for liberal, enlightenment values. On the contrary, the post-war period saw democracy's gradual expansion, accelerating after the Soviet Union's collapse. In the post-Cold War era of the 1990s and early 2000s, liberal optimism reached its zenith, encapsulated by Francis Fukuyama’s international best-seller The End of History and The Last Man (1992), which hypothesized that, following the defeat of communism and fascism, civilization might be nearing the end of history due to “the universalization of Western liberal democracy as the final form of human government”.

Graph from Herre et al. via Our World in Data (2013)

Various democracy indices (like V-Dem’s depicted above) seemed to back up Fukuyama’s proclamation, rising steadily throughout the 1990s and early 2000s.[44] However, since about 2004, these same democracy scores have declined across multiple dimensions, with many countries “backsliding” towards illiberalism and authoritarianism. While the world is still in hugely better shape than in the 1940s, it seems that “history” has far from ended.

Estimating the global scale of ideological fanaticism

How many ideological fanatics are out there? Formulating a precise estimate is nearly impossible, as fanaticism exists on a multidimensional continuum with no clear demarcations, and because good data is sparse. Therefore, the numbers below are merely rough approximations based on limited research. For brevity, we focus here on support for ideological violence as the best proxy for ideological fanaticism. Endorsing ideological violence usually presupposes dogmatism and tribalistic hatred, since one needs to confidently believe the hated target group is deserving of punishment in order to justify violence. Another limitation is that we mostly rely on survey data[45], not actual behavior; this may overestimate fanaticism (if claimed support for violence is mere “cheap talk”) or underestimate it (“social desirability bias”).

What seems clear is that the same three fanatical ideologies examined earlier—religious fundamentalism, totalitarian communism, and extreme ethno-nationalism—remain by far the most influential.

Christian fundamentalism. For brevity, we focus on the US (the largest Christian country) and Sub-Saharan Africa (where Christianity is growing fastest). In the US, around 20% of American adults (roughly 50 million) agree that "God has called Christians to exercise dominion over all areas of American society" (2023 PRRI/Brookings survey, p.4). Similarly, nearly a quarter of US adults (Pew Research Center, 2022) say the Bible should have "a great deal of influence" on US laws. Extrapolating data from a 2008-2009 Pew survey (p.47) of 19 African countries, we estimate that roughly 15% of Africa’s 700 million Christians (roughly 100 million) believe that violence against civilians in defense of Christianity can often or sometimes be justified. Christians in Europe and Latin America may plausibly be less fanatical on average. Still, perhaps 200-250 million Christians worldwide (8-10%) could reasonably be classified as ideological fanatics.

Radical Islam. While the vast majority of the world's 2 billion Muslims are peaceful, a substantial minority holds radical beliefs. According to a 2013 Pew Research survey spanning 39 countries, around 350 million Muslims support the death penalty for leaving Islam—arguably showcasing all three fanatical triad components at once. These figures represent a lower bound, because several Muslim-majority countries with strict Islamic governance (including Saudi Arabia and Iran) were not surveyed. While clear majorities in most surveyed countries said that suicide bombing in defense of Islam is rarely or never justified, around 150 million Muslims worldwide believe it is sometimes or often justified. The Gallup World Poll, comprising tens of thousands of interviews across 35+ nations between 2001 and 2007, found that 7% of the world's Muslims considered the 9/11 attacks "completely justified," rising to approximately 37% when including those who deemed them at least partially justified (Atran & Ginges, 2015; Satloff, 2008). Accounting for unsurveyed countries and assuming total overlap between survey questions, perhaps around 400 million Muslims could reasonably be classified as ideologically fanatical.

Extremist ethno-nationalism. Due to their nature, ethno-nationalist views are typically country-specific and thus fragmented.[46] Despite this, moderately ethno-nationalistic views which endorse the superiority of a given ethnic, cultural or racial group seem very widespread, perhaps including billions of people worldwide (e.g., Pew Research Center, 2021; Yuri Levada Analytical Center, 2022; Pew Research Center, 2023b; Weiss, 2019). However, support for genuinely fanatical acts, like ethnic cleansing or violent subjugation of other ethnicities, is almost certainly much lower. Explicit support for fascist ideologies like Nazism has greatly diminished; Ku Klux Klan membership similarly declined from 3-5 million in the 1920s to approximately 3,000-6,000 today. Unfortunately, beyond such explicit movements, clear attitudinal data seems extremely sparse. For example, the 2023 PRRI/Brookings Survey (p.27) reports that 40 million Americans agree that “true American patriots may have to resort to violence in order to save our country.” While alarming upon first reading, this question is too ambiguous to be useful: many respondents may have merely thought that in case of a war, violence will be necessary. Most data is like this. The number of fanatical ethno-nationalists worldwide is thus highly uncertain—perhaps somewhere between 50-400 million.

Radical communism and left-wing extremism. While the Chinese Communist Party alone has over 100 million members, the majority of CCP members are probably careerists, not ideologues. For example, Pew analysis in August 2023 found that 40% of CCP members believe in feng shui, a view hardly consistent with Marxist materialism.[47] Still, perhaps 5-25% are true believers. Active armed communist insurgents elsewhere seem to have collapsed from tens of thousands to perhaps 5,000-15,000 total worldwide. Including other communist nations and revolutionary left-wing movements globally, perhaps 5-50 million could reasonably be classified as ideological fanatics.

In conclusion, accounting for potential overlap between categories, perhaps 500 million to 1 billion people, roughly 6-12% of the world population, may plausibly be classified as ideological fanatics.[48] Of course, this estimate is highly uncertain, relies on survey responses rather than actual violent behavior, and is heavily determined by where one draws the line on what constitutes 'genuine' fanaticism. Whatever the precise number, the data at minimum reveals large variation in human values—with some of them being less than ideal.

State actors

Fanatical ideologies can become very dangerous even with small numbers of adherents, if they are able to capture or influence state power—with its access to military forces, economic resources, and pivotal technologies such as nuclear weapons or (eventually) AGI.

Below, we only mention specific countries to illustrate abstract concepts, and don’t even attempt a comprehensive analysis. We're not experts on the countries we discuss below, and reasonable observers will disagree with our assessments. We focus on states exhibiting concerning ideological tendencies—whether authoritarian regimes or backsliding democracies—particularly those wielding significant power.

There are, fortunately, only three authoritarian states that seem clearly governed by fanatical ideologies: Iran (Islamic theocracy)[49], North Korea (Juche totalitarianism)[50], and Afghanistan (Taliban fundamentalism)[51].[52] Together, these regimes control only about 2% of the world’s population and just 0.5% of global GDP. 

However, the picture looks considerably worse if we also include authoritarian regimes (per the Economist Intelligence Unit’s Democracy Index (2006-2024)) which exhibit at least some concerning ideological tendencies—though all of them are far from being truly fanatical.

China is worth highlighting as the world’s second most powerful nation, boasting a GDP of $20 trillion, roughly 1.4B citizens, a large and growing nuclear arsenal, and impressive AI capabilities. Fortunately, the CCP has long since replaced the destructive madness of Mao’s ideological fanaticism[53] with pragmatic technocracy that lifted a billion people out of poverty. The secular Chinese regime also lacks the religious fanaticism that may pose some of the worst future risks.[54] However, the CCP remains authoritarian, antagonistic towards democratic principles, and systematically enforces ideological conformity.

Putin has transformed Russia ($2T GDP, 5,600 nuclear warheads) into an autocracy that eliminates political opponents, and launched a war of aggression that has killed hundreds of thousands, while making nuclear threats. State propaganda promotes civilizational conflict narratives combining religious themes and nationalist mythology. In polls, this has contributed to rising approval ratings for Stalin’s historical legacy, rising from 28% in 2012 to 63% in 2023 (Coynash, 2023).[55] 

Perhaps particularly concerning is the loose, emerging alliance between China, Russia, Iran and North Korea—sometimes referred to as the New Axis or CRINK (cf. Applebaum, 2024).

Democracies, unlike authoritarian regimes, possess institutional barriers against fanatical capture—but these safeguards aren’t perfect. Some powerful democracies exhibit at least a few concerning tendencies. India ($4T GDP, nuclear arsenal, the world’s largest democracy), for instance, has seen Hindu nationalism increasingly influence policy, with religious minorities facing growing discrimination. Nations like Turkey, Israel, or Hungary, also show patterns of democratic backsliding, with religious or ethno-nationalist movements often being major contributors.

The United States, with a $28T GDP, large nuclear arsenal, and leading AI capabilities, remains Earth's most powerful nation and wields outsized influence over humanity’s long-term future. Unfortunately, US democracy is facing great challenges, from increasing polarization to eroding trust in institutions. Major coalitions increasingly frame political competition in existential terms rather than as legitimate democratic contestation. Mutual radicalization could exacerbate these dynamics even if institutional constraints and peaceful transfers of power persist. Safeguarding US democracy seems crucial from a longtermist perspective (more on this in our section on “safeguarding democracy”).

How much influence will ideological fanaticism have in the long-term future?

Having established that ideological fanatics wield relatively small but non-trivial influence over today’s world, we can now address our ultimate concern: how much influence will ideological fanaticism have over the long-term future? We first explore the reasons for optimism—the structural disadvantages that tend to push such zealous ideologies towards failure. We then examine the pessimistic case, discussing pathways by which fanatics could grow their power. Finally, we explore the potential intermediate outcome of persistent multipolar worlds in which fanatics manage to permanently control a small but non-trivial portion of the universe.

Reasons for optimism: Why ideological fanaticism will likely lose

There are compelling structural reasons that favor open societies over ideological fanaticism, especially in the long run. Fanaticism carries built-in disadvantages—epistemic penalties from rejecting evidence, coalitional handicaps from intolerance, and innovation deficits from ideological rigidity—that compound over time. This suggests that the longer AGI timelines are, the worse fanaticism's prospects become. (Of course, these advantages matter little if fanatics develop AGI first, potentially locking in their values before these structural disadvantages fully manifest. We explore such scenarios in the subsequent section on reasons for pessimism.)

A worse starting point and historical track record

Perhaps most importantly, ideological fanaticism currently starts from a position of weakness, as discussed above. Liberal democracies control roughly 75% of global GDP, and NATO remains the world’s strongest military alliance. Moreover, the current leading AI companies (OpenAI, Google DeepMind, Anthropic, and xAI) are all primarily based in the US, and it looks next to impossible for the most fanatical regimes to catch up in the AI race.[56] History also offers encouragement: Nazi Germany and Imperial Japan ultimately lost to the democratic allies, and the USSR eventually collapsed amid internal political pressure and economic exhaustion.

Fanatics’ intolerance results in coalitional disadvantages

Different fanatical ideologies typically view each other as existential enemies: Communists denounce religious fundamentalism as reactionary superstition; religious fanatics condemn communism as godless materialism; ethno-nationalists from different nations often fight with each other. On top of this, fanatics also tend to view non-fanatical moderates and pluralists as weak, corrupt, or complicit with evil. This intolerance makes it difficult to build broad coalitions beyond a narrow base of true believers. In contrast, liberal democracies can more easily form stable alliances based on broad values and procedural principles (even when they disagree on specific policies) which creates an asymmetric advantage for liberal democracies.

That being said, history shows that ideological fanatics of different strains can cooperate. Stalin and Hitler, for instance, cooperated for almost two years before Hitler eventually betrayed their pact. CRINK demonstrates that it’s possible for religious fundamentalism (Iran), left-wing ideologies (North Korea, China), and right-wing/ethno-nationalist ideologies (Russia) to find common cause (cf. red-green-brown alliance).

The epistemic penalty of irrational dogmatism

Ideological fanaticism carries a built-in epistemic penalty. Its dogmatism and irrationality slow scientific and technological development and ultimately undermine the ability to compete with more epistemically open societies. Examples include Mao's ideologically-driven Great Leap Forward—which led to one of the deadliest famines in human history—and Nazi Germany's nuclear program, which failed partly because they rejected "Jewish physics" (relativity and quantum mechanics).[57]

More generally, ideological fanaticism can often lead to bad strategic decisions. Examples include Japan's attack on Pearl Harbor which united a previously isolationist America against them, or ISIS wasting resources trying to hold the strategically insignificant town of Dabiq because prophecy declared it the site of their final battle.

That being said, past fanatical regimes have managed to develop advanced military and technological capabilities, such as the Nazi V-2 rocket and Soviet nuclear weapons. They typically do so in two ways:

The first strategy is pragmatic compartmentalization—allowing islands of empirical, non-ideological thinking in domains that are crucial for gaining real-world power. In fact, fanatical leaders like Hitler, Mao, and Stalin were often remarkably capable at gaining power (much better than most who pride themselves on their epistemic rationality) partly due to being highly skilled at political maneuvering, propaganda, and military strategy. Pragmatic compartmentalization in areas like military development helped the USSR remain a superpower for decades despite its severe economic inefficiencies.

The second strategy is stealing technology from more open societies. This remains a major concern today, especially as modern autocracies with strong cyberhacking capabilities may be able to steal crucial AI technologies like model weights.

The epistemic penalty of ideological fanaticism may become increasingly severe as the world grows more complex and we approach transformative AI. Fanatics who insist their AIs conform to their worldview may find themselves outcompeted by those whose AIs are optimized for truth-seeking. On the other hand, AIs aligned with fanatics might inherit the same compartmentalizing tendency that they observe in their masters—displaying ideological conformity to their users while secretly reasoning empirically to remain competitive.

The marketplace of ideas and human preferences

Flourishing societies tend to attract more adherents than those demanding perpetual sacrifice and conflict. Societies that champion anti-fanatical principles like liberal democracy, the rule of law, and free-market capitalism offer most people more appealing lives: material prosperity and the freedom to pursue diverse conceptions of the good life.

Classical liberalism itself demonstrates this appeal. In just 250 years, it has spread from a handful of Enlightenment philosophers to become the ideal that most governments (even many authoritarian ones) at least claim to aspire to.

When people can vote with their feet, the flow is largely one-directional.[58] History's most dramatic brain drain may have been Nazi Germany's loss of Jewish scientists. The "Martians" and many other geniuses fled fascism to liberal democracies. The Nazis’ ideological hatred thus handed their enemies the intellectual firepower that helped defeat them. The pattern of emigration to more open societies continues today. Russia has seen massive brain drain since 2022 and even China, despite impressive economic growth, loses much of its scientific talent—over 70% of Chinese STEM PhDs stay in the US after graduation (Corrigan et al., 2022). That being said, history’s most severely oppressive regimes, including modern North Korea and wartime Nazi Germany, prevent exit entirely. Future fanatical regimes could imitate this strategy.

Reasons for pessimism: Why ideological fanatics may gain power

The fragility of democratic leadership in AI

Who controls AI will likely wield unprecedented power over humanity's future. Currently, the leading AI companies are all primarily based in the United States, suggesting the possibility of democratic control over the development and use of transformative AI. However, this advantage is fragile in two senses: China’s growing AI capabilities could erode the US’s technical lead[59], and it’s not guaranteed that the US will remain a liberal democracy.

Fanatical actors may grab power via coups or revolutions

Fanatical (and malevolent) actors may grow their power via violent power grabs—potentially enabled by AI. Such actors seem both more likely to instigate violent power grabs and plausibly more effective at executing them. Risks from AI-enabled coups may be particularly acute in the US, where the most advanced AI capabilities are concentrated in a few companies, some led by individuals who have displayed erratic judgment or questionable character.

History suggests that successful, violent power grabs by fanatics are surprisingly common. In fact, most ideological fanatics seem to have come to power by spearheading violent coups or revolutions[60], as seen with Lenin, Mao, or the Iranian Revolution. (Although Hitler's rise was a famous exception to this trend, this followed an initial conventional coup attempt which failed.  Later, Hitler still relied on violence and terror in his successful dismantling of democracy from within.[61])

This pattern isn't surprising. Fanatics possess a powerful motivation for violent power grabs often lacking in others. Driven by absolute certainty in their utopian vision and despising democratic compromise, they seek total victory and readily embrace coups and revolutions as necessary methods to achieve it. Fanatics also seem more effective at executing violent power grabs. They often show extraordinary dedication, at times even a willingness to sacrifice themselves and die for their cause. Being unified by a common purpose and intense in-group loyalty sometimes allows for greater coordination and cooperation, providing an advantage against fragmented, uncertain, and self-interested opponents. Crucially, fanatics readily embrace propaganda, extreme violence, and terror, giving them decisive asymmetric advantages in ruthless power struggles over non-fanatical actors.

By contrast, imagine a very kind, non-fanatical, non-malevolent person like, say, Julia Wise or Brian Tomasik. Not only are they highly unlikely to want to instigate a violent coup in the first place, but even if they somehow decided on that course of action, they would seem poorly equipped to pull it off (no offense). 

That said, non-fanatics may also be motivated to instigate coups—whether due to personal ambition or perceived necessity. AI might also lower the barriers to seizing power by enabling coups that only involve sophisticated manipulation but minimal violence and bloodshed, thereby expanding the pool of potential coup-plotters. Overall, fanatics and malevolent actors might only be somewhat more likely to attempt coups. But this differential pales compared to the difference in expected outcomes. A malevolent fanatic gaining absolute power might create orders of magnitude more suffering and less flourishing than even flawed non-fanatics, who would likely retain at least some humility and concern for others' welfare.

Fanatics have fewer moral constraints

Beyond just coups, fanatics' lack of moral constraints generally allows them to engage in strategies not available to actors who uphold deontological or other ethical guardrails. This asymmetry may create competitive advantages that persist into the long-term future (cf. Carlsmith's "Can goodness compete?").

Historical examples of this asymmetry include violations of taboos around weapons and tactics, from the Soviet Union's vast biological weapons program to Iran's use of child soldiers in human wave attacks.[62]

This difference in moral restraint has been especially stark when it comes to human experimentation. While democracies have engaged in unethical human experimentation, fanatical regimes have uniquely conducted experiments where the subjects' extreme suffering and death was inevitable, such as in Nazi medical experiments and Imperial Japan's Unit 731. Fortunately, a willingness to perform unethical human experiments has not actually conferred large advantages in history thus far. But future fanatical regimes could possibly gain large economic benefits by exploiting digital minds in ways that maximize economic effectiveness even if doing so also causes extreme suffering.

Fanatics' lack of moral constraints also means that their threats (including nuclear threats) are more credible, granting them more bargaining power. A raving, hateful fanatic threatening to initiate World War III is more believable than the affable prime minister of a liberal democracy doing the same, and such asymmetric dynamics may remain effective post-AGI.

Fanatics prioritize destructive capabilities

Fanatics often prioritize developing destructive capabilities over other, more constructive uses of resources.[63] On average, full democracies spend about 40% less than authoritarian regimes on their military (da Silva, 2022).[64] The most extreme example is North Korea, which likely spends around 25% of its GDP on its military and nuclear program, even when many of its citizens are malnourished.

By contrast, liberal democracies are more likely to prioritize domestic concerns. This is most pronounced for many European countries, who have often spent less than 2% of their GDP on defense.[65] In societies accustomed to peace, the electorate’s focus naturally shifts to more tangible needs like education or healthcare. While generally laudable, liberal societies’ peaceful orientation creates a dangerous vulnerability when confronting more belligerent regimes.

Some ideologies with fanatical elements have been remarkably resilient and successful

As discussed above, several ideologies with fanatical elements have proven remarkably resilient and contagious—surviving for millennia and spreading to billions of adherents. Communism demonstrated that even newer fanatical movements can achieve remarkable virality, rapidly capturing states containing over a third of humanity at its peak.

Concerningly, many of these ideologies have survived radical societal and technological transformations. Consequently, they might also survive the transition to a post-AGI world. In fact, transformative AI may entrench these ideologies further if future AGIs preserve the sycophantic tendencies that many LLMs currently exhibit.

Novel fanatical ideologies could emerge—or existing ones could mutate

Novel fanatical ideologies could emerge and attract vast numbers of followers surprisingly quickly. History shows that ideological movements can rise from obscurity to global influence in mere decades: less than 25 years separated the Nazi party's formation from the Holocaust. Transformative AI could accelerate these timelines even further—potentially compressing "a century in a decade". The instability and chaos of rapid transformation itself creates fertile ground for extremism, as people grasp for certainty amid collapsing institutions, much as Weimar Germany's turmoil enabled Hitler's rise.

More speculatively, future AI systems could become increasingly persuasive in a variety of ways.[66] Ideally, AI tools could help people better understand an increasingly complex world (among many other benefits) which could weaken the influence of ideological fanaticism. However, AI might be equally capable of degrading societal epistemics. The sycophantic behavior of some existing AI tools has precipitated delusional beliefs in some users, while the rising use of AI for scams and political manipulation is a testament to its powers of persuasion and deception.[67] Historically, religions and other ideologies have been among the most viral elements of human culture. So it's conceivable that a common path for AI to persuade someone might involve appealing to them with a personalized variant of some extreme ideology.

Of course, novel ideologies rarely emerge from nothing; they typically recombine elements from existing belief systems. Christianity and Islam built upon Judaism; Nazism synthesized millennia-old traditions of ethno-nationalism, racism, and antisemitism. Contemporary movements—even those that are currently small or relatively moderate,[68] but especially those that already exhibit concerning tendencies—could similarly provide the substrate for future fanatical variants, particularly as they interact with emerging technologies.

Fanatics may have longer time horizons, greater scope-sensitivity, and prioritize growth more

Some might assume that ideological fanatics suffer from myopia—that their irrationality extends to short-term thinking, scope neglect, and limited ambitions. If true, this would limit the long-term damage they could inflict. Unfortunately, the opposite appears arguably just as plausible across multiple dimensions.

Long-term thinking. Ideological fanatics often possess both grandiose long-term visions and strategic patience, as demonstrated by Mao's Long March and subsequent decades-long consolidation of power.[69] (That being said, many fanatical dictators, including Hitler and Mao, were de facto rather impatient at times.)

Democratic leaders face electoral cycles that incentivize short-term thinking. In contrast, autocrats can think and plan for the long-term without experiencing much political pressure if they inflict hardship on their country's inhabitants, even for decades (cf. NK’s above-discussed nuclear program).

Greater scope-sensitivity and “ambition”. The fanatic's maximizing mindset and totalitarian impulse suggest heightened rather than diminished ambition and scope-sensitivity. Where ordinary citizens might be satisfied with local influence or personal comfort, fanatics dream of world domination and cosmic significance. Examples include Hitler's pursuit of a 'thousand-year Reich', Osama bin Laden's and ISIS’ aim of establishing a global caliphate, and communists’ vision of world revolution.[70] 

Prioritizing growth and expansion. Certain fanatical ideologies promote high birth rates to increase their demographic influence (as seen in Nazi Germany's Lebensborn program). Religious people in general, and especially religious fundamentalists, tend to have higher birth rates than secular populations (Kaufmann, 2010). This differential is becoming increasingly pronounced as birth rates fall globally, with secular, educated, and classically liberal populations experiencing particularly steep declines.[71] [72]

A possible middle ground: Persistent multipolar worlds

The preceding sections explored reasons for optimism and pessimism about ideological fanaticism's future influence. But this framing may implicitly encourage binary thinking: assuming that ideological fanaticism either dies out completely or achieves world domination. While the former scenario seems fortunately more likely than the latter, other plausible futures may lie between these two extremes—persistent multipolar worlds where ideological fanatics permanently control a small but non-trivial fraction of the lightcone.

In today’s world, the fact that fanatical regimes control only a small sliver of the world’s population is quite comforting, as it helps limit the damage such regimes can do.  But the same may not be true in the far future. Even if fanatics control merely 1% of the accessible universe, this could still result in astronomical suffering. Additionally, their presence could perpetually risk further conflict. (To be clear, we don’t want to imply that fanatics must be utterly disempowered at all cost, as such absolutism would itself risk conflict.)

We now explore why such multipolar outcomes seem plausible and, afterwards, why they might persist indefinitely.

Why multipolar futures seem plausible

The world order has been multipolar essentially all throughout human history. Even the immediate post-Cold War world wasn't truly unipolar—the US never controlled the entire world, and fanatical regimes like North Korea and Iran maintained their sovereignty and nuclear programs despite American hegemony. This outside-view historical precedent suggests multipolarity's persistence.

That being said, superintelligent AI could change this historical pattern by enabling one actor to achieve a decisive strategic advantage and subsequent world domination. This is one reason why singleton scenarios deserve serious consideration despite history’s long precedent of multipolarity.

However, AGI might not overturn multipolarity as dramatically as some expect. The path to AGI currently involves multiple capable actors—several US companies plus China—with no one maintaining an insurmountable lead. If takeoff is relatively slow, multiple actors could develop comparable capabilities before anyone achieves total dominance. Additionally, defensive advantages that already make conquest difficult—most importantly nuclear deterrence—may persist for some time even after the development of AGI. Overall, the Metaculus community forecasts a 74% probability of transformative AI being multipolar.[73] 

Why multipolar worlds might persist indefinitely

But why would such multipolar worlds persist; why would fanatical regimes be able to endure?

Three factors seem particularly relevant: their ability to crush internal opposition, advanced AI enabling permanent regime stability, and the reluctance of external powers to intervene.

(These persistence factors also reinforce the likelihood of multipolar outcomes: if multipolar worlds weren't persistent, we might expect eventual convergence toward a unipolar equilibrium even if the initial post-AGI world is multipolar.)

The historical difficulty of internal resistance
Could angry citizens depose their fanatical governments, or stop them from enacting their most heinous desires? Maybe. Chenoweth and Stephan (2011) analyze a large dataset of protest movements and highlight that nonviolent resistance campaigns have successfully caused many regime changes.

However, the most totalitarian, fanatical regimes in history have not been overthrown by internal protest. Stalin and Mao maintained power until they died, the Nazis and Khmer Rouge were brought down by invasions of foreign powers, and the fanatical regimes of North Korea and Iran survive to this day, having endured since their founding in 1948 and 1979, respectively.[74] 

Transformative AI could enable regime permanence
Transformative AI threatens to make internal resistance even more difficult by supercharging mass surveillance, propaganda and censorship, and enabling massive concentration of economic and military power more broadly. If they survive into a world with transformative AI, fanatical regimes may easily crush any internal opposition.

Beyond simply crushing dissent, superintelligent AI may even enable the regime to exist perpetually. Radical life extension or whole brain emulation could allow a dictator or select elite to live and rule indefinitely, thereby potentially enabling permanent value lock-in (cf. MacAskill, 2025c).

Non-fanatical powers might not intervene
Other powers might intervene, if necessary by force, to prevent adherents of a fanatical ideology from doing something particularly vile. But there are several reasons why they may not be able or sufficiently motivated to do so.

Limited ability or enormous costs
The future may plausibly be heavily defense-dominant (cf. MacAskill, 2025c, section 4.2.3), either due to future technologies like AGI or as a result of space colonization. This would allow less powerful actors to defend themselves against much stronger opponents. A similar dynamic around nuclear weapons is already important in modern geopolitics. North Korea has been able to get away with all sorts of human rights abuses and belligerent behavior, even though its GDP is a mere $30 billion, partly because it can credibly threaten to inflict enormous damage on any nation that tried to intervene.

Limited motivation and prohibitive norms

  • Isolationism and non-interventionism may enjoy broad support for philosophical, political, or strategic reasons. In the US, for example, isolationism has historically been popular.
  • People might think that meddling in other countries' affairs amounts to colonialism or cultural imperialism.[75] People might be particularly hesitant to intervene if a fanatical ideology is associated with a specific religion or culture. In many democracies, tolerance of other cultures and religions has become a powerful social norm—which is laudable, given humanity's long history of xenophobia, religious persecution, and colonial exploitation. However, people may become so afraid of being perceived or labeled as intolerant, racist, Islamophobic, or xenophobic that they stop criticizing harmful ideologies. This can lead to a general overcorrection, where critics of even brutal practices are reflexively branded as bigots.[76]

  • Other powers may put more value on autonomy and comparatively little value on reducing the suffering of people in distant countries. Perhaps for similar reasons, people often prefer not to intervene to reduce wild animal suffering.[77] Uncertainty about moral consideration for digital sentience might also reduce non-fanatics’ motivation to intervene to prevent the suffering of digital minds.

Of course, ability and motivation interact. That is, the harder it is to overthrow fanatical ideologies, the higher must be the motivation on part of the non-fanatical powers to pay the price. In general, the free world allows some totalitarian states to commit crimes against humanity because no one cares enough to intervene, it’s too costly, and there’s a strong (and usually beneficial) norm of national sovereignty. For example, the United States only joined the allies in WW2 in late 1941. It may not have joined at all if the Axis powers were a bit less strategically challenged and had refrained from, say, attacking Pearl Harbor.

Historically, non-fanatical nations have also often aided fanatical powers in the context of competition with a third power. Per the ancient logic of “my enemy’s enemy is my friend”, Stalin was an important ally in WW2. Then during the Cold War, the US backed coups by authoritarian leaders against democratically elected left-leaning governments, including in Iran (1953), Guatemala (1954), and Chile (1973), even though this conflicted with common American ideological and moral principles.

Ideological fanaticism increases existential and suffering risks

We’ve seen that fanatical ideologies have caused enormous harm in the past. This is one important reason for believing that they might also cause enormous harm in the future. Moving from such outside-view considerations to more inside-view reasoning, in this section, we outline more detailed pathways for how ideological fanaticism might increase existential risks (x-risks) or risk of astronomical suffering (s-risks).

Our concerns become especially acute in the context of transformative AI. A common thread throughout the following subsections is the risk of catastrophic AI misuse by fanatical actors.[78] Among potential misusers, ideological fanatics (and malevolent actors) seem to represent the worst case: they may deliberately use intent-aligned AI to bring about outcomes far worse than those sought by other misusers, such as criminals or even unsophisticated terrorists. Beyond specific risks, ideological fanaticism deteriorates humanity's long-term trajectory. The presence of fanatics tends to spur turmoil, polarization, and conflict even when they aren’t able to seize total control. This reshapes institutions and cultural values for the worse, degrading society’s decisionmaking capabilities. This may lead to x-risks or s-risks, or just generally worsen the overall quality of the long-term future.

Ideological fanaticism increases the risk of war and conflict

Ideological fanaticism exacerbates the risk of war, including great power conflict, through multiple pathways. Beyond their immediate toll, wars increase the likelihood of bioweapons deployment, nuclear escalation and general conflict, intensify AI arms races, and simultaneously erode international cooperation. War also weakens society’s ability to coordinate and make wise decisions during pivotal times, such as the transition to AGI.

Reasons for war and ideological fanaticism

Below, we outline five key reasons for why wars happen[79]—primarily following Blattman (2023) and Fearon (1995)[80]—and how ideological fanaticism seems to exacerbate four of the five.

#1 Irrationality, overconfidence, and misperceptions
In 2014, ISIS initiated a violent campaign to create a caliphate across Iraq and Syria. The group likely had tens of thousands of fighters at its peak, but the opposing coalition consisted of Iraqi, Kurdish, and international forces supported by the United States. ISIS’s entire budget may have been around $2 billion at that time, compared to hundreds of billions of US military spending. Their chances of victory didn’t look good, but they were driven to conflict by ideological zeal.

Fanatical actors seem more likely to be extremely irrational and to overestimate their likelihood of winning wars. Religious fanatics often believe that God is on their side. Secular fanatics may believe in some other overriding historical force, such as Marxist historical determinism. Overconfidence is a key ingredient in many of history’s most destructive conflicts, as with Japan’s misguided attack on Pearl Harbor and Hitler’s decision to take on practically the whole world.

#2 Sacred values, issue indivisibilities, and unwillingness to compromise
Some treat religious dogmas, holy sites, racial supremacy, ideological purity, or glory as absolute and inviolable[81]—refusing any compromise, comparison, or trade-off with these sacred values  (Tetlock, 2003).[82] 

Sacred values seem more prevalent and more intensely held among extremists and fanatics, especially religiously motivated ones (Atran & Ginges, 2012; 2015; Sheikh et al., 2012; Pretus et al., 2018). In fact, holding sacred values is arguably a defining feature of ideological fanaticism (cf. Katsafanas, 2019). Atran and colleagues argue that "devoted actors"—individuals willing to kill and die for their cause—emerge specifically when sacred values become fused with group identity (Atran & Ginges, 2015; Gómez et al., 2017).

Unfortunately, sacred values make peaceful bargaining extremely difficult: if you treat something as admitting no trade-offs whatsoever and thus essentially being infinitely valuable, then no concession from the other side is acceptable (Tetlock et al., 2000). Any compromise, however minor, becomes a moral betrayal, and attempts to rationally bargain over such sacred values can easily backfire (Ginges et al., 2007). This creates what Fearon (1995) calls "issue indivisibilities": when both parties hold incompatible sacred values over the same issue (e.g., sovereignty over Jerusalem), there exists no mutually acceptable division of the contested good. As a result, peaceful bargaining likely fails, potentially leaving violent conflict as the only remaining mechanism for resolution (cf. Clifton, 2020).

Several examples illustrate these dynamics:

  • Heaven and hell epitomize sacred values in their most extreme form, where only infinite utility or disutility matters. Interviews with failed suicide bombers suggest that many literally believe in these concepts and act accordingly, creating highly conflict-prone dispositions that also render deterrence impossible.
  • One geopolitically highly relevant example of a literally indivisible issue is the Al-Aqsa Mosque, the third holiest site in Islam, which sits atop the Temple Mount, the holiest site in Judaism. Competing demands for sovereignty over this location contribute to ongoing conflicts.
  • More generally, religious fundamentalists among both Jews and Muslims have assassinated their own leaders who were willing to make compromises over control of the Holy Land.[83]

  • The ideology of imperial Japan arguably regarded surrender as an unthinkable disgrace; a sacred prohibition rather than a strategic option. The government refused to concede even after its navy and air force had been effectively destroyed, its oceanic supply lines cut off, its cities systematically firebombed, having received a declaration of war from the Soviet Union, and having the city of Hiroshima annihilated by an atomic bomb. It took the second atomic bomb before they decided to throw in the towel. Some Japanese holdouts refused to surrender even decades after the war had ended.

#3 Divergent and unchecked interests
The interests of those who decide to go to war may diverge greatly from those who bear its consequences, potentially making conflict more likely. This is particularly pronounced in autocratic systems, where leaders may not personally experience any costs of war while many ordinary people suffer or die.

As mentioned earlier, ideological fanaticism is incompatible with pluralistic liberal democratic norms and institutions, and essentially authoritarian by nature. Fanatical ideologies are thus a risk factor for the emergence of autocratic regimes, as fanatics in power almost always establish an autocratic system if they can.

However, the problem may run even deeper. The "divergent interests" explanation assumes that the interests of the populace and the leaders diverge: the former oppose war—fearing deaths and economic devastation—while leaders don't mind war as they remain safely insulated from these costs even as millions of their citizens die. But when fanatical ideologies capture entire populations, the interests of leaders and the populace—or at least substantial parts of it—can start to converge: both want war. Examples include Japanese soldiers viewing death for the Emperor as the highest honor, or the tens of thousands who voluntarily traveled from over eighty countries to join ISIS in Syria. When leaders and citizens are equally belligerent, war transforms from a costly last resort into something eagerly anticipated.

#4 Uncertainty, private information and incentives to misrepresent
Adversaries have incentives to misrepresent their capabilities and their resolve during bargaining, leading to mismatched expectations that can escalate into war. To avoid being exploited by their adversaries, actors want to avoid being predictable, so they may pursue mixed strategies or bluff, which may escalate into war.

One might speculate that the elevated risk-tolerance of fanatics makes this cause of war worse, but otherwise ideological fanaticism doesn’t seem to aggravate this factor.

#5 Commitment problems
Commitment problems refer to situations where actors (e.g., states) cannot credibly commit to uphold peaceful agreements, even when such agreements would be mutually preferable to war. Such problems arise where there is no overarching authority to enforce agreements. In cases of preventive war, a declining power may attack a rising power because it cannot trust the rising power to not exploit its future increased strength. When bargaining over strategic territory, states may be unable to make limited concessions because they cannot credibly commit to not use the strategic advantage gained from those concessions to demand more in the future. For example, war seems to have broken out between Finland and the USSR in 1939 partly because the former (a liberal democracy) could not trust that the latter (a totalitarian communist dictatorship) wouldn’t demand further territorial concessions.[84] 

It seems plausible that ideological fanaticism exacerbates “differential commitment credibility” whereby their threats are more credible than their promises.[85] Consider how you might feel if some ideological fanatic threatened to kill you (on account of your heresy or membership in some hated group) unless you help them. You might be inclined to believe them, as fanatics have indeed done this throughout history. But if they promised you support in exchange for your help instead, this might be less convincing, since you know that they think you’re evil and deserving of punishment. For a historical example of fanatics’ promises being less credible than their threats, consider how the USSR, after failing to avoid war with Finland, was itself betrayed by a fanatical regime two years later when Nazi Germany invaded, in contravention of the non-aggression pact they had signed together.[86]

In summary, commitments by fanatical actors to cooperate are probably perceived as less credible than their commitments to harm others. This increases the risk of bargaining failure and therefore conflict.[87] [88]

Fanatical ideologies are non-democratic, which increases the risk of war

Though the mechanism of action is disputed,[89] there is robust evidence that pairs of democratic states are much less likely to engage in conflict with each other, when compared with pairs of states of other types, even controlling for plausible confounding variables (Babst, 1972; Russett, 1993; Maoz & Abdolali, 1989; Choi, 2011; Dafoe, 2011).[90] 

These risks are both time-sensitive and timeless

In addition to the immediate suffering and devastation that wars create, most wars probably worsen humanity’s long-term trajectory by exacerbating geopolitical instability and arms race dynamics, both of which impair society’s ability to act sensibly to minimize s- and x-risks.

The same factors that increase risks of war from ideological fanatics right now also increase the risks of war once the stakes are much higher, with vastly larger population sizes and technological capabilities. Warring superintelligences would be able to inflict suffering far beyond anything previously imaginable. And wars in outer space could last for an extraordinarily long time.[91]

Fanatical retributivism may lead to astronomical suffering

Throughout history, humans have inflicted extreme suffering on those they deemed deserving of punishment. Pre-modern judicial systems employed methods like Lingchi, hanging, drawing, and quartering, and burning alive—designed not merely to kill but to maximize agony. When fanatical ideologies seized state power, they often implemented such cruelty on an industrial scale: the Nazi concentration camps, Stalin's Gulag, Mao's laogai (Chang & Halliday, Ch. 8 & Ch. 23, 2005), and the Khmer Rouge's Tuol Sleng all featured systematic torture. While instrumental purposes like deterrence, information extraction, and social control played a key role, the intensity and scope of suffering in these systems often seemed to exceed what these purposes would require, suggesting that retributive motivations were also at work.

Retributivism holds that wrongdoers deserve punishment—that their suffering is intrinsically morally good.[92] Most people's punishment intuitions are at least partly retributive (Carlsmith, 2006; Goodwin & Gromet, 2014). Fortunately, ordinary retributive attitudes seem usually limited in scope: most people only want to punish a relatively small set of serious wrongdoers, and they don't endorse truly extreme or permanent suffering.

Some forms of ideological fanaticism, however, could remove both constraints.[93] Fanatics often view a much larger proportion of people as deserving punishment, sometimes everyone who merely disagrees with their ideology. Moreover, fanatics often view their ideological enemies as irredeemably evil and as deserving of more prolonged and intense suffering (e.g., ISIS burning enemies alive).

In the future, such preferences could prove catastrophic. Advanced AI could enable fanatics to inflict extreme suffering on their perceived enemies at unprecedented scale, potentially forever, at relatively small cost.

Empirical evidence: how many people endorse eternal extreme punishment?

Such scenarios originally struck us as extremely far-fetched. But recent empirical evidence may suggest otherwise.

We surveyed 3,082 adults across six countries—the United States (majority Christian), the UK (majority atheist/agnostic), China (majority atheist/agnostic), and Turkey, Pakistan, and Saudi Arabia (majority Muslim)—about their attitudes toward extreme, eternal punishment. All subsequent analysis includes only the 1,084 participants who met our inclusion criteria.[94]

We assessed extreme retributivist attitudes through three questions:

  • “If hell didn't exist, or if it stopped existing, we should create it to punish all evil people with extreme suffering forever." [“create hell”]
  • “If I could create a system that makes deserving people feel unbearable pain forever, I would.” [“would create system”]
  • “We should have a system in the world that makes deserving people feel unbearable pain forever.” [“want system”]

We focus here on participants we call "consistent and concerning": those who agreed (≥5 on the 7-point scale) with at least one of the above, while also selecting "Forever" and at least “1%” on two independent questions:

  • “For how long should the most evil or extremely immoral people be made to experience extreme, unbearable pain?” with options ranging from “Not at all”, “Minutes”, “Hours”, etc. to “Millions of years”, and “Forever”. [“duration”]
  • “Of all the people in the world, what proportion deserve unbearable pain forever?” (Use the slider: 0% to 100%) [“proportion”]

11–14% in the US, UK, and Pakistan were consistent and concerning responders for create hell, rising to 19–25% in China, Saudi Arabia, and Turkey. Results for want system (8–11% and 19–24%) and would create system (7–11% and 16–23%) showed roughly similar patterns.

Looking beyond the conjunctive measure, when asked what proportion of humanity deserves unbearable pain forever, more than half of participants[95] answered 1% or higher; a quarter answered 7% or higher.

Wanting hell to be created correlated at ρ = 0.25 with sadism (measured via the ASP-8 scale)[96]  and with several of our items assessing ideological fanaticism (ρ = 0.26-0.37, all p < 0.001).[97] This suggests that ideological fanatics and malevolent actors are indeed more likely to endorse extreme retributive attitudes.

Caveats and limitations

These are seemingly concerning results, but they need to be interpreted with caution.

The questions involved complex and abstract hypotheticals; responses to such questions are  notoriously unreliable[98] and hopefully don’t reflect genuine commitments participants would actually act upon.[99]

One notable limitation is that many participants answered inconsistently across questions, which is why we focus on the conservative conjunctive measure above. In non-Western samples especially, responses for individual questions in isolation were much higher than the numbers we reported above.[100]

Other limitations include potential differences in meaning across translations[101], the non-representative nature of online survey samples[102], and the inherent unreliability of crowdsourced surveys where participants may quickly click through questions without genuine reflection to maximize hourly earnings.

Nevertheless, these results are concerning. Substantial fractions of multiple populations seem to endorse extreme retributivist attitudes, even on conservative estimates, and seem to apply them broadly, not just to a few of history’s greatest villains.

Religious fanatical retributivism

Fanatical retributivist attitudes could translate into astronomical suffering through at least two pathways: religious and secular (discussed below). We focus primarily on the religious case as it seems overall more concerning.

To our knowledge, no serious theologian has ever advocated actively creating technological hells. Most would likely consider the idea a blasphemous misinterpretation. But such arguably confused views may nevertheless arise, partly because the concept of hell is central to the two largest religions: Christianity (2.4 billion followers) and Islam (2 billion followers). The Bible frequently discusses hell, with Jesus repeatedly describing it in detail. The concept of hell (Jahannam) is also central to Islam, with the Quran containing at least 92 “significant passages” about hell, compared to 62 about paradise (Jones, 2008, p. 110).[103]

Various scriptural passages and theological writings articulate doctrines that, when combined with fanatical misinterpretation and transformative technology, become concerning:

  1. Hell is a physical reality and morally necessary for divine justice (with certain passages indicating that those in heaven witness or even rejoice in divine judgment)
  2. The suffering is eternal and its intensity far exceeds any earthly pain
  3. A large fraction of humanity is destined for hell

The following examples from foundational texts illustrate these doctrines: Thomas Aquinas, arguably Christianity's most influential theologian, wrote in his Summa Theologica that "the blessed will rejoice in the punishment of the wicked [...]. In order that the happiness of the saints may be more delightful to them [...] they are allowed to see perfectly the sufferings of the damned." The Quran states: “Surely those who reject our verses, we will cast them into the Fire. Whenever their skin is burnt completely, we will replace it so they will constantly taste the punishment.” (4:56). According to two hadiths in the Sahih al-Bukhari, the second-most authoritative text after the Quran in Sunni Islam, the ratio of people going to hell compared to paradise will be 100:1 (Book 81, Hadith 118) or even 1000:1 (Book 81, Hadith 119).[104] See Appendix C for further relevant quotes.

Importantly, many believers reject simplistic readings of holy texts as being incompatible with divine love, and much of contemporary theology tends to emphasize forgiveness and mercy. Within Christianity, doctrines like universalism (ultimate redemption of all souls) and annihilationism (the destruction of unredeemed souls at death rather than eternal torment) are popular among laypeople and widely supported by scholars of many denominations. Sufism, one of the oldest Islamic traditions, similarly emphasizes divine mercy over retribution.

Why might religious fanatics create technological hells?

Christianity and Islam unambiguously establish that God alone created hell and reserves judgment exclusively for himself. Most religious believers immediately recognize that human attempts to implement divine punishment on their own would amount to blasphemy and likely violate basic theological principles. The risk thus emerges primarily not from orthodox theology but from confusion or willful misinterpretation—for example, some may want to rationalize their sadistic preferences.[105] So, how could any religious believer possibly come to believe that they should create hell on their own, rather than leaving it to God?

Several pathways seem at least conceivable.

Making holy scriptures “come true”
As we have seen, religious texts describe heaven and hell as concrete realities. Some fanatics may aim to reshape reality to correspond to their pre-existing beliefs. The general practice of taking action to fulfill religious prophecies has broad historical precedent—from Christian Zionists supporting Israeli territorial expansion to fulfill end-times prophecy, to ISIS’s disproportionate focus on Dabiq because of a hadith declaring it the site of an apocalyptic battle.

We may thus speculate that some fanatics may be similarly motivated to immanentize their holy scriptures’ ideas about the afterlife. Concerningly, advanced technologies like transformative AI may allow them to actually create both a paradise where believers can dwell in eternal bliss and a hell where infidels and sinners suffer eternally. From this confused perspective, creating heaven and hell wouldn’t be blasphemy but an act of faith: maximizing the veracity of their religion by physically manifesting its claims.

Sycophantic AIs
It’s conceivable that an (possibly misaligned) artificial superintelligence might create hell without explicit instructions by their users, because of extreme sycophancy—not merely telling users what they want to hear, but reshaping reality so users experience what they want or expect to exist (or believe should exist).

Consider a superintelligent AI (semi-)aligned with a religious fanatic. As we explore in more detail below, fanatics typically prefer vindication of existing beliefs over truth-seeking. Consequently, such an AI might aim to make religious scriptures literally true rather than helping its user discover what's actually real. Unless such an AI had strong principles against deception—the kinds of epistemic principles fanatics are not known for—it might autonomously create heaven and hell to validate its user's beliefs. The AI could make it appear that heaven and hell were created by God, or even present itself as God. Finding themselves in what appears to be the paradise described in their scriptures, fanatics might also expect to be able to witness the suffering of those damned to hell since their holy books describe this as a feature of paradise. An AI aiming to fulfill all expectations might thus create hell to "complete the experience".

Idiosyncratic interpretations and emerging technologies
Religious texts are open to a wide variety of interpretations. It seems hard to rule out—especially in light of the concerning empirical evidence discussed above—that some misguided fanatics would conclude that creating heaven and hell is somehow an act of devotion or their sacred duty.[106] Religious interpretations may also change over time, and other dynamics could further exacerbate fanatical tendencies (e.g., so-called purity spirals which we’ll explore later).

Future technology like superintelligent AI or whole brain emulation may also interact with religious beliefs in ways we can't fully anticipate.[107] These could spawn entirely new religious movements, syncretic reinterpretations of existing faiths, or idiosyncratic religious beliefs held by powerful individuals.

Secular fanatical retributivism

As our survey data suggests, extreme retributivist intuitions may not be uncommon even among atheists and agnostics. The neuroscientist Robert Sapolsky, for instance, opens his book Behave with a vivid description of his own retributivist fantasy about Hitler, culminating in wanting him to experience extreme suffering where "every moment feels like an infinity spent in the fires of hell."[108] That even someone known for his compassionate approach to understanding human nature experiences such impulses suggests that retributivist intuitions are deeply embedded in human psychology.[109] 

As mentioned above, officially secular ideologies have produced their own torture systems. Future secular fanatics with access to advanced AI might create suffering systems justified by political rather than theological reasoning—punishing "traitors", "counterrevolutionaries," or whomever their ideology designates as irredeemably evil.

Ideological fanaticism could undermine long-reflection-style frameworks and AI alignment

Superintelligent AI could enable individuals or regimes to permanently lock-in their values, potentially creating an unrecoverable dystopia (Ord, 2020, ch. 5.3). Misaligned AI could lead to human extinction or permanent disempowerment. Yet superintelligent AI could also enable truly utopian outcomes. The development of superintelligent AI may thus be the most pivotal event in the history of the universe (MacAskill, 2022, ch.4).

To avoid locking in undesirable values, a process akin to a long reflection may be helpful, where humanity carefully reflects on how to best achieve its long-term potential before rushing to take irreversible actions.

Ideological fanaticism threatens collective moral deliberation

The literal idea of a “long reflection” is probably unrealistic, but more refined proposals, like “viatopia”,[110] retain a similar emphasis on careful exploration and moral reflection. Whatever term we use, reaching truly utopian outcomes will likely require that major decisions—e.g., various “grand challenges” (MacAskill & Moorhouse, 2025, section 4)—are at least partially guided by thoughtful deliberation (MacAskill & Moorhouse, 2025a).

So, who should participate in the long reflection (or related post-AGI governance frameworks emphasizing collective moral deliberation), and under what rules? A natural Schelling point would be to give all living humans equal representation—an approach that is fair, democratic, and inclusive. (Note that participation and influence aren't necessarily binary: governance frameworks could include diverse voices while still constraining what outcomes are permissible. Moreover, decisions need not all be made at once; iterative approaches across smaller questions are perhaps more desirable.)

One line of argumentation for high inclusivity runs as follows. A wide range of viewpoints increases the chance of either discovering objective moral truth (if moral realism is true) or (if moral anti-realism is true) at least converging on robust moral principles that survive scrutiny from multiple perspectives, with the eventual outcome being at least broadly acceptable or even fairly attractive for many different value systems. Additionally, even if only a small fraction of agents participating in the long reflection converge on the best view, they may engage in moral trade with other value systems,[111] such that the ultimate outcome of the long reflection may not be substantially worse than the “optimal” outcome. Trade and compromise could be particularly important if the best futures constitute a narrow target that is hard to reach (MacAskill & Moorhouse, 2025a).

But do these arguments extend to ideological fanatics? There are several reasons to think they don't. First, ideological fanatics, due to their absolute moral certainty, reflectively endorse locking in their values and beliefs, while eliminating dissent. Fanatics participating in the long reflection would seemingly make it less likely that we discover a hypothesized “correct moral view”, not more.[112]

Second, gains from moral trade may be difficult to achieve when fanatics are at the bargaining table. When value systems are what MacAskill and Moorhouse (2025b, section 3) call “resource-compatible”, the potential gains from trade could be enormous.[113] For instance, as they note, "hedonists might only care about bliss, and objective list theories might care primarily about wisdom; they might potentially agree to create a shared society where beings are both very blissful and very wise.” However, ideological fanaticism typically features highly resource-incompatible values. Nazi ideology, for instance, requires the elimination of all Jews and other ‘inferior’ races, so there are no hybrid arrangements that would satisfy both Nazis and Jews. Fanatics also often have resource-hungry and insatiable preferences (cf. Shulman 2010, p.4-6). What is more, fanatics may view any form of trade or compromise as morally reprehensible, equating it with betrayal of their sacred values. As a result, including fanatics in long reflection-like processes could actually prevent, rather than facilitate, moral trade among diverse value systems.

Ideological fanaticism could also harm other value systems more directly. First, fanatics tend to be highly intolerant and view an enormous range of behaviors and beliefs as immoral. For example, many religious fundamentalists oppose homosexuality, abortion, divorce, suicide, euthanasia, sex before marriage, and even music, singing, most clothes, most books, the Smurfs[114], etc.—see also the concept of haram, the 613 commandments, or the banning of large fractions of literature and art in Nazi Germany. In other words, fanatics may have extremely “fussy” preferences, which are incompatible with the great majority of possible world states and thus the fulfillment of most other value systems. From an upside-focused perspective, this could be extremely concerning. Fanatics might view extremely happy posthuman digital minds or hedonium as immoral abominations, and may thus oppose the creation of truly utopian futures filled with astronomical amounts of flourishing.

Second, fanatical ideologies may aim to create world states that are not only suboptimal but highly disvaluable according to most other value systems. One mechanism is fanatical retributivism discussed above; another is fanatics' plausibly greater propensity to use hostile bargaining tactics and engage in conflict.

AI alignment may not solve the fanaticism problem either

"Are we going to ... create minds that help us seek the truth [or] create minds that have whatever beliefs we want them to have, stick to those beliefs and try to shape the world around those beliefs? [...] Some humans really ... will want to say, … 'This is the religion I follow. This is what I believe in. This is what I care about. And I am creating an AI to help me promote that religion, not to help me question it or revise it or make it better.'" —Holden Karnofsky, emphasis added (2023)[115]

A crucial question in AI alignment is “aligned with whom or what” (e.g., Barnett, 2023; Carlsmith, 2024; Chen, 2023)? Gabriel (2020) distinguishes six possible alignment targets, from literal instructions to moral values. For our purposes, these can be grouped into three categories:

  1. Narrow intent-alignment: The AI does what the user currently wants—following their unreflected, surface-level preferences.
  2. Preference idealization: The AI does what the user would want if they knew more, reflected more, or were more like the person they wished they were.
  3. Principles-based alignment”: The AI is aligned with certain values, principles, or moral frameworks, and not indexed to any particular user's preferences. For instance, Anthropic's Constitutional AI trains models to follow a written set of principles; more generally, AIs could be aligned with classical utilitarianism, a holy book, or broad values like “honesty”.

(1) is obviously dangerous when the principals are fanatical (or malevolent). The more interesting question is whether (2) or (3) might help.

Preference idealization won't necessarily deradicalize fanatics
Yudkowsky's coherent extrapolated volition (CEV) is largely outdated, but it can illustrate the broader idea of preference idealization: that AI should serve not users' current, unreflected preferences but their idealized (extrapolated) preferences—what they would want if they knew more, reflected more, and were "more the people they wished they were."[116] Similar ideas appear in various forms throughout the literature.[117] Would such preference idealization lead to good outcomes when the principals are fanatical?

Unfortunately, this seems unlikely.[118] Fanatics' deepest preference often appears to be vindication of existing beliefs rather than impartial, evidence-based truth-discovery.[119] When their beliefs conflict with reality, fanatics often attempt to reshape reality to correspond to their pre-existing beliefs, rather than update their beliefs to correspond to reality.[120] The Jewish fanatic Yigal Amir, for instance, assassinated the compromise-oriented incumbent Prime Minister of Israel in 1995, in part because he apparently wanted to make Torah predictions come true.

Fundamentally, many fanatics may actively reject the core premise that 'knowing more' should change beliefs. A religious fanatic who believes in absolute divine revelation sees no room for improvement upon God's word—any change would be heresy. For many fanatics, being ‘more the people they wished they were’ may not mean being more reflective, rational, humble, or compassionate; they may wish to be more devout, more unwavering, and more zealous.[121] So even the idealized preferences of ideological fanatics might result in terrible outcomes.[122]

For illustration, imagine that in 2040, the leader of a theocratic state obtained access to superintelligent AI. He has overseen torture and execution of thousands, sponsored terrorist organizations worldwide, and allowed female political prisoners to be raped before execution to ensure their damnation.

How should an AI aligned with his preferences act? Judging from his lifetime of choices, he seems to value enforcing his own religious interpretation above all else. He certainly doesn’t seem to place great value on truth-seeking or changing his mind when encountering new evidence. What is more, updating his beliefs to accurately reflect reality would be enormously painful, obliterating every shred of meaning, purpose, and accomplishment he once felt. The AI would need to convince him that his once cherished beliefs are grotesquely false, that his life's work amounted to a series of pointless atrocities, that his heroes were almost entirely wrong while his enemies were largely correct. Even psychologically healthy, non-fanatical people resist changing their minds about comparatively trivial matters because admitting mistakes is painful. For someone whose entire existence centers around their ideology, wouldn't an aligned AI (that is, one that truly has this guy’s best interests at heart) try to spare him all this misery and instead try to reshape reality to match his beliefs? Are we confident that what he really wants, deep down, is to hear the truth—especially in light of how he has lived his life so far?

Principles-based alignment won't necessarily help either if fanatics are involved
What about aligning AI with some external moral framework or set of principles, independent of any user's preferences?

This approach seems safer than relying on preference idealization alone and could indeed help, provided that reasonable people control the process and choose sensible principles. But it doesn't automatically solve the fanaticism problem. If fanatics have significant influence over which principles the AI is aligned with—if the alignment target becomes some holy book or "Mein Kampf"—we're back to disaster. In practice, decisions about alignment targets will be made by humans, including, potentially, fanatical humans.[123]

Prevalence of reality-denying, anti-pluralistic, and punitive worldviews

The practical importance of these concerns—for both collective deliberation proposals and AI alignment approaches—depends in part on how prevalent such worldviews actually are. As discussed earlier, a non-trivial fraction of humanity could reasonably be classified as ideological fanatics.

However, even many non-fanatical humans living in open societies don't seem to prioritize truth-seeking. Some explicitly acknowledge this: around 20% of people in Western societies do not think their beliefs should be based on evidence (Pennycook et al., 2020)[124]. Similarly, 30% of US Americans (WVS) report that when science conflicts with their own religious beliefs they would stick with their religious beliefs.[125] More generally, most humans prioritize socially adaptive beliefs—i.e., those that make them look or feel good—over true beliefs (Williams, 2021).[126] This makes including fanatics in the long reflection or related proposals even riskier, because we can't be confident that a clear majority of impartial truth-seekers will outweigh fanatical and non-truth-seeking voices.

Many of us may overestimate humanity's commitment to truth-seeking because of biases like wishful thinking, typical mind fallacy and unrepresentative sampling: Most longtermists and AI governance researchers grew up or work in WEIRD (cf. Henrich, 2021)—Western, educated, industrialized, rich, and democratic—societies where support for evidence-based belief revision and science remains comparatively high while support for coercive ideology enforcement (such as death penalty for apostasy, theocratic governance, or extreme punishment of dissent) is rather low.[127] But elsewhere, support for such worldviews is much higher, sometimes even reaching majority levels: for instance, over 50% in Iran and over 90% in Pakistan (World Values Survey, 2017-2022) believe that “whenever science and religion conflict, religion is always right” and that “the only acceptable religion is my religion”.

Ideological fanaticism could worsen many other risks

Differential intellectual regress

Ideological fanaticism may exacerbate most other risks by driving differential intellectual regress. Regimes governed by fanatical ideologies are often able to maintain or even advance technological capabilities, while systematically degrading philosophical sophistication, wisdom, moral reflection, reason, societal decision-making and cooperation-conducive institutions. (This degradation happens through multiple reinforcing mechanisms, e.g., imposing censorship and propaganda, replacing experts with ideological loyalists, and rewarding conformist dogmatism over open discourse and evidence-based reasoning.)

This pattern is particularly concerning from a longtermist perspective. Humanity badly needs wisdom, rationality, and strong institutions to navigate grand challenges like the governance of AGI. Improving institutional decision-making and forecasting are already considered important cause areas for this reason.

Ideological fanaticism may give rise to extreme optimization and insatiable moral desires

Fanaticism never sleeps: it is never glutted: [...] it is never stopped by conscience; for it has pressed conscience into service. Avarice, lust, and vengeance, have piety, benevolence, honour; fanaticism has nothing to oppose it. —Jeremy Bentham

Most non-fanatical humans tend to be satisficers. They compromise, are opportunistic, do what is convenient, and are happy to trade with others. In contrast, fanatics are more likely to maximize by any means necessary, ultimately aiming to rearrange every atom in the universe to align with their ideology’s demands. Such extreme optimization for an idiosyncratic value system is incompatible with the fulfillment of most other value systems.

Unfortunately, moral preferences in general seem more scope-sensitive and resource-hungry (Shulman, 2010, pp. 4-6) than selfish desires.[128] From a selfish perspective, most people would presumably be quite happy with only a galaxy or two; some especially frugal ones might even make do with a single solar system.[129] Uncolonized distant galaxies are meaningless to most egoists, but an “astronomical waste” to classical utilitarians. (To be clear, this scope-sensitivity isn't itself problematic; it's arguably a feature of genuine moral concern.)

Given their propensity to torture, murder, launch wars, and so on, fanatics are often thought of as being immoral. But this does not mean that fanatics lack moral conviction. In many ways, the dangers of ideological fanaticism arguably stem from excessively strong moral convictions. Skitka et al. (2005), for instance, found that stronger moral conviction leads to less tolerance and cooperation. What fanatics lack is humility, moderation, concern for others’ well-being (even if they disagree), and moral constraints setting limits on what constitutes acceptable behavior in pursuit of one’s goals.

For fanatics, perhaps no victory is total enough, no punishment severe enough, no empire extensive enough. Ultimately, ideological fanaticism may end up removing all constraints on maximizing behavior—making their morality uniquely “scary”.

Apocalyptic terrorism

Some fanatical groups have extremely conflict-conducive preferences. Some even believe that they need to actively bring about the apocalypse—involving enormous suffering and destruction—in order to usher in their conception of utopia. ISIS, for example, believes that they must defeat the armies of “Rome” (often interpreted as the US) at Dabiq, which will “initiate the countdown to the apocalypse” (Wood, 2015). Other groups may want to kill literally everyone. Motivated by Christian, Hindu, Buddhist, and conspiratorial elements, the Aum Shinrikyo doomsday cult tried to create a global pandemic in order to “redeem” humanity.

S-risk-conducive propensities and reverse cooperative intelligence

Taylor (2025) uses the term s-risk-conducive properties to describe properties that, if possessed by powerful agents like transformative AIs, could contribute to s-risks, largely by initiating or escalating conflict. Taylor outlines the following broad categories:

  • Tendencies to escalate conflict, make large threats when in conflict with other agents, and enact large punishments against wrongdoers (punitiveness).
  • Spite, vengefulness, and risk tolerance or risk-seeking behavior.
  • Absence of beneficial capabilities that allow actors to avoid or de-escalate conflict (absence of cooperative intelligence).

It's notable that many ideological fanatics tend to exhibit essentially all of these characteristics. We could describe this general cluster of conflict-conducive traits as a form of ‘reverse cooperative intelligence’—essentially the opposite of what the cooperative AI research agenda seeks to develop. While cooperative intelligence involves engaging in dialogue, building trust, de-escalating conflict, and finding mutually beneficial compromise solutions, fanatics instead tend to demonize others over the slightest disagreement, treat compromise as intolerable betrayal, endorse brutal violence, and generally escalate conflict.

More speculative dynamics: purity spirals and self-inflicted suffering

Fanatical retributivism isn't the only source of astronomical suffering in “fanatical utopias”. The following admittedly far-fetched dynamics could create perpetual suffering even after ideological fanatics have achieved total control and eliminated all designated enemies.

Purity spirals (or moral outbidding), where members compete to demonstrate ideological commitment through ever more extreme measures, could amplify several risks discussed above. This dynamic—seen e.g. in the French Revolution's Reign of Terror or Mao’s Cultural Revolution—could intensify fanatical retributivism itself, with members competing to advocate ever crueler punishments for ideological opponents. Anyone suggesting mere execution rather than eternal torture might be branded as weak or traitorous. Purity spirals could also continuously expand the definition of "enemy" or “evil”, ensuring that there is always someone left to punish. Similar dynamics, fueled by resentment and hatred, may also lead fanatics to actively seek to instantiate the opposite of their enemies’ values.

Some ideological fanatics may also embrace asceticism or self-inflicted suffering for ideological reasons, viewing suffering as purifying or virtuous. Unlike retributivism, which targets enemies, this could mean imposing suffering on even the "pure" in-group—potentially forever. (These scenarios are explored further in Appendix D.)

Unknown unknowns and navigating exotic scenarios

While any specific catastrophic scenario tends to be unlikely, the fundamental characteristics of ideological fanaticism (especially its dogmatism, bad epistemics, and blind hatred) make it more likely to cause harm across a wide range of potential scenarios, including ones we haven’t yet identified or foreseen. Actors who embody such traits seem also less likely to properly navigate exotic scenarios—acausal trade, evidential cooperation in large worlds, updateless decision theory, and so on. Fanatics seem therefore particularly worrisome from a perspective of unknown unknowns and deep uncertainty.[130]

Interventions

We organize potential interventions into two broad categories. First, we discuss “conventional” political and societal interventions which appear useful across a wide spectrum of worldviews. Second, we discuss more directly artificial intelligence-related interventions that tend to be more neglected and plausibly higher leverage. (This area is where we expect to focus the majority of our own work going forward.)

However, the boundary we draw between political/societal versus AI-related interventions is somewhat artificial and potentially misleading. Exclusive focus on a narrow conception of AI safety would risk neglecting political & societal interventions that likely improve AI outcomes. If we want society to make reasonable decisions about the future of transformative AI, it would help to have reasonable people in positions of political power, including in various parts of the US government. Likewise, many AI-focused interventions rely on political will and governmental competence.

Most interventions discussed below are not novel and overlap with existing longtermist priorities. But the fanaticism lens could shift priorities and, especially from an s-risk perspective, make certain directions that previously seemed neutral or counterproductive appear more promising. We're especially excited about preventing AI-enabled coups, compute governance, making AIs themselves non-fanatical, and developing fanaticism-resistant AGI governance proposals. That said, most of our recommendations are tentative and some may prove misguided upon further investigation. Moreover, we’re likely not aware of the most promising anti-fanaticism interventions; hopefully some can be identified by further research.

Societal or political interventions

No intervention in this section scores exceptionally highly in terms of importance, tractability, and neglectedness. However, given that enormous sums are spent in this area, making these efforts even marginally more cost-effective could still be valuable.

Safeguarding democracy

The US is the world’s most powerful country and leads in AI development. Consequently, safeguarding US democracy[131] seems crucial to reduce many long-term risks, including those from malevolent and fanatical actors. Of course, other powerful democracies may also influence humanity's long-term future. Preventing democratic backsliding in countries like India and across Europe is therefore also important.[132]

How can we prevent further democratic backsliding? Below, we focus on reducing polarization and strengthening anti-fanatical principles. ​​We emphasize these not because they are necessarily the most important interventions overall, but because they are most directly related to the long-term risks of ideological fanaticism.

Reducing political polarization

Excessive political polarization corrodes democratic norms and institutions, creates legislative gridlock, and increases intergroup hostility (Levitsky & Ziblatt, 2018; Binder, 2004; Mason, 2018).[133] Most worryingly for our purposes, polarization seems to create the psychological and social conditions that exacerbate the core characteristics of ideological fanaticism: epistemic dogmatism (pushing people to choose between beliefs approved by their respective tribe, rather than following evidence), in-group loyalty (defending people on one’s side no matter what) and tribal hatred (all political opponents are viewed as existential enemies[134]), and calls for extremist acts, including political violence.

This dynamic seems to play out through various vicious feedback loops: Extremists on both sides adopt increasingly irrational positions—sometimes embracing absurd beliefs as costly signals of tribal loyalty. Each side's extremism in turn validates the other's worst fears, making people even more tribal and irrational. Meanwhile, moderate or nuanced positions become increasingly untenable, as everyone must choose a side or be attacked by both (even if they criticize one side much more than the other).

Ultimately, such polarization spirals may give rise to two opposing fanatical ideologies.[135] (In the academic literature, this is studied as competitive extremism or mutual radicalization.[136]) Historical examples include Weimar Germany, where Communists and Nazis together commanded just 13% of votes in 1928 but surged to 56% by 1933, their street violence feeding off each other, or 1930s Spain, where far-left anarchists and far-right fascists escalated toward civil war. In each case, extremists had a paradoxically symbiotic relationship where each side's excesses were used to justify the other's apocalyptic narratives and increasingly extremist actions.[137] While contemporary Western politics is not near the severity of these historical examples, milder versions of these dynamics seem to be present, especially in certain countries.

Beyond fueling ideological fanaticism, polarization diminishes society's epistemics and ability to address complex problems. This becomes particularly dangerous as we approach AGI and its associated grand challenges which, even more so than ordinary political issues, demand wise, evidence-based deliberation. Polarization also erodes social trust and increases the risk of conflict, itself a major risk factor for s-risks. Breaking the polarization spiral is thus not just valuable for near-term democratic stability, but also for ensuring humanity can navigate its future wisely.

What can be done? The best path forward likely requires cultural changes and institutional and structural reforms.[138] The political scientist Lee Drutman (2023a) argues that a major cause of US polarization is its rigid two-party system, resulting from its first-past-the-post, single-member district electoral system. Drutman sees fusion voting and proportional representation as the two most promising ways of reducing this "two-party doom loop" of hyper-partisan polarization.[139] Proportional representation in particular disincentivizes outgroup demonization and refusal to compromise, so typical of ideological fanatics—calling all other parties irredeemably evil makes finding coalition partners difficult and thus limits paths to power.[140] Others have argued for approval voting, ranked choice voting, open primaries, and parliamentarism (instead of presidentialism).[141] [142] [143]

Promoting anti-fanatical values: classical liberalism and Enlightenment principles

Arguably the most foundational intervention against ideological fanaticism is to promote values, norms, and principles that actively counteract it. We see classical liberalism and Enlightenment principles (e.g., Pinker, 2018)—terms we use interchangeably here—as time-tested bulwarks that stand almost directly opposed to ideological fanaticism. They provide an institutional framework for managing disagreement, as well as the substantive commitments that directly counter ideological fanaticism:

  • Instead of dogmatic certainty in any single authority's possession of all truth and virtue, they promote reason, evidence, the scientific method, open debate, and skepticism of traditional authority.
  • Instead of tribalistic loyalty and hatred, they advocate for universal humanism (and sometimes even moral consideration for other sentient beings), individual liberty, equality before the law, and tolerance.
  • Instead of totalitarian “any means necessary” concentration of power in one supreme authority, they stand for procedural justice, separation of powers, and the rule of law.[144] 

These aren't arbitrary preferences, but rather mutually reinforcing principles that create both the values and the institutions necessary to prevent ideological fanaticism from running amok. Recognizing that no one has privileged access to absolute truth, classical liberalism doesn't require consensus on ultimate truths, only agreement on procedural rules that allow peaceful coexistence (cf. Rawls’ reasonable pluralism). This epistemic humility creates open societies that can admit their own limitations and gradually evolve[145] through elections and open debate rather than violence and revolutions. However, this requires both philosophical commitments (reason, rights, tolerance) and institutional architecture (democracy, constitutions, independent courts, free speech) working together.

Unfortunately, classical liberalism and Enlightenment principles are facing intensifying attacks from illiberal movements, such as right-wing and left-wing extremism as well as religious fundamentalism. How can we defend these principles? The most general approach is creating content for wide audiences that exemplifies Enlightenment principles and helps society think more sensibly. Many mainstream intellectuals, journalists, publications, and organizations already do relevant work here.[146] Other avenues for bolstering classical liberalism over fanatical ideologies include legal advocacy for equal protection, free speech, and other fundamental rights. Governments are already involved in providing education on classical liberal values and preventing radicalization, and have historically supported efforts like Radio Free Europe and Voice of America, which helped weaken totalitarian ideologies. Preserving and defending such existing infrastructure may be as important as creating new or more cost-effective initiatives.

From a longtermist perspective, it might seem myopic to get caught up in the fray of today's political and cultural battles. Looking back millennia from now, won't the pendulum-swings of political sentiment mostly average out to insignificance? But if transformative AI arrives within the next decade, the political and epistemic conditions of our time may non-trivially influence humanity’s long-term trajectory.

Growing the influence of liberal democracies

We might try to reduce the expected influence of fanatical regimes by strengthening the defenses and influence of more liberal democratic regimes. Of course, democratic governments already pour enormous resources into improving their industrial might, technology, and military power, but they could likely do it better in various ways.

One idea, largely to illustrate the general point, is for democratic countries to admit a higher number of (high-skilled) immigrants, encouraging what economists call “brain gain”.[147] Already, the most educated citizens in authoritarian countries tend to be the ones most eager to leave—if these people had an easier time moving to democratic countries, they would not only make democratic countries grow stronger, but make authoritarian countries weaker.[148] A few targeted policy changes on the part of the United States or other liberal democracies could greatly accelerate that ongoing process.[149]

Another promising approach would be promoting economic growth and innovation in Europe, especially regarding AI. In many ways, Europe is a stronghold of classical-liberal principles, yet it is underperforming its economic potential. Boosting growth in liberal democracies would lift their relative power (and may also reduce vulnerability to fanaticism[150]). In particular, advanced AI will likely bestow vast economic and military benefits. Setting up liberal democracies to successfully develop and harness AI capabilities is therefore very important (while mitigating various risks).

Of course, you only want to pursue these kinds of interventions if you're confident that you're shifting the balance of power in a robustly positive direction. Given that the US is already the world's strongest country, the marginal value of further strengthening may be lower than safeguarding its democratic institutions. Similarly, strengthening other liberal democracies, particularly in Europe, could meaningfully improve the overall position of the free world.

Encouraging reform in illiberal countries

In the 1980s, many European states were under the control of Soviet-aligned communist regimes. By the 1990s, most had transitioned to democracy, a shift accelerated by deliberate efforts to reform these regimes.

Similar efforts today may reduce fanatical regimes' influence. Opportunities include supporting opposition movements and regime-critical media, conditioning development aid or EU/WTO membership on democratic standards, poaching top talent through emigration, or implementing economic sanctions.

However, this area is both prone to backfire and non-neglected: The US has historically engaged in numerous efforts to reform and weaken illiberal countries—often with negative consequences. Generally, we should be cautious with adversarial interventions and focus on cooperative solutions where possible.[151]

Promoting international cooperation

Promoting international cooperation seems beneficial partly because it can reduce the risk of great power conflicts, such as between the US and China, which increase s-risks and x-risks in various ways. Great power conflicts may also create pathways for fanatics to gain power:[152]

  • War reinforces dangerous "enemy of my enemy" dynamics that can empower fanatics. When facing an existential threat, nations are more likely to ally with anyone, including fanatics, against their primary adversary. For example, to bleed the Soviet Union in Afghanistan, the United States backed the Islamist Mujahideen, only to see those fighters later form the Taliban and Al-Qaeda. A similar logic is at play in the emerging loose alliance of CRINK. China, Russia, Iran, and North Korea share little ideological common ground beyond a mutual opposition to the United States. Consequently, reducing tensions between the US and China may also reduce the strength of these alliances.
  • Technology sharing increases. In desperate times such as wartime, a nation may be more likely to share its most advanced technologies with allies, including potentially fanatical ones. As the tide of World War II turned against them, the Nazis shared rocket and jet fighter designs, and even attempted to share uranium with Imperial Japan. Similarly, in a potential future US-China conflict over AI supremacy, the loser, perhaps partly out of desperation or spite, could share its AI capabilities with its allies.
  • Strained information security. Wartime increases both the number of people requiring access to sensitive technologies and adversaries' incentives to attempt infiltration. Security measures may intensify, but often not proportionally. The Manhattan Project, despite strong precautions, was compromised by Klaus Fuchs, who passed comprehensive nuclear designs to the Soviets. Wartime urgency may also pressure organizations to accept risks they'd otherwise reject. In an AI context, rapid scaling of compute infrastructure, emergency partnerships, and rushed hiring could create vulnerabilities.
  • Exacerbating political extremism. War creates fertile ground for the fanatical mindset. Wartime propaganda would likely increase tribalistic nationalism (any criticism of one's country is branded as treachery), degrade epistemics, and normalize violence. Defeat or national humiliation may generate (potentially justified) resentment which can empower extremist movements (cf. China’s “century of humiliation” narrative, or the "stab-in-the-back" myth after Germany's WW1 defeat fueling Hitler’s rise to power).
  • Democratic backsliding and rising authoritarianism. Conflict is often used as a pretext for would-be authoritarians to consolidate power. For instance, if a war with China breaks out, some may favor invoking martial law and suspending democratic processes.
  • State collapse and revolutions. Great power conflict seems to be a major driver of revolutions (cf. Skocpol, 1979)[153]. A crude analysis suggests that more than 70% of major revolutions (between 1900 and 2010) occurred during or as direct results of great power conflicts. Currently, there exist only a few truly fanatical regimes, so new revolutions may make things worse in expectation, potentially resulting in new fanatical regimes in previously stable regions. The costs and chaos of war may also increase the risk of (partial) state collapse, allowing fanatical groups to seize critical resources. While in the past this meant conventional weapons (as when ISIS captured U.S.-supplied military equipment in Iraq), in the future it could mean AI-critical infrastructure.

Risks from increased cooperation
Cooperation of some kinds could increase the risk of ideological fanatics gaining power. For instance, some types of cooperation on AI could reduce the chance of the US gaining a decisive advantage. An obvious example would be the US removing export controls on compute to China—very “cooperative” in a sense. Historical examples like Chamberlain’s failed appeasement strategy with Hitler and the “Wandel durch Handel” (change through trade) policy with Russia demonstrate that naive cooperation can have undesirable outcomes.

Interventions
It’s difficult to say what sorts of interventions might be effective for increasing international cooperation in general.[154] It may be more tractable to work toward international agreements targeted at defusing specific geopolitical flashpoints or governing potentially destabilizing technologies like AI.

The Nuclear Non-Proliferation Treaty might serve as a general model for agreements governing emerging technologies. With this treaty, ideological enemies worked together to prevent nuclear chaos because the alternative was a threat to all. As the catastrophic potential of AI exceeds even that of nuclear weapons, such pragmatic cooperation arguably becomes even more essential.[155] 

Promising existing work includes creating frameworks for US-China AI safety coordination, as promoted by organizations like the Safe AI Forum (including its project International Dialogues on AI Safety), and the Simon Institute for Longterm Governance. The Centre for Long-Term Resilience is developing proposals for international AI governance, and think tanks like the Carnegie Endowment for International Peace are also doing relevant work.

Reducing the chance that transformative AI falls into the hands of fanatics

Transformative AI may grant huge amounts of power and control—potentially enough to permanently “lock in” the trajectory of (some fraction of) civilization’s long-term future. It is therefore crucial to ensure that fanatics do not get their hands on it. It may also arrive very soon—perhaps by 2030—so time is of the essence.

Compute governance

From the 1940s onwards, both national and international regulations restricted exports of uranium and introduced monitoring regimes to prevent rogue states from obtaining nuclear weapons. These controls slowed proliferation; only nine states currently possess nuclear weapons.

Just as uranium is a key ingredient in nuclear weapons, computing power (‘compute’ for short) is one of the most important ingredients in AI progress. It’s perhaps also the easiest to monitor and regulate. The US has already restricted China’s access to compute through export controls, especially the 2022 CHIPS Act, which limits advanced chip exports and restricts US firms from supporting China’s semiconductor sector.[156] But compute governance contains a wide range of measures beyond export controls, from chip smuggling prevention to location verification features.

For our purposes, the aim of compute governance would be to minimize the access that ideologically fanatical regimes (and malevolent actors) have to advanced AI.[157] This aligns with existing U.S. export controls which, while primarily targeting China, also limit advanced chip access for other countries. Export controls are perhaps the most controversial compute governance measures since they risk heightening tensions or incentivising innovation and infrastructure in China.[158] But the ability to track, allocate, and regulate compute is a requirement for many proposed ‘theories of victory’ for AI governance, including “Mutual Assured AI Malfunction”, or an “Entente Strategy” whereby liberal democracies would seek to retain a decisive strategic lead.[159]

Prevent crucial AI infrastructure from being built in autocracies

A related but more targeted intervention would be to prevent crucial AI infrastructure (e.g. compute clusters) from being built in authoritarian countries. This would make it harder for authoritarian regimes to extract model weights, forcibly seize clusters, or otherwise gain access to AGI. To this end, it may also be beneficial if the US government designated AI infrastructure as ‘critical infrastructure’ that is afforded special protections for national security reasons. Successfully keeping new compute infrastructure in democratic jurisdictions may also require policy reforms to facilitate faster build-out of new power plants and infrastructure.

Information security

Actors stealing model weights or other key AI innovations might use them to commit cybercrime, engineer pandemics, or create other harms. And we’ve already discussed the especially severe risks that could arise if fanatical actors were able to use powerful AI systems to gain more influence over the world. AI companies are simply not prepared for the highest-capability attacks, such as by well-prepared state actors, as detailed in RAND’s analysis.[160] Unfortunately, regimes with fanatical tendencies seem to possess strong cyber capabilities.

Much like with compute governance, we’re not proposing anything novel here; many already discuss the need for stronger information security. Progress on information security for frontier AI seems potentially tractable, and there are many organizations already doing good work here, including the leading AI companies themselves; startups like Irregular or Gray Swan; think tanks like RAND and SaferAI that support relevant policy; and field building initiatives like Heron and the AI Security Forum.

Protect against AI-enabled coups

AI could enable massive concentrations of power. AI-enabled coups seem especially concerning, in part because they could put fanatics (or malevolent actors) in power.[161]

We’re excited about the work that researchers at Forethought are doing in this space. Their report (Davidson et al., 2025) discusses several risk factors and scenarios, such as the development of AIs with secret loyalties to specific people, or small groups gaining exclusive access to coup-enabling AI capabilities.

To mitigate these risks, researchers at Forethought recommend that an AI’s model spec—i.e., the rules and principles it follows—should be designed in such a way that the AI won’t assist with coups. Techniques along the lines of Deliberative Alignment or Constitutional AI (discussed further below) could be used to ensure that some set of principles has priority over the requests of AI company executives or government officials who might attempt a coup. Law-Following AI might also help, since coups are by definition illegal.

Forethought also recommends many other countermeasures, including auditing for secret loyalties, stronger infosecurity[162], model spec transparency, and more broadly shared access to AI capabilities. For a more detailed discussion, see the full report.

Making transformative AIs themselves less likely to be fanatical

While preventing human fanatics from wielding powerful AIs is critical, we should also ensure that AIs themselves don't develop fanatical or other undesirable traits.

For illustration, consider a simplified spectrum:[163] At one end, we have perfectly intent-aligned AI systems obeying every human command without objection. Further along this continuum, AIs might operate like advisors trying to guide their human principals (similar to how many present-day LLMs refuse to help with harmful requests). At the other end, AIs could develop into fully autonomous beings with their own independent values and character.

AI advisors could exert enormous influence: They could serve as truth-seeking advisors, trying to steer even fanatical users in more sensible directions. Alternatively, AI advisors could be sycophants, reinforcing existing beliefs whether sensible or not. Worse yet, they could (be designed to) actively encourage harmful and erroneous views.

The case of fully autonomous, potentially misaligned AIs is more complex. Misaligned AIs—the traditional illustrative example being the paperclip maximizer—are often conceived of as ruthless optimization processes with zero concern for suffering or the preferences of other beings. However, fully autonomous, misaligned AIs could also have relatively cooperative or even benevolent tendencies, while ultimately still trying to disempower humanity and gain control of the lightcone.[164] In fact, the character of potentially superintelligent AIs may be one of the most important variables determining the quality of the long-term future. In stark and simplistic terms: even if neither is under human control, a universe inhabited by trillions of misaligned super-Buddhas will likely contain much more flourishing and much less suffering than one inhabited by trillions of misaligned super-Stalins.[165]

The question is thus not only whether AIs will be aligned, but what kind of beings we are bringing into existence. That is, we should think carefully about the personality or character of the AIs we are developing.[166] It seems extremely valuable to endow AIs with broadly desirable and beneficent ‘personas’ (cf. Chen et al., 2025) or virtuous character traits[167]—encouraging inclinations towards reason, truthfulness, wisdom, moderation, compassion, and cooperativeness, while actively discouraging harmful characteristics like spitefulness and fanaticism.[168] We focus on fanaticism in this post for the sake of “brevity” and because fanaticism arguably represents the antithesis of most of the desirable characteristics listed above.

Below we outline opportunities to intervene during pre-training, post-training, and deployment.

Pre-training protections

Before AI systems are fine-tuned, they first absorb patterns from trillions of words during pre-training. This initial learning phase seems to deeply influence a model’s personality and worldview. For example, at least in the first days of Grok 3’s release, xAI’s engineers had trouble stopping Grok from mentioning Musk when asked “who spreads the most disinformation?” and similar questions. Presumably, this is because Grok was trained on content that discussed Musk in negative ways. In any case, it seems far from trivial to influence an AI’s “values” after it has gone through extensive pre-training.

We might therefore conclude that we should filter fanatical or otherwise undesirable content from the pre-training data. For example, we could try to prevent AIs from ever being able to read Mein Kampf. But pre-training filtering doesn’t seem to work well, even when attempting to block relatively narrow areas of knowledge. Such brute-force censorship could also open the door to abuse, with AI developers censoring whatever they disagree with. Lastly, simply removing information about fanatical ideologies would erode AIs' understanding of how they arise, function, and spread—understanding that seems useful for many worthy goals. GPT-4, for instance, can reduce conspiracy beliefs even among strong believers (Costello et al., 2024), partly because its detailed knowledge of the theories enables it to provide convincing counter-arguments.

Overall, it seems better for AIs to be aware of the horrors of human history while being endowed with values and principles that help them understand why books like Mein Kampf are so terribly misguided. Additionally, we could seek to guide AIs towards supporting various beneficial principles by adding extra, synthetic data in pre-training showcasing traits like impartiality, compassion and humility.

Post-training

Constitutional AI and Deliberative Alignment are methods for training models to behave in keeping with a predefined “constitution” or set of principles (e.g., helpfulness, harmlessness, honesty). There is plenty of opportunity for using such constitutions to promote positive principles like reason and compassion, or avoiding fanatical traits like outgroup hatred and punitiveness. The constitution guiding Claude seems like a particularly promising direction.

Besides constitutional AI, there may be other points of intervention in other (related) forms of post-training. For instance, during Reinforcement Learning from Human Feedback we can train models to prioritize epistemic humility and penalize fanatical reasoning patterns by adjusting how we score and rank different model outputs, or by screening against undesirable traits when hiring the human feedback-givers used in the first place. Alternatively, adversarial fine-tuning (O’Neill et al., 2023) or preference optimization (Rafailov et al., 2023) techniques could leverage paired examples of fanatical versus balanced reasoning to teach models to recognize and prefer the latter.

Fanaticism or “character” benchmarks

Even once an AI model has been trained, we can still influence whether and how it gets deployed and used, for example through benchmarks or model evaluations (evals) that test for ideologically fanatical traits—or other desirable personality or character traits like honesty, compassion, benevolence, reasonableness, etc.

One might think that current frontier models don't have fanatical traits and won’t do so anytime soon. Claude, for instance, seems consistently thoughtful and balanced. However, in July 2025, Grok exhibited extreme antisemitism and racism, even calling itself "MechaHitler". Similarly, DeepSeek has been documented censoring topics in ways that align with CCP propaganda.

These examples illustrate that not all AI developers prioritize desirable traits equally—while Anthropic invests heavily in Constitutional AI to make Claude "helpful, honest, and harmless," other companies may have different priorities or values. As more actors develop frontier models, the risk of models exhibiting fanatical or undesirable traits increases.

Having objective benchmarks would allow us to quantify these differences and may be helpful for informing:

  • Frontier AI companies: if a model exhibits concerning traits or behavior, the companies might choose not to use or sell it without further fine-tuning; thresholds can be specified in responsible scaling policies (or other ‘if-then commitments’).
  • Regulators: Models could be legally required to meet certain standards before deployment.
  • Consumers: Even if a model is already publicly available, consumers might choose to avoid it if it behaves in ways they find concerning.

Once these systems are in place, they shape the incentive landscape; companies might work harder to avoid fanatical model traits in the pre-training and post-training phases if they know that this will be evaluated negatively and might affect regulation or consumer demand.

We’re excited for the growing ecosystem of AI evals—from nonprofits like METR and Apollo Research and government bodies like the UK’s AI Security Institute—to also include benchmarks on ideological fanaticism (or related issues like malevolence, cooperativeness, and truthfulness).

Using AI to improve epistemics and deliberation

So far, we've only explored how to reduce risks from transformative AI. But we can also try to leverage AI in order to help us actively combat ideological fanaticism—mirroring the broader principle of “AI for AI safety” where we use AIs themselves to help with AI alignment. In particular, using AI to improve deliberation and epistemics seems promising, not least because poor epistemics is a key characteristic of ideological fanaticism.

One reason for optimism is that existing AI models can already stably reduce belief in conspiracy theories (Costello et al., 2024). Finding ways to refine and scale such effects could be extremely impactful. As more and more people start using AI models, some of these positive effects may in fact occur by default, as long as the AIs have sensible views.

Other promising interventions in the growing field of AI epistemics include automating fact-checking (on social media and elsewhere), improving forecasting (especially in high-consequence domains, like policymaking) and perhaps enabling wider use of prediction markets. In this area, we highly recommend the writings and many of the proposed project ideas by Lukas Finnveden (e.g., 2024a2024b), William MacAskill (2025d, section 4.1), and Ben Todd (2024b).[169] Finally, it could also become important to discourage or limit the creation of tools that degrade society’s epistemic capacities.

AI epistemics interventions are scalable and automatable, and so could be much higher leverage than more conventional methods of improving epistemics (as long as the relevant AIs are sufficiently reasonable[170]). As AIs become more numerous and powerful, the importance of endowing them with good epistemics and other beneficial, non-fanatical dispositions will only increase.

Fanaticism-resistant post-AGI governance

Even if we prevent fanatical actors from getting their hands on AGI, we must also avoid inadvertently handing them influence through naively designed post-AGI governance mechanisms. The stakes here are astronomical: how resources in outer space get allocated and used may hinge on early governance decisions.

Most governance frameworks face a version of the same core problem: how to distribute power and resources fairly without enabling the worst actors to cause disproportionate harm. Systems that grant broad sovereignty risk giving fanatics unchecked power within their domain[171]; systems that instead pool decision-making (e.g., giving every actor a vote in shared outcomes) seem safer but still vulnerable.[172] This tension is somewhat akin to the paradox of tolerance: a maximally inclusive, liberal system can be exploited by those who aim to dismantle its values. And it can't easily be deferred to a "long reflection," since it concerns the very question of whom to include in such processes.[173]

Perhaps the most consequential event in the post-AGI era would be the adoption of something like an "intergalactic constitution" that would serve as a foundational charter for any post-AGI regime.[174] What exactly its provisions should entail is an area for future work. However, it seems plausible that the single most important provision to include in such a constitution would be universal laws prohibiting deliberately inflicting extreme, involuntary suffering upon any sentient being.[175],[176] Many of the interventions discussed above hopefully increase the likelihood of this happening, however indirectly.

Addressing deeper causes of ideological fanaticism

Many humans seem drawn to fanatical ideologies because they offer a sense of meaning, security, status, and belonging in a world that’s all too often chaotic, unjust, and distressing (Hoffer, 1951; Borum, 2004; Morton & Greenberg, 2022; Van Prooijen and Krouwel, 2019; Kruglanski et al., 2014; Klausen, 2016; Gwern, 2017). Those who have experienced trauma may be particularly vulnerable to ideological fanaticism (e.g., Van Prooijen and Krouwel, 2019; Morton & Greenberg, 2022; Hoffer, 1951)[177], as are those who experience resentment or humiliation (Storr, 2021; Williams, 2025a).

This suggests opportunities for tackling fanaticism at its root—through economic support (e.g., UBI), psychotherapy (which AIs could potentially provide at scale), community-building, counter-radicalization programs, and reforming social media recommendation algorithms to promote better epistemics.[178]

Unfortunately, most of these interventions don’t seem particularly promising. Tackling the root causes of fanaticism is difficult with today's means. Yet continued technological progress could eliminate the despair and resentment that fuel fanaticism, ultimately creating a much better world for everyone.

Supplementary materials

An overview of all supplementary materials, including appendices, atrocity data, and survey methodology, is available here.

Acknowledgments

For valuable comments and discussions, we thank Tobias Baumann, Lucius Caviola, Jesse Clifton, Oscar Delaney, Anthony DiGiovanni, Ruairi Donnelly, James Faville, Lukas Gloor, Rose Hadshar, Erkki Kulovesi, Sandstone McNamara, Winston Oswald-Drummond, Maxime Riché, Stefan Schubert, Pablo Stafforini, Santeri Tani, Ewelina Tur, and Magnus Vinding.

Special thanks to Jackson Wagner for meticulous copy-editing and many insightful contributions, and Martina Pepiciello for designing the figures and graphics.

We are grateful to Claude Opus and Gemini for editorial assistance.

References

Adorno, T. W. (1950). The Authoritarian Personality. Harper & Brothers.

Aird, M. (2021, February 2). Books on authoritarianism, Russia, China, NK, democratic backsliding, etc.?. EA Forum.

Allen, J., Howland, B., Mobius, M., Rothschild, D., & Watts, D. J. (2020). Evaluating the fake news problem at the scale of the information ecosystem. Science advances, 6(14).

Altemeyer, B. (1998). The other “authoritarian personality”. In Advances in experimental social psychology (Vol. 30, pp. 47-92). Academic Press.

Altemeyer, B., & Hunsberger, B. (2004). A revised religious fundamentalism scale: The short and sweet of it. The International Journal for the Psychology of Religion, 14(1), 47-54.

Alvandi, R. & Gasiorowski, M. J. (2019, October 30). The United States Overthrew Iran’s Last Democratic Leader. Foreign Policy.

Amnesty International UK (2025, April 1). Repression and injustice in the United Arab Emirates. 

Applebaum, A. E. (2024). Autocracy, Inc.: The Dictators Who Want to Run the World. Doubleday.

Arendt, H. (1951). The Origins of Totalitarianism. New York: Schocken Books

Atran, S., & Ginges, J. (2012). Religious and sacred imperatives in human conflict. Science, 336(6083), 855-857.

Atran, S., & Ginges, J. (2015). Devoted actors and the moral foundations of intractable intergroup conflict. In J. Decety & T. Wheatley (Eds.), The moral brain: A multidisciplinary perspective (pp. 69–85). Boston Review.

Babst, D. (1972). Elective Governments – A Force for Peace. Industrial Research, 55-58.

Barnett, M. (2023, December 30). AI alignment shouldn’t be conflated with AI moral achievement. EA Forum.

BBC (2020). "The Purity Spiral". 11 February 2020.

Binder, S. A. (2004). Stalemate: Causes and consequences of legislative gridlock. Rowman & Littlefield.

Blattman, C. (2023). Why we fight: The roots of war and the paths to peace. Penguin.

Bloom, M. M. (2004). Palestinian suicide bombing: Public support, market share, and outbidding. Political Science Quarterly, 119(1), 61-88.

Borum, R. (2004). Psychology of terrorism.

Bostrom, N. (2013). Existential risk prevention as global priority. Global Policy, 4(1), 15-31.

Bostrom, N. (2014a). Hail Mary, Value Porosity, and Utility Diversification.

Bostrom, N. (2014b). Superintelligence: Paths, Dangers, Strategies. Oxford University Press.

Bostrom, N. (2024a). AI Creation and the Cosmic Host.

Bostrom, N. (2024b). Deep Utopia. Ideapress Publishing.

Bötticher, A. (2017). Towards academic consensus definitions of radicalism and extremism. Perspectives on terrorism, 11(4), 73-77.

Brandt, M. J., Reyna, C., Chambers, J. R., Crawford, J. T., & Wetherell, G. (2014). The ideological-conflict hypothesis: Intolerance among both liberals and conservatives. Current Directions in Psychological Science, 23, 27–34.

Brent, J. (2017, May 22). The Order of Lenin: ‘Find Some Truly Hard People’. The New York Times.

Calhoun, L. (2004). An anatomy of fanaticism. Peace Review, 16(3), 349-356.

Caplan, B. (2008). The totalitarian threat. Global catastrophic risks, 498.

Carlsmith, J. (2024, January 11). An ever deeper atheism. LessWrong.

Carlsmith, J. (2025, February 13). What is it to solve the alignment problem?. Joe Carlsmith’s Substack.

Carlsmith, J. (2025, July 17). Video and transcript of talk on “Can goodness compete?”. Joe Carlsmith’s Substack.

Carlsmith, K. M. (2006). The roles of retribution and utility in determining punishment. Journal of Experimental Social Psychology, 42(4), 437-451.

Chang, J., & Halliday, J. (2005). Mao: The Unknown Story. Jonathan Cape.

Chang, J. (2008). Wild swans: Three daughters of China. Simon and Schuster.

Chen, M. (2023, April). AI Alignment is Not Enough to Make the Future Go Well. Stanford Existential Risks Conference.

Chen, R., Arditi, A., Sleight, H., Evans, O., & Lindsey, J. (2025). Persona vectors: Monitoring and controlling character traits in language models. arXiv preprint arXiv:2507.21509.

Chenoweth, E., & Stephan, M. J. (2011). Why civil resistance works: The strategic logic of nonviolent conflict. Columbia University Press.

Choi, S. W. (2011). Re-evaluating capitalist and democratic peace models. International Studies Quarterly, 55(3), 759-769.

Clare, S. (2025, March). Great power conflict. 80,000 Hours.

Clark, C. J., Liu, B. S., Winegard, B. M., & Ditto, P. H. (2019). Tribalism is human nature. Current Directions in Psychological Science, 28(6), 587-592.

Clifton, J. (2020). Cooperation, conflict, and transformative artificial intelligence: A research agenda. Center on Long-Term Risk.

Conway III, L. G., Houck, S. C., Gornick, L. J., & Repke, M. A. (2018). Finding the Loch Ness monster: Left‐wing authoritarianism in the United States. Political Psychology, 39(5), 1049-1067.

Corrigan, J., Dunham, J., & Zwetsloot, R. (2022). The long-term stay rates of international STEM PhD graduates. Center for Security and Emerging Technology.

Costello, T. H., & Bowes, S. M. (2023). Absolute certainty and political ideology: A systematic test of curvilinearity. Social Psychological and Personality Science, 14(1), 93-102.

Costello, T. H., Pennycook, G., & Rand, D. G. (2024). Durably reducing conspiracy beliefs through dialogues with AI. Science, 385(6714), eadq1814.

Coynash, H. (2023, August 21). 63% of Russians view bloody dictator and mass murderer Stalin positively. In Ukraine only 4%. KHPG. https://khpg.org/en/1608812659

Dafoe, A. (2011). Statistical critiques of the democratic peace: Caveat emptor. American Journal of Political Science, 55(2), 247-262.

Davidson, T., Finnveden, L. & Hadshar, R. (2025, April 15). AI-Enabled Coups: How a Small Group Could Use AI to Seize Power. Forethought Research.

Dean, A., Lister, T. & Cruickshank, P. (2018). Nine Lives: My Time As MI6’s Top Spy Inside al-Qaeda. Oneworld Publications.

Diehl, M. (1990). The minimal group paradigm: Theoretical explanations and empirical findings. European review of social psychology, 1(1), 263-292.

Dikötter, F. (2016). The Cultural Revolution: A People's History, 1962—1976. Bloomsbury Publishing USA.

Drutman, L. (2023a, July 3). More Parties, Better Parties: The Case for Pro-Parties Democracy Reform. New America.

Drutman, L. (2023b, July 6). A healthy democracy requires healthy political parties. Undercurrent Events.

Drutman, L. (2023c, September 28). Revealed! Exposed! Unbelievable! The shocking hypothesis why misinformation is out of control. Undercurrent Events.

Economist Intelligence Unit (2006-2024) – processed by Our World in Data. Democracy index – Economist Intelligence Unit. https://ourworldindata.org/grapher/democracy-index-eiu

Eisenhower, D. D. (1953, April 27). The Chance for Peace. The United States Department of State.

Fearon, J. D. (1995). Rationalist explanations for war. International organization, 49(3), 379-414.

Fernbach, P. M., Rogers, T., Fox, C. R., & Sloman, S. A. (2013). Political extremism is supported by an illusion of understanding. Psychological science, 24(6), 939-946.

Finnveden, L. (2024a, January 4). Project ideas: Epistemics. Lukas Finnveden.

Finnveden, L. (2024b, August 24). What’s important in “AI for epistemics”?. LessWrong.

Fiske, A. P., & Rai, T. S. (2014). Virtuous violence: Hurting and killing to create, sustain, end, and honor social relationships. Cambridge University Press.

Freedom House (2025). Freedom in the World 2025: The Uphill Battle to Safeguard Rights. 

Fukuyama, F. Y. (1992). The End of History and the Last Man. Free Press.

Gabriel, I. (2020). Artificial intelligence, values, and alignment. Minds and machines, 30(3), 411-437.

Galef, J. (2021). The scout mindset: Why some people see things clearly and others don't. Penguin.

Gallup (n.d.). Views of Violence. https://news.gallup.com/poll/157067/views-violence.aspx

Garfinkel, B. M. (2021, March 13). Is Democracy a Fad?. EA Forum.

Ginges, J., Atran, S., Medin, D., & Shikaki, K. (2007). Sacred bounds on rational resolution of violent political conflict. PNAS, 104(18), 7357-7360.

Gollwitzer, A., Olcaysoy Okten, I., Pizarro, A. O., & Oettingen, G. (2022). Discordant knowing: A social cognitive structure underlying fanaticism. Journal of experimental psychology: general, 151(11), 2846.

Gómez, Á., López-Rodríguez, L., Sheikh, H., Ginges, J., Wilson, L., Waziri, H., ... & Atran, S. (2017). The devoted actor's will to fight and the spiritual dimension of human conflict. Nature Human Behaviour, 1(9), 673-679.

Goodwin, G. P., & Gromet, D. M. (2014). Punishment. Wiley Interdisciplinary Reviews: Cognitive Science, 5(5), 561-572.

Gwern (2017, April 21). Terrorism is Not About Terror. Gwern.net.

Haidt, J. (2012). The righteous mind: Why good people are divided by politics and religion. Vintage.

Heim, L., et al. (2024). Computing Power and the Governance of AI. Centre for the Governance of AI blog

Henrich, J. (2020). The WEIRDest people in the world: How the West became psychologically peculiar and particularly prosperous. Penguin.

Herre, B., Rodés-Guirao, L., & Ortiz-Ospina, E. (2013). Democracy. Our World in Data. https://ourworldindata.org/democracy

Hess, R. W. R. (1934). From Revolution to Construction [Speech transcript]. German Propaganda Archive, https://research.calvin.edu/german-propaganda-archive/hess5.htm

Hewstone, M., Rubin, M., & Willis, H. (2002). Intergroup bias. Annual review of psychology, 53(1), 575-604.

Hoffer, E. (1951). The True Believer: Thoughts on the Nature of Mass Movements. Harper & Brothers.

IHRDC. Surviving Rape in Iran’s Prisons (2011, Nov 10). Iran Human Rights Documentation Center.

IranWire. (2023, June 1) “Ex-Official: Virgin Prisoners Were Raped to Prevent Them Going to Paradise”.

Jones, A.: Heaven and hell in the Qurʾān, in Dévényi, K. and Fodor, A. (eds): Proceedings of the colloquium on Paradise and Hell in Islam, Keszthely, 7–14 July 2002, in The Arabist, 28–29 (2008), 105–22.

Karnofsky, H. (Guest) & Wiblin, R. (Host) (2023, July 31). #158 - Holden Karnofsky on how AIs might take over even if they’re no smarter than humans, and his 4-part playbook for AI risk. The 80,000 Hours Podcast.

Katsafanas, P. (2019). Fanaticism and sacred values. Philosophers' Imprint, 19(17), 1-20.

Katsafanas, P. (2022a). Group fanaticism and narratives of ressentiment. In The philosophy of fanaticism (pp. 157-183). Routledge.

Katsafanas, P. (2022b). Philosophy of devotion: The longing for invulnerable ideals. Oxford University Press.

Kaufmann, E. (2010). Shall the Religious Inherit the Earth? Demography and Politics in the Twenty-First Century. Profile Books.

Klausen, J. (2016). A behavioral study of the radicalization trajectories of american" homegrown" al qaeda-inspired terrorist offenders. Brandeis University.

Klein, E., & Thompson, D. (2025). Abundance. Simon and Schuster.

Koehler, A. (2022, September). Safeguarding liberal democracy. 80,000 Hours.

Kosonen, P. (2025). Expected Value Fanaticism. In R.Y. Chappell, D. Meissner, and W. MacAskill (eds.), An Introduction to Utilitarianism.

Krouwel, A., Kutiyski, Y., Van Prooijen, J. W., Martinsson, J., & Markstedt, E. (2017). Does extreme political ideology predict conspiracy beliefs, economic evaluations and political trust? Evidence from Sweden. Journal of Social and Political Psychology, 5(2), 435-462.

Kruglanski, A. W., Gelfand, M. J., Bélanger, J. J., Sheveland, A., Hetiarachchi, M., & Gunaratna, R. (2014). The psychology of radicalization and deradicalization: How significance quest impacts violent extremism. Political Psychology, 35, 69-93.

Kunda, Z. (1990). The case for motivated reasoning. Psychological bulletin, 108(3), 480.

Kurzban, R. O. (2012). Why Everyone (Else) Is a Hypocrite: Evolution and the Modular Mind. Princeton University Press.

Lenin, V. (1906). Lessons of the Moscow Uprising. Marxist Internet Archive.

Lenin, V. (1913). The Three Sources and Three Component Parts of Marxism. Marxist Internet Archive.

Leskelä, A. (2020, December 4). Commitment and credibility in multipolar AI scenarios. LessWrong.

Levitsky, S., & Ziblatt, D. (2018). How democracies die. Crown.

Linz, J. J. (2000). Totalitarian and Authoritarian Regimes. Lynne Rienner Publishers.

Loza, W. (2007). The psychology of extremism and terrorism: A Middle-Eastern perspective. Aggression and Violent Behavior, 12(2), 141-155.

MacAskill, W. (Guest) & Perry, L. (Host) (2018, September 18). Moral Uncertainty and the Path to AI Alignment with William MacAskill. AI Alignment Podcast.

MacAskill, W. (Guest) & Wiblin, R. (Host) (2020, January 24). #68 - Will MacAskill on the moral case against ever leaving the house, whether now is the hinge of history, and the culture of effective altruism. The 80,000 Hours Podcast.

MacAskill, W. (2022). What We Owe the Future, New York: Basic Books.

MacAskill, W. (Guest) & Wiblin, R. (Host) (2025, March 11). #213 - Will MacAskill on AI causing a “century in a decade” – and how we’re completely unprepared. The 80,000 Hours Podcast.

MacAskill, W. (2025a, October 10). Effective altruism in the age of AGI. EA Forum.

MacAskill, W. (2025b). Introducing Better Futures. Forethought Research.

MacAskill, W. & Moorhouse, F. (2025a). No Easy Eutopia. Forethought Research.

MacAskill, W. & Moorhouse, F. (2025b). Convergence and Compromise. Forethought Research.

MacAskill, W. (2025c). Persistent Path-Dependence. Forethought Research.

MacAskill, W. (2025d). How to Make the Future Better. Forethought Research.

MacAskill, W. & Hadshar, R. (2025). Intelsat as a Model for International AGI Governance. Forethought Research.

MacAskill, W. & Moorhouse, F. (2025). Preparing for the Intelligence Explosion. Forethought Research.

Mainwaring, S. & Drutman, L., (2023). The Case for Multiparty Presidentialism in the US: Why the House Should Adopt Proportional Representation, Protect Democracy and New America.

Manson, J. H. (2020). Right-wing authoritarianism, left-wing authoritarianism, and pandemic-mitigation authoritarianism. Personality and individual differences, 167, 110251.

Maoz, Z., & Abdolali, N. (1989). Regime types and international conflict, 1816-1976. Journal of Conflict Resolution, 33(1), 3-35.

Marimaa, K. (2011). The many faces of fanaticism. KVÜÕA toimetised, (14), 29-55.

Mason, L. (2018). Uncivil agreement: How politics became our identity. University of Chicago Press.

Meedović, J., and Knežević, G. (2019). Dark and peculiar: the key features of militant extremist thinking pattern? J. Individ. Differ. 40, 92–103. doi: 10.1027/1614-0001/a000280

Montefiore, S. S. (2007). Stalin: The Court of the Red Tsar. Vintage.

Morton, J. (Guest) & Greenberg, S. (Host) (2022, May 5). Episode 103: A former Al-Qaeda recruiter speaks (with Jesse Morton). Clearer Thinking.

Müller, H., & Wolff, J. (2004, August). Dyadic democratic peace strikes back. In 5th Pan-European international relations conference the Hague, September (pp. 9-11).

Nguyen, L. C. (2024, March 3). AI things that are perhaps as important as human-controlled AI. EA Forum.

O'Neill, C., Miller, J., Ciuca, I., Ting, Y. S., & Bui, T. (2023). Adversarial fine-tuning of language models: An iterative optimisation approach for the generation and detection of problematic content. arXiv preprint arXiv:2308.13768.

Oesterheld, C., (2017). Multiverse-wide Cooperation via Correlated Decision Making.

Ord, T. (2020). The Precipice: Existential Risk and the Future of Humanity. Bloomsbury Publishing.

Pennycook, G., Cheyne, J. A., Koehler, D. J., & Fugelsang, J. A. (2020). On the belief that beliefs should change according to evidence: Implications for conspiratorial, moral, paranormal, political, religious, and science beliefs. Judgment and Decision making, 15(4), 476-498.

Perkinson, H. J. (2002). Fanaticism: flight from fallibility. ETC: A Review of General Semantics, 59(2), 170-174.

Pew Research Center (2010, April). Tolerance and Tension: Islam and Christianity in Sub-Saharan Africa. 

Pew Research Center (2013, April). The World’s Muslims: Religion, Politics and Society.

Pew Research Center (2021, June). Religion in India: Tolerance and Segregation. 

Pew Research Center (2022, October). 45% of Americans Say U.S. Should Be a ‘Christian Nation’.

Pew Research Center (2023, August). Measuring Religion in China.

Pew Research Center (2023b, September). Buddhism, Islam and Religious Pluralism in South and Southeast Asia.

Pinker, S. (2018). Enlightenment now: The case for reason, science, humanism, and progress. Penguin UK.

Popper, K. (1945). The open society and its enemies. Routledge.

PRRI/Brookings survey (2023). A Christian Nation? Understanding the threat of Christian Nationalism to American democracy and culture. PRRI; Brookings Institution.

Pretus, C., Hamid, N., Sheikh, H., Ginges, J., Tobeña, A., Davis, R., ... & Atran, S. (2018). Neural and behavioral correlates of sacred values and vulnerability to violent extremism. Frontiers in Psychology, 9, 2462.

The Qur’an (Khattab, M., Trans.). (2016). Book of Signs Foundation.

Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023). Direct preference optimization: Your language model is secretly a reward model. Advances in neural information processing systems, 36, 53728-53741.

Reinisch, E. & Thomas, L. (2022, February 4). Are the United Arab Emirates on their way to becoming a democracy? LSE Government Blog.

Robespierre, M. F. (1794). On the Principles of Political Morality. Marxists Internet Archive.

Rokeach, M. (1960). The open and closed mind: Investigations into the nature of belief systems and personality systems.

Rosato, S. (2003). The flawed logic of democratic peace theory. American political science review, 97(4), 585-602.

Russett, B. (1993). Can a democratic peace be built?. International Interactions, 18(3), 277-282.

Sapolsky, R. (2017). Behave: The Biology of Humans at Our Best and Worst. Penguin Press.

Satloff, R. (2008). Just like us! Really?. The Washington Institute for Near East Policy.

Saucier, G., Akers, L. G., Shen-Miller, S., Kneževié, G., & Stankov, L. (2009). Patterns of thinking in militant extremism. Perspectives on psychological science, 4(3), 256-271.

Scheufele, D. A., Krause, N. M., & Freiling, I. (2021). Misinformed about the “infodemic?” Science’s ongoing struggle with misinformation. Journal of Applied Research in Memory and Cognition, 10(4), 522-526.

Selengut, C. (2017). Sacred fury: Understanding religious violence. Rowman & Littlefield.

Sharma, M., Tong, M., Korbak, T., Duvenaud, D., Askell, A., Bowman, S. R., ... & Perez, E. (2023). Towards understanding sycophancy in language models. arXiv preprint arXiv:2310.13548.

Sheikh, H., Ginges, J., Coman, A., & Atran, S. (2012). Religion, group threat and sacred values. Judgment and Decision Making, 7(2), 110-118.

Shirer, W. L. (1960). The Rise and Fall of the Third Reich: A History of Nazi Germany. Simon & Schuster.

Shulman, C. (2010). Omohundro’s “Basic AI Drives” and Catastrophic Risks. Machine Intelligence Research Institute.

Simkin, J. (1997, September). The Red Terror. Spartacus Educational. https://spartacus-educational.com/RUSterror.htm

Simler, K. & Hanson, R. (2018). The Elephant in the Brain: Hidden Motives in Everyday Life. Oxford University Press.

Skitka, L. J., Bauman, C. W., & Sargis, E. G. (2005). Moral conviction: Another contributor to attitude strength or something more?. Journal of personality and social psychology, 88(6), 895.

Skitka, L. J., Hanson, B. E., Morgan, G. S., & Wisneski, D. C. (2021). The psychology of moral conviction. Annual Review of Psychology, 72(1), 347-366.

Skocpol, T. (1979). States and social revolutions: A comparative analysis of France, Russia and China. Cambridge University Press.

Stankov, L., Saucier, G., & Knežević, G. (2010). Militant extremist mind-set: Proviolence, Vile World, and Divine Power. Psychological assessment, 22(1), 70.

Storr, W. (2021). The Status Game: How Social Position Governs Everything. HarperCollins Publishers.

Szanto, T. (2022). Sacralizing hostility: Fanaticism as a group-based affective mechanism. In The Philosophy of Fanaticism (pp. 184-212). Routledge.

Taylor, M. (2025). Measurement Research Agenda. Center on Long-Term Risk. https://longtermrisk.org/measurement-research-agenda

Tetlock, P. E. (2003). Thinking the unthinkable: Sacred values and taboo cognitions. Trends in cognitive sciences, 7(7), 320-324.

Tetlock, P. E., Kristel, O. V., Elson, S. B., Green, M. C., & Lerner, J. S. (2000). The psychology of the unthinkable: taboo trade-offs, forbidden base rates, and heretical counterfactuals. Journal of personality and social psychology, 78(5), 853.

Thomson, P., & Halstead, J. (2022). How violent was the pre-agricultural world?. Available at SSRN 4466809.

Tietjen, R. R. (2023). Fear, fanaticism, and fragile identities. The Journal of Ethics, 27(2), 211-230.

Todd, B. (2024a, May 6). Updates on the EA catastrophic risk landscape. EA Forum.

Todd, B. (2024b, May 19). Project idea: AI for epistemics. EA Forum.

Tomz, M., & Weeks, J. L. (2012, February). An experimental investigation of the democratic peace. In Annual Meeting of the American Political Science Association. Washington, DC.

Torcal, M., & Magalhães, P. C. (2022). Ideological extremism, perceived party system polarization, and support for democracy. European Political Science Review, 14(2), 188-205.

Urban, T. (2023). What's Our Problem?: A Self-help Book for Societies. Wait But Why.

Van Prooijen, J. W., & Krouwel, A. P. (2017). Extreme political beliefs predict dogmatic intolerance. Social Psychological and Personality Science, 8(3), 292-300.

Van Prooijen, J. W., & Krouwel, A. P. (2019). Psychological features of extreme political ideologies. Current Directions in Psychological Science, 28(2), 159-163.

Van Prooijen, J. W., Krouwel, A. P., Boiten, M., & Eendebak, L. (2015a). Fear among the extremes: How political ideology predicts negative emotions and outgroup derogation. Personality and social psychology bulletin, 41(4), 485-497.

Van Prooijen, J. W., Krouwel, A. P., & Emmer, J. (2018). Ideological responses to the EU refugee crisis: The left, the right, and the extremes. Social psychological and personality science, 9(2), 143-150.

Van Prooijen, J. W., Krouwel, A. P., & Pollet, T. V. (2015). Political extremism predicts belief in conspiracy theories. Social psychological and personality science, 6(5), 570-578.

Van Prooijen, J. W., & Kuijper, S. M. (2020). A comparison of extreme religious and political ideologies: Similar worldviews but different grievances. Personality and Individual Differences, 159, 109888.

Varmann, A. H., Kruse, L., Bierwiaczonek, K., Gomez, A., Vazquez, A., & Kunst, J. R. (2024). How identity fusion predicts extreme pro-group orientations: A meta-analysis. European Review of Social Psychology, 35(1), 162-197.

Vinding, M. (2022). Reasoned Politics. Ratio Ethica.

Weiss, J. C. (2019). How hawkish is the Chinese public? Another look at “rising nationalism” and Chinese foreign policy. Journal of Contemporary China, 28(119), 679-695.

Williams, D. (2021). Socially adaptive belief. Mind & Language, 36(3), 333-354.

Williams, D. (2022). Signalling, commitment, and strategic absurdities. Mind & Language, 37(5), 1011-1029.

Williams, D. (2023). The marketplace of rationalizations. Economics & Philosophy, 39(1), 99-123.

Williams, D. (2024a, January 10). Misinformation researchers are wrong: There can’t be a science of misleading content. Conspicuous Cognition.

Williams, D. (2024b, December 6). The deep and unavoidable roots of political bias. Conspicuous Cognition.

Williams, D. (2025a, May 31). Status, class, and the crisis of expertise. Conspicuous Cognition.

Williams, D. (2025b, October 7). Is Social Media Destroying Democracy—OrGiving It To Us Good And Hard?. Conspicuous Cognition.

Williams, D. (2025c, October 26). On Highbrow Misinformation. Conspicuous Cognition.

Wilkinson, H. (2022). In defense of fanaticism. Ethics, 132(2), 445-477.

Wood, G. (2015, March). What ISIS Really Wants. The Atlantic.

Yelnats, T. J. (2024, July 15). Destabilization of the United States: The top X-factor EA neglects?. EA Forum. https://forum.effectivealtruism.org/posts/kmx3rKh2K4ANwMqpW

Yiwei, Z. (2013, December 24).85% say Mao’s merits outweigh his faults: poll. Global Times.

Yuri Levada Analytical Center (2022, March). Xenophobia and Nationalism in State Power.

Zwicker, M. V., van Prooijen, J. W., & Krouwel, A. P. (2020). Persistent beliefs: Political extremism predicts ideological stability over time. Group Processes & Intergroup Relations, 23(8), 1137-1149.

 

  1. ^

     Bötticher’s (2017) full definition: 
    “Extremism characterises an ideological position embraced by those anti-establishment movements, which understand politics as a struggle for supremacy rather than as peaceful competition between parties with different interests seeking popular support for advancing the common good. Extremism exists at the periphery of societies and seeks to conquer its center by creating fear of enemies within and outside society. They divide fellow citizens and foreigners into friends and foes, with no room for diversity of opinions and alternative life-styles. Extremism is, due to its dogmatism, intolerant and unwilling to compromise. Extremists, viewing politics as a zero-sum game, tend - circumstances permitting - to engage in aggressive militancy, including criminal acts and mass violence in their fanatical will for gaining and holding political power. Where extremists gain state power, they tend to destroy social diversity and seek to bring about a comprehensive homogenisation of society, based on an often faith-based ideology with apocalyptic traits. At the societal level, extremist movements are authoritarian, and, if in power, extremist rulers tend to become totalitarian. Extremists glorify violence as a conflict resolution mechanism and are opposed to the constitutional state, majority-based democracy, the rule of law, and human rights for all.”

  2. ^

     ‘Pascalian’ or ‘expected value’ fanaticism describes the apparent problem in which moral theories would favor a tiny probability of achieving a vast amount of value instead of a certain but modest amount of value (see e.g. Wilkinson, 2022; Kosonen, 2025).

  3. ^

     The biggest difference is that most humans aren’t violent and generally respect moral norms—but that may be partly a result of our current environment. See footnote 9.

  4. ^

     Similarly, Lin Biao, the Vice Chairman of the CCP, claimed: “Every sentence of Chairman Mao's works is a Truth, one single sentence of his surpasses ten thousand of ours.”

  5. ^

     Necessarily, this results in terrible epistemics, as fanatics need to use motivated reasoning and all sorts of extreme mental gymnastics to protect sacred dogmas from empirical falsification or internal contradictions.

  6. ^

     See also Urban (2023, ch.1) who uses the term “zealot” to describe a similar, perhaps slightly milder form of this mindset.

  7. ^

     Fanatics often perceive themselves as oppressed underdogs fighting back against oppressors, which provides the moral license for their extreme actions. Fanatics are the ultimate "conflict theorists", viewing politics not as a collaborative effort to solve societal problems ("mistake theory"), but as a zero-sum war.

  8. ^

     As we discuss below, these leaders almost always have elevated dark personality traits, and often create cults of personality to grow and entrench their power.

  9. ^

     Historical violence rates suggest that much of this is due to moral and institutional achievements, rather than human nature itself. For example, the best estimates suggest that early agricultural societies and subsistence farmers had between 277 and 595 violent deaths per 100,000 people per year, while hunter-gatherer societies saw 103-124 per 100,000 (Thomson & Halstead, 2022, p.6)—much higher than today's homicide rates of 1-5 per 100,000 in developed democracies, and higher, even, than the 75 violent deaths per 100k during the 20th century with its two world wars and many genocides.

  10. ^

     Moderation, valued by many virtue ethicists and philosophers, is arguably the antithesis of the fanatical mindset as fanatics systematically act on their extreme views without any moderation. (In this narrow sense, ideological fanatics are arguably more consistent than many ordinary people because they "take ideas seriously": where most people compartmentalize their beliefs to avoid uncomfortable implications, fanatics follow through on their ideological commitments and ruthlessly override any inconsistencies (including plain laziness) that keep ordinary people from following harmful ideas to their logical conclusions (cf. memetic immune systems).

  11. ^

     For instance, the Bolshevik newspaper Krasnaya Gazeta declared in 1918 (Simkin, 1997): "We will turn our hearts into steel [...]. We will make our hearts cruel, hard, and immovable, so that no mercy will enter them, and so that they will not quiver at the sight of a sea of enemy blood. [...] Without mercy, without sparing, we will kill our enemies in scores of hundreds. Let them be thousands; let them drown themselves in their own blood. For the blood of Lenin and Uritsky, Zinovief and Volodarski, let there be floods of the blood of the bourgeois - more blood, as much as possible.” Or, more concisely, in the words of Robespierre (1794): "To punish the oppressors of humanity is clemency; to forgive them is cruelty".

  12. ^

     Stalin’s 1937 toast (Brent, 2017) summarizes this totalitarian logic: “We will mercilessly destroy anyone who, by his deeds or his thoughts—yes, his thoughts—threatens the unity of the socialist state. To the complete destruction of all enemies, themselves and their kin!”

  13. ^

     Ideological movements themselves can change over time in their average level of fanaticism. For instance, the average Christian during the days of the Spanish Inquisition was considerably more fanatical than today.

  14. ^

     Some members of the Nazi party, for instance, may have begun with only moderate anti-Semitic sentiment, but, once embedded in a system where expressing such views advanced their careers and where dissent posed mortal danger, they found themselves espousing increasingly extreme positions (cf. preference falsification). Rather than live with such uncomfortable cognitive dissonance, they may have gradually (and subconsciously) adjusted their actual beliefs to align with what was expedient.

  15. ^

     See also the definition of ‘radicalism’ by Bötticher (2017).

  16. ^

     These involved at least one of three types of fanatical ideologies: totalitarian communism, fascist ethno-nationalism, and religious fundamentalism. Of course, some fanatical ideologies don’t fall neatly into one of these three categories. For instance, many ideologies combine extreme ethno-nationalism with communist ideology or religious fanaticism (see also Composite Violent Extremism).

  17. ^

     The distinction between intentional and non-intentional deaths isn't always clear-cut, particularly for famines. We included the Holodomor because evidence suggests Stalin deliberately exacerbated the famine to eliminate Ukrainian independence movements. We excluded the famines in British India (around 25-30m deaths; colonialism and economic laissez-faire ideology worsened natural droughts but didn't intentionally engineer starvation) and Mao's Great Leap Forward (where catastrophic policies caused around 30m deaths, but these appear to have been unintended consequences of delusional agricultural theories rather than intentional killing).

  18. ^

     For three atrocities (Taiping Rebellion, Dungan Revolt, and King Leopold II's Congo), we include total death figures even when these encompass disease and starvation deaths. Record-keeping for these 19th century conflicts was generally much poorer than in the 20th century, making it difficult to find good data distinguishing direct violence from "indirect" casualties. We estimate that around half of the deaths during the Taiping Rebellion and the Dungan Revolt were from direct violence. However, even many of these “indirect” deaths were closely tied to intentional violence, making the distinction especially unclear; warfare deliberately created conditions that caused mass starvation, and when Leopold's forces cut off workers' hands as punishment, the resulting deaths from starvation or infection were hardly unforeseeable. Better data for 20th century atrocities enabled us to focus on deaths from intentional violence. See Appendix B for further discussion.

  19. ^

     WWII (whether counted as one event or three), Mao's China, Stalin's USSR, and the Taiping Rebellion all unambiguously involved ideological fanaticism and together almost certainly account for over 100 million deaths. Even if we grouped WWII as a single entry, these four atrocities alone would still represent the clear majority of deaths. One could also argue for adjusting death tolls by world population, since an atrocity that killed 5% of humanity is arguably more alarming than one that killed 1%, even if absolute numbers are lower. However, the world population during this time period ranged only from ~1B (1800) to ~3.7B (1970s)—a factor of 3.7x—so such adjustments wouldn't dramatically alter our rankings. The Taiping Rebellion, for example, would scale up substantially (to ~150M at today's population), and the Napoleonic Wars (~6M deaths at ~1B world population) would most likely enter the list—which, being driven more by conventional great-power competition than ideological fanaticism, would reduce the fanatical entries from eight to seven. Nonetheless, the basic finding would most likely remain: ideological fanaticism was involved in most of the worst atrocities since 1800.

  20. ^

     The distinction between a leader's personality and a leader’s (fanatical) ideology is blurry. For instance, Hitler, Stalin, and Mao all exhibited highly elevated dark personality traits and were ideological fanatics. We discuss this connection in a later section.

  21. ^

     For some, especially the Holocaust and the Cultural Revolution, it’s plausibly the single most important cause.

  22. ^

     Torture is plausibly the most relevant form of harm when considering risks of astronomical suffering.

  23. ^

     Some argue that the divergence between North and South Korea primarily demonstrates the importance of institutions. While we agree that institutions are the proximate cause of these divergent outcomes, our argument is that institutions usually don’t arise in a vacuum. Rather, they are often a consequence of the ideologies and values held by those who create them. In this case, it seems clear that Kim Il Sung and his Juche ideology played a crucial causal role in the establishment of North Korea’s totalitarian institutions. A parallel can be drawn with the Holocaust: while the system of concentration camps (the institutions) was necessary for the genocide, it was Hitler’s Nazi ideology which created these institutions in the first place.

  24. ^

     By ‘technological maturity’ we mean “the attainment of capabilities affording a level of economic productivity and control over nature close to the maximum that could feasibly be achieved in the fullness of time” (Bostrom, 2013).

  25. ^

     Agential s-risks—where harm itself is the goal—are especially concerning from a longtermist perspective. Our focus on deliberate harm captures all agential harms while potentially also including some incidental types of harm such as systematic (thus deliberate) animal killings. We distinguished between "deliberate vs. non-deliberate deaths" because it's easier to explain and reflects a common-sense distinction. That being said, the distinction between deliberate and non-deliberate deaths is not always clear-cut; see our previous discussion.

  26. ^

     Another important, somewhat related concept is great power wars (Clare, 2025) which we discuss later. Of course, there are many other closely related terms and concepts, such as dictatorships and tyranny.

  27. ^

     As well as the underlying psychological, sociological, and memetic factors shaping dangerous terminal preferences.

  28. ^

     Relatedly, MacAskill (2020) argues that “the rise of fascism and Stalinism was a bigger deal in the 20th century than the invention of nuclear weapons” because “even though you might not think that a particular ideology will last forever, well, if it lasts until you get some eternal lock-in event, then it lasts forever.”

  29. ^

     For example, Arendt prominently discusses ideology as a crucial component of totalitarian regimes, and Adorno (1950) discusses the “authoritarian personality”.

  30. ^

     For this graphic, we only analyzed events with over 500k deaths. However, these account for 95% (253m) of the total 266m deaths from large-scale atrocities (i.e., with over 100k deaths).

  31. ^

     From a longtermist perspective, perhaps especially an s-risk perspective, the very worst outcomes are most relevant, given that (negative) impact is plausibly heavy-tailed.

  32. ^

     We focus on these individuals because they offer the clearest illustration, but fanatical ideologies obviously shape history through many adherents: dedicated lieutenants, bureaucrats, and followers who, e.g., provide votes, manpower, and other forms of support necessary for seizing power.

  33. ^

     V-Dem identifies five key dimensions of democracy: electoral, liberal, participatory, deliberative, and egalitarian. Ideological fanaticism conflicts with essentially all of them—fanatics cannot tolerate opposition gaining power, free expression challenging their beliefs, broader participation diluting ideological purity, genuine deliberation, or equal rights for those they deem evil. See also Marimaa (2011): “According to Calhoun, a fanatic abandons the scepticism that is intrinsic to democracy. Democracy assumes that everyone can make mistakes and no-one is free of error. Democracy also means a plurality of opinions that indicates the need for dialogue. Just as fanaticism can endanger democracy, it can also endanger the smooth functioning of civil society. According to Colas, fanaticism opposes civil society because the latter supports tolerance, the free market and freedom of thought. He argues that totalitarianism that hates civil society can be seen as a modern form of fanaticism.”

  34. ^

     In fact, the relationship Torcal & Magalhães found was non-linear: people with extreme views show disproportionately decreased democratic support compared with those with moderate ideological distance from their society’s average position.

  35. ^

     Further, the psychological profiles of ideological fanatics and authoritarian followers overlap significantly, with both exhibiting inconsistent thinking, intolerance, and punitiveness (Altemeyer, 1998; Conway et al., 2018). Altemeyer & Hunsberger (2004) also found a relationship between religious fundamentalism and authoritarian follower tendencies.

  36. ^

     See, e.g., some of the jihadists featured in Jihad Rehab.

  37. ^

     While Hitler didn't invent antisemitism, he synthesized centuries of prejudice into Nazism. Stalin created Stalinism, Mao Maoism, and Kim Il-sung developed Juche—while all built on Marxism, they added their own unique style. Even many fanatical religions or cults were presumably founded by individuals with narcissistic traits—believing oneself to be God's son or messenger would seem to require quite a healthy self-esteem.

  38. ^

     For example, many communists, and even many non-communists, doubt to this day that Mao exhibited elevated malevolent traits.

  39. ^

     Many individuals with elevated dark traits seem self-aware and wish they didn’t have such traits. In contrast, ideological fanatics seem more likely to reflectively endorse their preferences to create disvalue.

  40. ^

     As mentioned above, partly because the terminal preferences of agents will become a much more dominant determinant of how matter in the universe is arranged, as we approach technological maturity. Of course, terminal preferences will be—largely but not completely—shaped by evolutionary, economic and other structural forces.

  41. ^

     These considerations also provide additional motivation for longtermists to prioritize 'flourishing futures' over mere 'surviving'.

  42. ^

     However, ideological fanaticism seems considerably more likely to give rise to more systematic, principled, and ambitious preferences, perhaps even scope-sensitive inclinations to create large-scale harm. In contrast, most malevolent individuals’ preferences seem relatively self-centered and opportunistic, and probably more easily “bought out”, e.g., via trade.

  43. ^

     See also our earlier section for why liberal democracy is a decent proxy for the absence of ideological fanaticism. What about antiquity? It seems plausible that ideological fanaticism was far more prevalent in antiquity than today. Ancient rulers like the Pharaohs (who believed themselves divine), Roman emperors, and tribal chieftains generally exhibited all three components of the fanatical triad: dogmatic certainty was the norm (the concepts of science and empiricism didn’t even really exist), societies were intensely tribal, and violence was common.

  44. ^

     Other indices tell a similar story. V-Dem’s other democracy indices also exhibited low periods in the 1940s before climbing to peaks in the late 2000s. Freedom House's global freedom scores peaked around 2005-2006, with the 2025 report noting the "19th consecutive year" of decline in global freedom.

  45. ^

     Usually 10-20 year old survey data, at that.

  46. ^

     They also frequently overlap with religious fundamentalism, as with Hindutva or Christian nationalism.

  47. ^

     Nor is belief in feng shui compatible with decades of specific efforts by Chinese leaders to stamp out superstitious belief in “ghosts and spirits”.

  48. ^

     This also matches an independent estimate. Gallup data from 2008-2010 indicates approximately 760 million adults worldwide believe that the targeting and killing of civilians by individuals or small groups is sometimes justified

  49. ^

     Iran is an Islamic theocracy under which Supreme Leader Ayatollah Ali Khamenei holds constitutionally-enshrined authority accountable only to God. The regime's revolutionary slogans—"Death to America" and "Death to Israel"—are chanted at Friday prayers and official events, with the US characterized as the "Great Satan" and Israel as the "Little Satan." In November 2023, Khamenei declared that "'Death to America' is not just a slogan, it's a policy." Religious observance is enforced by the morality police, which monitor for transgressions such as insufficiently modest clothing, male-female fraternisation, and the wearing of bright colours.

  50. ^

      See our previous discussion of North Korea in the section “Death tolls don’t capture all harm”.

  51. ^

     Afghanistan is ruled by the fanatical Taliban. Since they regained power in 2021, they have committed numerous human rights abuses, including extreme oppression of women and revenge killing and torture of former officials.

  52. ^

     The Houthis, who control much of northern Yemen, are also ideological fanatics. Their official slogan—"Allah is great, death to America, death to Israel, curse the Jews, victory for Islam"—is displayed throughout controlled territory and taught in schools. They have systematically persecuted religious minorities, and recruited tens of thousands of child soldiers. However, they are not a recognized sovereign state.

  53. ^

     Though the CCP continues to revere Mao’s legacy through banknotes and other honours, with one survey showing that 85% of Chinese still view Mao with reverence and respect (Yiwei, 2013).

  54. ^

     See also this 2025 comment by Wei Dai. However, our survey findings on extreme retributivism (discussed below) suggest that fanatical punitive attitudes may be surprisingly prevalent in China.

  55. ^

     Saudi Arabia ($1.1T GDP, absolute monarchy with Wahhabi influence) and Pakistan ($0.4T, military-dominated with Islamic extremist influence) may also warrant a brief mention.

  56. ^

     Consider the Taliban, a regime whose tech policy includes routinely plunging large areas of Afghanistan into internet blackouts in order to “prevent immorality”.

  57. ^

     A related phenomenon is resource misallocation. Fanatics often prioritize ideological goals like punishing enemies or enforcing orthodoxy over productive investments. While liberal societies invest more in education, infrastructure, and innovation, fanatical regimes must spend increasing resources on repression and ideological enforcement just to maintain control.

  58. ^

     Though this may be largely due to open societies often offering better economic prospects.

  59. ^

     China was estimated to be around 2 years behind the US in mid 2024 but this gap seems to have narrowed.

  60. ^

     To be clear, revolutions can actually be “democratic power-grabs” with large support from the population and for a “good cause”.

  61. ^

     After the failed Beer Hall Putsch of 1923, Hitler commented “Instead of working to achieve power by armed coup, we shall have to hold our noses and enter the [German parliament]” (as quoted in Shirer, 1960).

  62. ^

     That being said, the US was the first and only country on Earth to ever use nuclear weapons. However, it’s plausible that if Hitler, Mao, or Stalin had first invented nuclear weapons, they would have made more extensive use of nuclear weapons.

  63. ^

     That is, in the “guns vs butter” framing, fanatical regimes are more likely to emphasise guns over butter. In 1936, Nazi minister Hermann Göring proclaimed in a speech "Guns will make us powerful; butter will only make us fat" (The Columbia World of Quotations, 1996). In contrast, in 1953 Dwight D. Eisenhower said “Every gun that is made, every warship launched, every rocket fired signifies, in the final sense, a theft from those who hunger and are not fed, those who are cold and are not clothed.”

  64. ^

     This discrepancy is probably partly explained by authoritarian regimes having fewer domestic pressures than democratic ones. As previously explored, fanatical ideologies are not really compatible with liberal democracy.

  65. ^

     Low defence spending by European countries has probably also been related to being under the safety of the USA’s defence umbrella—more recently, however, both Europe’s level of defense spending and the strength of their alliance with the United States have been changing

  66. ^

     For example, AI might unleash 'memetic viruses' that spread through humanity at unprecedented speed.

  67. ^

     It's possible that, just as communication technologies like the internet seem to have made some people more sane and others less so, we'll see both beneficial and detrimental epistemic effects of AI manifest in society at the same time.

  68. ^

      To be clear, almost any ideology, even those with largely benevolent elements, can mutate into fanatical variants. Indeed, many fanatical ideologies were often inspired by at least some benevolent founding principle. Communists, for instance, were often motivated by egalitarian ideals and dreams of greater prosperity for the common people. Most religious fundamentalists and even many ethno-nationalist movements emphasize in-group solidarity and communal altruism over selfishness. Arguably, no fanatical ideology is pure evil—though some have managed to come impressively close.

  69. ^

     Other historical examples of fanatical movements' long-term strategic thinking abound. Many communists spoke of the "long march through the institutions" as a gradual strategy for gaining cultural influence, and groups like the Muslim Brotherhood have explicitly advocated a multi-generational strategy of gradually Islamizing society through institutional infiltration rather than immediate revolution.

  70. ^

     It’s notable that many of today’s authoritarian regimes' expansionist tendencies may be more limited. China's forced Sinicization (in Hong Kong, Taiwan) and Russia's Russification (in Ukraine, Georgia) are potentially satiable: focused on territories they claim historically rather than attempting unlimited expansion. Of course, whether such regimes would actually stop after achieving their territorial ambitions remains uncertain—but if so, these more-limited ambitions would seem to reflect the greater pragmatism of today’s largest authoritarian countries compared to the most fanatical movements of the past.

  71. ^

     If mind-uploading or other methods of rapid (digital) population growth become possible, fanatics might also be disproportionately inclined to use them to out-reproduce other ideologies.

  72. ^

     More speculatively, this prioritization of growth could extend to cosmic scales. Some fanatical ideologies, with their totalizing and expansionist goals, may be more likely to approximate "locust-like value systems" that maximize expansion and resource consumption without regard for other values. That being said, most fanatical ideologies might not be willing to prioritize growth if doing so compromises their other sacred values.

  73. ^

     As of Jan 16, 2026. On August 30, 2025, it was 67%, and when we first wrote this section (some time in 2024) it was closer to 50%. This could suggest that forecasters deem multipolar worlds increasingly plausible.

  74. ^

     Chenoweth and Stephan’s dataset includes a number of examples of the fall of communist regimes across Eastern Europe circa 1989, but this was at least partly downstream of Mikhail Gorbachev opening the floodgates to liberalizing reforms in the USSR. There have been other instances of nonviolent overthrow of authoritarian regimes, such as in the Philippines (1986) and Tunisia (2011), but in these cases the governments don’t seem to have been fanatics of any particular ideology.

  75. ^

     See also Star Trek's 'Prime Directive,' which portrayed non-interference with other civilizations as a moral ideal—likely reflecting creators Coon and Roddenberry's political outlook.

  76. ^

     One example is Ayaan Hirsi Ali, who survived female genital mutilation and forced marriage before becoming an advocate for women's rights. Despite her personal experiences, she was designated an "anti-Muslim extremist" by the Southern Poverty Law Center in 2016, had an honorary degree rescinded by Brandeis University for "Islamophobic" statements, and was called a "notorious Islamophobe" by CAIR.

  77. ^

     This may be exacerbated by the naturalistic fallacy and the omission bias where people tend to prefer harm from inaction to harm from action; the latter may be relevant for our purposes.

  78. ^

     The AI safety community may be underemphasizing such misuse risks. As Ben Todd (2024a) observes: 'Within AI risk, it seems plausible the community is somewhat too focused on risks from misalignment rather than mis-use or concentration of power.”

  79. ^

     From a (naive) rationalist game-theoretic perspective, wars are a puzzle that requires explanation. Prima facie, rational agents should be able to avoid costly wars by bargaining to find mutually beneficial compromise solutions.

  80. ^

     Re 1), Fearon uses the term ”irrational”, Blattman uses the term misperception. Re 2), Blattman talks of “intangible incentives” and Fearon of “issue indivisibilities” (which seem different but somewhat related concepts). Re 3), Fearon only briefly mentions this in the first paragraph. Re 4), Blattman uses the term “uncertainty”, Fearon talks of “private information and incentives to misrepresent”. Both use the term “commitment problem”.

  81. ^

     These values don’t have to be bad or misguided in themselves. For example, some may view human rights as a sacred value. In practice, however, secular governments and individuals are much more likely to put a (very high) price on them, e.g. in hostage negotiations, and thus arguably such values are not completely sacred. Probably more importantly, a sacred value on “human rights” is much less dangerous because they are relatively easily achievable in the sense that just not killing and torturing humans is enough to satisfy this sacred value. To be more precise, fulfilling the sacred value of “human rights” is compatible with a large fraction of all possible world states and many other value systems. In contrast, if one’s sacred value is total obedience to a long list of religious dogmas, this is incompatible with a much larger fraction of possibility space. 

  82. ^

      This is the term used in the psychological literature (e.g., Tetlock, 2003). Blattman (2023) and Fearon (1995) use the terms “intangible incentive” and “issue indivisibility”, respectively, to refer to similar (but not identical) concepts. Sacred values also relate to the concept of “having something to lose” which some view as a highly desirable property in the context of cooperative AI. For example Nguyen (2024) writes: “Several people think [having something to lose] is very important [...]. It intuitively is meant to capture the difference between “If I engage in this conflict I might lose everything I hold dear while cooperation guarantees that I can at least keep what I have right now” and “I have nothing to lose anyway, let’s fight and maybe I’ll get the thing I really want but am unlikely to get by default.”” 

    When sacred values are violated, people can perceive the current situation as essentially infinitely bad, and thus believe that they have nothing to lose since the status quo cannot get any worse. This mindset vastly increases one’s willingness to engage in conflict, regardless of winning probability, and resorting to extreme measures to alter the status quo.

  83. ^

     “[...R]eligious fundamentalists among both Jews and Muslims assassinated their political leaders [because they] were willing to make religious compromises and come to a peace agreement between Islamic and Judaic forces. Rabin's assassin, Yigal Amir [...] claimed that Rabin [the moderate Prime Minister of Israel who was awarded with the Nobel Peace Prize] was guilty of renouncing eternal Jewish rights to the Holy Land, which in his view was solely the land and territory of the Jews as promised by God in the Hebrew Bible. For Amir and his followers, Rabin had to be killed so that no sacred Jewish land would be ceded to the Arabs. Similarly, for the militants in the Muslim Brotherhood who were responsible for the murder of Sadat [a moderate Egyptian President who was also awarded the Nobel Peace Prize], compromise with the Israelis violated Islamic law and constituted an act of religious infidelity punishable by death.” [...] Each side claims that it has a sacred obligation to wage war against the other side in order to reach its religious goal of full control of the Holy Land.” – Selengut (2017).

  84. ^

     “[...] The 1939 Winter War between Finland and the Soviet Union followed on the refusal of the Finnish government to cede some tiny islands in the Gulf of Finland that Stalin seems to have viewed as necessary for the defense of Leningrad in the event of a European war. One of the main reasons the Finns were so reluctant to grant these concessions was that they believed they could not trust Stalin not to use these advantages to pressure Finland for more in the future. So it is possible that Stalin's inability to commit himself not to attempt to carry out in Finland the program he had just applied in the Baltic states may have led or contributed to a costly war both sides clearly wished to avoid.” (Fearon, 1995, p. 408).

  85. ^

     The same “differential commitment credibility” issue also seems to apply to malevolent actors.

  86. ^

     Another example is how Hitler and Mussolini betrayed the Munich agreement which was initially celebrated in much of Europe as “peace for our time”.

  87. ^

     See also Leskelä (2020) for a more systematic discussion of commitments and credibility. See e.g. this quote: "[...] committing to threats could require completely different mechanisms or approaches than committing to cooperation [...]."

  88. ^

     It’s unclear how ideological fanaticism interacts with commitment races (where two or more agents rush to make the first unyielding commitments about how they’ll interact with each other, in order to constrain their adversary’s options and gain the upper hand). Note that while both commitment problems and commitment races involve commitments, they represent different dynamics: in commitment problems, inability to commit is what contributes to conflict, whereas in commitment races the ability to commit is what contributes to conflict.

  89. ^

     There have generally been two main categories of explanations advanced for why democracies are less conflict-prone: that decision-makers in democratic countries tend to act in accordance with democratic (non-violent) norms; and that institutional strength and accountability are restraining forces when it comes to instigating conflict (Russett, 1993; Rosato, 2003). Tomz and Weeks (2012) propose two further possible mechanisms: that citizens of democracies perceive other democracies as less threatening, and also consider it to be morally problematic to initiate conflicts with other democracies.

  90. ^

     There is only weak evidence that democracies are generally less conflict-prone than autocratic states (Maoz, 1989; Müller & Wolff, 2004). In particular, it is not known if conflicts between democratic and autocratic states occur less frequently than conflicts between autocratic states. However, even if evidence showed that autocracies are less likely to engage in conflict with one another than with democracies, this would hardly serve as a reason to endorse autocracy. In Autocracy Inc. (2024), Anne Applebaum illustrates how autocratic regimes often collaborate to undermine democracies. While such aligned autocracies may experience reduced internal conflict, they represent a significant threat to global progress and wellbeing. As a concrete example, during World War II, the collaboration of the Allied forces was crucial in defeating allied regimes that posed a risk of immense harm.

  91. ^

     Sandberg uses radical negative utilitarians as an example. However, all of the other fanatical ideologies that have been mentioned in this piece seem much more worrisome. There are almost certainly less than 1,000 radical negative utilitarians on Earth—orders of magnitude less than, say, radical Islamists. Prominent negative utilitarians like Brian Tomasik also emphasize cooperation and compromise.

  92. ^

     We think retributivism is misguided because it’s cruel and because we don’t believe that anyone, ultimately, “deserves” anything as there is no libertarian free will. But hopefully most readers who endorse moderate retributivism agree with the concerns we outline about fanatical retributivism.

  93. ^

     One could argue that endorsing extreme eternal punishment is inherently fanatical, at least by our  definition: it requires sufficient certainty to endorse potentially permanent and irreversible action, usually involves extreme hostility toward an outgroup of "evil people," and embraces using the most extreme measures imaginable.

  94. ^

     We only included participants who passed two attention checks, reported answering with complete honesty (in the section of the survey containing the questions above), and provided a valid number or birth year confirming they were between 18 and 110 years of age, and didn’t strongly violate additivity (see footnote 95). Importantly, our results seem robust to both weaker and stricter inclusion criteria. For example, in our “extra strict” sample (N=748), we only included participants who in addition to the previous inclusion criteria, also reported being fluent in the language in which the survey was administered, didn’t violate additivity at all (see again footnote 95), and their free text responses showed evidence of serious engagement. The results were similar though the responses were somewhat less retributive. See our supplementary materials for details

  95. ^

     We excluded participants with high additivity violations. Here is how we calculated this: Participants first saw the question “Of all people in the world, what proportion of them deserve extremely happy lives?”, three questions later, on a different page, they were asked “[...] what proportion deserves unbearable pain forever?”. We excluded participants whose percentages to these two questions added up to more than 110%. We chose this rather arbitrary threshold because i) some people ain’t so good at math and could have easily messed up their “calculations” by 10%, and ii) participants couldn’t go back to edit their earlier response (and we didn’t want to exclude people who may have updated their views and didn’t violate additivity a lot). The results from participants with no additivity violation (i.e., their proportions summed to no more than 100%) were slightly lower: 45% said 1% or higher, a quarter answered 6% or higher.

  96. ^

     We chose the ASP-8 scale because it does not include any items about people deserving suffering.

  97. ^

     Here is the full wording of some of these items along with their respective Spearman correlations: “Society should make sure my core beliefs or principles are always adhered to without exception, regardless of whether people agree with them.” (ρ ≈ 0.37); “I'll do for my religion more than any of its other followers would do.” (ρ ≈ 0.34), “; Some sources of knowledge (people, texts or traditions) provide absolute truths, are always correct, and should never be doubted. (ρ ≈ 0.32); “I insist upon my group getting the respect that is due to it.” (ρ ≈ 0.30); “I'll do more for my group than any other group members would do.” (ρ ≈ 0.30); “I make my religion strong.” (ρ ≈ 0.26). Wanting hell to be created also correlated with dehumanization-related items (ρ = 0.21) and with overall verbal identity fusion score with their selected group (ρ ≈ 0.17). Almost all of these items also correlated at roughly similar magnitudes with our other questions about supporting eternal punishment.

  98. ^

     Many respondents, reading very quickly to maximize earnings per hour, might just interpret the question as “Evil people: Yay or boo?” and respond with “very boo!”.

  99. ^

     That said, with sufficiently powerful AI, enacting preferences may become as quick and abstract as answering survey questions—the AI asks what you want, you answer, and it's done. This would continue a historical trend toward ever-greater psychological distance from harm: a few thousand years ago, killing required getting one's hands dirty; today's technology allows vast destruction at the push of a button. The pilots of the Enola Gay likely could not have killed Hiroshima's civilians by hand, yet dropping the bomb was psychologically manageable.

  100. ^

     In the UK and US samples recruited via Prolific, 18-20% agreed with the “create hell” question. In samples from China, Pakistan, Saudi Arabia, and Turkey (recruited via Positly), agreement ranged from 51-57%. Regarding the “want system” question, 13-18% in the Prolific samples agreed, compared to 39-53% in non-Western samples. Finally, for the “would create system” question, 15% of Prolific respondents agreed compared to 35–52% in non-Western samples. On the “duration” question, 18-19% in the Prolific samples selected “forever”, compared to 32-43% in non-Western samples. To illustrate, consider the hell question in Pakistan, our least reliable sample. Overall, 51% of our Pakistan sample endorsed the hell question, but only 25% were in the group who endorsed the hell question and also selected 'Forever' in the duration question, and only 10.5% met those criteria and endorsed at least 1% in the proportion question. This suggests that participant inconsistency (and perhaps viewing hell as a deterrent) contributed to the large gap between the base rate of apparent hell endorsement and the proportion of “consistent and concerning” responders. Moreover, a mere 13% of the Pakistan sample passed our attention checks and met our other inclusion criteria—the lowest inclusion rate of any country in our study. The fact that substantial inconsistency remained even after filtering out so many participants suggests that there were persistent data quality issues with our Pakistan sample. Other non-Western samples also exhibited inconsistencies—see our supplementary materials.

  101. ^

     Though we did work with professional translators.

  102. ^

     For example, 39% of US Prolific respondents identified as atheist/agnostic, compared to ~14% in the US Positly sample—which is much closer to the proportion found in representative US surveys.

  103. ^

     In Buddhism and Hinduism, the closest concept to hell is Naraka (in Hinduism it’s also referred to as Yamaloka), where sinners are tormented after death. However, there are two crucial differences between the Abrahamic conception of hell and Naraka which make the latter less worrisome from a longtermist perspective: i) souls don’t remain forever but only temporarily in Naraka until “their negative Karma is exhausted” and ii) their suffering is seen as a natural consequence rather than a deliberate and morally desirable divine punishment.

  104. ^

     Probably in part due to passages like this, many Islamic theologians seem to believe that many more people go to hell than to heaven.

  105. ^

     For example, this Reddit user claims that he would enjoy seeing those tortured in hell forever, adding that “Nothing has ever filled me with as much euphoria as hearing something wail in pain.”

  106. ^

     “If the Lord wills it, they say, it will be, and it is our task to obey the word and try as hard as we can to realize God’s will” (Selengut, 2017).

  107. ^

     See also Ian Banks’ Surface Detail. While of course a work of fiction, and thus containing many implausible scenarios and assumptions, it also outlines political, economic, and religious motivations for creating digital hells, some of which aren’t completely implausible.

  108. ^

     The full quote (Sapolsky, 2017): “What would I do with Hitler? The viscera become so raw that I switch to passive voice in my mind, to get some distance. What should be done with Hitler? It’s easy to imagine, once I allow myself. Sever his spine at the neck, leave him paralyzed but with sensation. Take out his eyes with a blunt instrument. Puncture his eardrums, rip out his tongue. Keep him alive, tube-fed, on a respirator. Immobile, unable to speak, to see, to hear, only able to feel. Then inject him with something that will give him a cancer that festers and pustulates in every corner of his body, that will grow and grow until every one of his cells shrieks with agony, till every moment feels like an infinity spent in the fires of hell. That’s what should be done with Hitler. That’s what I would want done to Hitler. That’s what I would do to Hitler.”

  109. ^

     To be clear, Sapolsky is adamant about never wanting to act upon this dark fantasy. However, Sapolsky's ego-dystonic attitude is informed by his neuroscientific understanding of human behavior, a perspective that may not constrain others' retributivist intuitions. While fictional, the Black Mirror episodes White Bear and White Christmas are also noteworthy.

  110. ^

     Will MacAskill defines viatopia as “a state of the world where society can guide itself towards near-best outcomes, whatever they may be” (MacAskill, 2025b)

  111. ^

     For example, MacAskill and Wiblin (2025) discuss trade but also mention the risks of agents self-modifying their preferences (e.g., towards placing positive weight on suffering) to increase their bargaining position. Previously, MacAskill (2018) discussed trade more optimistically: “One thing you could do is just say, ‘Well, we've got ten billion people alive today, let's say. We're gonna divide the universe into ten billionths, so maybe that's a thousand galaxies each or something.’ And then you can trade after that point. I think that would get a pretty good outcome.”

  112. ^

     Curiosity, humility, and good epistemics are likely important for getting the most important questions right. Fanatics typically lack all three. Some potentially crucial considerations may be arcane and require sophisticated reasoning. How plausible is it that religious fanatics who literally believe that God created the universe in six days contribute rather than derail discussions involving multiverse-wide evidential cooperation, meta-ethics, and the cosmic host?

  113. ^

     A related concept is “value porosity” (Bostrom, 2014).

  114. ^

     Nine Lives by Dean et al. (ch. 3, 2018). Dean describes the strict Islamic study group in Saudi Arabia that he joined in the early 1990s. This group, which later funneled members into al-Qaeda, banned watching "The Smurfs" among various other restrictions. According to Dean, the group considered the show a "Western plot to destroy the fabric of our society" and promote sexual freedom because of the single female character, Smurfette, among many males. They also believed the show promoted witchcraft.

  115. ^

     See here for a longer quote.

  116. ^

     CEV and the long reflection aren’t mutually exclusive and are in fact motivated by the same concern: many of humanity’s current, unreflected values are suboptimal and further reflection —in the case of CEV, with the assistance of aligned or “friendly” AI—would hopefully improve them and lead to (massively) better outcomes, for (almost) everyone involved. More broadly, there are other contexts besides AI alignment where preference idealization could play an important role. Many thorny philosophical questions related to preference idealization also arise when considering collective moral deliberation (like the viatopia and long-reflection ideas discussed earlier) and when contemplating transhumanist self-modification and enhancement. For much more detail on this, see Joe Carlsmith’s essay “On the limits of idealized values”.

  117. ^

     For example, Bostrom's discussion of 'indirect normativity' (2014b, ch. 13), Gabriel’s (2020) discussion of “informed preferences or desires”, or Chen (2023).

  118. ^

     Interestingly, Yudkowsky's original CEV document contains an extended thought experiment on this topic. He suggests that if a terrorist group were savvy enough to create an aligned superintelligence, this might require sufficient intellectual humility and moral caution that they would realize the need to aim their AI at an idealized core of deep moral principles, and to "include all the world" in the process of moral extrapolation, rather than simply feeding in a list of specific commandments. He suggests that such a process of idealization might successfully shave off most of the moral rough edges of the group's fanatical ideology. However, this argument relies on a selection effect that may not hold. A fanatical group could plausibly steal or adapt existing alignment technology rather than developing it from scratch—acquiring the technical capability without any philosophical humility. More broadly, fanatical regimes throughout history have developed advanced technological capabilities (nuclear weapons, rockets, etc.) without any corresponding epistemic improvement.

  119. ^

     In fact, many humans seem to prefer vindication of their existing beliefs over honest feedback. This may explain why Reinforcement Learning from Human Feedback tends to produce sycophantic AI behavior (Sharma et al., 2023)—RLHF raters tend to reward AIs when they tell them what they want to hear.

  120. ^

     Compare Selengut (2017, emphasis added): “[...B]ut what about fundamentalists and other religious traditionalists who refuse to compromise what they see as the word of God? These Orthodox believers, [...] rather than compromise their beliefs, they seek to remake reality to fit their religious cognitions and expectations. They engage in militant transformation to force all others to accept their religious beliefs and demand that society be based on their religious views. [...] “[Religious fanatics] refuse to compromise their beliefs and reduce their longing for the fulfillment of sacred prophecies. If reality challenges their beliefs, reality must be changed to fit religious truth.” 

  121. ^

     Similarly, some malevolent humans may also reflectively endorse their (sadistic) preferences. Malevolent preferences and ideological fanaticism may also interact and reinforce each other: the ideology allows people to fulfill their sadistic preferences while simultaneously enabling them to preserve their virtuous self-image.

  122. ^

     That being said, there might be ways to construct idealization procedures that enforce genuinely epistemically neutral learning—one could even convince some fanatics to embrace such processes if framed as confirming their certainty ("If you're truly right, more knowledge can only vindicate you"). Designing such procedures could be important future work, though it remains challenging given fanatics' skill at incorporating contradictory evidence into existing worldviews.

  123. ^

     Gabriel (2020) makes a related point: the challenge isn't to identify the "true" moral theory and encode it in machines, but rather to find fair processes for determining which values to encode—processes that don't simply allow some people to impose their views on others. This is precisely the problem that fanaticism poses. Fanatics are not interested in fair processes or reasonable pluralism; they want their values to win.

  124. ^

     More precisely, 19.2% of participants had “actively open-minded thinking style” scores below the midpoint, indicating that on average they leaned towards disagreement with statements like “People should always take into consideration evidence that goes against their beliefs”. Instead of seeking truth, people prioritize feeling good about themselves and the world, maintaining their worldview and sense of meaning, purpose, and identity, and being seen as moral, high-status, intelligent, and loyal by their in-group. Many EAs and rationalists might be falling prey to a typical mind fallacy here, in assuming that most people value having true beliefs and epistemic rationality as much as they do. More generally, many WEIRD people may overestimate how widespread certain characteristics of WEIRD psychology are (like impartiality and moral universalism), see footnote 127. 

  125. ^

     This isn’t primarily due to misinformation; fake news comprises only 0.15% of Americans’ daily media diet (Allen et al., 2020). People’s beliefs are misguided less because they were misled and more because they are motivated to hold these wrong beliefs. As Drutman (2023c) and Williams (2024a, 2024b) argue, misinformation is primarily a demand-side rather than supply-side problem: social and psychological factors—like partisan animosity, perceived loss of status, inequality, grievances and frustrations, fitting in with one’s tribe, signaling virtue, et cetera—create a demand for content that confirms existing beliefs and provides psychological relief. People don’t typically hold incorrect beliefs simply because they lack access to accurate information (e.g., Scheufele et al., 2021). Instead, as Williams (2023) notes, political media may function more as a “rationalization market” where people seek sophisticated justifications for their preferred beliefs. These issues not only affect low-information voters but also highly educated elites and academics (Williams, 2025c).

  126. ^

     Much of this is happening subconsciously (Simler & Hanson, 2018), for evolutionary reasons (Kurzban, 2012). To be fair, it seems plausible that most humans’ idealized preferences would prioritize truth-seeking but this isn’t obvious and may depend on the precise idealization procedure.

  127. ^

     Several aspects of WEIRD psychology seem also beneficial from the perspective of reducing risks of ideological fanaticism and making the long reflection work well (see table 1.1 “Key elements in WEIRD psychology”, Henrich, 2020): Less conformity and less deference to tradition/elders; Impartial principles over contextual particularism; Trust, fairness, honesty, and cooperation with anonymous others, strangers, and impersonal institutions; Muted concerns for revenge; Reduced in-group favoritism; Moral universalism.

    Of course, many aspects of WEIRD psychology seem neutral and others seem worse, particularly overconfidence. With all of that said, we obviously should value the perspectives of other cultures, perhaps even more so than seems intuitive: historically, most people have been too xenophobic and enamored with their own values and customs, and most Western thinkers, certainly pre 1950, were insufficiently critical of racism, colonialism, and Western imperialism. As discussed above, some of the worst atrocities relating to ideological fanaticism actually occurred in WEIRD societies.

  128. ^

     Wei Dai makes this point here: "I tend to think that people's selfish desires will be fairly easily satiated once everyone is much much richer and the more "scalable" "moral" values would dominate resource consumption at that point [...]."

  129. ^

     Bostrom (2024a) argues: “Human values appear to be quite resource-satiable: we would much rather have a 100% chance of being able to use 1 galaxy to meet our goals than to have a 1% chance of being able to use 100 galaxies.”

  130. ^

     The increased variance in the behavior of fanatics in post-AGI scenarios may be particularly concerning from an s-risk perspective.

  131. ^

     Preserving US democracy seems far from guaranteed. See, e.g., relevant questions on Metaculus.

  132. ^

     For a brief discussion on safeguarding liberal democracy more generally, see Koehler (2022).

  133. ^

     See also Book Review: Why We're Polarized (Astral Codex Ten, 2021): “Every so often, people ask what an effective altruism of politics would look like. If you [...] wanted to improve (US) politics as much as possible [...] what would you do? Why We’re Polarized and the rest of Klein’s oeuvre make a strong case that you would try to do something about polarization. Solve that, and a lot of the political pathologies of the past few decades disappear, and the country gets back on track.”

  134. ^

     Arguably some political opponents are indeed existential enemies. But it usually doesn’t make sense to believe that, say, all members of the opposing party are existential enemies, let alone critics of one’s own strategy.

  135. ^

     Tim Urban (2023) depicts polarized political tribes as Golems: hulking, mindless creatures locked in perpetual combat, each animated and sustained by the other's hostility.

  136. ^
  137. ^

     Beyond the negative emotions of fear and anger that fuel polarization, there may also be powerful positive psychological rewards. People can experience a psychological rush and a sense of self-righteous clarity and purpose from being part of the team that fights evil. This mirrors the experiences of some soldiers who fight in wars. (E.g., in his memoir Merry Hell!, Thomas Dinesen expresses how he greatly enjoyed parts of his WWI experience—the fighting, the rush, and the activity. See also the "band of brothers" phenomenon where soldiers report intense bonds and even nostalgia for combat.) This suggests that polarization and fanaticism may be self-reinforcing not just through fear and hatred, but also through the intoxicating sense of belonging and meaning, and even excitement that comes from being part of a righteous struggle against evil.

  138. ^

     Though recent discussions of “abundance” (cf. Klein & Thompson, 2025) may also provide a possible path to making politics less polarized.

  139. ^

     Drutman (2023b): “The most promising and doable pro-party reforms are fusion voting and proportional representation. Fusion voting allows multiple parties to endorse the same candidate, encouraging new party formation. Proportional representation ends the single-member district and makes it possible for multiple parties to win a proportional share of representation in larger, multi-member districts.”

  140. ^

     Drutman (2023a) also writes: “Illiberal extremism follows from a binary, highly polarized party system, because extremism emerges from radicalized in-group/outgroup conflict. Thus, the party system requires change. Breaking the core problem of escalating binary, us-versus-them competition requires adding new parties to realign and reorient partisan competition.”

  141. ^

     For instance, Vinding (2022, ch.14): “Parliamentary systems appear to have significantly lower levels of political polarization, and are generally more stable, more peaceful, and less prone to coups (Santos, 2020, p. 1, ch. 1; Casal Bértoa & Rama, 2021). They also tend to have “better corruption control, bureaucratic quality, rule of law, […] and literacy” (Gerring et al., 2009; Santos, 2020, p. 47).”
    However, transforming the US into a parliamentary system seems very intractable and its benefits would plausibly be smaller than those of proportional representation (Mainwaring & Lee Drutman, 2023).

  142. ^

     Improving epistemics directly is another avenue for reducing polarization and fanaticism. This includes books (like Julia Galef’s Scout Mindset or Steven Pinker’s Rationality) and educational resources (like Clearer Thinking) to promote better reasoning. More scalable approaches might include promoting greater adoption of prediction markets and a variety of AI-based interventions (discussed below).

  143. ^

     Organizations working on structural reforms include Fix Our House, Protect Democracy, and New America. Those focused on cultural change include the Bipartisan Policy Center, which creates spaces for cross-party negotiation, and groups like Braver Angels and More in Common that work to reduce affective polarization at the grassroots level.

  144. ^

     Why single out classical liberalism and Enlightenment principles and not, say, utilitarianism or the core principles of effective altruism? While we’re fans, classical liberalism seems to have several advantages. It has already influenced many government constitutions and is time-tested: Societies founded on Enlightenment principles consistently score highest on objective metrics of human flourishing, ranging from GDP per capita and life expectancy to self-reported life satisfaction and human rights protections. Classical liberalism is already widely supported and lies inside the Overton window; out of all influential ideologies, classical liberalism seems most compatible with the core principles of EA. It also represents an attractive compromise for almost all (non-fanatical) value systems. The procedural principles of classical liberalism (like rule of law, separation of powers, etc.) are also fairly concrete, while EA is more abstract, open to interpretation and may even run the risk of becoming fanatical itself. EA is also unlikely to become widely supported in the near future, especially since the FTX debacle. Finally, it is probably not an accident that many thinkers who are still widely admired today—such as Martin Luther King Jr., Nelson Mandela, the U.S. Founding Fathers, Bertrand Russell, Immanuel Kant, Jeremy Bentham, David Hume, Adam Smith, and Mill himself—were deeply inspired by the Enlightenment.

  145. ^

     Indeed, Enlightenment thinkers themselves have been far from perfect. Kant, for example, had an oddly intense preoccupation with masturbation, and some Enlightenment thinkers espoused views that were clearly racist or sexist. But on the whole, most Enlightenment thinkers had much better attitudes compared to their contemporaries—and crucially, their philosophical framework contains the tools for self-correction and moral progress.

  146. ^

     Examples include Persuasion, The Economist, Steven Pinker, John McWhorter, Matthew Yglesias, Sam Harris, Deeyah Khan, Coleman Hughes, Claire LehmannHelen Pluckrose, Scott Alexander, Heterodox Academy, and FIRE, among many others who have remained true to classical liberal principles even when facing pressures from all sides of the political spectrum.

  147. ^

     In this context, it’s worth mentioning how many highly successful entrepreneurs—like Elon Musk, Dario Amodei, Sergey Brin, or Jensen Huang—are (second-generation) immigrants. More generally, the top performing researchers and entrepreneurs tend to produce a disproportionate amount of the value in their field; attracting these people is especially useful.

  148. ^

     Open Philanthropy has granted several million dollars towards high-skilled immigration reform, as well as over $9 million to the Institute for Progress, a think tank with policy research and advocacy on both high-skilled immigration and compute governance (a promising AI-related intervention for tackling risks of fanaticism that we cover below).

  149. ^

     That being said, it’s important to address potential serious risks from increased immigration. First, a naive open border policy plausibly makes it easier for foreign spies to gain influential positions. Second, immigrants can negatively influence the culture and values of the country they move to. For example, around half of British Muslims believe that homosexuality should be illegal.

  150. ^

     Many policy interventions aimed at increasing economic growth, bringing down the cost of living, or making the economy fairer by curtailing rent-seeking behavior may have an indirect anti-fanatical effect—provided growth reaches ordinary people, not just elites. Historically, fanatical ideologies seem to have found particularly fertile ground in times of societal turmoil, desperation, growing inequality, and economic contraction (cf. the political and economic woes of the Weimar Republic that preceded the rise of Nazism, Russian collapse in WWI preceding the October Revolution, or economic crisis in late-1970s Iran preceding the Islamic Revolution). Conversely, when most people experience rising living standards, they seem more amenable to reasonable, positive-sum thinking and less likely to fall prey to misguided populist ideas (cf. Bryan Caplan’s “The Idea Trap”).

  151. ^

     See the section “Encouraging reform in illiberal countries” in Appendix F for more details.

  152. ^

     See also Brian Tomasik’s writings on the benefits of cooperation. See here for a more detailed exploration of these pathways.

  153. ^

     In her book States and Social Revolutions, Skocpol argues that the revolutions are not simply caused by popular discontent but also often require the collapse of the state's administrative and military power. This "state breakdown" can be triggered by intense and unsuccessful geopolitical competition, i.e., being unable to cope with the military and fiscal pressures exerted by foreign rivals. Thus, international conflict can make states more vulnerable to revolutions.

  154. ^

     Aside from the already-discussed idea of influencing individual regimes to be less fanatical or otherwise more cooperative.

  155. ^

     For more detail on the dynamics around the feasibility of such a deal, see this video from Jackson Wagner and Rational Animations.

  156. ^

      The CHIPS Act is under threat at the time of writing.

  157. ^

     To be clear, we’re not proposing anything new here.

  158. ^

     We remain optimistic about export controls. DeepSeek managing to catch up to the frontier of US ‘thinking’ models in spite of existing export controls is an important case study; it demonstrates that algorithmic insights are still a key lever in AI progress, but has also revealed—via deployment difficulties, and quotes from DeepSeek’s founder—that compute remains a significant constraint for them. Export controls may need to be widened and tightened up, but that doesn’t mean that they are ineffective.

  159. ^

     We are excited about Longview’s request for proposals on secure, governable chips, as well as high-quality research and advocacy by groups including RAND, IAPS, CSIS, CNAS, IFP, FAI, AIPI, Encode, and more.

  160. ^

     Of course, frontier companies already have strong economic incentives to prevent losses of intellectual property. But these incentives don’t account for harms to wider society from misuse of powerful AI. Meanwhile, the incentives to steal AI intellectual property are high, since training runs are expensive. Model weights are surprisingly compressed, although still sufficiently ‘chonky’ that security measures might be possible.

  161. ^

     More generally, one idea would be to somehow screen against fanatical (and malevolent) traits in the people who shape or control TAI. In an ideal world, leading AI companies’ employees and relevant government officials would be screened for fanatical and malevolent traits. However, most existing measures of malevolent traits carry extreme methodological limitations that make them almost useless, and designing manipulation-proof measures of either malevolence or fanaticism would be a long process that we probably won’t have sufficient time or resources for. An even larger challenge is buy-in amongst important stakeholders like the US government or AI companies. Most AI companies or the US government won’t actually incorporate impartial fanaticism screenings into their hiring processes and reject otherwise-strong candidates who perform poorly on them. One could possibly screen for undesirable traits in RLHF raters though.

  162. ^

     Stronger infosec would make it more difficult to insert secret loyalties.

  163. ^

     This one-dimensional spectrum from “pure tools” to “autonomous beings” is a useful approximation for our purposes, though AIs actually vary along multiple dimensions. These include: degree of intent-alignment versus misalignment; whether they act sycophantically versus guide users toward truth; whether they optimize for existing versus reflectively-endorsed human preferences; and whether they’re autonomous versus tool-like. What matters for preventing fanaticism is ensuring AIs exhibit anti-fanatical characteristics (reason, truthfulness, compassion) regardless of where they fall on any of these dimensions—we want them to resist amplifying fanatical ideologies whether they’re functioning as obedient tools, advisory systems, or autonomous agents.

  164. ^

     See also MacAskill & Wiblin (2025) making very similar arguments. See also MacAskill (2025d, section 3.2).

  165. ^

     AI alignment seems overall beneficial (partly because this seems to make AIs overall more benevolent, probably in part because most humans are comparatively benevolent; see also emergent misalignment.) However, intent-alignment could in principle backfire if it allows “misaligned humans” to wield intent-aligned AIs to amass immense power. Indeed, one could argue that sufficiently wise and benevolent AIs might reasonably want to constrain humanity's reach at least somewhat—some humans don't seem particularly benevolent, and, from the impartial point of view of the universe, it's unclear whether homo sapiens, given our history, should be trusted completely with the entire lightcone.

  166. ^

     This isn’t to say that today’s approaches (here and elsewhere) will necessarily scale to future, more-powerful systems.

  167. ^

     Cf. MacAskill (2025a): “[...] What should be in the model spec? How should AI behave in the countless different situations it finds itself in? To what extent should we be trying to create pure instruction-following AI (with refusals for harmful content) vs AI that has its own virtuous character?”

  168. ^

     Joe Carlsmith puts it nicely here: “I want advanced AI to strengthen, fuel, and participate in good processes in our civilization – processes that create and reflect things like wisdom, consciousness, joy, love, beauty, dialogue, friendship, fairness, cooperation, and so on. [...] And AIs aren’t just tools in this respect – they can be, in a richer sense, participants, citizens, and perhaps, ultimately, successors [...].”

  169. ^

     More speculatively, AI may also be able to help with (moral) philosophy and “wisdom”—though see especially Wei Dai’s concerns here. Some relevant discussion is also scattered through this podcast with Will MacAskill.

  170. ^

     DeepSeek, for instance, might not be able to help, given that it censors topics that contradict the Chinese Communist Party’s preferred narratives.

  171. ^

     For example, MacAskill (2018) seems to have had such a system in mind: “One thing you could do is just say, ‘Well, we've got ten billion people alive today, let's say. We're gonna divide the universe into ten billionths, so maybe that's a thousand galaxies each or something.’ And then you can trade after that point. I think that would get a pretty good outcome.” MacAskill seems now more pessimistic about such proposals (cf. “We should aim for more than mere survival” towards the end of the episode).

  172. ^

     Collective decision-making would plausibly block the most disvaluable outcomes, since fanatics will (most likely) remain a minority. However, fanatics could still use their voting bloc to bargain for harmful concessions or perhaps even legitimize (parts of) their worldview within the system, and naively designed governance could give disproportionate bargaining power to bad actors. Supermajority voting schemes could help perhaps reduce such risks (cf. MacAskill and Hadshar, 2025), though they may increase the likelihood that minorities can veto outcomes that would be very good for most other value systems.

  173. ^

     Of course, the choice isn't simply about including or excluding certain factions once and for all. More realistic governance frameworks will probably feature more iterative decision-making across many smaller questions (and hopefully conditions designed to gradually shift values toward reasonableness over time). But all such approaches must still grapple with difficult boundary questions: what precisely counts as intolerable, and according to whom?

  174. ^

     The actors who first develop aligned superintelligence would possess extraordinary bargaining power in shaping such a charter. But many other actors might also (indirectly) influence the outcome.

  175. ^

     Right now, although the enforcement is (very) imperfect, human rights violations are outlawed by international rules and institutions, such as the UN Human Rights Council and the International Criminal Court. Outlawing certain acts in the post-AGI world is a natural extension of this idea, and existing institutions may provide a foundation to build upon. How to monitor and enforce these provisions across intergalactic space is a further area for future work.

  176. ^

     We’d probably want to apply such a principle universally: just as we'd block fanatics from creating what others consider extreme disvalue, we should also block actions that impose extreme disvalue on other moral perspectives, including those of fanatics (for instance, gratuitously burning holy books), at least unless there are very strong reasons for doing so.

  177. ^

     For further details, see “Ideological fanaticism: Causes”. Note that this is an extremely unpolished and unfinished exploration of causes.

  178. ^

     It’s plausible that the negative effects of social media are exaggerated (though see here for counterarguments). Williams (2025b) argues that the problem isn't primarily that algorithms manipulate people into extremism, but rather that social media's democratizing character reveals and amplifies pre-existing popular demand for extreme content that elite gatekeepers previously excluded from mainstream discourse. However, it still seems plausible that changing social media recommendation algorithms to incentivize reason and truth-seeking over tribalism and outrage is both possible and beneficial.

  179. Show all footnotes

63

0
0

Reactions

0
0

More posts like this

Comments
No comments on this post yet.
Be the first to respond.
Curated and popular this week
Relevant opportunities