The Architecture of the Threshold

Alex Baxter

Cross post from Substack: https://alexbaxter1.substack.com/p/the-architecture-of-the-threshold?r=7m9mmg

In my previous exploration of the morality of AI creation (see “The Moral Weight of The Great Silence”), I examined the sense of inevitability that pervades the discourse on artificial intelligence and argued that our technological adolescence is approaching a threshold. While that piece served as a warning, highlighting the need for a Kantian framing where we treat people never merely as a means, but always also as an end, it left the “how” unanswered. If we accept that the future is not inevitable but the accumulation of individual decisions, we must confront the demands of the next step.

The options fall into two broad categories: technical safeguards and societal frameworks. Technical solutions are the most visible efforts, pursued largely by the market actors building frontier models. These approaches attempt to embed safety directly into systems even as they are grown and shaped through training. Through guardrails, alignment protocols, and interpretability tools, the goal is to ensure that increasingly capable systems remain oriented toward human flourishing.

The appeal of this option is obvious, for if safety is a design constraint, every deployment carries that same protection. However, these actors do not operate in a vacuum but exist within a competitive architecture, that crosses geopolitical lines. The ideal of being safe by design is constantly weighed against strategic necessity. In a world governed by such pressures, it is difficult to imagine every actor prioritising safety over market share or their own survival. Within the current market narrative, restraint could be mislabelled as weakness.

Political and societal solutions attempt to correct this potential uneven distribution of responsibility. Rather than relying on the internal logic of a model or the voluntary commitments of a corporation, these frameworks seek binding oversight through multinational convergence. While these mechanisms are often cumbersome, they serve to level the playing field. They signal that certain risks, perhaps especially those with a non-zero probability of catastrophe, are unacceptable regardless of the competitive advantage they might yield.

History offers some reassurance that such coordination is possible. The 1987 Montreal Protocol demonstrated that nations could cooperate to phase out technologies that threatened the global commons, achieving near-universal ratification in protection of the ozone layer. Similarly, international norms restricting reproductive human cloning emerged rapidly after early breakthroughs in biotechnology, using a mix of protocols and declarations to align technical progress with collective human well-being. While less binding than the Montreal Protocol, it is a potential forward signpost. These precedents suggest that coordination can occur when the risk is recognised as universal. But in application a tangible lever for monitoring is required.

In the 1980s, that lever was the measurement of the production of chlorofluorocarbons. Today’s equivalent in the field of AI, is the concentrated supply chain of high-end computational power. Compute serves as the primordial chemistry of artificial intelligence, and unlike the opacity of the models themselves, the hardware required to grow them is physical, finite, and geographically concentrated, as well as expensive and difficult to manufacture. A treaty modelled on the Montreal Protocol would look to transition us from strategic necessity towards collective verification, by establishing international standards around things like large-scale compute clusters, placing a verifiable ceiling on the scale of training runs. This would effectively level the playing field, ensuring that no single actor can accelerate into a state of superintelligence while others remain bound by safety protocols. The central challenge, as with all such frameworks, lies in verification. Compute clusters can be obscured, and states unwilling to participate may simply defect. Any viable treaty must grapple with enforcement directly.

Such a framework would restrain the ‘”at any cost” market architecture from a race toward an existential threshold, into a managed competition where the burden of proof for safety lies with those who seek to increase capability.

The variable we cannot control here is time.

Development of intelligence capacity could stretch across decades and allow political and societal institutions to adapt and solidify norms. However, if technical capability increases are measured in weeks or months rather than decades, acceleration may outpace governance, with deployment occurring before shared constraints exist. If we create systems that eclipse human rational agency, we are not merely increasing productivity but altering the parameters under which human agency can operate. Uncertainty regarding timelines does not grant us a licence for haste but imposes a moral weight on those currently shaping the technological trajectory.

In this though, we must confront the possibility that delay itself may be morally hazardous. If superintelligence can cure disease, eliminate scarcity, and mitigate climate collapse, then slowing development prolongs avoidable human suffering. From this perspective acceleration under responsible stewardship could be framed as a necessity for human flourishing, or a way to stabilise the world before adversarial systems dominate. This reframes the debate from one of speed versus safety to a choice between competing risks. Yet this reframing is not without its own hazards, as “responsible stewardship” is precisely the claim every actor in the race already makes for themselves. Without an external standard by which stewardship can be assessed and verified, the argument risks becoming a more sophisticated version of the strategic logic it sought to transcend.

Ultimately, the perceived conflict between technical guardrails and societal treaties is a false one. We require both. But more than that, we require the wisdom to distinguish urgency from impatience. Technological progress is not a law of nature but presents a series of choices we make under conditions of uncertainty.

Effective Altruism Forum
EA Forum

The Architecture of the Threshold

1

1

Reactions

More posts like this