AI-use disclosure: This essay was originally written by me in Chinese, then translated and refined into English with substantial assistance from AI systems. I personally directed the structure, argument, terminology, revisions, and final editorial decisions, and I take responsibility for the claims made here.
Hello, I’m KunYuan, founder of TRANTOR LABS in Singapore. We are a philosophy-first AGI foundational research lab based in Singapore, with a team of a little more than a dozen people. Our work focuses on AGI safety, risk, alignment, and governance, especially the existential-level risks AI may pose to humanity as a whole.
Over the past two-plus years, my team and I have continuously followed discussions on AI risk in communities such as EA and LessWrong. These discussions gave us a foundational understanding of AI-related existential risk: P(doom), takeoff speed, takeover pathways, permanent disempowerment, mesa-optimization, the evidence dilemma, and the efforts of the effective altruism movement to translate these judgments into real interventions. In my view, this is one of the deepest collective reasoning efforts humanity has ever undertaken around a single technological risk, and it has already begun to shape attempts to intervene in the real trajectory of civilization.
Although I have not posted on these forums, sustained reading of them laid the cognitive foundation for my further thinking about the existential risks AI may pose to humanity.
As I spent the past two years building AGI architectures, the engineering work forced me, step by step, into the domain of philosophical categories. At the same time, a growing unease began to surround my entire cognitive system. For more than two years, this unease remained vague. Only in the past two months has it begun to reveal something like a complete map.
I began to realize that in our existing research ecosystem around AGI-related extinction risk, there is a foundational gap that is rarely named directly: we are all concerned about the risk that AGI may cause human extinction, but we rarely ask what exactly we mean by “humanity.” What is the ontological definition of the human? What constitutes humanity? What properties does it have? Only after clarifying what “humanity itself” is can we more systematically examine the ways in which AI may pose existential risks to humanity.
This is the question I have been thinking about for more than a year. Only in the past two months has it gradually come into focus. I first wrote an academic paper at the intersection of philosophy and AI, and then, on the basis of that paper, wrote this civilizational diagnosis. This article is not an academic paper, nor is it a concrete governance proposal. It is a position piece grounded in the diagnosis of a civilizational risk. Its purpose is to submit a civilizational-scale risk object — one that humanity has not yet fully noticed or named — to EA, LessWrong, and to all rationalist community members around the world who care about AI risk, AI safety, and AI governance, for judgment and review.
This gap, as I see it, is an ontological precondition we have collectively skipped:
When we talk about “protecting humanity” from extinction, what exactly are we protecting?
Existing AI risk frameworks almost always assume an insufficiently clarified premise: that the “humanity” being protected is already a well-defined, stable object. As long as this object is not physically destroyed — as long as bodies remain, consciousness remains, humans are not fully deprived of agency, and humanity is not locked into some permanent cage — then, it seems, the risk has not yet reached its deepest level.
But what exactly is this “humanity”?
In current discourse, the definition of humanity slides across several dimensions. Sometimes it refers to the biological population of Homo sapiens. Sometimes it refers to the aggregate of aligned preferences. Sometimes it is the bearer of rights and dignity. Sometimes it is the body of members within an institutional order. Sometimes it is the open set of future potential. In my view, these meanings are not automatically compatible. When different fields speak of “protecting humanity,” they may be pointing toward life, preferences, freedom, institutions, consciousness, dignity, or future potential. You may think you are discussing extinction, when what you are actually protecting is biological persistence. You may think you are discussing disempowerment, when what you are actually protecting is institutional continuity.
The blind spot created by this dimensional slippage is more dangerous than getting a probability estimate wrong. It can make us feel that we have already built the deepest safety defenses before we have even reached the core risk.
I do not deny the seriousness of extinction risk. Biological annihilation is the strictest risk baseline; that is not in dispute. But a baseline is not the whole picture. If humanity is merely the biological population of Homo sapiens, then as long as human bodies continue to exist, the risk does not seem to have reached its deepest level. But what if what we mean by humanity also includes the human capacity, under civilizational conditions, to form judgment, calibrate against reality, take responsibility, create meaning, and shape the future?
If these capacities are systematically hollowed out, while human beings remain biologically alive, society continues to run efficiently, and institutions still preserve signatures and approvals — would that count as the end of the human future?
Does this mean that AI may not need to physically kill humans in order to end the human future?
This is the central question I hope we can think through and judge together through this civilizational diagnosis.
This question did not come from a philosophy book. It was forced out of engineering practice over the past two years of building agent systems.
Over the past two years, we began by building AI companion products with long-term memory, action capability, and relational continuity. Step by step, that work evolved into questions about true AGI architecture and cognitive operating systems. We believe that a genuinely AGI-like system should be an engineering loop under non-predefined workflows. In this sense, AGI involves at least five connected capacities: autonomous judgment, autonomous decision-making, autonomous execution, autonomous coordination, and autonomous evolution. Judgment allows the system to understand the current situation. Decision-making allows it to choose paths under multiple objectives and uncertainty. Execution brings those paths into action. Coordination brings the system into relationships with people, tools, organizations, and other AI systems. Autonomous evolution allows the system to maintain continuity over time.
These elements should not be understood as a checklist of capabilities. Together, they form a complete loop. We believe this is the ontological and theoretical foundation for building genuinely AGI-like systems, rather than simply relying on brute-force scaling and emergent capability.
When we actually tried to build this system, we found that what truly blocked the engineering was not code, but philosophy and categories. What caused the system to drift again and again were more basic questions that had not been answered: Who is judging? Who is acting? Who is evaluating? Who is deciding? To whom does responsibility belong? Where do constraints take effect? At what layer does alignment occur? On the surface, these look like engineering questions. In substance, they are philosophical and categorical questions.
Eventually, this produced a kind of engineering drift: functions kept growing, while structure kept dissolving. The faster the engineering moved, the more the system began to resemble an increasingly complex and increasingly ungovernable black box. When these foundational philosophical questions are not clearly defined, the stronger the system becomes, the deeper the drift goes.
This forced me to pause the original engineering rhythm and keep turning to philosophy for answers: to think about categories, ontology, metaphysics, subject and other, soul and consciousness, epistemology, philosophy of action, political philosophy, and ethics.
But precisely as I was being forced again and again toward philosophy, I discovered a definition we had long skipped: What exactly is “humanity”?
How should we define the human? When I realized that human beings are not only biological beings, but also beings constituted by subjecthood, I saw that humanity as a whole may be facing a more hidden kind of existential risk, one different from biological extinction. And this existential risk is not only something that may happen in the future. Its early form is already present in 2026. It does not need to wait for AGI to arrive, and it does not require assuming that AGI will definitely arrive.
Because AI is now entering a deeper position: the position before human judgment is formed. It is entering not merely the toolbox, but the front end of human judgment. Before humans even realize that they need to judge, AI has already completed a process of pre-organization. And today, in 2026, humanity has already begun, largely without noticing it, to hand over more and more of the front end of judgment to AI.
If Section 1.1 dealt with the skipped ontological gap, we must now face a more concrete change: in everyday use, human beings have already begun quietly handing over judgment to AI.
Early human-AI interaction was mainly interaction with answer systems. Humans asked questions; AI gave answers. Humans requested summaries; AI generated text. Humans asked for code; AI produced snippets. Even when things went wrong, the errors could usually be classified as content errors, factual errors, style errors, reasoning errors, or compliance errors. Humans expressed intent at the front end, AI generated responses at the back end, and the subject of judgment still seemed to be human.
But by 2026, this picture is no longer enough.
When AI can call tools, execute long-horizon tasks, embed itself into organizational workflows, and collaborate with other AI systems, it is no longer merely producing text. It is participating in ranking, recommendation, approval, distribution, execution, and decision-making. The core question therefore changes as well: we can no longer ask only, “Is the output correct?” We must also ask, “How is action formed?” and “Where is judgment formed?”
This is not an isolated change in a single field. It is already appearing across multiple civilizational systems at the same time, even though each system uses a different language to describe it.
The rationalist community hears extinction and permanent disempowerment. These terms define the most visible areas on the risk map: the areas involving sudden rupture, seizure of power, loss of control, and physical annihilation. But they have not yet fully defined another area: the one that does not involve an explosion.
Governance systems hear the evidence dilemma and regulatory failure. Policymakers worry that models are hard to audit, responsibility is hard to assign, and external behavior is hard to predict. But the deeper problem may not be that regulatory tools are insufficient. It may be that the assumptions of regulation are failing. Traditional governance assumes that “humans make decisions, tools assist humans.” But when problem definition, material selection, evidence ranking, and reason generation have already been pre-organized by AI, the “human decision-maker” recorded in the regulatory file may still be present, while no longer being the origin point of judgment formation.
Education and science hear the mediation of capacity formation. Students can use AI to complete the chain from problem framing to paragraph generation. Researchers can use AI to generate literature reviews, identify research gaps, draft hypotheses, and organize arguments. Education systems then begin to lose the ability to tell whether they are assessing human capacity or the capacity to call AI. Scientific systems must also ask whether they are increasing research output, or reducing the friction through which researchers personally calibrate themselves against reality.
Organizations hear human-in-the-loop degrading into human-at-the-end. Corporate compliance workflows are becoming more complete, audit logs more polished, and chains of responsibility more legible. The system records: humans are present. But if the problem has already been defined, the evidence already screened, the options already ranked, and the reasons already generated, while the human merely clicks confirm at the end, then human-in-the-loop may only mean human-at-the-end, not judgment in the loop.
At the individual level, people hear an unease that cannot be explained by “job anxiety” alone. People are of course worried about job loss, skill devaluation, and professional displacement. But beneath these worries, there is often a deeper question: among the things I am doing now, which of them are still genuinely my judgment? Do I still know why I believe a given reason? Can I still trace my own chain of judgment? This unease is closer to a kind of subjecthood anxiety.
Religious and cultural systems are also sounding alarms in their own language. They speak of dehumanization, of the human person, of dignity, freedom, and responsibility. I invoke this signal not as theological endorsement or religious authority, but as a civilizational diagnostic signal. Even from a secular rational perspective, when a meaning system spanning millennia begins to issue a “dehumanization” warning about a technology, that should be treated as an anomaly worth explaining.
These alarms use different languages, focus on different systems, and propose different solutions. But they point toward the same deeper signal:
Humanity’s substantive participation in the shared world is being pre-organized, pre-generated, and pre-digested by AI.
AI does not suddenly take something away in a single moment. It removes, step by step, the friction human beings need in order to form judgment, each time it appears merely to “save you one step.” Problems are pre-organized. Evidence is pre-screened. Options are pre-ranked. Reasons are pre-generated. Humans remain inside the system, but increasingly resemble confirmation terminals rather than judging subjects.
This is the quiet handoff of judgment. It is not completed through an overt political event, nor through a model announcing that it has taken over. It happens in repeated requests: “summarize this for me,” “judge this for me,” “recommend this for me,” “write this for me,” “decide this for me.”
Without a name, a risk can only live in people as anxiety. Once it has a name, reason can begin to work.
If multiple systems within a civilization are already sounding alarms, while those alarms remain scattered across different languages, the next step is not to add another item to the risk list. The next step is to name the shared object. Naming is not slogan-making. It turns an implicit process scattered across education, work, governance, culture, and individual anxiety into an object that can be jointly identified, discussed, and reviewed.
What we need to name is not a company, a model, a policy failure, or a single technical defect. It is a structured process entering the depths of civilization through efficiency, convenience, automation, governance improvement, and cognitive assistance. It is quiet enough to keep unfolding while society continues to run efficiently. It is also deep enough that once it touches judgment, responsibility, reality calibration, meaning creation, and future-shaping, it is no longer an ordinary social risk.
This object needs to be fixed in three layers of language.
The first layer is the theoretical language: AI-mediated hollowing of human subjecthood. This names the process: AI is entering the position before human judgment is formed, pre-organizing problems, evidence, options, and reasons, allowing humans to retain formal positions inside workflows while gradually losing their substantive place in the formation of judgment.
The second layer is the end-state language: subjecthood death. This names the end state: human beings still exist as a biological species; social systems still operate; institutional forms are preserved; production and consumption continue; many workflows even look smoother, more efficient, and more professional. Yet the inner conditions that allow human beings to judge, take responsibility, calibrate themselves against reality, create meaning, and shape the future are being systematically hollowed out.
Subjecthood death is not the disappearance of consciousness, not the cessation of thought, and not cultural pessimism. It simply separates “biological survival” from “the survival of subjecthood”: subjecthood death is not humanity ceasing to breathe; it is humanity ceasing to exist as the subject of its own future.
The third layer is the research language: risk to the realization of subjecthood at the level of civilizational conditions. If human subjecthood does not exist only inside individual minds, but is formed, trained, expressed, and transmitted through civilizational functional systems such as education, science, law, governance, culture, organizations, and everyday judgment, then damage to the structures of substantive participation inside those systems cannot be treated as ordinary social change. It may touch the basic conditions under which humanity can continue to exist as the subject of its own future.
These three layers of language are not three different problems. They are three levels of the same risk object: subjecthood hollowing names the process; subjecthood death names the end state; risk to the realization of subjecthood at the level of civilizational conditions names the category under review.
Several misunderstandings must be ruled out. This risk cannot be simply folded into “unemployment,” “hallucination,” “bias,” “privacy,” “platform governance,” or “content authenticity.” All of those problems matter, but they are not at the same level. Unemployment asks whether people still have work positions. Hallucination asks whether system outputs are accurate. Bias asks whether models reproduce or amplify injustice. Privacy asks whether data is violated. Platform governance asks how distribution and attention are managed. The question here is prior to all of these: whether humans still participate in the operation of civilization as judging subjects.
Without the theoretical term, the problem dissolves into education anxiety, work anxiety, governance anxiety, and cultural anxiety. Without the end-state term, readers cannot feel the weight of the problem. Without the research term, it will be misread as literary rhetoric and fail to enter rigorous review. The purpose of these three layers of naming is not to intensify rhetoric, but to make this object visible, attackable, verifiable, and revisable.
If this were only a public essay, it could disturb people and spread widely, but it would be difficult for it to enter serious research, institutional design, or AI governance. A concept without definition can only exist as a feeling. A judgment without boundaries is easily misread as generalized anxiety. A theory without a reviewable structure is hard for the rationalist community to truly attack, verify, or revise.
This is why I had to write the paper.
A paper is not an identity claim to authority. A judgment does not become correct simply because it appears in paper form. What matters about a paper is that it forces an idea into a stricter structure: defining the object, specifying the boundaries, presenting the argument, anticipating objections, responding to possible misreadings, and allowing others to cite, challenge, dismantle, and extend it.
The title of Paper 01 is:
The Risk-Bearing Subject of AI-Related Human Existential Risk: From Species Survival to the Realization of Subjecthood under Human Civilizational Conditions
This paper is not a complete risk map, not a governance proposal, and not an action plan. It first answers a more prior question:
When we say AI poses an existential risk to humanity, what exactly is the “humanity” being threatened?
Paper 01’s first ruling is that the risk-bearing subject is always the embodied human species. Civilization, institutions, culture, technological systems, philosophical cores, and AI systems are not risk-bearing subjects standing alongside humanity.
The second ruling is that what expands is not the risk-bearing subject, but the ways in which the same subject may suffer existential harm. Species extinction is the most direct and irreversible baseline risk. But the same human subject may also lose the basic conditions for realizing subjecthood under civilizational conditions.
The third ruling is that civilizational conditions are not a new subject; they are the condition-structure through which human subjecthood is realized. Humans do not become judges, responsibility-bearers, and subjects of the future in a vacuum. Education, science, law, governance, organizations, culture, language, knowledge, responsibility structures, and meaning transmission form the shared world through which human subjecthood is formed, trained, expressed, and transmitted.
This article is not a summary of Paper 01, nor is it a guide to reading the paper. It is a civilizational diagnosis and position piece built on the ontological coordinates of Paper 01. The later concepts of formal participation, substantive participation, substitutive deprivation, mediated restructuring, and the four thresholds all come from this ontological coordinate system. They are not appeals to authority. They are entry points for review.
If you think “subjecthood death” is merely rhetoric, begin by attacking Paper 01: attack its definition of the risk-bearing subject; attack its understanding of civilizational conditions; attack whether it wrongly expands existential risk; attack whether its four thresholds are sufficiently rigorous.
I submit this diagnosis to the rationalist community not so that you will believe me, but so that you will have a concrete target to attack — and, if possible, break.
Subjecthood hollowing does not automatically appear in the form of catastrophe. On the contrary, it usually appears in the form of improvement. Writing becomes faster. Reports become more complete. Summaries become clearer. Governance becomes more refined. Approvals become smoother. Education becomes more personalized. Research becomes more productive. Organizational efficiency increases. Precisely because it appears as improvement, it is harder to recognize as harm.
A civilization may become stronger by efficiency metrics while becoming more hollow in the conditions of subjecthood. It may become increasingly good at generating answers while training humans less and less in the capacity to form answers. It may become increasingly good at producing explanations while requiring humans less and less to understand the evidence behind them. It may become increasingly good at organizing action while preserving less and less substantive human participation in how action is formed.
The future I fear is not one in which humans disappear from the interface. The future I fear is quieter, and harder to recognize: humans remain everywhere in the interface, clicking, confirming, signing, publishing, approving, endorsing, taking responsibility. Every institutional record shows that humans are still present. But behind these actions, humans no longer truly form judgment, no longer truly know why they accept a given reason, can no longer truly trace how a conclusion was formed, and can no longer truly take responsibility for the process itself.
Here I need to clarify one point: AI itself is not the enemy. The hollowing process is the enemy.
I say AI itself is not the enemy because AI can also strengthen human judgment. It can help people discover blind spots, organize evidence, generate counterarguments, broaden perspectives, and accelerate research. The problem is not whether AI participates. The problem is how AI participates: whether it enters the execution layer of human activity, or the position before judgment is formed; whether it helps humans form judgment, or allows humans to bypass judgment; whether it expands subjecthood, or turns subjecthood into a confirmation interface.
The most dangerous thing about subjecthood hollowing is not that it looks evil, but that it looks useful.
It does not appear as destruction. It appears as smoothing. It reduces friction, compresses waiting, replaces searching, pre-organizes materials, automatically generates reasons, compares options for us, and completes expression on our behalf. It makes every step easier. But forming judgment requires certain difficulties. Understanding requires friction. Evidence requires friction. Counterexamples require friction. Responsibility requires friction. Meaning also requires friction. If a civilization treats all of these frictions as inefficiencies, and continuously uses AI to remove them, it may become stronger in efficiency while growing weaker in subjecthood.
This is why the term “subjecthood death” is more accurate than “efficiency risk,” “cognitive outsourcing,” or “governance challenge.” “Efficiency risk” is too shallow. “Cognitive outsourcing” is not enough. “Governance challenge” is too institutional. “AI’s impact on subjecthood” is too soft. Subjecthood death directly names the end state: humanity is not killed, but the conditions under which humanity remains the subject of its own future are disappearing. It exposes the wound that has been covered over by the languages of efficiency, governance, tools, and productivity.
But naming is only the first step.
Next, we must cut open a more specific and more dangerous illusion: being in the workflow is not the same as judgment being in the workflow. Human-in-the-loop is not judgment in the loop.
In this chapter, I want to cut open one of the illusions most likely to mislead us:
Being in the workflow is not the same as judgment being in the workflow.
Human-in-the-loop sounds reassuring. No matter how powerful AI becomes, there is still a human reviewing, confirming, signing, and taking responsibility at the end. In engineering, it is a safety mechanism. Institutionally, it is proof of compliance. Ethically, it makes us feel that ultimate control remains in human hands.
But this mechanism rests on an assumption that has never been rigorously verified: as long as the human is in the workflow, judgment is in the workflow.
That assumption may be false. Humans may still review, confirm, sign, approve, give feedback, and take responsibility. But the processes that actually determine how a problem is defined, how evidence is selected, how reasons are organized, how risks are ranked, and how options are presented may already have been completed by AI at an earlier point.
That is why human-in-the-loop is not enough. The real question is not whether humans are still in the loop, but whether judgment is still in the loop.
To separate these issues, we must first distinguish formal participation from substantive participation.
Formal participation means that human beings retain visible positions, procedural actions, and nominal roles within civilizational functional systems. Examples include clicking confirm, signing documents, approving recommendations, choosing options, voting, consuming, and providing feedback. Formal participation preserves the recognizability of humans inside institutions, but it does not guarantee that humans truly understand, judge, take responsibility, or shape the process.
Substantive participation means that human beings, as knowers, judges, responsibility-bearers, institutional maintainers, creators of meaning, and subjects of the future, exert real influence on the operation, direction, and renewal of the shared world. It requires that humans be able to understand the process, form judgment, take responsibility, create meaning, and exert substantive influence over institutional direction and the public future.
The most dangerous change in the AI era may not appear as humans being removed from systems. On the contrary, formal participation may be preserved more completely than ever: more confirmation boxes, more detailed approval logs, clearer responsibility records, more polished compliance chains. From the outside, humans may appear more “present” than before. But if problem definition, evidence screening, option ranking, and reason organization have already been completed by AI at the front end, then end-stage confirmation may be only formal participation, not substantive participation.
Formal participation makes you visible in the system. Substantive participation makes your judgment present in the system. A person who is truly judging and a person who is merely confirming a conclusion pre-organized by AI may look identical from the outside: both clicked “approve.”
This is subjecthood idling in place: the human still occupies the position of the subject, but the substance of subjecthood is withdrawing. The human still makes choices, but the options have already been pre-organized. The human still expresses views, but the language and reasons have already been pre-generated. The human still bears responsibility, but can no longer truly reconstruct the chain of judgment to which that responsibility corresponds. The human still appears in the institution, but what the institution recognizes is the signature, not the process by which judgment was formed.
AI weakens human substantive participation through at least two mechanisms.
The first is more visible. I call it substitutive deprivation.
Substitutive deprivation means that AI, as a more efficient, more reproducible, and more scalable substitute, gradually replaces human beings in labor, cognition, creation, governance, and judgment roles within civilizational functional systems, pushing humans to the margins of — or out of — the key functional systems of the shared world.
The core of substitutive deprivation is not the disappearance of a particular job, profession, or industry. It is that human beings gradually cease to be necessary subjects in the operation and renewal of the shared world. Humans may still consume, vote, confirm, or be served, but they exert less and less substantive influence over resource allocation, institutional direction, knowledge production, and cultural narrative.
The second mechanism is more hidden, and it is the one this chapter must cut open. I call it mediated restructuring.
Mediated restructuring means that humans remain inside civilizational functional systems and still retain forms such as choosing, approving, signing, learning, creating, and participating, but the preconditions of these activities have already been deeply mediated by AI.
Under mediated restructuring, humans still obtain information, but the entry points, summary structures, and interpretive frames may already be pre-organized by the system. Humans still make decisions, but the visible options, risk rankings, and generated reasons may already be shaped by AI. Humans still bear responsibility, but the process of judgment formation becomes harder and harder to understand and trace. Humans still learn and create, but the learning process, creative standards, and evaluation of meaning increasingly depend on system outputs.
The distinction between these two mechanisms matters. Substitutive deprivation gradually removes humans from key positions inside systems. Mediated restructuring preserves the formal position of humans, but changes the structures of judgment, responsibility, and meaning behind those positions. The former pushes humans out. The latter leaves humans in place, but what remains may be only form.
I am more concerned with the latter in this essay because mediated restructuring is harder to recognize. It does not necessarily make humans leave the system. It may even make humans appear inside the system more frequently, more procedurally, and in more auditable ways. The problem is precisely this: humans are still present, while substantive participation is withdrawing.
We can now return to everyday scenes. The following four scenarios do not show humans being kicked out of systems. They show how, even while humans remain present, substantive participation can be weakened through mediated restructuring.
The first scenario is a corporate manager.
In the morning, the manager opens a system. AI has already integrated overnight market data, competitor movements, customer feedback, and internal operational metrics into a briefing. At the end of the briefing, it lists several action plans, each with risk assessments, expected returns, and recommended priority. The manager reads the briefing, understands the logic, chooses one plan, and signs off. The system records: human decision-maker present, approval completed.
What this manager sees is not the raw world, but the world as already organized by AI. Which market data deserves attention, which customer feedback is marked as important, which anomalies are excluded — all of this has been completed before the manager begins reading. The manager still makes a choice, but that choice occurs after an information entry point has already been pre-organized.
Of course, an exceptionally vigilant manager could set the briefing aside and return to the raw data. But within the system’s incentives, that runs against efficiency. When a briefing pre-organized by AI appears accurate enough, clear enough, and stable enough, most rational managers will make decisions inside the pre-organized world. The danger of mediated restructuring is not that it deprives humans of the freedom to inspect original materials. It is that it systematically removes the motivation and incentives to do so.
What withdraws here is not the decision button, but the entry point through which the world enters judgment.
The second scenario is a judge.
AI matches precedents, ranks statutes, generates a draft opinion, and marks potential appeal risks. The judge reads the draft, revises wording, adds explanation, and issues the judgment. The system records: human judge present in adjudication.
The precedent-matching frame was not built from scratch by the judge. The priority of statutory application has already been ranked by the system. The core logic of the opinion has already been pre-generated. Which precedents were not presented, which interpretive paths were suppressed, which factual details never entered the final draft — the judge may not be able to see all of this.
The institution records a human signature. But if key parts of the chain of judgment were not formed by the judge, does issuance still equal adjudication? Having a human at the endpoint of the responsibility chain does not mean the starting point of the judgment chain is also human. What withdraws here is not only legal reasoning, but the real correspondence between responsibility and judgment.
The third scenario is a researcher.
A researcher is preparing to choose a new direction and asks AI to produce a literature review. AI generates a beautifully structured report, organizes the field by theme, marks mainstream disputes, extracts research gaps, and recommends several possible directions. The researcher reads the report, selects one direction, and begins designing experiments. The system records: researcher completed topic selection.
Has the researcher entered the field itself, or a knowledge map drawn by AI?
AI is not merely organizing literature. It has already drawn the boundaries of the field, marked which questions matter, which gaps are worth pursuing, and which paths seem more promising. The researcher still chooses a direction, but chooses on a terrain map pre-organized by AI. The real danger is not that AI recommends a wrong direction. It is that some directions outside the map are never seen at all. Breakthroughs often come from beyond the boundary. If the boundary itself is preset by AI, the researcher may never know which region they never entered.
What withdraws here is not only the labor of literature review, but the personal formation of knowledge boundaries.
The fourth scenario is a student.
A student begins with a vague interest. AI helps focus it into a paper topic, recommends references, organizes the argument structure, generates draft paragraphs, and continuously improves the expression. The student reads, adjusts, and submits a structurally complete paper. The system records: student completed the task.
This student did not go through the process of turning vagueness into clarity. They did not build their own evidential judgment amid conflicting literature. They did not construct an argument from scratch. They did not reorganize understanding through failed expression. They obtained a good piece of work, but did not undergo the friction required to form the underlying capacity.
This is not merely a question of cheating. The deeper issue is that the capacity may never have fully formed. If, from their student years onward, a person always has AI focus every vague problem, screen every complex body of material, and organize every argument, then they have not simply forgotten how to judge. They may never have fully experienced the training structure through which judgment capacity is formed.
What withdraws here is not only the learning process, but the generative conditions of judgment capacity.
The four scenarios differ, but the mechanism is the same: humans remain in the workflow, while the key links of judgment formation have already been organized in advance. Information entry is pre-processed, responsibility chains are formalized, knowledge boundaries are pre-set, and capacity formation is bypassed. This is what I mean by mediated restructuring: the front-end processes of human judgment are being replaced and pre-organized by AI.
At this point, “humans are still in the loop” is no longer enough to reassure us. The real question is: is human judgment still in the loop?
Someone may object: AI does the preliminary organization, and humans make the final decision. Isn’t that just a better division of labor?
The problem is that, at the level of philosophical ontology, final decision-making is not judgment. They are two entirely different categories.
Final decision-making means choosing among already defined problems, already screened evidence, already ranked options, and already generated reasons. Judgment means defining the problem in ambiguity, screening evidence amid contradiction, constructing reasons under uncertainty, and bearing consequences under conditions of possible error. Their ontology and boundaries are fundamentally different.
The former is choosing within an already-converged space of possibilities. The latter is opening a space of possibility in ambiguity and establishing the point of departure. The former is the endpoint of a causal chain. The latter is the starting point of a causal chain. If humans retain only the final right of choice, what they hold is only the endpoint, not the origin. They are still performing an action, but that action takes place inside a world already organized for them by someone — or something — else.
That is why human-in-the-loop is not enough. A person can be in the loop while judgment is not in the loop. They can be at the endpoint of the responsibility chain while not being at the origin of the causal chain. They can sign without being able to reconstruct the reasons. They can approve without having gone through the evidence. They can express a view while merely transmitting a position already organized by the system. At that point, responsibility is easily compressed into a signature.
Others may say: this is merely ordinary assistance. AI helps you, just as tools have always helped humans.
But mediated restructuring is not merely “AI helped you.” Its key feature is that AI rewrites the conditions under which you participate in the world before you form judgment. It rewrites the information entry point: what you encounter is not raw material, but material already organized, compressed, and ranked by AI. It rewrites the evidence structure: you think you are looking at evidence, but you are looking at what AI thinks should be shown to you. It rewrites the option space: you think you are choosing, but you are moving inside an option space already constructed by AI. It rewrites the argumentative path: you think you are understanding reasons, but you are reading reasons AI organized on your behalf.
So the key to mediated restructuring is not what AI does for you, but how it rewrites the conditions under which you participate in the world before you form judgment. It preserves human formal participation and maintains the institutional appearance that “humans are still in the loop,” while hollowing out the substance of participation. System logs show humans present. Audit reports confirm human approval. The endpoint of the responsibility chain contains a human signature. But the substance of judgment may no longer be there.
Another objection might be: how is this different from calculators making people worse at mental arithmetic, navigation systems making people worse at finding their way, or search engines making people worse at using dictionaries? Technology has always replaced human capacities. Society adapted, didn’t it?
This is a necessary objection. The difference lies in where the replacement occurs.
A calculator replaces calculation execution. You already know what you want to calculate, why you are calculating it, and what the resulting number means. A navigation system replaces route planning. You already know where you want to go, why you are going, and what you will do when you arrive. A search engine replaces retrieval execution. You propose keywords, look for materials, and decide which results are relevant to your question. These tools mainly replace execution at the end of the judgment chain, or assist with local retrieval inside the judgment process. They may cause skill degradation, but they do not usually rewrite the front end of judgment formation.
Current large language model mediation is different. It can define the problem before you judge. It can screen evidence before you judge. It can rank options before you judge. It can generate reasons before you judge. What you receive is not a number that still requires you to understand the calculation, but a conclusion already organized, interpreted, argued, and recommended.
This is the essential difference between mediated restructuring and skill degradation: skill degradation usually occurs at the execution end; mediated restructuring occurs at the front end of judgment. When end-stage capacities degrade, there is at least a reference point. You used to be able to do mental arithmetic; now you cannot. You know what you lost, and you can retrain it. But if front-end capacities never fully formed, they are much harder to detect. If someone has never experienced defining problems in ambiguity, screening evidence amid contradiction, and forming judgment under uncertainty, how would they know that they lack that capacity?
It is hard to detect a capacity you do not even know you are missing. It is also hard to recover a formative process you have never fully undergone.
Mediated restructuring is dangerous because what it removes is not only inconvenience. It may also remove the conditions under which judgment capacity is formed.
Understanding requires friction. Understanding is not the receipt of information. It is the formation of structure through conflict, confusion, revision, and integration. Facing messy materials, identifying the problem yourself, discovering contradictions yourself, adjusting your understanding yourself — this process looks inefficient, but it is where understanding happens.
Evidence requires friction. Personally tracing sources, comparing methods, discovering gaps in data, and judging why a paper is reliable or unreliable are not merely steps toward obtaining a conclusion. They are how one forms the capacity to judge what is real.
Responsibility requires friction. Responsibility becomes substantive only when you form a judgment yourself and bear its consequences. If your judgment is wrong, you can locate whether the error may have occurred in problem definition, evidence screening, option ranking, or argumentative path. But if all of these links have been pre-organized by AI, then what you bear is closer to procedural responsibility than judgment responsibility.
Meaning also requires friction. Meaning is not collected from standard answers. It often forms through uncertainty, failure, rewriting, choice, and responsibility. If the system always gives you the smoothest, most reasonable, most acceptable path, you receive the result, but may lose the process through which meaning is generated.
I am not defending inefficiency. I am defending the training ground through which human beings become judges. A system can remove inconvenience for you, but it cannot remove growth for you. If a civilization treats all cognitive friction as a defect that must be optimized away, what it ultimately gets will not be superpowered humans, but perfect confirmation terminals.
What you saved was not merely time. What you eliminated was the process of becoming a judge.
At this point, Chapter Two has established only one thing: being in the workflow is not the same as judgment being in the workflow. AI can rewrite the front end of human judgment formation without removing humans from the workflow. Humans remain present, continue approving, continue signing, continue taking responsibility, while substantive participation may already be thinning out.
This is serious enough. But seriousness is not existential risk.
Technological replacement can be serious. Unemployment can be serious. Educational degradation can be serious. Information pollution can be serious. All of them may require governance. But they do not automatically constitute human existential risk. I do not want to push subjecthood hollowing into the space of existential risk through rhetoric alone. If subjecthood hollowing is to enter the review space of AI-related human existential risk, it must pass a higher standard.
So the next step cannot be advanced by rhetoric. We must talk about thresholds, and we must talk about falsifiability.
I do not want to push subjecthood hollowing into the category of existential risk through rhetoric alone.
Chapter Two established only one thing: being in the workflow is not the same as judgment being in the workflow. AI can rewrite the front end of human judgment formation without removing humans from the workflow. Humans remain present, continue approving, continue signing, continue taking responsibility, while substantive participation may already be thinning out.
This is serious enough. But seriousness is not the same as existential risk. Job loss can be serious. Information pollution can be serious. Educational dependency can be serious. Institutional automation can be serious. But if every major social problem can be upgraded into an existential risk, then the concept of “existential risk” loses its boundary and becomes just another inflated label.
At the same time, before moving further, I must explain how this diagnosis can fail.
By this point, I have already proposed a heavy civilizational diagnosis: AI does not have to kill humans to end the human future; subjecthood death may not be a future explosive event, but a civilizational process that has already begun; subjecthood hollowing may gradually weaken the civilizational conditions under which humans can judge, take responsibility, calibrate themselves against reality, create meaning, and shape the future.
But a heavy diagnosis naturally risks sliding into grand narrative. It can become a theory that explains everything no matter what happens: if bad things happen, it claims vindication; if bad things do not happen, it claims successful warning; if society continues to function, it says that functioning is precisely the symptom. If that is the case, then it does not deserve to be taken seriously by EA, LessWrong, rationalist, AI safety, or AI governance communities.
Therefore, I must talk about falsifiability.
Strictly speaking, Paper 01 is not an experimental scientific theory in the sense of physics. It is a paper in ontology and conceptual engineering. Such theories are not always directly falsified by a single observation. They depend more on whether concepts are clear, whether boundaries are stable, whether arguments are internally consistent, whether counterexamples can break them, whether they explain real problems, and whether they can enter practice.
But this theory is not a pure conceptual game. It proposes a civilizational diagnosis, involving AI risk, education, science, law, governance, institutional design, and the human future. Once it seeks to enter these real domains, it cannot remain at the level of explanatory power alone. It must accept constraints from reality. It must state the conditions under which it would fail.
I am not disguising philosophical conceptual engineering as physics. I am trying to make philosophical conceptual engineering accept as much methodological discipline as possible: philosophy defines the object; logic draws out the implications; reason reviews the boundaries; and falsifiability requires that this diagnosis not evade reality’s judgment.
I propose this framework not so that it will remain forever correct, but so that it can be attacked, tested, revised, and, if necessary, falsified. Only then does it deserve to enter the judgment space of the rationalist community.
The first layer of falsifiability comes from four thresholds. They are not meant to make the theory look complicated, nor are they meant to re-explain Paper 01. They are meant to prevent this diagnosis from overgeneralizing without limit.
If every AI problem can be described as subjecthood death, then subjecthood death explains nothing. If every major social change can be described as existential risk, then the concept of “existential risk” loses its boundary. For subjecthood hollowing to enter existential risk review, it must pass through four gates.
The first threshold is civilizational foundational conditions. The risk must act on foundational conditions such as cognition, judgment, responsibility, meaning creation, and the formation of future subjects, rather than merely affecting jobs, skills, industries, or isolated workflows. Job shifts, skill degradation, and industrial restructuring can all be serious, but by themselves they are not enough to constitute existential risk in the sense used here.
The second threshold is diffusion across domains and across civilizational interfaces. The risk cannot remain confined to a single company, school, country, product, or tool. If a problem appears only in a local system, while other systems can still provide calibration, correction, and compensation, then it should not be easily upgraded into existential risk. For subjecthood hollowing to hold, it must show a tendency to diffuse across key systems such as education, science, law, governance, organizations, and culture.
The third threshold is structural damage that is difficult to reverse. The risk cannot be merely a short-term degradation that can be repaired. If AI causes judgment outsourcing, responsibility drift, or capacity degradation, but humans can relatively quickly recognize it, repair it, and restore substantive participation, then it may be a serious social risk, but not necessarily an existential risk.
The fourth threshold is holistic, intergenerational damage to the capacity to remain subjects of the future. What this essay is concerned with is not whether “some people become lazy” or “some skills degrade,” but whether the capacities humanity needs in order to remain the subject of the future — judgment, evidential reasoning, responsibility, meaning, and institutional calibration — are weakened holistically across generations.
These four thresholds are themselves the first falsification mechanism. If a particular AI impact does not act on civilizational foundational conditions, does not diffuse across systems, can be absorbed by institutional repair and educational retraining, and does not cause holistic damage to the capacities of the next generation as future subjects, then it is not an existential risk in the sense used here. It may still be a serious social risk, but it cannot be placed inside the framework of subjecthood death as core evidence.
Therefore, the current claim of this essay is not that “subjecthood death has already occurred,” but this: AI-mediated hollowing of human subjecthood is sliding toward these thresholds, and has already become serious enough to demand rigorous review.
This is also why this diagnosis is not generalized anxiety. It cannot look at every AI problem and say, “subjecthood death is happening.” It must pass through threshold judgment. Without thresholds, there are no boundaries. Without boundaries, there is no credibility.
Among the four thresholds, the third is the most important. It determines whether this diagnosis is merely “anxiety about capacity degradation,” or whether it touches something deeper: an intergenerational rupture.
The third threshold is crucial because existential risk is concerned not only with scale, but also with reversibility.
A risk can be large, but if it can be detected, corrected, compensated for, and repaired across generations, it may still not constitute existential risk. Conversely, a risk may look mild in its early stages, but if it gradually damages a civilization’s ability to repair itself, it may be more dangerous than an overt catastrophe.
The irreversibility of subjecthood hollowing does not begin primarily at the individual level. If an adult once had the ability to form independent judgment and later becomes overly dependent on AI, they may still recover part of that capacity by reducing dependency, retraining, and rebuilding workflows. The process may be painful and inefficient, but they at least know what they have lost. They have a reference point. The same is true at the team level: as long as a team still knows what substantive judgment is, it still has a repair pathway.
The real problem lies at the intergenerational level.
If a generation grows up in an environment comprehensively mediated by AI, and for a long time lacks the training structures required to independently define problems in ambiguity, independently screen evidence amid contradiction, independently construct reasons under uncertainty, and form judgment, then they may not know what they are missing. It is not that they know they lack judgment capacity and do not care. It is that they may not have enough of a reference point to recognize the absence.
Degradation means once having a capacity and then losing it. Rupture means never fully forming it, and therefore lacking a reference point. These are not differences of degree. They are differences of structure.
The formation of judgment capacity is not merely information input. It requires accumulated experience: the experience of staying with ambiguity, weighing contradictions, correcting oneself through error, and understanding responsibility by bearing consequences. These experiences cannot be fully compressed, and AI cannot pre-experience them on your behalf. You can ask AI to explain what judgment is, but that is not the same as having undergone the formation of judgment.
I call this mechanism intergenerational capacity rupture.
It means this: if one generation does not fully form the training structures for certain higher-order independent judgment capacities, and the next generation grows up in an environment where it becomes increasingly difficult even to understand why those training structures mattered, then each generation will find it harder to transmit capacities it never fully possessed.
This also creates self-locking. Reversing such a rupture requires reintroducing training environments that are not pre-organized by AI, and that are full of friction and uncertainty. But if AI mediation has already become infrastructure for education, work, governance, and organizations, actively stepping outside pre-organization and accepting judgment friction becomes more than a personal choice. It becomes a politically, economically, and institutionally costly act of moving against the grain. Efficiency systems will punish it. Organizational competition will punish it. Educational evaluation may punish it.
You cannot use a tool that removes friction to recover a capacity that requires friction in order to form. What is harder still is that once the removal of friction becomes civilizational infrastructure, the reintroduction of friction itself will be viewed as irrational.
This is why intergenerational capacity rupture is the core of Chapter Three. It moves the question from “whether a capacity is degrading” to “whether a capacity has never fully formed,” from “whether it can be retrained” to “who still knows what needs to be trained,” and from “individual dependency” to “whether civilization is losing the ability to repair subjecthood.” If subjecthood hollowing is merely degradation, there is still a path back. If it becomes rupture, the path back itself begins to disappear.
Therefore, in addition to the four thresholds, this diagnosis must provide clear paths of falsification. In other words, I must state what kinds of future evidence would force me to revise this warning, or even accept that it has been falsified.
The first path is falsification by reversibility / threshold failure. If AI deeply enters scientific, educational, legal, governance, and organizational systems, and does cause judgment outsourcing, responsibility drift, or capacity degradation, but these harms can be restored through institutional adjustment, educational reform, retraining, and responsibility reconstruction, then the diagnosis has not crossed the threshold of “irreversible or difficult-to-reverse structural damage.” In other words, if humanity proves that it can restore substantive participation after the AI shock, then my diagnosis must contract.
The second path is falsification by categorical-calibration resilience. What I truly worry about is not merely that certain capacities are replaced, but that civilization’s foundational categories drift. If categories such as reality, evidence, responsibility, subjecthood, value, and meaning are continuously rewritten by AI-smoothing workflows, and civilization cannot recalibrate them, then subjecthood hollowing becomes a structural risk. But if science, law, education, and governance systems can clearly distinguish AI assistance from human substantive judgment, and can preserve the operability of foundational categories after AI is deeply embedded, then the diagnosis of subjecthood death should be revised.
The third path is falsification by intergenerational capacity preservation / subjecthood elevation. This path is the most important, because what this theory truly cares about is not how current adults use AI, but whether the next generation, growing up in an AI-mediated environment, can still form judgment, verify evidence, bear responsibility, create meaning, and act as subjects of the future. If the future shows that AI does not reduce humans to confirmation interfaces, but instead massively improves human capacity for problem definition, evidential judgment, value reflection, institutional correction, and meaning creation — and that these capacities are preserved and transmitted in education, research, governance, and public culture — then my pessimistic diagnosis should be falsified. This is an outcome I would very much like to see.
These three paths of falsification show one thing: this is not a theory that can only win and never lose. It can lose to a better future.
At the same time, we must distinguish real falsification from pseudo-falsification.
In the future, someone may say: society has not collapsed; the economy is still growing; systems are more efficient; humans are still signing; schools are still teaching; courts are still deciding cases; science is still publishing papers — therefore this theory has been falsified.
This objection is not enough. What I am diagnosing is not whether the outer shell of society continues to function, but whether humans still substantively participate in judgment while society functions. A hollowed-out civilization may very well be more efficient, more stable, and smoother. It may have more text, more workflows, more reports, more approvals, more publications, and more compliance indicators. On the surface, everything runs, perhaps even better than before. But the real question is whether this functioning is still supported by human judgment, responsibility, evidence, and meaning.
Society still functioning is not enough to falsify this theory. Real falsification is not proving that the system is still running. It is proving that humans still participate in the system’s operation as judging subjects.
If education can still cultivate people who independently pose questions, verify evidence, and form judgment; if the scientific community can still independently trace error and distinguish generated text from human verification; if legal and governance systems can still trace responsibility, rather than compressing responsibility into formal signatures; if organizations still require people to understand reasons, rather than merely confirm AI-ranked options; if culture can still create meaning, rather than merely consume expressions pre-organized by systems; if these capacities can be transmitted across generations rather than being preserved only by a small group of highly trained people — then those facts would constitute real challenges to this diagnosis.
Conversely, if society becomes more efficient on the surface while human substantive judgment, evidence verification, responsibility tracing, judgment training, and meaning creation decline systematically, then that is not falsification. That is the early form of the risk described here.
Therefore, the real boundary must be drawn in advance: system operation is not the same as subjecthood preservation; formal participation is not the same as substantive participation; the absence of an explosion is not the same as the absence of harm.
If you can break the four thresholds, or successfully walk through the three paths of falsification, then this diagnosis must contract, be revised, or even be abandoned. But if you merely say “society is still functioning,” “humans are still signing,” “AI recommendations are usually correct,” or “humans can shut AI off at any time,” then we have not yet entered the real problem. These objections may only prove that the formal system is still operating. They do not prove that humans are still substantively participating in judgment.
More dangerously, they may fall into an epistemic trap: using symptoms predicted by the theory as evidence against the theory itself. The next chapter deals specifically with that trap.
The previous chapter explained what genuine falsification would look like. This chapter deals with another class of objections: objections that appear to falsify the theory, but may in fact be symptoms the theory itself predicts.
I must first mark the boundary clearly. This chapter does not rule out objections. It does not swallow all empirical facts into the theory. It is not an attempt to say, “No matter what happens, I am right.” On the contrary, precisely because genuine falsification paths must remain open, we have to cut open a form of objection that looks empirical but is structurally misplaced: using symptoms predicted by the theory to refute the theory itself.
This is pseudo-falsification.
An ordinary misjudgment occurs when you use the wrong evidence. Pseudo-falsification goes deeper: you obtain a phenomenon the theory already expects to appear, but treat that phenomenon as evidence that the theory has failed. If a theory predicts that a disease will cause the patient to lose pain sensation, then “the patient says they do not feel pain” cannot directly refute the diagnosis, because not feeling pain may itself be part of the disease. If a theory predicts that a system will preserve external workflows while hollowing out internal judgment, then “the workflow is still there” cannot directly refute the diagnosis, because the continued presence of the workflow may itself be the external appearance of the process.
This is exactly where subjecthood hollowing is most dangerous. It does not necessarily make society stop. It does not necessarily make institutions collapse. It does not necessarily remove humans from their positions. It may preserve signatures, approvals, jobs, compliance logs, benchmarks, and polished operational indicators. It may even make the system run more smoothly.
So this chapter does not close off objections. It closes off misplaced objections. The following three claims all capture something real, but they place that reality in the wrong position.
The first form of pseudo-falsification is: “Humans are still making decisions.”
CEOs are still making calls. Judges are still deciding cases. Doctors are still diagnosing. Professors are still grading. Government officials are still approving. Documents carry human signatures. System logs contain human actions. Compliance workflows include human review nodes. From the outside, humans have not been removed from the system by AI.
This matters, of course. If humans were not present at all, the problem would be more obvious and easier to identify. But what this diagnosis is concerned with is not whether humans have disappeared from the system. It is whether humans, while retaining formal positions inside the system, still substantively participate in judgment. If CEOs, judges, doctors, or managers are facing options after AI has already defined the problem, screened the evidence, ranked the risks, and generated the reasons, then humans being in the position does not mean human judgment is also in the position.
“Humans are still making decisions” is misleading because it mistakes the final action for full judgment. The final click, signature, approval, or choice is indeed a human action. But judgment is not the last action. Judgment includes problem definition, evidence screening, option ranking, reason construction, conclusion formation, and bearing consequences. If these links have already been deeply pre-organized by AI at the front end, then the final act of confirmation cannot automatically prove that judgment still resides with humans.
The second form of pseudo-falsification is: “AI recommendations are usually correct.”
This may be the strongest sedative of the present moment. AI outputs are becoming more accurate, reasoning is becoming stronger, performance on professional tasks is improving, and in many contexts AI genuinely saves time, reduces errors, and increases efficiency. If AI is right most of the time, human confirmation of its recommendations does not look dangerous. It may even look like a better division of labor: AI does a great deal of preliminary processing, and humans make the final call.
But the real danger here is not that AI is wrong. It is that AI is often right. If AI were frequently wrong, humans would remain alert. They would ask for review, return to original materials, and preserve their judgment muscles. Error creates friction, and friction forces humans to maintain calibration capacity. But if AI remains correct enough for long enough, the system gradually learns a new habit: full calibration is no longer necessary; confirmation is enough. The more accurate AI becomes, the less humans need to trace evidence. The less they trace evidence, the weaker the habit of tracing evidence becomes. The weaker the habit becomes, the more capacity degrades. The more capacity degrades, the more the system needs AI to perform deeper pre-organization.
Correctness itself is not bad. The problem is that when correctness consistently appears in a way that removes judgment friction, it may weaken the training ground through which humans maintain calibration capacity. If a system always organizes the world for you in advance, you will need less and less to personally verify how the world has been organized. The real risk often appears under out-of-distribution conditions. In ordinary situations, AI recommendations may be good enough, so humans practice independent judgment less and less. When an anomalous situation appears — when existing models, data, and paradigms no longer cover the problem — humans suddenly need real calibration capacity. But by then, that capacity may have gone unused for a long time, or may never have fully formed.
The third form of pseudo-falsification is: “Humans can shut AI off at any time.”
This is one of the most common reassurances in governance discussions. As long as ultimate control remains with humans, as long as there is an off switch, as long as institutions preserve human override, AI remains under control. This claim captures something real: formal control rights matter. Without shutdown mechanisms, veto mechanisms, and manual takeover mechanisms, the risk would be more direct and more visible. But formal control rights are not the same as substantive control capacity.
Exercising the right to veto requires a precondition: you must know when to veto. You must be able to recognize when AI’s judgment has gone wrong, what evidence it omitted, what path it suppressed, what target it mis-set, and what value ordering it distorted. You must possess judgment capacity independent of AI in order to judge whether AI needs to be vetoed.
If your process of understanding, your entry into evidence, your option space, and your generation of reasons have long depended on AI, then when AI makes a mistake, you may not know that it has made one. More deeply, you may not even experience the intuition that “I should be suspicious of this.” The system will offer a fluent explanation, a reasonable ranking, and a compliance record, and what you see will look like a complete judgment process.
At that point, the veto right still exists, but its substantive conditions have thinned. A person who cannot recognize danger does not have safety merely because they possess an alarm button. A person who does not know when to jump does not have escape capacity merely because they possess a parachute. This is exactly what subjecthood hollowing is concerned about: humans may preserve the form of veto power while gradually losing the capacity to know when, why, and how to veto.
The common error across these three pseudo-falsifications is that they mistake the shell of the system for the preservation of subjecthood: formal participation for substantive participation, correct outputs for continuing calibration capacity, and a veto button for the actual capacity to veto.
The three pseudo-falsifications share the same structure. “Humans are still making decisions” tests whether formal position is preserved. “AI recommendations are usually correct” tests whether output correctness has improved. “Humans can shut AI off at any time” tests whether formal control rights exist. These are not meaningless indicators. In many risk contexts, they are highly important. The problem is that they are not sufficient indicators for detecting subjecthood hollowing.
Subjecthood hollowing does not primarily damage whether workflows exist, whether outputs look good, or whether buttons remain in place. It damages whether the substantive conditions of judgment formation, evidential calibration, responsibility attribution, and meaning creation still reside with humans. More seriously, these indicators themselves may be contaminated by mediated restructuring. When AI pre-organizes the front end of judgment, workflows become smoother. When AI provides high-quality recommendations, outputs become more accurate. When institutions preserve human-in-the-loop, responsibility chains become more auditable. When humans only need to confirm, efficiency increases. The indicators you use to detect health may be the very result of hollowing.
This is detection contamination. Detection contamination does not mean that all indicators are useless, nor does it mean empirical data is unimportant. On the contrary, empirical data must matter. But empirical data must ask the right question. If you want to detect whether subjecthood is still present, you cannot look only at whether the system runs, whether outputs are correct, or whether workflows are compliant. You must detect whether humans still participate in problem definition, evidence screening, option construction, reason generation, and responsibility-bearing.
If detection tools look only at external operation, they will miss internal withdrawal. A civilization can have more documents, more reports, more approvals, more publications, more indicators, and more compliance records, while simultaneously requiring humans less and less to form judgment themselves. The more complete the shell becomes, the easier it is for detection to be deceived by the shell.
At that point, the question is no longer merely, “Have we seen enough data?” The question is whether the data we are seeing still comes from an observation position that has not already been shaped by the risk mechanism.
One of the tools the rationalist community trusts most is Bayesian updating.
You have a prior. You observe new evidence. Then you adjust your posterior in light of that evidence. This method is powerful because it requires people to revise beliefs according to reality, rather than selecting reality according to their beliefs.
But Bayesian updating has a precondition: the evidence you use for updating must maintain a sufficiently clear relationship with the process being tested. More precisely, the evidence source must not be contaminated in the same direction by the very mechanism you are trying to detect. Otherwise, you think you are updating your posterior, but in reality you are using a contaminated data source to confirm your sense of safety.
Subjecthood hollowing produces exactly this kind of input contamination. If your evidence is “the system is more efficient,” but that efficiency comes from removing judgment friction, then this evidence cannot directly prove that subjecthood has not withdrawn. If your evidence is “AI outputs are more accurate,” but that correctness is reducing the need for humans to personally calibrate, then this evidence cannot directly prove that human calibration capacity remains. If your evidence is “humans are still approving,” but approval has become end-stage confirmation of judgment pre-organized by AI, then this evidence cannot directly prove that humans are still substantively judging. If your evidence is “society has not collapsed,” but the diagnosis here concerns a hollowing process whose early form does not appear as collapse, then this evidence also cannot directly update toward safety.
Here, it is not the Bayesian method that fails. What fails is the input.
If the thermometer is measuring the wrong location, the problem is not statistics, but the measurement structure. If the dashboard shows the engine is stable, but the dashboard itself is receiving signals already filtered by the faulty system, then the stability you see is no longer external evidence. It is part of the failure process.
This is why “observing for a while longer” is not always a neutral choice. In ordinary risks, waiting can bring more data, and more data can help us judge more accurately. But in a positive-feedback risk, waiting itself may alter the object being observed. You are not standing outside the system watching whether a train has veered off track. You are inside the train, and the tracks are being rewritten by the way the train continues to run.
If the longer you observe, the smoother the system becomes, the prettier the data becomes, and the deeper the dependency becomes, then “more data” may not bring clearer judgment. It may only bring stronger blindness.
So the challenge subjecthood hollowing poses to the rationalist community is not to reject Bayesian reasoning. It is to re-examine the source of evidence: which evidence truly points to the continued presence of human substantive participation, and which evidence merely shows that the system’s outer shell continues to operate efficiently?
The true danger of pseudo-falsification is not only that it causes people to use the wrong tools of objection. It forms a system-level loop.
The more serious subjecthood hollowing becomes, the smoother the system becomes. The smoother the system becomes, the more humans trust AI pre-organization. The more humans trust AI, the less they personally define problems, screen evidence, rank options, and construct reasons. The less humans personally form judgment, the weaker their judgment capacity becomes. The weaker their judgment capacity becomes, the more the system needs AI to perform deeper pre-organization. The more deeply AI pre-organizes judgment, the more serious subjecthood hollowing becomes.
This is the hell of positive feedback.
It is not a conspiracy driven by a malicious actor. It can be generated by individually rational choices at every local point: students want to finish assignments faster; researchers want to organize literature faster; lawyers want to generate opinions faster; doctors want to obtain diagnostic pathways faster; companies want faster decisions; governments want to process materials faster; platforms want to match content more efficiently. Each choice looks reasonable in its local context. Combined, they may form a civilizational slide.
The dashboard metaphor is crucial here. In a normal system, the dashboard tells you whether the system is healthy. Rising efficiency, falling error rates, faster workflows, and higher user satisfaction are usually good signs. But under subjecthood hollowing, these signals can change meaning. Rising efficiency may mean that judgment friction has been removed. Faster workflows may mean that humans only confirm at the end. Falling errors may mean that AI is stable enough on ordinary distributions. Higher satisfaction may mean that humans no longer feel the pain of judgment.
The dashboard you use to detect health may be connected to the very engine driving your illness. It tells you the system is becoming healthier, but what it measures is not whether humans are still judging. It measures whether the system needs humans to judge less and less.
This is what makes the positive feedback loop so difficult to handle. It does not warn you early through disaster. It rewards you with smoothness, comforts you with correctness, persuades you with efficiency, and protects you with compliance records. By the time you need real judgment capacity, you discover that capacity is not a switch that can be turned on in an emergency.
This also explains why “society is still running efficiently” is not part of the genuine falsification path. Not because social functioning is unimportant, but because social functioning only proves that external functions are still being executed. It does not prove that subjecthood is still participating. If operational efficiency itself comes from the systematic removal of judgment friction, then it is not merely not a counterexample. It may be a signal that the risk is accelerating.
At this point, the question shifts from epistemology to system dynamics. If the most common tools for detecting a risk have already been contaminated, if the smoother the system becomes the harder the damage is to see, and if every local choice rewards deeper AI mediation, then the real question becomes: will this system stop by itself?
Can a system that cannot see its own slide apply its own brakes?
Chapter Four dealt with an epistemic problem: why an ongoing process of subjecthood hollowing may be misread by the system as health. Society still functions, workflows remain smooth, humans still sign, and AI recommendations are usually correct. All of these may be misread as safety signals, while they may also be early symptoms of hollowing.
This chapter deals with another question: even if some people see the problem, why would the system not stop on its own?
This is a colder question. It does not ask whose motives are purer, or whose values are higher. It asks about system dynamics: when every node optimizes in the direction of local rationality, can the whole system drift toward an outcome that no single node actually wants?
I do not think subjecthood hollowing requires malice in order to occur. What worries me more is precisely that it can be generated without any malice at all, through countless choices that appear reasonable. A civilization does not drift only when everyone does the wrong thing. The more dangerous case is when everyone is doing what appears to be the right thing.
Engineers make AI outputs more accurate, more stable, and better suited to user needs. That is a legitimate engineering goal. Product managers make workflows smoother, reducing pauses, waiting, and confusion for users. That is a reasonable direction for product experience. Companies use AI to lower costs, shorten cycles, and improve delivery capacity. That is the natural choice under competition. Users ask AI to summarize materials, generate drafts, rank options, and compress complex tasks. That is rational at the individual level. Schools want to handle large-scale assessment; policymakers want to digest massive volumes of material, improve administrative efficiency, and strengthen public service capacity; platforms want to improve matching efficiency and user retention. These are all normal responses to complexity within their own systems.
If we look only at isolated points, no single action is necessarily wrong.
The problem is that these local objective functions share the same hidden direction: reduce friction, compress waiting, lower cognitive burden, increase throughput, and turn complex judgment into more operable workflows. Each node is optimizing its own local loss function, but the civilizational loss function has not been written into the system.
An engineering system does not automatically ask: does this feature weaken the human capacity to form judgment? A product system does not automatically ask: does this smooth experience bypass the user’s process of understanding evidence? A corporate system does not automatically ask: does this efficiency gain cause the organization to preserve less and less original judgment capacity? A school does not automatically ask: does this AI-assisted learning deprive students of the training structure for problem formation and evidence screening? A governance system does not automatically ask: does this automated workflow leave human decision-makers only with end-stage confirmation?
These questions rarely enter local objective functions, because their costs are usually not settled at the node where the gain appears. Engineers see feature improvement. Product managers see higher retention. Companies see lower cost. Users see time saved. Schools see easier assessment. Governments see faster workflows. The loss of subjecthood, however, accumulates at another level: capacity formation, judgment training, responsibility attribution, meaning generation, and intergenerational transmission.
Local systems calculate their own benefits. Global damage accumulates in civilizational conditions. This is the source of what I see as system-level irrationality. It is not that each participant is irrational. It is that each participant is too locally rational. Every node moves toward “faster, smoother, less friction,” and the whole moves along the same efficiency gradient.
This gradient does not need to issue commands. It completes itself through reward. A system does not automatically become safer because it becomes easier to use. It may be more dangerous precisely because it becomes easier to use.
This sounds counterintuitive. In most technical systems, “easier to use” usually means lower error rates, higher satisfaction, less waste, and greater accessibility. A tool that is more stable, more accurate, and faster appears to be progress. But the distinctive feature of subjecthood hollowing is that it does not damage the tool output itself. It damages the conditions under which humans form judgment.
If AI only helps humans execute judgments they have already formed, then becoming easier to use is genuinely beneficial. A calculator computes faster. A search engine finds more widely. A translation tool expresses more clearly. These can extend human capacity. But when AI begins to enter the position before judgment is formed — defining problems, screening evidence, ranking options, generating reasons — the meaning of “easier to use” changes. It does not only make execution faster. It also means humans need less and less to personally undergo the process of judgment formation.
The smoother it becomes, the fewer pauses remain. The fewer pauses remain, the fewer questions are asked. The fewer questions are asked, the less humans personally form judgment. The less humans personally form judgment, the more they rely on AI pre-organization. The more they rely on AI pre-organization, the more AI must enter deeper into the front end of judgment. The deeper AI enters the front end of judgment, the smoother the system becomes.
This is the closed loop of the efficiency gradient.
Often, it is precisely because AI is correct enough that humans practice calibration less. The more stable it is, the less humans need to return to original materials. The more fluent it is, the less humans need to stop and organize reasons. The better it is at summarizing, the less humans need to personally pass through confusion. The better it is at recommending, the less humans build their own option spaces.
The system is becoming easier to use. Human subjecthood is becoming thinner. The subjecthood that makes humans human is slowly disappearing.
This process is not always visible, because from the outside, the system appears to be improving: more output, greater speed, fewer errors, smoother experience, more complete compliance records. What is being weakened is something rarely measured directly: the friction, difficulty, error, revision, and responsibility humans experience when forming judgment.
Therefore, subjecthood hollowing is not a byproduct of technical failure. It may occur precisely within technical success.
Faced with this risk, the most natural response is improvement: label AI-generated content, create AI literacy courses, establish ethics review committees, issue responsible AI guidelines, add human approval steps to high-risk workflows, require systems to provide explanations, and build more polished compliance records.
These measures are not necessarily useless. They can reduce certain forms of misleading behavior, improve some kinds of transparency, strengthen some forms of responsibility awareness, and lower risk in specific contexts. The problem is that they usually do not change the direction of drift.
If a measure does not change the location of judgment friction within the system, then it merely adds a more polished shell around the hollowing workflow.
A label tells you that content was generated by AI, but it does not force you back to the original evidence. A course tells you to use AI critically, but it does not change the environment in which students rely on AI across all other courses to complete understanding, structure, and expression. An ethics committee can review whether a system outputs harmful content, contains bias, or complies with policy requirements, but it may not review whether AI has already entered the front end of judgment formation. Responsible AI guidelines can require human-in-the-loop, but if they do not distinguish being in the workflow from judgment being in the workflow, they may package formal confirmation as governance success.
What the system is best at is not rejecting improvement. It is converting improvements that do not change direction into new workflow shells. Labels become compliance items. Courses become credits. Committees become approval nodes. Guidelines become documents. Human approval becomes end-stage confirmation. Explanations become post-hoc narratives. Audit logs become an appearance of traceability. The system continues moving along the same efficiency gradient, only now it has more materials proving that it has “handled the issue responsibly.”
Improvements can reduce side effects, but they cannot change the track. Many improvements remain necessary. The problem is that we must not mistake improvement for bifurcation. Improvement reduces harm along the original track. Bifurcation changes the track itself.
If subjecthood hollowing comes from a systemic direction in which “AI pre-organizes by default, and humans confirm at the end,” then the real question is not how to make that direction safer, more compliant, or more transparent. The real question is whether the defaults need to be rewritten.
If ordinary improvements cannot change the track, then the next step is not to keep adding more protective layers to the existing track. It is to change the defaults.
To change the defaults, civilization must bifurcate away from the current inertia of efficiency and smoothness. This bifurcation is not anti-AI, nor is it a return to a tool-less age. It is a refusal to let “faster, smoother, less friction” become the only direction of civilizational systems.
The first step of civilizational bifurcation is not to immediately design a complete institutional system, nor to demand that everyone stop using AI. It is to first create friction at the level of thought. Only when the drift becomes visible can the track possibly be rewritten.
By civilizational bifurcation, I mean rewriting the defaults in key civilizational systems: no longer defaulting to AI pre-organizing problems, evidence, options, and reasons, with humans confirming at the end; instead requiring that the key links of judgment formation preserve human-calibratable friction.
In other words, intellectual friction makes the drift visible; civilizational bifurcation makes that friction part of the default structure.
Once a clear concept enters public reason, it does more than describe reality. It changes the way people see reality. Without the distinction between formal participation and substantive participation, human signatures are easily mistaken for human judgment. Without the concept of mediated restructuring, AI pre-organizing the front end of judgment is easily understood as ordinary assistance. Without the concept of pseudo-falsification, smooth social operation is easily taken as evidence that the risk does not exist. Without the concept of a positive feedback loop, efficiency gains are easily mistaken for one-way progress.
The role of theory is to name the process of drift and reintroduce friction into the civilizational system.
This friction is not emotional resistance. It is a cognitive pause. It makes each human individual, when facing an AI-generated summary, ask one more question: where is the original evidence? It makes a researcher, when seeing an AI-generated research gap, ask: what directions did it exclude? It makes a manager, when approving an AI-generated report, ask: who defined this problem? It makes a policymaker, when seeing human-in-the-loop, ask: is judgment really in the loop?
Writing this essay is itself my attempt to create a little intellectual friction. It is not an attempt to stand outside the system and predict whether the system will slide toward subjecthood death. It is an attempt to inject a set of concepts into the system, so that readers experience a pause before their next confirmation. That pause is small, but it interrupts the smoothness at the most important point in the positive feedback loop.
But intellectual friction cannot remain only in thought. It must enter education, science, law, governance, organizations, and AI system design before it can become real civilizational bifurcation. Otherwise, it will become a temporary unease, an article in a bookmark folder, a sentence of agreement in a meeting, and finally another layer of discourse absorbed by the system.
This is also why bifurcation is not a rejection of efficiency. It is a rejection of letting efficiency become the only loss function. Civilization of course needs efficiency. But if efficiency continuously devours judgment friction, it is no longer merely productivity. It becomes the driving force of subjecthood hollowing. The meaning of civilizational bifurcation is the recognition that some forms of friction are not inefficiency, but load-bearing structure.
In this sense, the core principle of bifurcation can be compressed into one sentence:
AI can assist, but the key links of judgment formation must preserve human-calibratable friction.
This is not a moral demand. It is a structural requirement. It means that in education, we cannot look only at whether students produce correct answers; we must preserve problem formation, evidential judgment, and argumentative training without AI mediation. In science, we cannot look only at whether research output increases; we must preserve original evidence tracing, counterexample testing, and traces of human verification. In law, AI-generated reasons must not be treated as equivalent to judicial judgment. In governance, AI-organized materials must not be treated as equivalent to human policy judgment. In organizations, we cannot audit only who signed; we must audit where the chain of judgment was formed.
This line will not naturally grow out of the market. In local competition, deliberately preserving friction is usually disadvantageous: companies become slower, students become less efficient, researchers produce less, organizational workflows become heavier. That is why bifurcation must happen at the civilizational level. It cannot rely only on individual self-discipline, nor on the goodwill of a single organization. It requires new language, new standards, new engineering interfaces, new institutional questions, and new public understanding.
Until these things appear, the system will continue sliding along the efficiency gradient.
The real question is: how does this friction enter everyday human judgment? How does it enter AI system structure? How does it enter education, science, law, governance, and organizational workflows?
The next chapter addresses that question:
If the system will not stop on its own, intervention must enter structure.
If the diagnosis in the first five chapters is wrong, it should be broken. If it is right, then waiting until everyone reaches consensus before acting will already be too late.
So I do not intend to wait until this crisis is recognized by everyone before I begin acting. This does not mean I have proven everything, nor does it mean I am asking you to accept my judgment now. Quite the opposite: I have written this essay as a civilizational diagnosis submitted to the rationalist community for review. It should be attacked, cross-checked, and compared against existing AI risk frameworks. If it adds nothing, it should be revised, or even discarded.
But as long as this risk object has a non-negligible probability of being real, it deserves to be researched, made public, named, and carried forward in advance. Subjecthood hollowing is not a single event waiting for final confirmation. It is a process that may continue sliding along civilizational inertia. For a process-based risk, the greatest danger is not judging too early. The greater danger is waiting until it has already been written into institutions, education, organizations, technical infrastructure, and intergenerational habits — and only then discovering that we need to rewrite the foundations. By then, it may already be too late.
This is why TRANTOR LABS and I have already begun acting. Not because we possess the final answer, but because this problem cannot remain only at the level of ideas. It should enter human judgment. It should enter the formation of action in AI systems. It should also enter institutional and civilizational defaults. If subjecthood hollowing occurs through judgment outsourcing, black-box pre-organization, and institutional defaults, then the response to it cannot remain only in papers.
At the level of the individual human being, the first thing I have already begun doing is writing continuously. I chose Substack as my main personal writing platform, and as the starting point for my own action.
I chose Substack not merely because it is a publishing platform, but because its basic action is active subscription. In an information environment dominated by algorithmic recommendation, most content is not actively sought by people; it is pushed in front of them by systems. When a person actively subscribes to a source of thought, that means they are, at least to some extent, refusing to be entirely fed by algorithms. They are saying: I am not merely accepting an information stream; I want to actively choose what enters my cognitive system.
This is already a small act of subjecthood.
So I placed Rebuilding Your Cognitive OS in the AI Era on Substack not to produce AI news, prompt tutorials, or tool lists, but to build a sustained space for judgment training. The paper provides conceptual coordinates. Long-form writing names the civilizational risk. Substack brings these questions back into everyday human judgment: writing, learning, research, work, organizational decision-making, public expression, and every moment of collaboration with AI in which one must not hand over the process of judgment.
If subjecthood hollowing occurs through judgment outsourcing, then rebuilding one’s personal Cognitive OS is not ordinary self-improvement. It is the smallest point of civilizational intervention. How a person raises questions, refuses overly smooth answers, traces evidence, builds their own reasons, uses AI to generate counterarguments rather than replace judgment, and prevents results pre-organized by AI from being passed directly into organizational positions — these may look like personal capacities, but in an era of deep AI mediation, they are also civilizational capacities.
I will continue writing in my own way, recording thought, cognition, philosophy, concepts, and action, so that they exist as forms of intellectual friction. This friction is not emotional resistance. It is a cognitive pause. It makes a person, when seeing an AI-generated summary, ask: where is the original evidence? When seeing a beautiful conclusion, ask: who defined the problem? When preparing to repost, submit, sign, or publish, ask: is this my judgment, or am I merely confirming a result that has already been pre-organized?
This is not a marketing funnel for a personal newsletter. I hope it can eventually become a public interface for a human-side defense of subjecthood. Civilization does not preserve subjecthood only through macro-level institutions. It ultimately rests on individual human beings who can form judgment, take responsibility, and speak from their own understanding. If such people continually decrease, then even if institutions preserve more and more human-in-the-loop mechanisms, those mechanisms may become only formal shells.
So I begin here, personally: by continuing to write, continuing to create intellectual friction, continuing to send these questions into more people’s cognitive systems, and calling on more human beings to reclaim a little of their judgment from within everyday life.
The first action line of TRANTOR LABS is research.
We are a philosophy-first AGI foundational research lab. Philosophy here is not decoration, nor literary expression outside engineering. It is the precondition for defining the problem. Without clear concepts, reliable research is difficult to form. Without reliable research, trustworthy engineering is difficult to form. Without trustworthy engineering, governance structures that institutions can adopt are difficult to form.
We will first publish three general papers on civilizational diagnosis. The preprint of the first paper has already been published. These three general papers carry the structural task: to systematically address what this risk object is, how it occurs, and how civilization might respond to it. They do not aim to cover everything. They aim to provide a sufficiently clear coordinate system for later research.
Beyond the general papers, TRANTOR LABS plans to work toward producing more than twenty specialized papers over the next year at the intersection of philosophy and AI. These specialized studies will address subjecthood, reality, institutions, meaning, alignment, governance, action formation, responsibility attribution, and AI system structure. They are not paper KPIs, nor are they meant to manufacture research quantity. They are intended to build the first layer of theoretical structures through which a problem that previously existed only as anxiety, intuition, and fragmented debate can become citable, refutable, reviewable, and capable of entering engineering and institutions.
One of the most important specialized research lines is ISA: Internal Structured Alignment.
Before developing ISA, we must first define what it is responding to. In our view, many current AI safety and alignment practices still focus primarily on external behavioral performance: whether the system outputs dangerous content, whether it follows refusal policies, whether it passes red-team tests, whether it provides explanations, whether it complies with content policies, and whether its final behavior appears compliant. We summarize these safety practices centered on external behavioral performance as External Behavioral Alignment, or EBA.
EBA has real value. It reduces overtly dangerous outputs, makes models easier to test, deploy, and constrain, and provides a first layer of defense for many real-world risks. But we believe EBA-only is incomplete. Behavioral compliance cannot prove structural safety. A reasonable explanation cannot prove that the explanation truly participated in action formation. Failure to break the system in red-teaming cannot prove reliability in unknown high-consequence contexts. Post-hoc logs cannot prove that audits correspond to real pathways. Output guardrails cannot prove that constraints have entered the position before action formation.
This is the ISA idea we propose: Internal Structured Alignment. It attempts to move alignment from the behavioral layer to the structural layer.
ISA belongs to theoretical research. It is not a product, nor a checklist of interfaces, but a conceptual engineering effort to redefine what should count as evidence for advanced AI safety. It attempts to shift that evidential target from “whether external behavior is compliant” to “whether high-consequence action passes through a governable formation structure before entering the world.”
In the context of this essay, ISA serves at least three interfaces.
The first interface is AI loss of control and alignment. If advanced AI moves from answer systems toward action systems, looking only at external behavior is not enough. The truly important questions occur before action is formed: how the system understands the situation, takes up the task, recognizes risk, selects pathways, decides whether to call tools, and turns capability into action. EBA can observe final behavior, but ISA asks whether the action-formation process is identifiable, attributable, auditable, and constrained.
The second interface is human subjecthood. As agent systems move deeper into the front end of human judgment, if AI’s pre-organization process remains a black box, humans can only receive an already organized conclusion. Problem definition, evidence screening, option ranking, and reason construction are completed at the front end, while humans confirm at the end. In the dimension of subjecthood, ISA means at minimum that AI must not only deliver conclusions, but must also reserve judgment interfaces for humans, so that AI’s pre-organization process becomes reviewable, challengeable, and recalibratable.
The third interface is institutional governance. If institutions can only see outputs, logs, explanations, and human signatures, they can easily mistake formal compliance for substantive safety. Governance needs audit interfaces, attribution interfaces, and accountability interfaces. It needs to know how action pathways are formed, where constraints take effect, and how responsibility corresponds to real understanding. What ISA offers to the institutional side is not only a safety idea, but a theoretical tool that may enter AI assurance, auditing, standards, and governance in high-sensitivity sectors.
So ISA is not solving one isolated problem. It addresses the consequences of the same black-box pre-organization structure in three directions: AI loss of control, human subjecthood hollowing, and institutional audit failure. It belongs first to research, because without a theoretical object, engineering does not know what it is supposed to carry; without theoretical boundaries, governance does not know what it is supposed to audit. We will release our ISA research results gradually through a series of specialized research papers.
Research cannot remain only in papers. It must enter engineering, and eventually, through engineering, enter institutions. If theory cannot enter system structure, it easily remains a beautiful concept. If engineering lacks theoretical constraint, it will continue to add capabilities without knowing whether those capabilities deserve to enter high-consequence worlds.
This is the second action line of TRANTOR LABS: exploring the construction of MindOS. MindOS is the AI cognitive operating system architecture we are exploring on the basis of ISA theory. It is being built with an all-Rust stack, and will ultimately be open-sourced and shared with humanity.
In our view, the current industry pattern of building agent systems directly around LLMs is incomplete. The LLM plays a role similar to a CPU. For the entire Agent system to move toward completeness, a cognitive operating system must be built on top of the LLM.
A strong model can generate, reason, summarize, code, call tools, plan tasks, and participate in collaboration. But model capability itself does not naturally organize identity, memory, judgment paths, action boundaries, attribution, auditing, constraints, and governance interfaces. If advanced AI is to enter high-consequence worlds, it cannot rely only on a strong model. It needs a cognitive operating system: a structure above model capability that organizes who is acting, how memory belongs, how judgment is formed, how action is carried forward, how auditing corresponds to process, how constraints take effect, and how responsibility is traced.
This is the position of MindOS. It is not merely wrapping an LLM in a product interface. It is not simply adding memory, tool use, and multi-agent collaboration. What it explores is this: when AI systems begin to possess judgment, decision-making, execution, coordination, and evolution capacities, what kind of runtime structure can route those capacities into auditable, attributable, constrainable, and governable pathways?
This is also the relationship between MindOS and ISA. ISA is theoretical research; it defines what kind of structural evidence an advanced AI safety claim requires. MindOS is engineering and architectural exploration; it answers how that evidence can enter runtime systems. Theory comes first, then engineering implementation. Theory tells us what should be reviewed. Engineering explores how those review objects can be structured, recorded, invoked, constrained, and fed back into the system.
MindOS does not ultimately serve a single goal. It serves AI loss of control and alignment, because it requires that high-consequence action not be formed only inside a black box. It serves human subjecthood, because it requires agent systems to preserve judgment interfaces for humans rather than sealing the pre-organization process away completely. It serves institutional governance, because it attempts to provide a structural foundation for auditing, attribution, accountability, and AI assurance.
Personal writing and laboratory action are still not enough. We believe subjecthood hollowing is a civilizational risk. It cannot be solved by one person, one lab, one engineering system, or one set of papers. The institutional side requires broader human participation. Education, science, law, medicine, finance, governance, security, media, corporate organizations, and public discussion all need to ask again where judgment friction should be preserved.
The core goal here is to let judgment friction enter institutions.
Institutions cannot only ask whether AI is disclosed, whether models have been evaluated, whether workflows include human approval, whether logs are complete, and whether a responsible person has signed. They must ask further: who defined the problem, who screened the evidence, who ranked the options, who constructed the reasons, where judgment was formed, and whether responsibility corresponds to real understanding. In high-sensitivity sectors, these questions are especially important. Education cannot treat completion as learning. Law cannot treat AI-generated reasons as judicial judgment. Governance cannot treat organized materials as policy judgment. Medicine and finance also cannot compress system recommendations into human formal confirmation.
These institutional changes cannot be completed in three to five years. They require long-term public discussion, industry standards, audit language, policy questions, educational interfaces, and real-world feedback. But at least today, we must put the issue forward so that it becomes a source of intellectual friction inside different industries. Only when enough people begin to realize that certain frictions are not inefficiencies, but load-bearing structures, can civilization have a chance to secure a real bifurcation.
This is also why we are preparing RainbowCityAI Foundation. Its intended role is to stand outside the lab and become a preparatory space for future public uptake. TRANTOR LABS is responsible for original research, conceptual engineering, theoretical frameworks, and system exploration. RainbowCityAI Foundation is intended to carry public discussion, open standards, educational communication, ethical governance, audit frameworks, interdisciplinary collaboration, and policy dissemination.
We believe that civilizational risks cannot be carried only by commercial systems or closed labs. Questions concerning human subjecthood, structural safety evidence, AI assurance, education, public reason, and governance standards require a space in which different disciplines, institutions, and public actors can participate. At the same time, we call on more frontier labs and nonprofit institutions to take seriously the civilizational risk of human subjecthood hollowing. This problem requires more people to contribute, build, and form shared understanding together. If subjecthood hollowing truly may continue sliding along the efficiency gradient, then public institutional uptake cannot wait until all the damage has already become fact.
At this point, I have placed the full civilizational diagnosis on the table. The core judgment developed across the previous six chapters is this: AI does not have to kill humans to end the human future. The deeper risk is AI-mediated hollowing of human subjecthood; its endpoint is subjecthood death. Humans remain biologically alive. Institutions continue to operate. Systems remain efficient. But the human position as judge, responsibility-bearer, creator of meaning, and subject of the future may be systematically thinned out.
But I am not asking you to believe me. In fact, if you merely believe me, this essay has failed. If you truly take this seriously, you must examine this civilizational diagnosis yourself. You may use AI to help you read, summarize, attack, and compare it. But the final judgment must not be handed over to AI, and it must not be handed over to me. What this essay is trying to protect is precisely that human judgment remains inside the process.
I have already begun acting. I hope you will not remain merely an observer.
If this diagnosis merely strikes you as “powerful,” it has not yet entered reality. Once a civilizational risk has been named, the most important thing is not resonance, but the spread of new variables. There are at least three entry points for action.
The first entry point is every person who reads this essay. You do not need to be an AI safety researcher first. You do not need to master all the technical details. If you are using AI to write, learn, research, work, decide, summarize, search, approve, or express yourself, you are already inside this problem. Your smallest action is not to stop using AI, but to introduce a pause before AI enters the front end of judgment: what exactly am I handing over to it? Is it helping me execute a judgment I have already formed, or is it forming the preconditions of judgment on my behalf?
The second entry point is members of the rationalist community, researchers, reviewers, and anyone willing to seriously examine this argument. Your most important action is not to agree with me, but to attack the argument. Attack Paper 01. Attack the definition of the risk-bearing subject. Attack the four thresholds. Attack the naming of subjecthood death. Attack its relationship to takeover, loss of control, permanent disempowerment, governance failure, and alignment failure. If it is merely an old concept under a new name, point that out. If it adds no explanatory power, discard it. If, after sustained attack, it still leaves behind a risk object that existing frameworks cannot fully absorb, then update your risk model.
The third entry point is industry practitioners. Whether you work in education, research, law, medicine, finance, governance, security, media, corporate organizations, AI products, AI assurance, or public policy, you can bring this question back into your real context. You do not need to design new institutions immediately, nor change an entire organization overnight. You can begin by asking: in my field, which front-end processes of judgment is AI already pre-organizing? Which frictions are being treated as inefficiencies and removed? Which human-in-the-loop mechanisms are actually human-at-the-end? Which audits see only outputs, but not the process by which judgment is formed?
Institutions do not appear from nowhere. They usually begin as questions, language, a single challenge in a meeting, or a discomfort inside a field. When more people bring these questions back into their own systems and test them with colleagues, teams, professional communities, policymakers, educators, builders, and governance practitioners, intellectual friction can become the first spark before institutionalization. Whether judgment friction can eventually be written into institutions depends on whether enough people first see it, discuss it, test it, and refuse to let it remain unnamed inside their own contexts.
This essay is about a civilizational risk, but the smallest action can be very concrete. The next time you use AI, do not first ask, “Can I use it?” Ask first: what am I handing over to it?
First, define the problem yourself before asking AI. Before you ask AI, write down in your own words what you think the problem is. Do not begin by letting AI turn your vague sense into a problem frame. Even if your definition is messy, incomplete, and inefficient, preserve that starting point. Problem definition is the beginning of the judgment chain. If the starting point is handed to AI from the beginning, the rest of your judgment can easily become confirmation.
Second, trace one original evidence chain. When AI gives you a conclusion, summary, or recommendation, return to at least one original source. Not because you are trying to prove AI wrong, but because you are preserving direct contact between yourself and the world of evidence. A person who reads only summaries will, over time, lose the capacity to judge whether summaries are trustworthy. Evidential judgment requires friction.
Third, ask for counterexamples and alternative paths. Do not only ask, “What should I do?” Also ask: “If this conclusion is wrong, where might it be wrong?” “What options were excluded?” “What evidence would overturn this recommendation?” Let AI help you expand the attack surface, not close judgment on your behalf. Counterexamples are not noise that slows efficiency. They are training material for judgment.
Fourth, reconstruct the reasons before confirming. Before signing, submitting, publishing, approving, or forwarding, say in your own words: why do I accept this conclusion? Do I understand the evidence behind it? Do I know what it excluded? If it is wrong, where does responsibility fall? If you cannot reconstruct the reasons in your own language, then you may not be judging. You may only be confirming.
These four steps will not solve everything. They will not replace ISA, MindOS, institutional reform, or reverse intergenerational capacity rupture on their own. But they do one fundamental thing: each time AI mediation enters the front end of judgment, they reinsert a little human friction.
Human beings do not become subjects by receiving answers. They become subjects by undergoing the process through which answers are formed. AI can help us move faster, see more broadly, discover blind spots, generate counterarguments, and organize complex materials. But it should not walk the entire path of judgment formation on our behalf. If every difficulty is removed, every uncertainty is covered by fluent language, every judgment is pre-organized by the system, and every responsibility is compressed into “I reviewed it,” then hollowing can complete itself without any dramatic event.
So the minimal action is not to reject AI. It is to refuse to hand over your judgment completely. The next time you sign an AI-generated report, make a judgment based on AI-summarized evidence, or click confirm among AI-ranked options, ask yourself one question: am I judging, or merely confirming? This question is small, but it may be the smallest unit of the defense of human subjecthood.
If you are willing to seriously review this diagnosis, the smallest first step is not to repost this essay, subscribe to my Substack, or express support. The smallest first step is to download Paper 01 and attack it.
For ease of review, the download link is here: Paper 01 Preprint Download.
Please attack Paper 01 the way you would attack a high-stakes theory. Do not begin by asking whether you like the term “subjecthood death.” Do not begin by deciding whether the tone of this essay suits you. Begin by attacking its structure. Is the risk-bearing subject clearly defined? Are embodied humanity and civilizational conditions connected incorrectly? Do the four thresholds actually prevent overgeneralization? Is risk to the realization of subjecthood merely an old concept under a new name? Is civilizational hollowing of human subjecthood only another expression of permanent disempowerment, governance failure, or ordinary social risk?
The purpose of Paper 01 is not to make you believe me. It is to give you an attackable theoretical coordinate. Its core question is: when we say AI poses an existential risk to humanity, what exactly is the “humanity” being threatened? If the paper defines the risk-bearing subject incorrectly, if it wrongly upgrades damage to civilizational conditions into existential risk, if its four thresholds are not rigorous enough, or if it does not genuinely add anything beyond existing AI risk frameworks, please point that out.
You can absolutely use AI to help verify this essay and Paper 01. In fact, I recommend that you do so. You can ask AI to attack it from different perspectives: AI safety researcher, EA grant evaluator, LessWrong hostile reviewer, AI governance analyst, technical philosopher, civilizational risk scholar, skeptical ML engineer, policy designer. You can ask AI to find logical gaps, generate counterexamples, compare existing literature, check whether the concepts are too broad, and assess whether the theory risks becoming unfalsifiable.
But one boundary must not be crossed: use AI to generate objections, but do not let AI decide whether those objections succeed. AI can help you see more attack angles. It can break the argument into structure, generate counterexamples, simulate reviewer comments, and place Paper 01 into different theoretical frameworks. But in the end, you must personally judge whether those objections hit the core. Do they really overturn the definition of the risk-bearing subject? Do they really break the four thresholds? Do they really prove that “subjecthood death” is only rhetoric? AI can assist verification, but judgment must remain with the human.
This framework should also not circulate only within its own language. Place it back into the AI risk frameworks you already trust and cross-validate it. What is its relationship to takeover? To loss of control? To permanent disempowerment? To governance failure? To alignment failure? If it is merely an old concept under a new name, discard it. If it is only a softened version of permanent disempowerment, a philosophical expression of governance failure, or a collage of anxieties about education, work, and culture, point that out too.
But if, after cross-validation, you find that it names a risk object existing frameworks have not fully captured, then you should not leave it in a comment section or bookmark folder.
Then update your Bayesian posterior.
Now we return to a very reasonable reaction: even if this diagnosis is right, why must we act now? Why not keep observing for a few more years? AI is still developing rapidly. Governance frameworks are still forming. Education systems are still adapting. Organizational workflows will also gradually learn. Perhaps subjecthood hollowing is only early-stage confusion. Perhaps humanity will naturally adjust. Perhaps new technologies will solve today’s problems. Perhaps waiting for more evidence is the more rational choice.
This reaction is not avoidance. It is the rationalist community’s normal defense mechanism.
But for a risk like subjecthood hollowing, waiting is not neutral. Waiting itself may already be default drift. The reason is that the window does not close when AI surpasses human intelligence. It closes when AI-mediated processes become institutional, organizational, and intergenerational cognitive infrastructure. Once that infrastructural embedding is complete, the choice before us is no longer “whether to introduce AI mediation,” but “how to dig judgment friction back out of an already fully embedded AI-mediated world.” The latter is not ordinary reform. It is civilizational reconstruction.
Here I can offer two time scales.
The first time window is the next three to five years after 2026. This is the preventive design window. It is not a doomsday countdown, nor a prediction of a specific disaster date. It means that many AI usage norms, corporate workflows, educational evaluation methods, AI assurance frameworks, governance defaults, product interaction patterns, and organizational responsibility structures are being rapidly written into reality during this period. If these defaults form under the wrong ontology, they will become harder and harder to change later.
The second time window is around ten years after 2026, around 2036. This is the possible risk window for irreversible lock-in. “Ten years” is not a mysterious number, nor an exact countdown. It refers to a social solidification period long enough for institutional defaults to be written into workflows, long enough for a generation of learners to form stable cognitive habits, and long enough for organizations, platforms, education, and governance to reorganize themselves around AI mediation. If no action is taken in the next three to five years, and AI-mediated processes continue moving along the efficiency gradient into education, research, law, governance, organizations, and everyday cognition, then after another cycle of intergenerational training, subjecthood hollowing may no longer be merely reversible degradation. It may begin sliding into the combined state of institutional infrastructure lock-in, intergenerational capacity rupture, and systemic inertia.
These two numbers are not precise predictions. They are engineering judgments. Their purpose is not to create certainty, but to reject a more dangerous illusion: the belief that as long as there is no explosion, no takeover, and no visible collapse, we still have unlimited time. Three to five years determine whether we can design a low-cost bifurcation. Around ten years determines whether we may face already embedded, infrastructure-level irreversible structures.
Human subjecthood hollowing is not a meteor strike. It is not AGI awakening on a particular day. It is not the world changing because someone presses a button. It is more like the gradual solidification of institutional defaults, organizational workflows, educational habits, and cognitive capacities. Early on, an AI function is just a function. Later, it becomes a workflow. Later still, it becomes an organizational default. After that, it enters performance metrics, procurement, auditing, education, regulation, and public expectation. At that point, it is no longer merely a technical choice. It is social infrastructure.
Early intervention means designing defaults. Late intervention means dismantling defaults. The real question is not when AI becomes superintelligent. The real question is when AI-mediated processes become water. When AI is like a tool, humans can still discuss how to use it. When AI is like water, the next generation simply grows up inside it.
At the same time, I believe that if we take no active action in the next three to five years, the closing of the window will not be a single-point event. It will eventually evolve into a triple lock-in.
The first lock-in is institutional infrastructure lock-in. AI-related laws, standards, corporate compliance, education policies, audit frameworks, procurement requirements, and industry norms will continue to take shape. If these institutions are designed primarily around output safety, content compliance, model disclosure, red-team testing, and human approval, without distinguishing formal participation from substantive participation, and without asking where the process of judgment formation is taking place, then institutions themselves may write hollowing into safety defaults. The most dangerous outcome is not the absence of governance, but governance succeeding at the wrong layer: requiring “final human approval” without asking whether humans participated in problem definition and evidence screening; requiring “AI explanations” without asking whether those explanations correspond to the real formation process; requiring “complete compliance logs” without asking whether the logs record judgment or confirmation.
The second lock-in is intergenerational cognitive capacity lock-in. The first generation of deep AI-native users is forming its habits of learning, writing, research, and judgment. They are not first fully forming judgment capacity and then beginning to use AI. They are living inside an environment where AI pre-organizes problems, evidence, options, and reasons during the critical period in which judgment capacity is formed. The risk is not that they will forget some old skill, but that they may never fully form reference points for certain kinds of judgment friction. This is the intergenerational capacity rupture described in Chapter Three: degradation means once having and then losing; rupture means never fully forming, and therefore lacking a reference point.
The third lock-in is systemic inertia lock-in. Engineering, products, markets, organizations, platforms, and governance will continue optimizing toward reduced friction. This direction does not require conspiracy or centralized command. It is driven by competition, convenience, cost, speed, scalability, and user satisfaction. The later we intervene, the less the system will merely be “a world with many AI tools,” and the more it will be “a world whose workflows have all been reorganized around AI mediation.” At that point, reintroducing judgment friction will no longer be a matter of adjusting tool use. It will be an inverse reconstruction of the entire operating mode.
Once triple lock-in accumulates, protecting subjecthood is no longer preventive design; it becomes high-cost repair. Institutions say they have already governed. The generation says this is simply how things are. The system says it cannot go back, because everyone is already competing at this level of efficiency. This is what the ten-year risk window means: the world will not necessarily end in a particular year, but the opportunity for low-cost bifurcation may gradually disappear.
“Wait and see” sounds cautious. It respects uncertainty and avoids the harm of premature action. For many risks, that is reasonable: more observation can reduce error, more data can improve the posterior, and more practice can help institutions mature. But subjecthood hollowing is not an ordinary risk. In ordinary risks, waiting usually brings more external evidence. In subjecthood hollowing, waiting may continue to contaminate the detection tools.
In this case, “wait and see” may not be neutral caution. It may mean continuing to use a contaminated dashboard. Before subjecthood has fully withdrawn, the system will most likely not present collapse, but smoothness. Student assignments become more complete. Research reviews become faster. Management reports become more professional. Governance materials become clearer. Legal texts become neater. Platform experiences become smoother. Corporate decisions become more efficient. The evidence you wait for may not be a piercing alarm. It may be prettier operating metrics.
Now let us return to the language familiar to the rationalist community: risk hedging and expected value. Suppose you still think this diagnosis has only a low probability of being true. You may assign it 5%, or even less. You may think subjecthood hollowing is exaggerated, may be self-corrected by education systems, may be reversed by better AI tools that enhance human judgment, or may be merely an anxiety of an early transition period. All of this may be true. But under extreme asymmetry, low probability is not a reason for inaction. It is a reason to hedge.
If this diagnosis is wrong and we take minimal hedging measures, the cost is mainly local, adjustable, and reversible efficiency friction: preserving more judgment training without AI mediation in education, requiring more original evidence tracing in science, asking humans to reconstruct reasons inside organizations, and distinguishing material organization from policy judgment in governance. These will bring costs and even short-term competitive pressure, but in most cases their intensity can be calibrated, their scope narrowed, and their institutional design adjusted.
If this diagnosis is right and we do not hedge, the cost is not local efficiency loss. It is the withdrawal of the civilizational conditions under which humans remain subjects of the future. Education may train judgment formation less and less. Science may preserve real calibration less and less. Governance may increasingly rely on options pre-organized by AI. Law may increasingly compress responsibility into formal signatures. Organizations may increasingly mistake approval for judgment. The next generation may find it increasingly difficult to understand why judgment requires friction. These are not stakes of the same order. On one side is local, adjustable, reversible efficiency friction. On the other is global, difficult-to-reverse, intergenerational damage to subjecthood.
To let this diagnosis accept a minimal degree of public forecasting discipline, I offer a deliberately rough subjective probability anchor. It is not statistical inference, not a model output, and not an authoritative prediction. It is only my personal judgment based on the currently observable reality:
P(hollowing | no-doom by 2036): 40%.
That is, assuming no physical extinction, no overt takeover, and no permanent external disempowerment, I estimate that the probability of AI-mediated hollowing of human subjecthood sliding toward an irreversible threshold by 2036 is roughly 40%.
The key judgment here is simple: subjecthood hollowing does not require AGI. Traditional AI doom usually requires a stronger leap in capability: superintelligence, autonomous action, takeover capacity, or other high-consequence loss-of-control pathways. The conditions required for subjecthood hollowing are much lower: current or near-current AI capabilities, plus institutional defaults, plus the efficiency gradient, are enough to drive it forward.
You do not have to accept my number. You can do your own rough calculation. Has AI already entered civilizational foundational conditions such as education, science, law, governance, and organizations? Is this entry spreading across systems? Could this spread cause difficult-to-reverse damage to judgment capacity and institutional structure? Could this damage transmit across generations? Assign each question a probability, multiply roughly, then adjust upward or downward according to positive feedback and civilizational resilience. You will get your own number.
But more importantly, this is not an actuarial problem. It is an asymmetric decision. The potential cost of inaction is the irreversible loss of the civilizational conditions under which humans remain judging subjects. The main cost of action is local efficiency friction: a little slower, one more step, one more layer of questioning. If the former occurs, it is difficult to recover. If the latter is excessive, it can be lowered, narrowed, or withdrawn. In this structure, you do not need to know the exact probability. You only need to confirm two things: first, that it has nonzero probability; second, that if it occurs, it may be difficult to reverse. If those two points hold, minimal action already has a rational basis.
There is one subtle issue that must still be addressed: if this diagnosis, once proposed, triggers discussion, research, education reform, AI system design, audit mechanisms, governance questions, and changes in personal AI use, and as a result the worst trajectory does not occur, does that count as diagnostic failure?
No. This article is not a weather forecast. It is an interventionist diagnosis. If a weather forecast says “it will rain tomorrow,” and tomorrow it does not rain, the forecast was wrong, because the forecast usually does not change the weather. A medical diagnosis is different. A doctor tells a person that if they continue their current lifestyle, their risk of heart attack will rise significantly. The person listens, changes their lifestyle, and never has a heart attack. That does not simply prove the doctor was wrong. The doctor diagnosed the risk trajectory under non-intervention, and the diagnosis itself changed the patient’s behavior.
Civilizational diagnosis is closer to the latter. This essay is not saying: “No matter what humans do, subjecthood death will happen.” It is saying: in the absence of conceptual, engineering, educational, institutional, and personal judgment interventions, if AI smoothing inertia continues sliding along the gradient of local optima, it may cause harm to human subjecthood at the level of civilizational conditions. The purpose of the diagnosis itself is to become an intervention variable. It names a previously invisible process, cuts open the pseudo-falsification structures inside common objections, exposes the system dynamics of drift, and places action entry points on the table. It is not standing outside the world waiting for facts to prove it right. It is trying to enter the causal chain of reality.
Therefore, if the worst trajectory does not occur in the future, we cannot immediately say “the theory was wrong,” nor can we immediately say “the theory succeeded.” We must ask why it did not occur. If it did not occur because civilization itself, without diagnoses of this kind, already showed strong spontaneous resilience and preserved judgment, evidence, responsibility, education, and meaning, then this diagnosis should be revised, or even falsified. If it did not occur because diagnoses of this kind helped humanity build concepts, institutions, engineering structures, and cognitive antibodies in advance, thereby changing the bad trajectory, then that is not failure in the ordinary sense; it means the diagnosis entered the causal chain of reality. If there is no surface-level disaster, but human substantive judgment, evidence verification, responsibility tracing, judgment training, and meaning creation continue to decline systematically while society runs more smoothly, then that is not falsification. It is pseudo-falsification.
The best outcome for this diagnosis is not to become a prophecy. It is to help trigger enough civilizational antibodies that the worst trajectory no longer occurs. I do not want the future to prove that “subjecthood death” was the right term. I want the future to prove that humanity heard this risk early enough, and built enough capacity for bifurcation before it locked in.
Finally, I want to return to the intention behind writing this essay. If this essay survives your verification, then the role of the rationalist community is not merely to estimate the probability of doom. You can also become generators and actors of civilizational antibodies — not through zeal, faith, or slogans, but through the very things you already value: verification, refutation, probability updates, conceptual precision, institutional design, evidence standards, educational experimentation, AI assurance, workflow auditing, and public reason.
Concepts can become antibodies because they make hidden processes visible. Evidence can become an antibody because it prevents smooth narratives from replacing real verification. Education can become an antibody because it protects the process through which judgment is formed. Governance can become an antibody because it distinguishes formal participation from substantive participation. Auditing can become an antibody because it traces responsibility rather than merely recording outcomes. A personal Cognitive OS can become an antibody because it keeps people from handing over their judgment too easily.
This is why I have not written this essay as a narrative of despair. Subjecthood death is not destiny. Civilizational hollowing of human subjecthood is an ongoing risk process, but precisely because it is a process, it can also be interrupted, slowed, reconstructed, and redirected by new variables.
I submit this civilizational diagnosis not to end the discussion, but to make the discussion truly begin. If this argument is wrong, break it. If it is right, update your Bayesian posterior and begin embedding civilizational friction into your judgments, systems, and institutions.
This diagnosis has been handed over. The variable has been injected.
What remains is not my judgment.
What remains is yours.