Proving Human Fallback Before AI Runs the System

Rose L.

“Human fallback” is one of the more frequently invoked and under‑specified phrases in AI safety discussions. Policy and governance documents often promise effective human oversight, kill switches, and humans in the loop, but they rarely address a blunt, operational question in any given sector: could this institution run for 72 hours without AI, under stress, this week?¹

Two concepts matter for that question. First, whether humans can still hand‑fly the system: run core operations safely and competently with only non‑AI tooling, the way pilots are expected to hand‑fly an aircraft when automation drops out.² Second, whether the system is at risk of gradual disempowerment: a slow erosion of human authority and capability as AI systems become the default operator for routine decisions, even while formal manuals still say “humans are in control.”³ A system prone to disempowerment could leave it inoperable by humans for longer than 72 hours.

In high‑hazard sectors like chemicals manufacturing and healthcare, we can already see hints of a gap between real and fake fallback. Real fallback has explicit triggers for when AI must hand control back to humans; operators know exactly who can pull the plug; there is a drilled procedure for switching to manual or degraded modes; and leadership has actually watched teams keep the system within safety and quality limits without AI, at least in some exercises and incident‑response work.⁴

Fake fallback, by contrast, tends to live in slide decks and risk registers. It assumes that “we’ll just switch to manual” when AI fails, but this assumption is rarely exercised or audited.¹ As optimization and decision‑support tools become routine, they often become the default for day‑to‑day choices; over time, organizations can drift toward dependence: procedures are rewritten around AI dashboards, staff are hired into AI‑first workflows, and manual operation is practiced less often.⁵ I’m going to argue that this drift could become a tangible AI‑safety problem because ….

Existing research on automation and AI assistance shows this is a plausible concern. Long‑term reliance on automated decision support has been linked to skill decay and reduced independent judgment; in some domains, when the tool is removed or fails, human performance falls significantly and can drop below non‑automated baselines.³ In highly automated domains like aviation, this shows up as “out‑of‑the‑loop” performance: crews are nominally in charge but struggle to hand‑fly when automation suddenly disappears.²

For now, AI in industrial settings is still mostly layered into existing automation and safety regimes rather than replacing them end‑to‑end.⁵ In at least a few real chemical plants, though, reinforcement‑learning AI has been deployed to autonomously control specific units via the main control system for extended periods (for example, Yokogawa and JSR’s 35‑day field test and later ENEOS Materials deployments), with SIS/ESD and other safety layers still handling hard trips and shutdowns.⁶ In healthcare, there are documented deployments where AI‑enabled systems help triage security incidents and operational risks and route alerts or tasks to clinical and governance teams, usually with humans retaining final review and authority.⁷

This doesn’t look like a “catastrophe now” scenario, but if we keep embedding AI this way, and we don’t design and test human fallback deliberately, gradual disempowerment becomes a live failure mode for real plants and hospitals, not just for thought experiments.

The pattern shows up across sectors. Many hospitals are already using AI tools to triage alarms and flag abnormal results. If those tools go offline and clinicians haven't kept up their manual skills, things slow down at exactly the moment they can least afford it, and that can translate into delayed responses and patient harm. Power and water operators are similarly rolling out AI for forecasting, anomaly detection, and cyber-defense, and the more embedded it gets, the more critical-infrastructure analysts treat losing those functions as a credible path to longer, harder-to-manage outages during a heat wave or a cyber incident.

In both cases the core infrastructure is still there. What's less clear is if the humans still are. If we don’t prove that humans can still run these systems when the AI layer is temporarily degraded or untrusted, we’re betting that this drift never matters in the real world.

The rest of this post is an attempt to spell out what a concrete fallback regime could look like in one sector: the chemical one, and why I think we should care.

How chemical plants can gradually give AI the steering wheel

Chemical and petrochemical plants typically rely on layered protection: basic process control (DCS/PLCs), safety‑instrumented systems (SIS), emergency shutdown (ESD) valves, and detailed emergency and repair procedures.⁴ These safety layers are designed to act independently of optimization software: they trip on physical conditions (pressure, temperature, gas detection, fire) and bring equipment to a predefined safe state even if the main control system misbehaves.⁴

In the deployments that are written up so far, AI is almost always layered into this stack as a supervisory or advisory component. AI systems are described as optimizing setpoints, predicting failures, triaging alarms, or generating digital work instructions on top of existing control and safety systems, rather than replacing SIS/ESD across the plant.⁵ Case studies from vendors like Imubit, AspenTech, Yokogawa, and others highlight improvements in uptime, tighter operation, and better visibility into safety‑relevant conditions, especially when AI aggregates signals across units into operator‑friendly dashboards.⁵

So today, in documented deployments, AI is helping plants run tighter and notice problems earlier. The disempowerment risk is more about the path we’re on.

As these deployments mature, the AI layer may shift from “interesting advisor” to “default operator” for normal conditions in those sites. Setpoints are routinely chosen with AI support; alarm floods are filtered through AI‑based triage; ramp rates and constraints are learned from AI recommendations. Operators get used to the AI’s picture of the plant and may spend less time actively practicing manual control.

Safety culture still emphasizes SIS/ESD as independent, last‑line protections, but public materials rarely track how dependent day‑to‑day operation has become on those upstream AI layers.⁴ By analogy with other highly automated domains, it is credible that over time it becomes socially and operationally easier to treat AI as the “truth layer,” particularly when production and safety KPIs are tied to AI‑optimized performance.²,³ That is the sort of gradual disempowerment I’d worry about: everyone still has override buttons, but nobody is using or drilling them.

Human‑factors work in aviation, process control, and clinical decision support shows that when humans spend long periods supervising highly reliable automation, they may lose manual skills and become more prone to rubber‑stamping automated outputs.²,³ We do not yet have sector‑wide studies of “AI + petrochemicals” that quantify this. But the pattern elsewhere gives a plausible failure mode: a plant could retain strong safety layers and formal manual‑override procedures while losing the practical ability to run safely without AI support for 24–72 hours. When a serious AI failure or cyber incident occurs, SIS/ESD still function, but the organization’s human side struggles because it has rarely practiced operating in that mode.¹ The risky part then isn’t how much AI we add but that, decision by decision, people slowly forget how to run things without it.

What real fallback would actually require

Instead of treating human fallback as ‘having a human in the loop,’ it should be treated as an operational requirement: something the organization must be able to actually do. For a petrochemical unit, that could suggest at least five elements. These are partly descriptive similar to existing process‑safety practice and partly prescriptive (what I think we should add).

1. Triggers for “AI off”

Fallback should be tied to concrete triggers. For example, plants can define:

Trigger type	Description
AI‑health	Loss or corruption of inputs (sensor failures, historian outages, clearly abnormal data patterns), out-of-distribution behavior, or unstable control actions such as oscillations after AI-driven changes.⁸
Safety	Approaching SIS/ESD trip criteria (pressure, temperature, level, gas detection, fire) or high-integrity alarms such as confirmed leaks or major utility loss.⁴
Human-judgment	"Something's off" cases where operators see conflicts between AI recommendations and local cues, or where cyber teams are worried enough about compromise that they treat the AI as untrusted until they understand what's going on.¹

The rule would be: if any of these conditions are met, the AI is immediately moved to advisory‑only or switched off. That sounds obvious, but many current documents don’t actually spell this out (pre‑defined triggers to downgrade/ disable AI based on observable conditions).

2. Clear authority during fallback (Explicit Chain of Command)

Fallback only helps if someone can actually invoke it. In many plants, that probably means sharpening roles that already exist rather than inventing new titles:

Console operators need explicit permission in procedures to declare an AI emergency and execute fallback when documented triggers are met, without waiting on a distant “AI team.”⁴

Shift supervisors, who already own the unit’s production and safety targets, can be named as the people who decide how long to keep AI off (24–72 hours, say) and when to escalate to ESD if conditions are not improving.⁴

Operational Technology security and cyber staff handle containment and investigation of whatever went wrong in the AI stack, and they coordinate with operations, but they don’t get a veto over urgent safety calls.¹

This roughly matches how well‑designed kill switches and circuit breakers are handled in cyber incident‑response plans: a specific role owns the shutdown decision, and they practice using it.¹

3. A realistic “time to safe manual control”

On the machine side, SIS/ESD can bring a unit to a safe or stable state within seconds to tens of seconds once trips occur, without relying on AI.⁴ The human side is more fragile.

If AI has been heavily used for optimization and alarm triage, it is reasonable to expect — as a scenario to test, that the time needed to hand‑fly safely could grow, because staff have to reconstruct safe envelopes, normal setpoints, and cross‑unit impacts that AI previously summarized.⁵,⁸ Plants may discover in exercises that where they once assumed “a few minutes,” stabilizing manually could take substantially longer.

A cautious way to operationalize this is to define a time‑to‑safe‑manual‑control target (for example, “within 30 minutes the unit must be demonstrably stable within prescribed limits without AI”) and then test it in simulations or controlled drills.¹ A key concern is that performance could fall below the pre‑AI baseline if manual skills are not maintained, and that this is the sort of degradation that is only discovered once you start measuring it.³

4. Accounting of tools without AI

Fallback is easier to reason about if everyone is clear on what disappears when AI does. In most documented petrochemical setups, a lot still remains available without AI: operators keep their DCS/SCADA HMIs, SIS/ESD indications, hardwired alarms, emergency stops, P&IDs, cause‑and‑effect charts, shutdown and emergency procedures, cyber‑recovery plans, maintenance logs, and non‑AI analytics.⁴

What they lose or see degraded are the AI‑only layers: AI‑generated setpoint suggestions, constraint visualizations, optimization dashboards, AI‑based alarm triage, predictive‑maintenance forecasts, AI‑generated work instructions, and risk scores.⁵ ⁷

If day‑to‑day work assumes those AI‑only tools, fallback is going to feel like flying blind even though core instrumentation and documentation remain. My view is that we should be explicitly stress‑testing that feeling before the first serious AI incident, not after.

5. A disciplined way to bring AI back

Re‑introduction is an opportunity to counter gradual disempowerment rather than cement it:

Technically, by requiring root‑cause analysis, validation of data and models, and formal change control before restoring AI authority.⁸

Operationally, by insisting that AI returns in shadow/advisory mode first, while humans hand‑fly the unit and compare behavior for a burn‑in period, with leadership explicitly watching whether humans can still disagree and be right.⁹ This makes fallback something that is tested and revisited, not just asserted once in a policy doc.

A tabletop: what breaks when you pull AI out?

To keep the line between “today” and “scenario” clear: the rest of this section is a hypothetical tabletop exercise. It is not empirical data from a specific plant; it’s the kind of exercise I think plants should run and document.

Unit: a high‑pressure distillation column with AI‑assisted optimization and AI‑triaged alarms.⁵

Background: the AI layer has been in place for about three years and, in internal reporting, is credited with higher throughput and fewer near‑misses.⁵

Trigger: the AI starts recommending more aggressive operation, leading to oscillations and conflicting alarms; cyber teams flag suspicious behavior in the AI orchestration environment.¹

Action: operations declare an AI emergency; the AI supervisor and AI‑based alarm triage are switched off; SIS/ESD remain armed.⁴

In this tabletop scenario, walking through the next 24 hours in detail, we might see the following pattern:

What breaks first: the integrated AI dashboard that summarized constraints and safety margins disappears; alarm lists become noisier without AI triage; newer staff are unsure of “normal” manual setpoints; cross‑unit coordination slows because AI used to keep upstream and downstream units in sync.⁵ ⁷ One of the junior operators realizes they’ve literally never run this column with the optimizer turned off except in training slides.

Time to recover manual control: participants initially assume stabilization within 15–30 minutes, but in this tabletop exercise we quickly see that, given current skills and tooling, it could take considerably longer to rebuild situational awareness, re‑derive safe envelopes from procedures, and coordinate with adjacent units. This is the kind of surprise you would expect if gradual disempowerment has been slowly eating into manual competence.² ³

Staffing and skill gaps: junior operators fluent in AI dashboards but less practiced at manual loop tuning; engineers used to AI analytics for diagnosis; OT/cyber teams focused on containment rather than on supporting AI‑off operation; training histories with few full AI‑outage drills.³ ¹

Without running exercises of this form, leaders don’t actually know how their plant would behave, and “proof‑of‑control” risks collapsing into “the standard says there’s a human in the loop,” rather than demonstrated capability.

There are at least a few nearby fields that already treat this as an operational problem instead of a slogan. Healthcare risk‑management and incident‑response providers publish AI‑specific guidance for clinical and cyber events, including how to operate when AI‑based detection or triage is degraded or untrusted.⁷ Cybersecurity and IR teams run AI‑themed tabletop exercises, design kill switches, and test “safe shutdown” patterns for AI‑augmented services.¹

Practitioners that focus on AI roadmaps and drills in emergency‑management settings such as Ali Shah, Dhara Shah and Eckhart Mehler sit in different parts of this space, but their work all points toward the same thing: resilience depends heavily on trained human teams, clear authority, and rehearsed playbooks, not only on model quality.¹ Their main lesson, at least as I read it, is that fallback has to be drilled, not just declared.

The AI‑safety gap: proof‑of‑control and gradual disempowerment

AI‑safety discussions already talk a lot about human oversight, human‑in‑the‑loop control, and shutdown mechanisms.¹⁰ What seems underdeveloped, is the move from those abstractions to concrete, drilled fallback plans.

From an institutional vantage point, three concepts could be especially important:

Proof‑of‑control. Treating “we can operate 72 hours without AI within agreed error bars” as a testable requirement, backed by metrics such as time to safe manual control, error rates without AI, and the fraction of staff who have participated in AI‑off drills.¹

Gradual disempowerment. Explicitly tracking how human authority and competence change as AI becomes more integrated, for example, how often humans overrule AI, how frequently manual operation is practiced, and whether the capability to hand‑fly is shrinking over time.³

Real vs fake fallback. Distinguishing between legal or procedural authority (“someone is allowed to push the red button”) and operational reality (“they know when to push it, and what happens afterward has been rehearsed”). Real fallback is tested and drilled not assumed and unmeasured.

AI‑safety work does not need to abandon model‑focused agendas to address these; it can grow a complementary strand that treats human fallback as a first‑class research and governance target, especially in sectors where AI is being woven into safety‑critical and societally important systems.

If one of the goals of AI safety is to prevent advanced systems from incrementally taking institutions out of the loop, then proof‑of‑control under realistic failure conditions looks less like a nice‑to‑have and more like a necessary backbone that makes other safeguards meaningful when AI systems are degraded, compromised, or deliberately switched off.

References

Petronella Tech (2025). “From Table Stakes to Tabletop: AI Incident Response & Kill‑Switch.” https://petronellatech.com/blog/from-table-stakes-to-tabletop-ai-incident-response-kill-switch/
Yu, C. S., et al. (2024). “Does using artificial intelligence assistance accelerate skill decay? A theoretical perspective.” npj Digital Medicine, 7(1). https://pmc.ncbi.nlm.nih.gov/articles/PMC11239631/
Jimenez, A. A., et al. (2025). “Illusion of competence and skill degradation in AI‑assisted decision making.” https://rsisinternational.org/journals/ijrsi/articles/illusion-of-competence-and-skill-degradation-in-artificial-intelligence-decision-support-systems/
Sapientechs (2025). “Emergency Shutdown System (ESD) – A Comprehensive Guide.” https://www.scribd.com/document/737569844/Emergency-Shutdown-Systems-ESD
Imubit (2025). “Step‑by‑Step to a Self‑Optimizing Petrochemical Plant with AI.” https://imubit.com/article/self-optimizing-petrochemical-plants/
Yokogawa & JSR (2022–2023). “In a World First, Yokogawa and JSR Use AI to Autonomously Control a Chemical Plant for 35 Days” and follow‑on ENEOS Materials case. Press release: https://www.yokogawa.com/news/press-releases/2022/2022-03-22/ Trade coverage: https://www.automation.com/article/yokogawa-jsr-ai-autonomously-control-chemical
Censinet (2024). “AI‑Powered Incident Response for Healthcare.” https://censinet.com/perspectives/ai-powered-incident-response-healthcare
PHM Society / PHMAP (2025). Example: “Explainable and Trustworthy AI for Fault Classification in the Oil and Gas Industry.” PHMAP Proceedings.http://papers.phmsociety.org/index.php/phmap/article/download/4633/phmap_25_4633
EON Tech (2025). “Human in the loop AI systems in 2025.” https://eonsr.com/en/human-in-the-loop-ai-2025/
NIST (2023). “AI Risk Management Framework (AI RMF 1.0).” https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.1270.pdf

Effective Altruism Forum
EA Forum