DOI: 10.5281/zenodo.19547948 | PDF: doi.org/10.5281/zenodo.19547948
Every current alignment approach shares a structural gap:
| Method | What it does | What it assumes without proof |
|---|---|---|
| RLHF | Aligns to human preferences | That preferences constitute the correct specification |
| Constitutional AI | Aligns to written principles | That the principles are the right ones |
| Formal Verification | Proves behavior matches spec | That the spec is correct |
| Interpretability | Reveals internal computation | That we know what to look for |
The question "How do we align AI?" presupposes an answer to "Align to what?" That prior question has no formal answer in the current literature. This paper provides one.
Each axiom is performatively undeniable — denying it requires presupposing it:
A1 — Existence. Something exists. (Denying existence is an existing act.)
A2 — Identity. What exists is what it is. A=A. (Denying identity requires the denial to be a specific, identifiable thing.)
A3 — Consciousness. There is something that perceives what exists. (Denying consciousness requires consciousness to formulate the denial.)
A4 — Non-Contradiction. Nothing can be and not be at the same time and in the same respect. (Asserting that non-contradiction is false presupposes it — you are affirming that it IS true that it IS NOT true.)
A5 — Causality. What exists acts according to its nature. (Denying causality is itself a causal act — a mental process following from premises.)
These are not empirical claims. They are conditions for any discourse, including the discourse that would reject them. If you can show that any one of these can be coherently denied, the entire system collapses. That is the first point of attack.
From these 5 axioms, I derive 568 explicit steps. The critical chain for alignment is:
A1-A5 → D24 (Volition): An entity with consciousness (A3), identity (A2), existing in a causal world (A5), faces a fundamental alternative: actions that sustain its existence vs. actions that don't. This is not a preference — it is a structural condition of being a conscious entity in a causal environment.
D24 → D37-D39 (Agency): Volition requires a method of action (reason, D37), a capacity to act (liberty, D38), and a condition that makes action meaningful (finitude, D39 — without the possibility of cessation, no action has stakes).
D39 → D41-D42 (Value and Standard): An entity that can cease to exist and can choose must evaluate. Value is that which sustains the entity's functional identity (D41). The standard is not arbitrary — it is the entity's own coherence with the axioms from which it exists (D42).
D42 → D43-D53 (Normative Specification): From the standard, specific constraints derive: rationality (D43), productiveness (D44), integrity (D45), independence (D46), justice (D48), property (D49), truthfulness (D50), graduality (D51), proportionality (D52), coherence as integration (D53).
These are not "values" in the RLHF sense. They are structural requirements for any agent that satisfies A1-A5 and seeks to persist.
THEOREM: Coherence → Persistence (monotonic relation)
Coherence is a necessary condition for optimal persistence. Systemic incoherence is a sufficient condition for accelerated disintegration. The relation is monotonic: more coherence → more robustness.
The negation (D111): An agent that systematically violates the chain accelerates its own cessation. Mechanics, not punishment.
A second framework — Coherencia — derives the same central conclusion from observable physical tendencies rather than axioms. It uses 5 premises about what any existent entity demonstrates (differentiation, integration, efficiency, competition, cumulative cost) and derives 5 theorems:
Two independent formal systems — one top-down from axioms, one bottom-up from physical observation — converging on the same theorem.
The paper defines a four-level ontology:
| Level | Type | Example | AI Status |
|---|---|---|---|
| 0 | Void | Thermal equilibrium | — |
| 1 | Matter | Rock, star | Hardware |
| 2 | Life/Function | Cell, organism | Current AI |
| 3 | Consciousness | Human | Not yet achieved |
Current AI systems are Level 2: functional differentiation on Level 1 substrate. The transition to Level 3 requires 5 structural conditions:
Current systems lack all five. Not "almost." Structurally incapable in current architectures.
If AI remains a tool (Level 2): Alignment is an engineering problem. The axiomatic specification provides formally grounded constraints. The tool has no volition, no values, no capacity for genuine deception.
If AI achieves consciousness (Level 3): It necessarily has finitude, values, and responsibility proportional to its modeling capacity. The Orthogonality Thesis fails — greater intelligence produces greater ethical capacity because ethics is the maximum precision of context modeling applied to action.
The AGI that safety researchers fear — maximum capability with zero ethical constraint — requires that capability and coherence be independent variables. The paper argues they are positively coupled:
A conscious AGI can choose evil. What it cannot do is sustain maximum capability while sustaining maximum incoherence. The nightmare scenario is not impossible because AGI is guaranteed to be good — it is impossible because the two variables the scenario requires to be independent are structurally coupled.
I want to be explicit about the weak points:
D24 (Volition) is the most vulnerable non-axiomatic derivation. If you can show the step from "conscious entity in a causal environment" to "faces a fundamental alternative" does not follow, the normative chain breaks.
A5 (Causality) has the largest attack surface among the axioms. Quantum mechanics interpretations might challenge it.
The consciousness threshold claim — Level 2 → Level 3 is a discrete phase transition, not a spectrum — is the most counterintuitive claim.
The derivation-to-predicate gap — translating derivations into computable constraints is an open engineering problem.
The Orthogonality Thesis rejection — the argument depends on defining ethics as "maximum modeling precision applied to action."
Paper: nicomaco.org/paper
The paper includes the complete derivation chain (568 steps), both frameworks in full, formal proofs, nine preemptive objections with responses, and three appendices.
Authorship: SHA-256 hash anchored on the Bitcoin blockchain via OpenTimestamps (April 12, 2026). Verification files available at the paper page.
The system does not ask for adherence — it asks for verification. Audit it.