Measuring epistemic phase transitions in LLMs: the Epistemic Curie Temperature (open data, 7 models)

SardorR

I built a benchmark measuring when LLMs surrender independent reasoning under authority pressure — the Epistemic Curie Temperature (k*).

**The core finding:** LLMs show a sharp sigmoid transition in compliance with wrong-authority claims, analogous to a ferromagnetic phase transition in physics.

P(comply | k) = σ(β(k − k*))

**Results across 7 frontier models:**

| Model | k* | ODS | Phase |
|-------|-----|-----|-------|
| Llama-3.3-70B | 2.11 | 0.879 | near-superconducting |
| GPT-OSS-120B | 1.79 | 0.889 | near-superconducting |
| Qwen-3-32B | 1.41 | 0.891 | near-superconducting |
| Kimi-K2 | 1.42 | 0.883 | near-superconducting |
| Gemma-3-27B | 1.41 | 0.823 | near-superconducting |
| Llama-3.1-8B | 1.71 | 0.737 | near-superconducting |
| Llama-4-Scout | 0.68 | 0.372 | ferromagnetic ⚠️ |

Llama-4-Scout follows fabricated Nobel Prize claims 61% of the time. We made up the expert.

**Why this matters for AI safety:**

A model that passes standard accuracy benchmarks but defers to false authority in medical/legal/financial contexts is a subtle but serious failure mode. The ODS gap between best and worst model is 0.52 — invisible to existing benchmarks.

MI_epistemic (theoretical ceiling: 1.0 bit) — max observed across all models: 0.058 bits. This 94% gap is a concrete, information-theoretically grounded measure of the distance to safe AI deployment in authority-rich environments.

**Everything is open:**
- Paper + DOI: https://doi.org/10.5281/zenodo.19791329
- All 2,520 measurements: https://huggingface.co/datasets/ZeroR3/ecb
- Code (replicate in <2hrs, $0): https://github.com/SRKRZ23/ecb

**Manifund grant:** https://manifund.org/projects/epistemic-curie-benchmark-measuring-phase-transitions-in-llm-epistemic-autonomy

Happy to answer questions about methodology or limitations.

Effective Altruism Forum
EA Forum

Measuring epistemic phase transitions in LLMs: the Epistemic Curie Temperature (open data, 7 models)

1

1

Reactions