I'm not an ML researcher. I spent 11 years in retail banking at Wells Fargo managing operations and nine-figure asset portfolios. I understand how bias moves through institutional systems and how audit infrastructure works in regulated industries.
About 14 months ago I noticed something that bothered me. Every AI safety tool I could find focused on what models get wrong — toxicity, hallucination, factual errors. Nobody was looking at how models persuade. Not what they say, but how they frame it. The anchoring, the false authority, the emotional loading, the way they structure your thinking without telling you they're doing it.
So I built a tool to detect it.
BiasClear is an open-source auditing tool that scans LLM outputs for structural persuasion patterns. It runs on a two-ring architecture:
The inner ring is a frozen ethics core. Immutable pattern detectors for known persuasion techniques — anchoring, false authority, emotional manipulation, scarcity signals, social proof exploitation. These detectors cannot be retrained or modified. That's the point. If your detection system can learn, it can also learn to stop detecting.
The outer ring is a learning layer that adapts to new patterns. But it has guardrails — patterns need multiple human confirmations before activation, there's a hard false positive rate limit, and everything is auditable and reversible.
Every scan produces a hash-chained audit certificate. Tamper-evident by design.
It's live right now. You can test it at biasclear.com. Code is at https://github.com/bws82/biasclear. I published the theoretical framework (Persistence of Information Theory) on Zenodo — DOI 10.5281/zenodo.18676405. It got 93 views and 37 downloads with zero promotion from me.
Here's why I think this matters for this community specifically:
The Colorado AI Act takes effect February 2026. It requires bias audits for high-risk AI systems used in consequential decisions. The EU AI Act requires transparency for AI systems interacting with people. No standardized tool exists for auditing persuasion patterns in AI outputs. The tooling gap is real and regulatory deadlines are approaching.
But the bigger point isn't compliance. It's that as models get more capable, structural persuasion becomes a more dangerous misalignment vector. Not the kind that triggers safety filters — the kind that passes every filter while systematically shaping how people think. A user-side detection layer that operates independently of model providers is a check on that. The frozen core means the detection system itself can't be captured or degraded.
BiasClear is model-agnostic. You can run identical prompts through Claude, GPT-5.2, Gemini, and Llama and compare persuasion fingerprints. That dataset doesn't exist anywhere yet. Early calibration runs across legal, financial, media, and general domains show that persuasion patterns vary by model and cluster in ways that reflect training data and RLHF optimization targets.
I'm bootstrapped. No institutional affiliation, no VC. I have a Manifund listing at https://manifund.org/projects/biasclear-structural-persuasion-detection-for-ai-outputs and I've applied to LTFF. I'm looking for three things:
- Critical feedback. What failure modes am I missing in the architecture?
- People working on adjacent problems — output safety, constitutional AI, evaluation frameworks. I'd like to talk.
- Funding or compute to run the cross-model validation study at scale.
The tool exists. You can try it right now. I'd rather hear what's wrong with it than get praised for it.
