What Does Reddit Actually Talk About When It Discusses AI Safety? An Exploratory Clustering Analysis (Jan–Mar 2026)

Aurorah

Why I did this

Some of AI safety communication implicitly assumes that "the public" thinks about AI risks in roughly the same terms that alignment researchers do: existential risk, misalignment, control problems. While, I kept seeing something messier: job loss threads next to EU AI Act debates next to GPT-4o trust complaints next to philosophical posts about AI consciousness.

I wanted to know whether these conversations form one coherent discourse or several overlapping ones, and whether the emotional tone varies by topic in any systematic way.

This seemed potentially relevant for anyone thinking about AI governance communication, public outreach, or simply trying to understand where public concern actually lives right now.
I’m posting this partly to sanity-check whether this kind of discourse mapping seems useful.

What I did

I collected 7,178 Reddit posts between January 29 and March 1, 2026, using 40 keyword-based search terms (such as "AI safety", "AI alignment", "EU AI Act", "red teaming LLM", "AI replace jobs"). Posts were retrieved across all subreddits and manually filtered to remove off-topic communities to 6,374 ready for analysis.

The analysis pipeline:

Sentence embeddings (paraphrase-multilingual-MiniLM-L12-v2) -> 10D UMAP -> HDBSCAN clustering
Manual cluster review using structured cluster cards
Two interpretive layers added: post-level sentiment (RoBERTa classifier) and discourse framing (human-first labeling with blind LLM comparison and human adjudication)

The full report, sample data, and code are available here

Three findings that may matter for EA communication / governance work

1. Reddit AI safety discourse is fragmented.
The 23 interpretable clusters I found span labour disruption, governance and regulation, lab trust breakdown, authenticity and synthetic content, technical safety, markets and enterprise adoption, and philosophical debates about personhood and alignment. No single topic dominates: the largest accounts for ~10% of posts. "AI safety" on Reddit looks more like a field of related but distinct conversations than one coherent debate.

This suggests that different audiences may need quite different communication approaches - a message designed for people worried about job displacement may land very differently from one aimed at people following lab safety discourse.

2. The most negatively-toned clusters are not about abstract x-risk.
The clusters with the strongest negative sentiment are concentrated around lived, immediate disruption: job replacement anxiety, synthetic content spam, broken trust in specific AI labs, AI misuse in schools, creative displacement. By contrast, clusters framed around enterprise adoption, national AI progress, or regulatory frameworks are much closer to neutral or slightly positive.

This is potentially relevant for EA-adjacent communication: the concerns most salient to Reddit users right now are experiential and near-term, not primarily existential.

3. Framing matters as much as topic.
Two clusters can both be "about AI and work" while one is framed as macro labour replacement anxiety and another as micro hiring friction and CV pressure - they likely call for different policy and communication responses. Similarly, some clusters frame AI as infrastructure, others as a degraded-authenticity problem, others as a trust-in-institutions problem. Topic labels alone don't capture this.

What I'd love feedback on

Does this kind of public discourse mapping seem useful for AI safety communication or governance work? Is anyone doing something similar more rigorously?
Are there obvious methodological improvements that would make this kind of snapshot more principled?

📎 Full project: https://github.com/kelukes/reddit-ai-safety-discourse-2026 | Report (PDF): https://github.com/kelukes/reddit-ai-safety-discourse-2026/blob/main/report.pdf

Effective Altruism Forum
EA Forum

What Does Reddit Actually Talk About When It Discusses AI Safety? An Exploratory Clustering Analysis (Jan–Mar 2026)

2

Why I did this

What I did

Three findings that may matter for EA communication / governance work

What I'd love feedback on

2

Reactions

More posts like this