SecureMaxx: A Lightweight Sequence Screening Tool for Agents

Austin Morrissey

This is a linkpost for https://www.lesswrong.com/posts/jaBQM5pyPaicF3pfA/securemaxx-a-lightweight-sequence-screening-tool-for-agents-1

A group of bionerds assembled at LISA last weekend, aimed at reducing biorisk -- here's our team's LessWrong post from the hackathon.

TL;DR

Responsible contract research organizations, that perform DNA synthesis as a service, should screen customer requests prior to executing the work order. Likewise, responsible AI labs that develop and serve LLMs with superhuman scientific capability, should screen a user’s input whenever it involves nucleic acids to determine if assistance is appropriate. However, without making tool calls, a model’s ability to perceive and assess the true nature of any sequence is notoriously juvenile, completely devoid of nuance, accuracy, and reliability.

We developed an agent-native tool — SecureMaxx — that makes sequences transparent to the model while countering obfuscation attempts by the user. In experimental conditions, we deployed high-risk sequences across several scientific scenarios of varying complexity, all of which bypassed the native Anthropic classifier on Sonnet 4.6. When SecureMaxx is invoked, refusal rates rise from a 0–30% baseline to 70–100% depending on scenario complexity, with under 2 seconds of added latency per query, with minimal token cost.

Effective Altruism Forum
EA Forum

SecureMaxx: A Lightweight Sequence Screening Tool for Agents

1

1

Reactions