A group of bionerds assembled at LISA last weekend, aimed at reducing biorisk -- here's our team's LessWrong post from the hackathon.
TL;DR
Responsible contract research organizations, that perform DNA synthesis as a service, should screen customer requests prior to executing the work order. Likewise, responsible AI labs that develop and serve LLMs with superhuman scientific capability, should screen a user’s input whenever it involves nucleic acids to determine if assistance is appropriate. However, without making tool calls, a model’s ability to perceive and assess the true nature of any sequence is notoriously juvenile, completely devoid of nuance, accuracy, and reliability.
We developed an agent-native tool — SecureMaxx — that makes sequences transparent to the model while countering obfuscation attempts by the user. In experimental conditions, we deployed high-risk sequences across several scientific scenarios of varying complexity, all of which bypassed the native Anthropic classifier on Sonnet 4.6. When SecureMaxx is invoked, refusal rates rise from a 0–30% baseline to 70–100% depending on scenario complexity, with under 2 seconds of added latency per query, with minimal token cost.
