Epistemic status: low-confidence shower thoughts, written fast, and frontier AI companies/governance are not my area of technical depth. I’m holding all of this lightly and welcome corrections in the comments.
Claude Opus 4.8 was used to fact check and link sources, and reduced my rambling thoughts by removing about 50% of the original content I wrote with minimal other edits.
I originally had four bullet points on the previous Department of War feud as further concerning context, but moved them to a footnote[1].
- A leading story for how we get safe, long-term beneficial AI is that a safety-focused, ethical lab gets a real lead and uses it to slow down and do safety work as it approaches automated AI R&D.
- Mythos and Fable 5 (the same model, Fable 5 is the public version with aggressive bio and cyber guardrails) marked the point where Anthropic pulled clearly ahead, by a surprising margin. That was very good news, partly because Anthropic is widely seen as the lab that cares most about safety and the long-term future.
- Anthropic had spent months putting unusually aggressive limits on Fable 5 (which customers complained about endlessly), hurting itself financially and inviting government scrutiny by loudly flagging Mythos’s cyber capabilities.
- Whether through ignorance or malice, the government differentially singled out Anthropic, effectively banning Fable 5 and Mythos via export controls (the first time export controls have been applied to an AI model) and, perhaps most concerningly, even for foreign Anthropic employees, forcing Anthropic to suspend both the public’s and its own access.
- The trigger was a narrow jailbreak that other leading models share: Anthropic claims (I believe correctly[2]) GPT-5.5 can find the same vulnerabilities with the same prompt, so Anthropic was penalized for something its competitors do too. Anthropic knew about the technique and allowed it, because it’s really just asking the model to “fix this code,” which surfaces minor vulnerabilities in order to patch them.
- Mythos meaningfully sped up Anthropic’s own research. Informal internal employee estimates were around 4x; Greenblatt’s was a 1.55x serial speedup. Anthropic could fall back to Opus 4.8, so the gap may be modest, but it was large enough that Anthropic sabotaged Fable 5’s machine-learning capabilities in novel, highly controversial ways (later walked back to less controversial methods) so competitors couldn’t use Fable 5 to accelerate their own research.
- The lead a safety-focused lab may have to slow down responsibly may, optimistically, be only months. Fable has been down two weeks, with Polymarket giving only ~30% chance it will be back in 6 days, by July 1st for US customers. Even then, foreign Anthropic employees (estimates I’ve seen range 30% to 70%) and customers may stay locked out, which could cause ongoing research productivity loss and harm Anthropic’s IPO if it can’t sell its top models abroad, or to US customers without friction, depending on how onerous any ongoing identity verification becomes.
- The ban may also chill American AI more broadly, with developers slowing relative to Chinese labs, not for well thought-out existential-risk reasons but because there is no clear guidance from the government on what AI companies are allowed to release (or even use internally), versus what will be subject to sudden export controls.
- Net effect: the most safety-focused lab is being differentially slowed; American AI may be slowed relative to China; and there’s growing politicization and misplaced government intervention, in ways that make the world less safe rather than more, and the future less likely to go well.
- ^
- The Department of War opposed a reasonable red line, then went further, using unusually aggressive (in fact, unheard-of) tactics to assert control over AI by designating Anthropic a “supply-chain risk”.
- The exceptions Anthropic wanted were narrow: don’t use its AI for autonomous weapons before it’s ready, and (seemingly the more important one) don’t use it for domestic civilian surveillance, possibly an essential safeguard against totalitarianism and concentration of power as advanced AI arrives.
- AI labs, especially Anthropic, retaining negotiating power with government in worlds where AI labs are nationalized, was one of the main paths I saw to AI going well. This makes that look less likely, at least under the current administration.
- Hegseth tied this earlier episode to the recent Fable 5 incident, (misleadingly) and belligerently calling its ban from the DOW “permanent.”
- The concern here is that this indicates a repeating pattern of unfair adverse action toward Anthropic, in a way that seriously penalizes Anthropic for its efforts on both AI safety[3] and defending against totalitarianism.
- ^
I only had a vague impression of this important fact so I had Opus 4.8 research this and write this footnote which I believe is accurate:
I can’t independently verify how Mythos/Fable 5 compares with GPT-5.5 (or OpenAI’s gated GPT-5.5-Cyber) on the specific tasks at issue. But I mostly believe Anthropic’s claim that they’re broadly comparable, and it isn’t only Anthropic saying so: the UK’s AI Security Institute evaluated both and found GPT-5.5 essentially on par with Mythos on expert cyber-vulnerability tasks (71% vs 69%, inside the margin of error), and independent security researchers read it the same way. Mythos may have an edge on deep-codebase synthesis, but it doesn’t look uniquely dangerous in a way GPT-5.5 isn’t.
- ^
As noted in footnote 2, Anthropic is being punished not because its model is less safe, it seems comparable to GPT 5.5 cyber; and in fact one interpretation is that Anthropic is being punished because it tried to alert the government of the safety concerns of its models, a pro-safety action which raised public perception of Mythos/Fable as dangerous.
