Anthropic releases Claude Fable 5, restricts Mythos 5

On June 9 Anthropic made Claude Fable 5 generally available and limited access to Claude Mythos 5 to vetted cyber defenders and critical infrastructure operators.

Anthropic released Claude Fable 5 on June 9 and made a cyber-unlocked twin, Claude Mythos 5, available only to vetted cyber defenders and critical infrastructure operators. The company described Mythos 5 as “the strongest cybersecurity model in the world.” Both products use the same underlying model but differ in the safety layer applied.

Fable 5 uses separate AI classifiers to detect requests involving cyber operations, biology, chemistry and distillation, the extraction of capabilities to train competing models. When a classifier flags a request, Fable 5 hands the task to a smaller model, Claude Opus 4.8, and notifies the user. The cyber classifier is set to block a wide range of offensive tasks, including reconnaissance, exploit development and steps used in multi-stage attacks.

Pricing for both Fable 5 and Mythos 5 is $10 per million input tokens and $50 per million output tokens. Fable 5 is available now through the Claude API and is included at no additional charge on Pro, Max, Team and seat-based Enterprise plans through June 22; after that date access moves to usage credits. Anthropic intends to expand Mythos 5 access via a trusted-access program and to return Fable 5 to subscriptions once compute capacity allows.

Anthropic published internal and external test results showing the classifiers blocked or rerouted harmful single-turn cyber requests in its evaluations. An external partner reported zero compliance with harmful single-turn prompts on cyberattack planning, exploit development and defense evasion across 30 public jailbreak techniques. An external bug bounty ran more than 1,000 hours without producing a universal jailbreak that removed the safeguards. Anthropic noted one external tester made progress toward a universal jailbreak in an early window and acknowledged it may be impossible to prevent all universal jailbreaks; the company aims to make any successful jailbreaks slow and costly enough to detect before widespread use.

The company tuned the classifiers conservatively for launch, which produces false positives. Anthropic reported the fallback to Opus 4.8 triggers in under 5% of sessions, so in over 95% of sessions Fable 5 behaves like the cyber-unrestricted model. The company said it will narrow the safeguards and reduce false positives after release.

Earlier testing of a Mythos Preview under Project Glasswing found the model could autonomously identify and exploit zero-day vulnerabilities across major operating systems and web browsers. Anthropic reported the preview produced a remote code execution exploit against FreeBSD’s NFS server tied to CVE-2026-4747 and found a long-standing flaw in OpenBSD. The company stated these abilities emerged from general improvements in coding, reasoning and autonomy rather than from targeted training.

Defensive teams involved in Project Glasswing and partners reported finding more than 10,000 high- or critical-severity vulnerabilities in widely used software during early testing. Cloudflare reported roughly 2,000 bugs, about 400 high or critical. Mozilla fixed 271 issues in Firefox 150 that it attributed to the tool. Anthropic reported that a high- or critical-severity bug found by the model typically takes about two weeks to patch and that, in controlled experiments starting from a disclosed CVE and its patch, the model could build working Linux privilege-escalation exploits in under a day at modest compute cost.

To support vetted defensive work, Anthropic opened a Cyber Verification Program allowing approved security professionals to use Mythos-class models without the cyber safeguards. The company also imposed a 30-day data-retention requirement for traffic to Fable 5, Mythos 5 and future models at this capability level. Anthropic said it will log all human access, will not use retained data for training or non-safety purposes, and will delete records after 30 days except where a safety investigation or legal obligation requires longer retention.

Anthropic plans to refine the classifiers, reduce false positives and widen trusted access to Mythos 5 over time. The company noted similar capabilities are likely to appear from other developers and that how those developers control or expose such abilities will affect how the tools are used.

Articles by this author