The New Mythos Tier
Anthropic has released Claude Fable 5, marking the debut of its Mythos capability tier. This model surpasses the Claude Opus line, achieving state of the art results on complex, multi step reasoning tasks. The company acknowledges that such advanced capabilities are dual use, as the model can discover and exploit software vulnerabilities and perform agentic hacking across a full attack lifecycle.
Built in Safeguards
Rather than outright refusing risky prompts, Fable 5 routes suspicious requests to a less capable model. A classifier layer detects requests related to cybersecurity, biology, chemistry, or model distillation and hands those sessions to Claude Opus 4.8 instead. Users receive a notification when a fallback occurs. The company says the classifiers trigger in under 5% of sessions, meaning over 95% run on Fable’s full capability. Internal evaluations show the classifiers effectively block meaningful progress on offensive cybersecurity tasks.
Defensive Deployment
Anthropic is also offering Claude Mythos 5, the same model but with cybersecurity safeguards removed, to a restricted group of defenders and infrastructure providers. This version is deployed through Project Glasswing in collaboration with the US government. The company reports that external red teaming found no universal jailbreaks across more than 1,000 hours of testing, though the UK AI Safety Institute made early progress within a short testing window.
Source: Cyber Security News
