Automated Exploit Construction
Anthropic’s Mythos Preview security focused AI model has demonstrated the ability to not only identify software vulnerabilities but also construct working proof-of-concept exploits from them. Cloudflare’s security team tested the model against more than fifty internal repositories as part of Project Glasswing, an invitation only program from Anthropic. The results mark a shift in automated vulnerability research. Previous AI models could identify individual flaws and explain their significance, but they consistently failed to complete the exploit chain.
Mythos Preview addresses this limitation through two key capabilities. Exploit chain construction allows the model to combine multiple low severity bugs such as a use-after-free error, arbitrary read/write primitives, and return oriented programming gadgets into a single working exploit. Bugs that might have remained in a security backlog become actionable attack paths. The model also performs proof generation: it writes code to trigger a suspected bug, compiles and runs it in a sandboxed environment, reads the failure output, adjusts its hypothesis, and iterates until it confirms or rules out exploitability.
Impact and Noise Reduction
Even with these improvements, false positives remain a challenge. Two factors dominate the noise rate: programming language and model bias. C and C++ codebases generated significantly more false positives than memory safe languages like Rust. Models also tend to report findings speculatively, flooding triage queues with hedged conclusions. Mythos Preview reduces this problem by producing output with clearer reproduction steps and PoC code that helps security teams make faster fix or dismiss decisions.
Cloudflare found that pointing any AI model directly at a repository produces poor coverage. Effective vulnerability research requires a custom execution harness built around narrow scope, targeting each agent on a specific task rather than the entire codebase. This approach, combined with Mythos Preview’s ability to deliver confirmed findings with attached PoCs, significantly reduces triage time for security teams.
Source: Cyber Security News
