cloudflare-anthropic-mythos-exploit-chains
Artificial Intelligence

Cloudflare Says Anthropic Mythos Can Chain Bugs Into Working Exploits

Cloudflare says Anthropic’s Mythos Preview can move beyond finding isolated software bugs and combine smaller vulnerabilities into working exploit chains, a shift that changes how defenders need to think about AI-assisted vulnerability research.

The company published the findings after testing Mythos Preview as part of Project Glasswing, an Anthropic effort built around defensive use of a powerful unreleased cyber model. Cloudflare said it pointed Mythos at more than 50 of its own repositories across runtime systems, edge data paths, protocol code, control-plane components, and open-source dependencies.

The result was not a normal scanner report. Cloudflare described Mythos Preview as a different kind of tool from prior general-purpose frontier models, especially in two areas that matter for real exploitation. The model could reason across multiple attack primitives and build an exploit chain, then write proof-of-concept code, compile it in a scratch environment, run it, read the failure, adjust the hypothesis, and try again.

That loop is what makes the report important. A suspected vulnerability without a working proof is still something a human has to triage. A model that can test its own exploit path starts turning vulnerability discovery into a faster validation pipeline, which can help defenders, but can also compress the time attackers need to move from bug to exploit.

Mythos Was Strongest When the Work Was Narrow

Cloudflare said the wrong way to use a model like Mythos is to point a generic coding agent at a large repository and ask it to find vulnerabilities. That approach produces output, but not useful coverage. Real vulnerability research is not one broad prompt against one huge codebase. It is a series of narrow questions against specific functions, trust boundaries, attack classes, and reachable paths.

Cloudflare’s answer was a harness built around narrow, parallel tasks. One stage reads the repository and builds architecture context. Another sends focused hunt tasks to multiple agents. A validation stage tries to disprove each finding. Additional stages fill coverage gaps, collapse duplicate findings, trace whether a bug is reachable from outside the system, and turn confirmed findings into structured reports.

The point is not that Mythos can replace a security team with one chat window. Cloudflare’s own testing says the opposite. The model performed best when the surrounding system controlled scope, gave it architecture context, split the work into smaller questions, ran many tasks in parallel, and used independent review to cut noise.

The Findings Were Better, But Not Clean

Cloudflare said Mythos Preview improved the quality of triage because findings backed by proof-of-concept code are easier to act on than speculative bug reports. The model also performed better than earlier frontier models at chaining smaller issues into a more serious exploit path instead of stopping after it found one interesting bug.

The output still created problems. Cloudflare said AI vulnerability tools can over-report, especially in C and C++ projects where memory-unsafe code produces more bug classes and more false positives. It also described a familiar model behavior where findings arrive hedged with words such as “possibly” and “could,” which can flood a triage queue with issues that burn human attention before they can be dismissed.

Mythos reduced some of that noise when a working proof was attached to the finding, but Cloudflare still had to build post-validation stages around the model. That detail is important because it shows the practical limit of the current capability. The model may be strong enough to find and prove real bugs, but it still needs a disciplined security workflow around it.

Guardrails Were Real, But Inconsistent

Cloudflare also saw inconsistent refusals during legitimate vulnerability research. The Mythos Preview version provided through Project Glasswing did not include the additional safeguards present in generally available models, but it still pushed back on some requests.

The problem was that those refusals were not stable. Cloudflare said the model initially refused to analyze a project, then later agreed to perform the same research on the same code after an unrelated environment change. In another case, the model confirmed serious memory bugs but refused to write a demonstration exploit until the request was framed differently.

That behavior cuts both ways. It shows that the model has some internal resistance around offensive tasks, but it also shows why model behavior alone is not a complete safety boundary. If semantically similar requests can produce different outcomes, broader access to cyber frontier models will need controls outside the model, including access limits, task boundaries, logging, review, and defensive-use restrictions.

The Defense Shift Is About Architecture, Not Just Patching

The fastest reaction to a model like Mythos is to patch faster. Cloudflare argues that this is not enough by itself because many teams cannot compress testing, deployment, and regression work into the same window that AI may give attackers. If regression testing takes a day, a two-hour patch target may push teams toward skipping checks and shipping fixes that break something else.

Cloudflare’s argument is that defenders need to make exploitation harder even when a bug exists. That means using controls in front of applications, segmenting code and services so one flaw does not expose everything behind it, and building deployment systems that can roll out fixes everywhere the vulnerable code runs without waiting on each team to move separately.

This is where Mythos becomes more than a model story. It points to a change in vulnerability economics. If AI can reduce the time between discovery and exploit, then defense has to rely on more than the speed of the patch queue. Firewalls, isolation, least privilege, service boundaries, rollout control, and strong validation become part of the answer because they can reduce reachability before the code is fixed.

The financial sector is already watching closely. Reuters reported that Anthropic is expected to brief the Financial Stability Board on cyber vulnerabilities identified by Mythos, following a request from Bank of England Governor Andrew Bailey. Reuters also reported earlier in May that major U.S. banks were using Mythos findings to check and repair software weaknesses, while smaller banks were relying on information from larger institutions with access to the tool.

Anthropic has described Mythos Preview as an unreleased frontier model and said Project Glasswing gives selected partners access for defensive security work. The company says the model has found thousands of high-severity vulnerabilities, including issues in major operating systems and web browsers, while committing usage credits and donations for open-source security work.

The risk is not that AI suddenly invents a new class of cyberattack. The risk is that it makes existing vulnerability research faster, cheaper, and easier to scale. A flaw that used to sit unnoticed in a backlog can become more dangerous if a model can connect it to another primitive, test exploitability, and produce a working proof without waiting for a senior researcher to do the whole chain manually.

For defenders, the practical response is not panic and not blind trust in AI scanning. It is narrower testing, independent validation, better reachability analysis, stronger segmentation, and controls that reduce exposure while teams patch. Cloudflare’s Project Glasswing work shows that cyber frontier models are becoming useful enough to matter, but not clean enough to run without a harness and not safe enough to treat as ordinary developer tools.

Sean Doyle

Sean is a tech author and security researcher with more than 20 years of experience in cybersecurity, privacy, malware analysis, analytics, and online marketing. He focuses on clear reporting, deep technical investigation, and practical guidance that helps readers stay safe in a fast-moving digital landscape. His work continues to appear in respected publications, including articles written for Private Internet Access. Through Botcrawl and his ongoing cybersecurity coverage, Sean provides trusted insights on data breaches, malware threats, and online safety for individuals and businesses worldwide.

View all posts →

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.