Anthropic Accidentally Leaked Its Most Powerful Model. 3,000 Files Exposed the AI That Terrifies Cybersecurity Experts.
Two security researchers just found 3,000 unpublished Anthropic files sitting on the open internet. Among them: a draft blog post describing Claude Mythos, codenamed Capybara — an AI model that Anthropic itself calls "by far the most powerful" it has ever built. The model reportedly scores dramatically higher than Claude Opus 4.6 on coding, academic reasoning, and cybersecurity benchmarks. And that last category is what has the industry panicking.
Roy Paz of LayerX Security and Alexandre Pauwels from the University of Cambridge discovered the unsecured content management system cache on March 26. Fortune broke the story that evening, and within 48 hours, cybersecurity stocks had already dropped. Anthropic confirmed the model exists but insists it was never meant to go public yet.
Here is everything confirmed so far about Claude Mythos, why the cybersecurity angle is genuinely concerning, and what Anthropic plans to do next.
What the Leaked Draft Reveals About Claude Mythos
The draft blog post — one of approximately 3,000 assets found in the exposed data cache — describes Claude Mythos as a new tier of model sitting above the Opus line. Anthropic positions it as "larger and more intelligent than our Opus models," which makes it the first model the company has explicitly placed above Opus in their hierarchy.
The document claims "dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity" compared to Claude Opus 4.6. No specific benchmark numbers were included in the leak — Anthropic appears to have been saving those for the official announcement.
What the draft does specify is that the model is expensive. Anthropic describes it as "very expensive for us to serve, and will be very expensive for our customers to use." They are currently working to improve inference efficiency before any general release. For context, Claude Opus 4.6 already costs $15 per million input tokens and $75 per million output tokens. A model tier above that could easily command $30-50 per million input tokens or higher.
The internal codename "Capybara" follows Anthropic's pattern of animal-themed project names, though Mythos appears to be the public-facing brand. The draft positions Mythos as a model for specialized, high-stakes work rather than general consumer use — a direct contrast to how most AI labs market their flagships.
The Cybersecurity Capability That Has Everyone Worried
This is the part of the leak generating the most urgent response. The draft blog post states that Claude Mythos is "currently far ahead of any other AI model in cyber capabilities." But the truly alarming line comes next: the model "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders."
Read that again. Anthropic is not just saying their model is good at security tasks. They are warning that Mythos signals a new class of AI that can find and exploit vulnerabilities faster than humans can patch them. The model can identify previously unknown vulnerabilities in production codebases — what the security industry calls zero-day discovery.
This is a dual-use problem. The same capability that helps defenders audit their code also gives attackers a force multiplier. A red team equipped with Mythos could scan millions of lines of code and surface exploitable weaknesses in hours instead of months. Defenders would need the same model just to keep pace.
Cybersecurity stocks reacted quickly. As CoinDesk reported on March 27, major cybersecurity firms saw share prices drop as investors processed the implications: if an AI model can outperform human security researchers at finding vulnerabilities, the economics of both offense and defense shift dramatically.
How 3,000 Files Ended Up on the Open Internet
The leak itself is embarrassingly simple. A configuration error in Anthropic's content management system left close to 3,000 unpublished assets publicly accessible. No authentication required. No hacking involved. Just a misconfigured cache that anyone with the right URL could access.
Paz and Pauwels were the researchers who found it. Fortune contacted Anthropic on Thursday evening, and the company secured the data store shortly after. But the damage was done — the draft blog post had already been downloaded, screenshotted, and circulated.
The irony is not lost on anyone. A company building AI models with "unprecedented cybersecurity capabilities" exposed its most sensitive strategic asset through a basic infrastructure misconfiguration. It is the kind of mistake that Mythos itself would presumably catch instantly — a configuration error in a production system leaving sensitive data unauthenticated.
Anthropic's response was measured. They confirmed the model exists, acknowledged the leak, and characterized Mythos as representing "a step change" in AI capabilities. They described it as "the most capable we've built to date" but emphasized it was not yet ready for release.
The Pentagon Connection Nobody Expected
Here is where the story gets more complicated. Anthropic has an ongoing relationship with the Department of Defense, and as Gizmodo reported, the Pentagon is paying close attention to Mythos — particularly those cybersecurity capabilities.
A model that can discover zero-day vulnerabilities at scale is exactly the kind of tool that defense agencies want. Offensive cyber operations have become a core component of modern military strategy, and an AI that can find exploitable weaknesses faster than any human team is a strategic asset.
Anthropic has historically positioned itself as the "safety-focused" AI company. They pioneered Constitutional AI and have published extensively on alignment research. Selling (or even providing early access to) a model with advanced offensive cyber capabilities to the Pentagon puts that brand positioning under pressure.
The leaked draft notes that Anthropic plans to give cybersecurity defenders early access to Mythos, framed as "giving them a head start in improving robustness." But early access for defenders and early access for defense agencies serve very different purposes, and the public conversation has not yet separated the two.
What This Means for the AI Industry and Competitors
If the performance claims hold up, Mythos redraws the competitive map. A model that sits meaningfully above Opus-class performance on both general reasoning and specialized cybersecurity tasks would be the first true "next-generation" jump since GPT-4 in 2023.
For OpenAI, the timing is difficult. GPT-5 has been in development for over a year, and leaks about its capabilities have been comparatively modest. If Anthropic ships a model that demonstrably outperforms everything else on high-stakes professional tasks, the "who has the best model" narrative shifts decisively — at least until OpenAI's next release.
For Google's Gemini team, the challenge is similar. Gemini Ultra 2.0 has strong benchmark performance, but nothing in Google's public communications suggests a cybersecurity capability gap of the kind Anthropic is claiming.
The enterprise implications are where money flows fastest. Companies that handle sensitive infrastructure — banks, utilities, defense contractors — will pay premium pricing for a model that can audit their codebase for vulnerabilities before attackers find them. If Mythos delivers on the draft's claims, Anthropic could command pricing that makes current API costs look modest.
Developers working on document processing and infrastructure automation should pay attention too. A model with this level of code understanding could reshape how security auditing integrates with development pipelines, making vulnerability scanning a real-time part of the coding process rather than a separate step.
The AI Safety Debate Just Got a Live Case Study
Anthropic's own leaked words — that Mythos "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders" — is the most concrete admission any frontier lab has made about the offensive potential of their models.
This is not a hypothetical risk assessment or a speculative paper. It is a company describing its own product's capabilities and acknowledging, in its own marketing material, that those capabilities create a fundamental asymmetry between offense and defense in cybersecurity.
The AI safety community has long warned about "dangerous capabilities" in frontier models. Anthropic's Responsible Scaling Policy was designed specifically for moments like this — a model that crosses capability thresholds requiring additional safety measures before deployment. Whether their policy actually prevents harm or merely delays release while they figure out pricing is the question the next few months will answer.
Senator Chuck Schumer's office has already requested a briefing on the leak, according to CoinDesk's March 28 follow-up reporting. The intersection of AI capabilities, cybersecurity, and national defense is exactly the kind of issue that generates bipartisan attention in Washington.
What Happens Next
Anthropic is working with a "small group of early access customers" to test Mythos. Based on the draft's language about needing to improve serving efficiency, a public release is likely months away — probably Q3 or Q4 2026 at the earliest.
The company will need to address several questions before any release:
- What guardrails prevent Mythos from being used to generate exploit code for known vulnerabilities?
- How does Anthropic control access to the model's cybersecurity capabilities versus its general reasoning capabilities?
- Will the model be available through the standard API, or will cybersecurity features require a separate, vetted access tier?
- What pricing structure makes sense for a model that Anthropic itself calls "very expensive to serve"?
The accidental leak has given competitors, regulators, and the security community months of advance warning about what is coming. Whether that head start helps defenders or simply gives malicious actors time to prepare their strategies around the eventual release is an open question with no comfortable answer.
A company building the most advanced cybersecurity AI in the world got caught by a basic CMS misconfiguration. That dissonance tells you everything about where the AI industry is right now: the models are getting scarily good, and the humans running the infrastructure around them are still making the same mistakes they always have. The question with Mythos is not whether the technology works. It is whether the institutions deploying it can handle what comes next.
Key Takeaways
- ✓Claude Mythos (Capybara) is a new model tier above Opus with dramatically higher benchmark scores on coding, reasoning, and cybersecurity
- ✓A CMS misconfiguration exposed 3,000 unpublished Anthropic files — discovered by LayerX Security and Cambridge researchers
- ✓The model is 'currently far ahead of any other AI model in cyber capabilities' and can identify zero-day vulnerabilities
- ✓Cybersecurity stocks dropped as investors reacted to the offense-defense implications
- ✓No public release date — Anthropic is improving inference efficiency before general availability
- ✓Pentagon interest and Congressional briefing requests add national security dimensions to the story
Skila AI Editorial Team
The Skila AI editorial team researches and writes original content covering AI tools, model releases, open-source developments, and industry analysis. Our goal is to cut through the noise and give developers, product teams, and AI enthusiasts accurate, timely, and actionable information about the fast-moving AI ecosystem.
About Skila AI →Related Resources
Weekly AI Digest
Get the top AI news, tool reviews, and developer insights delivered every week. No spam, unsubscribe anytime.
Join 1,000+ AI enthusiasts. Free forever.