Anthropic leaked Claude Mythos via a CMS error

On March 27, 2026, Anthropic's most powerful AI model was revealed to the public. Not through a press release, not through a product launch, not through a controlled leak to a favored reporter. Through a misconfigured checkbox in a content management system.

What got exposed

Nearly 3,000 unpublished assets, including draft blog posts, PDFs, images, and details of an invite-only CEO summit in Europe, were sitting in a publicly accessible data store. Fortune reported the breach first. Anthropic confirmed it, describing the exposed files as "early drafts of content that were being considered for publication" and attributing the exposure to "human error" in its CMS configuration.

The drafts referred to the model by two names: Mythos and Capybara. The two versions of the same blog post appear to differ only in the name, suggesting Anthropic was still deciding between them. The underlying model is the same either way.

According to those drafts, Capybara represents an entirely new tier above Opus, Sonnet, and Haiku. "Capybara is a new name for a new tier of model: larger and more intelligent than our Opus models, which were, until now, our most powerful," the draft stated, as reported by The Decoder. Compared to Claude Opus 4.6, it "gets dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity, among others."

An Anthropic spokesperson confirmed to Fortune that this is real: "We're developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity. We consider this model a step change and the most capable we've built to date." Training is complete. Early access customers are already testing it.

The cybersecurity dimension

The benchmark numbers matter less than what Anthropic wrote about its own model in those drafts.

The leaked documents describe Claude Mythos as "currently far ahead of any other AI model in cyber capabilities" and warn that it "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders."

Anthropics's own language, in its own draft, frames the model as a threat to the defenders it's trying to help. That's not marketing hedging. That's a company genuinely uncertain about what it has built.

For context: in February 2026, Claude Opus 4.6 discovered 22 vulnerabilities in Firefox as part of a collaboration with Mozilla. That was the previous best model. Mythos, according to those drafts, is dramatically ahead of Opus 4.6 specifically in cybersecurity.

The rollout plan reflects that concern. Anthropic intends to release access gradually, starting with a small group of security-focused organizations. The goal, per the leaked documents, is to give defenders "a head start in improving the robustness of their codebases against the impending wave of AI-driven exploits." The company also confirmed that hacking groups linked to China have already attempted to misuse its existing tools, with one state-backed group using Claude Code to target roughly 30 organizations, including technology and financial companies.

The model is also expensive. The leaked drafts noted it is "very expensive for us to serve, and will be very expensive for our customers to use," which means the initial access will be limited regardless of the security concerns.

The irony

Anthropics's brand is built on exactly the kind of careful, structured approach to AI risk that a situation like this is supposed to prevent. The company publishes safety frameworks. It runs formal risk assessments. It builds approval processes for model deployment. It refused Pentagon requests over two specific safety guardrails and absorbed the political fallout.

And then a misconfigured CMS setting exposed its most sensitive unreleased model to the public internet.

Not a sophisticated intrusion. Not a nation-state attack. Not a disgruntled insider. A content staging environment where draft files weren't marked private. As Igor's Lab noted, the incident revealed "how far along its next model tier apparently already is internally and how cautiously it intends to handle its release" in a single accidental disclosure.

The cybersecurity stocks reacted immediately. The iShares Expanded Tech-Software Sector ETF (IGV) fell nearly 3% on March 27. The market read the leaked documents the same way the security community did: a model that outpaces defenders is not good news for the companies selling defense.

Why it matters

The pattern here is specific and worth naming. Anthropic withheld Mythos from release because of cybersecurity risk. It designed a careful, phased rollout. It planned to give defenders early access before broader release. Every one of those decisions reflects genuine caution about what the model can do.

All of that preparation was undone by a setting in a publishing tool.

This isn't a story about hypocrisy. Anthropic's safety commitments appear sincere, and the phased rollout plan is more responsible than most labs would attempt. The story is about the gap between institutional intent and operational reality. You can build the most rigorous AI safety framework in the industry and still have your most sensitive model exposed by the kind of mistake that happens when someone forgets to click a toggle.

The question that doesn't have a clean answer: if the company that writes the safety playbook for the entire industry can't secure its own content management system, what does that tell us about the infrastructure holding the rest of this together?

Originally published as an Instagram carousel on @recul.ai.

What got exposed

The cybersecurity dimension

The irony

Why it matters

More from Recul

Anthropic built its most powerful model and decided the world can't have it

The Pentagon banned Anthropic the same week it deployed Claude to target Iran

Claude now lives inside Excel and PowerPoint, and the CFO test landed

The era of cheap AI is ending. The bill is not only for chatbots.