The Mythos split - and what it means when one company decides who gets to see digital holes

Anthropic launched Claude Fable 5 yesterday and called it its most powerful generally available model. That's the headline. The less visible fact is what it launched alongside: Claude Mythos 5, which is the same model minus the safety rails on cybersecurity.

The difference between the two is that Fable 5 uses classifiers to detect when a user asks about cybersecurity topics and falls back to a weaker model. Mythos 5 doesn't. And Mythos 5 isn't available to the public - only to vetted users who need them.

So what Anthropic has built is not really a "safe" version of its most powerful model. It has built a two-tier system. The public gets the capped model. A private list of partners gets the uncapped one. And Anthropic itself decides who makes that list.

The Mythos split - and what it means when one company decides who gets to see digital holes

What actually changed

When Anthropic released Mythos Preview in April, it disclosed that the model could identify and exploit zero-day vulnerabilities in real-world software. That raised alarms. Cybersecurity experts warned that AI-driven vulnerability discovery could outpace the patch cycle. The White House started reconsidering pre-release oversight for high-risk AI models.

Now, two months later, Anthropic has drawn a line. You can use Fable 5 to write code, do knowledge work, and analyze images at Mythos-level capability. But if the model's classifiers think you're asking something with security implications, it quietly routes your request to Opus 4.8, a less capable model. On cybersecurity benchmarks, Fable 5's public score drops to essentially the same level as Opus 4.8, because that's exactly where the fallback kicks in.

Meanwhile, Mythos 5 - the same architecture without the guardrails - goes to partners Anthropic has vetted.

The real question isn't safety. It's gatekeeping.

I'm not going to pretend this is straightforward. Anthropic has a genuine argument: without safeguards, Fable 5's cybersecurity capabilities could be misused to cause serious damage, the company said. The risk is real. A model that can find and exploit unknown vulnerabilities in production software is a tool that needs boundaries.

But the structural question is who draws those boundaries and under what accountability. Anthropic is a private company deciding which institutions - governments, enterprises, "critical partners" as it calls them - get access to what is effectively the most powerful automated vulnerability scanner in existence. It says it will work with the U.S. and allied governments to manage expanded access. That's a coalition model, not a regulatory framework.

Compare that with how traditional vulnerability research works. Security researchers find bugs, they report them to vendors, there's a disclosure timeline, and patches ship. It's messy and slow, but there's at least a public process. AI changes the timing: vulnerabilities can be identified suddenly, without vendor awareness, and without fixes available. That's what made Mythos Preview so disruptive to the industry's assumptions.

But the two-tier architecture also means something subtler. The gap between the model everyone has and the model a few people have creates an information asymmetry that is, by definition, a power asymmetry. Whoever holds the uncapped model can see vulnerabilities that no one else can find yet. The defensive use case - finding your own bugs before attackers do - is the stated purpose. But the same capability is also offensive. The guardrails Anthropic built into Fable 5 don't change the capability; they restrict who can use it.

The classifiers are the system

There's also a practical detail worth sitting with. According to Anthropic's own system card, the cybersecurity classifiers are "super aggressive and sensitive" - so much so that they trigger on benign, non-security coding tasks. People running ordinary development workflows are getting routed to the weaker model without realizing it.

That matters because it means the boundary between the public model and the restricted one is porous and unpredictable. You might think you're using a Mythos-class model for everything, but on certain queries, you're not. The leash is tighter than the marketing suggests.

For investors and users trying to evaluate whether Fable 5 actually delivers the value Anthropic is promising, that's the hidden constraint. The model is powerful everywhere except the domain where its most distinctive capability lives - and that's by design.

What this looks like from outside the US

If you're in a frontier market where cybersecurity infrastructure is thin, this two-tier model doesn't land as a safety decision. It lands as access. Companies and governments that aren't on Anthropic's partner list will never have Mythos-level scanning capability, while allied governments will. The defensive gap widens the same way the offensive gap does.

I've been tracking how permissionless systems try to solve exactly this kind of asymmetry - the idea that tools for defending your own systems shouldn't require the approval of a Silicon Valley company. Whether that idea can actually work at this level of capability is still an open question. But the structure Anthropic has chosen makes the gap more visible.

Where we go from here

The question now is whether this two-tier model becomes the template. If other companies follow Anthropic's lead - shipping public models with safety-classifier fallbacks while reserving uncapped versions for vetted partners - the industry normalizes private gatekeeping of high-risk capability. If the White House or other regulators move to formalize pre-release review or licensing, the gatekeeping becomes public rather than corporate, which changes the accountability structure even if the capability restriction doesn't.

Neither outcome is obviously better. Private gatekeeping moves faster but answers to fewer people. Public gatekeeping is more legitimate but slower and more politically shaped. The fact that we're still figuring out which one we want is itself a sign of how fast the underlying technology has outpaced the institutional response.

What I'm watching: whether the aggressive classifiers in Fable 5 get tuned more leniently over time (eroding the distinction between the two models), or whether the partner list for Mythos 5 stays small and exclusive, keeping the power asymmetry intact. The answer to that will tell us whether this is a temporary containment strategy or a durable architecture for how the industry manages its most dangerous tools.