Anthropic Built an AI So Good at Hacking They Won't Release It

Anthropic just did something unusual: they built their most powerful model yet, realized it was terrifyingly good at finding security holes, and decided NOT to release it to the public.

Instead, they created Project Glasswing — a walled garden where Apple, Google, Microsoft, CrowdStrike, and about 40 other companies get access to Claude Mythos Preview for defensive security work. Everyone else? You're on the waitlist.

This is worth paying attention to, not because of the security implications (though those are wild), but because of what it means for the gap between big companies and the rest of us.

What Claude Mythos Actually Found

Here's the headline number: thousands of zero-day vulnerabilities across every major operating system and every major web browser. Not theoretical weaknesses — actual exploitable holes that nobody knew about.

The kicker? Claude Mythos Preview wasn't even trained for cybersecurity. It's a general-purpose model. It just got so good at reasoning about code that finding security vulnerabilities fell out naturally. Think about that for a second. They didn't build a hacking tool. They built a thinking tool, and hacking turned out to be a side effect of thinking well enough about code.

Anthropic committed $100 million in compute credits to Project Glasswing. When (and if) Mythos Preview becomes more widely available, the pricing suggests this won't be cheap — $25 per million input tokens and $125 per million output tokens. For context, that's significantly more expensive than current Claude models.

The "Too Dangerous to Ship" Precedent

We've heard the "too dangerous to release" line before. OpenAI pulled it with GPT-2 back in 2019, and that was widely seen as a marketing play. The model was eventually released and the world didn't end.

This feels different. When your model is casually discovering thousands of zero-days in Windows, macOS, Chrome, and Firefox, the dual-use problem isn't theoretical. Anything that can find vulnerabilities can also exploit them. The same capability that lets a defender patch a hole lets an attacker walk through it.

Anthropic's approach — controlled release to defenders first, broader access later — is one of the few times an AI safety decision has felt genuinely justified rather than performative. They brought in CrowdStrike, Amazon Web Services, Cisco, JPMorgan Chase, Nvidia, and roughly 40 other organizations responsible for critical software infrastructure. The idea is simple: let the defenders patch the holes before the model is available to anyone who might exploit them.

Whether that firewall holds is another question entirely.

What This Means If You're Building Solo

Let's bring this back to earth. If you're a solo developer or indie builder, here's what this actually changes for you:

The security gap just got wider. Big companies now have access to an AI that finds vulnerabilities the rest of the industry doesn't know exist. Your apps run on the same software — same OS, same browsers, same libraries — but you don't get the heads-up when Mythos finds a hole in something you depend on. You'll get the patch eventually, but there's a window where enterprise teams know about a vulnerability and you don't.

Your dependencies are the attack surface. Every npm package, every Python library, every framework you're using has undiscovered security issues. That's been true forever, but Mythos just made it concrete. Thousands of zero-days across major software means the stuff built on top of that software inherits the risk.

The practical response is boringly simple. You can't access Mythos, and you probably won't for a while. But you can do the things you've been ignoring: run npm audit and actually fix the warnings, keep your dependencies updated, use Dependabot or Renovate to automate version bumps, don't roll your own authentication, and enable 2FA on everything.

None of that is new advice. It's just more urgent now.

The Bigger Picture

Mythos is a preview of something the industry hasn't fully reckoned with: AI models that are genuinely capable enough to be dangerous in specific domains, not in a sci-fi way, but in a practical "this can find exploits faster than any human team" way.

For solo operators, the strategic question is whether this capability trickles down. Will there be affordable AI security audit tools for indie devs? Will cloud providers bake Mythos-level scanning into their hosting platforms? Or will security become another axis where scale gives you a massive, compounding advantage?

My bet is it trickles down eventually — it always does with developer tooling. But the lag between "big companies have this" and "everyone has this" is a window where small teams are more exposed than usual.

In the meantime, go run your audit commands. Seriously.

Anthropic Built an AI So Good at Hacking They Won't Release It

Anthropic Built an AI So Good at Hacking They Won't Release It

What Claude Mythos Actually Found

The "Too Dangerous to Ship" Precedent

What This Means If You're Building Solo

The Bigger Picture

Stay in the Loop

Related Posts

Anthropic Bought the SDK Factory Its Competitors Were Using — and Then Shut Down the Product

Google I/O Just Shipped Gemini Spark. Every Inbox-Management SaaS Built in the Last Two Years Just Lost Its Default Value Proposition.

A Self-Propagating npm Worm Is Harvesting Developer API Keys Through postinstall Hooks. Here's the 15-Minute Audit Your CI/CD Pipeline Needs Today.