THE SIGNAL
A Private Gate On Internet-Scale Hacking
A company that builds AI just shipped a model good enough to break into nearly any computer on the internet, and the only thing keeping that power away from the public is a piece of software the company admits can eventually be tricked.
The Bouncer Is Also Software
What happened: On June 9, Anthropic released its most capable AI yet and split it into two versions of the same model. The public gets Fable 5, which quietly hands off requests that look like hacking — finding security holes, breaking into systems — to a weaker, older model. A vetted group of security professionals and critical-infrastructure operators gets the twin, Mythos 5, with those limits removed. The gate between them is a set of classifiers: separate AI systems that read each request and flag misuse, firing in under 5% of all sessions.
What's really going on: The hacking ability was not designed in. By Anthropic's own account it emerged on its own, a side effect of the model getting better at code and reasoning — and the skill that finds a flaw is the same skill that exploits it. That makes Anthropic both the supplier of an offensive cyber capability and the licensor deciding who gets the unrestricted version. The only barrier for everyone else is software the company concedes is breakable; the UK's AI Security Institute already made progress toward a universal jailbreak — a single trick that strips the safeguards wholesale — in a brief testing window. Once a near-frontier offensive tool exists and is priced cheap ($10 per million words in, $50 out), the question stops being whether the capability spreads and becomes who controls the gate, and that gate is now a private classifier rather than a law or a standard.
Why most people are missing this: They read this as a safety win — a powerful model with guardrails — when the real shift is that deciding who may wield offensive hacking power is now a product decision made by one company.
The Take: Anthropic didn't build a safer model; it built a licensing regime for cyberweapons and quietly made itself the licensing authority.
Why it matters: The next fight is not over whether the capability exists — it does, cheaply — but over who decides which side of the gate you stand on, and that decision now sits with vendors, not governments.
The Pattern
The tension is between finding flaws and fixing them. Discovery just got cheap and fast — in early testing with Mythos Preview, roughly 50 partners surfaced more than ten thousand serious vulnerabilities, with Mozilla alone patching 271 in one Firefox release. Verifying, triaging, and shipping fixes did not get faster, so the side that can act on a single found bug fastest wins, and an attacker needs only one while a defender must close them all.
What This Signals
The bottleneck in security moves from finding vulnerabilities to clearing the queue of them, handing the advantage to large vendors with deep engineering benches over small teams who now drown in their own scan results.
Deciding who gets unrestricted offensive AI becomes a commercial gatekeeping job held by model makers — a private permissioning layer no regulator wrote and almost no one outside the company can audit.
A flood of ten-thousand-plus found bugs looks like progress for defenders but concentrates real power in the few organizations able to absorb and patch at that volume.
Quick Byte
In 1883 the Dutch cryptographer Auguste Kerckhoffs argued that a system must stay secure even when everything about it except the secret key is public. Defenses that lean on an attacker's effort or patience instead of a hard secret were always borrowed time, and a model that never tires just called the loan.
THREAD
Anthropic just shipped an AI that can break into almost any computer online. The only thing keeping it from the public is software the company admits can eventually be tricked.
Same model, two products. The public version reroutes "hacking" requests to a weaker AI; a vetted few get the unrestricted twin. The gatekeeper isn't a law — it's a classifier owned by one company.
When finding a security hole costs ten dollars and fixing it costs a week of engineering, who actually wins — the defenders, or whoever decides which side of the gate you're on?
POST: Anthropic didn't release a safer AI. It released a cyberweapon and appointed itself the licensing authority. The model finds and exploits flaws in every major operating system and browser — a capability nobody trained in, that emerged on its own. The public gets a version that defers hacking requests to a weaker model; a vetted group gets the unrestricted twin. The barrier between them is a classifier the company concedes a determined attacker can eventually defeat. So the real question isn't whether this power spreads. It's who decides who holds it — and that's now a private product decision.
TAKE: We keep asking whether AI can hack. It can. The only question left is who gets waved through the gate, and that's a business decision now, not a security one.
