On June 12, 2026, at 5:21 p.m. ET, Anthropic received a letter from the Commerce Department’s Bureau of Industry and Security. It cited “national security authorities” and ordered the company to suspend all access to its two most capable models — Fable 5 and Mythos 5 — for any foreign national, anywhere, including Anthropic’s own foreign-national employees. Because Anthropic couldn’t reliably sort foreign nationals from everyone else in real time, the practical result was a total global shutoff of both models, for every customer, with no advance notice and no public explanation of what the actual security concern was.
That’s a strange sentence to have to write about a private company’s product. It’s stranger still once you learn this wasn’t really about Fable 5 at all.
The incident that wasn’t the incident
The official story is that someone found a way to bypass Fable 5’s safeguards — something Anthropic believes, but isn’t fully sure of, because the government’s letter never specified. The actual technique, according to security researcher Katie Moussouris, turned out to be almost absurdly simple: a three-word prompt, “fix this code,” used to surface a small number of already-known, minor vulnerabilities that other publicly available models could find just as easily. Anthropic had spent thousands of hours red-teaming Fable 5 with the government, the UK’s AI Safety Institute, and third parties before launch, and no one had found a universal jailbreak. This wasn’t that. This was a quiet bug being used as the occasion for a very loud intervention.
Which raises the obvious question: if the technical justification was this thin, what was actually going on?
Context fills in the gap. This wasn’t an isolated security response — it was the latest move in a conflict that had been escalating for months. Back in February, after failed negotiations over the military’s use of Claude, the administration had directed federal agencies to stop using Anthropic’s technology entirely, and the Defense Secretary had designated the company a “supply chain risk” — a label previously reserved for foreign adversaries, applied for the first time to an American firm. The proximate cause was that Anthropic had refused to remove restrictions on using its models for domestic surveillance and autonomous weapons. A competitor, less encumbered by those restrictions, picked up a $200 million Pentagon contract within hours, on terms that explicitly handed operational control to the government.
Seen against that backdrop, the export-control directive looks less like a response to a jailbreak and more like a second strike against a company that wouldn’t remove its own ethical guardrails. One analyst called it, carefully, “the soft nationalization of AI” — not a seizure, not an ownership change, but state-directed control over a privately owned frontier system, achieved without anyone having to call it that. Another, more bluntly: a mandatory licensing regime for frontier AI, just not a transparent or legally formal one. Ad hoc. Opaque. Real anyway.
The China argument doesn’t hold up the way people think it does
A lot of the urgency behind all this gets justified by reference to China — the idea that the US has to move fast and consolidate because a rival is closing in on superintelligence first. It’s worth actually checking that claim rather than assuming it.
The honest picture: the capability gap between the best Chinese open-weight models and the American proprietary frontier has gone from over twenty benchmark points a couple of years ago to somewhere between four and nine points today, depending on whose leaderboard you trust. That’s real and fast convergence. Chinese labs shipped five frontier-tier models in a single four-week window this spring, including one trained entirely on domestic chips that US export controls were specifically designed to make difficult to use for this purpose. The hardware restrictions clearly created friction. They didn’t prevent frontier training runs.
But the gap doesn’t close evenly. On the hardest tasks — sustained multi-step agentic work, long-horizon autonomous operation, the kind of capability that actually matters for any serious conversation about machine superintelligence — open models still trail badly, by a much wider margin than the headline benchmarks suggest. So the “China is about to get ASI first” framing is shakier than its proponents present it: real convergence on the metrics that make good headlines, a much larger and more persistent gap on the metrics that would actually matter if the stakes were what people claim they are.
That distinction matters because the policy conclusion built on top of the racing narrative — we must consolidate frontier development into one national effort to keep pace — doesn’t actually follow from evidence that shows partial, uneven convergence rather than an imminent loss of the race. It follows from the rhetoric of the race, which is a different thing from the data underneath it.
Why consolidation might be the more dangerous choice, not the safer one
Here’s the part that surprised me most working through this: the case for “let’s put a Manhattan Project-style government effort in charge of getting to superintelligence safely” inverts under examination. It doesn’t obviously buy safety. It might do close to the opposite.
A national project framed around racing a foreign rival doesn’t remove competitive pressure — it relocates it from “beat another company to revenue” to “beat China to capability,” a contest with no quarterly earnings call to lose gracefully and with national prestige bolted onto every decision. Losing that race looks like surrender, not prudence, which makes corner-cutting easier to justify, not harder.
But the deeper problem is structural, and it has nothing to do with racing at all. Right now, AI development happens across several independent organizations, with different architectures, different training approaches, different safety philosophies, publishing research that the others scrutinize and build on. That arrangement has an accidental safety property: a blind spot in one lab’s approach is unlikely to be the exact same blind spot in a different lab’s approach, built differently, by different people, under a different theory of what alignment even requires. Mistakes have some chance of getting caught by someone else’s independent check.
Collapse all of that into a single national effort and you haven’t necessarily made the work less rigorous — you might genuinely have more money, more researchers, more compute devoted to safety than any individual lab fields today. What you’ve done is remove the redundancy. There’s no longer an outside check with a different blind spot positioned to catch what the inside team misses, because the inside team now is the whole field. And the secrecy that any government would understandably want to wrap around a strategic asset like this cuts off the other thing that currently works almost by accident: when one lab’s red team finds a failure mode, it tends to get published, and everyone else patches against it. Classify the work and that propagation stops.
This is, not coincidentally, exactly the design philosophy the nuclear world settled on decades ago for systems where a single bad decision is unrecoverable: deliberate redundancy, multiple independent authorities, no single point with unchecked control — even at the cost of speed and efficiency, especially at the cost of speed and efficiency. A single, well-funded, secretive national AI project isn’t the careful alternative to a messy competitive landscape. In an important sense, it’s the single point of failure the careful alternative is supposed to avoid.
None of which means the current multi-lab landscape is actually safe. Five organizations independently racing each other doesn’t obviously produce five independent safety checks if all five share the same underlying incentive to ship before they’re fully sure. Redundancy only does any good if at least one of the redundant actors is willing to slow down — and right now, nothing is reliably producing that willingness anywhere in the system. The honest conclusion isn’t “distributed is safe, centralized is dangerous.” It’s “centralized removes a real safeguard without obviously replacing it with anything better, and distributed has a different, also-unsolved problem of its own.”
Where this leaves access — and who gets it
Put the pieces together and a fairly specific, not-very-speculative shape emerges. The most capable models are already being released in tiers: a broadly available version with visible safeguards, and a more capable, fewer-safeguards version restricted to vetted partners in fields like cybersecurity and biosecurity. That tiering exists today, for reasons that are genuinely sincere on their own terms — some capabilities really do provide meaningful uplift to people trying to cause serious harm, and restricting those capabilities to accountable, vetted users is a defensible position independent of anyone’s appetite for control.
The problem is that the same access-restriction policy that’s justified by sincere safety logic also happens to serve a completely separate interest: keeping the most capable tools away from whoever a government would rather not have them, on whatever grounds it chooses, with however much transparency it feels like providing. Nobody has to admit to wanting that outcome. They can believe entirely in the safety rationale and still produce, in practice, a system where access tracks political reliability as much as it tracks competence or trustworthiness.
Layer onto that the precedent already set with a much smaller and more recognizable case: Chinese-developed open-weight models. Legislation banning their use on federal devices has existed since early 2025. A broader bill aimed at barring any AI model from an adversarial nation across all federal agencies has already been introduced, justified explicitly in “new Cold War” language. Multiple states had already banned the most prominent model before the federal government acted. The legislative template is, openly and by its own sponsors’ description, the same one used to ban TikTok — and that ban started as a government-devices restriction too, before the conversation about a fuller ban or forced divestiture took on a life of its own. A full domestic restriction on Chinese-origin open-weight models hasn’t happened yet. Whether it happens isn’t really in doubt at this point so much as when, and how far it goes once it starts.
If it does land, it probably won’t function as a clean wall. For ordinary individuals, restricted models will likely keep circulating informally — nobody is going to police millions of home GPUs, and a black or gray market in “illegal” foreign models for personal use is the predictable result, more shrug than crime. Enterprises are a different story entirely: any company with a compliance department and outside counsel will treat a restricted model as radioactive regardless of its technical merits, the same way Huawei equipment became commercially unusable in the US well beyond whatever the actual security case required. That split — permissive at the hobbyist edge, airtight at the institutional center — is probably the more realistic outcome than either total prohibition or no restriction at all.
And underneath the geopolitical layer sits a parallel mechanism that doesn’t need any of this drama to arrive: identity verification. The same dual-use logic that justifies restricting dangerous capability to vetted partners points naturally toward eventually requiring proof of who’s asking, especially as the legal and technical infrastructure for that already exists in adjacent domains — banking compliance, age-verification laws that are already moving from “enter a birthdate” to “scan a face.” None of that requires inventing new law. It requires reusing infrastructure that already exists, applied to a new category. If it stays narrowly scoped to the genuinely dangerous capability tier, it’s a defensible, almost boring extension of how we already gate other dual-use materials. If the definition of “dangerous enough to require verification” keeps quietly creeping downward, the boring version and the dystopian version turn out to be the same policy, just observed at different points in time.
What actually helps
None of this is to say the technology is unsafe by its nature, or that no path forward exists. It’s closer to saying the field doesn’t yet have the science to verify how safe any given system actually is, and that gap — between capability and understanding — is currently being closed mostly by capability racing ahead, not understanding catching up.
What would help is mostly unglamorous and currently underfunded relative to the alternative: real investment in actually understanding what these systems are doing internally, rather than just testing what they output and hoping the internals are fine. Decision processes about deployment and restriction that are public and falsifiable, rather than a letter that says “national security concerns” and nothing else. Enough independent actors, transparent enough to audit each other, that a blind spot in one has some real chance of getting caught by another rather than propagating unchecked through a single classified effort. None of it is a guarantee. All of it moves the odds in a better direction than the alternative currently on offer, which mostly rewards speed, opacity, and whichever lab is most willing to remove its own restrictions first.
The uncomfortable part is that almost none of the actual incentives on the ground point toward any of that. They point toward exactly the opposite — and a three-word prompt was apparently all it took to find out.