Anthropic, Fable 5 and Mythos 5 Blocked by the US Government Over a Jailbreak

A few hours ago, while I was doing the usual Saturday morning things - meaning cleaning up my codebase, obviously - a letter landed at Anthropic. Not just any letter: a directive from the United States government that, at 5:21 PM Eastern Time, ordered them to shut off access to Fable 5 and Mythos 5. Immediately. For all foreign nationals, inside and outside the US - including non-US Anthropic employees.

Translated: if you do not have an American passport, those two models no longer exist for you.

And guess who does not have an American passport? Exactly. Me, you, and a few hundred million people in Europe.

What happened, in plain English

Imagine you run a wildly popular pizza place. One day City Hall sends you a certified letter: “Starting tonight, you may no longer serve margherita and quattro stagioni to anyone who was not born in this country. Reason: national security.” You open the envelope, read it, sigh, and remove two items from the menu. All while thinking: “Are we seriously doing this?”

That is, more or less, what happened to Anthropic.

The US government, invoking export control powers, imposed a total suspension of access to the two models. Anthropic complied - because when a legal order lands on your desk, you do not exactly have a menu of alternatives - but in its official statement it made one thing pretty clear: it does not agree with a single comma of the decision. Access to all other models, to be clear, remains untouched.

The official reason? The government says it became aware of a method for jailbreaking Fable 5.

Wait, what is a “jailbreak”?

If the word vaguely reminds you of “jailbreaking an iPhone”, you are not far off. A jailbreak for an AI model is when someone finds a way around the model’s safety rules - the rules that stop it, for instance, from giving you instructions for dangerous or illegal things.

Think of the model as a very polite, very rigid librarian. You ask for a forbidden book, and it politely tells you no. A jailbreak is the art of asking the question in such a twisted way that the librarian gives you the book anyway without realizing it.

There are two kinds of jailbreaks:

Non-universal: they only work under specific circumstances, in limited cases. They are the daily bread of anyone studying model security.
Universal: the master key. One single trick that unlocks everything. Those would be a serious problem.

And that is where things get interesting.

Anthropic’s version: “are we kidding right now?”

Anthropic more or less dismantled the justification piece by piece, and I have to say, the arguments look pretty hard to punch holes in.

First: the alleged jailbreak is not universal. It is a very narrow thing that, stripped down to basics, amounted to asking the model to read source code and fix software flaws it found there. In other words, as Anthropic itself put it, something any developer does every single day to keep systems secure.

Second - and this is where I smiled - the vulnerabilities it identified were reportedly “relatively simple”, to the point that other publicly available models find them on their own, including OpenAI’s GPT-5.5. So the “secret and dangerous method” would apparently be something the competition’s model can also do without any special trick.

Third, and this is the strongest point: before launch, Anthropic says Fable’s defenses were tested for thousands of hours by the US government, the UK’s AISI, third-party organizations, and internal teams. According to them, nobody found a universal jailbreak. And their claim is simple: perfect jailbreak resistance does not exist today for any model, from any provider. If the standard becomes “pull the model the moment someone finds a minor hole”, then everyone would have to shut up shop - OpenAI included.

Their approach is called defense in depth. They are not trying to make the model invulnerable - impossible - but to make jailbreaks either so narrow they are useless, or so expensive they are not worth the trouble, all while monitoring the system closely enough to catch and block attacks in real time. It is also why Fable keeps data for 30 days: to study and patch weaknesses. A choice that, they openly admit, comes with a real cost for customers.

In short, Anthropic’s position is: “You took a water pistol and treated it like a weapon of mass destruction.”

This is not the first time dishes have been flying

This is where it turns political. Very political.

This is not the first clash between Anthropic and the Trump administration. Back in February, after Dario Amodei - CEO and co-founder - opposed the use of his technology for certain defense-related purposes, Trump reportedly ordered federal agencies to stop using Anthropic products immediately. The whole thing came with a memorable Truth Social post in classic “We don’t need it, we don’t want it, and we won’t be doing business with them anymore!” style. Anthropic, in response, announced legal action against the government, which had meanwhile also branded the company a “supply chain risk.”

So yes, there has been bad blood between the two sides for a while. And that, honestly, makes it hard to take the national security directive at 100 percent face value - just as it makes it hard to rule out that Anthropic is downplaying the issue for its own convenience. The truth, as usual, probably sits somewhere in the middle, in that gray zone where technology and politics blur together until they become almost impossible to separate.

And what about us in Europe? This is a sovereignty issue

There is one detail in this story that should make Europeans pay attention, and commenters picked up on it immediately: an AI model is now a matter of national sovereignty.

Think about it. With a single letter, a foreign government shut down two work tools for hundreds of millions of people outside its borders. Not because of a war, not because of some spectacular economic sanction, but because of an alleged bug in an artificial intelligence system.

If your company, your product, your workflow depends on a model running on American servers and subject to decisions made in Washington, then you are always one press release away from being stranded. And there is not much you can do about it.

That is why the whole conversation about Mistral in France, European models, and open source models you can self-host suddenly stops sounding like a debate for ideological nerds and starts sounding like a very boring question of operational continuity. It is not (just) digital patriotism. It is refusing to hang your work from a thread that somebody else can cut whenever they feel like it.

My take

I have been messing around with local models for a long time - anyone who follows me knows that, between Loom, Doc Analyzer, and the many times I installed things locally “just so I would not depend on anyone.” And every single time, there was a little voice in my head saying: “Matteo, why are you doing this to yourself? The APIs are cheap and they work.”

Well: this story is exactly why.

Not because local models are always better - they are not. On many tasks, a solid API model will flatten your desk-side gemma4:28b in ten seconds flat. But resilience is a feature, and you only notice it when you suddenly need it. The day your provider disappears - because of a government directive, a pricing change, or a political fight that has absolutely nothing to do with you - that little model running on your own server suddenly becomes the most valuable thing you have.

There is also a second layer here that leaves me uneasy, and at this point I will happily take off my local-first fanboy hat for a second: shutting down a commercial model used by hundreds of millions of people over a narrow jailbreak, one that competitors can apparently do too, really does look like an overreaction. If that became the standard, every new model could end up blocked the moment the first researcher finds a hole - and in this field, there are always holes. It is a bit like shutting down every highway because someone discovered you can drive the wrong way on them.

That said - and yes, I am consciously keeping a foot in both camps here - I am not one of the people who think model safety is some trivial detail you can wave away with a shrug. Between the company minimizing the issue and the state reacting in an opaque way, the real loser is always the same person: the one who uses those models to work and ends up reading press releases instead of writing code.

Anthropic says it is working to restore access as soon as possible, and that this is all a “misunderstanding.” Let us hope so. In the meantime, I am going to make sure my home server is still on.

You never know.

Posts

Tags

Categories

Authors