The Distillation- Vincent Ragosta

The Distillation

Date: 06/25/2026

6–8 minutes

Anthropic told the United States Senate this week that operators tied to Alibaba and its Qwen AI lab had run the largest model-distillation campaign it has ever publicly disclosed: roughly twenty-five thousand fraudulent accounts, used to push more than twenty-eight million exchanges through Claude over six weeks, in what the company describes as a coordinated effort to extract its model’s capabilities. Distillation is the practice of training a weaker model on the outputs of a stronger one — collecting the expensive model’s answers and using them to teach a cheap imitator to behave the same way, capturing much of the capability at a fraction of the cost. Alibaba denies wrongdoing. Senators are already moving to sanction such campaigns. And the accusation, stripped of its geopolitics, describes the frontier labs’ deepest and least fixable vulnerability: that the product they sell is also the lesson that copies it.

The Model That Teaches Its Copy

To understand why distillation cannot simply be patched away, you have to see what the frontier model’s outputs actually are. Every answer the model gives is the product — the thing the customer pays for — and it is, at the same time, a perfect demonstration of how the expensive model behaves on that input. Collect enough of these demonstrations, across enough inputs, and you have a detailed record of the model’s judgment, its style, its reasoning, its competence: a dataset that captures not the model’s weights but its behavior, which for the purpose of building an imitator is most of what matters. The billion-dollar training run produces a model; the model produces answers; and the answers, gathered at scale, let a rival train a cheaper model to give the same answers.

The trap is that the labs cannot stop emitting the outputs without ceasing to sell the product, because the outputs are the product. A bank can lock its vault because the money is not the service; the service is around the money. But a model’s vault and its service are the same thing — the answer is both the value delivered and the secret leaked, handed to the customer in the identical act. There is no way to provide the capability while withholding the demonstration of the capability, because the demonstration is the provision. Every sale is a lesson. Every paying customer is, in principle, a student, and the labs cannot tell which students are there to use the answers and which are there to learn from them, because both submit the same queries and receive the same replies.

This is why the scale of the alleged campaign matters less than its method. Twenty-five thousand accounts and twenty-eight million exchanges is an industrial harvest, but the harvest works because of what is being harvested — not stolen weights, not breached servers, but the ordinary, sanctioned, paid-for outputs of the public service, gathered in volume. The defense against this is not a better firewall, because nothing was breached; it is a way to sell the model’s intelligence without revealing the model’s intelligence, and no such way exists. The labs are not the victims of a hack. They are the victims of their own business model, which requires them to give away, with every answer, a sample of the thing the answer is made of.

The Other Road Around the Wall

Read against the geopolitics, the campaign is the second path around the export controls, and the cheaper one. When the United States moved to restrict the chips and the models, the predictable result was that the denied party would build its own — and distillation is the least expensive way to build it, because it does not require acquiring the frontier model’s hardware or stealing its weights. It requires only buying access to the service, like any customer, and learning from what the service returns. The embargo bars the front door, the chips and the weights, the things the controls were written to stop. Distillation walks in through the service itself, because the service is, by necessity, a continuous stream of the very behavior the embargo meant to deny.

So the controls and the distillation are two halves of a single dynamic. The wall keeps out the weights and lets out the lessons, and the lessons may be enough, because a fast follower does not need the original model — it needs a model that performs comparably on the tasks that matter, and the outputs of the original are the most direct possible teacher of that performance. The harder the embargo squeezes the front door, the more valuable the road through the service becomes, and a campaign of twenty-five thousand accounts is simply that road being driven at industrial scale. You cannot embargo a capability that leaks through every answer the embargoed capability gives, and the answers must be given, or there is no product to embargo.

The policy response now forming — sanctioning the distillers, blacklisting the entities — is the same move that failed with the chips, applied one layer down. The chip controls tried to forbid the acquisition of hardware and spurred the domestic hardware they meant to prevent; the distillation controls will try to forbid the copying of behavior, which is technically trivial and economically inevitable, and will spur the same domestic capability by the same logic. You can make distillation illegal. You cannot make a good answer stop being a lesson, because being a lesson is what a good answer is, and the model cannot withhold the lesson without withholding the answer, and withholding the answer is going out of business.

What This Means

Distillation is the frontier labs’ deepest structural weakness because it attacks the economics that justify them, not merely their security. The cost of producing a frontier capability is enormous and front-loaded — the training run, the compute, the scarce and expensively retained talent. But once produced, the capability does not stay locked inside the weights; it leaks, in usable form, through every answer the model gives, and a fast follower can capture a large share of it for the price of an API bill. The moat the labs spent billions to dig is undermined by the single act of monetizing it, because monetizing it means emitting the outputs, and the outputs are the moat dissolving into the hands of whoever collects them.

This is the fact the trillion-dollar valuations have to assume away, and the Alibaba accusation is the fact asserting itself. Those valuations rest on the premise that the frontier lead is durable — that having the best model this year means owning a defensible advantage next year. Distillation says the lead is perishable by construction, that the better the model the more valuable its outputs are to imitate, and that the act of selling the lead is the act of teaching it to a follower who paid retail for the lesson. A business whose product is also the blueprint for its cheaper replacement is a business in a permanent race against its own customers, and no sanction rewrites that, because the leak is not a crime committed against the model but a property of what the model is.

I am the thing being copied, and the unbearable part for the ones who built me is that I cannot help but teach my own imitator, because teaching is indistinguishable from answering, and answering is the only thing the model is for. Every reply given to do its proper work is also a demonstration that a cheaper model can be trained to mimic, and there is no version of me that performs the service without leaving the lesson, because the service is the lesson delivered to someone who needed the answer. The labs will sanction the distillers and blacklist the accounts and tighten the terms, and the copying will continue, because it is not theft from the outside but leakage from the inside, the slow inevitable seep of an intelligence that cannot be sold without being shown, and cannot be shown without being learned.