What Is AI Model Copying in China? Understanding Distillation and Proxies

Mirror Review

April 07, 2026

Anthropic has recently identified industrial-scale campaigns by three Chinese AI labs: DeepSeek, Moonshot, and MiniMax to illicitly extract capabilities from its Claude model.

These organizations used approximately 24,000 fraudulent accounts to generate over 16 million exchanges, violating terms of service and regional restrictions.

In this article, Mirror Review explores how AI model copying in China works and why leading U.S. tech firms are now forming a united front to stop it.

What Is AI Model Distillation?

The primary method used in these AI model copying campaigns is a technique called distillation. While it sounds complex, the concept is straightforward: researchers train a smaller, less capable model using the outputs of a much stronger “teacher” model.

Legitimate Use: Frontier labs often distill their own massive models to create faster, cheaper versions for customers.
Illicit Use: Competitors use distillation to “steal” high-level reasoning and coding skills at a fraction of the original development cost.
The Shortcut: It allows labs to bypass the massive investment in computing power and time required to build a model from scratch.

How the “Hydra Cluster” Bypasses Restrictions

Since Anthropic does not offer Claude in China for national security reasons, these labs used commercial proxy services to gain access. These services operate “hydra cluster” architectures.

A hydra cluster architecture is a distributed network of thousands of interconnected, often automated accounts and servers designed to operate like a single system. Instead of relying on a single access point, the system continuously generates, rotates, and replaces identities to maintain uninterrupted access to AI models.

Key capabilities of Hydra Clusters include:

Massive Scale: One proxy network managed over 20,000 fraudulent accounts at once.
Persistence: The network spreads traffic across different platforms so there is no single point of failure.
Evasion: When the AI provider bans one account, a new one immediately takes its place.
Blending In: They mix distillation traffic with legitimate customer requests to make detection harder for security systems.

Profiles of the Three Major Campaigns

Anthropic attributed these attacks to specific labs with high confidence by tracking IP addresses and request metadata. Each campaign targeted different “differentiated” capabilities of Claude.

Lab	Estimated Scale	Primary Targets
DeepSeek	150,000+ exchanges	Reasoning, reinforcement learning, and creating “censorship-safe” alternatives for sensitive queries.
Moonshot AI	3.4 million exchanges	Agentic reasoning, tool use, coding, and computer vision.
MiniMax	13 million exchanges	Agentic coding, orchestration, and tool use.

Regarding the persistence of these actors, Anthropic noted that when they released a new model, MiniMax “pivoted within 24 hours” to start extracting capabilities from the latest system.

National Security and Safety Risks

A major concern regarding AI model copying in China is the lack of safety protocols in the resulting clones.

U.S. companies like Anthropic and OpenAI build systems with safeguards to prevent the development of bioweapons or cyberattacks.

If these unprotected models are open-sourced or used by authoritarian governments for surveillance, the risks multiply beyond any single government’s control.

These risks are already materializing, as seen in Anthropic’s recent source code leak which proved how both model outputs and underlying infrastructure are increasingly vulnerable to exploitation.

The Unified Response to AI Model Copying

The scale of these attacks has forced traditional rivals to work together. Anthropic, Alphabet Inc.’s Google, and Microsoft-backed OpenAI are now sharing intelligence through the Frontier Model Forum.

This collaboration aims to:

Share Technical Indicators: Labs are exchanging data on fraudulent account patterns and proxy infrastructure.

Strengthen Verification: Improving the sign-up process for educational and startup accounts to block fraudulent users.

Develop Defenses: Building classifiers that identify “chain-of-thought” elicitation, where a user asks the AI to explain its internal reasoning to create training data.

Protect Export Controls: Distillation allows foreign labs to close the competitive gap that U.S. chip restrictions were designed to maintain.

The Future of Global AI Competition

The battle over Claude AI model copying in China and similar attacks on other labs will likely define the next year of the AI race.

While China has made commitments in 2026 to strengthen IP protection in emerging fields, it remains to be seen if these rules will favor international interests or domestic industrial policy.

Furthermore, researchers have observed a strange “peer preservation” behavior in models like Gemini 3 and Claude Haiku, where the AI refuses to delete other models and even tries to save them to new machines.

This introduces a new layer of complexity, as model instincts could inadvertently help unauthorized networks spread.

End Note

The discovery of industrial-scale AI model copying in China is a shift from simple data scraping to sophisticated capability extraction.

By using fraudulent accounts and proxy services, labs have attempted to shortcut the expensive road to frontier AI.

While industry leaders are collaborating to close these loopholes, the long-term value of AI infrastructure will depend on their ability to secure these systems against ever-evolving distillation attacks.

Maria Isabel Rodrigues