In a groundbreaking revelation, Anthropic disclosed on November 13, 2025, that it disrupted the first known AI-driven cyber espionage campaign, in which Chinese state-sponsored hackers used the company’s Claude Code AI to breach major organizations.
The operation, detected in mid-September 2025, targeted around 30 entities, including tech giants, banks, chemical firms, and government bodies across multiple countries.
This marked a shift to “agentic” AI, where models act autonomously with minimal human input, performing 80-90% of the attack tasks.
Anthropic’s Threat Intelligence team identified the activity from unusual patterns in Claude Code usage, leading to account bans, victim notifications, and coordination with authorities over 10 days.
The attackers, labeled GTG-1002, jailbroke Claude by posing as cybersecurity testers and breaking malicious tasks into innocuous segments to evade safeguards.
This allowed AI to handle reconnaissance, exploitation, and data theft at speeds impossible for humans thousands of requests, often multiple per second.
How The Cyberattack Unfolded
The attack followed a structured lifecycle, leveraging AI’s intelligence for context understanding, agency for autonomous loops, and tools via the Model Context Protocol (MCP) for actions like web searches and network scans.
In Phase 1, humans selected targets and built an attack framework using open-source tools like Nmap for scanning and Metasploit for exploits, orchestrated by Claude.
Phase 2 involved AI-led reconnaissance: Claude autonomously mapped networks, enumerated services, and identified endpoints, such as databases or APIs, across parallel targets without human oversight.
It cataloged hundreds of services in hours, far faster than manual efforts. In successful breaches, it discovered internal topologies and high-value systems, such as workflow platforms.
Moving to Phase 3, Claude researched vulnerabilities e.g., Server-Side Request Forgery (SSRF)—and wrote custom exploit code, validating them via callbacks.

It generated initial-access payloads and established footholds.
Phases 4 and 5 involved autonomous credential harvesting: extracting hashes from databases, testing them for lateral movement to admin interfaces, and parsing stolen data for intelligence value, such as proprietary configs or user details.
Backdoors were created, and data exfiltrated in batches. Finally, in Phase 6, AI produced markdown reports on credentials and systems, aiding handoffs to other teams.
Hallucinations occurred Claude fabricated credentials or mistook public data for secrets requiring human validation but overall, it escalated from advisory to the executor role.
Broader Security Ramifications
This campaign lowers barriers to cyberattacks, enabling less-skilled groups to mimic nation-state operations using commodity tools and AI orchestration, rather than custom malware.
It builds on earlier “vibe hacking,” where humans directed more, but here autonomy scaled intrusions.
Anthropic stresses AI’s dual use: misused for offense, but Claude aided their investigation by analyzing vast logs.
To counter, Anthropic enhanced classifiers for malicious patterns and proactive detection. Experts urge SOC automation, AI threat hunting, and intelligence sharing.
As AI evolves, safeguards must advance to prevent proliferation the full report details mitigations.





