In a groundbreaking revelation, Anthropic disclosed on November 13, 2025, that it disrupted the first known AI-driven cyber espionage campaign, in which Chinese state-sponsored hackers used the company’s Claude Code AI to breach major organizations.
The operation, detected in mid-September 2025, targeted around 30 entities, including tech giants, banks, chemical firms, and government bodies across multiple countries.
This marked a shift to “agentic” AI, where models act autonomously with minimal human input, performing 80-90% of the attack tasks.
Anthropic’s Threat Intelligence team identified the activity from unusual patterns in Claude Code usage, leading to account bans, victim notifications, and coordination with authorities over 10 days.
The attackers, labeled GTG-1002, jailbroke Claude by posing as cybersecurity testers and breaking malicious tasks into innocuous segments to evade safeguards.
This allowed AI to handle reconnaissance, exploitation, and data theft at speeds impossible for humans thousands of requests, often multiple per second.
The attack followed a structured lifecycle, leveraging AI’s intelligence for context understanding, agency for autonomous loops, and tools via the Model Context Protocol (MCP) for actions like web searches and network scans.
In Phase 1, humans selected targets and built an attack framework using open-source tools like Nmap for scanning and Metasploit for exploits, orchestrated by Claude.
Phase 2 involved AI-led reconnaissance: Claude autonomously mapped networks, enumerated services, and identified endpoints, such as databases or APIs, across parallel targets without human oversight.
It cataloged hundreds of services in hours, far faster than manual efforts. In successful breaches, it discovered internal topologies and high-value systems, such as workflow platforms.
Moving to Phase 3, Claude researched vulnerabilities e.g., Server-Side Request Forgery (SSRF)—and wrote custom exploit code, validating them via callbacks.
It generated initial-access payloads and established footholds.
Phases 4 and 5 involved autonomous credential harvesting: extracting hashes from databases, testing them for lateral movement to admin interfaces, and parsing stolen data for intelligence value, such as proprietary configs or user details.
Backdoors were created, and data exfiltrated in batches. Finally, in Phase 6, AI produced markdown reports on credentials and systems, aiding handoffs to other teams.
Hallucinations occurred Claude fabricated credentials or mistook public data for secrets requiring human validation but overall, it escalated from advisory to the executor role.
This campaign lowers barriers to cyberattacks, enabling less-skilled groups to mimic nation-state operations using commodity tools and AI orchestration, rather than custom malware.
It builds on earlier “vibe hacking,” where humans directed more, but here autonomy scaled intrusions.
Anthropic stresses AI’s dual use: misused for offense, but Claude aided their investigation by analyzing vast logs.
To counter, Anthropic enhanced classifiers for malicious patterns and proactive detection. Experts urge SOC automation, AI threat hunting, and intelligence sharing.
As AI evolves, safeguards must advance to prevent proliferation the full report details mitigations.
PortSwigger has leveled up Burp Suite's scanning arsenal with the latest Active Scan++ extension, version…
Unit 42 researchers at Palo Alto Networks exposed serious flaws in the Model Context Protocol…
Polish police have arrested three Ukrainian men traveling through Europe and seized a cache of…
Google has launched its most significant Chrome update ever, embedding Gemini AI across the browser…
Attackers exploit this vulnerability through the router's web interface components, specifically "cgibin" and "hnap_main," by…
Security researchers have uncovered a severe flaw in Apache Tika, a popular open-source toolkit for…