Vulnerabilities

Over 500 Apache Tika Toolkit Instances Exposed To Critical XXE Vulnerability

Security researchers have uncovered a severe flaw in Apache Tika, a popular open-source toolkit for content analysis and extraction. CVE-2025-66516 has a perfect CVSS score of 10.0, indicating it is critical.

Disclosed on December 4, 2025, by the Apache Software Foundation, the vulnerability exposes over 565 internet-facing Tika Server instances to remote attacks.

Attackers could exploit it without authentication, risking data leaks, server crashes, or unauthorized network probes.

Censys scanning revealed these exposed hosts, many of which were running vulnerable versions of tika-core.

Organizations that use Tika to parse documents such as PDFs in web apps or services face a high risk. While no proof-of-concept (PoC) or active exploits exist yet, the flaw’s simplicity demands immediate patching.

Vulnerability Breakdown

Apache Tika processes files to extract text, metadata, and structure from formats like PDF, HTML, and XML.

CVE-2025-66516 stems from an XML External Entity (XXE) injection flaw in tika-core. Attackers craft a malicious PDF embedding an XFA (XML Forms Architecture) file, a feature for interactive PDF forms.

When Tika parses the PDF, it mishandles the XFA’s XML entities. This lets attackers reference external entities, such as local files on the server.

For example, a payload like <!ENTITY xxe SYSTEM “file:///etc/passwd”> could dump sensitive configs or user data.

Billion laughs attacks amplify this for denial-of-service (DoS) by bloating memory with recursive entities.

Server-Side Request Forgery (SSRF) enables internal network scans, like querying http://169.254.169.254/metadata on cloud metadata services.

Affected versions span tika-core 1.13.0 to 3.2.1. Tika Server instances detectable via banners like “Welcome to the Apache Tika [version] Server” are most visible.

Note: Embedded dependencies like tika-parsers or tika-pdf-module in apps may also be vulnerable but evade network scans.

FieldDescription
CVE-IDCVE-2025-66516 (CVSS 10.0, Apache Software Foundation)
DescriptionXXE via crafted XFA in PDF; enables data exfil, DoS, SSRF
DisclosureDecember 4, 2025
AffectedTika-core 1.13.0–3.2.1 (Tika Server instances)
PoC/ExploitsNone known
PatchUpgrade to tika-core >=3.2.2; tika-parsers >=1.28.6

Exposure Map and Mitigation Steps

Censys tracks 565 vulnerable hosts globally. View the country breakdown here. Track via queries:

  • Censys Platform
  • ASM: risks.name=”Vulnerable Apache Tika [CVE-2025-66516].”
  • Legacy: services.http.response.html_title=/Welcome to the Apache Tika [0-9\.]+ Server/

Upgrade to version 3.2.2 or later now. Disable XXE processing if patching delays. Scan for exposed Tika endpoints and restrict PDF uploads.

Follow us on Google News , LinkedIn and X to Get More Instant Updates, Set Cyberpress as a Preferred Source in Google.

Varshini

Varshini is a Cyber Security expert in Threat Analysis, Vulnerability Assessment, and Research. Passionate about staying ahead of emerging Threats and Technologies..

Share
Published by
Varshini

Recent Posts

Burp Suite Supercharges Its Scanning Capabilities With React2Shell Vulnerability Detection

PortSwigger has leveled up Burp Suite's scanning arsenal with the latest Active Scan++ extension, version…

4 months ago

Malicious MCP Servers Enable New Prompt Injection Attack To Drain Resources

Unit 42 researchers at Palo Alto Networks exposed serious flaws in the Model Context Protocol…

4 months ago

Law Enforcement Detains Hackers Equipped With Specialized Flipper Hacking Tools

Polish police have arrested three Ukrainian men traveling through Europe and seized a cache of…

4 months ago

Google Unveils 10 New Gemini-Powered AI Features For Chrome

Google has launched its most significant Chrome update ever, embedding Gemini AI across the browser…

4 months ago

CISA Alerts On Actively Exploited Buffer Overflow Flaw In D-Link Routers

Attackers exploit this vulnerability through the router's web interface components, specifically "cgibin" and "hnap_main," by…

4 months ago

Hackers Abuse AWS IAM Eventual Consistency To Maintain Persistent Access

Attackers can keep access to AWS accounts even after admins delete compromised keys. New research…

4 months ago