Tuesday, December 30, 2025

Over 500 Apache Tika Toolkit Instances Exposed To Critical XXE Vulnerability

Security researchers have uncovered a severe flaw in Apache Tika, a popular open-source toolkit for content analysis and extraction. CVE-2025-66516 has a perfect CVSS score of 10.0, indicating it is critical.

Disclosed on December 4, 2025, by the Apache Software Foundation, the vulnerability exposes over 565 internet-facing Tika Server instances to remote attacks.

Attackers could exploit it without authentication, risking data leaks, server crashes, or unauthorized network probes.

Censys scanning revealed these exposed hosts, many of which were running vulnerable versions of tika-core.

Organizations that use Tika to parse documents such as PDFs in web apps or services face a high risk. While no proof-of-concept (PoC) or active exploits exist yet, the flaw’s simplicity demands immediate patching.

Vulnerability Breakdown

Apache Tika processes files to extract text, metadata, and structure from formats like PDF, HTML, and XML.

CVE-2025-66516 stems from an XML External Entity (XXE) injection flaw in tika-core. Attackers craft a malicious PDF embedding an XFA (XML Forms Architecture) file, a feature for interactive PDF forms.

When Tika parses the PDF, it mishandles the XFA’s XML entities. This lets attackers reference external entities, such as local files on the server.

For example, a payload like <!ENTITY xxe SYSTEM “file:///etc/passwd”> could dump sensitive configs or user data.

Billion laughs attacks amplify this for denial-of-service (DoS) by bloating memory with recursive entities.

Server-Side Request Forgery (SSRF) enables internal network scans, like querying http://169.254.169.254/metadata on cloud metadata services.

Affected versions span tika-core 1.13.0 to 3.2.1. Tika Server instances detectable via banners like “Welcome to the Apache Tika [version] Server” are most visible.

Note: Embedded dependencies like tika-parsers or tika-pdf-module in apps may also be vulnerable but evade network scans.

FieldDescription
CVE-IDCVE-2025-66516 (CVSS 10.0, Apache Software Foundation)
DescriptionXXE via crafted XFA in PDF; enables data exfil, DoS, SSRF
DisclosureDecember 4, 2025
AffectedTika-core 1.13.0–3.2.1 (Tika Server instances)
PoC/ExploitsNone known
PatchUpgrade to tika-core >=3.2.2; tika-parsers >=1.28.6

Exposure Map and Mitigation Steps

Censys tracks 565 vulnerable hosts globally. View the country breakdown here. Track via queries:

  • Censys Platform
  • ASM: risks.name=”Vulnerable Apache Tika [CVE-2025-66516].”
  • Legacy: services.http.response.html_title=/Welcome to the Apache Tika [0-9\.]+ Server/

Upgrade to version 3.2.2 or later now. Disable XXE processing if patching delays. Scan for exposed Tika endpoints and restrict PDF uploads.

Follow us on Google News , LinkedIn and X to Get More Instant Updates, Set Cyberpress as a Preferred Source in Google.

Varshini
Varshini
Varshini is a Cyber Security expert in Threat Analysis, Vulnerability Assessment, and Research. Passionate about staying ahead of emerging Threats and Technologies..

Recent News

Recent News