Vulnerabilities

Critical Flaw In Apache Tika Core Enables Exploitation Through Malicious PDF Files

A critical XML External Entity (XXE) vulnerability in Apache Tika, tracked as CVE-2025-66516, exposes users to attacks through specially crafted PDF files containing XFA content.

Disclosed on December 4, 2025, by Apache security team member Tim Allison, this flaw affects core parsing modules across multiple versions.

Attackers can exploit it to read sensitive files, trigger denial-of-service conditions, or execute server-side request forgery on vulnerable systems.

Apache Tika is a popular open-source toolkit for content analysis and extraction, widely used in document processing pipelines, search engines, and security tools.

The vulnerability stems from improper handling of XML in PDF-embedded XFA forms, allowing external entity expansion.

This expands on the earlier CVE-2025-54988, which focused on the PDF parser module. Users who patched only that module remain at risk if core components lag.

Vulnerability Details and Affected Components

The XXE issue resides deep in Tika’s XML processing. When Tika parses a malicious PDF containing an XFA (XML Forms Architecture) template, it activates external entities defined in the XML.

For example, an attacker embeds a payload like <!DOCTYPE foo [<!ENTITY xxe SYSTEM “file:///etc/passwd”>]> inside the PDF’s XFA stream.

Tika’s core then resolves this, potentially leaking local files or probing internal networks via protocols like http:// or file://.

This affects all platforms Windows, Linux, macOS since Tika is Java-based. The CVE broadens CVE-2025-54988 in two key ways:

  1. The root fix is in tika-core, not just the PDF module. Upgrading only tika-parser-pdf-module leaves systems exposed.
  2. In the Tika 1.x series, the PDF parser lived in tika-parsers, which was overlooked in the prior report.

Here’s a breakdown of impacted artifacts:

ComponentAffected VersionsFixed In
tika-core1.13 – 3.2.13.2.2+
tika-parsers1.13 – 1.28.5 (pre-2.0.0)2.0.0+
tika-parser-pdf-module2.0.0 – 3.2.13.2.2+

Severity is rated critical due to its ease of exploitation no authentication required and its potential to impact remote code in misconfigured setups.

CVSS v3.1 score isn’t finalized, but XXE flaws typically hit 9.8/10 for network accessibility.

Mitigation and Recommendations

To fix this, upgrade immediately: tika-core and tika-parser-pdf-module to 3.2.2 or later; tika-parsers to 2.0.0+. Verify via Maven coordinates: org.apache.tika:tika-core:3.2.2.

Disable external entity processing in Tika configs if upgrades aren’t feasible set TikaConfig with feature(“xml-external-entities”, false).

Organizations scanning uploads or indexing docs should audit Tika deployments. Tools like OWASP Dependency-Check can flag vulnerable versions.

No known exploits in the wild yet, but PDF vectors make this ripe for phishing campaigns.

Varshini

Varshini is a Cyber Security expert in Threat Analysis, Vulnerability Assessment, and Research. Passionate about staying ahead of emerging Threats and Technologies..

Share
Published by
Varshini

Recent Posts

Burp Suite Supercharges Its Scanning Capabilities With React2Shell Vulnerability Detection

PortSwigger has leveled up Burp Suite's scanning arsenal with the latest Active Scan++ extension, version…

4 months ago

Malicious MCP Servers Enable New Prompt Injection Attack To Drain Resources

Unit 42 researchers at Palo Alto Networks exposed serious flaws in the Model Context Protocol…

4 months ago

Law Enforcement Detains Hackers Equipped With Specialized Flipper Hacking Tools

Polish police have arrested three Ukrainian men traveling through Europe and seized a cache of…

4 months ago

Google Unveils 10 New Gemini-Powered AI Features For Chrome

Google has launched its most significant Chrome update ever, embedding Gemini AI across the browser…

4 months ago

CISA Alerts On Actively Exploited Buffer Overflow Flaw In D-Link Routers

Attackers exploit this vulnerability through the router's web interface components, specifically "cgibin" and "hnap_main," by…

4 months ago

Over 500 Apache Tika Toolkit Instances Exposed To Critical XXE Vulnerability

Security researchers have uncovered a severe flaw in Apache Tika, a popular open-source toolkit for…

4 months ago