Security researchers have uncovered a severe flaw in Apache Tika, a popular open-source toolkit for content analysis and extraction. CVE-2025-66516 has a perfect CVSS score of 10.0, indicating it is critical.
Disclosed on December 4, 2025, by the Apache Software Foundation, the vulnerability exposes over 565 internet-facing Tika Server instances to remote attacks.
Attackers could exploit it without authentication, risking data leaks, server crashes, or unauthorized network probes.
Censys scanning revealed these exposed hosts, many of which were running vulnerable versions of tika-core.
Organizations that use Tika to parse documents such as PDFs in web apps or services face a high risk. While no proof-of-concept (PoC) or active exploits exist yet, the flaw’s simplicity demands immediate patching.
Vulnerability Breakdown
Apache Tika processes files to extract text, metadata, and structure from formats like PDF, HTML, and XML.
CVE-2025-66516 stems from an XML External Entity (XXE) injection flaw in tika-core. Attackers craft a malicious PDF embedding an XFA (XML Forms Architecture) file, a feature for interactive PDF forms.
When Tika parses the PDF, it mishandles the XFA’s XML entities. This lets attackers reference external entities, such as local files on the server.
For example, a payload like <!ENTITY xxe SYSTEM “file:///etc/passwd”> could dump sensitive configs or user data.
Billion laughs attacks amplify this for denial-of-service (DoS) by bloating memory with recursive entities.
Server-Side Request Forgery (SSRF) enables internal network scans, like querying http://169.254.169.254/metadata on cloud metadata services.
Affected versions span tika-core 1.13.0 to 3.2.1. Tika Server instances detectable via banners like “Welcome to the Apache Tika [version] Server” are most visible.
Note: Embedded dependencies like tika-parsers or tika-pdf-module in apps may also be vulnerable but evade network scans.
| Field | Description |
|---|---|
| CVE-ID | CVE-2025-66516 (CVSS 10.0, Apache Software Foundation) |
| Description | XXE via crafted XFA in PDF; enables data exfil, DoS, SSRF |
| Disclosure | December 4, 2025 |
| Affected | Tika-core 1.13.0–3.2.1 (Tika Server instances) |
| PoC/Exploits | None known |
| Patch | Upgrade to tika-core >=3.2.2; tika-parsers >=1.28.6 |
Exposure Map and Mitigation Steps
Censys tracks 565 vulnerable hosts globally. View the country breakdown here. Track via queries:
- Censys Platform
- ASM: risks.name=”Vulnerable Apache Tika [CVE-2025-66516].”
- Legacy: services.http.response.html_title=/Welcome to the Apache Tika [0-9\.]+ Server/
Upgrade to version 3.2.2 or later now. Disable XXE processing if patching delays. Scan for exposed Tika endpoints and restrict PDF uploads.
Follow us on Google News , LinkedIn and X to Get More Instant Updates, Set Cyberpress as a Preferred Source in Google.





