Listen to this Post
The vulnerability resides in the `tika-core` library’s XML parsing logic when processing XFA (XML Forms Architecture) data embedded within a PDF file. Attackers can craft a malicious PDF containing an XFA component with a defined XML External Entity (XXE). When Apache Tika parses this PDF, the vulnerable XML parser within `tika-core` insecurely processes the DTD (Document Type Definition). This allows the attacker to include external entity references with file:// or http:// URIs. During parsing, these entities are resolved and expanded, leading to unauthorized file reads from the server’s filesystem or Server-Side Request Forgery (SSRF) as the parser makes outbound HTTP requests to internal systems. The flaw exists because the parser was not configured to disable external entity resolution, allowing the injection of malicious DTDs through the XFA stream.
Platform: Apache Tika
Version: 1.13-3.2.1
Vulnerability: XXE via PDF
Severity: Critical
Date: Dec 4, 2025
Prediction: Patched (3.2.2)
What Undercode Say:
Analytics
`find / -name “tika-core.jar” -type f`
`java -cp tika-app.jar org.apache.tika.cli.TikaCLI –list-parsers`
`curl -X PUT http://target:9998/rmeta –data-binary @malicious.pdf`
How Exploit
Craft malicious XFA payload.
Embed into PDF structure.
Trigger Tika PDFParser.
Protection from this CVE
Upgrade tika-core.
Disable XXE parsing.
Validate input files.
Impact
File disclosure.
SSRF attacks.
Data exfiltration.
🎯Let’s Practice Exploiting & Learn Patching For Free:
Sources:
Reported By: github.com
Extra Source Hub:
Undercode

