Docling Core, Improper Input Validation (CVE-2026-44019) (High) -DC-Jun2026-183

Listen to this Post

How CVE-2026-44019 Works

Docling Core versions between 2.5.0 and 2.74.1 contain a critical flaw in how they handle image reference Uniform Resource Identifiers (URIs). The core of the issue lies in insufficient validation of two URI schemes: `file://` and data:.
The library processes images through `ImageRef` objects that contain a `uri` field. In vulnerable versions, when an attacker provides an image reference with a `file://` URI, Docling Core does not block or sanitize it. This allows a malicious actor to read any file on the server’s filesystem that the application process has permissions to access.
The second attack vector uses the `data:` URI scheme. This scheme is typically used to embed small images directly as Base64-encoded strings. In vulnerable versions, Docling Core does not enforce a decoded-size limit on these `data:` URIs. An attacker can craft a `data:` URI containing a massively inflated payload that, when decoded, consumes excessive amounts of system memory, leading to a denial-of-service (DoS).
The vulnerability exists because the `ImageRef` class relies on Pydantic’s `AnyUrl` type, which prior to version 2.10.0, was more lenient in its validation. Furthermore, the internal handlers that process images fail to check if a URI resolves to a local file path, making the application susceptible to local file inclusion (LFI) attacks.
The fix, implemented in version 2.74.1, introduces two main changes: first, it blocks `file://` URIs by default, effectively preventing local file reads. Second, it establishes a maximum size limit for decoded inline image data from `data:` URIs, protecting against memory exhaustion. The updated handlers now perform strict validation and reject any suspicious URIs.

DailyCVE Form:

Platform: `Docling Core`
Version: `2.5.0 – 2.74.1`
Vulnerability: `Insufficient URI Validation`
Severity: `High`
Date: `2026-06-02`

Prediction: `Immediate (Already Patched)`

What Undercode Say:

Verify vulnerability by attempting a local file read
echo '{"images": [{"uri": "file:///etc/passwd"}]}' | python -c "
import sys, json
from docling_core.types.doc import DoclingDocument
doc = DoclingDocument.model_validate_json(sys.stdin.read())
for img in doc.images:
print(f'Attempted access: {img.uri}')
In vulnerable versions, this would read /etc/passwd
"

Exploit:

An attacker can craft a malicious document containing an `ImageRef` object with a `file://` URI pointing to a sensitive local file, such as file:///etc/passwd. When the application processes this document, Docling Core will attempt to read the specified local file, exposing its contents to the attacker. Additionally, the attacker can embed a `data:` URI with a large Base64 payload, causing the system to allocate excessive memory and crash.

Protection:

  1. Upgrade Docling Core to version 2.74.1 or later immediately:
    pip install --upgrade docling-core>=2.74.1
    
  2. If upgrading is not possible, implement a manual workaround by rejecting any `file:` or `data:` URIs from untrusted input:
    Example URI sanitization
    if uri.startswith(('file://', 'data:')):
    raise ValueError("Blocked potentially dangerous URI scheme")
    
  3. Enforce strict input size limits and memory quotas on any process handling untrusted image references.

Impact

  • Local File Disclosure: An attacker can read any file on the server accessible to the application process, leading to the exposure of sensitive data such as configuration files, source code, or credentials.
  • Denial of Service: By sending a large `data:` URI, an attacker can exhaust system memory, causing the application to crash and become unavailable.
  • Supply Chain Risk: Applications that embed Docling Core and process untrusted documents are vulnerable, potentially affecting downstream systems and data integrity.

🎯Let’s Practice Exploiting & Learn Patching For Free:

🎓 Live Courses & Certifications:

Join Undercode Academy for Verified Certifications

🚀 Request a Custom Project:

Secure, high-velocity infrastructure and disruptive technological engineering. Contact our engineering team for high-tier development and proprietary systems:
[email protected]
💎 Smart Architecture | 🛡️ Secure by Design | ⭐ Trusted by Thousands

Sources:

Reported By: github.com
Extra Source Hub:
Undercode

🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]

💬 Whatsapp | 💬 Telegram

📢 Follow DailyCVE & Stay Tuned:

𝕏 formerly Twitter 🐦 | @ Threads | 🔗 Linkedin Featured Image

Scroll to Top