Listen to this Post
How CVE-2026-44019 Works
Docling Core versions between 2.5.0 and 2.74.1 contain a critical flaw in how they handle image reference Uniform Resource Identifiers (URIs). The core of the issue lies in insufficient validation of two URI schemes: `file://` and data:.
The library processes images through `ImageRef` objects that contain a `uri` field. In vulnerable versions, when an attacker provides an image reference with a `file://` URI, Docling Core does not block or sanitize it. This allows a malicious actor to read any file on the server’s filesystem that the application process has permissions to access.
The second attack vector uses the `data:` URI scheme. This scheme is typically used to embed small images directly as Base64-encoded strings. In vulnerable versions, Docling Core does not enforce a decoded-size limit on these `data:` URIs. An attacker can craft a `data:` URI containing a massively inflated payload that, when decoded, consumes excessive amounts of system memory, leading to a denial-of-service (DoS).
The vulnerability exists because the `ImageRef` class relies on Pydantic’s `AnyUrl` type, which prior to version 2.10.0, was more lenient in its validation. Furthermore, the internal handlers that process images fail to check if a URI resolves to a local file path, making the application susceptible to local file inclusion (LFI) attacks.
The fix, implemented in version 2.74.1, introduces two main changes: first, it blocks `file://` URIs by default, effectively preventing local file reads. Second, it establishes a maximum size limit for decoded inline image data from `data:` URIs, protecting against memory exhaustion. The updated handlers now perform strict validation and reject any suspicious URIs.
DailyCVE Form:
Platform: `Docling Core`
Version: `2.5.0 – 2.74.1`
Vulnerability: `Insufficient URI Validation`
Severity: `High`
Date: `2026-06-02`
Prediction: `Immediate (Already Patched)`
What Undercode Say:
Verify vulnerability by attempting a local file read
echo '{"images": [{"uri": "file:///etc/passwd"}]}' | python -c "
import sys, json
from docling_core.types.doc import DoclingDocument
doc = DoclingDocument.model_validate_json(sys.stdin.read())
for img in doc.images:
print(f'Attempted access: {img.uri}')
In vulnerable versions, this would read /etc/passwd
"
Exploit:
An attacker can craft a malicious document containing an `ImageRef` object with a `file://` URI pointing to a sensitive local file, such as file:///etc/passwd. When the application processes this document, Docling Core will attempt to read the specified local file, exposing its contents to the attacker. Additionally, the attacker can embed a `data:` URI with a large Base64 payload, causing the system to allocate excessive memory and crash.
Protection:
- Upgrade Docling Core to version 2.74.1 or later immediately:
pip install --upgrade docling-core>=2.74.1
- If upgrading is not possible, implement a manual workaround by rejecting any `file:` or `data:` URIs from untrusted input:
Example URI sanitization if uri.startswith(('file://', 'data:')): raise ValueError("Blocked potentially dangerous URI scheme") - Enforce strict input size limits and memory quotas on any process handling untrusted image references.
Impact
- Local File Disclosure: An attacker can read any file on the server accessible to the application process, leading to the exposure of sensitive data such as configuration files, source code, or credentials.
- Denial of Service: By sending a large `data:` URI, an attacker can exhaust system memory, causing the application to crash and become unavailable.
- Supply Chain Risk: Applications that embed Docling Core and process untrusted documents are vulnerable, potentially affecting downstream systems and data integrity.
🎯Let’s Practice Exploiting & Learn Patching For Free:
🎓 Live Courses & Certifications:
Join Undercode Academy for Verified Certifications
🚀 Request a Custom Project:
Secure, high-velocity infrastructure and disruptive technological engineering. Contact our engineering team for high-tier development and proprietary systems:
[email protected]
💎 Smart Architecture | 🛡️ Secure by Design | ⭐ Trusted by Thousands
Sources:
Reported By: github.com
Extra Source Hub:
Undercode

