Xgrammar, Unbounded Cache Denial of Service, CVE-XXXX-XXXX (Critical)

How the CVE Works:

Xgrammar utilizes an in-memory cache to store compiled grammars for performance optimization. However, the cache lacks size restrictions, allowing attackers to exploit it by flooding the system with unique grammar requests (e.g., varied JSON schemas). Each request forces Xgrammar to cache new entries indefinitely, consuming all available memory. This leads to a denial of service (DoS) by exhausting host resources. The vulnerability is particularly dangerous in LLM inference servers processing untrusted inputs, as attackers can trigger memory exhaustion with minimal requests.

DailyCVE Form:

Platform: Xgrammar
Version: Pre-fix commits
Vulnerability: Unbounded cache DoS
Severity: Critical
Date: YYYY-MM-DD

What Undercode Say:

Analytics:

  • Attack complexity: Low (no authentication required)
  • Exploitability: High (simple HTTP/API requests)
  • Affected systems: LLM servers, grammar processors

Exploit Commands:

Flood Xgrammar with unique schemas
for i in {1..10000}; do
curl -X POST http://target/api/parse -d '{"schema":"unique_'$i'"}'
done

Mitigation Code:

Implement cache limits (Python example)
from xgrammar import GrammarCache
cache = GrammarCache(max_entries=1000) Enforce size limit

Protection Steps:

  1. Update Xgrammar to versions with cache limits (mlc-ai/xgrammar243).

2. Deploy rate-limiting on grammar parsing endpoints.

3. Monitor memory usage for abnormal spikes.

Detection Script:

import psutil
def check_memory_usage(threshold=90):
if psutil.virtual_memory().percent > threshold:
alert("Potential DoS attack detected")

Firewall Rule:

Limit requests per IP (iptables example)
iptables -A INPUT -p tcp --dport 80 -m connlimit --connlimit-above 50 -j DROP

References:

References:

Reported By: https://github.com/advisories/GHSA-389x-67px-mjg3
Extra Source Hub:
Undercode

Join Our Cyber World:

💬 Whatsapp | 💬 TelegramFeatured Image

Scroll to Top