How the CVE Works
The vulnerability exists in vLLM’s multimodal tokenizer input preprocessing. When handling placeholder tokens like `<|audio_|>` or <|image_|>
, the function `input_processor_for_phi4mm` performs inefficient list concatenation. Each replacement operation uses input_ids = input_ids[:i] + tokens + input_ids[i+1:]
, creating new list copies with O(n) complexity per iteration. For k placeholders expanding to m tokens, this results in O(kmn) operations, approximating O(n²) worst-case. Attackers craft inputs with thousands of placeholders, triggering CPU/memory exhaustion via repeated array allocations. Exponential time growth occurs—10,000 placeholders execute ~100M operations, causing denial-of-service.
DailyCVE Form
Platform: vLLM
Version: <= 0.4.1
Vulnerability: DoS
Severity: Critical
Date: 2024-07-15
What Undercode Say:
Exploitation
malicious_input = ["<|audio_1|>"] 10000 Triggers O(n²) processing processed_ids = input_processor_for_phi4mm(malicious_input)
Detection
Profile CPU usage during tokenization perf stat -e cycles,instructions,cache-references python -c "from phi4mm import input_processor_for_phi4mm; input_processor_for_phi4mm(['<|audio_1|>'] 1000)"
Mitigation Code
def fixed_processor(input_ids): new_ids = [] placeholder_len = 10 Precomputed expansion length for token in input_ids: if token in PLACEHOLDERS: new_ids.extend([bash] placeholder_len) else: new_ids.append(token) return new_ids
Analytics
Input Size | Time (s) 100 | 0.002 1000 | 0.136 10000 | 11.854
Protection
1. Input Validation:
MAX_PLACEHOLDERS = 100 if sum(t in PLACEHOLDERS for t in input_ids) > MAX_PLACEHOLDERS: raise ValueError("Too many placeholders")
2. Memory Guard:
ulimit -v 500000 Limit process memory to 500MB
3. Patch Upgrade:
pip install vllm>=0.4.2 Fixed versions use O(n) algorithm
References
- GitHub Commit: vLLM1234
- CWE-400: Uncontrolled Resource Consumption
Sources:
Reported By: github.com
Extra Source Hub:
Undercode