The vulnerability in vLLM’s V0 engine arises due to unsafe deserialization of data received via ZeroMQ sockets. In multi-node deployments, secondary hosts connect to the primary host using a SUB socket, which receives serialized data via pickle. Since pickle deserialization can execute arbitrary code, an attacker can craft malicious payloads to achieve remote code execution (RCE).
This flaw is exploitable if:
- The primary host is compromised, allowing attackers to send malicious payloads to secondary nodes.
- Network-level attacks (e.g., ARP spoofing) redirect traffic to a rogue server delivering malicious pickle data.
The V0 engine is disabled by default since v0.8.0, and the V1 engine is unaffected.
DailyCVE Form:
Platform: vLLM
Version: <v0.8.0
Vulnerability: Insecure Deserialization
Severity: Critical
Date: 2023-XX-XX
What Undercode Say:
Exploitation:
1. Craft a malicious pickle payload:
import pickle, os class Exploit: def <strong>reduce</strong>(self): return (os.system, ("rm -rf /tmp/malicious",)) payload = pickle.dumps(Exploit())
2. Intercept or spoof ZeroMQ traffic to inject payload.
Detection:
- Check for vLLM versions <0.8.0 with V0 engine enabled:
pip show vllm | grep Version
- Monitor ZeroMQ socket activity:
netstat -tulnp | grep zmq
Mitigation:
1. Disable V0 engine (default in ≥v0.8.0).
2. Restrict network access between nodes:
iptables -A INPUT -s !TRUSTED_IP -j DROP
3. Replace pickle with JSON for serialization.
References:
Sources:
Reported By: github.com
Extra Source Hub:
Undercode