Supply Chain Vulnerabilities
LLM supply chain vulnerabilities arise when weaknesses in third-party components, models, datasets, tooling, or deployment platforms compromise the integrity, security, or reliability of large language model (LLM) systems.
Unlike traditional software supply chain risks, which focused primarily on code dependencies, LLM supply chains extend to pre-trained models, fine-tuning adapters (e.g., LoRA, PEFT), training datasets, model repositories, conversion and merge services, cloud and edge deployment infrastructure and licensing and data usage agreements.
Because modern AI development heavily relies on open ecosystems and third-party assets, the supply chain attack surface is significantly expanded.
Key Takeaways
- LLM supply chain risks extend beyond traditional software vulnerabilities to include third-party pre-trained models, fine-tuning adapters, training datasets, and deployment platforms, any of which can be tampered with or poisoned
- Weak model provenance is a critical gap: Model cards provide no cryptographic guarantees of origin, making it possible for attackers to publish compromised models on repositories like Hugging Face while impersonating trusted sources
- Fine-tuning techniques like LoRA introduces a new attack vector — a malicious LoRA adapter can be merged with a legitimate base model, injecting backdoors that activate during inference without affecting benchmark performance
- On-device LLM deployments expand the supply chain attack surface further, as attackers can reverse-engineer mobile apps, replace embedded models with tampered versions, and redistribute them via social engineering
- Mitigation requires maintaining an AI-specific software bill of materials (SBOM), verifying model integrity through file hashes and code signing, conducting AI red teaming on third-party models, and continuously auditing supplier security posture and licensing terms
Why it Matters
LLMs are often distributed as opaque binary artifacts. Unlike traditional open-source code, they cannot be easily inspected for hidden functionality. This increases reliance on trust in upstream suppliers. Compromise anywhere in the AI supply chain can result in the following:
- Biased or manipulated outputs
- Embedded backdoors
- Security breaches
- Data exfiltration
- Malware execution
- Legal and licensing exposure
- System instability or failure
Key Risk Areas to Consider
- Traditional Third-party Component Vulnerabilities: Outdated or deprecated packages used during model development or fine-tuning can be exploited, similar to OWASP A06 – Vulnerable and Outdated Components.
- Licensing Risks: AI systems incorporate diverse software and dataset licenses. Mismanagement may lead to legal violations, distribution restrictions, or commercial exposure.
- Outdated or Deprecated Models: Unmaintained models may contain unresolved vulnerabilities.
- Vulnerable Pre-trained Models: Pre-trained models may contain hidden biases, backdoors, or malicious modifications. Techniques such as parameter tampering (e.g., ROME / “lobotomization”) can directly alter model behavior.
- Weak Model Provenance: Model cards provide descriptive information but do not guarantee authenticity. Attackers may impersonate legitimate suppliers or compromise repository accounts.
- Malicious LoRA Adapters: Low-rank Adaptation (LoRA) and PEFT techniques allow modular fine-tuning. A malicious adapter can compromise a base model when merged, especially in collaborative or automated deployment environments.
- Exploited Collaborative Model Development: Model merging, conversion services, and shared hosting platforms can be manipulated to inject vulnerabilities or bypass review processes.
- On-device Model Risks: LLMs deployed at the edge increase exposure to firmware vulnerabilities, reverse engineering, and tampered model repackaging.
- Unclear Terms and Privacy Policies: Changes in supplier T&Cs may permit training on application data without clear consent, leading to unintended memorization or exposure of sensitive information.
Representative Attack Scenarios
These scenarios illustrate that supply chain risks affect both development and production environments.
- Compromised Dependency (PyPI Attack): Malicious packages embedded malware in development environments
- Direct Model Tampering (PoisonGPT): Attackers altered model parameters to bypass repository safety checks
- Malicious Fine-tuned Model: A model appears safe on benchmarks but contains targeted triggers
- Fake Model Publication (WizardLM): Attackers publish malware-laced models under popular names
- LoRA Adapter Compromise: A malicious adapter introduces hidden vulnerabilities when merged
- Cloud Infrastructure Exploitation (CloudBorne/CloudJacking): Virtualization or firmware weaknesses compromise hosted models
- GPU Memory Leakage (LeftOvers CVE-2023-4969): Sensitive data recovered from leaked GPU memory
- Reverse-engineered Mobile App: Tampered models embedded in repackaged apps redirect users to scam content
- Dataset Poisoning: Public datasets manipulated to embed subtle backdoors during fine-tuning
- T&C Manipulation: Supplier modifies privacy policies to train on sensitive application data
Prevention and Mitigation Strategies
Supplier and Source Vetting
Be sure to use trusted and verifiable model sources. Review supplier security posture and T&Cs regularly and audit changes in licensing and privacy policies.
Vulnerability Management
Apply OWASP A06:2021 controls, perform dependency scanning and patch management and maintain secure development environments.
SBOM / AI-BOM Practices
Maintain a software bill of materials (SBOM), track models, datasets, adapters, and licenses and consider AI/ML-specific BOM standards such as OWASP CycloneDX.
Model Integrity Verification
Use digital signatures and file hash verification, apply code signing for externally supplied code and validate provenance where possible.
AI Red Teaming
Conduct adversarial testing on third-party models, evaluate models in intended operational contexts and don’t rely solely on published safety benchmarks.
Collaborative Environment Controls
Monitor model merge and conversion services, use automated scanners (e.g., HuggingFace SF_Convertbot Scanner) and audit shared development pipelines.
Anomaly and Robustness Testing
Perform adversarial robustness checks, integrate detection into MLOps pipelines and conduct periodic red team exercises.
Edge Deployment Protection
Encrypt models at rest, use hardware integrity checks and vendor attestation APIs and terminate applications on unrecognized firmware
Licensing Governance
Maintain license inventories via BOMs, use automated license management tools and train teams on license obligations
The Core Security Principle
LLM systems are not standalone artifacts. They are assembled ecosystems composed of code, data, models, adapters, infrastructure, licenses and cloud services. Every external dependency introduces risk.
Supply chain security for LLMs requires continuous validation, strong provenance controls, secure MLOps practices, red teaming and adversarial testing, legal and licensing governance and infrastructure hardening.
In AI systems, trust is inherited. If any upstream component is compromised, the downstream application is compromised. Secure the chain, verify the source and continuously monitor.
< Back to Glossary of Terms