NVIDIA has issued a security bulletin warning about two high-severity vulnerabilities in its Triton Inference Server software.
These flaws allow remote attackers to cause denial-of-service conditions on Linux systems by injecting specially crafted inputs.
Vulnerability Details
Attackers can exploit CVE-2025-33211 by improperly validating a specified quantity in inputs sent to the server.
This issue, linked to CWE-1284, enables denial-of-service without requiring privileges, as the CVSS v3.1 vector AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A: H yields a base score of 7.5.
The second flaw, CVE-2025-33201, stems from inadequate checks for unusual conditions when handling oversized payloads.
Classified under CWE-754, it shares the same CVSS vector and score, leading to server crashes that disrupt AI inference workloads.
Triton Inference Server is an open-source platform for deploying machine learning models across frameworks and hardware, including GPUs.
Both vulnerabilities require no authentication and can be triggered remotely, making them risky for production AI deployments where uptime matters.
Here’s a summary of the issues:
| CVE ID | Description | CVSS Vector | Base Score | Severity | CWE | Impact |
|---|---|---|---|---|---|---|
| CVE-2025-33211 | Improper input quantity validation | AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H | 7.5 | High | CWE-1284 | Denial of Service |
| CVE-2025-33201 | Poor handling of extra-large payloads | AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H | 7.5 | High | CWE-754 | Denial of Service |
Mitigation Steps
Systems running Triton Inference Server on Linux before version r25.10 face exposure. NVIDIA patched both flaws in the r25.10 release available on GitHub.
Users must update immediately from the official releases page and follow the secure deployment guide, which covers API protections and logging safeguards.
Researchers credited include seaw1nd for CVE-2025-33211, and the Trend Micro Zero Day Initiative team and others for CVE-2025-33201.
No evidence shows active exploitation, but organizations handling AI models should scan networks and prioritize patches to avoid service disruptions.
NVIDIA urges monitoring its Product Security page for alerts.





