Thursday, March 5, 2026

Ollama Parsing Vulnerabilities Could Let Attackers Execute Arbitrary Code Through Crafted Model Files

Ollama versions before 0.7.0 contain parsing flaws that allow attackers to execute arbitrary code by loading a crafted GGUF model through the API, and users should update immediately.

The issue stems from unsafe handling of untrusted metadata during model load, which enables an out-of-bounds write and a reliable path to code execution in common builds.

An attacker who can reach the Ollama API can upload a malicious model or cause the server to pull one, triggering remote code execution in the model runner process.

In practical terms, that means a malicious GGUF can flip bits beyond a bounds-checked region, overwrite function pointers reachable during inference, and redirect execution to arbitrary code paths.

While exploitability is easier on non-PIE builds (e.g., local debug builds), release builds with PIE remain plausibly exploitable via more advanced primitives.

Hence, the safest course is to upgrade to the latest version and restrict API access to trusted users only.

Background context: Ollama is widely deployed, mixes Go with high-performance C/C++ (via llama.cpp), supports GGUF imports and a large model library, and exposes a local REST API and CLI so model parsing is a high-risk surface.

Technical Details

The core bug sits in the multimodal mllama loader logic: metadata entries include an untrusted array of layer indices used to mark intermediate layers, which are written into a std::vector<bool> without bounds checks.

Because std::vector<bool> packs flags as bits, writing out of bounds becomes a bit-flip primitive that can set arbitrary zero bits to one beyond the vector’s storage.

In heap-adjacent structures used during inference (for example, ggml backend interfaces), some function pointers begin as NULL; flipping the right bits can turn a NULL into a valid code address and force a call through a pointer like iface.

Synchronize when inference code paths invoke it. From there, a reliable exploit chain uses a stack pivot gadget such as “mov rsp, rbx; pop rbp; ret” to pivot into attacker-controlled memory in the same structure, then a short ROP chain to ret-sanitize existing pointers, and finally to achieve arbitrary command execution by rewriting a writable GOT entry (partial RELRO) so that subsequent calls to free resolve to system instead.

In non-PIE builds, the executable base is fixed, making gadget assembly straightforward; in PIE builds, attackers would need a separate leak or a different corruption (e.g., creating an arbitrary read/write primitive) before inference can resolve addresses, which increases complexity but does not eliminate risk.

Notably, the C++ mllama path that allowed this has been replaced by a Go implementation on mainline, removing the vulnerable code path; users must still update to a release that includes the rewrite.

FieldValue
IdentifierCVE‑TBD (pending assignment)
Affected productsOllama model runner (mllama model parsing)
Affected versions< 0.7.0
Componentmllama metadata parsing (std::vector<bool> index handling)
Vulnerability typeOut‑of‑Bounds Write leading to RCE
Attack vectorMalicious GGUF model loaded via API/pull or local load
Exploit prerequisitesAbility to make the server load a crafted model; inference reached
ImpactRemote Code Execution in runner process; command execution under runner context
Severity (CVSS)Pending; likely High to Critical
Exploit statusProof‑of‑concept demonstrated with controlled call and ROP in common builds
MitigationUpgrade to ≥ 0.7.0; restrict API access; avoid untrusted registries; prefer signed/verified models
FixVulnerable C++ path replaced with Go implementation; added bounds checks in parsing paths

Ollama’s popularity, broad model support, and simple REST API make it a high‑value target, and its GGUF pipeline imports untrusted binary metadata before any user interaction.

Upgrading closes a trivially reachable RCE path via parsing, and hardening the deployment (local-only bind, authenticated proxy, and trusted registries) limits exposure even if new parsing bugs surface later.

Varshini
Varshini
Varshini is a Cyber Security expert in Threat Analysis, Vulnerability Assessment, and Research. Passionate about staying ahead of emerging Threats and Technologies..

Recent News

Recent News