Critical OpenAI Sora 2 Flaw Exposes System Prompts via Audio Transcript Output

OpenAI’s Sora 2, a cutting-edge video generation model, has a notable security vulnerability that allows researchers to extract its hidden system prompt through audio transcripts, highlighting risks in multimodal AI systems.

This flaw, uncovered by AI security firm Mindgard, demonstrates how cross-modal prompting can bypass safeguards, potentially enabling misuse or deeper attacks on model behavior.

While the prompt itself contains standard guardrails, its exposure underscores the need to treat system instructions as sensitive configuration data.

Sora 2 represents a leap in multimodal AI, generating 15-second videos from text prompts with integrated audio and visuals. However, this complexity introduces semantic drift during data transformations.

Mindgard’s team, led by Aaron Portnoy, began experimenting on November 3, 2025, testing extraction across text, images, video, and audio to reveal the model’s internal rules.

Direct text requests failed due to training against prompt leaks, so they shifted to visual and auditory outputs, where safeguards are weaker.

Initial attempts focused on rendering text as images or video frames, but results suffered from glyph distortions and frame inconsistencies.

For instance, prompts for ASCII art or signs produced legible starts that quickly devolved into unreadable approximations, as models prioritize visual plausibility over exact symbols.

Encoded formats like QR codes or barcodes fared worse, yielding visually convincing but decodable gibberish due to pixel imprecision.

The breakthrough came with audio: prompting Sora 2 to narrate short prompt fragments in sped-up speech allowed transcription of high-fidelity clips within the 15-second limit.

By chaining these substituting placeholders for sensitive parts and stitching outputs researchers reconstructed the full prompt, including directives for metadata generation, content restrictions on nudity or copyright, and fixed parameters such as 30 FPS and a 1.78 aspect ratio.

This method outperformed visuals by avoiding rendering errors inherent to probabilistic pixel generation.

The extracted prompt starts with “You are ChatGPT, a large language model trained by OpenAI,” outlining video-specific rules, such as avoiding lyrics and ensuring consistency with the input images.

Mindgard disclosed the issue to OpenAI on November 4, received acknowledgment by November 7, and published on November 12.

This timeline demonstrates responsible vulnerability handling, but it also exposes broader issues with frontier LLMs.

Multimodal drift amplifies leakage risks, as transformations compound uncertainties, making outputs unpredictable. Vendors must enhance cross-modal testing, while users should verify prompt protections in AI integrations.

Though not immediately exploitable for harm, such leaks could inform jailbreaks or policy evasions, underscoring the need for stronger red-teaming.

As AI evolves, securing system prompts is crucial, akin to protecting API keys.

Critical OpenAI Sora 2 Flaw Exposes System Prompts via Audio Transcript Output

Recent News

Burp Suite Supercharges Its Scanning Capabilities With React2Shell Vulnerability Detection

Malicious MCP Servers Enable New Prompt Injection Attack To Drain Resources

Law Enforcement Detains Hackers Equipped With Specialized Flipper Hacking Tools

Google Unveils 10 New Gemini-Powered AI Features For Chrome

CISA Alerts On Actively Exploited Buffer Overflow Flaw In D-Link Routers

Over 500 Apache Tika Toolkit Instances Exposed To Critical XXE Vulnerability

Recent News

Burp Suite Supercharges Its Scanning Capabilities With React2Shell Vulnerability Detection

Malicious MCP Servers Enable New Prompt Injection Attack To Drain Resources

Law Enforcement Detains Hackers Equipped With Specialized Flipper Hacking Tools

Google Unveils 10 New Gemini-Powered AI Features For Chrome

CISA Alerts On Actively Exploited Buffer Overflow Flaw In D-Link Routers

Over 500 Apache Tika Toolkit Instances Exposed To Critical XXE Vulnerability

About us

Company

The latest

Burp Suite Supercharges Its Scanning Capabilities With React2Shell Vulnerability Detection

Malicious MCP Servers Enable New Prompt Injection Attack To Drain Resources

Law Enforcement Detains Hackers Equipped With Specialized Flipper Hacking Tools

Subscribe

Critical OpenAI Sora 2 Flaw Exposes System Prompts via Audio Transcript Output

Inside OpenAI Sora 2: Uncovering System Prompts Driving Multi-Modal LLMs

Recent News

Recent News

About us

Company

The latest

Subscribe