AI Sec Reviews
Isometric vector illustration representing owasp llm top 10 mitigation guide
LLM Security

OWASP LLM Top 10 Mitigation Guide: Controls for Every Risk Category (2025 Edition)

A practitioner's OWASP LLM Top 10 mitigation guide covering all ten 2025 risk categories — prompt injection through unbounded consumption — with concrete controls, tooling pointers, and residual-risk notes.

By Aisecreviews Editorial · · 8 min read

If you are evaluating your LLM deployment posture or advising a team on where to put security controls, the OWASP LLM Top 10 mitigation guide published in the 2025 edition of the OWASP Top 10 for LLM Applications is the most practical framework available for scoping that work. This article walks through each of the ten risk categories, the controls OWASP recommends, and where defenses fall short so you can layer appropriately.

What Changed in the 2025 Edition

The 2023 v1.1 list already established the core attack surface. The 2025 revision reorganized that surface based on real incident reports and widened coverage for agentic and RAG-heavy deployments. Three categories are effectively new or substantially reworked:

  • System Prompt Leakage (LLM07) — split out from general information disclosure because system prompt extraction has become a reliable first step in multi-stage attacks against deployed assistants.
  • Vector and Embedding Weaknesses (LLM08) — added to address retrieval poisoning, access-control failures in vector stores, and embedding inversion attacks that arrive with RAG pipelines.
  • Misinformation (LLM09) — replaces the earlier “Overreliance” label with a framing that better captures hallucination amplification as a deliberate adversarial technique, not just a quality problem.

The former Model Denial of Service was renamed Unbounded Consumption (LLM10) to capture cost-based attacks alongside availability attacks. Model Theft and Insecure Plugin Design were removed as standalone categories; their residual content folded into Supply Chain and Excessive Agency respectively.

Tier-by-Tier Control Mapping

LLM01 — Prompt Injection. The highest-priority risk and the one with the widest residual attack surface after controls are applied. Mitigations include strict role instructions in system prompts, input sanitization, output format enforcement via JSON Schema or structured-output mode, and privilege segregation that prevents injected instructions from reaching privileged tool calls. Direct injection (via user turn) and indirect injection (via retrieved document content) require separate detection logic. Red-teaming this category continuously matters more than any single static control; the Promptfoo framework’s adversarial plugin suite covers both injection vectors against deployed endpoints. For a deeper look at offensive techniques motivating these defenses, aisec.blog covers prompt injection and jailbreak tradecraft in technical detail.

LLM02 — Sensitive Information Disclosure. Training-time and inference-time leakage require different controls. At inference time: output filtering for PII patterns, secrets scanning on completions, and access-scoped context windows so users cannot extract data outside their authorization boundary. At training time: data minimization, differential privacy where budget allows, and membership-inference testing before production deployment.

LLM03 — Supply Chain. Every fine-tuned model, every third-party embedding API, every external dataset is an attack surface. Controls: integrity verification (cryptographic signing, hash checks) for model weights and adapter layers; a signed Software Bill of Materials (SBOM) per OWASP CycloneDX ML extensions; behavioral acceptance testing against a standardized red-team suite before any model upgrade lands in production. Static scanning tools like ModelAudit can detect embedded executables and backdoor triggers in serialized weights.

LLM04 — Data and Model Poisoning. Malicious training data can install behavioral backdoors that trigger on specific token sequences. Practical defenses: vet data vendors against provenance documentation, run anomaly detection over label distributions, and hold out a clean evaluation set for post-training behavioral regression testing. For agentic systems with continuous learning, treat every new data ingestion as a supply-chain event with the same controls as LLM03.

LLM05 — Improper Output Handling. LLM outputs fed directly to browsers, SQL engines, or shell interpreters without sanitization reproduce the classic injection triage: XSS, SQLi, command injection, now triggered by model output. Apply context-aware encoding: HTML-encode before rendering, parameterize before querying, sandbox before executing. Treat model output as untrusted user input at every downstream boundary.

LLM06 — Excessive Agency. Agentic systems that can browse, write files, send email, or call APIs are high-value targets precisely because compromising the model compromises every tool the model can reach. The principle of least privilege applies directly: grant the minimal tool set the task actually requires, require explicit user approval for irreversible or high-impact actions, and enforce separation between the planning context and the execution context. GuardML covers runtime guardrail tooling that enforces tool-call policies at the agent boundary, which is the right layer for this control.

LLM07 — System Prompt Leakage. System prompts increasingly encode business logic, customer data, and API keys — none of which belong there. Controls: keep credentials and sensitive context in environment variables or secrets managers accessed via secure lookups, not embedded in the prompt string; monitor output for verbatim system prompt fragments; rotate system prompt content periodically to reduce the value of a leaked snapshot.

LLM08 — Vector and Embedding Weaknesses. RAG pipelines introduce a new attack path: poisoning the vector store to control what context gets retrieved, or exploiting access-control gaps to retrieve documents outside the querying user’s authorization scope. Controls: strict logical partitioning in the vector database with per-user or per-role namespace isolation; input sanitization on content before it is chunked and embedded; regular audits of retrieval results against expected authorization boundaries. Embedding inversion (recovering training data from embedding vectors) is an emerging threat that differential-privacy techniques partially address.

LLM09 — Misinformation. Hallucination becomes an attack when an adversary can predict or trigger false outputs to influence downstream decisions. Mitigations: Retrieval-Augmented Generation grounded to vetted source corpora, citation requirements enforced in system prompts, human-in-the-loop review for high-stakes outputs, and output confidence scoring where the model supports it. RAG reduces but does not eliminate hallucination — grounding helps with factual recall but not with adversarially crafted retrieval contexts.

LLM10 — Unbounded Consumption. Token-exhaustion attacks inflate cost and degrade availability. Controls: per-user and per-session rate limits, maximum token budgets per request enforced at the API gateway layer, queue depth monitoring with autoscaling guards, and cost anomaly alerting. Recursive prompt patterns that expand token usage through chained tool calls require specific detection logic beyond simple per-request limits.

Building the Control Stack

OWASP structures each entry with Prevention and Mitigation Strategies, Example Attack Scenarios, and Reference Links; the full 2025 PDF is the authoritative source for implementation detail. In practice, the controls cluster into three layers:

  1. Input layer — sanitization, injection detection, rate limiting, prompt structure enforcement.
  2. Output layer — format validation, PII filtering, downstream encoding, secrets scanning.
  3. Runtime/governance layer — tool-call allowlists, human approval gates, audit logging, supply-chain provenance, cost monitoring.

No single control addresses the full list. LLM01 (prompt injection) and LLM06 (excessive agency) are the highest-leverage targets for most teams deploying non-trivial agentic systems in 2025; LLM03 and LLM08 become critical once fine-tuning and RAG pipelines are in scope. A tiered rollout — harden the input/output perimeter first, then close the agency and supply-chain gaps — is more realistic than attempting all ten simultaneously.


Sources

Sources

  1. OWASP Top 10 for Large Language Model Applications (Official Project)
  2. OWASP Top 10 for LLMs v2025 — Official PDF
  3. OWASP Top 10 for LLMs 2025: Key Risks and Mitigation Strategies — Invicti
  4. OWASP Top 10 2025 for LLM Applications: Risks and Mitigation Techniques — Confident AI
Subscribe

AI Sec Reviews — in your inbox

Reviews of AI security products and platforms. — delivered when there's something worth your inbox.

No spam. Unsubscribe anytime.

Related

Comments