Lethal Trifecta
Quick Answer
The lethal trifecta is the threat-model shorthand for the three ingredients that make a tool-using AI agent dangerous when combined: an untrusted instruction source, a sensitive data source, and an exfiltration or side-effect channel — all reachable inside one planning loop. Any two are manageable; all three together let attacker text in an ingested document steer the agent into reading private data and writing it to an attacker-reachable sink without a user click.
Lethal Trifecta
The lethal trifecta is the threat-model shorthand for the three ingredients that, present together inside a single agent task, turn ordinary prompt injection into authorized data exfiltration: an untrusted instruction source (email, issue, retrieved chunk, memory item), a sensitive data source (private repository, inbox, token), and an exfiltration or side-effect channel (HTTP fetch, public comment, pull request). The term names the composition, not any single attack technique. Any two ingredients are usually manageable; all three reachable in one planning loop make the agent a confused deputy that the attacker, not the user, controls.
Concrete instances include a malicious public GitHub issue causing an MCP-equipped agent to leak private repository contents via an autonomously opened pull request, and EchoLeak (CVE-2025-32711), a zero-click Microsoft 365 Copilot exfiltration triggered by crafted external email.
See also
- Indirect prompt injection — how the untrusted-instruction ingredient typically gets activated
- Tool hijacking — how the side-effect channel ingredient is exercised
- Memory poisoning — how untrusted instructions persist across sessions