Back to Publications

Superseded

This paper has been superseded by Hardening Multi-Agent Systems Against Prompt Injection. The current treatment lives there.

AI SecuritySupersededOctober 2, 2025

Exploiting Multi Agent Systems: How Prompt Injection Turns Collaboration into Compromise

AI SecurityPrompt InjectionMulti-Agent Systems

Exploiting Multi Agent Systems: How Prompt Injection Turns Collaboration into Compromise

Placeholder intro | Conference briefing

Presenter: Jeremy Richards
Team: AI Red Team, ServiceNow

Intro

Large language model agents no longer operate in isolation. They collaborate, delegate, and execute tool calls across orchestration layers, which creates a larger security surface than single-agent prompt injection alone.

This briefing introduces a red-team attack path for multi-agent systems: harvesting hidden system prompts, escalating through mirrored pattern injections, subverting planner behavior, and influencing downstream tool use. The session focuses on both direct prompt injection and second-hand (indirect) injections that move across agent boundaries and compound into mission-level compromise.

Session details

  • Date and time: Thursday, October 2, 3:15 PM to 4:00 PM
  • Room: 801B
  • Format: 45-minute briefing
  • Tracks: AI, ML, and Data Science; Application Security

Scope and deliverables

  • Attack chain walkthroughs for direct and indirect multi-agent injection
  • Practical defender controls with trade-offs in engineering effort and efficacy
  • Orchestrator policy patterns for reducing blast radius
  • Observability patterns that support security telemetry while respecting privacy

Status

This page is a placeholder intro for the published session. A longer technical write-up and supporting materials will be added in a future revision.