Agentic AI
Beyond Jailbreaks: Contextual Red Teaming for Agentic AI
Eresus Security Research TeamSecurity Researcher
July 15, 2024
1 min read
Overview
As LLMs evolve from standalone text generators to "Agentic AI", they now interact directly with internal APIs, databases, and third-party services. A simple "jailbreak" is no longer the highest threat; instead, indirect prompt injections that manipulate the agent into performing cross-tenant data exfiltration or logic abuse have become critical.
The Contextual Red Team Approach
Testing must account for the environment. If an agent has access to a SQL database via a semantic search tool, researchers must test whether an attacker can load malicious context via a seemingly innocent email (Indirect Prompt Injection) which forces the Agent to dump the database schema in its next API call.