The AI Agent Security Audit You Haven't Done
Enterprise AI agents with tool use — web browsing, email, database access, code execution — are in production across organizations that reviewed them as chat models. This episode dissects why prompt injection is the primary unsolved threat, how tool permission creep happens silently across development cycles, and what a structured AI agent security program actually looks like to build.
The Deployment Debrief · Host: Elise · AI Insight Lab
Key takeaways
- 1
An AI agent with tool use is not the same security problem as an AI model without it — the threat surface, incident attribution model, and governance requirements are categorically different.
- 2
Prompt injection is unsolved at scale: OWASP LLM Top 10 documents that over 90% of major frameworks lack reliable defenses, and your production agents are not exceptions.
- 3
Tool permission creep happens silently across development iterations — the permissions granted at deployment rarely match the permissions in use six months later.
- 4
The EU AI Act's 72-hour incident reporting window requires a workflow most enterprises have not built — and AI agent incidents are the most likely trigger for that obligation.
Episode sections
Why enterprise AI security reviews written for chat models don't cover the threat surface that tool-using agents create — and why most organizations don't realize the gap yet.
What changed between 2023 and 2024: from models that generate text to agents that execute SQL, send email, browse the web, and write files — and why that distinction changes the entire security model.
Why over 90% of major LLM frameworks lack reliable prompt injection defenses, how it works operationally, and what a successful attack looks like in a production enterprise environment.
Why the security review that approved your LLM deployment didn't include tool-use, service account permissions, or incident attribution — and how that gap compounds with each new agent deployment.
Tool permission creep across development cycles, the absence of standard service account review frameworks for AI agents, and why EU AI Act incident reporting requirements have no corresponding workflow at most enterprises.
Reactive (post-incident), compliance-driven (regulation-first), and proactive (threat-model-first) — what each looks like operationally and which one your organization is implicitly running right now.
The specific audit framework: agent inventory, tool permission review, prompt injection testing, incident attribution workflow, and insurance coverage gap analysis — what each step requires and who owns it.
The six questions your CISO cannot currently answer about your enterprise AI agents — and why the inability to answer them is itself a material security finding.