# AI Agent Systems Unknown Unknowns Checklist This is the highest-density area of unknown unknowns because AI agent systems are new enough that most developers have never encountered these failure modes. ## Prompt Injection & Security - Direct prompt injection — users craft inputs that override system prompts - Indirect prompt injection — malicious content in documents/websites the agent reads - Tool abuse — agent tricked into calling tools with malicious parameters - Privilege escalation — agent accessing resources beyond intended scope - Data exfiltration via prompt injection — agent tricked into sending data to attacker - System prompt extraction — users can often get the agent to reveal its instructions - Jailbreaking — circumventing safety guidelines through creative prompting - Multi-step attacks — benign individual steps that combine into harmful actions ## Cost & Resource Control - Token costs can explode with agent loops (agent retrying or thinking in circles) - No per-user or per-request cost caps → one user can drain your budget - Long conversations accumulate context → each message costs more than the last - Tool calls multiply costs (each tool call is a separate API call) - Retry loops without backoff or max attempts - Parallel agent execution without concurrency limits - Large document processing (PDFs, codebases) consuming massive token counts - No monitoring of per-user or per-session cost ## Reliability & Failure Modes - LLMs are non-deterministic — same input can produce different outputs - Hallucinated tool calls — agent invents tools or parameters that don't exist - Hallucinated data — agent confidently presents fabricated information - Cascading hallucinations — one agent's hallucination feeds into another's input - Infinite loops — agent gets stuck in a retry or reasoning cycle - Partial failures — agent completes some steps but fails silently on others - Context window overflow — conversation exceeds model limits, drops information - Model degradation — API provider changes model behavior without notice - Rate limiting from API providers during high usage ## Multi-Agent Specific - No coordination protocol — agents working at cross purposes - Deadlocks — agents waiting on each other - Race conditions — multiple agents modifying the same resource - Message ordering not guaranteed in async agent communication - No shared state management — agents have inconsistent views of the world - Error propagation — one agent's failure cascades through the system - No supervisor/orchestrator — no way to detect when the system is stuck - Trust boundaries — should Agent A trust Agent B's output? - Observability — hard to debug what happened across multiple agents - Testing — traditional unit tests don't cover non-deterministic behavior ## Data & Privacy - User data sent to LLM APIs — check provider's data retention policies - PII in prompts — sensitive data exposed to third-party AI providers - Conversation logs containing sensitive information - Agent memory/context containing data from other users (multi-tenant leaks) - Training data inclusion — some providers may train on your data - GDPR right to deletion includes AI conversation data - Vector database containing embeddings of sensitive documents ## UX & Trust - Users don't understand what the AI can and can't do - No confidence indicators — users can't tell when the AI is guessing - Users over-trust AI output (especially in domains like medical, legal, financial) - No human-in-the-loop for high-stakes decisions - Missing audit trail — can't explain why the AI did something - Inconsistent behavior confuses users (different answers to same question) - Latency expectations — AI operations are slower than traditional CRUD - Error messages from AI failures are often cryptic to users - No feedback mechanism — users can't report bad AI behavior ## Evaluation & Quality - No evaluation framework — no way to measure if the AI is getting better or worse - No regression testing for prompt changes - A/B testing is complex with non-deterministic outputs - Ground truth is expensive to establish for open-ended tasks - Prompt drift — system prompt accumulates changes that interact unpredictably - Model version changes break carefully tuned prompts - No monitoring of output quality over time - Bias in AI outputs not measured or mitigated ## Architecture Patterns - Putting too much logic in prompts vs code (prompts are fragile, code is reliable) - Not separating AI decision-making from action execution - Missing fallback for when AI is unavailable or degraded - No caching of common AI responses (huge cost and latency savings) - Streaming responses not implemented (users staring at blank screen) - No graceful degradation — if AI fails, entire feature fails - Tight coupling to specific AI provider — no abstraction layer - Missing structured output validation — AI returns unexpected formats - No content filtering on AI outputs before showing to users