--- name: sentry-setup-ai-monitoring description: Setup Sentry AI Agent Monitoring in any project. Use when asked to monitor LLM calls, track AI agents, track conversations, or instrument OpenAI/Anthropic/Vercel AI/LangChain/Google GenAI/Pydantic AI. Detects installed AI SDKs and configures appropriate integrations. license: Apache-2.0 category: feature-setup parent: sentry-feature-setup disable-model-invocation: true --- > [All Skills](../../SKILL_TREE.md) > [Feature Setup](../sentry-feature-setup/SKILL.md) > AI Monitoring # Setup Sentry AI Agent Monitoring Configure Sentry to track LLM calls, agent executions, tool usage, and token consumption. ## Invoke This Skill When - User asks to "monitor AI/LLM calls" or "track OpenAI/Anthropic usage" - User wants "AI observability" or "agent monitoring" - User asks about token usage, model latency, or AI costs **Important:** The SDK versions, API names, and code samples below are examples. Always verify against [docs.sentry.io](https://docs.sentry.io) before implementing, as APIs and minimum versions may have changed. ## Prerequisites AI monitoring requires **tracing enabled** (`tracesSampleRate > 0`). ## Data Capture Warning **Prompt and output recording captures user content that is likely PII.** Before enabling send-default-PII (`sendDefaultPii: true` in JavaScript or `send_default_pii=True` in Python) or per-integration prompt/output capture (`recordInputs`/`recordOutputs` in JS, `include_prompts` in Python), confirm: - The application's privacy policy permits capturing user prompts and model responses - Captured data complies with applicable regulations (GDPR, CCPA, etc.) - Sentry data retention settings are appropriate for the sensitivity of the data **Ask the user** whether they want prompt/output capture enabled. Do not enable prompt/output capture without explicit confirmation. Use `tracesSampleRate: 1.0` only in development; in production, use a lower value or a `tracesSampler` function. ## Detection First **Always detect installed AI SDKs before configuring:** ```bash # JavaScript grep -E '"(openai|@anthropic-ai/sdk|ai|@langchain|@google/genai)"' package.json # Python grep -E '(openai|anthropic|langchain|huggingface)' requirements.txt pyproject.toml 2>/dev/null ``` ## Sampling Check After detecting AI SDKs, check the current sampling configuration: ```bash # JavaScript grep -E 'tracesSampleRate|tracesSampler' sentry.*.config.* instrument.* src/instrument.* app/instrument.* 2>/dev/null # Python grep -E 'traces_sample_rate|traces_sampler' *.py **/*.py 2>/dev/null ``` **If `tracesSampleRate` / `traces_sample_rate` is below 1.0 AND no `tracesSampler` / `traces_sampler` is configured:** Ask the user: > "Your current sample rate is {rate}. Agent runs are sampled as complete span trees — if the root span is dropped, all child gen_ai spans are lost. For full AI visibility, gen_ai-related transactions should be sampled at 100%. Would you like me to set up a `tracesSampler` that keeps AI traces at 100% while sampling other traffic at your current rate?" If user confirms, read `${SKILL_ROOT}/references/sampling.md` for implementation patterns. ## Supported SDKs ### JavaScript | Package | Integration | Min Sentry SDK | Auto? | |---------|-------------|----------------|-------| | `openai` | `openAIIntegration()` | 10.53.0 | Yes | | `@anthropic-ai/sdk` | `anthropicAIIntegration()` | 10.53.0 | Yes | | `ai` (Vercel) | `vercelAIIntegration()` | 10.53.0 | Yes* | | `@langchain/*` | `langChainIntegration()` | 10.53.0 | Yes | | `@langchain/langgraph` | `langGraphIntegration()` | 10.53.0 | Yes | | `@google/genai` | `googleGenAIIntegration()` | 10.53.0 | Yes | *Vercel AI: 10.53.0+ required. Requires `experimental_telemetry` per-call. ### Python Integrations auto-enable when the AI package is installed — no explicit registration needed: | Package | Auto? | Notes | |---------|-------|-------| | `openai` | Yes | Includes OpenAI Agents SDK | | `anthropic` | Yes | | | `langchain` / `langgraph` | Yes | | | `huggingface_hub` | Yes | | | `google-genai` | Yes | | | `pydantic-ai` | Yes | | | `litellm` | **No** | Requires explicit integration | | `mcp` (Model Context Protocol) | Yes | | ## JavaScript Configuration ### Node.js — auto-enabled integrations Just ensure tracing is enabled. Integrations auto-enable when the AI package is installed: ```javascript Sentry.init({ dsn: "YOUR_DSN", tracesSampleRate: 1.0, // Lower in production (e.g., 0.1) streamGenAiSpans: true, // SDK ≥10.53.0 // OpenAI, Anthropic, Google GenAI, LangChain integrations auto-enable in Node.js }); ``` To customize (e.g., enable prompt capture after user confirmation — see Data Capture Warning): ```javascript Sentry.init({ dsn: "YOUR_DSN", tracesSampleRate: 1.0, streamGenAiSpans: true, sendDefaultPii: true, integrations: [ Sentry.openAIIntegration({ // recordInputs/recordOutputs default to true when sendDefaultPii is true }), ], }); ``` ### Browser / Next.js OpenAI (manual wrapping required) In browser-side code or Next.js meta-framework apps, auto-instrumentation is not available. Wrap the client manually: ```javascript import OpenAI from "openai"; import * as Sentry from "@sentry/nextjs"; // or @sentry/react, @sentry/browser const openai = Sentry.instrumentOpenAiClient(new OpenAI()); // Use 'openai' client as normal ``` ### LangChain / LangGraph (auto-enabled) ```javascript Sentry.init({ dsn: "YOUR_DSN", tracesSampleRate: 1.0, streamGenAiSpans: true, sendDefaultPii: true, integrations: [ Sentry.langChainIntegration(), Sentry.langGraphIntegration(), ], }); ``` ### Vercel AI SDK Add to `sentry.edge.config.ts` for Edge runtime: ```javascript Sentry.init({ dsn: "YOUR_DSN", tracesSampleRate: 1.0, streamGenAiSpans: true, sendDefaultPii: true, integrations: [Sentry.vercelAIIntegration()], }); ``` Enable telemetry per-call: ```javascript await generateText({ model: openai("gpt-4o"), prompt: "Hello", experimental_telemetry: { isEnabled: true, recordInputs: true, recordOutputs: true, }, }); ``` ## Python Configuration Integrations auto-enable — just init with tracing. Only add explicit imports to customize options: ```python import sentry_sdk sentry_sdk.init( dsn="YOUR_DSN", traces_sample_rate=1.0, # Lower in production (e.g., 0.1) stream_gen_ai_spans=True, # SDK ≥2.60.0 send_default_pii=True, # Integrations auto-enable when the AI package is installed. # Only specify explicitly to customize (e.g., include_prompts): # integrations=[OpenAIIntegration(include_prompts=True)], ) ``` ## Manual Instrumentation Use when no supported SDK is detected. Follow the canonical [Sentry Conventions for `gen_ai.*` attributes](https://getsentry.github.io/sentry-conventions/attributes/gen_ai/) — the [JS docs](https://docs.sentry.io/platforms/javascript/guides/connect/ai-agent-monitoring/#manual-instrumentation) may lag behind; do not set attributes marked deprecated in the conventions. ### Span Types | `op` | Span `name` pattern | Purpose | |------|---------------------|---------| | `gen_ai.{operation}` (e.g. `gen_ai.chat`, `gen_ai.request`) | `{operation} {model}` (e.g. `chat gpt-4o`) | Individual LLM call | | `gen_ai.invoke_agent` | `invoke_agent {agent_name}` | Agent execution lifecycle | | `gen_ai.execute_tool` | `execute_tool {tool_name}` | Tool/function call | | `gen_ai.handoff` | `handoff from {source} to {target}` | Agent-to-agent transition | For LLM-call spans, the `op` follows the pattern `gen_ai.{gen_ai.operation.name}` — use `gen_ai.chat`, `gen_ai.embeddings`, `gen_ai.generate_content`, or `gen_ai.text_completion` where the operation is known. Span attributes only accept primitives; arrays/objects must be JSON-stringified. ### Example (JavaScript) ```javascript const inputMessages = [ { role: "user", parts: [{ type: "text", content: "Tell me a joke" }] }, ]; await Sentry.startSpan({ op: "gen_ai.chat", name: "chat gpt-4o", attributes: { "gen_ai.request.model": "gpt-4o", "gen_ai.operation.name": "chat", "gen_ai.input.messages": JSON.stringify(inputMessages), }, }, async (span) => { const result = await llmClient.complete(inputMessages); const outputMessages = [ { role: "assistant", parts: [{ type: "text", content: result.text }], finish_reason: result.finishReason, }, ]; span.setAttribute("gen_ai.output.messages", JSON.stringify(outputMessages)); span.setAttribute("gen_ai.usage.input_tokens", result.inputTokens); span.setAttribute("gen_ai.usage.output_tokens", result.outputTokens); return result; }); ``` ### Key Attributes **Common (all AI spans):** | Attribute | Required | Description | |-----------|----------|-------------| | `gen_ai.request.model` | Yes | Model identifier (e.g., `gpt-4o`, `claude-sonnet-4-6`) | | `gen_ai.operation.name` | No | Operation label (`chat`, `embeddings`, `invoke_agent`, `execute_tool`, `handoff`, etc.) | | `gen_ai.agent.name` | No | Agent name (set on agent and tool spans) | **Request / response content (PII — enable only after confirming; see Data Capture Warning above):** | Attribute | Description | |-----------|-------------| | `gen_ai.input.messages` | JSON-stringified array of input messages. Each item uses `{role, parts}` where `parts` is `[{type, content}]`; `role` is `"user"`, `"assistant"`, `"tool"`, or `"system"` | | `gen_ai.output.messages` | JSON-stringified array of response messages (text + tool calls), same shape as inputs | | `gen_ai.system_instructions` | System prompt passed to the model | | `gen_ai.tool.definitions` | JSON-stringified list of tools available to the model | **Token usage:** | Attribute | Description | |-----------|-------------| | `gen_ai.usage.input_tokens` | Total input tokens — **includes** cached tokens | | `gen_ai.usage.input_tokens.cached` | Subset of input tokens served from cache | | `gen_ai.usage.input_tokens.cache_write` | Tokens written to cache while processing input | | `gen_ai.usage.output_tokens` | Total output tokens — **includes** reasoning tokens | | `gen_ai.usage.output_tokens.reasoning` | Subset of output tokens used for reasoning | | `gen_ai.usage.total_tokens` | Sum of input + output tokens | **Tool spans (`gen_ai.execute_tool`):** | Attribute | Description | |-----------|-------------| | `gen_ai.tool.name` | Tool identifier | | `gen_ai.tool.description` | Human-readable tool description | | `gen_ai.tool.call.arguments` | JSON-stringified tool arguments | | `gen_ai.tool.call.result` | JSON-stringified tool result | ### Token Usage and Cost Calculation Sentry uses token attributes to [calculate model costs](https://docs.sentry.io/ai/monitoring/agents/costs/). **Cached and reasoning tokens are subsets, not separate counts** — `gen_ai.usage.input_tokens` already includes `gen_ai.usage.input_tokens.cached`, and `gen_ai.usage.output_tokens` already includes `gen_ai.usage.output_tokens.reasoning`. Sentry subtracts the cached/reasoning counts from the totals to compute the uncached/non-reasoning portion. Reporting a cached or reasoning count greater than its total produces negative costs in the dashboard. Example — 100 input tokens total, 90 served from cache: - Correct: `input_tokens = 100`, `input_tokens.cached = 90` - Wrong: `input_tokens = 10`, `input_tokens.cached = 90` (cached larger than total → negative cost) The same rule applies to `gen_ai.usage.output_tokens` vs. `gen_ai.usage.output_tokens.reasoning`. ## Verification After configuring, make an LLM call and check the Sentry Traces dashboard. AI spans appear with `gen_ai.*` operations showing model, token counts, and latency. ## Conversations Conversations gives a readable, chat-style view of past sessions with your AI agent. It groups spans by `gen_ai.conversation.id` — so whether a user talked across multiple traces or multiple conversations happened inside one trace, you get a timeline of every message, tool call, and response. Find it at **Explore > Conversations** in Sentry. ### Prerequisites for Conversations - Tracing enabled with `tracesSampleRate > 0` - `streamGenAiSpans: true` (JS SDK >=10.53.0) / `stream_gen_ai_spans=True` (Python SDK >=2.60.0) — required so AI spans are sent as standalone items. Without this, spans with large inputs/outputs can hit transaction payload size limits and be dropped. - **Input and output capture enabled** — Conversations reconstructs the chat from `gen_ai.input.messages` and `gen_ai.output.messages` attributes. Set `sendDefaultPii: true` (JS) / `send_default_pii=True` (Python). Without it, conversations appear empty. ### Setting a Conversation ID Some integrations (OpenAI Agents SDK for Python, OpenAI SDK for Node) infer the conversation ID automatically. For all others, set it manually. #### JavaScript ```javascript import * as Sentry from "@sentry/node"; // or @sentry/nextjs, @sentry/nestjs, etc. // Set at the start of a conversation Sentry.setConversationId("conv_abc123"); // All subsequent AI calls carry gen_ai.conversation.id: "conv_abc123" await openai.chat.completions.create({ model: "gpt-5.5", messages: [{ role: "user", content: "Hello" }], }); ``` #### Python ```python import sentry_sdk.ai # Set at the start of a conversation sentry_sdk.ai.set_conversation_id("conv_abc123") # All subsequent AI calls carry gen_ai.conversation.id = "conv_abc123" ``` Some integrations infer the conversation ID automatically. For example, the Python OpenAI integration picks it up when you use the `conversation` parameter: ```python import openai import sentry_sdk sentry_sdk.init(...) conversation = openai.conversations.create() response = openai.responses.create( model="gpt-5.4", input=[{"role": "user", "content": "What are the 5 Ds of dodgeball?"}], conversation=conversation.id # automatically sets gen_ai.conversation.id ) ``` ### Conversations vs Traces These are independent concepts: - A single conversation can span **multiple traces** (e.g., user refreshes the page mid-conversation — new trace, same conversation ID) - A single trace can contain spans from **different conversations** (e.g., user starts a new chat without refreshing) ## Troubleshooting | Issue | Solution | |-------|----------| | AI spans not appearing | Verify `tracesSampleRate > 0`, check SDK version | | Token counts missing | Some providers don't return tokens for streaming | | Negative or wrong costs in dashboard | Cached/reasoning tokens are subsets of totals — see Token Usage and Cost Calculation | | Prompts not captured | Set `sendDefaultPii: true` (JS) or `send_default_pii=True` (Python); use `recordInputs`/`include_prompts` only for explicit overrides | | Vercel AI not working | Add `experimental_telemetry` to each call | | Conversations view empty | Ensure `streamGenAiSpans: true` / `stream_gen_ai_spans=True`, `sendDefaultPii: true` / `send_default_pii=True`, and a conversation ID is set |