--- name: observing-agentforce description: "Analyze production Agentforce agent behavior using session traces and Data Cloud. TRIGGER when: user queries STDM session data or Data Cloud trace records; investigates production agent failures, regressions, or performance issues; asks about session traces, conversation logs, or agent metrics; wants to reproduce a reported production issue in preview; runs findSessions or trace analysis queries. DO NOT TRIGGER when: user creates, modifies, or debugs .agent files during development (use developing-agentforce); writes or runs test specs (use testing-agentforce); uses sf agent preview for local development iteration; deploys or publishes agents." allowed-tools: Bash Read Write Edit Glob Grep license: Apache-2.0 metadata: version: "1.0" last_updated: "2026-04-08" argument-hint: " [--agent-file ] [--session-id ] [--days ]" compatibility: claude-code --- # Agentforce Observability Improve Agentforce agents using session trace data and live preview testing. **Three-phase workflow:** - **Observe** -- Query STDM sessions from Data Cloud (if available), OR run test suites + preview with local traces as fallback - **Reproduce** -- Use `sf agent preview` to simulate problematic conversations live - **Improve** -- Edit the `.agent` file directly, validate, publish, verify --- ## Platform Notes - Shell examples below use bash syntax. On Windows, use PowerShell equivalents or Git Bash. - Replace `python3` with `python` on Windows. - Replace `/tmp/` with `$env:TEMP\` (PowerShell) or `%TEMP%\` (cmd). - Replace `jq` with `python -c "import json,sys; ..."` if jq is not installed. --- ## Routing Gather these inputs before starting: - **Org alias** (required) - **Agent API name** (required for preview and deploy; ask if not provided) - **Agent file path** (optional) -- path to the `.agent` file, typically `force-app/main/default/aiAuthoringBundles//.agent`. Auto-detect if not provided. - **Session IDs** (optional) -- analyze specific sessions; if absent, query last 7 days - **Days to look back** (optional, default 7) Determine intent from user input: - **No specific action** -> run all three phases: Observe -> surface issues -> ask if user wants to Reproduce and/or Improve - **"analyze" / "sessions" / "what's wrong"** -> Phase 1 only, then suggest next steps - **"reproduce" / "test" / "preview"** -> Phase 2 (run Phase 1 first if no issues in hand) - **"fix" / "improve" / "update"** -> Phase 3 (run Phase 1 first if no issues in hand) ### Resolve agent name Before any STDM query, resolve the user-provided agent name against the org to get the exact `MasterLabel` and `DeveloperName`: ```bash sf data query --json \ --query "SELECT Id, MasterLabel, DeveloperName FROM GenAiPlannerDefinition WHERE MasterLabel LIKE '%%' OR DeveloperName LIKE '%%'" \ -o ``` - `MasterLabel` = display name used by STDM `findSessions` and Agent Builder UI (e.g. "Order Service") - `DeveloperName` = API name with version suffix used in metadata (e.g. "OrderService_v9") - The `--api-name` flag for `sf agent preview/activate/publish` uses `DeveloperName` **without** the `_vN` suffix (e.g. "OrderService") Store these values: - `AGENT_MASTER_LABEL` -- for `findSessions()` agent filter - `AGENT_API_NAME` -- `DeveloperName` without `_vN` suffix, for `sf agent` CLI commands - `PLANNER_ID` -- the Salesforce record ID for this agent ### Locate the .agent file **Step 1 -- Search locally:** ```bash find /force-app/main/default/aiAuthoringBundles -name "*.agent" 2>/dev/null ``` If the user provided an agent file path, use that directly. Otherwise, search for files matching `AGENT_API_NAME`. **Step 2 -- If not found locally, retrieve from the org:** ```bash sf project retrieve start --json --metadata "AiAuthoringBundle:" -o ``` > **Known bug:** `sf project retrieve start` creates a double-nested path: `force-app/main/default/main/default/aiAuthoringBundles/...`. Fix it immediately after retrieve: ```bash if [ -d "force-app/main/default/main/default/aiAuthoringBundles" ]; then mkdir -p force-app/main/default/aiAuthoringBundles cp -r force-app/main/default/main/default/aiAuthoringBundles/* \ force-app/main/default/aiAuthoringBundles/ rm -rf force-app/main/default/main fi ``` **Step 3 -- Validate the retrieved file:** Read the `.agent` file and verify it has proper Agent Script structure: - `system:` block with `instructions:` - `config:` block with `developer_name:` - `start_agent` or `subagent` blocks with `reasoning: instructions:` - Each subagent should have distinct `instructions:` content (not identical across subagents) Store the resolved path as `AGENT_FILE` for Phase 3. --- ## Phase 0: Discover Data Space Before running any STDM query, determine the correct Data Cloud Data Space API name. ```bash sf api request rest "/services/data/v63.0/ssot/data-spaces" -o ``` Note: `sf api request rest` is a beta command -- do not add `--json` (that flag is unsupported and causes an error). The response shape is: ```json { "dataSpaces": [ { "id": "0vhKh000000g3DjIAI", "label": "default", "name": "default", "status": "Active", "description": "Your org's default data space." } ], "totalSize": 1 } ``` The `name` field is the API name to pass to `AgentforceOptimizeService`. **Decision logic:** - If the command fails (e.g. 404 or permission error), fall back to `'default'` and note it as an assumption. - Filter to only `status: "Active"` entries. - If exactly one active Data Space exists, use it automatically and confirm to the user: "Using Data Space: ``". - If multiple active Data Spaces exist, show the list (label + name) and ask the user which to use. Store the selected `name` value as `DATA_SPACE` for all subsequent steps. ### Prerequisite check: STDM DMOs After deploying the helper class (step 1.0), run a quick probe to verify the STDM Data Model Objects exist in Data Cloud: ```bash sf apex run -o -f /dev/stdin << 'APEX' ConnectApi.CdpQueryInput qi = new ConnectApi.CdpQueryInput(); qi.sql = 'SELECT ssot__Id__c FROM "ssot__AiAgentSession__dlm" LIMIT 1'; try { ConnectApi.CdpQueryOutputV2 out = ConnectApi.CdpQuery.queryAnsiSqlV2(qi, ''); System.debug('STDM_CHECK:OK rows=' + (out.data != null ? out.data.size() : 0)); } catch (Exception e) { System.debug('STDM_CHECK:FAIL ' + e.getMessage()); } APEX ``` **If `STDM_CHECK:FAIL`:** STDM is not activated. Inform the user and switch to **Phase 1-ALT**: > STDM (Session Trace Data Model) is not available in this org. To enable: Setup -> Data Cloud -> Data Streams and verify "Agentforce Activity" is active. **Proceeding with fallback: test suites + local traces.** **If `STDM_CHECK:OK`**, proceed to Phase 1 (STDM path). --- ## Phase 1-ALT: Observe Without STDM (Fallback Path) When STDM is not available, use test suites and `sf agent preview --authoring-bundle` with local trace analysis. | Data source | When to use | Pros | Cons | |---|---|---|---| | STDM (Phase 1) | Historical production analysis | Real user data, volume | Requires Data Cloud, 15-min lag | | Test suites + local traces (Phase 1-ALT) | Dev iteration, orgs without STDM | Instant, full LLM prompt, variable state | Preview only, no real user data | ### 1-ALT.1 Run existing test suite (if available) ```bash sf agent test list --json -o sf agent test run --json --api-name --wait 10 --result-format json -o | tee /tmp/test_run.json JOB_ID=$(python3 -c "import json; print(json.load(open('/tmp/test_run.json'))['result']['runId'])") sf agent test results --json --job-id "$JOB_ID" --result-format json -o ``` ### 1-ALT.2 Derive test utterances from .agent file (if no test suite) If no test suite exists, derive utterances: one per non-entry subagent (from `description:` keywords), one per key action, one guardrail test, one multi-turn test. ### 1-ALT.3 Preview with `--authoring-bundle` (local traces) Run each test utterance through preview to generate local trace files: ```bash sf agent preview start --json --authoring-bundle -o | tee /tmp/preview_start.json SESSION_ID=$(python3 -c "import json; print(json.load(open('/tmp/preview_start.json'))['result']['sessionId'])") sf agent preview send --json --session-id "$SESSION_ID" --authoring-bundle \ --utterance "$UTT" -o | tee /tmp/preview_response.json sf agent preview end --json --session-id "$SESSION_ID" --authoring-bundle -o ``` **Trace file location:** `.sfdx/agents/{BundleName}/sessions/{sessionId}/traces/{planId}.json` ### 1-ALT.4 Local trace diagnosis | Issue type | Trace command | |---|---| | Subagent misroute | `jq -r '.plan[] \| select(.type=="NodeEntryStateStep") \| .data.agent_name' "$TRACE"` | | Action not called | `jq -r '.plan[] \| select(.type=="EnabledToolsStep") \| .data.enabled_tools[]' "$TRACE"` | | LOW adherence | `jq -r '.plan[] \| select(.type=="ReasoningStep") \| {category, reason}' "$TRACE"` | | Variable capture fail | `jq -r '.plan[] \| select(.type=="VariableUpdateStep") \| .data.variable_updates[]' "$TRACE"` | | Vague instructions | `jq -r '.plan[] \| select(.type=="LLMStep") \| .data.messages_sent[0].content' "$TRACE"` | **DefaultTopic trace quirk:** With `--authoring-bundle`, the root `.topic` field often shows `"DefaultTopic"` even when routing works. Always use `NodeEntryStateStep.data.agent_name` for the real subagent chain. **Entry answering directly (SMALL_TALK pattern):** If `start_agent` trace shows `SMALL_TALK` grounding and transition tools visible but none invoked, add "You are a router only. Do NOT answer questions directly." to `start_agent` instructions. ### 1-ALT.5 Classify and present Classify issues using the categories in `references/issue-classification.md`. After presenting findings, automatically proceed to agent config evidence analysis. --- ## Phase 1: Observe -- Query STDM > Full STDM query details, Apex service deployment, and response parsing: see `references/stdm-queries.md` ### 1.0 Deploy helper class (once per org) Deploy `AgentforceOptimizeService` Apex class to the org. Check if already deployed first: ```bash sf data query --json --query "SELECT Id, Name FROM ApexClass WHERE Name = 'AgentforceOptimizeService'" -o ``` If not deployed, copy from skill directory and deploy. See `references/stdm-queries.md` for full steps. ### 1.1 Find sessions Query recent sessions using `findSessions()`. Parse `DEBUG|STDM_RESULT:` from the Apex debug log. If `findSessions` returns empty, switch to Phase 1-ALT. ### 1.2 Get conversation details Use `getMultipleConversationDetails()` for up to 5 sessions (most recent first). Returns turn-by-turn data with messages, steps, topics, and action results. ### 1.2b Get LLM prompt/response (optional) When LOW adherence detected, use `getLlmStepDetails()` to get the actual LLM prompt and response. ### 1.2c Get aggregated metrics (recommended first step) Use `getAggregatedMetrics()` for high-level health dashboard: session rates, top intents, quality distribution, RAG averages. ### 1.2d Get moment insights (per-session detail) Use `getMomentInsights()` for intent summaries, quality scores (1-5), and retriever metrics per session. ### 1.2e Run observability queries (RAG deep-dive) Use `runObservabilityQuery()` for targeted RAG analysis: KnowledgeGap, Hallucination, RetrievalQuality, AnswerRelevancy, Leaderboard. ### 1.3 Reconstruct conversations Render turn-by-turn timeline from `ConversationData` JSON for each session. ### 1.4 Identify issues > Full issue pattern table and classification categories: see `references/issue-classification.md` Check each session for: action errors, subagent misroutes, missing actions, wrong inputs, variable capture failures, no transitions, slow actions, LOW adherence, abandoned sessions, dead subagents, publish drift, dead hub anti-pattern, entry answering directly, and safety issues. Priority: P1 = action errors, misroutes, LOW adherence; P2 = missing actions, variable bugs, knowledge gaps; P3 = performance, abandoned sessions. ### 1.5 Present findings and agent config evidence Present sessions analyzed, issues grouped by root cause category, and uplift estimate. Then automatically proceed to analyze the `.agent` file to confirm root causes. > Full structural analysis checks, cross-reference procedures, and publish drift detection: see `references/issue-classification.md` Retrieve the `.agent` file from the org, run automated checks (subagent count vs action blocks, dead hub detection, orphan actions, cross-subagent variable dependencies), and cross-reference STDM symptoms against the file structure. --- ## Phase 2: Reproduce -- Live Preview > Full preview procedures, trace diagnosis commands, and classification criteria: see `references/reproduce-reference.md` Build one test scenario per confirmed issue from Phase 1. Run each through `sf agent preview` with `--authoring-bundle` (generates local traces). Run each scenario **3 times** and classify: | Verdict | Criteria | |---|---| | `[CONFIRMED]` | Same failure in 3/3 runs | | `[INTERMITTENT]` | Failure in 1-2 of 3 runs | | `[NOT REPRODUCED]` | Passes in 3/3 runs | Only `[CONFIRMED]` and `[INTERMITTENT]` issues proceed to Phase 3. **Key commands:** ```bash sf agent preview start --json --authoring-bundle -o sf agent preview send --json --session-id "$SID" --utterance "" --authoring-bundle -o sf agent preview end --json --session-id "$SID" --authoring-bundle -o ``` **Trace location:** `.sfdx/agents/{Name}/sessions/{sessionId}/traces/{planId}.json` --- ## Phase 3: Improve -- Edit .agent File Directly > Full procedures for pre-flight checks, fix mapping, instruction principles, regression prevention, deployment chain, verification, safety re-verification, and test case creation: see `references/improve-reference.md` ### 3.0 Pre-flight Verify all action targets exist and are registered in the org before editing. If targets are missing, present options: deploy stubs, remove actions, register via UI, or proceed with routing-only fixes. ### 3.1-3.3 Map issue, edit, and follow instruction principles Map each confirmed issue to a fix location in the `.agent` file (description, instructions, actions, bindings, transitions). Use the Edit tool for targeted changes. Follow instruction principles: name actions explicitly, state pre-conditions, scope tightly, keep persona in `system:` only. ### 3.4 Regression prevention Establish baseline before editing. Make minimal edits. Test immediately after each edit. One fix per publish cycle. Check cross-subagent dependencies. Test adjacent subagents. ### 3.5 Apply fixes Read the `.agent` file, edit with the Edit tool (tabs for indentation), show the diff. ### 3.6 Validate, deploy, publish, activate ```bash # Validate (dry run) sf agent validate authoring-bundle --json --api-name -o # Publish (compile + deploy + activate) sf agent publish authoring-bundle --json --api-name -o ``` If publish fails, use deploy + activate fallback (note: incomplete -- does not propagate `reasoning: actions:` to live metadata). ### 3.7 Verify Run Phase 2 scenarios post-fix. Check trace for correct routing, grounding, tools, and variables. After 24-48 hours, re-run Phase 1 to compare against baseline. ### 3.7b Safety re-verification (required) Re-run safety review (`Section 15 of /developing-agentforce`) on the modified `.agent` file. Revert any changes that introduce BLOCK findings. ### 3.8 Update Testing Center test cases Create regression test cases from confirmed issues in Testing Center YAML format. Deploy with `sf agent test create` and verify all previously-broken scenarios pass. --- ## Reference Files | Reference | Contents | |---|---| | `references/stdm-queries.md` | STDM query procedures, Apex service deployment, response parsing | | `references/issue-classification.md` | Issue pattern table, root cause categories, structural analysis checks | | `references/reproduce-reference.md` | Phase 2 preview procedures, trace diagnosis, classification criteria | | `references/improve-reference.md` | Phase 3 editing, deployment chain, verification, safety, test cases | | `references/stdm-schema.md` | DMO field schemas, data hierarchy, quality notes, agent name resolution |