--- name: sf-ai-agentforce-testing description: > Agentforce agent testing with dual-track workflow and 100-point scoring. TRIGGER when: user tests Agentforce agents, runs sf agent test commands, creates test specs, validates topic routing, or analyzes agent test coverage. DO NOT TRIGGER when: Apex unit tests (use sf-testing), building agents (use sf-ai-agentforce), or Agent Script DSL (use sf-ai-agentscript). license: MIT compatibility: "Requires API v66.0+ (Spring '26) and Agentforce enabled org" metadata: version: "2.1.0" author: "Jag Valaiyapathy" scoring: "100 points across 7 categories" --- # sf-ai-agentforce-testing: Agentforce Test Execution & Coverage Analysis Use this skill when the user needs **formal Agentforce testing**: multi-turn conversation validation, CLI Testing Center specs, topic/action coverage analysis, preview checks, or a structured test-fix loop after publish. ## When This Skill Owns the Task Use `sf-ai-agentforce-testing` when the work involves: - `sf agent test` workflows - multi-turn Agent Runtime API testing - topic routing, action invocation, context preservation, guardrail, or escalation validation - test-spec generation and coverage analysis - post-publish / post-activate test-fix loops Delegate elsewhere when the user is: - building or editing the agent itself → [sf-ai-agentforce](../sf-ai-agentforce/SKILL.md) or [sf-ai-agentscript](../sf-ai-agentscript/SKILL.md) - running Apex unit tests → [sf-testing](../sf-testing/SKILL.md) - creating seed data for actions → [sf-data](../sf-data/SKILL.md) - analyzing session telemetry / STDM traces → [sf-ai-agentforce-observability](../sf-ai-agentforce-observability/SKILL.md) --- ## Core Operating Rules - Testing comes **after** deploy / publish / activate. - Use **multi-turn API testing** as the primary path when conversation continuity matters. - Use **CLI Testing Center** as the secondary path for single-utterance and org-supported test-center workflows. - Interactive and programmatic CLI preview use standard `sf org login web` authentication; **ECA is only required for Agent Runtime API testing**, not for live preview. - Fixes to the agent should be delegated to **[sf-ai-agentscript](../sf-ai-agentscript/SKILL.md)** when Agent Script changes are needed. - Do **not** use raw `curl` for OAuth token validation in the ECA flow; use the provided credential tooling. ### Script path rule Use the existing scripts under: - `~/.claude/skills/sf-ai-agentforce-testing/hooks/scripts/` These scripts are pre-approved. Do not recreate them. --- ## Required Context to Gather First Ask for or infer: - agent API name / developer name - target org alias - testing goal: smoke test, regression, coverage expansion, or bug reproduction - whether the agent is already published and activated - whether the org has **Agent Testing Center** available - whether **ECA credentials** are available for Agent Runtime API testing Preflight checks: 1. discover the agent 2. confirm publish / activation state 3. verify dependencies (Flows, Apex, data) 4. choose testing track --- ## Dual-Track Workflow ### Track A — Multi-turn API testing (primary) Use when you need: - multi-turn conversation testing - topic re-matching validation - context preservation checks - escalation or action-chain analysis across turns Requires: - ECA / auth setup - agent runtime access ### Track B — CLI Testing Center (secondary) Use when you need: - org-native `sf agent test` workflows - test spec YAML execution - quick single-utterance validation - CLI-centered CI/CD usage where Testing Center is available ### Quick manual path For manual validation without full formal testing, use preview workflows first, then escalate to Track A or B as needed. --- ## Recommended Workflow ### 1. Discover and verify - locate the agent in the target org - confirm it is published and activated - confirm required actions / Flows / Apex exist - decide whether Track A or Track B fits the request ### 2. Plan tests Cover at least: - main topics - expected actions - guardrails / off-topic handling - escalation behavior - phrasing variation ### 3. Execute the right track #### Track A - validate ECA credentials with the provided tooling - retrieve metadata needed for scenario generation - run multi-turn scenarios with the provided Python scripts - analyze per-turn failures and coverage #### Track B - generate or refine a flat YAML test spec - run `sf agent test` commands - inspect structured results and verbose action output ### 4. Classify failures Typical failure buckets: - topic not matched - wrong topic matched - action not invoked - wrong action selected - action invocation failed - context preservation failure - guardrail failure - escalation failure ### 5. Run fix loop When failures imply agent-authoring issues: - delegate fixes to [sf-ai-agentscript](../sf-ai-agentscript/SKILL.md) - re-publish / re-activate if needed - re-run focused tests before full regression --- ## Testing Guardrails Never skip these: - test only after publish/activate - include harmful / off-topic / refusal scenarios - use multiple phrasings per important topic - clean up sessions after API tests - keep swarm execution small and controlled Avoid these anti-patterns: - testing unpublished agents - treating one happy-path utterance as coverage - storing ECA secrets in repo files - debugging auth with brittle shell-expanded `curl` commands - changing both tests and agent simultaneously without isolating the cause --- ## Output Format When finishing a run, report in this order: 1. **Test track used** 2. **What was executed** 3. **Pass/fail summary** 4. **Coverage gaps** 5. **Root-cause themes** 6. **Recommended fix loop / next test step** Suggested shape: ```text Agent: Track: Multi-turn API | CLI Testing Center | Preview Executed: Result: Coverage: Issues: Next step: ``` --- ## Cross-Skill Integration | Need | Delegate to | Reason | |---|---|---| | fix Agent Script logic | [sf-ai-agentscript](../sf-ai-agentscript/SKILL.md) | authoring and deterministic fix loops | | create test data | [sf-data](../sf-data/SKILL.md) | action-ready data setup | | fix Flow-backed actions | [sf-flow](../sf-flow/SKILL.md) | Flow repair | | fix Apex-backed actions | [sf-apex](../sf-apex/SKILL.md) | Apex repair | | set up ECA / OAuth for Agent Runtime API | [sf-connected-apps](../sf-connected-apps/SKILL.md) | auth and app configuration | | analyze session telemetry | [sf-ai-agentforce-observability](../sf-ai-agentforce-observability/SKILL.md) | STDM / trace analysis | --- ## Reference Map ### Start here - [references/interview-wizard.md](references/interview-wizard.md) - [references/multi-turn-testing.md](references/multi-turn-testing.md) - [references/cli-commands.md](references/cli-commands.md) - [references/test-spec-reference.md](references/test-spec-reference.md) ### Execution / auth - [references/execution-protocol.md](references/execution-protocol.md) - [references/multi-turn-execution.md](references/multi-turn-execution.md) - [references/eca-setup-guide.md](references/eca-setup-guide.md) - [references/credential-convention.md](references/credential-convention.md) - [references/connected-app-setup.md](references/connected-app-setup.md) ### Coverage / fix loops - [references/coverage-analysis.md](references/coverage-analysis.md) - [references/agentic-fix-loops.md](references/agentic-fix-loops.md) - [references/results-scoring.md](references/results-scoring.md) - [references/known-issues.md](references/known-issues.md) ### Advanced / specialized - [references/agentscript-agents.md](references/agentscript-agents.md) - [references/agentscript-testing-patterns.md](references/agentscript-testing-patterns.md) - [references/cli-testing-details.md](references/cli-testing-details.md) - [references/deep-conversation-history-patterns.md](references/deep-conversation-history-patterns.md) - [references/swarm-execution.md](references/swarm-execution.md) - [references/trace-analysis.md](references/trace-analysis.md) - [references/agent-api-reference.md](references/agent-api-reference.md) ### Templates / assets - [references/test-templates.md](references/test-templates.md) - [references/test-plan-format.md](references/test-plan-format.md) - [assets/](assets/) --- ## Score Guide | Score | Meaning | |---|---| | 90+ | production-ready test confidence | | 80–89 | strong coverage with minor gaps | | 70–79 | acceptable but coverage expansion recommended | | 60–69 | partial validation only | | < 60 | insufficient confidence; block release |