--- name: mcp-audit description: Audit MCP servers for protocol compliance, metadata drift, and compatibility regressions. Use when reviewing tool annotations, tool/result schemas, structured output, lifecycle/init handshake, capabilities, prompts/resources support, transports, auth, security, version drift, or Warden/CI MCP compatibility checks. Trigger phrases include "audit MCP", "check MCP spec compliance", "review tool hints", "validate tools/list", "check initialize handshake", "review prompt or resource capabilities", and "check MCP compatibility in Warden". --- # MCP Audit Audit an MCP server against the current released MCP specification and any repo-specific compatibility constraints. Read `references/spec-baseline.md` and `references/checklist.md` before making changes. Use `references/version-watchpoints.md` when spec drift, draft features, or older protocol targets may matter. `references/common-findings.md` captures recurring failure patterns. `SOURCES.md` is provenance, not the audit checklist. ## Workflow 1. Pin the protocol baseline. - Default to the latest released MCP spec revision unless the repo explicitly targets another version. - Treat draft and SEP content as watchpoints, not release-blocking requirements, unless the user or repo explicitly asks for draft compatibility. - Identify which MCP primitives and utilities the server actually implements: prompts, resources, tools, completions, logging, tasks, or experimental extensions. 2. Audit lifecycle and capability negotiation. - Verify `initialize` and `notifications/initialized` behavior, negotiated protocol version, and claimed capabilities. - Check that the server only advertises capabilities and sub-capabilities it actually supports, such as `listChanged`, `subscribe`, or task-related capability blocks. - For HTTP transports, verify behavior around `MCP-Protocol-Version` after initialization if the repo owns transport handling directly. 3. Audit tools if present. - Verify `tools/list` pagination, `notifications/tools/list_changed` if claimed, and client-visible metadata from the exported server surface. - Check tool definitions: `name`, `title`, `description`, `icons`, `inputSchema`, `outputSchema`, `annotations`, and `execution.taskSupport`. - Check tool result semantics: `content`, `structuredContent`, `isError`, embedded resources, resource links, and the split between protocol errors and tool execution errors. - Review safety hints conservatively: `readOnlyHint`, `destructiveHint`, `idempotentHint`, and `openWorldHint`. - Build the explicit upstream-mutation inventory for write-capable tools. 4. Audit prompts and resources if present. - Prompts: capability declaration, `prompts/list` pagination, `prompts/get`, `notifications/prompts/list_changed`, argument handling, and prompt message content types. - Resources: capability declaration, `resources/list` pagination, `resources/read`, `resources/templates/list`, `resources/subscribe`, `notifications/resources/list_changed`, `notifications/resources/updated`, URI scheme usage, MIME types, and text/blob encoding. - Preserve the spec control hierarchy: prompts are user-controlled, resources are application-controlled, and tools are model-controlled. 5. Audit transports, auth, and security. - `stdio`: newline-delimited JSON-RPC over `stdin` and `stdout`, no non-protocol stdout, stderr-only logging, and environment-based credential handling rather than HTTP OAuth flows. - HTTP/Streamable HTTP: origin validation, localhost-binding guidance for local deployments, session and protocol-version handling, and HTTP-only auth flows when the server actually supports HTTP. - Authorization: protected resource metadata discovery, `WWW-Authenticate` challenges, scope guidance, resource indicators, bearer-token handling, audience validation, and no query-string tokens. - Security: input and URI validation, access controls, output sanitization, rate limits and timeouts, consent or sandbox expectations for local servers, and DNS-rebinding or SSRF risk surfaces. 6. Audit version and compatibility drift. - Separate true spec violations from intentional older-version targeting or host-specific behavior. - Check newer released-spec features that may be missing or mis-modeled, such as icons, tool name guidance, execution or task support, and structured tool output. - Note draft-only or SEP-only expectations separately so the audit does not over-enforce unreleased behavior. - Check repo-specific compatibility constraints such as tool-count limits, generated definitions, inspector or SDK quirks, and Warden rules. 7. Run validation. - Prefer existing integration tests against the exported server surface. - If none exist, add or update the narrowest automated check that proves the claimed protocol behavior. - Refresh generated definitions or catalogs if the repo uses them. - Finish with the repo's normal validation commands when appropriate. 8. Report the result. - State the protocol baseline audited. - List every primitive and capability the server implements. - List every upstream-mutating tool. - Separate confirmed violations, compatibility risks, and watchpoints. - Call out what was validated via source inspection versus real server behavior. - Note any assumptions about older spec targets, client quirks, or host-specific extensions. ## Failure Handling - If the repo targets an older MCP revision, audit against that version first and record the delta to latest separately. - If framework adapters transform schemas or annotations, trust the exported wire surface over local declarations. - If HTTP auth or transport behavior is owned by upstream infrastructure, audit the repo-owned boundary and explicitly mark the remainder as inherited or out of scope. - If a requirement appears only in a draft or SEP, do not fail the server on it unless the user asked for draft compatibility. - If a check passes structurally but the real server response differs, treat the wire behavior as authoritative.