--- name: anth-prod-checklist description: 'Execute production deployment checklist for Claude API integrations. Use when deploying Claude-powered features to production, preparing for launch, or implementing go-live validation. Trigger with phrases like "anthropic production", "deploy claude", "claude go-live", "anthropic launch checklist", "production ready claude". ' allowed-tools: Read, Bash(curl:*), Grep version: 1.0.0 license: MIT author: Jeremy Longshore tags: - saas - ai - anthropic compatibility: Designed for Claude Code --- # Anthropic Production Checklist ## Overview Complete checklist for deploying Claude API integrations to production with reliability, observability, and cost controls. ## Pre-Launch Checklist ### Authentication & Keys - [ ] Production API key from dedicated Workspace - [ ] Key stored in secret manager (not env files on servers) - [ ] Key rotation procedure documented and tested - [ ] Separate keys for each environment (dev/staging/prod) ### Error Handling - [ ] All 5 error types handled: `authentication_error`, `invalid_request_error`, `rate_limit_error`, `api_error`, `overloaded_error` - [ ] SDK `maxRetries` set (recommended: 3-5 for production) - [ ] Custom error logging with `request-id` captured - [ ] Circuit breaker for sustained API failures ### Rate Limits & Cost - [ ] Usage tier verified at [console.anthropic.com](https://console.anthropic.com/settings/limits) - [ ] Application-level rate limiting implemented - [ ] Cost alerts configured (monthly spend caps) - [ ] Model selection optimized (Haiku for simple tasks, Sonnet for complex) - [ ] `max_tokens` set to realistic values (not inflated) - [ ] Prompt caching enabled for repeated system prompts ### Reliability - [ ] Timeout configured (`timeout` parameter, recommended 60-120s) - [ ] Graceful degradation when API is unavailable - [ ] Health check endpoint tests API connectivity ```python async def health_check(): try: # Use token counting as a cheap health probe (no generation cost) count = client.messages.count_tokens( model="claude-haiku-4-20250514", messages=[{"role": "user", "content": "ping"}] ) return {"status": "healthy", "tokens": count.input_tokens} except Exception as e: return {"status": "degraded", "error": str(e)} ``` ### Observability - [ ] Request/response logging (redact content, keep metadata) - [ ] Latency tracking (p50, p95, p99) - [ ] Token usage tracking (input + output per request) - [ ] Cost tracking per feature/customer - [ ] Error rate alerting (429s, 5xx, timeouts) ```python import logging import time logger = logging.getLogger("anthropic") def tracked_create(**kwargs): start = time.monotonic() try: response = client.messages.create(**kwargs) duration = time.monotonic() - start logger.info( "claude_request", extra={ "request_id": response._request_id, "model": response.model, "input_tokens": response.usage.input_tokens, "output_tokens": response.usage.output_tokens, "duration_ms": int(duration * 1000), "stop_reason": response.stop_reason, } ) return response except Exception as e: duration = time.monotonic() - start logger.error("claude_error", extra={"error": str(e), "duration_ms": int(duration * 1000)}) raise ``` ### Content Safety - [ ] System prompts reviewed for injection resistance - [ ] User input validated and length-limited - [ ] Output scanned for sensitive data leakage - [ ] Content moderation for user-facing responses ### Infrastructure - [ ] Deployment uses canary/rolling strategy - [ ] Rollback procedure documented and tested - [ ] Runbook created (see `anth-incident-runbook`) - [ ] On-call escalation path defined ## Alerting Thresholds | Metric | Warning | Critical | |--------|---------|----------| | Error rate (5xx) | > 1% | > 5% | | p99 latency | > 10s | > 30s | | 429 rate | > 5/min | > 20/min | | Daily cost | > 80% budget | > 100% budget | | Auth failures (401/403) | > 0 | > 0 (immediate) | ## Resources - [API Status](https://status.anthropic.com) - [Pricing](https://docs.anthropic.com/en/docs/about-claude/pricing) - [Rate Limits](https://docs.anthropic.com/en/api/rate-limits) ## Next Steps For version upgrades, see `anth-upgrade-migration`.