--- name: testing-best-practices description: Test layering, execution, and CI guidance across unit, integration, and e2e. Use when designing tests, writing test cases, or planning test strategy for a module. --- ## When to activate Engage when: - Working with spec files (`*.spec.md`, `SPEC.md`, `spec/*.md`) - Designing test cases or test strategy for a module - Writing or reviewing unit, integration, or e2e tests - After `/specout` completes - Planning CI test lanes ## Mutation policy - Default: analyze code and produce test strategy, matrix, and implementation plan. - Do not edit spec files unless the user explicitly requests spec maintenance. - When this skill conflicts with system/project rules, follow system/project rules. ## Test layering policy ### Unit tests Purpose: verify individual functions and invariants in isolation. - **Data-driven**: parameterized tables covering happy path, boundary, error, and edge cases. - **Property-based**: fuzz invariants that must hold across all inputs (e.g., idempotency, sort stability, roundtrip serialization). - Derive cases from the module's public API surface: input types/constraints, output shape, error modes, invariants. - Cover categories per function: happy path, boundary values, error cases, edge cases, invariants. ### Integration / contract tests Purpose: verify interactions between components and external services. - **API envelope**: request/response shape, status codes, content types, pagination. - **Error contract**: error codes, error shapes, rate limiting, retries. - **Auth and scoping**: token validation, role-based access, tenant isolation. - **Eventual consistency**: verify convergence within bounded time; poll rather than sleep. - Reuse auth state across tests where possible; avoid redundant login flows. ### E2E tests Purpose: verify real user workflows through the full stack. - No mocks; exercise real services, databases, and APIs. - Happy-path workflows only; save edge cases for lower layers. - Fast: each test should complete within a reasonable timeout. - **State-tolerant**: never assume a clean slate; tolerate and work with prior state. - **Idempotent**: safe to run repeatedly without cleanup between runs. - **Flow-oriented**: validate real data paths end-to-end rather than isolated assertions. ## Hard rules - **Never invent signatures, source locations, or line numbers.** Only reference what you have read from the codebase. - **No fabricated fixtures.** Derive test data from actual schemas, types, or seed data in the repo. - **No test-only hacks in product code.** No `if (process.env.TEST)` branches, no test-specific exports, no test backdoors. - **E2E must not rely on clean slate.** Tests must tolerate pre-existing data, prior test runs, and shared environments. - **Never weaken assertions to make tests pass.** Fix the underlying issue. - **Never hard-code values matching test assertions.** Implement general-purpose logic. ## Execution guidance ### Preflight checks (before e2e) 1. Verify the target environment is reachable (health endpoint, ping). 2. Confirm required services are running (database, API, auth provider). 3. Validate test user / credentials exist and are functional. 4. Check for leftover state that could cause false failures; log it, do not fail on it. ### Deterministic fixtures - Use seeded randomness for generated data (seeded faker, deterministic UUIDs). - Fixtures should be self-contained; avoid cross-test fixture dependencies. - Prefer factory functions over shared mutable fixture objects. ### Async handling - Poll with bounded timeout and backoff; never use fixed `sleep`/`waitForTimeout`. - Set explicit timeout per operation; fail fast with a descriptive message on timeout. - Bound retry attempts (e.g., max 3 retries with exponential backoff). - Use framework-native waiting (Playwright `expect`, async assertions) over manual loops. ### Flake handling - **Single infrastructure retry** per test run; if it fails twice, it is not flake. - On retry failure, collect diagnostics: screenshots, network logs, service health, timestamps. - Classify the failure (flaky / outdated / bug) before attempting a fix. - Never add arbitrary delays or retry loops as a flake "fix." ## API surface discovery Before generating test cases: - Read the module source to enumerate exports/public functions. - Confirm scope from the user request and inspected code context; if ambiguous, state assumptions and proceed conservatively. - For each function: input types/constraints, output shape, error modes, invariants. - Probe for state dependencies and ordering constraints between functions. - Decide granularity from context: unit-level (individual functions) vs integration-level (compositions). ## Output format Keep outputs actionable and concise. Use markdown, not rigid JSON schemas. ### Test strategy Brief summary of what to test and at which layer: ```markdown ## Test Strategy - **Unit**: [functions/modules], data-driven + property-based for [invariants] - **Integration**: [API contracts], auth scoping, error envelopes - **E2E**: [workflows], happy-path flows against real services ``` ### Test matrix Tabular case listing per function or flow: ```markdown ## Test Matrix ### `functionName` | ID | Category | Name | Input | Expected | |----|----------|------|-------|----------| | HP-01 | happy_path | basic uppercase | "hello" | "HELLO" | | BV-01 | boundary | empty string | "" | "" | | ERR-01 | error | null input | null | INVALID_ARGUMENT | | EDGE-01 | edge | unicode combining | "cafe\u0301" | "CAFE\u0301" | ``` Case ID scheme: `{CATEGORY}-{NN}` (HP, BV, ERR, EDGE). Append-only; never renumber. ### Implementation plan Ordered steps to write and run the tests: ```markdown ## Implementation Plan 1. Add factory for [fixture] using seeded data 2. Write parameterized unit tests for [function] (X cases) 3. Write integration test for [API endpoint] auth + error contract 4. Write e2e flow for [workflow] with preflight checks 5. Run suite: `[command]` ``` ## CI guidance ### Fast PR smoke lane - Unit tests + linting + type-check on every PR. - Subset of integration tests covering critical contracts. - Target: under 5 minutes. ### Nightly full lane - Full unit + integration + e2e suite. - Include property-based tests with higher iteration counts. - Idempotency verification: run critical setup paths twice, assert no side effects on second run. - Flake detection: flag tests that pass on retry but failed initially. ## Workflow 1. Spec or code defines the module behavior (types, constraints, API surface). 2. Agent (with this skill) produces test strategy, matrix, and implementation plan. 3. test-writer agent translates the plan to runnable code in the target language's idiom. 4. Developer implements to pass the tests. 5. If implementation reveals missing cases, propose them first; append to spec only when explicitly requested.