--- name: unit-testing description: >- Write effective unit tests with Jest, Vitest, or pytest. Covers mocking strategies (stubs, spies, mocks, fakes), coverage configuration and meaningful thresholds, snapshot testing, mutation testing with Stryker/mutmut, test doubles taxonomy, and the Arrange-Act-Assert pattern. Use when: "unit test," "Jest," "Vitest," "pytest," "mock," "coverage," "test doubles," "mutation testing." Related: coverage-analysis, ci-cd-integration, ai-test-generation, shift-left-testing. license: MIT metadata: author: kindlmann version: "1.0" category: automation --- Write effective, maintainable unit tests using Jest, Vitest, or pytest. --- ## Discovery Questions 1. **Framework:** Jest, Vitest, or pytest? Check `package.json` or `pyproject.toml`. 2. **Coverage tooling:** Already configured? Look for `jest.config.*`, `vitest.config.*`, `.nycrc`, `[tool.coverage]`. 3. **Mocking strategy:** Manual mocks, auto-mocking, or dependency injection? Check for `__mocks__/` dirs or DI containers. 4. **Existing conventions:** Check `.agents/qa-project-context.md` first for project-specific guidelines. --- ## Core Principles **1. Test behavior, not implementation.** Verify *what* code does, not *how*. Refactoring internals should not break tests. ```typescript // Bad — implementation detail // Good — observable behavior expect(svc._cache.size).toBe(3); expect(svc.getUser("abc")).toEqual({ id: "abc", name: "Alice" }); ``` **2. Fast, isolated, deterministic.** No network/disk/DB. No shared mutable state. No uncontrolled `Date.now()` or `Math.random()`. **3. Arrange-Act-Assert.** ```typescript it("should apply discount for orders over $100", () => { // Arrange const order = createOrder({ subtotal: 150 }); const svc = new DiscountService(0.1); // Act const result = svc.apply(order); // Assert expect(result.total).toBe(135); }); ``` **4. One assertion concept per test** (multiple `expect` calls are fine if they verify the same concept). **5. Descriptive test names:** `"should [behavior] when [condition]"`, not `"test calculateTotal"`. --- ## Framework-Specific Patterns ### Jest **describe/it structure with setup/teardown:** ```typescript describe("UserService", () => { let service: UserService; let mockRepo: jest.Mocked; beforeEach(() => { mockRepo = { findById: jest.fn(), save: jest.fn() } as jest.Mocked; service = new UserService(mockRepo); }); afterEach(() => jest.restoreAllMocks()); it("should return user when found", async () => { mockRepo.findById.mockResolvedValue({ id: "1", name: "Alice" }); const result = await service.getUser("1"); expect(result).toEqual({ id: "1", name: "Alice" }); }); it("should throw when user not found", async () => { mockRepo.findById.mockResolvedValue(null); await expect(service.getUser("999")).rejects.toThrow(NotFoundError); }); }); ``` **Module mocking (`jest.mock`):** ```typescript jest.mock("./email-client", () => ({ sendEmail: jest.fn().mockResolvedValue({ sent: true }), })); // Partial mock — keep original, override one export jest.mock("./utils", () => ({ ...jest.requireActual("./utils"), generateId: jest.fn(() => "fixed") })); ``` **Spying (`jest.spyOn`):** wraps real method, records calls. ```typescript const spy = jest.spyOn(console, "warn").mockImplementation(); service.deprecatedMethod(); expect(spy).toHaveBeenCalledWith(expect.stringContaining("deprecated")); ``` **Timer mocking:** ```typescript beforeEach(() => jest.useFakeTimers()); afterEach(() => jest.useRealTimers()); it("should debounce", () => { const fn = jest.fn(); const debounced = debounce(fn, 300); debounced(); expect(fn).not.toHaveBeenCalled(); jest.advanceTimersByTime(300); expect(fn).toHaveBeenCalledTimes(1); }); ``` **Async:** `await expect(fn()).resolves.toEqual(...)` / `await expect(fn()).rejects.toThrow(...)`. --- ### Vitest Same API as Jest but Vite-native. Key differences: ```typescript // vitest.config.ts import { defineConfig } from "vitest/config"; export default defineConfig({ test: { globals: true, environment: "node", coverage: { provider: "v8", reporter: ["text", "html", "lcov"] }, }, }); ``` **Mocking with `vi`:** ```typescript vi.mock("./email-client", () => ({ sendConfirmation: vi.fn().mockResolvedValue(true) })); const spy = vi.spyOn(repository, "save"); ``` **In-source testing** (useful for utilities): ```typescript export function clamp(val: number, min: number, max: number) { return Math.min(Math.max(val, min), max); } if (import.meta.vitest) { const { it, expect } = import.meta.vitest; it("clamps below", () => expect(clamp(-5, 0, 10)).toBe(0)); it("clamps above", () => expect(clamp(15, 0, 10)).toBe(10)); } ``` Enable: `test: { includeSource: ["src/**/*.ts"] }` and `define: { "import.meta.vitest": "undefined" }`. **Monorepo workspaces:** ```typescript // vitest.workspace.ts export default ["packages/*/vitest.config.ts"]; ``` --- ### pytest **Fixtures and conftest.py:** ```python # conftest.py @pytest.fixture def db(): database = Database(":memory:") database.migrate() yield database database.close() @pytest.fixture def user_service(db): return UserService(db) ``` ```python class TestUserService: def test_create_returns_id(self, user_service): uid = user_service.create({"name": "Alice"}) assert uid is not None def test_get_nonexistent_raises(self, user_service): with pytest.raises(UserNotFoundError): user_service.get("nonexistent") ``` **Parametrize for data-driven tests:** ```python @pytest.mark.parametrize("input_val,expected", [ ("hello world", "Hello World"), ("", ""), ("CAPS", "Caps"), ]) def test_title_case(input_val, expected): assert title_case(input_val) == expected ``` **Monkeypatch for mocking:** ```python def test_uses_env(monkeypatch): monkeypatch.setenv("APP_URL", "https://test.local") assert fetch_config()["source"] == "https://test.local" def test_retry(monkeypatch): calls = {"n": 0} def fake(url): calls["n"] += 1 if calls["n"] < 3: raise ConnectionError return {"ok": True} monkeypatch.setattr("app.client.http_request", fake) assert fetch_with_retry("https://api.test") == {"ok": True} ``` **Markers:** `@pytest.mark.slow`, then run `pytest -m "not slow"`. Use `-k "test_create"` for name matching. --- ## Mocking Taxonomy | Double | What it does | When to use | |--------|-------------|-------------| | **Stub** | Returns canned data, no verification | Control dependency return values | | **Spy** | Wraps real impl, records calls | Verify calls without changing behavior | | **Mock** | Replaces impl + records calls | Control return AND verify interaction | | **Fake** | Simplified working impl (in-memory DB) | Complex stateful dependencies | ```typescript // Stub — just a return value const pricing = { getPrice: () => 9.99 }; // Spy — real behavior, tracked const spy = vi.spyOn(logger, "info"); // Mock — replaced + verified const notifier = { send: vi.fn().mockResolvedValue(true) }; expect(notifier.send).toHaveBeenCalledWith(expect.objectContaining({ type: "done" })); // Fake — working substitute class FakeRepo implements UserRepository { private data = new Map(); async findById(id: string) { return this.data.get(id) ?? null; } async save(u: User) { this.data.set(u.id, { ...u }); } } ``` **Rule of thumb:** Use the simplest double. Prefer stubs over mocks. Reserve fakes for stateful dependencies. Never call real external APIs in unit tests. --- ## Coverage ### Configuration **Jest:** ```javascript // jest.config.js module.exports = { coverageProvider: "v8", collectCoverageFrom: ["src/**/*.ts", "!src/**/*.{d,test,stories}.ts", "!src/**/index.ts"], coverageThresholds: { global: { branches: 80, functions: 80, lines: 80, statements: 80 } }, }; ``` **Vitest:** set `test.coverage` in `vitest.config.ts` with `provider: "v8"`, `thresholds: { branches: 80, ... }`. **pytest:** ```toml # pyproject.toml [tool.coverage.run] source = ["src"] omit = ["src/**/test_*.py", "src/**/conftest.py"] [tool.coverage.report] fail_under = 80 show_missing = true exclude_lines = ["pragma: no cover", "if TYPE_CHECKING:"] ``` ### Coverage Types | Type | Measures | Blind spots | |------|----------|-------------| | **Branch** | Every if/else path taken? | Misses value combinations | | **Line** | Each line executed? | Misses untested branches in one line | | **Statement** | Each statement executed? | Similar to line | | **Function** | Each function called? | Nothing about correctness | **Priority:** Branch > Line > Statement > Function. ### Meaningful Thresholds - **80% line coverage** as baseline gate, not a vanity target. - Branch coverage matters more than line coverage. - Focus on: business logic, transformations, error paths, edge cases. - Skip: generated code, type definitions, barrel exports, trivial getters, framework boilerplate. ### CI Gate ```yaml # Jest/Vitest exit non-zero when thresholds fail. For pytest: - run: pytest --cov=src --cov-fail-under=80 ``` --- ## Mutation Testing Coverage tells you what code *ran*. Mutation testing tells you if tests would *catch a bug*. It works by making small source changes (e.g., `>` to `>=`, `true` to `false`), running tests against each mutant. If tests still pass, the mutant **survived** -- your tests missed that logic. ### Stryker (JS/TS) ```bash npm i -D @stryker-mutator/core @stryker-mutator/jest-runner # or vitest-runner ``` ```javascript // stryker.config.mjs export default { testRunner: "jest", coverageAnalysis: "perTest", mutate: ["src/**/*.ts", "!src/**/*.test.ts"], thresholds: { high: 80, low: 60, break: 50 }, reporters: ["html", "clear-text", "progress"], }; ``` Run: `npx stryker run` ### mutmut (Python) ```bash pip install mutmut mutmut run --paths-to-mutate=src/ mutmut results # summary mutmut show 42 # inspect surviving mutant #42 ``` ### Interpreting Scores | Score | Meaning | |-------|---------| | 90%+ | Strong -- catching most logic changes | | 70-89% | Decent -- review survivors in critical paths | | <70% | Tests execute code but do not verify behavior | Run mutation testing on critical business logic, not entire codebases. Ignore equivalent mutants (logically identical code). --- ## Snapshot Testing ### When to Use - UI component render output, serialized data structures, CLI formatting - Output where exact structure matters and is hard to assert field-by-field ### When NOT to Use - Frequently changing output (snapshot fatigue, rubber-stamp reviews) - Large snapshots (hard to review), implementation details (CSS classes, internal IDs) - As substitute for targeted assertions when specific values matter ### File vs Inline Snapshots ```typescript // File snapshot — stored in __snapshots__/*.snap expect(tree).toMatchSnapshot(); // Inline snapshot — stored in the test file, auto-updated expect(tree).toMatchInlineSnapshot(`

Dashboard

`); ``` Prefer inline for small output (<20 lines). Use property matchers for dynamic values: ```typescript expect(user).toMatchSnapshot({ id: expect.any(String), createdAt: expect.any(Date) }); ``` --- ## Anti-Patterns (with Fixes) **Testing private methods** -- Test through the public API instead. If a private method needs its own tests, extract it to its own module. **Mocking everything** -- Only mock external boundaries (network, filesystem, DB, time). Let fast, deterministic internal collaborators use real implementations. **Snapshot overuse** -- Use `expect(x).toBe("active")` for specific values. Reserve snapshots for structured output. **Non-descriptive names** -- Replace `"works"` with `"should return empty array when no items match the filter"`. **Shared mutable state** -- Initialize in `beforeEach`, not at module scope: ```typescript // Bad: shared mutation // Good: fresh per test const items = []; let items: string[]; it("A", () => items.push("a")); beforeEach(() => { items = []; }); it("B", () => { it("A", () => { items.push("a"); expect(items).toHaveLength(1); }); items.push("b"); it("B", () => { items.push("b"); expect(items).toHaveLength(1); }); expect(items).toHaveLength(1); // FAILS }); ``` --- ## Done When - Coverage thresholds configured in `jest.config.*`, `vitest.config.*`, or `pyproject.toml` and enforced as a CI gate (non-zero exit on failure) - Test files follow the project's co-location or `__tests__` directory convention consistently — no test files in ad-hoc locations - Mocking strategy documented (in `qa-project-context.md` or inline): which boundaries get mocked (HTTP, DB, time) and which internal collaborators use real implementations - No test reaches outside the process boundary — no real HTTP calls, no real database, no filesystem writes to shared state - All snapshot tests are intentional and reviewed: no auto-accepted snapshots with `--updateSnapshot` in CI ## Related Skills - **coverage-analysis** -- Interpreting coverage reports, identifying meaningful gaps, CI integration. - **ci-cd-integration** -- Test stages in pipelines, parallelization, caching, deployment gating. - **ai-test-generation** -- AI-assisted test generation, edge case discovery, legacy code bootstrapping. - **shift-left-testing** -- Pre-commit hooks, IDE integration, developer workflow optimization.