--- description: >- You MUST use this skill before writing any test cases or adding test coverage. Use when the goal is to verify code behavior through tests — new tests, missing coverage, or a reproduction test after a bug fix. Stronger signals: "write tests", "add tests", "unit test", "integration test", "test coverage", "TDD", "test this function", "add a regression test". Also apply automatically after completing sextant:fix-bug, before marking the bug as resolved. --- !../principles/SKILL_BODY.md !../tool-gitnexus/SKILL_BODY.md --- # Test Writing Workflow ## Core Principle The value of tests is not "line coverage," but **accurately reporting regressions when code changes**. Good tests are a safety net for future refactoring — they go red only when behavior is truly broken, not when unrelated changes are made. --- ## Entering from a Bug Fix If this test session was triggered by a bug fix, apply the following adjustments before entering the standard workflow: **1. Write the reproduction test first — before anything else.** The root cause and trigger conditions are already known from the bug fix. Use that analysis directly: ```python def test_(): # This test MUST FAIL before the fix is applied. # It documents the exact condition that triggered the bug. result = assert ``` **2. Skip Step 1 (code analysis) for the fixed function.** Jump directly to Step 2, using the impact assessment from the bug fix as input for "what callers depend on." **3. Prioritize boundary tests adjacent to the bug.** Focus on the boundary conditions the bug exposed — null, zero, extreme values, race conditions. **4. Name the reproduction test clearly:** ```python def test_calculate_discount_zero_rate_does_not_divide_by_zero(): # fix: rate=0 caused ZeroDivisionError (see bug fix root cause) assert calculate_discount(price=100, rate=0) == 100 ``` Once the reproduction test is in place and passing, skip to Step 2 for any additional coverage. **If the reproduction test still fails after the fix was applied:** stop the test-writing workflow. Report to the user: "The reproduction test is still failing — the fix may be incomplete. Please re-trigger with `sextant:fix-bug` to re-evaluate the root cause, or describe the remaining failure and I'll investigate with baseline rules." Do not proceed to Step 2. --- ## Before You Start ### Five Characteristics of Good Tests (F.I.R.S.T) | Characteristic | Description | |---------------|-------------| | **Fast** | Single test < 100ms, full test suite < 30s | | **Independent** | No dependencies between tests, no dependency on execution order | | **Repeatable** | Results consistent across any environment; no dependency on external services, system time, or random numbers | | **Self-validating** | Results are pass/fail; no manual log inspection required | | **Prioritized by Impact** | Write tests for high-impact functions first: core business logic, functions with many callers, code that is hard to debug manually | ### Choosing the Test Level ``` ─── Test Pyramid (bottom to top: decreasing quantity, increasing cost) ─ ╱ ╲ E2E Tests — few, verify key user paths ╱ ╲ ╱─────╲ Integration Tests — moderate, verify inter-module collaboration ╱ ╲ ╱─────────╲ Unit Tests — many, verify behavior of individual functions/classes ───────────────────────────────────────────────────── ``` **Selection criteria:** - Pure functions, computation logic, data transformation → **Unit tests** - Inter-module interaction, database reads/writes, API calls → **Integration tests** - Complete user operation flows → **E2E tests** (cover core paths only) --- ## Complete Execution Workflow > **Progress tracking:** At the start of each step, output an updated progress block. > > ``` > Write Tests Progress > ✓ Step 1: Analyze Code Under Test — > → Step 2: Design Test Boundaries — in progress > ○ Step 3: Determine Test Structure > ○ Step 4: Handle Dependencies > ○ Step 5: Implement Tests > ○ Step 6: Validate Quality > ``` > > Replace `○` with `→` for the current step, and `✓` once complete. --- ### Step 1: Analyze the Code Under Test Before writing tests, thoroughly understand the behavioral contract of the code under test. **Questions to answer:** - What is the **responsibility** of the function/class under test? - What are its **inputs**? What is the **valid range** and **boundary values** for each input? - What are its **outputs** (return values + side effects)? - What **preconditions** does it have? - Which **exceptions** might it throw? Under what conditions? - Which **external collaborators** does it depend on? Do these need to be mocked? 🔗 When GitNexus is available, use `context` MCP tool to get callees (external dependencies) automatically. ### Step 2: Design Test Boundaries Divide the behavioral space into discrete test regions. **Three required test scenarios:** **① Happy Path** — typical input → expected output; each meaningful parameter combination **② Boundary Conditions:** - Null/zero/empty collection/empty string - Maximum/minimum/just beyond range - Single-element collection (off-by-one happens most here) - Type boundaries (integer overflow, float precision) **③ Error Path:** - Invalid input → expected exception/error code - External dependency failure → expected degradation/error handling - Concurrency/race conditions (if applicable) - Timeout scenarios (if network/IO is involved) ``` ─── Test Boundary Matrix ──────────────────────────── Target under test: Happy path: case_1: → Boundary conditions: boundary_1: → Error path: error_1: → ───────────────────────────────────────────────────── ``` ### Confirmation Gate (between Step 2 and Step 3) For **Medium and Large tasks** (new test module or full coverage pass), after presenting the Test Boundary Matrix, use the confirmation gate with: - **question**: The Test Boundary Matrix above, plus: `"Does this boundary coverage match your expectations before I write the tests?"` - **options**: - `"Yes, proceed with implementation"` - `"No — adjust the scope or boundaries"` For **Lightweight tasks** (1–3 tests for a single function) or **bug-fix reproduction tests**: skip — proceed directly to Step 3. **If user selects "No":** ask *"Which scenarios should be added, removed, or changed?"*, update the matrix, and show it again before proceeding. --- ### Step 3: Determine Test Structure **Naming conventions:** Test names must describe behavior, not repeat the function name. ```python # ✅ Good test naming def test_discount_returns_zero_when_rate_exceeds_price(): def test_login_fails_with_expired_token(): def test_parse_handles_empty_input_gracefully(): # ❌ Poor test naming def test_discount(): def test_login(): ``` **Naming pattern:** `test__` or `test__when__then_` **AAA Structure (Arrange → Act → Assert):** ```python def test_order_total_applies_discount_for_vip_user(): # Arrange user = create_vip_user() items = [Item("book", price=100), Item("pen", price=20)] order = Order(user=user, items=items) # Act total = order.calculate_total() # Assert assert total == 96.0 # VIP 20% discount: (100 + 20) * 0.8 ``` **Each test verifies only one behavior.** Multiple asserts checking different aspects of the same behavior are acceptable. ### Step 4: Handle External Dependencies | Dependency Type | Isolation Method | Use Case | |----------------|-----------------|----------| | Pure interface dependency (Repository, Client) | Mock objects | Unit tests | | Database | In-memory database / test containers | Integration tests | | External API | HTTP mock (e.g., responses, wiremock) | Integration tests | | File system | Temp directory + teardown cleanup | File read/write logic | | Time/random | Inject controllable clock / fixed seed | Deterministic results needed | **Mock discipline:** - Mocked behavior must be consistent with the real implementation's contract - Don't mock internal implementations of code you don't own — mock interfaces - If mock setup is more complex than the code under test, the code has too heavy dependencies 🔗 When GitNexus is available, use `context` callees to distinguish internal vs external dependencies. ### Step 5: Implement Tests **Recommended implementation order:** 1. Write the most typical happy path case first — verify AAA structure and mock setup are correct 2. Add boundary conditions — cover boundary items in the Step 2 matrix one by one 3. Write error paths — verify error handling meets expectations **Reference existing test style:** New tests' file organization, naming style, assertion library, and fixture patterns must be consistent with existing tests in the project. 🔗 When GitNexus is available, use `query " test"` to find reference tests. ### Step 6: Validate Test Quality ``` ─── Test Quality Checklist ────────────────────────── [ ] Independence: No dependencies between tests? Can run in any order? Can run individually? [ ] Determinism: No dependency on system time, random numbers, external services? [ ] Readability: Does reading the test name tell you what's being tested? Is AAA structure clear? [ ] Validity: If a line of logic in the code under test is deleted, will the test fail? [ ] Non-brittleness: Does the test depend on implementation details rather than behavior? [ ] Boundary coverage: Does each case in the Step 2 matrix have a corresponding test? [ ] Error messages: Is the error message on assertion failure sufficient to locate the problem? [ ] Speed: Is a single test < 100ms? ───────────────────────────────────────────────────── ``` **"Validity" check technique (mutation testing mindset):** Make these changes to the code under test; check if tests fail: - Change `>` to `>=` - Change return value to `None` / `null` - Comment out a key conditional branch If tests still pass, they cover lines but don't verify behavior. --- ## Forbidden Actions - **Write business logic in tests**: Directly assert expected concrete values; don't replicate computation logic - **Assert `assertTrue(result is not None)`**: Doesn't verify any behavior - **Depend on execution order**: Each test must be self-contained - **Test private methods**: Test behavior of public interfaces, not implementation details - **Ignore flaky tests**: Fix or delete them — flaky tests cause teams to ignore all failures --- ## Sprint State Integration If `.sextant/state.json` exists in the project root and the current task matches a sprint task: - **On start:** offer to update the task's `status` from `pending` → `in_progress`. Ask: *"Update sprint state to mark Task N as in_progress?"* - **On completion** (acceptance condition met): offer to update `status` to `done`. Ask: *"Update sprint state to mark Task N as done?"* - **On blocker** (test failure, missing dependency, unresolvable ambiguity that halts progress): surface the issue, then ask: *"Mark Task N as blocked and record the reason in flags?"* If confirmed, set `status: "blocked"` and append `{"task": N, "reason": ""}` to the top-level `flags` array. Do not proceed to the next task while a task is blocked. Do not write the file without explicit user confirmation. If the user declines, continue without state updates. --- ## Reply Format **Lightweight task** (1–3 tests for a single function): one sentence only. ``` ✅ Added tests for `` covering (:). ``` **Medium/large task** (new test module or full coverage pass): full block. Test Summary: | # | Item | Detail | |---|------|--------| | [1] | Conclusion | | | [2] | Changes | | | [3] | Risks / Assumptions | | | [4] | Verification | | | [5] | Needs your input | |