--- name: tdd description: Guides agent through test-driven development using red-green-refactor. Use when user mentions TDD, red-green-refactor, test-first development, outside-in TDD, mockist TDD, London-school TDD, acceptance TDD, or double-loop TDD. Do not use for writing E2E/Playwright tests, configuring test runners or frameworks, adding tests without TDD methodology, or general testing advice. --- # Test-Driven Development ## Philosophy **Core principle**: Tests should verify behavior through public interfaces, not implementation details. Code can change entirely; tests shouldn't. **Good tests** are integration-style: they exercise real code paths through public APIs. They describe _what_ the system does, not _how_ it does it. A good test reads like a specification - "user can checkout with valid cart" tells you exactly what capability exists. These tests survive refactors because they don't care about internal structure. **Bad tests** are coupled to implementation. They mock internal collaborators, test private methods, or verify through external means (like querying a database directly instead of using the interface). The warning sign: your test breaks when you refactor, but behavior hasn't changed. If you rename an internal function and tests fail, those tests were testing implementation, not behavior. Read [references/tests.md](references/tests.md) for test naming rules and examples when writing tests. Read [references/mocking.md](references/mocking.md) for when and how to mock. Read [references/test-tiers.md](references/test-tiers.md) for organizing unit vs component tests. Read [references/component-testing.md](references/component-testing.md) for UI component testing patterns. ## Task Analysis Two legitimate TDD styles. Pick one per feature; don't mix mid-feature. **Actively classify the task before writing any tests.** ### Step 1 — Check for explicit request If user said "classical", "Detroit", or "inside-out" → **Classical**. Done. If user said "outside-in", "London", "mockist", or "double-loop" → **[Outside-in](references/outside-in.md)**. Done. Otherwise, proceed to Step 2. ### Step 2 — Examine task and codebase - [ ] **Entry point** — where does the change enter? (UI component, route, API handler, domain service, utility) - [ ] **Scope** — single module or crosses multiple layers? - [ ] **Collaborators** — existing with stable interfaces, or new ones to discover? - [ ] **Task language** — user-visible behavior ("user can...") or internal logic ("calculate", "transform")? **[Outside-in](references/outside-in.md)** when ANY: - Entry point is UI / route / controller - Spans 2+ layers with new collaborators to discover - Describes user journey or end-to-end flow - Greenfield multi-component feature **Classical** when: - Single module with existing interface - Algorithms, data transforms, domain rules - Integrating with existing deep modules - No outside-in signals **Default: classical** when ambiguous. Outside-in requires discipline to replace mocks with real implementations; classical has no such cleanup tax. ### Step 3 — State classification Before proceeding, copy the template from [assets/classification-template.md](assets/classification-template.md), fill it in, and present it to the user. Wait for user confirmation before writing any tests. ## Anti-Pattern: Horizontal Slices **DO NOT write all tests first, then all implementation.** This is "horizontal slicing" - treating RED as "write all tests" and GREEN as "write all code." This produces **crap tests**: - Tests written in bulk test _imagined_ behavior, not _actual_ behavior - You end up testing the _shape_ of things (data structures, function signatures) rather than user-facing behavior - Tests become insensitive to real changes - they pass when behavior breaks, fail when behavior is fine - You outrun your headlights, committing to test structure before understanding the implementation **Correct approach**: Vertical slices via tracer bullets. One test → one implementation → repeat. Each test responds to what you learned from the previous cycle. Because you just wrote the code, you know exactly what behavior matters and how to verify it. ``` WRONG (horizontal): RED: test1, test2, test3, test4, test5 GREEN: impl1, impl2, impl3, impl4, impl5 RIGHT (vertical): RED→GREEN: test1→impl1 RED→GREEN: test2→impl2 RED→GREEN: test3→impl3 ... ``` ## Workflow (Classical / Inside-Out) ### 1. Planning Using the entry point and scope identified in [Task Analysis](#task-analysis): - [ ] Confirm with user what interface changes are needed - [ ] Confirm with user which behaviors to test (prioritize) - [ ] Identify opportunities for [deep modules](references/deep-modules.md) (small interface, deep implementation) - [ ] Design interfaces for [testability](references/interface-design.md) - [ ] List the behaviors to test (not implementation steps) - [ ] Get user approval on the plan Ask: "What should the public interface look like? Which behaviors are most important to test?" **You can't test everything.** Confirm with the user exactly which behaviors matter most. Focus testing effort on critical paths and complex logic, not every possible edge case. ### 2. Tracer Bullet Write ONE test that confirms ONE thing about the system: ``` RED: Write test for first behavior → test fails GREEN: Write minimal code to pass → test passes ``` This is your tracer bullet - proves the path works end-to-end. ### 3. Incremental Loop For each remaining behavior: ``` RED: Write next test → fails GREEN: Minimal code to pass → passes ``` Rules: - One test at a time - Only enough code to pass current test - Don't anticipate future tests - Keep tests focused on observable behavior ### 4. Refactor After all tests pass, read [references/refactoring.md](references/refactoring.md) and look for refactor candidates: - [ ] Extract duplication - [ ] Deepen modules (move complexity behind simple interfaces) - [ ] Apply SOLID principles where natural - [ ] Consider what new code reveals about existing code - [ ] Run tests after each refactor step **Never refactor while RED.** Get to GREEN first. ## Checklist Per Cycle ``` [ ] Test describes behavior, not implementation [ ] Test uses public interface only [ ] Test would survive internal refactor [ ] Code is minimal for this test [ ] No speculative features added ``` --- When using outside-in TDD, read [references/outside-in.md](references/outside-in.md) for the full double-loop workflow. When deciding test file placement, read [references/test-tiers.md](references/test-tiers.md) for unit vs component tier rules. When testing UI components, read [references/component-testing.md](references/component-testing.md) for mount helpers and boundary faking patterns. ## Validation To validate changes to this skill, run the 4-stage process: discovery validation (test frontmatter triggers), logic validation (simulate execution), edge case testing (attack the logic), and architecture refinement (enforce progressive disclosure). See [skills best practices](https://github.com/mgechev/skills-best-practices) for prompts.