--- name: tdd-workflow description: Test-driven development execution workflow with red-green-refactor cycle, increment decomposition, and pause/continue rules. Use when building features or fixing bugs using TDD. Framework-agnostic with language-specific configs for Python, TypeScript, Go, and Ruby delegation. --- # TDD Workflow ## Core Principle **Never write implementation code before a failing test exists for that behavior.** ## Execution Flow ### 1. Setup ``` LANG = config_read("tech_stack", "unknown") If LANG == "unknown": detect from project files (package.json → TypeScript, Gemfile → Ruby, go.mod → Go, pyproject.toml → Python) If LANG == "unknown": AskUserQuestion("What language/framework?") RUNNER = lookup LANG in [references/language-configs.md] If LANG in [ruby]: delegate to `rspec-coder` or `minitest-coder` skill for runner details INCREMENTS = decompose feature using patterns from [references/increments.md] - Read requirements/plan - Identify pattern (Data Transformation, CRUD, State Machine, Calculation, Integration) - Break into ordered increments: degenerate → happy path → variations → edge cases → errors For each INCREMENT in INCREMENTS: TaskCreate(subject: INCREMENT.name, description: INCREMENT.test_description) AskUserQuestion("Review increments. Adjust ordering or scope?", options: ["Looks good", "Modify"]) ``` ### 2. TDD Loop (per increment) ``` For each INCREMENT in INCREMENTS: TaskUpdate(INCREMENT.task_id, status: "in_progress") # RED Write failing test to real file (not code block) RUN = Bash(RUNNER.test_command) If RUN.status == PASS: STOP — investigate unexpected pass If RUN.status == FAIL (wrong reason): fix test, rerun PAUSE → show test + failure output, wait for user # GREEN Write minimal implementation code RUN = Bash(RUNNER.full_suite_command) If RUN.status == FAIL: show output, PAUSE → ask user to fix or hand off If RUN.status == PASS: auto-continue (no pause) # REFACTOR If improvement opportunities exist: Refactor implementation and/or test code RUN = Bash(RUNNER.full_suite_command) If RUN.status == FAIL: revert refactor, PAUSE → discuss If RUN.status == PASS: auto-continue (no pause) TaskUpdate(INCREMENT.task_id, status: "completed") Brief summary of what was implemented → immediately begin next increment (no pause between increments) ``` ### 3. Wrap-up ``` RUN = Bash(RUNNER.full_suite_command) Report: increments completed, total tests, pass/fail status Suggest: remaining work, missed edge cases, integration tests needed ``` ## Pause/Continue Rules | Situation | Action | |-----------|--------| | RED: test fails (expected) | **Pause** — show test + failure, wait for user | | RED: test passes unexpectedly | **Stop** — investigate, don't proceed | | GREEN: all tests pass | **Auto-continue** to REFACTOR | | GREEN: tests fail | **Pause** — show output, ask user | | REFACTOR: tests pass | **Auto-continue** to next increment | | REFACTOR: tests fail | **Revert + Pause** — discuss approach | | Between increments | **Auto-continue** — no pause | ## Task Tracking Integration ``` TASK_TRACKING = config_read("task_tracking.enabled", "false") WORKFLOW_ID = "tdd-{timestamp}" If TASK_TRACKING: For each TaskCreate: add metadata {workflow: WORKFLOW_ID, phase: "tdd-loop"} For each TaskUpdate: wrap with If TASK_TRACKING: TaskUpdate(...) On Wrap-up: update ledger if LEDGER_ENABLED LEDGER_ENABLED = config_read("task_tracking.ledger", "false") LEDGER_PATH = config_read("task_tracking.ledger_path", ".agents/workflow-ledger.yml") ``` ## Red-Green-Refactor Cycle ### 1. Red Phase: Write a Failing Test - Select one small, specific behavior from the increment - Write a descriptive test that expresses the expected behavior - Run the test to confirm it **fails** (red) - The failure should be for the right reason (not syntax error or missing dependency) ### 2. Green Phase: Minimal Implementation - Implement only the minimal code necessary to pass the failing test - Resist adding extra features or handling edge cases not yet covered by tests - Run the test to confirm it **passes** (green) ### 3. Refactor Phase: Improve Quality - Review both implementation and test code for improvements - Remove duplication, improve naming, extract methods - Apply language-specific idioms and patterns - Run tests after each refactoring step to ensure they still **pass** ## Test Sequencing Strategy **Order tests from simple to complex:** 1. **Happy path** — The core behavior with valid inputs 2. **Validation tests** — Required fields, format constraints 3. **Edge cases** — Boundary conditions, empty values, unusual inputs 4. **Error handling** — Invalid inputs, failure scenarios 5. **Integration points** — Interactions with other components ## Test Quality Guidelines **Each test should:** - Cover exactly one behavior - Be isolated — no shared state between tests - Have a clear, descriptive name - Fail for only one reason **Avoid:** - Testing implementation details instead of behavior - Writing tests after the code - Sharing test setup that creates hidden dependencies - Skipping the refactor phase ## Framework-Specific Implementation For language-specific test runner commands, see [references/language-configs.md](references/language-configs.md). For Ruby projects: - **RSpec:** Apply `rspec-coder` skill - **Minitest:** Apply `minitest-coder` skill A Rails example is available in [references/rails-tdd-workflow.md](references/rails-tdd-workflow.md). ## Test Generation Patterns When writing tests outside a TDD loop (e.g., adding coverage to existing code), follow these patterns. ### Framework Detection | Evidence | Framework | |----------|-----------| | `spec/` + `_spec.rb` + `.rspec` | RSpec | | `test/` + `_test.rb` + `test_helper.rb` | Minitest | | `*.test.js` + `jest.config.js` | Jest | | `*.spec.ts` in `tests/` or `e2e/` | Playwright | | `vitest.config.js` | Vitest | ### Test Plan Structure Before writing tests, create a plan covering: 1. **Scope** - What functionality will be tested 2. **Happy path scenarios** - Expected successful flows 3. **Sad path scenarios** - Error handling, validations, failures 4. **Edge cases** - Boundary conditions, null/empty values, unusual inputs 5. **Auth checks** - Authorization/authentication (if applicable) 6. **Test data requirements** - Fixtures or data needed 7. **Mocking strategy** - External services/dependencies to mock ### Test Case Matrix Use for complex scenarios with multiple parameters: | Objective | Inputs | Expected Output | Test Type | |-----------|--------|-----------------|-----------| | Validate creation | valid params | Created, 201 | Happy Path | | Reject duplicate | existing data | Error, 422 | Sad Path | | Handle empty | missing field | Validation error | Edge Case | ### Completion Criteria Tests are complete when ALL of these are met: **Coverage:** All public methods tested, happy/sad/edge paths covered, auth checks included. **Quality:** Tests pass (verified by running), isolated (no shared state), follow AAA pattern (Arrange-Act-Assert), descriptive names. **Framework compliance:** Proper matchers, appropriate mocking, follows project patterns. ### Test Writing Best Practices - Test behavior, not implementation details - One assertion focus per test - Use descriptive test names that document expected behavior - Prefer explicit assertions over implicit ones - Use test doubles sparingly and purposefully - Group related tests with describe/context blocks - Test data should be minimal but sufficient - For Rails: use transactional fixtures and database cleaner - For Playwright: proper waiting strategies, avoid flaky selectors