---
name: code-test
description: Enforce test-driven development (RED-GREEN-REFACTOR). Use when implementing features, fixing bugs, or changing behavior - write failing test first, then minimal code to pass.
---

# Test-Driven Development

Write the test first. Watch it fail. Write minimal code to pass.

**Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing.

---

## When to Use

**Always use for:**
- New features
- Bug fixes
- Behavior changes
- Refactoring

**Exceptions (confirm with user):**
- Throwaway prototypes
- Generated code
- Configuration files

**Workflow Integration:**
- **Multiple independent tasks from a plan?** → Use `task-dispatch` skill (it enforces TDD per task with quality gates)
- **Single implementation task?** → Use this skill directly for TDD workflow

---

## The Iron Law

```
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
```

Wrote code before the test? Delete it. Start over. No exceptions.

---

## Red-Green-Refactor Cycle

### 1. RED - Write Failing Test

Write one minimal test showing what should happen.

**Requirements:**
- One behavior per test
- Clear name describing behavior
- Real code (avoid mocks unless unavoidable)

### 2. Verify RED - Watch It Fail

**MANDATORY. Never skip.**

Run the test. Confirm:
- Test fails (not errors from typos)
- Failure message matches expectation
- Fails because feature is missing

**Test passes immediately?** You're testing existing behavior. Fix the test.

### 3. GREEN - Minimal Code

Write the simplest code to pass the test.

**DO:** Just enough to pass, simple implementation
**DON'T:** Add features, refactor other code, add configurability

### 4. Verify GREEN - Watch It Pass

**MANDATORY.**

Run the test. Confirm:
- Test passes
- Other tests still pass
- No errors or warnings

### 5. REFACTOR - Clean Up

Only after green:
- Remove duplication
- Improve names
- Extract helpers

Keep tests green. Don't add behavior.

### 6. Repeat

Next failing test for next behavior.

---

## Common Rationalizations

| Excuse | Reality |
|--------|---------|
| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
| "I'll test after" | Tests passing immediately prove nothing. |
| "Already manually tested" | Ad-hoc is not systematic. No record, can't re-run. |
| "Deleting X hours is wasteful" | Sunk cost fallacy. Unverified code is debt. |
| "Need to explore first" | Fine. Throw away exploration, then TDD. |
| "Test hard to write" | Hard to test = hard to use. Simplify design. |

---

## Red Flags - Stop and Start Over

- Code written before test
- Test passes immediately
- Can't explain why test failed
- "Just this once" rationalization
- Keeping code "as reference"

**All of these mean:** Delete code. Start with TDD.

---

## Verification Checklist

Before marking work complete:

- [ ] Every new function has a test
- [ ] Watched each test fail before implementing
- [ ] Each test failed for expected reason
- [ ] Wrote minimal code to pass each test
- [ ] All tests pass
- [ ] No errors or warnings in output

---

## TDD Evidence Format (For Subagent Verification)

When implementing as a subagent, you MUST output this evidence block:

```yaml
tdd_evidence:
  tests_written:
    - name: "test_feature_x"
      file: "tests/test_x.py"
      red_output: "FAILED - [actual failure message]"
      green_output: "PASSED - 1 passed in 0.05s"
  implementation_files:
    - path: "src/feature.py"
  all_tests_pass: true
  test_command: "pytest tests/test_x.py -v"
  final_output: "[full test output]"
```

**This is REQUIRED for SubagentStop hook verification.**

---

## Integration

**Use with:**
- `code-debug` - Write failing test to reproduce bug before fixing
- `task-dispatch` - Subagents follow TDD for each task
- `completion-verify` - Run tests before claiming completion