--- name: characterization-test-generator description: Generates tests that capture current behavior of existing code before refactoring. Use when you need a safety net before AI-assisted refactoring or modifying legacy code. allowed-tools: [Read, Write, Edit, Glob, Grep, Bash, AskUserQuestion] --- # Characterization Test Generator Generate tests that document what existing code **actually does** — not what it should do. These tests capture current behavior so you can refactor with confidence, especially when using AI to modify code. ## When This Skill Activates Use this skill when the user: - Says "I need to refactor this" or "help me refactor" - Wants to "add tests before changing code" - Asks for "characterization tests" or "golden master tests" - Says "I want to safely modify this class/module" - Mentions "legacy code" or "untested code" - Wants AI to refactor but is worried about breaking things ## Why This Matters for AI-Generated Code AI refactoring is powerful but risky: - AI doesn't remember **why** code works a certain way - AI may "improve" code and break subtle behavior - Characterization tests freeze current behavior as a regression suite - If any test fails after refactoring, you know something changed ## Process ### Phase 1: Discover Code Under Test ``` Glob: **/*.swift (in the target module) Grep: "class |struct |enum |protocol |func " (public API surface) ``` Identify: - [ ] Public types and their public methods - [ ] Input types and return types - [ ] Side effects (network, disk, UserDefaults, notifications) - [ ] Dependencies (injected or created internally) - [ ] State mutations (properties that change) ### Phase 2: Classify Behavior For each public method, classify: | Category | Example | Test Strategy | |----------|---------|---------------| | Pure computation | `calculate(a, b) -> c` | Input/output pairs | | State mutation | `addItem(_:)` modifies array | Before/after state checks | | Async operation | `fetchData() async -> [Item]` | Mock dependencies, verify results | | Side effect | `save()` writes to disk | Verify mock was called | | Event emission | `delegate?.didUpdate()` | Capture delegate calls | | Error path | Throws on invalid input | Verify correct error type | ### Phase 3: Generate Tests #### Test Naming Convention Characterization tests should be clearly labeled: ```swift import Testing @testable import YourApp @Suite("Characterization: ItemManager") struct ItemManagerCharacterizationTests { // Tests document CURRENT behavior, not ideal behavior } ``` #### Template: Pure Computation ```swift @Test("current behavior: calculates total with tax") func calculatesTotal() { let calculator = PriceCalculator() let result = calculator.total(subtotal: 100.0, taxRate: 0.08) // Document actual behavior — even if the rounding seems wrong #expect(result == 108.0) } ``` #### Template: State Mutation ```swift @Test("current behavior: addItem appends and sorts by date") func addItemSortsbyDate() { let manager = ItemManager() let older = Item(title: "A", date: .distantPast) let newer = Item(title: "B", date: .now) manager.addItem(older) manager.addItem(newer) // Document actual ordering behavior #expect(manager.items.first?.title == "A") #expect(manager.items.last?.title == "B") #expect(manager.items.count == 2) } ``` #### Template: Async with Dependencies ```swift @Test("current behavior: loadItems returns cached data when offline") func loadItemsOffline() async throws { let mockNetwork = MockNetworkClient(shouldFail: true) let mockCache = MockCache(items: [Item.sample]) let service = ItemService(network: mockNetwork, cache: mockCache) let items = try await service.loadItems() // Document: falls back to cache when network fails #expect(items.count == 1) #expect(items.first?.title == Item.sample.title) } ``` #### Template: Side Effects ```swift @Test("current behavior: save writes to UserDefaults") func saveWritesToDefaults() { let defaults = MockUserDefaults() let settings = SettingsManager(defaults: defaults) settings.setTheme(.dark) settings.save() // Document: saves as string, not as enum raw value #expect(defaults.lastSetValue as? String == "dark") #expect(defaults.lastSetKey == "app_theme") } ``` #### Template: Error Paths ```swift @Test("current behavior: throws on empty title") func throwsOnEmptyTitle() { let validator = ItemValidator() #expect(throws: ValidationError.emptyField("title")) { try validator.validate(Item(title: "", date: .now)) } } ``` #### Template: Edge Cases ```swift @Test("current behavior: handles nil optional gracefully") func handlesNilOptional() { let parser = DataParser() let result = parser.parse(data: nil) // Document: returns empty array on nil, doesn't crash #expect(result.isEmpty) } @Test("current behavior: handles empty collection") func handlesEmptyCollection() { let aggregator = StatsAggregator() let stats = aggregator.compute(values: []) // Document: returns zeroes, not NaN or crash #expect(stats.average == 0.0) #expect(stats.count == 0) } ``` ### Phase 4: Fill in Actual Values This is the key step. For each test: 1. **Read the source code** to understand what the method actually returns 2. **Trace the logic** for the given inputs 3. **Write the assertion with the actual value**, even if it seems wrong ```swift // ❌ Wrong — this is what you WANT it to do #expect(result == expectedCorrectValue) // ✅ Right — this is what it ACTUALLY does #expect(result == actualCurrentValue) // Note: off-by-one, but current behavior ``` **Add comments for surprising behavior:** ```swift // CHARACTERIZATION: This returns 11 not 10 due to inclusive range. // Don't "fix" this until intentionally changing behavior. #expect(range.count == 11) ``` ### Phase 5: Verify and Lock 1. **Run all characterization tests** — they must ALL pass 2. **If any fail**, adjust assertions to match actual behavior 3. **Tag them** so they're easy to find later: ```swift @Suite("Characterization: ItemManager") @Tag(.characterization) struct ItemManagerCharacterizationTests { ... } // Define the tag extension Tag { @Tag static var characterization: Self } ``` 4. **Run selectively:** ```bash # Run only characterization tests xcodebuild test -scheme YourApp \ -only-testing "YourAppTests/ItemManagerCharacterizationTests" ``` ## Output Format ```markdown ## Characterization Tests Generated **Module**: [Module name] **Classes tested**: [List] **Tests generated**: [Count] ### Coverage Summary | Class | Methods Covered | Edge Cases | Notes | |-------|----------------|------------|-------| | ItemManager | 5/7 | 3 | 2 private methods skipped | | PriceCalculator | 3/3 | 2 | All public API covered | ### Files Created - `Tests/CharacterizationTests/ItemManagerCharacterizationTests.swift` - `Tests/CharacterizationTests/PriceCalculatorCharacterizationTests.swift` ### Surprising Behaviors Found - `PriceCalculator.total()` rounds DOWN, not to nearest cent - `ItemManager.sort()` is unstable — equal dates may reorder ### Ready to Refactor All [X] characterization tests passing. Safe to refactor with AI. Run `xcodebuild test` after each change to verify no behavior changed. ``` ## When NOT to Use This - Code is already well-tested (check coverage first) - You're deleting the code entirely (no need to characterize) - The code is trivially simple (single-line computed properties) - You **want** to change the behavior (use `tdd-bug-fix` instead) ## References - Michael Feathers, *Working Effectively with Legacy Code* — coined "characterization test" - `generators/test-generator/` — for standard test generation (not characterization) - `testing/tdd-refactor-guard/` — pre-refactor checklist that uses these tests