--- name: legacy-code-safety description: Use when modifying, removing, or refactoring code that lacks test coverage. Emphasizes the danger of untested changes and the RGR workflow to add characterization tests before modifications. allowed-tools: - Read - Write - Edit - Bash - Grep - Glob --- # Legacy Code Safety > "Changing untested code is like performing surgery blindfolded." ## DANGER: The Risk of Modifying Untested Code **Modifying code without tests is one of the HIGHEST-RISK activities in software development.** When you change untested code: - You CANNOT know what behaviors you're breaking - You have NO safety net to catch mistakes - You're relying on hope instead of evidence - You're creating technical debt on top of technical debt **In dynamically-typed languages (Python, JavaScript, Ruby, PHP), this risk is EXPONENTIALLY higher:** - No compiler to catch type errors - No static analysis to verify correctness - Tests are your ONLY safety mechanism - Runtime is when you discover your mistakes ## The Golden Rule **NEVER modify untested code without adding tests first.** If code lacks tests: 1. STOP 2. Add characterization tests 3. Verify tests pass 4. THEN make your changes ## The RGR Workflow for Legacy Code This is NOT standard TDD. This is **characterization testing** - capturing current behavior before changing it. ### Standard TDD (New Features) 1. RED: Write failing test for NEW behavior 2. GREEN: Implement to make test pass 3. REFACTOR: Clean up implementation ### Legacy Code RGR (Existing Code) 1. RED: Write test that SHOULD pass (captures current behavior) 2. GREEN: Verify it passes (confirms you captured reality correctly) 3. REFACTOR: NOW you can safely modify with a safety net ### The Critical Difference **TDD:** You know what the behavior SHOULD be, write test for ideal **Legacy RGR:** You DON'T know all behaviors, capture what EXISTS ### Why "RED" for Passing Tests? The "RED" step means "you're in the red zone" - working with dangerous untested code. Even though you expect the test to pass, you're at risk until you have that passing test as proof. ## Characterization Testing Process ### Step 1: Identify All Behaviors Before writing tests, understand what the code currently does: ```typescript // Legacy function - no tests function calculateShipping(order) { let cost = 9.99 if (order.total > 100) { cost = 0 } if (order.weight > 50) { cost += 25 } if (order.destination === 'HI' || order.destination === 'AK') { cost += 15 } return cost } ``` **Questions to answer:** - What inputs does it accept? - What outputs does it produce? - What are the edge cases? - Are there side effects? - What happens with invalid input? ### Step 2: Write Tests for Current Behavior **Document what EXISTS, not what should exist:** ```typescript describe('calculateShipping (characterization)', () => { // Capture base case test('standard shipping', () => { const order = { total: 50, weight: 10, destination: 'CA' } expect(calculateShipping(order)).toBe(9.99) }) // Capture free shipping threshold test('free shipping over $100', () => { const order = { total: 150, weight: 10, destination: 'CA' } expect(calculateShipping(order)).toBe(0) }) // Capture weight surcharge test('heavy package surcharge', () => { const order = { total: 50, weight: 75, destination: 'CA' } expect(calculateShipping(order)).toBe(34.99) // 9.99 + 25 }) // Capture regional surcharge test('Hawaii surcharge', () => { const order = { total: 50, weight: 10, destination: 'HI' } expect(calculateShipping(order)).toBe(24.99) // 9.99 + 15 }) // Discovered edge case: free shipping + weight test('free shipping with heavy package', () => { const order = { total: 150, weight: 75, destination: 'CA' } expect(calculateShipping(order)).toBe(25) // 0 + 25 (is this right?) }) // Discovered edge case: all conditions test('all surcharges with free shipping threshold', () => { const order = { total: 150, weight: 75, destination: 'HI' } expect(calculateShipping(order)).toBe(40) // 0 + 25 + 15 }) }) ``` ### Step 3: Verify Tests Pass **These tests MUST pass immediately:** ```bash npm test ``` If they don't pass: - You misunderstood the current behavior - Fix the test, not the code - The test should document reality ### Step 4: NOW You Can Refactor **With tests in place, you have a safety net:** ```typescript // NOW we can safely refactor function calculateShipping(order: Order): number { const BASE_COST = 9.99 const FREE_SHIPPING_THRESHOLD = 100 const HEAVY_PACKAGE_WEIGHT = 50 const HEAVY_PACKAGE_SURCHARGE = 25 const REMOTE_DESTINATIONS = ['HI', 'AK'] const REMOTE_SURCHARGE = 15 let cost = order.total > FREE_SHIPPING_THRESHOLD ? 0 : BASE_COST if (order.weight > HEAVY_PACKAGE_WEIGHT) { cost += HEAVY_PACKAGE_SURCHARGE } if (REMOTE_DESTINATIONS.includes(order.destination)) { cost += REMOTE_SURCHARGE } return cost } ``` **Run tests again:** ```bash npm test # All tests still pass ``` You've improved the code WITHOUT breaking existing behavior. ## When This Skill Applies Use this skill when: - [ ] Deleting code that has no tests - [ ] Modifying functions without test coverage - [ ] Refactoring untested modules - [ ] Changing behavior in legacy systems - [ ] Test coverage is below 80% for code you're touching - [ ] You inherit code from previous developers - [ ] You're working in a codebase with poor test discipline **Even if the change seems "simple", if there are no tests, use this workflow.** ## Language-Specific Risk Levels ### HIGH RISK: Dynamic/Weakly-Typed Languages **Python, JavaScript, Ruby, PHP:** - NO compile-time type checking - NO static analysis of type correctness - Tests are your ONLY safety net - Runtime is when errors appear - User discovers bugs, not compiler **Extra caution required:** - Write MORE tests - Test edge cases thoroughly - Check for None/null/undefined - Verify types explicitly - Add type hints/annotations when possible ### MEDIUM RISK: Static/Strongly-Typed Languages **Go, Rust, TypeScript (strict mode), Java:** - Type system catches some errors - Compiler validates correctness - Still need tests for BEHAVIOR - Types don't test business logic **Type safety helps, but:** - Types verify structure, not logic - Tests still required for correctness - Don't rely solely on type system ## The Strangler Fig Pattern For large legacy systems, don't try to test everything at once: ### What Is It? Named after the strangler fig vine that grows around a tree: 1. Vine grows alongside tree 2. Eventually replaces tree entirely 3. Original tree can be removed Applied to code: 1. Wrap legacy code with new interface 2. Add tests to the wrapper 3. Gradually move logic to new code 4. Eventually remove legacy code ### Implementation **Phase 1: Wrap** ```typescript // Legacy code (untested, scary) function legacyCalculatePrice(item) { // ... 500 lines of spaghetti ... } // New wrapper (tested) function calculatePrice(item: Item): number { // Add validation and tests here validateItem(item) // Call legacy for now return legacyCalculatePrice(item) } ``` **Phase 2: Test the Boundary** ```typescript describe('calculatePrice wrapper', () => { test('validates input', () => { expect(() => calculatePrice(null)).toThrow() }) test('handles normal items', () => { const item = { id: 1, price: 10 } expect(calculatePrice(item)).toBe(10) }) // Characterization tests for legacy behavior test('matches legacy for discounts', () => { const item = { id: 1, price: 10, discount: 0.2 } expect(calculatePrice(item)).toBe(8) }) }) ``` **Phase 3: Replace Internals** ```typescript function calculatePrice(item: Item): number { validateItem(item) // New implementation (tested, clean) const basePrice = item.price const discount = item.discount ?? 0 return basePrice * (1 - discount) // Legacy code removed! } ``` **Phase 4: Expand** Now that one function is safe, repeat for next function. ### Benefits - Incremental progress - Always have working system - Tests added gradually - Risk minimized - Can stop anytime ### When to Use - Large untested codebase - Can't stop for months to add tests - Need to deliver features while improving - High-risk production system ## Pre-Modification Checklist Before touching ANY untested code: - [ ] Identified all entry points (who calls this code?) - [ ] Identified all exit points (what does this code call?) - [ ] Understood current behavior (what does it do now?) - [ ] Written characterization tests for current behavior - [ ] All characterization tests pass - [ ] Documented any surprising behaviors found - [ ] Considered using Strangler Fig for large changes **ONLY AFTER ALL CHECKS PASS:** - [ ] Make your intended changes - [ ] Run tests (verify behavior preserved or intentionally changed) - [ ] Add tests for NEW behavior if applicable - [ ] Document breaking changes if any ## Red Flags Stop immediately if: - "I'll just make a quick change, no need for tests" - "It's a small change, what could go wrong?" - "I don't have time to write tests" - "The code is too complex to test" - "I'll add tests later" (you won't) **These are rationalizations for dangerous behavior.** ## Common Objections (and Responses) ### "But tests will take longer than the change!" **Response:** The initial change is fast. Finding and fixing the bugs you introduced takes much longer. Tests save time. ### "The code is too complex to test!" **Response:** If it's too complex to test, it's too complex to change safely. Simplify first, or use Strangler Fig pattern. ### "We need to ship this feature NOW!" **Response:** Shipping a broken feature is slower than shipping a tested feature late. Technical debt compounds. ### "This code has worked fine without tests for years!" **Response:** Until now. The moment you change it, that safety is gone. Past stability doesn't predict future stability. ### "I understand the code, I won't break it!" **Response:** Confidence is not a substitute for evidence. Even experts make mistakes. ## Success Metrics How to know you've done this right: **Green flags:** - All characterization tests pass before changes - All tests still pass after changes - New behavior has new tests - Test coverage increased - Code is more readable - Confidence is high **Red flags:** - Tests fail after changes (and you don't know why) - Skipped edge cases "to save time" - Changed behavior without updating tests - Test coverage decreased or stagnant - More confused than before ## Integration with Other Skills Use these related skills: - **test-driven-development**: For new behavior, use TDD after characterization tests exist - **refactoring**: With tests in place, refactor safely - **boy-scout-rule**: While adding tests, improve code quality - **proof-of-work**: Document test results as evidence - **debugging**: When characterization tests reveal unexpected behavior ## Tools and Techniques ### Code Coverage Tools Use coverage tools to identify untested code: ```bash # JavaScript/TypeScript npm test -- --coverage # Python pytest --cov=mypackage # Ruby bundle exec rspec --require spec_helper # Go go test -cover ./... ``` **Target coverage:** - Critical paths: 100% - Business logic: 90%+ - Overall codebase: 80%+ ### Mutation Testing Verify tests actually catch bugs: ```bash # JavaScript npm install -D stryker # Python pip install mutmut ``` Mutation testing changes your code slightly and verifies tests fail. If tests still pass after mutation, they're not effective. ### Approval Testing For complex outputs (HTML, JSON, reports): ```typescript test('generates invoice', () => { const invoice = generateInvoice(order) expect(invoice).toMatchSnapshot() }) ``` First run captures current output. Future runs verify it hasn't changed. ## Real-World Example ### Scenario Legacy payment processor, 2000 lines, no tests, needs to add new payment method. ### Step 1: Identify Scope Don't test everything - test what you'll touch: ```typescript // Will modify this function processPayment(payment, method) { // ... many lines ... } // Will NOT modify these function logTransaction(transaction) { } function sendEmail(user, content) { } ``` ### Step 2: Add Characterization Tests ```typescript describe('processPayment (characterization)', () => { test('processes credit card payment', () => { const payment = { amount: 100, currency: 'USD' } const method = { type: 'credit_card', number: '4111...' } const result = processPayment(payment, method) expect(result.success).toBe(true) expect(result.transactionId).toBeDefined() expect(result.amount).toBe(100) }) // More tests for existing payment methods... }) ``` ### Step 3: Tests Pass ```bash npm test # ✅ All tests passing ``` ### Step 4: Add New Feature ```typescript function processPayment(payment, method) { // Existing code preserved if (method.type === 'credit_card') { return processCreditCard(payment, method) } // NEW: Add cryptocurrency support if (method.type === 'crypto') { return processCrypto(payment, method) } throw new Error(`Unknown payment method: ${method.type}`) } ``` ### Step 5: Add Tests for New Behavior ```typescript test('processes cryptocurrency payment', () => { const payment = { amount: 100, currency: 'USD' } const method = { type: 'crypto', address: '0x123...' } const result = processPayment(payment, method) expect(result.success).toBe(true) expect(result.transactionId).toBeDefined() }) ``` ### Step 6: All Tests Pass ```bash npm test # ✅ All tests passing (old + new) ``` ### Result - New feature added safely - Existing behavior preserved - Tests ensure no regression - Future changes safer ## Remember 1. **Untested code is dangerous code** - especially in dynamic languages 2. **Characterize before changing** - Capture current behavior first 3. **RGR workflow for legacy** - Different from TDD, specific to untested code 4. **Strangler Fig for large systems** - Gradual improvement beats big bang 5. **Tests are proof** - Confidence without evidence is just hope 6. **No shortcuts** - "Quick changes" create slow bugs **Legacy code is not an excuse for reckless changes. It's a reason for EXTRA caution.**