--- name: typed-holes-refactor description: Refactor codebases using Design by Typed Holes methodology - iterative, test-driven refactoring with formal hole resolution, constraint propagation, and continuous validation. Use when refactoring existing code, optimizing architecture, or consolidating technical debt through systematic hole-driven development. --- # Typed Holes Refactoring Systematically refactor codebases using the Design by Typed Holes meta-framework: treat architectural unknowns as typed holes, resolve them iteratively with test-driven validation, and propagate constraints through dependency graphs. ## Core Workflow ### Phase 0: Hole Discovery & Setup **1. Create safe working branch:** ```bash git checkout -b refactor/typed-holes-v1 # CRITICAL: Never work in main, never touch .beads/ in main ``` **2. Analyze current state and identify holes:** ```bash python scripts/discover_holes.py # Creates REFACTOR_IR.md with hole catalog ``` The Refactor IR documents: - **Current State Holes**: What's unknown about the current system? - **Refactor Holes**: What needs resolution to reach the ideal state? - **Constraints**: What must be preserved/improved/maintained? - **Dependencies**: Which holes block which others? **3. Write baseline characterization tests:** Create `tests/characterization/` to capture exact current behavior: ```python # tests/characterization/test_current_behavior.py def test_api_contracts(): """All public APIs must behave identically post-refactor""" for endpoint in discover_public_apis(): old_result = run_current(endpoint, test_inputs) save_baseline(endpoint, old_result) def test_performance_baselines(): """Record current performance - don't regress""" baselines = measure_all_operations() save_json("baselines.json", baselines) ``` Run tests on main branch - they should all pass. These are your safety net. ### Phase 1-N: Iterative Hole Resolution For each hole (in dependency order): **1. Select next ready hole:** ```bash python scripts/next_hole.py # Shows holes whose dependencies are resolved ``` **2. Write validation tests FIRST (test-driven):** ```python # tests/refactor/test_h{N}_resolution.py def test_h{N}_resolved(): """Define what 'resolved correctly' means""" # This should FAIL initially assert desired_state_achieved() def test_h{N}_equivalence(): """Ensure no behavioral regressions""" old_behavior = load_baseline() new_behavior = run_refactored() assert old_behavior == new_behavior ``` **3. Implement resolution:** - Refactor code to make tests pass - Keep characterization tests passing - Commit incrementally with clear messages **4. Validate resolution:** ```bash python scripts/validate_resolution.py H{N} # Checks: tests pass, constraints satisfied, main untouched ``` **5. Propagate constraints:** ```bash python scripts/propagate.py H{N} # Updates dependent holes based on resolution ``` **6. Document and commit:** ```bash git add . git commit -m "Resolve H{N}: {description} - Tests: tests/refactor/test_h{N}_*.py pass - Constraints: {constraints satisfied} - Propagates to: {dependent holes}" ``` ### Phase Final: Reporting **Generate comprehensive delta report:** ```bash python scripts/generate_report.py > REFACTOR_REPORT.md ``` Report includes: - Hole resolution summary with validation evidence - Metrics delta (LOC, complexity, coverage, performance) - Behavioral analysis (intentional changes documented) - Constraint validation (all satisfied) - Risk assessment and migration guide ## Key Principles ### 1. Test-Driven Everything - Write validation criteria BEFORE implementing - Tests define "correct resolution" - Characterization tests are sacred - never let them fail ### 2. Hole-Driven Progress - Resolve holes in dependency order - Each resolution propagates constraints - Track everything formally in Refactor IR ### 3. Continuous Validation Every commit must validate: - ✅ Characterization tests pass (behavior preserved) - ✅ Resolution tests pass (hole resolved correctly) - ✅ Constraints satisfied - ✅ Main branch untouched - ✅ `.beads/` intact in main ### 4. Safe by Construction - Work only in refactor branch - Main is read-only reference - Beads are untouchable historical artifacts ### 5. Formal Completeness Design complete when: - All holes resolved and validated - All constraints satisfied - All phase gates passed - Metrics improved or maintained ## Hole Quality Framework ### SMART Criteria for Good Holes Every hole must be: - **Specific**: Clear, bounded question with concrete answer - ✓ Good: "How should error handling work in the API layer?" - ✗ Bad: "How to improve the code?" - **Measurable**: Has testable validation criteria - ✓ Good: "Reduce duplication from 60% to <15%" - ✗ Bad: "Make code better" - **Achievable**: Can be resolved with available information - ✓ Good: "Extract parsing logic to separate module" - ✗ Bad: "Predict all future requirements" - **Relevant**: Blocks meaningful progress on refactoring - ✓ Good: "Define core interface (blocks 5 other holes)" - ✗ Bad: "Decide variable naming convention" - **Typed**: Clear type/structure for resolution - ✓ Good: `interface Architecture = { layers: Layer[], rules: Rule[] }` - ✗ Bad: "Some kind of structure?" ### Hole Estimation Framework Size holes using these categories: | Size | Duration | Characteristics | Examples | |------|----------|-----------------|----------| | **Nano** | 1-2 hours | Simple, mechanical changes | Rename files, update imports | | **Small** | 4-8 hours | Single module refactor | Extract class, consolidate functions | | **Medium** | 1-3 days | Cross-module changes | Define interfaces, reorganize packages | | **Large** | 4-7 days | Architecture changes | Layer extraction, pattern implementation | | **Epic** | >7 days | **SPLIT THIS HOLE** | Too large, break into smaller holes | **Estimation Red Flags**: - More than 3 dependencies → Likely Medium+ - Unclear validation → Add time for discovery - New patterns/tools → Add learning overhead ### Hole Splitting Guidelines **Split a hole when**: 1. Estimate exceeds 7 days 2. More than 5 dependencies 3. Validation criteria unclear 4. Multiple distinct concerns mixed **Splitting strategy**: ``` Epic hole: "Refactor entire authentication system" → Split into: R10_auth_interface: Define new auth interface (Medium) R11_token_handling: Implement JWT tokens (Small) R12_session_management: Refactor sessions (Medium) R13_auth_middleware: Update middleware (Small) R14_auth_testing: Comprehensive test suite (Medium) ``` **After splitting**: - Update dependencies in REFACTOR_IR.md - Run `python scripts/propagate.py` to update graph - Re-sync with beads: `python scripts/holes_to_beads.py` ## Common Hole Types ### Architecture Holes ```python "?R1_target_architecture": "What should the ideal structure be?" "?R2_module_boundaries": "How should modules be organized?" "?R3_abstraction_layers": "What layers/interfaces are needed?" ``` **Validation:** Architecture tests, dependency analysis, layer violation checks ### Implementation Holes ```python "?R4_consolidation_targets": "What code should merge?" "?R5_extraction_targets": "What code should split out?" "?R6_elimination_targets": "What code should be removed?" ``` **Validation:** Duplication detection, equivalence tests, dead code analysis ### Quality Holes ```python "?R7_test_strategy": "How to validate equivalence?" "?R8_migration_path": "How to safely transition?" "?R9_rollback_mechanism": "How to undo if needed?" ``` **Validation:** Test coverage metrics, migration dry-runs, rollback tests See [HOLE_TYPES.md](references/HOLE_TYPES.md) for complete catalog. ## Constraint Propagation Rules ### Rule 1: Interface Resolution → Type Constraints ``` When: Interface hole resolved with concrete types Then: Propagate type requirements to all consumers Example: Resolve R6: NodeInterface = BaseNode with async run() Propagates to: → R4: Parallel execution must handle async → R5: Error recovery must handle async exceptions ``` ### Rule 2: Implementation → Performance Constraints ``` When: Implementation resolved with resource usage Then: Propagate limits to dependent holes Example: Resolve R4: Parallelization with max_concurrent=3 Propagates to: → R8: Rate limit = provider_limit / 3 → R7: Memory budget = 3 * single_operation_memory ``` ### Rule 3: Validation → Test Requirements ``` When: Validation resolved with test requirements Then: Propagate data needs upstream Example: Resolve R9: Testing needs 50 examples Propagates to: → R7: Metrics must support batch evaluation → R8: Test data collection strategy needed ``` See [CONSTRAINT_RULES.md](references/CONSTRAINT_RULES.md) for complete propagation rules. ## Success Indicators ### Weekly Progress - 2-4 holes resolved - All tests passing - Constraints satisfied - Measurable improvements ### Red Flags (Stop & Reassess) - ❌ Characterization tests fail - ❌ Hole can't be resolved within constraints - ❌ Constraints contradict each other - ❌ No progress for 3+ days - ❌ Main branch accidentally modified ## Validation Gates | Gate | Criteria | Check | |------|----------|-------| | Gate 1: Discovery Complete | All holes cataloged, dependencies mapped | `python scripts/check_discovery.py` | | Gate 2: Foundation Holes | Core interfaces resolved, tests pass | `python scripts/check_foundation.py` | | Gate 3: Implementation | All refactor holes resolved, metrics improved | `python scripts/check_implementation.py` | | Gate 4: Production Ready | Migration tested, rollback verified | `python scripts/check_production.py` | ## Claude-Assisted Workflow This skill is designed for effective Claude/LLM collaboration. Here's how to divide work: ### Phase 0: Discovery **Claude's Role**: - Run `discover_holes.py` to analyze codebase - Suggest holes based on code analysis - Generate initial REFACTOR_IR.md structure - Write characterization tests to capture current behavior - Set up test infrastructure **Your Role**: - Confirm holes are well-scoped - Prioritize which holes to tackle first - Review and approve REFACTOR_IR.md - Define critical constraints ### Phase 1-N: Hole Resolution **Claude's Role**: - Write resolution tests (TDD) BEFORE implementation - Implement hole resolution to make tests pass - Run validation scripts: `validate_resolution.py`, `check_foundation.py` - Update REFACTOR_IR.md with resolution details - Propagate constraints: `python scripts/propagate.py H{N}` - Generate commit messages documenting changes **Your Role**: - Make architecture decisions (which pattern, which approach) - Assess risk and determine constraint priorities - Review code changes for correctness - Approve merge to main when complete ### Phase Final: Reporting **Claude's Role**: - Generate comprehensive REFACTOR_REPORT.md - Document all metrics deltas - List all validation evidence - Create migration guides - Prepare PR description **Your Role**: - Final review of report accuracy - Approve for production deployment - Conduct post-refactor retrospective ### Effective Prompting Patterns **Starting a session**: ``` "I need to refactor [description]. Use typed-holes-refactor skill. Start with discovery phase." ``` **Resolving a hole**: ``` "Resolve H3 (target_architecture). Write tests first, then implement. Use [specific pattern/approach]." ``` **Checking progress**: ``` "Run check_completeness.py and show me the dashboard. What's ready to work on next?" ``` **Generating visualizations**: ``` "Generate dependency graph showing bottlenecks and critical path. Use visualize_graph.py with --analyze." ``` ### Claude's Limitations **Claude CANNOT**: - Make subjective architecture decisions (you must decide) - Determine business-critical constraints (you must specify) - Run tests that require external services (mock or you run them) - Merge to main (you must approve and merge) **Claude CAN**: - Analyze code and suggest holes - Write comprehensive test suites - Implement resolutions within your constraints - Generate reports and documentation - Track progress across sessions (via beads + REFACTOR_IR.md) ### Multi-Session Continuity **At session start**: ``` "Continue typed-holes refactoring. Import beads state and show current status from REFACTOR_IR.md." ``` **Claude will**: - Read REFACTOR_IR.md to understand current state - Check which holes are resolved - Identify next ready holes - Resume where previous session left off **You should**: - Keep REFACTOR_IR.md and .beads/ committed to git - Export beads state at session end: `bd export -o .beads/issues.jsonl` - Use /context before starting to ensure Claude has full context ## Beads Integration **Why beads + typed holes?** - Beads tracks issues across sessions (prevents lost work) - Holes track refactoring-specific state (dependencies, constraints) - Together: Complete continuity for long-running refactors ### Setup ```bash # Install beads (once) go install github.com/steveyegge/beads/cmd/bd@latest # After running discover_holes.py python scripts/holes_to_beads.py # Check what's ready bd ready --json ``` ### Workflow Integration **During hole resolution**: ```bash # Start work on a hole bd update bd-5 --status in_progress --json # Implement resolution # ... write tests, implement code ... # Validate resolution python scripts/validate_resolution.py H3 # Close bead bd close bd-5 --reason "Resolved H3: target_architecture" --json # Export state bd export -o .beads/issues.jsonl git add .beads/issues.jsonl REFACTOR_IR.md git commit -m "Resolve H3: Define target architecture" ``` **Syncing holes ↔ beads**: ```bash # After updating REFACTOR_IR.md manually python scripts/holes_to_beads.py # Sync changes to beads # After resolving holes python scripts/holes_to_beads.py # Update bead statuses ``` **Cross-session continuity**: ```bash # Session start bd import -i .beads/issues.jsonl bd ready --json # Shows ready holes python scripts/check_completeness.py # Shows overall progress # Session end bd export -o .beads/issues.jsonl git add .beads/issues.jsonl git commit -m "Session checkpoint: 3 holes resolved" ``` **Bead advantages**: - Tracks work across days/weeks - Shows dependency graph: `bd deps bd-5` - Prevents context loss - Integrates with overall project management ## Scripts Reference All scripts are in `scripts/`: - `discover_holes.py` - Analyze codebase and generate REFACTOR_IR.md - `next_hole.py` - Show next resolvable holes based on dependencies - `validate_resolution.py` - Check if hole resolution satisfies constraints - `propagate.py` - Update dependent holes after resolution - `generate_report.py` - Create comprehensive delta report - `check_discovery.py` - Validate Phase 0 completeness (Gate 1) - `check_foundation.py` - Validate Phase 1 completeness (Gate 2) - `check_implementation.py` - Validate Phase 2 completeness (Gate 3) - `check_production.py` - Validate Phase 3 readiness (Gate 4) - `check_completeness.py` - Overall progress dashboard - `visualize_graph.py` - Generate hole dependency visualization - `holes_to_beads.py` - Sync holes with beads issues Run any script with `--help` for detailed usage. ## Meta-Consistency This skill uses its own principles: | Typed Holes Principle | Application to Refactoring | |-----------------------|----------------------------| | Typed Holes | Architectural unknowns cataloged with types | | Constraint Propagation | Design constraints flow through dependency graph | | Iterative Refinement | Hole-by-hole resolution cycles | | Test-Driven Validation | Tests define correctness | | Formal Completeness | Gates verify design completeness | **We use the system to refactor the system.** ## Advanced Topics For complex scenarios, see: - [HOLE_TYPES.md](references/HOLE_TYPES.md) - Detailed hole taxonomy - [CONSTRAINT_RULES.md](references/CONSTRAINT_RULES.md) - Complete propagation rules - [VALIDATION_PATTERNS.md](references/VALIDATION_PATTERNS.md) - Test patterns for different hole types - [EXAMPLES.md](references/EXAMPLES.md) - Complete worked examples ## Quick Start Example ```bash # 1. Setup git checkout -b refactor/typed-holes-v1 python scripts/discover_holes.py # 2. Write baseline tests # Create tests/characterization/test_*.py # 3. Resolve first hole python scripts/next_hole.py # Shows H1 is ready # Write tests/refactor/test_h1_*.py (fails initially) # Refactor code until tests pass python scripts/validate_resolution.py H1 python scripts/propagate.py H1 git commit -m "Resolve H1: ..." # 4. Repeat for each hole # ... # 5. Generate report python scripts/generate_report.py > REFACTOR_REPORT.md ``` ## Troubleshooting ### Characterization tests fail **Symptom**: Tests that captured baseline behavior now fail **Resolution**: 1. Revert changes: `git diff` to see what changed 2. Investigate: What behavior changed and why? 3. Decision tree: - **Intentional change**: Update baselines with documentation ```python # Update baseline with reason save_baseline("v2_api", new_behavior, reason="Switched to async implementation") ``` - **Unintentional regression**: Fix the code, tests must pass **Prevention**: Run characterization tests before AND after each hole resolution. ### Hole can't be resolved **Symptom**: Stuck on a hole for >3 days, unclear how to proceed **Resolution**: 1. **Check dependencies**: Are they actually resolved? ```bash python scripts/visualize_graph.py --analyze # Look for unresolved dependencies ``` 2. **Review constraints**: Are they contradictory? - Example: C1 "preserve all behavior" + C5 "change API contract" → Contradictory - **Fix**: Renegotiate constraints with stakeholders 3. **Split the hole**: If hole is too large ```bash # Original: R4_consolidate_all (Epic, 10+ days) # Split into: R4a_consolidate_parsers (Medium, 2 days) R4b_consolidate_validators (Small, 1 day) R4c_consolidate_handlers (Medium, 2 days) ``` 4. **Check for circular dependencies**: ```bash python scripts/visualize_graph.py # Look for cycles: R4 → R5 → R6 → R4 ``` - **Fix**: Break cycle by introducing intermediate hole or redefining dependencies **Escalation**: If still stuck after 5 days, consider alternative refactoring approach. ### Contradictory Constraints **Symptom**: Cannot satisfy all constraints simultaneously **Example**: - C1: "Preserve exact current behavior" (backward compatibility) - C5: "Reduce response time by 50%" (performance improvement) - Current behavior includes slow, synchronous operations **Resolution Framework**: 1. **Identify the conflict**: ```markdown C1 requires: Keep synchronous operations C5 requires: Switch to async operations → Contradiction: Can't be both sync and async ``` 2. **Negotiate priorities**: | Option | C1 | C5 | Tradeoff | |--------|----|----|----------| | A: Keep sync | ✓ | ✗ | No performance gain | | B: Switch to async | ✗ | ✓ | Breaking change | | C: Add async, deprecate sync | ⚠️ | ✓ | Migration burden | 3. **Choose resolution strategy**: - **Relax constraint**: Change C1 to "Preserve behavior where possible" - **Add migration period**: C implemented over 2 releases - **Split into phases**: Phase 1 (C1), Phase 2 (C5) 4. **Document decision**: ```markdown ## Constraint Resolution: C1 vs C5 **Decision**: Relax C1 to allow async migration **Rationale**: Performance critical for user experience **Migration**: 3-month deprecation period for sync API **Approved by**: [Stakeholder], [Date] ``` ### Circular Dependencies **Symptom**: `visualize_graph.py` shows cycles **Example**: ``` R4 (consolidate parsers) → depends on R6 (define interface) R6 (define interface) → depends on R4 (needs parser examples) ``` **Resolution strategies**: 1. **Introduce intermediate hole**: ``` H0_parser_analysis: Analyze existing parsers (no dependencies) R6_interface: Define interface using H0 analysis R4_consolidate: Implement using R6 interface ``` 2. **Redefine dependencies**: - Maybe R4 doesn't actually need R6 - Or R6 only needs partial R4 (split R4) 3. **Accept iterative refinement**: ``` R6_interface_v1: Initial interface (simple) R4_consolidate: Implement with v1 interface R6_interface_v2: Refine based on R4 learnings ``` **Prevention**: Define architecture holes before implementation holes. ### No Progress for 3+ Days **Symptom**: Feeling stuck, no commits, uncertain how to proceed **Resolution checklist**: - [ ] **Review REFACTOR_IR.md**: Are holes well-defined (SMART criteria)? - If not: Rewrite holes to be more specific - [ ] **Check hole size**: Is current hole >7 days estimate? - If yes: Split into smaller holes - [ ] **Run dashboard**: `python scripts/check_completeness.py` - Are you working on a blocked hole? - Switch to a ready hole instead - [ ] **Visualize dependencies**: `python scripts/visualize_graph.py --analyze` - Identify bottlenecks - Look for parallel work opportunities - [ ] **Review constraints**: Are they achievable? - Renegotiate if necessary - [ ] **Seek external review**: - Share REFACTOR_IR.md with colleague - Get feedback on approach - [ ] **Consider alternative**: Maybe this refactor isn't feasible - Document why - Propose different approach **Reset protocol**: If still stuck, revert to last working state and try different approach. ### Estimation Failures **Symptom**: Hole taking 3x longer than estimated **Analysis**: 1. **Why did estimate fail?** - Underestimated complexity - Unforeseen dependencies - Unclear requirements - Technical issues (tool problems, infrastructure) 2. **Immediate actions**: - Update REFACTOR_IR.md with revised estimate - If >7 days, split the hole - Update beads: `bd update bd-5 --note "Revised estimate: 5 days (was 2)"` 3. **Future improvements**: - Use actual times to calibrate future estimates - Add buffer for discovery (20% overhead) - Note uncertainty in IR: "Estimate: 2-4 days (high uncertainty)" **Learning**: Track actual vs estimated time in REFACTOR_REPORT.md for future reference. --- **Begin with Phase 0: Discovery. Always work in a branch. Test first, refactor second.**