--- name: Fixing Bugs Systematically description: Diagnose and fix bugs through systematic investigation, root cause analysis, and targeted validation. Use when something is broken, errors occur, performance degrades, or unexpected behavior manifests. --- # Fixing Bugs Systematically Structured protocol for isolating root causes and implementing focused fixes in existing features. ## When to Use - Something is broken and needs diagnosis and repair - Error messages or unexpected behavior occurs - Performance degradation in existing functionality - Intermittent or hard-to-reproduce issues ## Core Steps ### 1. Context & Reproduction Read relevant documentation: - `docs/feature-spec/F-##-*.md` for affected feature - `docs/user-stories/US-###-*.md` for expected behavior and acceptance criteria - `docs/api-contracts.yaml` if API-related - `docs/system-design.md` for architecture context Document the bug: - **Expected behavior** (cite story AC or spec) - **Actual behavior** (what's broken) - **Reproduction steps** - **Feature ID** (F-##) and **Story ID** (US-###) if known ### 2. Investigation #### Simple bugs (obvious entry point) Use direct investigation: - Grep to locate error messages or related code - Read suspected files to examine implementation - Trace function calls and data transformations - Check related files for connected logic #### Complex bugs (multiple subsystems or unclear origin) Delegate to async agents in parallel: **Spawn `senior-engineer` agents to:** - Trace error flow through specific subsystem - Analyze related failure patterns - Investigate runtime conditions **Spawn `Explore` agents to:** - Map data flow across multiple files - Find all error handling for specific operation - Locate configuration and integration points **Example:** For authentication bug, spawn: - Agent 1: "Trace auth flow from login endpoint to session creation" - Agent 2: "Find all error handling and validation in auth module" - Agent 3: "Locate session storage config and related code" Wait for results using `./agent-responses/await {agent_id}` ### 3. Root Cause Analysis **Generate hypotheses:** - List 3-8 potential root causes from investigation - Rank by probability (evidence from code) and impact - Select most likely cause(s) **Decision point:** - **Fix immediately** if root cause is obvious and confirmed - **Add validation** if multiple plausible causes or runtime-dependent behavior ### 4. Validation (if needed) Add minimal debugging: - Logging at decision points - Data inspection at boundaries - Input/output logging at integration points Test to confirm root cause before proceeding to fix. ### 5. Implementation Fix the confirmed root cause: - Keep changes minimal and focused - Maintain API stability unless approved - Follow existing patterns in codebase **Update documentation if needed:** - Add note in feature spec or changelog - Update `docs/api-contracts.yaml` if contract changed (requires approval) - For slash commands: - `/manage-project/update/update-feature` to correct spec - `/manage-project/update/update-story` if ACs were ambiguous - `/manage-project/update/update-api` if API changed (with approval) ### 6. Validation & Testing Verify fix against acceptance criteria: - Test all ACs from affected user stories - Check 1-2 key edge cases and error states - Run contract tests if API changed - Verify events in `docs/data-plan.md` still fire correctly ### 7. Cleanup - Remove all debugging and logging code - Verify no temporary files remain ## Investigation Strategy **For direct investigation:** - Use grep, read_file to understand subsystem - Trace flows manually through related files - Focus on specific area where bug manifests **When to validate before fixing:** - Multiple plausible root causes exist - Runtime-dependent behavior - Intermittent or hard-to-reproduce issues **For async investigation:** - Each agent investigates independent subsystem - Run in parallel for speed - Maximum 6 agents (diminishing returns) ## Artifacts **Inputs:** - `docs/feature-spec/F-##-*.md` — Feature specs - `docs/user-stories/US-###-*.md` — Expected behavior and ACs - `docs/api-contracts.yaml` — API specs - `docs/system-design.md` — Architecture context **Outputs:** - Investigation findings (inline notes or agent reports) - Updated feature spec with bug resolution notes - Fixed code with accompanying tests ## Quick Reference | Scenario | Approach | |----------|----------| | Single subsystem, obvious entry | Direct investigation → immediate fix | | Multiple subsystems, unclear origin | Spawn 2-4 agents in parallel → synthesize findings → fix | | Runtime-dependent or intermittent | Add targeted logging → reproduce → analyze logs → fix | | Multiple independent fixes needed | Pass investigation results to fix agents via artifact files |