---
name: bugfixforever
description: State-of-the-art procedure for fixing bugs in software projects. Use this skill when a bug has been detected or declared by the user, agent, or another skill (not during early-stage work in progress). Enforces a disciplined test-driven approach - understand, reproduce experimentally, write failing tests, fix the code, and clean up.
---

# BugFixForever (BFF)

## Overview

This skill guides a reliable, test-driven procedure for fixing bugs in software projects. The approach emphasizes experimental reproduction, comprehensive test coverage at appropriate abstraction levels, and clean commits.

## When to Use This Skill

Activate this skill when:
- A user reports a bug or issue in the system
- An agent detects unexpected behavior or failures
- Another skill identifies a problem that needs fixing
- Do NOT interrupt early-stage work in progress for minor issues

## The BFF Workflow

Follow these phases in order. Do not skip ahead to fixing code before establishing reproducibility and test coverage.

### Phase 1: Understand the Bug

Gather context about the bug's behavior:
- What is the expected behavior vs. actual behavior?
- When does it occur? What triggers it?
- What is the impact? (User-facing, data corruption, performance, etc.)
- What has changed recently that might be related?
- Is this reproducible consistently or intermittent?

Review relevant code, recent commits, and existing tests to build mental model of where the bug likely exists.

### Phase 2: Experimentally Reproduce the Bug

Prove the bug exists through experimentation. All experimental code must live on the filesystem - do NOT write one-off scripts directly into stdin of a scripting language.

**Acceptable reproduction methods:**
- Run the software normally (web server, CLI, etc.)
- Run existing tests to confirm failures
- Write temporary test files to isolate the issue
- Use a web browser and developer tools to reproduce user-facing issues
- Tail logs and add temporary logging statements
- Run the same test multiple times for intermittent issues
- Manipulate mocks/stubs to isolate components

**Code inspection & debugging techniques:**
- Add temporary logging/print statements to trace execution flow
- Use debuggers with breakpoints to step through code
- Add strategic assertions to catch bad state earlier
- Temporary console.log/print statements in production code (to be removed in cleanup phase)

**System-level tools (when appropriate):**
- System call tracing (strace, dtrace) for low-level issues
- Performance profiling tools for performance bugs
- Container/process inspection (docker logs, ps, top)
- Network analysis tools

**Important:** Create reusable files for all experimental code. Temporary test files, debugging scripts, and reproduction cases should be saved to the filesystem, not executed as one-liners.

Continue experimentation until the bug's behavior is proven and understood mechanistically.

### Phase 3: Write Failing Tests

Once reproduction is confirmed, write one or more declarative tests into the project's test suites that fail because the bug exists.

**The Drill-Down Approach:**

For user-facing bugs or integration issues:
1. **Start at the top**: Write an E2E or integration test that captures the user-visible symptom
2. **Drill down**: Write a unit test that captures the root cause at the component/function level

For internal bugs discovered during development:
- A focused unit test may be sufficient

**Test quality criteria:**
- Tests should be declarative and clearly document what behavior is expected
- Tests should fail reliably when the bug exists
- Tests should be placed in appropriate test suites (not temporary files)
- Choose abstraction level that matches the bug's nature

Run the new tests to confirm they fail with clear, expected error messages.

### Phase 4: Fix the Bug

Only after failing tests are in place, modify the production code to fix the bug.

**Fix principles:**
- Make the minimal change necessary to fix the root cause
- Preserve existing functionality (don't introduce regressions)
- Follow existing code patterns and conventions
- Consider edge cases and similar bugs that might exist

Run the new tests to confirm they now pass. Run the full test suite to ensure no regressions.

### Phase 5: Cleanup and Commit

Remove all temporary artifacts and prepare a clean commit:

**Cleanup tasks:**
- Delete temporary test files, debugging scripts, and experimental code
- Remove temporary logging/print/console.log statements added during reproduction
- Remove extraneous comments added during debugging
- Update or remove outdated documentation affected by the fix

**Commit preparation:**
Use the `no-more-commitment-issues` skill to ensure commit hygiene:
- Clear, concise commit message explaining what was fixed and why
- Only include files relevant to the fix and its tests
- No debug statements, temporal language, or cruft

## Examples

### Example 1: User-facing bug in web application

**User report:** "When I click 'Save' on the profile page, nothing happens."

**Phase 1:** Understand - profile save button should update user data but appears non-functional

**Phase 2:** Reproduce experimentally:
```bash
# Run dev server
pnpm run dev

# Open browser, navigate to profile page, reproduce issue
# Check browser console - see "TypeError: Cannot read property 'id' of undefined"
# Add temporary console.log in src/components/Profile.tsx to trace data flow
```

**Phase 3:** Write failing tests (drill-down approach):
```typescript
// E2E test capturing user-facing symptom
test('user can save profile changes', async ({ page }) => {
  // ... test fails
});

// Unit test capturing root cause
test('ProfileService.save handles missing user ID', () => {
  // ... test fails
});
```

**Phase 4:** Fix - add null check in ProfileService.save()

**Phase 5:** Remove temporary console.log, commit with no-more-commitment-issues skill

### Example 2: Intermittent test failure

**Agent observation:** "Tests are flaky - UserAuthTest sometimes fails."

**Phase 1:** Understand - race condition suspected in authentication flow

**Phase 2:** Reproduce experimentally:
```bash
# Create temporary script to run test 100 times
# File: debug/test_flakiness.sh
for i in {1..100}; do
  npm test -- UserAuthTest
done

# Add temporary logging to trace async timing issues
```

**Phase 3:** Write failing test that reliably exposes race condition

**Phase 4:** Fix - add proper async/await handling

**Phase 5:** Delete debug/test_flakiness.sh and temporary logging, commit

## Key Principles

1. **Filesystem-first**: All experimental code lives in files, not stdin one-liners
2. **Drill-down testing**: Start with high-level tests for user impact, drill to unit tests for root cause
3. **Prove before fixing**: Never fix code until bug is reproducible and tests are written
4. **Clean commits**: Remove all temporary artifacts before committing