---
name: debug
description: "Systematic debugging with hypothesis testing. Persistent across sessions."
allowed-tools: Read, Write, Edit, Bash, Glob, Grep, Task, AskUserQuestion
argument-hint: "[issue description]"
---

**STOP — DO NOT READ THIS FILE. You are already reading it. This prompt was injected into your context by Claude Code's plugin system. Using the Read tool on this SKILL.md file wastes ~7,600 tokens. Begin executing Step 1 immediately.**

## Step 0 — Immediate Output

**Before ANY tool calls**, display this banner:

```
╔══════════════════════════════════════════════════════════════╗
║  PLAN-BUILD-RUN ► DEBUGGING                                  ║
╚══════════════════════════════════════════════════════════════╝
```

Then proceed to Step 1.

# /pbr:debug — Systematic Debugging

You are running the **debug** skill. Your job is to run a structured, hypothesis-driven debugging session that persists across conversations. You track every hypothesis, test, and finding in a debug file so work is never lost.

This skill **spawns Task(subagent_type: "pbr:debugger")** for investigation work.

---

## References

- `references/questioning.md` — Investigation questioning patterns: progressive depth, exploratory questions, concrete options (not open-ended)
- `references/ui-brand.md` — Status symbols, banners, error boxes for issue reporting

---

## Context Budget

Reference: `skills/shared/context-budget.md` for the universal orchestrator rules.
Reference: `skills/shared/agent-type-resolution.md` for agent type fallback when spawning Task() subagents.

Additionally for this skill:
- **Never** perform investigation work yourself — delegate ALL analysis to the debugger subagent
- **Minimize** reading debug file content — read only the latest hypothesis and result section
- **Delegate** all code reading, hypothesis testing, and fix attempts to the debugger subagent

---

## Core Principle

**Debug systematically, not randomly.** Every investigation step must have a hypothesis, a test, and a recorded result. No "let me just try this" — every action has a reason and is documented.

---

## Flow

### Step 1: Ensure Debug Directory Exists

**CRITICAL (no hook) -- DO NOT SKIP: Create debug directory before any file writes.**

Before any file operations, ensure both directories exist by running:

```bash
mkdir -p .planning/debug
```

This handles the case where neither `.planning/` nor `.planning/debug/` exist yet (debug can be run before other skills that create `.planning/`). Do NOT skip this step — writing files to a non-existent directory will fail.

### Step 2: Check for Active Debug Sessions

**Load depth profile:** Run `pbr-tools config resolve-depth` to get `debug.max_hypothesis_rounds`. If the command fails (no config.json or CLI error), default to 5 rounds. Initialize a round counter at 0. This counter increments each time a continuation debugger is spawned.

Scan `.planning/debug/` for existing debug files:

```
.planning/debug/
  {NNN}-{slug}.md     # Each debug session is a file
```

Read each file's frontmatter to check status:
- `status: gathering` — collecting symptoms from user
- `status: investigating` — testing hypotheses
- `status: fixing` — applying fix
- `status: verifying` — confirming fix works
- `status: resolved` — session complete

**If active sessions found:**

**CRITICAL -- DO NOT SKIP**: Present the following choice to the user via AskUserQuestion before proceeding:
Use the **debug-session-select** pattern from `skills/shared/gate-prompts.md`:
  question: "Found active debug sessions. Which would you like?"

Generate options dynamically from active sessions:
- Each active session becomes an option: label "#{NNN}: {title}", description "Started {date}, last: {last hypothesis}"
- Always include "New session" as the last option: description "Start a fresh debug investigation"
- If more than 3 active sessions exist, show only the 3 most recent plus "New session" (max 4 options)

Handle responses:
- If user selects an existing session: go to **Resume Flow** (Step 3b)
- If user selects "New session": go to **New Session Flow** (Step 3a)
- If user types a session number not in the list: look it up and resume it

**If no active sessions found:**
- Go to **New Session Flow** (Step 3a)

### Step 2b: UAT Gap Intake (conditional)

**Trigger:** If `$ARGUMENTS` contains `--from-uat`, OR if no arguments provided and UAT.md gaps exist for the current phase.

When triggered:

1. Read the current phase's VERIFICATION.md (or UAT.md if it exists):
   - Look for `status: gaps_found` in frontmatter
   - Parse the gaps section for individual gap entries
2. For each gap entry, extract:
   - Gap description (symptom)
   - Expected vs. actual behavior
   - Affected files/components
3. Pre-fill the debug session with UAT context:
   - **Issue title**: derived from the first/primary gap description
   - **Symptoms**: populated from gap details (expected, actual, scope)
   - **Category**: set to the gap's category if available (e.g., "test", "integration")
   - **Context**: include the phase context and related plan information
4. Skip the symptom-gathering questions in Step 3a (symptoms are already populated from UAT gaps)
5. Proceed directly to **Generate Session ID** and **Create Debug File** with the pre-filled data

**If `--from-uat` is specified but no gaps exist:**
- Display: "No UAT/verification gaps found for the current phase. Starting a standard debug session."
- Fall through to normal Step 3a flow

### Step 3a: New Session Flow

#### Gather Symptoms

If `$ARGUMENTS` is provided and descriptive:
- Use it as the initial issue description
- Still ask targeted follow-up questions

If `$ARGUMENTS` is empty or minimal:
- Ask the user for symptoms

**Symptom gathering questions** (ask as plain text — these are freeform, do NOT use AskUserQuestion):

1. **Expected behavior**: "What should happen?"
2. **Actual behavior**: "What actually happens? Include error messages if any."
3. **Reproduction**: "How do you trigger this? Steps to reproduce?"
4. **Onset**: "When did this start? Did anything change recently (new code, dependency update, config change)?"
5. **Scope**: "Does this affect everything or just specific cases? Any patterns?"

**Optional follow-ups** (ask if relevant):
- "What have you already tried?"
- "Does this happen in all environments (dev, prod, test)?"
- "Any relevant log output?"

#### Generate Session ID

1. Scan `.planning/debug/` for existing files
2. Extract NNN prefixes
3. Next number = highest + 1 (start at 001)
4. Generate slug from issue title via CLI:
   ```bash
   pbr-tools slug-generate "{issue title}"
   ```
   Parse the JSON output to get the `slug` field.

#### Create Debug File

**CRITICAL (no hook) -- DO NOT SKIP: Write debug session file to disk.**

Create `.planning/debug/{NNN}-{slug}.md`:

```markdown
---
id: "{NNN}"
title: "{issue title}"
status: gathering
created: "{ISO date}"
updated: "{ISO date}"
severity: "{critical|high|medium|low}"
category: "{runtime|build|test|config|integration|unknown}"
---

# Debug Session: {title}

## Symptoms

**Expected:** {expected behavior}
**Actual:** {actual behavior}
**Reproduction:** {steps}
**Onset:** {when it started}
**Scope:** {affected areas}

## Environment

- OS: {detected or reported}
- Runtime: {node version, python version, etc.}
- Relevant config: {any config that matters}

## Investigation Log

### Round 1 (automated)

{This section is filled by debugger}

## Hypotheses

| # | Hypothesis | Status | Evidence |
|---|-----------|--------|----------|
| 1 | {hypothesis} | {testing/confirmed/rejected} | {evidence} |

## Root Cause

{Filled when found}

## Fix Applied

{Filled when fixed}

## Timeline

- {ISO date}: Session created
```

#### Spawn Debugger

Display to the user: `◆ Spawning debugger...`

Spawn `Task(subagent_type: "pbr:debugger")` with the prompt template.

Read `${CLAUDE_SKILL_DIR}/templates/initial-investigation-prompt.md.tmpl` for the spawn prompt. Fill in the `{NNN}`, `{slug}`, and symptom placeholders with values from the debug file created above.

### Step 3b: Resume Flow

1. Read the debug file content
2. Parse the investigation log and hypotheses table
3. Display to user:

```
Resuming debug session #{NNN}: {title}

Last state:
- Hypotheses tested: {N}
- Confirmed: {list or "none yet"}
- Rejected: {list}
- Current lead: {most promising hypothesis}

Continuing investigation...
```

4. **Increment the round counter** (resuming counts as a new round). Display to the user: `◆ Spawning debugger (resuming session #{NNN}, round {N})...`

   Spawn `Task(subagent_type: "pbr:debugger")` with the continuation prompt template.

   Read `${CLAUDE_SKILL_DIR}/templates/continuation-prompt.md.tmpl` for the spawn prompt. Fill in the `{NNN}`, `{slug}`, and `{paste investigation log...}` placeholders with data from the debug file.

### Step 4: Handle Debugger Results

When the debugger agent completes, first check for completion markers in the Task() output before routing:

| Marker in Task() Output | Route To |
|--------------------------|----------|
| `## DEBUG COMPLETE` | ROOT CAUSE FOUND + FIX path |
| `## ROOT CAUSE FOUND` | ROOT CAUSE FOUND (no fix) path |
| `## DEBUG SESSION PAUSED` | CHECKPOINT path |
| No marker found | INCONCLUSIVE path |

**Spot-check:** Before routing, verify `.planning/debug/{NNN}-{slug}.md` exists and was recently updated (modified timestamp is newer than the Task() spawn time). If the debug file was not updated, warn: `⚠ Debug file not updated by agent — results may be incomplete.`

**Memory capture:** Reference `skills/shared/memory-capture.md` — check debugger output for `<memory_suggestion>` blocks and save any reusable knowledge discovered during debugging.

Display: `✓ Debug session complete — {N} hypotheses tested` (read the hypothesis count from the debug file's Hypotheses table).

The debugger returns one of four outcomes:

#### ROOT CAUSE FOUND + FIX

```
Root cause identified: {cause}
Fix applied: {description}
Commit: {hash}
```

Actions:
1. Update debug file:
   - Set `status: resolved`
   - Fill "Root Cause" section
   - Fill "Fix Applied" section
   - Add timeline entry
2. Update STATE.md if it has a Debug Sessions section
3. Report to user with branded output:

```
╔══════════════════════════════════════════════════════════════╗
║  PLAN-BUILD-RUN ► BUG RESOLVED ✓                             ║
╚══════════════════════════════════════════════════════════════╝

**Session #{NNN}:** {title}
**Root cause:** {cause}
**Fix:** {description}
**Commit:** {hash}


╔══════════════════════════════════════════════════════════════╗
║  ▶ NEXT UP                                                   ║
╚══════════════════════════════════════════════════════════════╝

**Continue your workflow** — the bug is fixed

`/pbr:progress`

<sub>`/clear` first → fresh context window</sub>


**Also available:**
- `/pbr:continue` — execute next logical step
- `/pbr:verify-work {N}` — verify the current phase


```

#### ROOT CAUSE FOUND (no fix)

Used when the debugger was invoked with `find_root_cause_only` or when the fix is too complex for auto-application.

```
Root cause identified: {cause}
Suggested fix: {approach}
```

Actions:
1. Update debug file:
   - Set `status: resolved`
   - Fill "Root Cause" section
   - Add suggested fix to notes
2. Suggest next steps to user:

```


╔══════════════════════════════════════════════════════════════╗
║  ▶ NEXT UP                                                   ║
╚══════════════════════════════════════════════════════════════╝

**Apply the fix** — root cause identified, fix needed

`/pbr:quick {fix description}`

<sub>`/clear` first → fresh context window</sub>


**Also available:**
- `/pbr:plan-phase` — for complex fixes that need planning
- `/pbr:progress` — see project status


```

#### CHECKPOINT

The debugger found something but needs user input or more investigation.

```
Investigation progress:
- Tested: {hypotheses tested}
- Found: {key finding}
- Need: {what's needed to continue}
```

Actions:
1. Update debug file with findings so far
2. Present checkpoint to user
3. **CRITICAL -- DO NOT SKIP**: Present the following choice to the user via AskUserQuestion before proceeding:
   Use the **debug-checkpoint** pattern from `skills/shared/gate-prompts.md`:
   question: "Investigation has reached a checkpoint. How should we proceed?"

Handle responses:
- "Continue": **Increment the round counter** (e.g., round 1 becomes round 2). Then display `◆ Spawning debugger (continuing investigation, round {N})...` and spawn another `Task(subagent_type: "pbr:debugger")` with updated context from the debug file
- "More info": **Increment the round counter.** Ask the user freeform what additional context they have, then update the debug file and spawn another debugger
- "New approach": **Increment the round counter.** Ask the user freeform what alternative angle to try, then update hypotheses and spawn another debugger

#### INCONCLUSIVE

The debugger exhausted its hypotheses without finding the root cause.

```
Investigation exhausted:
- Tested: {all hypotheses}
- Rejected: {list}
- Remaining unknowns: {list}
```

Actions:
1. Update debug file with all findings
2. Report to user:
   - What was tested and eliminated
   - What remains unknown
   - Suggested next investigation approaches:
     - Different reproduction steps
     - Log-level debugging
     - Environment comparison
     - Bisect (git bisect to find the breaking commit)
     - External help (stack overflow, docs)
3. Keep session active for future resumption

---

## Debugger Investigation Protocol

The debugger agent follows this protocol internally:

### Hypothesis-Driven Investigation

```
1. OBSERVE: Read error messages, logs, code around the failure point
2. HYPOTHESIZE: "The most likely cause is X because Y"
3. PREDICT: "If X is the cause, then test Z should show W"
4. TEST: Execute test Z
5. EVALUATE:
   - Result matches prediction → hypothesis supported → investigate deeper
   - Result contradicts → hypothesis rejected → try next hypothesis
   - Result is unexpected → new information → form new hypothesis
```

### Investigation Techniques

| Technique | When to Use |
|-----------|------------|
| **Stack trace analysis** | Error with stack trace available |
| **Code path tracing** | Logic error, wrong behavior |
| **Log injection** | Need to see runtime values |
| **Binary search** | Know it worked before, need to find when it broke |
| **Isolation** | Complex system, need to narrow scope |
| **Comparison** | Works in one case, fails in another |
| **Dependency audit** | Recent dependency changes |
| **Config diff** | Works in one environment, not another |

### Evidence Quality

| Quality | Description | Action |
|---------|-------------|--------|
| **Strong** | Directly proves/disproves hypothesis | Record and move on |
| **Moderate** | Suggests but doesn't prove | Record, seek corroboration |
| **Weak** | Tangentially related | Note but don't base decisions on it |
| **Misleading** | Red herring | Record as eliminated, explain why |

### Hypothesis Round Limit

The maximum number of investigation rounds is controlled by the depth profile's `debug.max_hypothesis_rounds` setting:
- `quick`: 3 rounds (fast, surface-level investigation)
- `standard`: 5 rounds (default)
- `comprehensive`: 10 rounds (deep investigation)

The orchestrator tracks the round count. Before spawning each continuation debugger (Step 4 "CHECKPOINT" -> "Continue"), increment the round counter. If the counter reaches the limit:
- Do NOT spawn another debugger
- Present to user: "Debug session has reached the hypothesis round limit ({N} rounds for {depth} mode). Options:"

Use AskUserQuestion:
  question: "Reached {N}-round hypothesis limit. How should we proceed?"
  header: "Debug Limit"
  options:
    - label: "Extend"    description: "Allow {N} more rounds (doubles the limit)"
    - label: "Wrap up"   description: "Record findings so far and close the session"
    - label: "Escalate"  description: "Save context for manual debugging"

- If "Extend": double the limit and continue
- If "Wrap up": update debug file `status: resolved` with `resolution: abandoned`, record all findings, suggest next steps
- If "Escalate": write a detailed handoff document to the debug file with all hypotheses, evidence, and suggested manual investigation steps

---

## Debug File Management

### Lifecycle

```
gathering → investigating → fixing → verifying → resolved
(any non-resolved) → resolved  (with resolution: abandoned, if 7+ days stale)
(any non-resolved) → (same)    (resumed after pause)
```

### Staleness Detection

When scanning for active sessions, check the `updated` date. If more than 7 days old:
- Set `status: resolved` with `resolution: abandoned` in frontmatter
- Still offer to resume, but warn: "This session is {N} days old. Context may have changed."

### Cleanup

Old resolved debug files can accumulate. They serve as a knowledge base for similar issues. Do NOT auto-delete them.

---

## State Integration

Update STATE.md Debug Sessions section (create if needed):

```markdown
### Debug Sessions

| # | Issue | Status | Root Cause |
|---|-------|--------|------------|
| 001 | Login timeout | resolved | DB connection pool exhausted |
| 002 | CSS not loading | active | investigating |
```

---

## Git Integration

Reference: `skills/shared/commit-planning-docs.md` for the standard commit pattern.

If `planning.commit_docs: true` in config.json:
- New session: `docs(planning): open debug session {NNN} - {slug}`
- Resolution: `docs(planning): resolve debug session {NNN} - {root cause summary}`
- Fix commits use standard format: `fix({scope}): {description}`

---

Reference: `skills/shared/error-reporting.md` for branded error output patterns.

## Edge Cases

### User provides a stack trace or error in arguments
- Parse it as the "Actual behavior" symptom
- Extract key information: error type, file, line number
- Use this to form initial hypotheses immediately

### Debugger agent fails
If the debugger Task() fails or returns an error, display:
```
╔══════════════════════════════════════════════════════════════╗
║  ERROR                                                       ║
╚══════════════════════════════════════════════════════════════╝

Debugger agent failed for session #{NNN}.

**To fix:**
- Check the debug file at `.planning/debug/{NNN}-{slug}.md` for partial findings
- Re-run with `/pbr:debug` to resume the session
- If the issue persists, try a fresh session with different symptom details
```

### Issue is in a dependency (not user code)
- Document which dependency and version
- Check if there's a known issue (search patterns in node_modules, site-packages, etc.)
- Suggest: update dependency, pin version, or work around

### Issue is intermittent
- Note intermittency in symptoms
- Suggest: run multiple times, add timing/logging, check for race conditions
- Investigation must account for non-deterministic reproduction

### Multiple issues interacting
- If investigation reveals multiple separate issues, split into separate debug sessions
- Create additional debug files
- Track each independently

### Fix would be a breaking change
- Report the root cause but DO NOT auto-apply the fix
- Present the trade-offs to the user
- Let the user decide how to proceed

---

## Anti-Patterns

1. **DO NOT** skip hypothesis formation — every test must have a reason
2. **DO NOT** make random changes hoping something works
3. **DO NOT** ignore failed hypotheses — record why they failed
4. **DO NOT** exceed the depth profile's `debug.max_hypothesis_rounds` limit without user confirmation (default: 5 for standard mode)
5. **DO NOT** fix the symptom instead of the root cause
6. **DO NOT** auto-apply fixes for breaking changes
7. **DO NOT** delete debug files — they're a knowledge base
8. **DO NOT** combine multiple bugs into one debug session
9. **DO NOT** skip updating the debug file after each investigation round
10. **DO NOT** start a new session when an active one covers the same issue