---
name: ccpm-debugging
description: Systematic debugging with defense-in-depth approach (symptoms → immediate cause → root cause → systemic issues → prevention). Auto-activates when user mentions "error", "failing", "broken", "debug", "bug", "issue" or when /ccpm:verify runs. Uses Observe → Hypothesize → Test → Confirm → Fix → Verify → Document workflow. Updates Linear with findings and automatically logs blockers that require external fixes. Traces root causes instead of patching symptoms. Integrates with ccpm-code-review to verify fixes pass all gates before marking complete. Suggests binary search for intermittent issues and five-whys analysis for complex problems.
---

# CCPM Debugging

Systematic debugging with Linear integration and structured troubleshooting workflow.

## When to Use

This skill auto-activates when:

- User mentions: **"error"**, **"failing"**, **"broken"**, **"debug"**, **"bug"**, **"issue"**
- Running **`/ccpm:verify`** command
- Tests failing during `/ccpm:verify`
- Build errors during implementation
- Runtime errors or exceptions
- Unexpected behavior

## Core Debugging Philosophy

### Defense-in-Depth Approach

**Multiple layers of investigation**:
1. **Symptoms** - What's observable?
2. **Immediate cause** - What directly caused the error?
3. **Root cause** - Why did it happen?
4. **Systemic issues** - Could this happen elsewhere?
5. **Prevention** - How to prevent recurrence?

### Systematic Process

```
Observe → Hypothesize → Test → Confirm → Fix → Verify → Document
```

Never skip steps. Each step builds on the previous.

## Integration with CCPM

### Use with `/ccpm:verify`

When verification fails, this skill provides structured debugging:

```
User: "/ccpm:verify AUTH-123"

Claude: [ccpm-debugging activates]

Starting systematic debugging for AUTH-123...

Step 1: GATHER SYMPTOMS
Running /ccpm:verify to collect error information...

Found failures:
- 3 failing tests
- 1 TypeScript compilation error
- 0 linting errors

Step 2: ANALYZE FAILURES

Test failures:
1. test/auth/login.test.ts:45
   ❌ "should handle invalid credentials"
   Expected: 401 Unauthorized
   Actual: 500 Internal Server Error

2. test/auth/jwt.test.ts:78
   ❌ "should refresh expired tokens"
   Error: Cannot read property 'id' of undefined

3. test/api/protected.test.ts:23
   ❌ "should reject missing tokens"
   Timeout: Request exceeded 5000ms

TypeScript error:
src/auth/jwt.ts:45:12
- error TS2345: Argument of type 'string | undefined' is not assignable to parameter of type 'string'

Step 3: FORM HYPOTHESES

Hypothesis 1: JWT token handling has type safety issue (TS error)
Hypothesis 2: Error handling missing (test 1 shows 500 instead of 401)
Hypothesis 3: Token refresh logic has null reference bug (test 2)
Hypothesis 4: Protected route timeout suggests authentication middleware hanging (test 3)

Step 4: TEST HYPOTHESES
[Proceeds with systematic testing...]

Step 5: UPDATE LINEAR
Adding debugging findings to AUTH-123...
```

### Linear Integration

**Automatic Linear updates during debugging**:

```typescript
// After each debugging step
Linear.addComment(taskId, {
  title: "Debugging Progress",
  content: `
## Symptoms
- 3 failing tests (login, jwt, protected routes)
- TypeScript type error in jwt.ts:45

## Root Cause
Missing null check before accessing user.id in token refresh

## Fix Applied
Added null guard and proper error handling

## Verification
Re-running tests...
  `
});
```

### Blocker Logging

**When encountering blockers**:

```
Claude: [ccpm-debugging detects blocker]

Found BLOCKER: Redis connection failing in staging

Adding blocker to Linear AUTH-123:
---
🚫 BLOCKER: Redis Connection Issue

Environment: Staging
Error: ECONNREFUSED 127.0.0.1:6379
Impact: Cannot test session management

Investigation:
- Redis service not running in staging
- Docker compose missing Redis container
- Connection string using localhost instead of service name

Temporary workaround:
- Skip Redis-dependent tests in staging
- Test locally with Docker

Resolution needed:
- Update staging docker-compose.yml
- Fix Redis service configuration
---

Blocker logged. Continuing with fixable issues...
```

## Debugging Strategies

### Strategy 1: Error Message Analysis

**Read error messages carefully**:

```
Error: Cannot read property 'id' of undefined

Analysis:
1. What's undefined? → Some object
2. What property? → 'id'
3. Where? → Line number in stack trace
4. When? → During what operation?

Investigation:
- Check stack trace for call site
- Identify which object is undefined
- Trace back to why it's undefined
- Find where it should be defined
```

### Strategy 2: Reproduce Minimally

**Create minimal reproduction**:

```
Full test failing:
- 100 lines of setup
- Multiple database calls
- Complex state management

Minimal reproduction:
- 10 lines
- Mock database
- Isolated function call

Result: Easier to identify exact issue
```

### Strategy 3: Binary Search

**For intermittent or complex issues**:

```
Problem: Feature works sometimes, fails other times

Binary search approach:
1. Identify working commit (git bisect start)
2. Test middle commit
3. Mark as good or bad
4. Repeat until culprit commit found

Result: Exact commit that introduced bug
```

### Strategy 4: Rubber Duck Debugging

**Explain the problem out loud**:

```
User: "I don't understand why this is failing"

Claude: [ccpm-debugging activates]

Let's walk through this step by step:

1. What are you trying to do?
   → "Log in a user with JWT"

2. What do you expect to happen?
   → "Return 200 with token"

3. What actually happens?
   → "Returns 500 error"

4. What's different between expectation and reality?
   → "Error handling is missing"

5. Why would error handling be missing?
   → "Forgot to wrap async call in try-catch"

Often explaining the problem reveals the solution!
```

## Root-Cause Tracing

### The 5 Whys Technique

**Keep asking "why?" until you find root cause**:

```
Problem: Tests failing in CI but passing locally

Why? → Database connection timeout in CI
Why? → Database takes longer to start in CI
Why? → No health check waiting for database
Why? → Docker Compose doesn't have healthcheck configured
Why? → Template missing this configuration

Root cause: Missing healthcheck in docker-compose.yml template

Fix: Add healthcheck to template, not just local override
```

### Trace Backwards

**Start from symptom, trace backwards**:

```
Symptom: User sees "Internal Server Error"
       ↓
Application log: TypeError: Cannot read property 'email' of null
       ↓
Code: const email = user.email
       ↓
user comes from: await db.findUser(id)
       ↓
findUser returned: null (user not found)
       ↓
Why null? User ID was: undefined
       ↓
ID came from: req.params.userId
       ↓
Route defined as: /api/users/:id (not :userId)
       ↓
Root cause: Route parameter mismatch
```

## Common Debugging Patterns

### Pattern 1: Null/Undefined Issues

```typescript
// ❌ Crash waiting to happen
function getEmail(user) {
  return user.email;  // Crashes if user is null
}

// ✅ Defensive
function getEmail(user) {
  if (!user) {
    throw new Error('User is required');
  }
  return user.email;
}

// ✅ Even better with TypeScript
function getEmail(user: User | null): string {
  if (!user) {
    throw new Error('User is required');
  }
  return user.email;
}
```

### Pattern 2: Async/Await Errors

```typescript
// ❌ Unhandled promise rejection
async function login(email, password) {
  const user = await db.findUser(email);  // Could throw
  return generateToken(user);
}

// ✅ Proper error handling
async function login(email, password) {
  try {
    const user = await db.findUser(email);
    if (!user) {
      throw new UnauthorizedError('Invalid credentials');
    }
    return generateToken(user);
  } catch (error) {
    if (error instanceof DatabaseError) {
      logger.error('Database error during login', error);
      throw new ServiceUnavailableError();
    }
    throw error;
  }
}
```

### Pattern 3: Race Conditions

```typescript
// ❌ Race condition
let counter = 0;
async function increment() {
  const current = counter;
  await delay(10);
  counter = current + 1;  // Lost updates!
}

// ✅ Atomic operation
let counter = 0;
const lock = new Mutex();
async function increment() {
  await lock.acquire();
  try {
    counter++;
  } finally {
    lock.release();
  }
}
```

## Debugging Workflows

### Workflow 1: Test Failure

```
1. Read test failure message carefully
2. Identify what's expected vs actual
3. Find the code being tested
4. Add console.log or debugger
5. Re-run test in isolation
6. Step through with debugger
7. Identify exact line causing failure
8. Fix the issue
9. Verify test passes
10. Update Linear with fix
```

### Workflow 2: Runtime Error

```
1. Capture full error message + stack trace
2. Identify error location from stack trace
3. Reproduce error consistently
4. Add error handling/logging at error site
5. Trace backwards to root cause
6. Fix root cause (not just symptom)
7. Add test to prevent regression
8. Update Linear with findings
```

### Workflow 3: Performance Issue

```
1. Measure baseline performance
2. Profile to find bottleneck
3. Hypothesize cause
4. Test hypothesis (enable/disable features)
5. Confirm bottleneck
6. Optimize bottleneck
7. Measure improvement
8. Document in Linear
```

## Integration with Other CCPM Skills

### Works with `ccpm-code-review`

```
Debugging fixes issue
       ↓
ccpm-code-review verifies fix
       ↓
- Tests now passing?
- Build successful?
- Issue documented?
       ↓
Only then mark as complete
```

### Works with `sequential-thinking`

```
Complex debugging scenario
       ↓
sequential-thinking structures investigation
       ↓
Thought 1: Gather symptoms
Thought 2: Form hypotheses
Thought 3: Test hypothesis A
Thought 4: Root cause identified
Thought 5: Fix applied
Thought 6: Verification complete
```

### Works with `pm-workflow-guide`

```
Debugging started
       ↓
pm-workflow-guide suggests /ccpm:verify
       ↓
ccpm-debugging provides systematic approach
       ↓
After fix: pm-workflow-guide suggests /ccpm:verify
```

## Examples

### Example 1: Failing Tests

```
User: "Tests are failing, help me debug"

Claude: [ccpm-debugging activates]

Starting systematic debugging...

Step 1: GATHER SYMPTOMS
Running tests to collect failure information...

Found 2 failures:
1. test/api/users.test.ts:34
   "GET /api/users should return user list"
   Expected status 200, got 404

2. test/api/users.test.ts:45
   "POST /api/users should create user"
   Expected status 201, got 404

Step 2: PATTERN RECOGNITION
Both tests hitting /api/users route → routing issue likely

Step 3: HYPOTHESIS
Route might not be registered or path is wrong

Step 4: INVESTIGATION
Checking route registration...

Found in src/app.ts:
```typescript
app.use('/api/user', userRouter);  // ← Missing 's'
```

Should be:
```typescript
app.use('/api/users', userRouter);
```

Step 5: ROOT CAUSE
Typo in route registration path

Step 6: FIX
Updated route path to '/api/users'

Step 7: VERIFICATION
Re-running tests...
✅ All tests passing (2/2)

Step 8: LINEAR UPDATE
Adding fix summary to Linear task...

Issue: Route registration typo
Fix: Corrected /api/user → /api/users
Tests: Now passing
Commit: abc123f

Debugging complete! ✅
```

### Example 2: Memory Leak

```
User: "Application memory usage keeps growing"

Claude: [ccpm-debugging activates]

Investigating memory issue...

Step 1: MEASURE BASELINE
Starting memory: 150MB
After 1 hour: 450MB
After 2 hours: 750MB
Pattern: Linear growth → memory leak

Step 2: PROFILE
Using Node.js --inspect and Chrome DevTools...

Heap snapshot shows:
- EventEmitter listeners growing
- WebSocket connections not being cleaned up

Step 3: HYPOTHESIS
Event listeners not being removed after WebSocket disconnect

Step 4: CODE INVESTIGATION
Found in src/websocket/handler.ts:
```typescript
wss.on('connection', (ws) => {
  ws.on('message', handleMessage);
  ws.on('close', () => {
    console.log('Connection closed');
    // ❌ Not removing listeners!
  });
});
```

Step 5: FIX
```typescript
wss.on('connection', (ws) => {
  const messageHandler = handleMessage.bind(null, ws);
  ws.on('message', messageHandler);
  ws.on('close', () => {
    ws.removeListener('message', messageHandler);
    ws.removeAllListeners();
    console.log('Connection closed and cleaned up');
  });
});
```

Step 6: VERIFICATION
Running memory test for 2 hours...
Memory stable at ~160MB ✅

Step 7: LINEAR UPDATE
Blocker resolved: Memory leak in WebSocket handler
Fix: Proper cleanup of event listeners
Testing: 2-hour stability test passed

Debugging complete! ✅
```

## Tips for Effective Debugging

### Do's

- ✅ Read error messages completely
- ✅ Check stack traces for exact line numbers
- ✅ Reproduce issue consistently
- ✅ Create minimal reproduction
- ✅ Test one hypothesis at a time
- ✅ Document findings in Linear
- ✅ Add regression tests
- ✅ Fix root cause, not symptoms

### Don'ts

- ❌ Make random changes hoping to fix it
- ❌ Skip error messages
- ❌ Change multiple things at once
- ❌ Ignore warnings
- ❌ Forget to verify the fix
- ❌ Leave debugging code in production
- ❌ Skip documentation

## Summary

This skill provides:

- ✅ Systematic debugging approach
- ✅ Root-cause tracing
- ✅ Linear integration for tracking
- ✅ Blocker logging
- ✅ Defense-in-depth investigation
- ✅ Integration with CCPM verification workflow

**Philosophy**: Systematic over random, root-cause over symptoms, document for future.

---

**Source**: Adapted from [claudekit-skills/debugging](https://github.com/mrgoonie/claudekit-skills)
**License**: MIT
**CCPM Integration**: `/ccpm:verify`, `/ccpm:sync`, Linear blocker tracking