--- name: fix-issue license: MIT compatibility: "Claude Code 2.1.148+. Requires memory MCP server, context7 MCP server, gh CLI." description: "Fixes GitHub issues using parallel analysis agents for root cause investigation, code exploration, and regression detection. Reads issue context from gh CLI, searches codebase and memory for related patterns, generates a fix with tests, and links the resolution back to the issue via PR. Includes prevention analysis to avoid recurrence. Use when debugging errors, resolving regressions, fixing bugs, or triaging issues." argument-hint: "[issue-number]" context: fork version: 2.5.0 author: OrchestKit tags: [issue, bug-fix, github, debugging, rca, prevention] user-invocable: true allowed-tools: [AskUserQuestion, Bash, Read, Write, Edit, Task, TaskCreate, TaskUpdate, TaskStop, Grep, Glob, ToolSearch, CronCreate, CronDelete, PushNotification, mcp__memory__search_nodes, mcp__context7__get_library_docs] skills: [commit, explore, verify, memory, remember, chain-patterns] complexity: medium persuasion-type: guidance model: sonnet hooks: PreToolUse: - matcher: "Read" command: "${CLAUDE_PLUGIN_ROOT}/hooks/bin/run-hook.mjs skill/issue-context-loader" once: true metadata: category: workflow-automation mcp-server: memory, context7 triggers: keywords: [fix, debug, "bug report", broken, "500 errors", investigate, resolve, regression, "track down", "figure out why", "issue #"] examples: - "fix issue #234" - "there's a bug where users can't reset their passwords" - "something's causing 500 errors on the /api/users endpoint" anti-triggers: [implement, build, create, explore, review, brainstorm] paths: ["src/**/*.{ts,tsx,js,jsx}", "package.json", "CLAUDE.md"] --- # Fix Issue Systematic issue resolution with hypothesis-based root cause analysis, similar issue detection, and prevention recommendations. ## Quick Start ```bash /ork:fix-issue 123 /ork:fix-issue 456 ``` > **Opus 4.8**: Root cause analysis uses native adaptive thinking. Dynamic token budgets scale with context window for thorough investigation. > **CC ≥ 2.1.119 multi-host note (M122):** Issue fetching works against GitHub, GitLab, Bitbucket, and GitHub Enterprise. The argument is either a numeric ID (use the configured default remote's host) or a full URL (parsed via `parsePrUrl`/`parseIssueUrl` from `src/hooks/src/lib/pr-host-parser.ts`). Branch on the detected host family for the right CLI: `gh issue view` (GitHub/GHE), `glab issue view` (GitLab), `bb issue view` (Bitbucket). Reference: `src/skills/chain-patterns/references/pr-from-platform.md`. ## Argument Resolution ```python ISSUE_NUMBER = "$ARGUMENTS[0]" # e.g., "123" (CC 2.1.59 indexed access) # $ARGUMENTS contains the full argument string # $ARGUMENTS[0] is the first space-separated token ``` ## STEP -1: MCP Probe + Resume Check **Run BEFORE any other step.** Detect available MCP servers and check for resumable state. ```python # Probe MCPs (parallel — all in ONE message): # memory is alwaysLoad in .mcp.json (CC 2.1.121+, #1541) — probe below kept as fallback for older CC: ToolSearch(query="select:mcp__memory__search_nodes") ToolSearch(query="select:mcp__context7__resolve-library-id") # Write capability map: Write(".claude/chain/capabilities.json", JSON.stringify({ "memory": , "context7": , "timestamp": now() })) # Check for resumable state: Read(".claude/chain/state.json") # If exists and skill == "fix-issue": # Read last handoff, skip to current_phase # Tell user: "Resuming from Phase {N}" # If not exists: write initial state Write(".claude/chain/state.json", JSON.stringify({ "skill": "fix-issue", "issue": ISSUE_NUMBER, "current_phase": 1, "completed_phases": [], "capabilities": capabilities })) ``` > Load pattern details: `Read("${CLAUDE_PLUGIN_ROOT}/skills/chain-patterns/references/mcp-detection.md")` ## Phase 0b — Prior-fix lookup (signal-fired, optional) Before diagnosis kicks off, optionally invoke `scripts/prior_fix_lookup.py ` to surface similar fixes already recorded in the memory MCP. READ-ONLY — no writeback. Self-skips on every non-happy-path so it never blocks the fix: ```bash python3 plugins/ork/skills/fix-issue/scripts/prior_fix_lookup.py "$CLAUDE_JOB_DIR" ``` Auto-skip conditions (all exit 0, all WARN-logged): | Skip reason | Trigger | |-------------|---------| | `signal absent` | `error_text` missing OR signature extractor returns `None` | | `yg-mcp-core not importable` | `yg-mcp-core>=0.3.0` not installed (orchestkit is public; yg-mcp-core lives on private `pypi.yonyon.ai` — HQ-only) | | `memory MCP unreachable` | MCP server down OR `.mcp.json` doesn't define `memory` | Session dir must contain `fix-issue-input.json` (with `error_text: str`). The signature extractor (`signature_lib.extract_signature`) normalizes Python tracebacks, JS stack traces, and generic `: ` errors to a ` :` shape used as the `search_nodes` query. Handoff JSON at `/prior-fix-matches.json` records `status`, `signature`, and `matches_count`; the top-3 matches land in `/prior-fix-matches.md` as a Markdown table. Mirrors the memory-consumer pattern from PR #1889 but read-only. Closes orchestkit#1895. ## CRITICAL: Task Management is MANDATORY (CC 2.1.16) **BEFORE doing ANYTHING else (after MCP probe), create tasks to track progress:** ```python # 1. Create main task IMMEDIATELY TaskCreate( subject="Fix Issue: #{ISSUE_NUMBER}", description="Systematic issue resolution with RCA and prevention", activeForm="Fixing issue #{ISSUE_NUMBER}" ) # 2. Create subtasks for each key phase TaskCreate(subject="Understand issue", activeForm="Reading issue details") TaskCreate(subject="Hypothesis & RCA", activeForm="Analyzing root cause") TaskCreate(subject="Implement fix", activeForm="Applying fix with tests") TaskCreate(subject="Validate & prevent", activeForm="Validating fix and prevention") TaskCreate(subject="Commit and PR", activeForm="Creating PR for fix") # 3. Set dependencies for sequential phases TaskUpdate(taskId="3", addBlockedBy=["2"]) TaskUpdate(taskId="4", addBlockedBy=["3"]) TaskUpdate(taskId="5", addBlockedBy=["4"]) TaskUpdate(taskId="6", addBlockedBy=["5"]) # 4. Before starting each task, verify it's unblocked task = TaskGet(taskId="2") # Verify blockedBy is empty # 5. Update status as you progress TaskUpdate(taskId="2", status="in_progress") # When starting TaskUpdate(taskId="2", status="completed") # When done ``` ## STEP 0: Effort-Aware Fix Scaling (CC 2.1.76) Scale investigation depth based on `/effort` level: | Effort Level | Approach | Agents | Phases | |-------------|----------|--------|--------| | **low** | Quick fix: read → fix → test → done | 0 agents | 1, 6, 7, 11 | | **medium** | Standard: investigate → fix → test → prevent | 2-3 agents | 1-4, 6-8, 11 | | **high** (default) | Full RCA: 5 parallel agents → fix → prevent → lessons | 5 agents | All 11 phases | | **xhigh** (Opus 4.8, CC 2.1.111+) | Full RCA + extra regression-scan pass across sibling modules | 5 agents | All 11 phases + regression sweep | > **Override:** Explicit user selection (e.g., "Proper fix") overrides `/effort` downscaling. Hotfix always uses minimal phases regardless of effort. ## STEP 0a: Verify User Intent **BEFORE creating tasks**, clarify fix approach: ```python AskUserQuestion( questions=[{ "question": "How do you want to approach this fix?", "header": "Fix Approach", "options": [ {"label": "Proper fix (Recommended)", "description": "Full RCA, regression test, prevention plan"}, {"label": "Quick fix", "description": "Minimal investigation, fix and test"}, {"label": "Investigate first", "description": "Deep analysis before deciding on approach"}, {"label": "Hotfix", "description": "Emergency fix, minimal process"} ], "multiSelect": false }] ) ``` **Based on answer, adjust workflow:** - **Proper fix**: All 11 phases, 5 parallel RCA agents - **Quick fix**: Phases 1, 6, 7, 11 only — skip RCA agents and prevention - **Investigate first**: Enter plan mode for read-only analysis, then decide - **Hotfix**: Phases 1, 6, 11 only — emergency path ### Sub-question: Local-CI Strategy (AskUserQuestion — M118 #1467) Once the approach is chosen, ask whether to run CI locally before pushing — orthogonal to fix depth: ```python # Skip when invocation flag is explicit: # /ork:fix-issue 123 --local-ci → skip, run full suite locally # /ork:fix-issue 123 --security-only → skip, security tests only # /ork:fix-issue 123 --push-and-let-ci → skip, no local run # # Force local-CI when issue has security or data-loss labels (warns user it overrode their choice). AskUserQuestion(questions=[{ "question": "Before push?", "header": "Local CI", "options": [ {"label": "Push and let CI run (default)", "description": "Fastest round-trip, CI catches failures"}, {"label": "Run full suite locally first", "description": "~2-3 min extra; catches CI failures locally before push"}, {"label": "Run security tests only", "description": "~30s; covers the usual blocker class — secrets, deps, common vulns"} ] }]) ``` **Override rule:** if the issue's GitHub labels include `security` or `data-loss`, override the user's selection with **"Run full suite locally first"** and surface a one-line notification: *"Security/data-loss label detected — running full local suite as a precaution."* The user can still bypass with the `--push-and-let-ci` arg, which logs the bypass for audit. **If 'Investigate first' selected:** ```python # 1. Enter read-only plan mode EnterPlanMode("Investigate issue: $ISSUE_REF") # 2. Investigation phase — Read/Grep/Glob ONLY, no Write/Edit # - Read the issue description and linked context # - Trace the error path through relevant code # - Search for related issues, past fixes, test failures # - Build hypothesis list with evidence # 3. Produce RCA report: # - Root cause hypothesis (ranked by confidence) # - Affected files and blast radius # - Recommended approach (proper fix vs quick fix) # - Risk assessment # 4. Exit plan mode — returns analysis for user decision ExitPlanMode() # 5. User reviews RCA. If "proceed with fix" → continue to Phase 5 (Fix). # If "need more info" → re-enter investigation. ``` Load `Read("${CLAUDE_SKILL_DIR}/rules/evidence-gathering.md")` for detailed workflow adjustments per approach. ## STEP 0b: Select Orchestration Mode Choose **Agent Teams** (mesh) or **Task tool** (star). Load `Read("${CLAUDE_SKILL_DIR}/references/agent-selection.md")` for the selection criteria, cost comparison, and task creation patterns. ## Service Discovery & Visual Inspection When the issue involves a running web app, API, or UI bug, discover services and inspect visually **before** forming hypotheses: ```bash # 1. Discover services via Portless (preferred) portless list 2>/dev/null # api → api.localhost:1355 (port 8080) # app → app.localhost:1355 (port 3000) # 2. Fallback: discover ports manually lsof -iTCP -sTCP:LISTEN -nP | grep -E 'node|python|java' # 3. Visual inspection with agent-browser agent-browser open "http://app.localhost:1355" agent-browser screenshot /tmp/issue-before.png # capture broken state agent-browser console # check for JS errors agent-browser network log # inspect failed API calls agent-browser get text @error-banner # extract error messages ``` Use Portless named URLs (`*.localhost:1355`) in all investigation steps — they're stable, self-documenting, and eliminate port-guessing failures. Install with `npm i -g portless`. ## Workflow Overview | Phase | Activities | Output | |-------|------------|--------| | **1. Understand Issue** | Read GitHub issue details | Problem statement | | **1b. Service Discovery** | Portless list, agent-browser visual inspection | Service URLs, screenshots | | **2. Similar Issue Detection** | Search for related past issues | Related issues list | | **3. Hypothesis Formation** | Form hypotheses with confidence scores | Ranked hypotheses | | **4. Root Cause Analysis** | 5 parallel agents investigate | Confirmed root cause | | **5. Fix Design** | Design approach based on RCA | Fix specification | | **6. Implementation** | Apply fix with tests | Working code | | **7. Validation** | Verify fix resolves issue, screenshot after state | Evidence | | **8. Prevention** | How to prevent recurrence | Prevention plan | | **9. Runbook** | Create/update runbook entry | Runbook | | **10. Lessons Learned** | Capture knowledge | Persisted learnings | | **11. Commit and PR** | Create PR with fix | Merged PR | ### Progressive Output (CC 2.1.76) Output results **incrementally** as each phase completes — don't batch until the PR: | After Phase | Show User | |-------------|-----------| | 1. Understand Issue | Problem statement, affected files | | 3. Hypothesis Formation | Ranked hypotheses with confidence scores | | 4. RCA | Confirmed root cause, evidence chain | | 6. Implementation | Fix description, files changed | | 7. Validation | Test results, before/after behavior | For the proper fix path with 5 parallel RCA agents, output each agent's findings **as they return** — don't wait for all 5. If one agent identifies the root cause with high confidence early, flag it immediately so the user can confirm and skip remaining agents. ### Phase Handoffs (CC 2.1.71) Write handoff JSON after phases 3, 4, 6, 7 to `.claude/chain/`. See `chain-patterns` skill for schema. | After Phase | Handoff File | Key Outputs | |-------------|-------------|-------------| | 3. Hypothesis | `03-hypotheses.json` | Ranked hypotheses with confidence scores | | 4. RCA | `04-rca.json` | Confirmed root cause, evidence, affected files | | 6. Implementation | `06-fix.json` | Fix description, files changed, test plan | | 7. Validation | `07-validation.json` | Test results, coverage delta | ### Worktree-Isolated RCA Agents (CC 2.1.50) Phase 4 agents SHOULD use `isolation: "worktree"` when they need to edit files: ```python Agent(subagent_type="debug-investigator", prompt="Investigate hypothesis: {desc}...", isolation="worktree", run_in_background=true) ``` ### Post-Fix Monitoring (CC 2.1.71) After Phase 11 (commit + PR), schedule CI monitoring: ```python # Guard: Skip cron in headless/CI (CLAUDE_CODE_DISABLE_CRON) # if env CLAUDE_CODE_DISABLE_CRON is set, run a single check instead CronCreate( schedule="*/5 * * * *", prompt="Check CI for PR #{pr_number}: gh pr checks {pr_number} --repo {repo}. All pass → CronDelete this job. Any fail → alert with details." ) ``` ### Worktree Cleanup (CC 2.1.72) If worktree isolation was used in Phase 4, clean up after validation: ```python # After Phase 7 validation passes — exit worktree, keep branch for PR ExitWorktree(action="keep") ``` Every `EnterWorktree` or `isolation: "worktree"` agent must have a matching cleanup. If agents used `isolation: "worktree"`, they handle their own exit — but if the lead entered a worktree in Step 0, it must call `ExitWorktree` before Phase 11 commit. ### Fix Pattern Memory If memory MCP is available (from Step -1 probe), save the fix pattern: ```python if capabilities.memory: mcp__memory__create_entities([{ name: "fix-pattern-{slug}", entityType: "fix-pattern", observations: [root_cause, fix_description, regression_test, issue_ref] }]) ``` > **Full phase details**: Load `Read("${CLAUDE_SKILL_DIR}/references/fix-phases.md")` for bash commands, templates, and procedures for each phase. ## Critical Constraints - **Feature branch MANDATORY** -- NEVER commit directly to main or dev - **Regression test MANDATORY** -- write failing test BEFORE implementing fix - **Prevention required** -- at least one of: automated test, validation rule, or process check - Make minimal, focused changes; DO NOT over-engineer ## CC 2.1.49 Enhancements > Load `Read("${CLAUDE_SKILL_DIR}/references/cc-enhancements.md")` for session resume, task metrics, tool guidance, worktree isolation, and adaptive thinking. ## Rules Quick Reference | Rule | Impact | What It Covers | |------|--------|----------------| | evidence-gathering (load `${CLAUDE_SKILL_DIR}/rules/evidence-gathering.md`) | HIGH | User intent verification, confidence scale, key decisions | | rca-five-whys (load `${CLAUDE_SKILL_DIR}/rules/rca-five-whys.md`) | HIGH | 5 Whys iterative causal analysis | | rca-fishbone (load `${CLAUDE_SKILL_DIR}/rules/rca-fishbone.md`) | MEDIUM | Ishikawa diagram, multi-factor analysis | | rca-fault-tree (load `${CLAUDE_SKILL_DIR}/rules/rca-fault-tree.md`) | MEDIUM | Fault tree analysis, AND/OR gates, critical systems | > **Push notifications (CC 2.1.110+):** Issue-fix flows can span 10–20 min with RCA → fix → test → PR. When the fix lands and tests pass, call `PushNotification` so the user knows the fix is ready for review. Requires Remote Control + "Push when Claude decides" config; fails silently if unavailable. > > ```python > PushNotification( > title="ork:fix-issue complete", > body=f"#{issue_number}: fix pushed, {tests_passing}/{tests_total} tests · PR #{pr_number}" > ) > ``` ## Agent Coordination ### Dispatch envelope (CC 2.1.142+ flags — M146-6 / #1849) When spawning the 5 RCA agents (debug-investigator, code-quality-reviewer, test-generator, etc.) — whether in-session via the `Agent` tool or headless via `claude -p --bare` — set explicit per-role flags so behaviour is deterministic across interactive and CI runs: | Agent role | `--permission-mode` | `--effort` | |---|---|---| | RCA / investigation (`debug-investigator`, `Explore`) | `dontAsk` | `low` — `medium` | | Test reproduction (`test-generator`) | `acceptEdits` | `medium` | | Fix authoring (production code) | `default` (keep user in loop) | `medium` — `high` | | Verification (`code-quality-reviewer`) | `dontAsk` | `low` | **Never** use `bypassPermissions` — fix-issue's RCA phase often touches code paths; the audit trail matters. For headless invocations (e.g. from `/ork:ci-sentinel` or a cron-driven bug sweep), pass the flags explicitly: ```bash claude -p --bare \ --permission-mode dontAsk \ --effort medium \ --max-turns 12 \ "/ork:fix-issue " ``` ### SendMessage (Evidence Sharing) When an RCA agent discovers the root cause, share with the fix agent: ```python SendMessage(to="debug-investigator", message="Root cause: race condition in cache invalidation — see git blame for commit abc123") ``` ### Context Passing All 5 RCA agents receive: issue description, ranked hypotheses, reproduction steps, and affected file paths — not just "investigate issue #N". ### Skill Chain After fix is applied: `TaskCreate(subject="Verify fix", addBlockedBy=[fix_task_id])` → `/ork:verify`. ### Verification Gate Before claiming any fix is complete, apply the 5-step verification gate: `Read("${CLAUDE_PLUGIN_ROOT}/skills/shared/rules/verification-gate.md")`. "Should work now" is not evidence — run the test, read the output, cite the result. ### Response Protocol When reporting fix status, follow `Read("${CLAUDE_PLUGIN_ROOT}/skills/shared/rules/anti-sycophancy.md")` — state findings directly, no performative language. Use the agent status protocol: DONE, DONE_WITH_CONCERNS, BLOCKED, or NEEDS_CONTEXT. ## Related Skills - `ork:commit` - Commit issue fixes - `debug-investigator` - Debug complex issues - `browser-tools` - Visual inspection with agent-browser + Portless - `ork:issue-progress-tracking` - Auto-updates from commits - `ork:remember` - Store lessons learned > **Session recovery (CC 2.1.108+):** After idle periods or interruptions, use `/recap` to restore conversational context alongside checkpoint-resume state. Enabled by default since CC 2.1.110 (even with telemetry disabled). ## Picker fallback (#1795) If the `AskUserQuestion` picker stalls (schema break, not a CC input bug — orchestkit#1795, now guarded by tests/skills/structure/test-askuserquestion-schema.sh), set `ORK_ASK_FALLBACK=text` before starting CC. The `lifecycle/ask-fallback-injector` hook injects a reminder telling the assistant to pose options inline as a numbered list and ask the user to reply with the option number. ## References Load on demand with `Read("${CLAUDE_SKILL_DIR}/references/")`: | File | Content | |------|---------| | `fix-phases.md` | Bash commands, templates, procedures per phase | | `agent-selection.md` | Orchestration mode selection criteria and cost comparison | | `similar-issue-search.md` | Similar issue detection patterns | | `hypothesis-rca.md` | Hypothesis-based root cause analysis | | `agent-teams-rca.md` | Agent Teams RCA workflow | | `prevention-patterns.md` | Recurrence prevention patterns | | `cc-enhancements.md` | CC 2.1.49 session resume, task metrics, adaptive thinking | --- **Version:** 2.4.0 (March 2026) — Rich elicitation with options for fix approach, progressive output for incremental phase results