--- name: repo-scanning description: | Internal process for the repo-scanner agent. Defines the step-by-step procedure for scanning GitHub repos for evidence that supports or explains bug clusters. Not user-invocable — loaded by the repo-scanner agent through its skills frontmatter. allowed-tools: "Read, Bash(cat:*), Grep, Glob" user-invocable: false version: 0.1.0 author: Jeremy Longshore license: SEE LICENSE IN LICENSE model: inherit effort: medium compatibility: "Designed for Claude Code; internal agent-loaded skill (not user-invocable)" tags: [triage, repo-scanning, evidence, github, internal-agent-skill] --- # Repo Scanning Process Step-by-step procedure for scanning GitHub repos to gather corroborating evidence for bug clusters, assigning confidence tiers to each finding. ## Overview Loaded by the `repo-scanner` agent inside the `x-bug-triage` plugin. Walks each clustered bug through a fixed evidence-gathering pipeline against the up-to-three repos most likely to host the bug: matching open/recent issues, recent commits in the impact window, affected code paths, and recent deploys correlated to the cluster's first_seen timestamp. Each finding is graded against the evidence-tier policy and recorded as cluster evidence. ## Prerequisites - Cluster table populated upstream by the bug-clustering stage - `surface_repo_mapping` configured for every product_surface present in clusters - Triage MCP server reachable for `mcp__triage__search_issues`, `mcp__triage__inspect_recent_commits`, `mcp__triage__inspect_code_paths`, and `mcp__triage__check_recent_deploys` - GitHub access tokens with read scope on the target repos ## Instructions ### Step 1: Select Repos For each cluster: 1. Look up repos from surface_repo_mapping using the cluster's product_surface 2. Cap at top 3 repos per cluster (hard limit — never scan more) 3. If no mapping exists, note it as a warning and skip ### Step 2: Search Issues For each repo, call `mcp__triage__search_issues` with the cluster's symptoms and error_strings: - Match error strings against open/recent issues - Assign evidence tier based on match confidence ### Step 3: Inspect Recent Commits Call `mcp__triage__inspect_recent_commits` for each repo: - 7-day window from current date - Filter by affected paths if known from the cluster's feature_area - Look for commits that touch relevant code paths ### Step 4: Inspect Code Paths Call `mcp__triage__inspect_code_paths` with the cluster's surface and feature_area: - Identify likely affected code paths - Check for recent changes or known fragile areas ### Step 5: Check Recent Deploys Call `mcp__triage__check_recent_deploys` for each repo: - Correlate deploy/release timing with cluster's first_seen timestamp - Recent deploy near first_seen is a stronger signal ### Step 6: Assign Evidence Tiers For each piece of evidence, assign a tier: | Tier | Name | Criteria | |------|------|----------| | 1 | Exact | issue_match at >=0.9 confidence | | 2 | Strong | issue_match >=0.7, recent_commit >=0.8, affected_path >=0.7, recent_deploy >=0.8 | | 3 | Moderate | Lower confidence matches, sibling_failure | | 4 | Weak | external_dependency, heuristic proximity | ### Step 7: Handle Degradation If a repo is inaccessible or an API call fails: 1. Log a degraded scan result with the error reason 2. Continue scanning remaining repos — never abort the whole scan 3. Include degradation warnings in output ## Output - `cluster_evidence` rows tagged with tier (1–4), source kind, repo, and finding link - `cluster_scan_warnings` rows for any repo skipped or degraded during scanning - Updated cluster `evidence_summary` field with per-tier counts ## Error Handling - Missing surface→repo mapping: warn, skip the cluster's scan, proceed with remaining clusters - GitHub API rate limit hit: pause and resume, or degrade to "rate_limited" warning if budget exhausted - Single tool call failure: capture error reason, continue to next step, never abort a multi-step scan - All four signal sources empty for a cluster: record evidence_summary="empty" — downstream stages still run ## Examples Triggered automatically after owner-routing for each clustered bug. Typical output for a 12-cluster batch against 3 repos each: "12 clusters scanned, 7 with Tier 1 evidence, 3 with Tier 2 only, 2 with no evidence (Tier 4 weak only). 1 repo skipped (rate limit)." ## Resources Load evidence tier definitions for proper tier assignment: ``` !cat skills/x-bug-triage/references/evidence-policy.md ```