---
name: verify-llm-artifacts
description: Confirms or rejects findings from review-llm-artifacts before deletes or risky refactors. Loads review-verification-protocol-style checks per finding. Use after a review run, when the user wants to reduce false positives, before fix-llm-artifacts on dead code, or when validating a full-project scan.
disable-model-invocation: true
---

# Verify LLM Artifacts Findings

Second-pass verification for `.beagle/llm-artifacts-review.json`. The detection pass optimizes for recall; this pass optimizes for **precision** so agents do not remove or “clean” code that is still required.

## When to run

- After `/beagle-core:review-llm-artifacts` (especially full-project scans).
- Before `/beagle-core:fix-llm-artifacts` when findings include **deletions**, **dead code**, or **High** risk.
- Whenever past runs flagged artifacts that should not have been removed.

## Inputs

- **Required:** `.beagle/llm-artifacts-review.json` from a completed review.
- **Optional:** `$ARGUMENTS` — `--priority-only` (verify `dead_code` and any `fix_action` of `delete` first; then others), `--id N` (single finding id).

If the review file is missing, exit with: `Run /beagle-core:review-llm-artifacts first.`

## Prerequisite skills

1. Load `Skill(skill: "beagle-core:review-verification-protocol")` — general anti–false-positive discipline.
2. Load `Skill(skill: "beagle-core:llm-artifacts-detection")` — category criteria for what counts as a real issue.

## Instructions

### Hard gates

Objective pass conditions before you claim verification is done:

1. **Input parse:** The JSON load command in step 1 exits 0 (no traceback). **Pass:** valid JSON on disk at `.beagle/llm-artifacts-review.json`.
2. **Evidence before verdict:** For each finding you adjudicate, you have applied [references/verification-checklist.md](references/verification-checklist.md) for its `category` (or documented why the category is N/A) and recorded matching strings in `checks_performed`. **Pass:** no `status` without at least one checklist-backed check or an explicit N/A note in `notes`.
3. **Output contract:** After writing `.beagle/llm-artifacts-verification.json`, the validate command in step 4 exits 0; `summary` counts equal the number of `results` entries by `status`; every `id` matches the source report. **Pass:** schema-valid JSON and consistent ids/counts.

### 1. Load and validate JSON

```bash
python3 -c "import json; json.load(open('.beagle/llm-artifacts-review.json'))"
```

**Pass:** command exits 0.

Record `git_head` and `scope` from the report. If the working tree no longer matches (optional strict mode: compare to `git rev-parse HEAD`), warn that line numbers may drift.

### 2. Order findings

Default order:

1. `category == "dead_code"` or `fix_action == "delete"` or `risk == "High"`
2. Remaining findings by `(risk descending, id ascending)`

With `--priority-only`, stop after processing category `dead_code` and all `fix_action: delete` (still write full output for those processed).

### 3. Verify each finding

For each finding, follow [references/verification-checklist.md](references/verification-checklist.md).

**Minimum evidence per finding:**

- Read the **file** at the cited location and enough context to judge (parent symbol, imports).
- For unused/dead claims: **search** the repo (symbols, exports, string hooks) unless the issue is purely stylistic with no removal.

**Pass:** `checks_performed` lists only checks you actually ran (e.g. `read_symbol`, `ripgrep_symbol`); `notes` cite the decisive observation.

Assign one status:

| `status` | Meaning |
|----------|---------|
| `confirmed_issue` | The finding is valid; acting on it is appropriate. |
| `false_positive` | The finding should be discarded; do not auto-fix. |
| `inconclusive` | Needs human or product context; treat like risky in `fix-llm-artifacts`. |

Set `confidence`: `high` | `medium` | `low` based on how direct the evidence was.

### 4. Write output

Create `.beagle` if needed. Write **`.beagle/llm-artifacts-verification.json`**:

```json
{
  "version": "1.0.0",
  "created_at": "2026-04-19T12:00:00Z",
  "source_report": ".beagle/llm-artifacts-review.json",
  "source_git_head": "<from review>",
  "review_scope": "all|changed",
  "results": [
    {
      "id": 1,
      "status": "confirmed_issue|false_positive|inconclusive",
      "confidence": "high|medium|low",
      "checks_performed": ["read_symbol", "ripgrep_symbol", "export_trace"],
      "notes": "1-3 sentences of evidence"
    }
  ],
  "summary": {
    "confirmed_issue": 0,
    "false_positive": 0,
    "inconclusive": 0
  }
}
```

Validate the file you wrote:

```bash
python3 -c "import json; json.load(open('.beagle/llm-artifacts-verification.json'))"
```

**Pass:** command exits 0; re-open the file and confirm `summary` matches `results` (count each `status`).

### 5. Summarize for the user

Print a short markdown table: id, category, original one-line description, **verdict**, confidence.

End with:

- Counts of confirmed vs false positive vs inconclusive.
- Recommendation: run `fix-llm-artifacts` only on confirmed (see that skill when verification file is present).

## Rules

- Do **not** invent new issues; only adjudicate existing `findings[]` entries.
- Prefer `inconclusive` over `confirmed_issue` when removal could break dynamic or cross-repo usage.
- Preserve finding `id` values exactly as in the source report.

## Integration

- **`fix-llm-artifacts`:** When this file exists, use it to skip `false_positive` ids and to treat `inconclusive` like risky fixes.
- **`fix_action` custody:** The `fix_action` field (`refactor`/`delete`/`simplify`/`extract`) is emitted by `review-llm-artifacts` and consumed by `fix-llm-artifacts` as a risk gate; verification carries it through unchanged and does **not** re-validate it.