--- name: verify description: "Evidence-based completion verification against all acceptance criteria. Must run before claiming work is done. No claims without evidence." --- ## Session Start Check (required) If `session-start` has not run in this session, run `/atdd-kit:session-start` first. # verify Skill If ARGUMENTS does not contain `--autopilot` (user invoked directly via slash command): - Display message: "This skill is autopilot-only. Use `/atdd-kit:autopilot ` instead." - **STOP.** Do not proceed with execution. If ARGUMENTS contains `--autopilot` (invoked by autopilot): skip this guard silently. Do NOT invoke ship or claim work is complete until ALL ACs have FRESH verification evidence from commands executed in THIS session. No exceptions. ## State Gate (required) 1. **Check `in-progress` label:** `gh issue view --json labels --jq '[.labels[].name] | index("in-progress")'` - Present: proceed. - Missing: STOP. "Issue #N does not have `in-progress` label. Run `atdd` first." 2. **Check implementation branch:** Verify current branch is not `main`. - On `main`: STOP. "Cannot verify on main. Check out the implementation branch first." ## Iron Law NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE. "should work", "looks correct", "probably passes" are NOT evidence. Only command output is evidence. **FRESH means NOW:** Execute in this session, in this message. Do not use cached results, previous session output, or sub-agent reports. ## Spec Authority Check (before Verification Flow) Resolve the authoritative AC source via `bash lib/spec_check.sh`: 1. `slug=$(bash lib/spec_check.sh derive_slug )`; AC6 fallback applies (missing / continuation-fallback / tbd-persona, all prefixed `[spec-warn]`). 2. `status=$(bash lib/spec_check.sh spec_status "$slug")` — tiebreak: - `approved` / `implemented` → **spec is authoritative.** - `draft` / `deprecated` → Issue comment ACs win; emit `[spec-warn] draft:` or `[spec-warn] deprecated:`. 3. Report divergences using `docs/methodology/us-ac-format.md` § Spec ↔ Issue Divergence Matrix (5 patterns: Added / Removed / Modified / Reordered / Status drift). Full matrix, fallback cases, and examples: `docs/guides/spec-reference.md`. ## Verification Flow 1. Read ACs from the authoritative source determined above 2. For each AC, identify the verification command: - Unit test AC → run specific test: `[test command] [test file/class]` - Snapshot test AC → run snapshot tests - E2E test AC → run E2E suite - Build AC → run build command 3. Execute each command 4. Read the full output 5. Map output to AC: PASS with evidence or FAIL with reason ## Verification Checklist ``` ## Verification Results | AC | Test | Result | Evidence | |----|------|--------|----------| | AC1: [name] | [test command] | PASS / FAIL | [output excerpt] | | AC2: [name] | [test command] | PASS / FAIL | [output excerpt] | | ... | ... | ... | ... | ### Additional Verification - [ ] Build passes with zero warnings: [command] -> [result] - [ ] Lint passes: [command] -> [result] - [ ] All existing tests pass: [command] -> [result] ``` ## What Counts as Evidence - Command output showing PASS with test count - Build log showing "Build Succeeded" with 0 warnings - Lint output showing 0 violations - NOT: "I ran the tests and they passed" (no output shown) - NOT: "Should be fine" / "looks good" - NOT: Previous session's test results (must be FRESH) - NOT: Partial test run (must run ALL tests) ## If Any AC Fails - Report which AC failed and why - Do NOT proceed to ship - Return to `atdd` skill to fix ## Status Output **Autopilot mode only** (ARGUMENTS contains `--autopilot`). Skip in standalone mode. Output a `skill-status` fenced code block as the **last element** of your response at every terminal point. Terminal points: - **COMPLETE:** All ACs pass with fresh evidence and ship about to be invoked. - **PENDING:** Waiting for user input. - **BLOCKED:** State Gate failed. - **FAILED:** One or more ACs failed -- return to atdd. ```skill-status SKILL_STATUS: COMPLETE | PENDING | BLOCKED | FAILED PHASE: verify RECOMMENDATION: ``` Examples: ```skill-status SKILL_STATUS: COMPLETE PHASE: verify RECOMMENDATION: All ACs verified. Running ship. ``` ```skill-status SKILL_STATUS: BLOCKED PHASE: verify RECOMMENDATION: Issue #N does not have in-progress label. Run atdd first. ``` ```skill-status SKILL_STATUS: FAILED PHASE: verify RECOMMENDATION: AC3 failed: test output shows assertion error. Return to atdd to fix implementation. ``` See `docs/guides/skill-status-spec.md` for full field definitions, BLOCKED vs FAILED distinction, and autopilot action matrix. ## If All ACs Pass - Post verification results as Issue comment - "All verification items passed. Running `atdd-kit:ship` to finalize the PR." - Invoke `atdd-kit:ship` via the Skill tool. ## Red Flags (STOP) | Thought | Reality | |---------|---------| | "Tests should pass now" | RUN the verification. "Should" is not evidence. | | "I'm confident this works" | Confidence is not evidence. Run the command. | | "Just this once I'll skip lint" | No exceptions. Run every check. | | "The sub-agent said it passed" | Verify independently. Trust no report without fresh output. | | "I already ran these tests earlier" | Earlier is not now. Run FRESH. | | "Partial test run is enough" | Must run ALL tests. Partial results hide regressions. | | "The build is obviously fine" | Run the build command. "Obviously" is a red flag word. | ## Gate Function Pattern ``` IDENTIFY -> what command proves this AC? RUN -> execute the command READ -> read the full output VERIFY -> does output match the claim? ONLY THEN -> state the result ```