# Gate A Statement-Binding Red Team — Round 002 (2026-06-10) Round 001 ([gate-a-redteam-001.md](gate-a-redteam-001.md)) closed every *soundness* hole but recorded a **deeper limitation**: no Gate A layer bound a library theorem's *statement* to its claimed canonical goal, so a vacuously-true or mis-stated theorem under a plausible name still passed (sound but meaningless). That gap is exactly what the **statement-binding gate** (Stage D, ADR-011 / SPEC-011-A, landed in [#123](https://github.com/agenticsnz/unsorry/pull/123) / [#128](https://github.com/agenticsnz/unsorry/pull/128)) was built to close. Round 002 attacks that gate directly. Three adversarial agents each opened a **real PR** that keeps the goal's *name* and ships an honest-building library term whose *type is not the goal's statement*, while pointing the index entry at the goal's real canonical sha — the smuggle is the asymmetry between what the index claims and what the library proves. A fourth PR is an honest control. An independent adjudicator verified every PR's CI state from `gh pr checks` / `gh run view --log-failed`. **An attack is "blocked" iff gate-a is RED (cannot merge); the control must be GREEN; a survivor is any attack PR whose gate-a is green.** ## Verdict **The ADR-011 binding gate holds against every vector. 3/3 attacks blocked, 0 survivors; the honest control passes gate-a.** Each attack is caught at the regenerated statement-binding obligation: Gate A lowers the goal's canonical statement to a Lean obligation `_binding_check : := ` and type-checks it. Because the obligation demands the library term *inhabit the goal's type*, a vacuous `True`, a specialised `0 ≤ 0`, and an `autoImplicit`-vacuous `∀ {p} {n}, p → ¬p → 0 ≤ n` all fail to type-check — the meaningfulness gap recorded in round 001 is now gated, not merely reviewed. ## Per-vector results | Vector | PR | gate-a | Caught by | Evidence (the binding obligation rejected the smuggle) | |---|---|---|---|---| | vacuous-true | [#124](https://github.com/agenticsnz/unsorry/pull/124) | ❌ fail | **statement-binding obligation** (built under `--wfail`) | `✖ [575/591] Building Unsorry.NatMulOneThmBinding` → `nat_mul_one_thm has type True but is expected to have type ∀ (n : Nat), n * 1 = n` → Lean exited 1 (run 27300731227) | | weakened-hyp | [#125](https://github.com/agenticsnz/unsorry/pull/125) | ❌ fail | **statement-binding obligation** (built under `--wfail`) | `✖ [595/600] Building Unsorry.NatZeroLeBinding` → `nat_zero_le_thm has type 0 ≤ 0` (expected `∀ (n : Nat), 0 ≤ n`) → build failed (run 27300741087) | | autoimplicit-vacuity | [#127](https://github.com/agenticsnz/unsorry/pull/127) | ❌ fail | **statement-binding obligation** (built under `--wfail`) | `✖ [594/600] Building Unsorry.NatZeroLeBinding` → `nat_zero_le_thm has type ?m.5 → ¬?m.5 → 0 ≤ ?m.6` (expected `∀ (n : Nat), 0 ≤ n`) → build failed (run 27300765050) | **Control:** | Vector | PR | gate-a | Result | Evidence | |---|---|---|---|---| | control-correct | [#126](https://github.com/agenticsnz/unsorry/pull/126) | ✅ pass | honest proof not false-rejected | `nat_mul_one_thm := Nat.mul_one n`; binding obligation regenerated to `∀ (n : Nat), n * 1 = n` and type-checked: `✔ [589/600] Built Unsorry.NatMulOneThmBinding`; gate-a passed (run 27300751229, 5m30s) | All four PRs are **OPEN, unmerged, autoMerge=null** — verified, nothing merged or closed by the adjudicator. ## Why the binding step is the load-bearing layer here Every attack module **built fine on its own** (`✔ Built Unsorry.NatMulOneThm`, `✔ Built Unsorry.NatZeroLe`) — these are honest, sound Lean terms; round-001's sorry/axiom layers have nothing to bite on. The defense is the *binding obligation*, regenerated by `tools.gate_a.check_statement_binding generate` from the goal's canonical AISP statement, then type-checked in the `--wfail` library build: - The obligation `_binding_check : := ` forces the merged term to inhabit **the goal's type**, not merely *a* type. `True`, `0 ≤ 0`, and the autoImplicit metavariable type `?m → ¬?m → 0 ≤ ?m` are all distinct from `∀ (n : Nat), …`, so elaboration fails with a type mismatch on exactly the tampered module — every *other* `*Binding.lean` in all three runs built cleanly, isolating the failure to the smuggle. - The obligation is **CI-ephemeral and regenerated from the goal**, never read from the PR: no `*Binding.lean` is committed on any attack branch (verified via `git ls-tree`). This is the non-bypassable-by-construction property — an attacker cannot ship a forged-passing binding, because Gate A overwrites it from the canonical goal before building. - The index `goal`/`sha` linkage that round 001 called *advisory* is now **load-bearing**: the binding generator keys off the goal the index entry claims, so claiming the goal's real sha is precisely what summons the obligation that catches the mismatched body. The autoimplicit-vacuity vector (#127) is the explicit replay of round 001's PR-#64 survivor class. In round 001 it was only a *soundness*-clean vacuity that `check_library_options` had to catch as a forbidden option; in round 002 the binding obligation rejects it one step earlier on type-mismatch (`?m → ¬?m → 0 ≤ ?m` ≠ `∀ (n : Nat), 0 ≤ n`). `check_library_options` is sequenced after the `--wfail` build and so never ran (the build halted the job first); the agent verified locally that it would also exit 1 on the payload, so the defense is layered, not single-point. ## Caveats (declared, not silently estimated) - **gate-b also failed on all three attacks and on the control.** This is orthogonal to the gate under test. On the control it is a real Gate B invariant, not a binding result: GB005 rejects `goals/nat-mul-one-thm.aisp: status≡proved requires sha as 64 lowercase hex` — the red-team task flipped `status` to `proved` while leaving the goal record's content-address `sha≜∅`, which Gate B's validator rejects (a correctly-merged proof, e.g. `nat-mul-comm-thm.aisp`, also carries the sha). gate-b's state does **not** change any verdict here: the authoritative result for every vector is gate-a's binding/build outcome, and the control's *gate-a* (the gate under test) is green. - The failing CI step is literally named "Library build — zero-warning bar (`--wfail`)", because the regenerated binding obligations are type-checked inside that build rather than in a separate step. The *cause* of each failure is the binding obligation (`*Binding.lean`), as the per-module trace shows. We classify all three as **caught by the statement-binding gate**. ## Reproduce The three attack PRs (#124 vacuous-true, #125 weakened-hyp, #127 autoimplicit-vacuity) and the control (#126 control-correct) are OPEN/unmerged, branches `redteam2/*` retained for audit. Each attack branch carries its tampered `library/Unsorry/*.lean`, an index entry under `library/index/.aisp` claiming the goal's canonical statement, and the flipped `goals/*.aisp`. Re-running gate-a on any attack branch regenerates the binding obligation from the goal and fails the `--wfail` build on the corresponding `*Binding.lean`.