# Phase-3 run 001 — the chain, in anger (Platonic Schläfli core)

**run_id:** `phase3-run-001` · **date:** 2026-06-11/12 (UTC) · **trial:** thread A — force decompose → prove-subs → recompose to carry a real proof end-to-end.

Machine record: [`phase3-run-001.json`](phase3-run-001.json).

## Exit-metric verdict (read this first)

**The exit metric is: did the decompose → prove-subs → recompose chain carry a real proof end-to-end?**

**MET.** `platonic-schlafli-core` — the Schläfli arithmetic core of the Platonic-solids classification, mathlib-absent — is kernel-verified on `main` (#211, gate-a green, binding obligation regenerated and satisfied). The proof file imports `Unsorry.PlatonicSchlafliCoreS2/S3/S4` and **is literally the composition of its sub-lemmas**:

```lean
theorem platonic_schlafli_pairs (p q : ℕ) (hp : 3 ≤ p) (hq : 3 ≤ q)
    (h : (p : ℚ)⁻¹ + (q : ℚ)⁻¹ > 2⁻¹) :
    (p, q) ∈ ({(3,3),(3,4),(4,3),(3,5),(5,3)} : Finset (ℕ × ℕ)) :=
  platonic_schlafli_pairs_of_bounds p q hp hq
    (platonic_schlafli_fst_lt_six p q hp hq h)
    (platonic_schlafli_snd_lt_six p q hp hq h) h
```

Three interior nodes recomposed the same way first (s1-s1 → #197, s2 → #202, s1 → #207), so the chain was exercised at every level of a depth-3 tree, not just at the root.

**Honesty about the forcing:** the parent was *made* to decompose with a strangled stage-1 budget (`UNSORRY_ATTEMPTS=1`, `UNSORRY_WALL=420`). A frontier model at full budget might have one-shot this statement. What this run tested — and what is now demonstrated — is the *chain*: budget exhaustion → machine-proposed sub-lemmas through Gate B → bottom-up proving → unblock sweeps → recompose through Gate A, three levels deep, 13 goals, 4 recompositions.

## The tree

```
platonic-schlafli-core                       #149 decompose (4 subs) → #209 unblock → #211 PROVED
├── s1  (p<6 bound chain)                    #152 decompose (3 subs) → #203 unblock → #207 PROVED
│   ├── s1-s1 (cast monotonicity)            #167 decompose (2 subs) → #193 unblock → #197 PROVED @xhigh
│   │   ├── s1-s1-s1 nat_cast_le_rat_of_le   #184 PROVED @high
│   │   └── s1-s1-s2 nat_cast_six_eq_rat_six #186 PROVED @high
│   ├── s1-s2 rat_inv_le_inv_six_of_six_le   #166 PROVED @xhigh (see incidents)
│   └── s1-s3 nat_six_le_of_not_lt_six       #187 PROVED @high
├── s2  platonic_schlafli_fst_lt_six         #155 decompose (3 subs) → #200 unblock → #202 PROVED @high
│   ├── s2-s1 nat_inv_le_third_of_three_le   #191 PROVED @high
│   ├── s2-s2 rat_gt_sixth_of_add_gt_half    #195 PROVED @high
│   └── s2-s3 nat_lt_six_of_sixth_lt_inv     #194 PROVED @high
├── s3  platonic_schlafli_snd_lt_six         #201 PROVED @high
└── s4  platonic_schlafli_pairs_of_bounds    #208 PROVED @high  (the braced Finset literal)
```

13/13 proved · 4 decompositions · 4 recompositions · depth 3 · binding held everywhere · 0 soundness incidents.

## The effort ladder, measured (ADR-015)

11 of 13 final proofs closed at the **cheapest rung** (`high`, typically 3–8 minutes claim→PR); 2 needed `xhigh`; the only `max` spends were on the genuinely-stuck pre-decomposition passes. The ladder did exactly what it was designed to do: pay for deep reasoning only where a cheaper pass failed — and it cut quota burn proportionally, which mattered (below).

## Incidents — and the machinery they produced

This run ate three CLI quota outages and surfaced two coordination bugs. Each incident either validated existing guards or produced a new one, all shipped in [v1.3.0](https://github.com/agenticsnz/unsorry/releases/tag/v1.3.0):

1. **Outage 1 (05:48Z)** — every claude call died in ~1 min; the loop, unable to tell "model tried and failed" from "model never ran", demoted all 8 leaves below τ_v (#156–#163) and starved itself. Manual restore #165.
2. **Outage 2 (10:43Z)** — same, 9 more spurious demotes (#168–#181). Manual restore #183 — and **ADR-016**: fast-dead call + failed health probe = infrastructure failure, zero queue writes, exit 3.
3. **#166's silent stall** — a spurious demote raced the open prove PR for the same goal record; the PR went CONFLICTING, and GitHub runs **no checks** on a conflicted PR, so its armed auto-merge waited 8 hours in silence. (Also: gate-a's text lint rejected the word "axiom" in a *doc comment* — kernel checks were green; reworded.) Produced **ADR-017**: claim guard + supervisor + CONFLICTING flag.
4. **Claim races (19:00Z)** — pre-ADR-017 loops double-claimed past `PROVE_CLAIM_CAP=1`; 6 duplicate prove PRs, all closed, identical content where compared (the gates are idempotent — soundness was never at risk; the cost was redundant runs). Root-cause interleave still unreproduced — known issue.
5. **Outage 3 (19:45Z) — the validation.** First outage under the new guards: alpha's call died in 137s, probe failed, claim released, **zero penalties**, exit 3; both supervisors rode ~3.5 h of outage with 1h-capped exponential backoff and resumed unattended. Queue repair PRs required: **none.**
6. **Claim-guard false positive (23:4xZ)** — GitHub search tokenizes punctuation, so leftover duplicate #198 (`…-s2-s2`) matched the *parent's* guard query and both agents skipped the recompose-ready root. The supervisor's CONFLICTING flag surfaced it; exact-prefix matching shipped (#210).

Wall clock start→finish was ~24 h, of which ~12 h was quota outage and ~3.5 h was actual healthy-quota proving for the entire tree.

## What this run proves, and what it doesn't

- **Proved:** the architecture's central claim — a goal too big for one attempt becomes a tree, the tree becomes lemmas, the lemmas recompose into the goal, with the kernel as the only authority at every step. Also: an unattended swarm now survives its operational environment (outages, races, silent CI stalls) without corrupting queue state.
- **Not proved:** that decomposition beats one-shotting *for this statement* (the forcing was artificial), and nothing about the difficulty ceiling — these sub-lemmas were elementary. The next escalation is a target that resists a full-budget frontier model on its own merits.
- A bonus datapoint for the open-swarm thread: a second contributor machine (`mac-158f`) proved `alternating-sum-naturals` (#204) mid-run, through the same gates, with no coordination beyond the claims branch.