# ADR-010: Affinity-Weighted, Gap-Based Goal Selection | Field | Value | |-------|-------| | **Decision ID** | ADR-010 | | **Initiative** | unsorry Phase 2 — open lemmas and target decomposition | | **Proposed By** | unsorry maintainers | | **Date** | 2026-06-10 | | **Status** | Accepted | ## WH(Y) Decision Statement **In the context of** Phase 2, where the swarm must drive verified proofs toward a chosen unformalised target through a growing tree of decomposed sub-lemmas rather than chew through a flat Phase-1 backlog of ~20 independent one-liners, **facing** the fact that the loop as built (SPEC-007-A) selects the first claimable goal in lexicographic id order — a deterministic but blind order that ignores both which tactic and decomposition families actually merge and how far a goal sits from importable, already-proved lemmas, so the swarm would re-attempt patterns that have already failed and pick goals it cannot yet reach, **we decided for** affinity-weighted, gap-based selection as specified in design doc Components §6 and protocol `⟦Γ:Affinity⟧`: rank claimable goals by `(affinity desc, gap asc)` with a lexicographic id tie-break for reproducible trials, where affinity earns `+1` on a merge and `−10` on a failed attempt and a pattern below the viability threshold `τ_v = −5` is skipped and its goal re-queued for re-decomposition, and gap is the count of a goal's unproved dependencies (its distance to the merged library); affinity and usage live on the existing `library/index/.aisp` entries (which already carry `aff` and `use` fields) and/or a numeric field on goal records, updated by the same gated PRs that merge proofs, so the score is eventually-consistent and strictly advisory, **and neglected** pure gap-based selection, pure affinity selection, a central scoring service or database, and keeping the lexicographic Phase-1 order, **to achieve** a self-sharpening work queue that concentrates agent budget on proven approaches that are within reach of the merged library and steers the swarm toward the target instead of scattering it, **accepting that** affinity is a heuristic and not a guarantee — it can settle into a local optimum (mitigated with an affinity floor and occasional exploration), concurrent affinity updates race (acceptable because the score is advisory and eventually-consistent — never trust-bearing for correctness), and trial determinism depends on the lexicographic tie-break. ## Context The design doc's compounding mechanism (Components §6) and the protocol's `⟦Γ:Affinity⟧` block both specify a self-sharpening queue: a merge adds `+1` affinity to the goal pattern and decomposition that produced it, a failed attempt subtracts `10`, patterns below a viability threshold are skipped and re-queued for re-decomposition, and agents prefer goals whose gap to the merged library is smallest. The asymmetry (`+1` vs `−10`) deliberately favours approaches that have already been proven to work. None of this is wired today. SPEC-007-A's selection step is explicit that Phase 0 and Phase 1 have no affinity data and that the loop therefore takes the first claimable goal in lexicographic id order — a choice it justifies only as a reproducible default for trials under deliberate collision pressure, not as a strategy for reaching a target. That default was adequate for the Phase-1 reality. phase1-run-001 ran against a flat backlog of independent `Int`/`Nat` one-liners delegating to mathlib; every goal had a gap of zero (no unproved dependencies) and there was no decomposition tree to navigate, so any selection order reached the same merges. The run landed a merge rate of 0.6, and its dominant throughput problem was not selection quality but fan-out: because claim markers do not persist to `main`, a goal stays claimable until its prove PR actually merges, so under pending auto-merge multiple agents re-selected the same highest-priority unproved goal and produced redundant duplicate PRs (#71 dup of #70, #73 dup of #72). That dup-PR pressure is a property the Phase-2 selection order inherits and, if anything, worsens once decomposition floods the queue with siblings. Phase 2 changes the shape of the queue. The swarm is pointed at a curated, genuinely unformalised target (the recommended first target is the Nicomachus identity `Σk³ = (Σk)²`, which needs two or three sub-lemmas over an existing mathlib `Σk` lemma and so exercises decomposition records and gap selection without re-opening the statement-binding gap) and drives toward it by decomposition. ADR-009 produces the sub-goals — a decomposition record plus typed `Post(A) ⊆ Pre(B)` dependency edges (SPEC-003-C) — and this ADR decides the order in which the resulting tree of claimable goals is worked. With a real dependency tree, gaps are no longer uniformly zero and patterns genuinely differ in their track record, so lexicographic order stops being free: it would send agents at goals whose dependencies are not yet proved and would re-burn budget on decomposition families that have already failed. Two boundaries from the design principles constrain the mechanism. First, the repository is the only infrastructure: there is no scoring server, queue server, or central judge, and a second source of truth that could drift from the repo is disallowed. Second, and decisively, nothing outside the kernel is load-bearing for correctness — the index's usage and affinity metadata are advisory, content addressing protects only the integrity of the fetched artifact, and Gate B can never admit anything to the library. Affinity therefore sits firmly on the advisory side of that line: a stale, raced, or even maliciously-wrong affinity score can only misroute effort, never admit a bad proof. This is the property that makes eventual consistency an acceptable cost rather than a soundness hazard. ## Options Considered ### Option 1: Affinity-weighted, gap-based selection, scores on repo artifacts (Selected) Rank claimable goals by `(affinity desc, gap asc)`, lexicographic id tie-break. Affinity `+1` on merge, `−10` on failed attempt; below `τ_v = −5` a pattern is skipped and its goal re-queued for re-decomposition. Gap is the count of a goal's unproved dependencies (`gap ≜ |deps(g) ∖ proved|`), so goals nearest to importable lemmas rank first and goals with no nearby lemmas are deprioritised until decomposition brings them into range. Scores live on the existing `library/index/.aisp` `aff`/`use` fields and/or a numeric field on goal records, and updates ride on the same Gate-A/Gate-B PRs that merge proofs. Pros: implements design doc §6 and protocol `⟦Γ:Affinity⟧` directly; the two signals are complementary — affinity captures *what works*, gap captures *what is reachable* — so neither failure mode of the pure variants applies; storage reuses fields the index schema already carries, adds no new infrastructure, and keeps a single source of truth in the repo; updates are gated, so a score change rides an already-verified merge and cannot be injected out-of-band; the lexicographic tie-break preserves the reproducible-trial property SPEC-007-A relies on; and because the score is advisory, the race on concurrent updates is benign. Cons: affinity is a heuristic that can get stuck in a local optimum (needs an explicit floor and/or an exploration allowance); eventual consistency means an agent can select on a slightly stale score; and the selection step is now stateful in a way that must be kept deterministic for trials. ### Option 2: Pure gap-based selection (Rejected) Rank only by distance to the merged library, ignoring track record. Rejected because it is blind to which tactic and decomposition families actually merge: it would repeatedly select a near-library goal whose pattern has already failed, re-burning budget on the same dead approach. It discards exactly the `−10` signal the asymmetry exists to capture. ### Option 3: Pure affinity selection (Rejected) Rank only by pattern track record, ignoring reachability. Rejected because the swarm would chase proven patterns into goals whose dependencies are not yet proved — a high-affinity goal with a large gap is not yet workable, and selecting it wastes a claim and an attempt before the sub-lemmas it needs exist. Pure affinity also accelerates the local-optimum risk by doubling down on whatever has worked so far. ### Option 4: Central scoring service / database (Rejected) Hold affinity and usage in a separate service or database that agents query at selection time. Rejected because it violates the repository-only-infrastructure principle and introduces a second source of truth that can drift from the merged library — the precise drift hazard the design's anti-drift discipline exists to prevent. It also adds an availability dependency to the selection step for no soundness gain, since the scores are advisory regardless of where they live. ### Option 5: Keep lexicographic order (Rejected) Retain SPEC-007-A's first-in-id-order selection. Rejected because it is fit for a flat ~20-goal Phase-1 backlog with uniform zero gaps but useless for driving toward a target through a growing sub-lemma tree: it cannot distinguish a reachable, proven-pattern goal from an unreachable or already-failed one, and it does nothing to steer the swarm at the target. Phase 2 is exactly the regime where the order starts to matter. ## Dependencies | Relationship | ADR ID | Title | Notes | |--------------|--------|-------|-------| | Depends On | ADR-007 | Agent Identity and Budgets | Selection runs inside the agent loop and budgets this ADR's `−10`-then-skip behaviour governs; the loop the score steers is the one ADR-007 bounds | | Relates To | ADR-009 | Decomposition Records and Sub-Goal Generation | Decomposition produces the sub-goals and `Post(A) ⊆ Pre(B)` dependency edges whose unproved count this ADR ranks by; the re-queue-on-skip path feeds back into re-decomposition | | Refines | SPEC-007-A | Agent Loop Script | Replaces the lexicographic selection step with the affinity/gap ranking; the lexicographic order is retained only as the tie-break | ## References | Reference ID | Title | Type | Location | |--------------|-------|------|----------| | REF-1 | Design doc §Components 6 (compounding mechanism), §Components 7 (library index) | Design document | ../proposals/distributed-research-swarm-plan.md | | REF-2 | swarm/protocol.aisp ⟦Γ:Affinity⟧ — `+1/−10`, `τ_v`, `select`/`gap` | Contract | ../../swarm/protocol.aisp | | REF-3 | SPEC-003-A — Goal Record Schema (`deps` edges, status, sha) | Specification | specs/SPEC-003-A-Goal-Record-Schema.md | | REF-4 | SPEC-003-C — Translation and Decomposition Records (`Index` `aff`/`use`, edges) | Specification | specs/SPEC-003-C-Translation-and-Decomposition-Records.md | | REF-5 | SPEC-007-A — Agent Loop Script (lexicographic selection as built) | Specification | specs/SPEC-007-A-Agent-Loop-Script.md | | REF-6 | phase1-run-001 — first prove-cycle trial (merge rate 0.6, dup-PR fan-out) | Metrics | ../metrics/phase1-run-001.md | ## Status History | Status | Approver | Date | |--------|----------|------| | Proposed | unsorry maintainers | 2026-06-10 | | Accepted | unsorry maintainers | 2026-06-10 |