# The deliberation council _Why ContextDevKit convenes a relevant, cheaply-briefed panel of experts before a high-stakes choice — automatically, at the two moments it matters — and why that panel still never gets to write the decision._ ## The problem this solves ContextDevKit has always had a place to record *decisions* (ADRs) and their trade-offs. What it lacked initially was an artifact for the **adversarial reasoning that should precede a decision** — the argument where competing positions are weighed before one is chosen. Single-pass reasoning has a known failure mode: it anchors on the first plausible approach and then rationalizes it. The `/debate` command answered that with a multi-agent *deliberation* — independent voices argue, a distinct synthesizer converges, and the result feeds an ADR as pre-decision working material. That design was correct but under-used. Three structural weaknesses kept the deliberation from firing where it was needed: 1. **Manual-only.** A developer had to *remember* to run `/debate` before a hard call. In practice it was almost never invoked at the two moments it was built for — opening a new feature and recording an architectural decision — so high-stakes choices were still made by a single generalist voice with no adversarial check and no recorded rationale. 2. **A fixed, generic roster.** Every debate convened the same three unnamed voices regardless of the question. A database-migration debate and a UX-copy debate got the identical council, diluting relevance. 3. **Uniform model cost.** Every voice ran at the same (premium) tier, so *gathering evidence* — cheap, parallelizable, and exactly the work that makes a deliberation worth convening — cost the same as the reasoning itself. That tax discouraged the very legwork the mechanism depends on. The deliberation council addresses all three without abandoning the original `/debate` contract. ## Three moves ### 1. Auto-invoke at the two designed moments Two new autonomy areas — `feature-deliberation` and `decision-deliberation` — resolve to `debate` mode at **grade ≥ 3**, reusing the exact shape and fail-safe that the `/ship` checkpoint already had. The mode table reads `['manual', 'manual', 'debate', 'debate']`: at grades 1–2 nothing auto-fires, at grade 3 (the default install posture) and above the council convenes on its own. Starting a new feature (the `/workflow` spec phase) or recording a decision (`/new-adr`) now pulls a council together without anyone remembering to. **The crucial nuance — state it plainly: the ADR *write* stays `manual` at every grade.** The separate `adr` area is `['manual', 'manual', 'manual', 'manual']` — an unmovable floor. The deliberation **precedes** the decision; it never **authorizes** it. The council argues the forces and pre-fills the ADR's Context; a human still signs the verdict. Auto-invocation buys you the adversarial pass for free, not the commitment. This keeps the platform's founding posture intact: governance is *enforced* by deterministic triggers, but the irreversible act — the recorded decision — remains a human signature, exactly as it was before ADR-0070. ### 2. A dynamic, named, deterministic council `deliberation-council.mjs` replaces the flat count of anonymous positions with a roster that *fits the question* — and computes it deterministically, because the kit distrusts AI goodwill (the same question must always yield the same council, so roster selection is computed, not vibed). The flow: - **Classify the question into advisor lanes.** A keyword pass maps the framed question onto the six advisor lanes (`architecture`, `security`, `features`, `deepen`, `ux`, `growth`). The `architecture` lane is *always* seated — the architect is in every council, the protected spine. - **Map each lane to a named specialist agent** through the project's `advisor.lanes` owners (architect / security / ux-designer / product-owner / …). A database-migration question seats the architect and security; a UX-copy question seats the architect and ux-designer. Relevance by construction. - **Scale the size to the question** via `clamp(matchedLanes, council.min, council.max)` (defaults 3–6). Too few lanes matched? Pad up to `min` from a fixed perspective pool so there are always at least `min` independent voices. Too many? Trim past `max`, keeping the highest-priority lanes — architecture and security survive the cut. Turning `council.autoSelect` off restores the original ADR-0035 behaviour: a flat roster of `N` generic, unnamed positions. The new roster is an *upgrade* layered over a preserved fallback, not a replacement that strands old configs. ### 3. A tiered research swarm The third move attacks the cost tax directly, using the cost-tiered routing policy (*expensive models think, cheap models execute* — see [model-tier routing study](model-tier-routing-study.md)). A deliberation now runs in phases on deliberately different tiers: - **Scouts gather evidence on the `fast` tier (Haiku).** Cheap, parallelizable agents assemble the evidence pack *before* anyone argues. This is the legwork that makes a deliberation worth convening — and now it is cheap enough to always do. - **Voices reason on the `reasoning` tier (Opus).** The independent positions argue the briefed question. - **Verification runs on the `powerful` tier (Sonnet)** for hard claims that need checking. The contract — and the line ADR-0070 will not cross — is that **the voices and the synthesizer are never downgraded.** The specialist agent supplies the *perspective* (its lane), never a cheaper argument. Only the scout (evidence) and verify phases run on cheaper tiers. Models resolve through `model-policy.mjs`; if the routing policy is absent (low level or a host gap) the plan still returns its tiers with `model: null` — it degrades, it never throws. ## The mental model Think of the council as **convening the right experts, cheaply briefed, before a high-stakes choice.** Cheap scouts do the reading and lay out the evidence; the expensive specialists — chosen for *this* question, not a generic three — argue it; a separate synthesizer converges (or records an honest `unresolved` tension). The whole exchange is written down as a deliberation artifact that flows *into* the ADR's Context. The council is the briefing and the debate; the ADR is the verdict; the human is the signature. None of those three roles collapses into another — that separation is the entire point. This complements, rather than competes with, the kit's other multi-agent machinery. Hierarchical fan-out (the QA and forge orchestrators, the [squad pipeline](../SQUAD-PIPELINE-FORMAT.md)) routes work to specialists and consolidates results. Deliberation is the kit's only *deliberative* orchestration: independent voices arguing opposing positions toward a reasoned consensus. The tier discipline it borrows is documented in the [model-tier routing study](model-tier-routing-study.md); the parallel-workstream cousin is the [swarm feasibility study](swarm-feasibility-study.md). The council is one of three governance systems that landed together — the others enforce the [workflow journey](workflow-governance.md) and route work through [active squads](active-squads.md). ## Trade-offs Nothing here is free, and ADR-0070 is explicit about the costs: - **Added latency at the two gates.** Auto-invoking a council adds time to `/new-adr` and to the `/workflow` spec phase. *Mitigation:* the tiered-research design keeps the expensive phase small (cheap scouts do the bulk), and the gate is skippable below grade 3 — drop to grade 2 and it never auto-fires. - **More orchestration surface.** A new council script, three new config blocks (`council`, `autoInvoke`, `research`), and four rewired entry points (`/debate`, `/new-adr`, `/workflow`, `/ship`). *Mitigation:* the council core is pure and zero-dependency (keyword classification + a clamp), it degrades to `model: null` rather than throwing, and the auto-invoke wiring + deterministic roster scaling + tier floors are pinned by `integration-test-deliberation.mjs` (20 checks) plus selfcheck gate cells — so a regression fails the build. - **Ceremony on trivial decisions would be net-negative.** A council convened for a one-line change manufactures false confidence and burns calls. *Mitigation:* the triggers are narrow and deterministic — only the new-feature and new-decision moments — and the nudge hook only ever *suggests*; it never blocks an edit (immutable rule 2). ## The floor, restated Two invariants survive every move above, and they are the reason the council is safe to auto-invoke: 1. **The ADR write stays human at every grade.** Deliberation precedes the decision; it never authorizes it. 2. **Voices never downgrade below their tier.** The cheap tiers gather and verify; the reasoning never goes on sale. Everything else — when the council fires, who sits on it, how big it is, how cheaply it is briefed — is computed deterministically from the question and the config, never improvised.