# MISHKAN — Build Decisions Decisions made at Phase 0 that govern the entire build. Each is locked unless explicitly revisited with a dated entry below. --- ## D-001 — Cognee deployment: Local Docker **Decision:** Cognee runs as a containerised service under `~/.claude/mishkan/cognee/`, managed by Docker Compose. **Rationale:** Aligns with the existing infrastructure discipline — everything else runs through Docker Compose with multi-environment overlays, SOPS-managed secrets, and hardening overlays. The knowledge graph stays local; no external account or billing surface. Fastest install. **Implications:** - `.mcp.json` points the Cognee MCP at the local containerised endpoint. - Secrets (DB password, API keys if any) managed via SOPS, never plaintext. - A hardening overlay is applied on every container recreate. - Backups are local; no cloud egress. --- ## D-002 — Model backend: Claude Code models only **Decision:** Every agent runs on a Claude model tier. There is no local model runtime and no local-model MCP wrapper. **Rationale:** The target is Claude Code's native models. Introducing a local runtime (Ollama / LM Studio / Docker Model Runner / llama.cpp) would add a whole subsystem — an MCP wrapper, runtime health-checks, fallback logic, per-agent runtime selection — for no benefit given the target. Removing it simplifies the build materially. **Implications:** - Tier *values* the hook accepts: Fable, Opus, Sonnet, Haiku — but **Fable is dormant** (0 agents) since its 2026-06-12 suspension (amendment below). Live distribution is Opus 9 · Sonnet 22 · Haiku 14. - **Fable (0 — dormant):** briefly the producing specialists of Mishmar + Migdal (2026-06-11 amendment), reverted to Sonnet on 2026-06-12 (suspension amendment). The value stays valid so the tier can be re-enabled if access is restored. - **Opus (9):** Nehemiah, Bezalel, all Team Leads, Jehonathan. - **Sonnet (22):** every agent that **writes code/config into the codebase** (precision matters on Y4NN's code) + senior specialists + research clarify/formulate/research. Includes implementation specialists Hizkiah, Salma, Hiram, Obed, Asaph, Palal, Meremoth, Hanun — plus Nathan, Zadok, Shallum, Ira, Benaiah, Joab, Hushai, Oholiab, Meshullam, Seraiah, Joah, Jakin, Ezra, Caleb. - **Haiku (14):** agents that do **not** write code — QA (Uriah, Jahaziel), all Team Reporters, pure advisors (Deborah, Rehum), Sefer team-layer docs (Shevna), research summarise/evaluate/report (Shaphan, Shemaiah, Baruch). **Amendment 2026-05-27:** original split put implementation specialists on Haiku for cost. Revised on Y4NN's preference — Sonnet writes his code more precisely. Haiku retained only where no code is written (evaluate/collect/advise). **Amendment 2026-06-11 (four-tier — Fable 5 for Migdal + Mishmar specialists):** the eight producing specialists of the infrastructure (Migdal) and security (Mishmar) teams move from Sonnet to **Fable 5** (`claude-fable-5`). Fable 5 is positioned for complex, long-horizon work — threat modelling, infra topology, hardening — which is exactly these agents' load. Note the selector here is **reasoning load, not code-writing**: that is why Hushai (a security *advisor* that writes no code) qualifies for Fable on the strength of its analysis depth, whereas the Sonnet tier's selector is *code-writing precision on Y4NN's code* and the Haiku tier's is *no reasoning or code at all* (collect/evaluate/advise-lite). The two "advisor" agents split on exactly this axis — Hushai (reasoning-heavy security counsel) → Fable; Rehum (health/SRE advice-lite) → Haiku. Scope is deliberate: team **leads** (Phinehas, Eliashib) stay Opus (orchestration/judgement), and the **reporters** (Maaseiah, Zaccur) stay Haiku (collect only). This amends "three tiers only" to four; the "Claude Code models only" decision is unchanged (Fable 5 is a Claude model). Cost: Fable 5 is **$10 / $50 per MTok** — above Opus-tier — and is covered by plan limits until **2026-06-22**, then usage credits (an accepted cost). Verified before rollout: `fable` is a valid bare subagent alias resolving to `claude-fable-5` (confirmed against the API-recorded model field in a subagent transcript). The `model-route.py` hook's `VALID` set, the `observability-log` schema `model_tier` enum, and the `usage_parser.py` price table were extended for the new tier in the same change. **Amendment 2026-06-12 (Fable suspended — reverted to Sonnet):** less than a day after the four-tier amendment landed, Anthropic disabled Claude Fable 5 (and Mythos 5) for **all customers** under a US export-control directive (anthropic.com/news/fable-mythos-access, 2026-06-12 17:21 ET) — `claude-fable-5` now hard-errors on spawn (verified: a subagent routed to it returns "model … may not exist or you may not have access", 0 tokens). The eight Migdal+Mishmar specialists are therefore **reverted to Sonnet** (frontmatter + `model-routing.yaml`), restoring the pre-amendment distribution (Opus 9 · Sonnet 22 · Haiku 14). The four-tier **mechanism is kept dormant** — `fable` stays in the hook's `VALID` set, the telemetry enum, and the price table — so re-enabling is a one-line routing change if Anthropic restores access ("working to restore access as soon as possible"). This episode is the concrete motivation for D-017 (user-editable dynamic tier routing): model availability can change abruptly, and routing should adapt without a code change + reinstall. - Tier declared per-agent in frontmatter `model:` field. - Overridable centrally via `~/.claude/mishkan/config/model-routing.yaml`. - Cost discipline lives entirely in tier assignment + prompt caching + Cognee offloading. The observability loop surfaces expensive agents. **Supersedes:** the original design §16 model assignment matrix, which assumed local models for the fast tier. Local tiers are replaced by Haiku. --- ## D-003 — Install scope: User + Project hierarchy **Decision:** `~/.claude/` carries permanent standards, agents, hooks, and rules common across all work. A per-project `.claude/` carries project-specific state, seeded by `/mishkan-init`. **Rationale:** Matches the design doc's CLAUDE.md hierarchy. The user-level layer is always warm and travels every project; the project layer holds sprint state, the project CLAUDE.md, and project-scoped settings. **Implications:** - All MISHKAN artifacts live under `~/.claude/mishkan/` to avoid clobbering the existing user-level surface (5 agents, 8 commands, 152 skills, settings, command-validator script). - `~/.claude/CLAUDE.md` and `~/.claude/rules/y4nn-standards.md` are introduced by MISHKAN (neither existed before). - Commands are symlinked into `~/.claude/commands/` only after confirming no name collision. - `/mishkan-init` seeds the project layer: `./CLAUDE.md`, `docs/`, project `.claude/settings.json`, Cognee project namespace. --- ## D-004 — Existing user-level surface is preserved, never overwritten **Decision:** MISHKAN extends `~/.claude/`; it does not replace anything. **Preserved as-is:** any pre-existing user-level `~/.claude/agents/*.md`, `~/.claude/commands/*.md`, `~/.claude/skills/*`, `~/.claude/settings.local.json`, and any existing helper scripts (e.g. a command-validator). The installer never overwrites or removes files it did not place. **Extended:** `~/.claude/settings.json` gains the MISHKAN hook registrations. If a pre-existing `Bash` PreToolUse validator is present, the new security hook chains alongside it rather than replacing it. **Leveraged:** if the project provides its own ops specialist agent, the Migdal and Mishmar teams reference it for environment-specific operational knowledge. --- ## D-005 — MISHKAN is a distributable npm package (added 2026-05-27) **Decision:** MISHKAN ships as an npm package (`mishkan-harness`) installed via a **dependency-free `npx` one-shot installer** (`npx mishkan-harness install`). The installer **copies** the payload into `~/.claude/mishkan` (not symlinked to node_modules), creates relative symlinks for agent/skill/command discovery, and merges hooks into `~/.claude/settings.json` with paths resolved from `os.homedir()` at install time. **Rationale:** the harness must be portable and shareable, not bound to one machine. The earlier hand-placed build hardcoded absolute paths (`/home/ogu/...`) in settings.json and `projects.yaml`. The installer removes all machine-binding. **Implications:** - **Zero npm dependencies** in the installer — a security-first harness must not carry supply-chain risk, and Mishmar's own rules flag postinstall scripts, so a no-deps `npx` installer is the only consistent choice. - Package layout: `bin/mishkan.js` (installer), `payload/mishkan/` (→ `~/.claude/mishkan`), `payload/user/` (→ user-level `CLAUDE.md` + `rules/`, placed only if absent), `payload/install/settings.hooks.json` (hook fragment with a `{{MISHKAN}}` placeholder resolved at install), `docs/engineer/` (canonical profile). - Install is **idempotent** and **non-clobbering**: never overwrites a user's `CLAUDE.md`, `rules/y4nn-standards.md`, or any real (non-symlink) agent/command. - `uninstall` removes the harness, its symlinks, and its hooks while preserving user-level files (`--purge` to also remove the user rule). - `projects.yaml` is **discovery-based** (env / workspace-root / git-repo scan), carrying no hardcoded paths. - Verified: full install→status→uninstall cycle in a throwaway `$HOME` with zero source-machine path leakage. ## D-006 — Engineer profile is canonical, replaceable, and propagated (added 2026-05-27) **Decision:** the engineer the harness serves is described in `docs/engineer/profile.md` — a single, replaceable source of truth. The runtime load path is the generic `~/.claude/mishkan/profile.md` (not a person-specific filename), so any engineer can adopt the harness by replacing one file. **Propagation is two-layer:** `scripts/sync-profile.sh` does the mechanical copy + reference/drift audit; **Seraiah** (Sefer org-layer agent) owns the semantic re-derivation of digests drawn from the profile (the user-level `CLAUDE.md` non-negotiables, engineering-identity docs) when it materially changes. ## D-007 — Curated library is a separate cognee store from project knowledge (added 2026-05-28) **Decision:** the cross-project **curated library** lives in its own isolated cognee store (`mishkan-curated-*`, MCP alias `cognee-curated`, port 7730), physically separate from the **work** store that holds per-project knowledge (`mishkan-cognee-*`, MCP alias `cognee`, port 7777). A project's `.mcp.json` declares both: `cognee` (read+write its own graph) and `cognee-curated` (read-only reference). The per-client memory dataset (`_memory`, e.g. `claude_code_memory`) is a legitimate part of the work store — never pruned. **Why physical, not logical:** project ingestion pulls in code and data that can include PII (the aiobi-mail test ingested real Gmail addresses), and with `ENABLE_BACKEND_ACCESS_CONTROL=false` all datasets share one Neo4j graph — so logical dataset tags alone leave them commingled in one store and one UI. Neo4j Community allows only one database per instance, so true graph isolation requires a separate Neo4j container. The curated box reuses the shared Ollama and the shared Postgres *server* (own database `curated_db`) to keep the cost to one small extra Neo4j. The curated library is small and regenerable (`seed-curated-library.sh` → the curated box), so the split is cheap to maintain. **Embeddings caveat (inherited):** the curated box embeds via **local Ollama** — bulk seeding bursts embedding calls and cloud free-tier embeddings 429 (RESOURCE_EXHAUSTED). *(Superseded on the work-store axis by D-012: per-project physical isolation. The `cognee` MCP alias now points to a per-project Ladybug store, not port `:7777`. Port `:7777` is repurposed as `cognee-memory`, holding only `claude_code_memory`.)* ## D-008 — Three-layer memory epistemology: structure / project semantics / curated cross-project (added 2026-06-05) **Decision:** MISHKAN's knowledge surface is split into **three physically separate stores**, each owning one epistemic question, with no overlap of write authority: | Store | Question it answers | Source of truth | Write authority | |---|---|---|---| | **Graphify** (per-project, local artifacts) | *How is the code structured?* — call graphs, dependents, god nodes, schema-to-code edges, file-to-symbol provenance | tree-sitter AST + optional LLM enrichment, deterministic, re-derivable from the repo | the build (`graphify` CLI), no agent writes by hand | | **Cognee work** (`:7777`, `cognee` MCP) | *Why does this code exist? what did we decide? what did we learn on this project?* | curated project artifacts (PRD, SRS, ADRs, sprint reports, agent learnings) ingested via `mishkan-ingest` | agents, gated by `mishkan: ingest` frontmatter | | **Cognee curated** (`:7730`, `cognee-curated` MCP, read-only from projects) | *What have we learned across all projects?* | promoted cross-harness knowledge, reference library | `/sprint-close` + `seed-curated-library.sh`, only | Graphify v0.8.31 (MIT, [github.com/safishamsi/graphify](https://github.com/safishamsi/graphify)) joins the stack as a **third store**, not as a Cognee feeder, not as a Cognee replacement. ### Force-tension **What pushes toward a third layer.** Structural questions ("who calls `apply_overlay`?", "what depends on `models.User`?", "where are the god nodes?", "what tables does this service read?") are answered today by repeated grep and file reads, which is exactly the failure mode the engineer profile names: token waste, context bloat, and answers that drift because they reconstruct structure from prose instead of reading it from the AST. Graphify gives a deterministic graph, locally extracted, re-derivable from the repo at any time — the *opposite* epistemic shape from Cognee work, which is a curated, lossy, LLM-summarised narrative of decisions. Conflating the two in one store has already been rejected once (D-007 separated curated from work for the same reason: different write discipline, different trust shape). **What pushes back.** Three stores is one more runtime, one more failure mode, one more place for an agent to look in the wrong order. Graphify is a young project (v0.8.x, ~10 weeks old at this writing) with a credible breaking-change risk. Cognee already has a code-extraction notion (`codify`); adding Graphify risks duplicating capability the harness already paid for. The boundary between "structure" and "semantics" is not always crisp ("why does this function call `Y`?" is both). The tension resolves toward the split: the *write discipline* is what matters, and the three stores have three different write disciplines (deterministic build output / curated agent ingestion / cross-harness promotion). Collapsing any two of them collapses one discipline into another and loses the property D-007 was introduced to protect. ### Alternatives considered 1. **Keep only the two Cognee stores (status quo).** *Bad.* Structural questions stay grep-shaped — high token cost per question, answers that miss transitive edges and god-node patterns by construction. Cognee `codify` produces an LLM-summarised view of code, not a deterministic call graph: it is the wrong tool for "who depends on X" and was never meant to be that tool. Leaves the engineer's documented complaint about token waste unaddressed. 2. **Add Graphify as a third store with a hard write-discipline boundary.** *Chosen.* Each store answers one question, each has one writer, agents consult them in a documented order (Graphify first for structure, Cognee work for rationale, Cognee curated for cross-project precedent). Mirrors D-007's logic: physical separation when write discipline differs. 3. **Make Graphify a pre-processor / feeder into Cognee work.** Rejected. Graphify's value is that the graph is *deterministic and re-derivable* — pushing it through Cognee's LLM extraction layer destroys both properties. The output would be a lossy paraphrase of a graph that was exact, ingested into a store optimised for semantics not topology. It would also entangle Graphify's update cycle (per-commit) with Cognee's ingestion cycle (curated, sparse), forcing one to the cadence of the other. 4. **Substitute Cognee with Graphify entirely.** *Bad — door explicitly closed.* Graphify does not hold decisions, rationale, sprint reports, agent learnings, or the curated cross-project library. Replacing Cognee with Graphify would delete the "why" layer to gain a "how" layer. The two are orthogonal, not competing. 5. **Defer Graphify and re-evaluate after a measurement POC.** Rejected as the primary path because the write-discipline argument is independent of any token-saving number: even if Graphify saved zero tokens, the deterministic structural graph still belongs in its own store. A measurement POC is still useful but is out of scope here (see below). ### Invariants of boundary (the routing matrix agents follow) - **Structure → Graphify.** "Who calls X", "what depends on Y", "god nodes", "files touching table Z", "transitive dependents of module M", "what does this file import / export". Deterministic; re-derivable; cite the graph node id. - **Project rationale / decisions / learnings → Cognee work.** "Why did we choose X over Y", "what did sprint S3 conclude on auth", "what did Hizkiah learn about the embedding 429 issue". Curated; cite the ingested artifact. - **Cross-project reference → Cognee curated.** "How have we handled rate limiting elsewhere", "what does the engineer profile say about commit format". Read-only from projects; cite the curated node id. - **Ambiguous questions ("why does this function call Y?")** decompose into a structure half (Graphify: it calls Y at file:line, via path P, in branch B) and a semantics half (Cognee work: ADR D-00x decided that B owns Y because reasons). Agents answer both halves explicitly; they never fuse them into one store. - **Write authority is exclusive.** No agent writes to Graphify (only the `graphify` CLI does, on its update trigger). No build process writes to Cognee work (only agents do, gated by `mishkan: ingest`). No project writes to Cognee curated (only `/sprint-close` and the seed script do). ### Integration - **Runtime placement.** Graphify runs **per-project**, artifacts live under `.graphify/` in the project (`graph.html`, `GRAPH_REPORT.md`, `graph.json`), gitignored by default. No shared service, no port. The Neo4j export (`--neo4j-push`) targets a **dedicated Graphify Neo4j container** when used — it does **not** share the Cognee work store's Neo4j (port 7687 in that container is Cognee's; collision would commingle a deterministic AST graph with an LLM-summarised semantic graph, the exact conflation D-008 forbids). Most agent queries hit `graph.json` directly; Neo4j is opt-in for cross-repo graph queries and not part of the default install. - **Re-extraction trigger.** Incremental refresh via `graphify --update` runs: (a) on a **post-commit hook** when files in `src/`, `lib/`, or language-specific source roots change, and (b) on `/sprint-close` as a belt-and-braces full re-extract. No cron — the engineer's stateful-operation rule applies: AI prepares the command, Y4NN runs it (the post-commit hook is local to his machine, run by his shell, not by an agent). - **Agent consult order (PreToolUse hook).** Graphify ships a PreToolUse hook that nudges Claude Code toward graph-first queries. The MISHKAN integration is **deferred to a follow-up ADR** (see Out of Scope) — landing it requires threading through the existing Bash PreToolUse validator chain (D-004) and per-agent opt-in for the five code-writing specialists (Hizkiah, Salma, Oholiab, Nathan, Zadok). The hook is *available* but *not enabled by default* in this decision. - **Citation discipline.** When an agent answers from Graphify, it cites the graph node id and the source `file:line`. When it answers from Cognee work, it cites the ingested artifact. No "according to the graph" without an id. ### Out of scope (explicitly not decided here) 1. ~~**Token-saving measurement POC.**~~ **CLOSED 2026-06-07** — POC executed on the MISHKAN harness with Graphify v0.8.33: **88.1× average reduction** across the 5 canonical structural questions (range 69.4× to 141.8×). The third-party 71.5× claim is **verified** in spirit — MISHKAN's actual measurement is +23 % higher (likely driven by Python-heavy AST shape vs the mixed TS corpora of the third-party benchmark). Full method, query-by-query results, and honest gaps in `docs/research/graphify-token-saving-poc.md`. The 88.1× figure may be cited as a MISHKAN claim. The 71.5× figure may be cited as a third-party point reference within the observed range. 2. **Refactor of the Explore agent / Hiram's exploration playbook.** Whether Hiram should consult Graphify before grep is a downstream agent change, not decided here. 3. **Unified Graph Explorer UI** combining Graphify's `graph.html` and Cognee's Neo4j browser. Not decided; each store keeps its own UI. 4. **Graphify PreToolUse hook enablement and routing.** Whether and how the hook fires for the five code-writing specialists, and how it composes with the existing Bash PreToolUse chain, is deferred to a follow-up ADR. 5. **Cognee `codify` deprecation.** Whether Cognee's own code-extraction feature is now redundant in MISHKAN given Graphify is not decided here — it stays available; `mishkan-ingest` continues to gate its use. 6. **Cross-project Graphify federation** (one graph across all projects in `projects.yaml`). Not in scope; Graphify is per-project for now. ### Consequences **Positive.** - Code-writing specialists (Hizkiah, Salma, Oholiab, Nathan, Zadok) get a deterministic structural answer for "who calls X" / "what depends on Y" without grep-shaped token spend. - Write discipline stays sharp: three stores, three writers, three citation shapes. No agent has to guess where to write. - Cognee work stays small and curated — it does not get polluted with AST-derived nodes that change on every commit. - Graphify artifacts are re-derivable from the repo, so the third store has effectively zero backup obligation: delete `.graphify/` and rebuild. - The boundary is testable: a CI check can refuse a commit that ingests AST-shaped content into Cognee work. **Negative.** - One more runtime to install, document, and teach 45 agents to route to. - Graphify v0.8.x is young; a breaking change in node schema or CLI flags would touch every agent that cites a graph node id. Mitigation: cite by `file:line` alongside node id so the answer survives a schema change. - The "structure vs semantics" line is not always crisp; some questions require both stores and a careful answer. Documented in the routing matrix above, but it adds cognitive load. - The Neo4j-push path introduces a second Neo4j container if enabled — more memory, more secrets to manage. Mitigation: keep it opt-in; default is `graph.json` only. - The deferred PreToolUse hook means agents must be *told* to consult Graphify first; the harness does not enforce it until the follow-up ADR lands. **Supersedes / amends:** none. Extends D-001 (Cognee local Docker) and D-007 (curated vs work split) by adding a third epistemic layer on the same discipline. *(The "Cognee work (`:7777`)" row in the table above describes the model as of D-008. Superseded on the work-store axis by D-012: the `cognee` MCP alias now points to a per-project Ladybug store on its own port; `:7777` is repurposed as `cognee-memory`, session memory only.)* ## D-009 — Graph-first PreToolUse gate for the five code-writing specialists (added 2026-06-05) **Decision:** introduce a MISHKAN PreToolUse hook — `pre-tool-knowledge-route.py` — that, for **exactly the five code-writing specialist agents** (Hizkiah, Salma, Oholiab, Nathan, Zadok), runs alongside the existing security (D-004) and model-routing hooks and **advises** — does not block — when a `Read` or `Grep` call on source code looks like a structural query that Graphify (D-008) could have answered deterministically. The hook is **advisory** (soft gate), not hard-deny: it injects a permissionDecisionReason nudge plus, where supported, a concrete `graphify search` command into the tool input metadata, but always returns `allow`. Conformance is measured via the existing PostToolUse observer chain; the gate **never** refuses the underlying tool call. ### Force-tension **What pushes toward enforcement.** The five named agents are the harness's heaviest token consumers — Hizkiah and Salma in particular routinely Read 6–10 files to answer "who calls X" or "what depends on Y", which D-008 just declared the wrong store for that question. Without a runtime mechanism, the D-008 routing matrix is doctrine on paper that drifts the moment an agent is mid-task and reaches for the familiar tool. The whole reason D-008 exists is to make structural queries deterministic; leaving the enforcement at "we told them to" collapses the discipline into a code-review aspiration. **What pushes against a hard gate.** Three failure modes make hard-deny costly: (a) the project may not yet be scanned (first run, fresh clone) — denying Read would brick the agent until Y4NN runs `graphify`, violating the stateful- operation contract that says AI prepares but does not execute scans; (b) the graph can be stale (HEAD has moved since the last `--update`), so the agent's correct move is to Read the source of truth, not the cached graph; (c) the heuristic for "this query is structural" is necessarily fuzzy — a query like "how does this function handle errors" is semantic and Graphify is the wrong store for it. A hard gate would generate false-positive blocks on cases (a), (b), and (c), creating the exact friction the engineer profile names as a top complaint with AI tooling. The tension resolves toward **soft gate + telemetry**: the hook nudges and records, never blocks. If telemetry shows ≥80% of code-writing Reads now preceded by `graphify search` after one sprint, the doctrine is working without enforcement. If it does not, a future ADR can revisit hardening. ### Alternatives considered 1. **Hard gate — deny Read/Grep when no `graphify search` was observed in the agent's session within the last N tool calls.** *Bad.* Brittle on first-scan, stale-graph, and semantic-query cases; turns a coordination problem into a runtime block; violates the asymmetric-delegation contract by effectively forcing the agent to ask Y4NN to run a scan mid-task. 2. **Soft gate — advisory injection on suspected structural queries; always allow.** *Chosen.* Surfaces the doctrine at the exact moment it would otherwise be skipped, names the concrete command (`graphify search `), and degrades gracefully when the graph is absent or stale. Composes cleanly with the existing Bash PreToolUse chain (D-004) and the Python model-routing hook because all three already follow the fail-open-on-error contract. 3. **Silent telemetry only — count the conformance ratio in the PostToolUse bus, no advisory in the prompt.** *Useful, but insufficient alone.* Without the in-prompt nudge, the agent has no feedback signal mid-task; the ratio would document drift rather than reduce it. Adopted as **phase 1**: ship the telemetry first (single sprint), then layer the advisory nudge on top once the baseline is measured. The full soft-gate behaviour is the **phase 2** target documented here. 4. **Skill-only doctrine — encode "graphify first on structural queries" in the five craft skills (`hizkiah-backend-impl-craft.md` etc.) and rely on agents to follow it.** *Insufficient.* The harness already has the precedent (D-004 PreToolUse security) that doctrine without a mechanism drifts; skills inform behaviour, hooks enforce shape. Adopted **alongside** the hook, not as a substitute. 5. **Expand the gate to all 45 agents.** Rejected as scope creep. The QA, reporter, research, and orchestration agents do not write code and rarely ask structural questions; gating them adds noise without changing behaviour. Expansion to Hiram (Explore) is the obvious next candidate but is deferred to the Explore-refactor ADR (D-008 Out of Scope #2). ### Invariants of the gate - **Scope — exactly five agents.** The hook activates only when the invoking subagent is one of: **Hizkiah, Salma, Oholiab, Nathan, Zadok**. For any other agent (including Hiram, Caleb, all QA/reporters/orchestrators), the hook is a no-op. Adding an agent requires a new ADR. - **Trigger condition (precise).** - **`Read`** on a file whose extension matches the Graphify-supported set: `.py .ts .tsx .js .jsx .mjs .cjs .go .rs .java .php .rb`. Configs, Markdown, YAML, lockfiles, and dotfiles are **not** triggers. - **`Grep`** when the `pattern` is a bare identifier (matches `^[A-Za-z_][A-Za-z0-9_]*$`) — i.e. clearly a symbol lookup, not a semantic regex. Patterns with `.*`, alternation, multiline, or non-word characters do **not** trigger. - **Explore** tool calls are out of scope (deferred ADR). - **Fallback behaviour (graceful degradation).** - *No `.graphify/graph.json` in the project* → emit a single advisory line "Graphify not yet scanned for this project; ask Y4NN to run `graphify` (stateful op). Falling back to Read is correct for now." and `allow`. - *Graph stale* (`graph.json` mtime older than the most recent commit on HEAD) → emit "Graphify graph is older than HEAD; structural answer may be stale, prefer Read+cite for changes after ``." and `allow`. - *Trigger heuristic likely wrong* (the agent has already issued a `graphify search` in this session for a related symbol) → no nudge, silent allow. The PostToolUse counter still records the read. - **Opt-out path.** Two mechanisms, in order of normalcy: 1. Per-tool-call: a `tool_input.metadata.skip_graphify_nudge: true` field suppresses the advisory for that single call. Used when an agent has explicitly decided the query is semantic. 2. Session-wide: env `MISHKAN_GRAPHIFY_NUDGE=off` disables the hook entirely (Y4NN debug escape). Recorded in the session-start observer so disablement is visible in sprint reports. - **Performance budget.** The hook must add **≤ 50 ms p95** to PreToolUse latency. Implementation must avoid invoking `graphify search` itself inside the hook — it inspects `graph.json` metadata (mtime, presence) and the tool input shape only. If the hook exceeds 200 ms on any call, it self-disables for the remainder of the session and logs to the PostToolUse observer ("graphify nudge self-disabled: budget exceeded"). - **Fail-open contract.** Identical to D-004 and to `model-route.py`: any parse, IO, or format error → emit nothing, exit 0, never block. - **Conformance metric.** Two numbers, recorded by `post-tool-observe.sh` per session and aggregated at `/sprint-close`: 1. **Nudge-respect ratio** — of the Read/Grep calls that triggered an advisory, the fraction followed by a `graphify search` within the next 3 tool calls of the same agent. 2. **Pre-Read graph consultation rate** — of all triggering Read/Grep calls by the five agents, the fraction preceded by **any** `graphify search` in the same session. Target ≥ 80% after one sprint of phase-2 operation. Both metrics are reported, neither is a gate. ### Integration - **Hook file.** `payload/mishkan/hooks/pre-tool-knowledge-route.py`, registered alongside `pre-tool-security.sh` and `model-route.py` in the PreToolUse chain. Order: security (deny on violation) → model-route (inject model) → knowledge-route (advise). Each is independent and fail-open; chain order is for clarity, not correctness. - **Subagent detection.** Reuses the same `subagent_type` field `model-route.py` already reads from `tool_input`. When the field is absent (top-level Claude Code session, not a subagent), the hook is a no-op — Y4NN is not in scope of the gate. - **Advisory shape.** The hook returns `hookSpecificOutput.permissionDecision = "allow"` with a populated `permissionDecisionReason` quoting the exact `graphify search` command to try first. The agent sees the reason; nothing is enforced. - **Phase 1 (this sprint).** Ship the hook in **telemetry-only mode**: trigger detection runs, metrics are recorded, no advisory text emitted. Establishes the baseline number for "how often do the five agents already consult Graphify before Read?". - **Phase 2 (next sprint, conditional on phase-1 baseline).** Enable the advisory text. Re-measure. If nudge-respect ratio < 50% after one sprint, open a follow-up ADR — do not unilaterally promote to hard gate. ### Out of scope (explicitly not decided here) 1. **Token-saving measurement POC** (the 71.5× figure) — still separate, as in D-008 Out of Scope #1. The conformance metric here is behavioural (did the agent consult the graph?), not economic (how many tokens did it save?). 2. **Refactor of the Explore agent / Hiram's exploration playbook** — deferred per D-008 Out of Scope #2. The gate explicitly excludes Explore tool calls. 3. **Unified Graph Explorer UI** — out of scope, per D-008. 4. **Adding further agents to the gate** (e.g. Hiram once refactored, or any future code-writing specialist) — requires a new ADR amendment. 5. **Cognee-work consultation gate** — a symmetric soft-gate for semantic queries ("have we decided this before?") that would nudge toward Cognee MCP before a freeform reasoning answer. Plausible and consistent with the D-008 routing matrix, but not decided here. If pursued, it would follow this ADR's shape (advisory, fail-open, telemetry-first). 6. **Incremental re-extract on commit** — D-008 Out of Scope. The stale- graph fallback above accommodates the absence of incremental refresh; it does not commit the harness to provide one. 7. **Hard-deny mode** — explicitly deferred. The phase-2 advisory is the strongest enforcement this ADR sanctions. Any future hardening is a separate, dated decision. ### Consequences **Positive.** - The D-008 routing matrix gains a runtime mechanism for its highest- traffic edge (structural queries by code-writing specialists), without introducing a friction failure mode the engineer profile rejects. - Telemetry-first phasing means the harness measures before it constrains; the phase-2 decision will be grounded in a baseline, not a guess. - Fail-open contract preserves the property all MISHKAN hooks share: a broken hook never bricks delegation. - Composable with the existing PreToolUse chain (D-004 security, model-route) without rewriting any of them; each hook stays small and inspectable. **Negative.** - One more hook to maintain, with its own heuristic surface (extension list, regex for "bare identifier", stale-graph detection). Each is a small calibration debt — false-positive nudges on semantic queries will accumulate small annoyance until the advisory wording is tuned. - The stale-graph fallback depends on file mtime vs. HEAD commit time, which is approximate; an agent could pull a graph from a sibling checkout and have a wrong mtime [UNKNOWN — to verify in implementation whether this corner case warrants a content-hash check]. - Phase-2 enablement creates a sprint-boundary coordination point: the decision to flip the advisory on depends on phase-1 telemetry being reviewed, which is a `/sprint-close` agenda item that did not previously exist. - The 50 ms p95 budget is tight for Python startup on cold cache; implementation may need to be a small Bash inspector instead, mirroring `pre-tool-security.sh`'s shape. [UNKNOWN — measure cold-start cost of the Python interpreter on Y4NN's machine before committing to language.] - Documents an opt-out (`MISHKAN_GRAPHIFY_NUDGE=off`) which, like every opt-out, can become a habit that erodes the discipline. The session- start observer logging is the partial mitigation; review at `/sprint-close` is the rest. **Supersedes / amends:** none. Extends D-008 (Graphify as third store) by providing the runtime mechanism D-008 explicitly deferred (D-008 Out of Scope #4). Composes with D-004 (existing PreToolUse Bash chain) without modifying it. ## D-010 — Workflow portfolio discipline: four anti-patterns, two caps, PM+CTO co-ownership (added 2026-06-07) **Decision:** the dynamic-workflow portfolio is governed by an explicit discipline rather than ad-hoc accretion. Three rules together: (a) **hard caps** — 10 top-level (org) workflows + 4 team workflows per team; (b) **four named anti-patterns** that disqualify a candidate; (c) **PM + CTO joint ownership** — every addition, retirement, or substitution goes through Nehemiah and Bezalel together, never through a single agent's unilateral push. A fire-count rule retires workflows that fire < 2× across 3 sprints, gated by the same PM+CTO review. ### Force-tension **What pushes toward accretion.** Workflows are the most powerful primitive the harness exposes — typed contracts, parallel fan-out, adversarial verify panels. Every recurring task looks like a candidate. Each team has 5–7 specialists, so "one workflow per team's main shipping flow" feels natural. Six team-ship workflows × six teams + the 10 org-level = a portfolio of 46. Without a cap, that is the asymptote: most workflows fire once, contribute nothing, and add maintenance debt. **What pushes against.** Workflow runtime cost is real (Bun-shape migrations hit hundreds of agents and thousands of subagent-tokens per run). Each workflow carries a contract surface that must be kept correct as agents evolve. Workflows that fire rarely cannot pay back their codification cost. Worse, workflows that look load-bearing but encode skill-shape (linear sequence, no panel, no termination predicate) blur the distinction between workflow and skill, eroding the discipline that makes the workflow tool worth reaching for. ### Adopted shape **(a) Hard caps.** - **10 org-level workflows.** Current: `mishkan-sprint-close`, `mishkan-deep-research`, `mishkan-codebase-audit`, `mishkan-migration-wave`, `mishkan-architecture-panel`, `mishkan-release-readiness`, `mishkan-init`, `mishkan-blast-radius`, `mishkan-knowledge-gap-discovery`, `mishkan-standards-rollout`. Adding an 11th forces a retirement vote. - **4 team-level workflows per team.** Current shipped count varies by team (Chosheb 1, Panim 2, Yasad 3, Mishmar 1, Migdal 2, Sefer 1). Spare slots per team are deliberately left open — candidates compete for them at PM+CTO review, not on a team lead's word. **(b) Four anti-patterns.** A candidate that exhibits any of these is rejected or reworked, regardless of fit: 1. **Skill-in-workflow-clothing.** Linear sequence, no parallelism, no termination predicate, no panel. That shape is a skill, not a workflow. If the cost-to-fan-out gain is < 2× wall-clock vs Task delegation, it does not earn the workflow tool. 2. **Workflow calling workflow without a contract.** Nested workflows are valid (cf. `release-readiness` → `codebase-audit`) **only** when the inner workflow's output schema is consumed structurally by the outer. Free-form nesting hides token cost, breaks retry semantics, and produces an opaque blob of subagent transcript that the orchestrator cannot reason over. 3. **Judge panels with non-orthogonal reviewers.** If two reviewers in a panel share ≥ 70% of their evaluation criteria, the panel is theatre — redundant votes from correlated judges. Each lens must be load-bearing and distinct (the canonical example is `mishkan-blast-radius`'s caller-side / data-contract / runtime-behavior triad). 4. **Workflow-as-status-page.** Orchestration that fans out to gather state without synthesis is a dashboard query, not a workflow. The synthesis stage is the workflow's reason to exist; if it is missing or trivial, the work belongs in observability, not in `Workflow()`. **(c) PM + CTO joint ownership.** New workflow proposals are written as a brief (problem, fan-out shape, termination predicate, expected fire-count, anti-pattern self-check). Nehemiah owns delivery / recurrence justification; Bezalel owns orchestration shape / schema contracts. Joint approval lands the workflow under `payload/mishkan/workflows/`; unanimous rework lands it under `payload/mishkan/workflows/proposed/` with the rework note; rejection returns the brief to its proposer with the failing anti-pattern named. **Soft-retirement rule.** A workflow that fires < 2 times across 3 consecutive sprints surfaces in the next `/sprint-close` for PM+CTO review. The default disposition is retirement to `proposed/`; the rebuttal is a concrete upcoming-use justification. ### Alternatives considered 1. **No cap; allow accretion.** *Rejected.* This is the default state of every ungoverned workflow portfolio in the wild. The reference cases (OneRedOak, Bun-shape) show that production teams converge on 3–6 workflows after accretion; the cap codifies that ceiling rather than letting the harness relearn it under load. 2. **Per-team unilateral additions, no PM+CTO review.** *Rejected.* Without a joint gate the four anti-patterns reappear — every team adds the workflow that feels load-bearing from their vantage point, and the harness ends up with six near-identical feature-ship orchestrations. The June 2026 portfolio review surfaced exactly this drift (Sefer proposed two doc-generation workflows; both folded into skills under the gate). 3. **Workflow router that auto-selects per task.** *Deferred.* Selecting among 18 workflows by task description is the same problem as skill discovery (cf. `mishkan-skill-discovery`). Layering a router on top of workflows duplicates that infrastructure. Wait for the skill-discovery layer's telemetry before deciding whether workflows need their own router or can be discovered through the same surface. 4. **Cap of 6 org-level + 6 team workflows.** *Rejected after PM/CTO split verdict.* Bezalel preferred 6 load-bearing org-level; Nehemiah preferred 10 with retirement-based pruning. Adopted 10 because the existing portfolio already contains 10 that pass the anti-pattern check; cutting to 6 would retire workflows that clear the bar by lottery, not by discipline. The retirement rule is the steady-state pressure. ### Invariants of the discipline - **Org-level cap = 10.** Always. To add, retire. - **Team-level cap = 4 per team.** Always. To add, retire from that team. - **Anti-pattern self-check** is part of the proposal brief — proposer must name how the workflow avoids each of the four. - **No solo additions.** Even Bezalel cannot land a workflow alone; the rules-rollout workflow exists precisely to prevent that drift. - **`proposed/` is not a parking lot.** Workflows there carry a written promotion criterion (concrete fire-count, named use case). ### Consequences **Positive.** - The portfolio stays legible. An engineer (or new agent) can read all 18 in an afternoon and know what each is for. - The cost ceiling is bounded. Worst-case spend is the sum of 18 known shapes, not an unbounded sprawl. - The anti-pattern catalogue gives proposers concrete language for self- review before bringing a brief, shortening the loop. - The PM+CTO gate replays the same discipline used for architecture (D-002, D-007) — consistency across the harness's governance surfaces. **Negative.** - Some legitimate one-off orchestrations will fail the cap and have to wait for a retirement slot. The escape valve is `proposed/` — the work is not lost, just not active. - The fire-count rule is approximate; a workflow that fires 1× per sprint but is genuinely load-bearing (e.g. quarterly audit) will trip the retirement default and need to argue itself back each time. Tuned by raising the window from 3 sprints to N if false-retirements appear. - "Anti-pattern self-check" relies on the proposer's honesty about their candidate; a determined push can word-paint around it. Mitigation: the CTO half of the gate has explicit authority to reject on shape. ### Amendment (2026-06-10) — loop-until-QA-passes feature-ship workflows **Trigger.** The engineer observed that the designed Lead → Specialist → QA collaboration was not enforced in practice — the main session dispatched specialists directly and self-graded, bypassing the chain, with no loop driving work to a QA-clean state. The remedy is two-layer: a cross-cutting discipline in `team-lead-craft` §6.1 (all six teams), and a deterministic, unskippable workflow form where stakes justify it. **Portfolio change (PM + CTO joint brief).** - **Two NEW team workflows:** `yasad-feature-ship`, `panim-feature-ship`. - **Three AMENDED workflows** gain a bounded retry loop in place (no new slot): `chosheb-feature-ship` (loop-until-ready), `mishmar-security-gate` and `migdal-infra-change` (conservative — one remediation-proposal cycle then escalate). Sefer is covered by the discipline layer only (no product to gate — a workflow there would be anti-pattern #4). **Brief, per the proposal format.** - *Problem.* The chain is bypassable; QA convergence is not enforced. - *Fan-out shape.* `route → implement → parallel orthogonal QA panel → loop-until-zero-blockers (cap 3; 1–2 conservative for security/infra) → escalate`. ≥6 agents across panel × cycles. - *Termination predicate.* Zero `blocker`-severity findings (keys off the existing structured QA verdicts — Uriah/Jahaziel agent contracts and the existing workflow `{ready}`/`{decision}`/`{safe}` fields; no new contract). - *Expected fire-count.* High — once per shipped feature on the active team; comfortably clears the ≥10/quarter bar. - *Anti-pattern self-check.* #1 cleared (termination predicate + parallel panel, not a linear skill); #2 cleared (no nesting); #3 cleared (panels are orthogonal — contract/tests/data; a11y/DS/QA; the panel excludes the implementer so no lens self-reviews); #4 cleared (synthesis is a QA-clean artifact or a structured escalation). **Caps after this amendment.** Team-level total 8 → 10. Per-team: Yasad 3/4, Panim 2/4; Chosheb/Mishmar/Migdal/Sefer unchanged (amendments use no new slot). All within the 4/team cap. **Loop-until-X is no longer single-use.** This supersedes the earlier framing (`workflows/README.md`) that `knowledge-gap-discovery` was the only legitimate loop-until-X. Bounded QA-convergence is now a sanctioned second case; the bound (hard cycle cap + mandatory escalation, never silent settle) is what keeps it inside the discipline. Security and infra loops are deliberately conservative: no stateful op runs inside the loop (asymmetric delegation, rules §5). **Supersedes / amends:** none. Codifies the discipline implicit in D-002 (Claude Code models only — capability discipline) and D-007 (separate stores — epistemic discipline) onto the orchestration layer. ## D-011 — Universal skill-discovery layer (added 2026-06-07) **Status:** adopted, Phase 1 canary. **Context — the tension.** The harness now ships 40+ MISHKAN craft skills, and the user surface adds dozens more (`~/.claude/skills/`, plugin-bundled, project-local). When a task arrives, the main session cannot reliably remember which skill applies — the list itself does not fit in working memory without bloating context. Two failure modes compete: 1. **Reinvention.** The model improvises work that a skill already encodes, because the right skill was never surfaced. This is the silent failure — no error, just degraded output. 2. **Context bloat.** The model loads too many skills "just in case", spending tokens on dormant guidance that crowds out the work itself. The tension is not "should we make skills discoverable" — it is *who* decides which skills surface and *how much context* that decision consumes. D-010 governed workflow portfolio discipline by cap + anti-pattern self-check; the same discipline applies one layer down, but the right instrument is a router, not a cap (we *want* the long tail of installed skills, we just don't want to pay for all of them at once). ### Alternatives considered 1. **TUI-only browsing.** *Rejected.* A `mishkan-skills browse` TUI is useful, but it puts the discovery cost on the engineer's reading time for every task. The model still wouldn't know which skill to mention in the first place, so the silent-reinvention failure mode persists. Kept as a Phase 2 nice-to-have, not the primary surface. 2. **Embedding-based matching.** *Rejected.* Local sentence-embedding model + cosine similarity would give better recall than TF-IDF on ambiguous queries. But: (a) it pulls in a model dependency (200MB+ at minimum) for a layer that runs in every session, (b) the inference latency is non-trivial at session boot, (c) the trigger-phrase match on `Use when…` lines already covers the common case because skill descriptions are written *for* matching. TF-IDF is the right fallback for the long tail. Revisit if miss-log analysis shows recurring semantic misses that token overlap cannot recover. 3. **MCP server.** *Rejected for Phase 1.* An MCP server for skill discovery would let other tools query the same index, but it adds a process to manage and a network surface. The Python-script + flat-file index has the same data shape with zero process overhead, and an MCP wrapper around it remains a free Phase 2 option once the index format stabilises. 4. **No discovery layer; keep the implicit reach.** *Rejected.* The forty-skill surface already exceeds what the model reaches reliably. The reinvention rate observed in sessions (anecdotal but consistent) was the trigger for this decision. ### Adopted shape - **Universal indexer** (`payload/mishkan/scripts/skill-discovery-indexer.py`). Stdlib-only Python, scans four roots in precedence order: `~/.claude/mishkan/skills/` → `~/.claude/skills/` → `~/.claude/plugins/*/skills/` → `/.claude/skills/`. Output: single flat JSON at `~/.claude/mishkan/skill-discovery/index.json`. Each entry: `{name, source_path, origin, description, triggers, category, frontmatter_raw, sha256, indexed_at, mtime}`. Refresh triggers: install/update (full rebuild), session boot (`--stat-only` mtime sweep against `meta.last_scan`), manual `/mishkan-skills-reindex`. Stale entries (source_path gone) are dropped at routing time. - **Router** (`payload/mishkan/scripts/skill-discovery-router.py`). Three matching mechanisms layered: (a) trigger-phrase weighted match (trigger = 3.0, description keyword = 1.0), (b) category prior (×1.5 multiplier when invoked from a workflow declaring `relevant_skill_categories`), (c) TF-IDF fallback when trigger pass yields < 3 results. Output: 3 buckets capped at 13 total — `must_load` ≤ 3, `should_consider` ≤ 5, `adjacent` ≤ 5. - **Skill** (`payload/mishkan/skills/skill-discovery/SKILL.md`). Main-session-side. Tells the model how to interpret the 3 buckets and the trust asymmetry (mark, never auto-load, non-mishkan entries for stateful operations). - **Slash commands.** `/skills` (invoke router on current task); `/mishkan-skills-reindex` (manual rebuild). - **Phase 1 canary integration.** `mishkan-init` workflow declares `relevant_skill_categories` and runs a `SkillRouter` phase before PRD. Result is advisory and folded into Bezalel's signoff context. No other workflow is wired in Phase 1. ### Risks 1. **Context bloat from over-trusting `must_load`.** Mitigation: cap of 3, bias-rule in the skill (prefer should_consider over padding must_load), miss-log surfaces over-firing. 2. **Name collisions across roots.** Mitigation: precedence order at index time (mishkan wins); collisions recorded in `meta.collisions` rather than silently dropped. 3. **Trust asymmetry violations.** Mitigation: every non-mishkan entry carries an explicit `trust` warning; skill rule forbids auto-load of non-mishkan skills for stateful operations (D-002 / y4nn-§5). 4. **Stale index hiding real skills.** Mitigation: session-boot `--stat-only` sweep + drop-at-routing-time for dead `source_path`. 5. **Fail-closed makes the harness brittle.** Mitigation: the layer is fail-open end-to-end — indexer errors are per-skill skips, router exceptions return empty buckets, every miss is logged to `~/.claude/mishkan/skill-discovery/misses.jsonl` for tuning. ### Phase 1 → Phase 2 path Phase 1 ships indexer + router + skill + slash commands + canary in `mishkan-init`. Promotion to Phase 2 requires: - **2 sprints of miss-log signal.** If the misses file shows recurring patterns the trigger-match cannot catch, tune `description` text on the implicated skills *before* reaching for embeddings. - **Threshold validation.** Default thresholds (`high=4.0, mid=1.5`) are guesses on Phase 1 data; revisit at sprint close with the actual score distribution from the miss log and successful routings. - **Broader workflow integration.** Once thresholds settle, add the `relevant_skill_categories` field + SkillRouter phase to the other org-level workflows that benefit (codebase-audit, release-readiness, knowledge-gap-discovery). - **TUI browse surface** (optional). A `mishkan-skills` CLI that wraps the same index for cold-browsing without a task. - **MCP wrapper** (optional). Exposes the router as an MCP tool so agents in other contexts (project-local subagents) can query the same surface. **Supersedes / amends:** none. Layers under D-010 (workflow portfolio discipline) one level down — the same discipline (cap + anti-pattern + miss-log feedback) applied to the skill surface rather than the workflow surface. ### Amendment 2026-06-07: Phase 2 shipped Phase 1 ended at "the router exists." Phase 2 turns that into "agents auto-discover skills without being asked," through three injection mechanisms — none of them blocking, all fail-open. 1. **Install-time rebuild.** `bin/mishkan.js` install phase 1 spawns the indexer in `--rebuild` mode after the payload copy so `~/.claude/mishkan/skill-discovery/index.json` lands seeded before the first session boots. Fail-open: a missing `python3` or an indexer error logs a warning and the install continues; recovery path is `/mishkan-skills-reindex`. 2. **SessionStart drift check.** `hooks/session-start-skill-index.sh` runs the indexer in `--stat-only` mode on session boot. The indexer compares each SKILL.md mtime against `meta.last_scan` and rebuilds only on drift (or if the index file is missing). p95 budget: 200 ms, well within the SessionStart budget shared with the other boot hooks. Fail-open: any error → exit 0 silently. 3. **PreToolUse auto-injection on Task / Agent.** `hooks/pre-tool-task-skill-route.sh` reads the Task payload from stdin, extracts `tool_input.prompt`, runs the router in `--format injection` mode, and returns the resulting compact markdown block via `hookSpecificOutput.additionalContext` (the documented Claude Code PreToolUse field for prepending advisory context). Hard caps: ≤ 3 `must_load` + ≤ 3 `should_consider` entries, ≤ 600 tokens of prepended markdown, `adjacent` dropped at injection time (awareness-only doesn't justify its tokens here). Skip injection entirely on empty buckets — never pollute with "no skills found". p95 budget: 100 ms (worst-case timeout: 1.5 s as a hard floor). The `mishkan-init` Phase 1 canary stays in place; this becomes the dominant path for every other Task call. **Trust marker.** The injection renderer suffixes every non-`mishkan` entry with `(community)`. The skill-discovery skill's existing rule — non-`mishkan` skills are never auto-loaded for stateful operations (D-002 / y4nn-§5) — applies unchanged. **Telemetry.** Every Task hook invocation emits a `hook_fire` event on the observability bus (decision = `allow` with injection, `ok` without). Every empty-bucket routing continues to land in `~/.claude/mishkan/skill-discovery/misses.jsonl`. A new aggregator script `scripts/skill-discovery-misses.py` (and `/mishkan-skills-misses` slash command) clusters misses by sorted-keyword signature and surfaces top-N patterns + by-reason breakdown. **Threshold-tuning process.** The mid + high thresholds remain `mid=1.5, high=4.0` from Phase 1. Tuning is now data-driven: - After each sprint, run `/mishkan-skills-misses --top 10`. - For any pattern with count ≥ 5 *and* a clearly-applicable skill that should have matched, **edit the skill's description** to include the pattern's keywords. Description tuning is free and bounded; it should exhaust before threshold tuning. - If a recurring pattern still misses after description tuning, *and* the scores are clustering just under the `must_load` boundary, lower `--threshold-high` to 3.5; if `must_load` is over-firing on marginally-relevant skills, raise it to 4.5. Move in 0.5 steps and one sprint at a time. Same recipe applies to `--threshold-mid`. - `router_exception:*` reasons are bugs — escalate to Bezalel; never tune around an exception. The Phase 1 → Phase 2 promotion gate ("2 sprints of miss-log signal") applied to broadening to Phase 2 hook mechanics; the same 2-sprint gate now applies to threshold tuning before any default is moved in the shipped router. ## D-009 — Amendment 2026-06-07: Phase 2 advisory injection shipped **Original D-009 §6.2** scoped Phase 1 to telemetry-only, with the advisory injection deferred until "the baseline rate is measured." The phase-2 promotion has happened. **Why Phase 2 now.** The skill-discovery PreToolUse router (D-011 Phase 2) injects skill suggestions based on the Task prompt — it matches "implement payment" to implementation skills, not to `graphify-query-craft` whose triggers describe structural questions. The router will not surface graphify for the vast majority of agent dispatches; only a runtime hook on the specific tool calls Graphify can answer (Read on source / Grep on bare identifier) closes that gap. Without Phase 2, the carrier-set expansion in the §6 amendment below has no enforcement path. **Mechanism.** `pre-tool-knowledge-route.sh` now emits an `hookSpecificOutput.additionalContext` advisory in addition to the telemetry event: - **Read** on a source extension → "about to Read ``. For structural questions, `graphify query` is cheaper." - **Grep** on a bare identifier → "`graphify query \"who calls X\"` or `graphify affected X --depth 2` is the dispatch-aware alternative." Suppressed when the project has no `graphify-out/` directory (no graph, no recommendation). Never sets `permissionDecision` — purely advisory, the tool call proceeds whether the agent acts on it or not. **Invariants preserved.** - Fail-open everywhere (missing jq, missing bus, malformed input → silent exit 0). - ≤ 50 ms p95 — bash hot path, single jq subprocess, no Python. - Telemetry event still fires (the Knowledge tab's Graphify activity counter is unchanged). - Trigger surface unchanged from Phase 1 — same source extensions, same bare-identifier Grep pattern. **What this changes.** A Hizkiah / Salma / Joah / Ira / etc. who Reads source or Greps an identifier now gets, in the tool-call context, a specific `graphify` command to consider. The advisory is structurally precise (suggests the right Graphify subcommand for Read vs Grep) and project-aware (suppressed where graphify-out/ doesn't exist), so it doesn't add noise on projects that don't use Graphify. ## D-009 — Amendment 2026-06-07: scope expanded to all code-touching dev agents **Original D-009** scoped the Graphify advisory nudge to **exactly five agents** (Hizkiah, Salma, Oholiab, Nathan, Zadok) — the highest token consumers writing code. Empirical use since the POC made the original "five only" boundary read narrower than the actual benefit space. **Amended scope** — **20 code-touching dev agents** carry `graphify-query-craft` and are reached by the PreToolUse nudge when they hit `Read`/`Grep` on source: - **Yasad backend (5):** Hizkiah, Nathan, Zadok, Shallum, Uriah - **Panim frontend (4):** Salma, Oholiab, Asaph, Jahaziel - **Chosheb UI (1):** Hiram - **Mishmar code-security (3):** Ira, Joab, Hushai - **Migdal infra-code (4):** Palal, Meshullam, Meremoth, Hanun - **Sefer code-documentation (3):** Joah, Shevna, Jehonathan **Why the broadening.** Graphify answers "who calls X" / "what depends on Y" / "where is the entry point" — structural questions useful to any agent reading code, not just writing it. A security reviewer auditing a payment flow benefits from the call graph as much as Hizkiah implementing it; a documentation specialist (Joah / Shevna) writing module docs benefits from knowing its surroundings. Restricting to writers left readers' queries — typically grep + Read — on the expensive path. **What did not change.** - The hook stays **advisory**, never blocks (D-009 invariant preserved). - Fail-open on missing/stale graph preserved. - `--context` discipline in the craft skill body preserved. - The hook's trigger predicate (Read on source ext / Grep on bare identifier) is agent-independent — adding agents to the carrier set is what changes coverage, the hook itself stays the same. - The two orchestrators, the research pipeline, the QA reporters who don't read source, and the team reporters remain out of scope — adding them would add noise without changing behaviour. **Implication for D-010 (workflow portfolio).** The `mishmar-security-gate` and `sefer-release-notes` workflows now invoke agents carrying the skill — their structural-query phases will be served by Graphify where before they would have grep+Read. Token savings carry through to those workflows. --- ## D-012 — Per-project isolation of the cognee work store (added 2026-06-10) **Status:** adopted (per-project physical isolation); engine **resolved** to embedded Ladybug (Benaiah-vetted, see below); the provisioning migration is a follow-on plan. Co-owned by Bezalel (CTO) + Phinehas (security). **Decision:** each project gets its **own physically-separate cognee work store**, instead of all projects sharing the single work box on `:7777`. Isolation is by **topology** (separate process + separate data), never by cognee's `datasets=` filter. This extends D-007's physical-separation principle from *curated vs work* to *project vs project* within work. The curated store (`:7730`) is unchanged — one corpus, no projects to isolate from each other. **Why (the assumption that broke):** D-007 separated curated from work but assumed `datasets=` would scope projects *within* the work store. It does not. Verified against cognee v1.1.0 source + issue #1023: with `ENABLE_BACKEND_ACCESS_CONTROL= false` (required because cognee's access control is Neo4j-incompatible), the `datasets=` filter is **advisory only** — search runs against the whole graph. cognee's only real isolation is a physical per-dataset DB created in access-control mode, and Neo4j is not a supported backend for it. A `datasets=["wisemoney"]` query returned aiobi-mail content **including a live Gemini API key** — a cross-tenant confidentiality failure (Phinehas: CRITICAL), not a latent weakness. **Engine (resolved — Benaiah dependency vet against cognee v1.1.0):** **embedded Ladybug.** `GRAPH_DATABASE_PROVIDER=ladybug` is cognee's *default* at v1.1.0 — `ladybug==0.16.0` is a core (non-optional) dependency with a fully-implemented adapter (`infrastructure/databases/graph/get_graph_engine.py`; the `kuzu` token aliases to the same `LadybugAdapter`). Each project = **one cognee-mcp container (the already-patched `mishkan/cognee-mcp` image) on its own port + own volume, with `GRAPH_DATABASE_PROVIDER=ladybug` and a per-project `GRAPH_FILE_PATH`** → physical isolation by container + volume + on-disk graph file, no Neo4j (embedded), no N×4g memory floor, no reliance on the failed `datasets=` filter. The Neo4j-per-project fallback is **not needed**. *Implementation note — container, not stdio (2026-06-10).* Bezalel proposed evaluating a host **stdio** cognee-mcp (zero container) as the default. Evaluated and rejected: the recall fix (`cognee-mcp-recall-user.py` + `cognee-mcp-core-align.py`) lives only in the **built Docker image**, so a host checkout (`_src/cognee`, unpatched + core not version-aligned) would reintroduce the recall bug and require maintaining the patches in two places. The container path reuses the image where recall is *proven* working — physical isolation is identical (separate container + volume), the only loss vs stdio is one lightweight container per active project (no Neo4j, so far cheaper than today's shared Neo4j-backed store). Revisit stdio if/when cognee-mcp ships the #2855 fix upstream and the patches can be dropped. Supply-chain: Ladybug is a credible MIT fork of the Apple-archived Kuzu (founder ex-FB/Google), actively maintained (v0.17.1, Jun 2026), no known CVEs, version-pinned — acceptable for an embedded local store with no network exposure; **re-vet at 12 months** (watch items: ~8-month-old fork, PyPI `ladybug` namespace recently reclaimed from an unrelated tools suite). **Provisioning:** lazy, per project at `/mishkan-init` — `ensure-work-store.sh` brings up the project's own Ladybug container and prints its port, substituted into the rendered `.mcp.json`. **Three aliases per project** (the alias is the doorway; the backend behind each differs): - **`cognee`** → this project's own Ladybug store — isolated project knowledge. - **`cognee-memory`** → a single **shared** session-memory store: the **kept** Neo4j box on `:7777`, holding **`claude_code_memory`** only. Per-client session memory is one continuous thing across all work and is **not re-derivable** from docs, so it stays a shared pillar rather than fragmenting per project. (Session memory is cross-project by nature; it must be kept scrubbed of project secrets/PII.) *Mechanism (verified, cognee v1.1.0):* two distinct layers — the **session conversational cache** (env `CACHING`) and the **`claude_code_memory` graph dataset** (env `COGNEE_MCP_AGENT_SCOPED`). Per-project stores set `CACHING=false` so the session cache accumulates only in `cognee-memory`; `CACHING=false` disables only that cache (not embedding/LLM/query caching, recall/cognify unaffected). The permanent `claude_code_memory` dataset is independent of `CACHING` — it is kept central by routing `remember()`/memory writes to the `cognee-memory` alias. - **`cognee-curated`** → the shared reference library (`:7730`, D-007). Isolation rides on the per-project `cognee` instance; the dataset name becomes cosmetic. The old shared Neo4j *work* graph (project data) is **discarded** — it is re-derivable from tagged docs — but the **box itself is kept and repurposed** as `cognee-memory`; `claude_code_memory` is never pruned (D-007). So the migration does not delete the Neo4j box, it narrows its job to session memory. **Security posture (Phinehas, gating):** the boundary must be physical and verifiable by topology; an application-layer filter whose enforcement just failed (#1023) is rejected *as the boundary* (acceptable only as defense-in-depth on top). **Interim controls, mandatory until per-project stores land:** the shared `:7777` is a **single trust domain** — no secrets, no PII; ingest opt-in + scrubbed only; a loud advisory-only warning at the boundary; and the already-leaked Gemini key is **revoked (engineer) then purged from the graph (engineer-run, Mishmar-specified)**. **Alternatives considered:** 1. *Re-enable access control on Neo4j.* Rejected — `multi_user_support_possible()` raises `EnvironmentError` for Neo4j; needs a backend migration first. 2. *Neo4j Enterprise multi-DB (one DB per project).* Rejected as primary — standing license liability for a single-engineer harness; still N cognee-mcp instances; one Neo4j process still spans tenants. Last-resort fallback only. 3. *Wrapper-side post-filter on the shared graph.* Rejected — no reliable per-node dataset tag to filter on; a non-durable workaround (rule 3). **Consequences.** *Positive:* per-project confidentiality by construction, no reliance on cognee's filter, bounded per-project cost (embedded ≈ free; fallback ≈ one small Neo4j). *Negative:* loses the work-store Neo4j browser (static-HTML `visualize_graph` export is the fallback); existing work graphs are re-ingested against the new backend (cheap, idempotent via `mishkan-ingest`); a dependency on Ladybug's maintenance health (Benaiah-vetted; Kuzu is archived). **Supersedes / amends:** extends **D-007** (physical store separation) to a per-project axis within work — D-007 stands unchanged. Touches **D-008**'s Cognee work store. The provisioning migration (`mishkan-init` rework, per-project store creation, re-ingest) is a follow-on `/plan` routed to Migdal, gated on the engine vet + Y4NN ratification. --- ## D-013 — Attach the MISHKAN ontology to ingested documentation (added 2026-06-11) **Status:** accepted · **Drivers:** Y4NN · **Authors:** drafted by the harness, ratify with Nathan + Bezalel. **Context.** `mishkan-ingest` ran `cognee.cognify(datasets=[…])` with **no ontology** — the live log says *"No ontology file provided. No owl ontology will be attached to the graph. [OntologyAdapter]"*, and ingested nodes carry `ontology_valid: false`. The harness already defines a schema in `payload/mishkan/ontology.md` (14 entity types, 16 edges, `blast_radius`), but it was a **markdown convention for agent-authored nodes** (`cognee-promote`, `context-compress`, Baruch) — never a machine ontology fed to cognify. So ingested documentation was graphed by unconstrained LLM extraction, with no link to the MISHKAN type system. **Decision.** Ship a machine ontology and attach it at every `mishkan-ingest` cognify. - `payload/mishkan/ontology.ttl` — **Turtle/OWL**, the 1:1 machine mirror of `ontology.md` (14 `owl:Class`, 16 `owl:ObjectProperty` with `rdfs:domain`/`range` via `owl:unionOf`, `blast_radius` as `owl:DatatypeProperty`). Turtle over RDF/XML for reviewability; hand-authored because the schema is "locked v1." `ontology.md` stays the canonical human source; `ontology.ttl` is its machine form. - cognee v1.1.0 reads the ontology from the **environment** (Pydantic `OntologyEnvConfig`), **not** a cognify kwarg (verified live — the kwarg lands in `**kwargs` and is silently ignored). `docker-compose.work.yml` sets `ONTOLOGY_RESOLVER=rdflib` · `MATCHING_STRATEGY=fuzzy` (already the defaults) · `ONTOLOGY_FILE_PATH=/home/cognee/ontology.ttl`; `ensure-work-store.sh` stages the ttl to that path at provision (idempotent). **Fail-open:** missing file → cognee warns, ingests ontology-free. So the ontology applies to *every* cognify in the work store, not just `mishkan-ingest`. - ONE shared ontology for all projects (the MISHKAN schema is global), even though work stores are per-project (D-012). **Caveat (sets expectations).** cognee's resolver marks `ontology_valid=true` and adds parent/object- property edges **only where the LLM-extracted entity *type names* match ontology class names**. The default `KnowledgeGraph` extraction assigns free-form types; it will not emit `Decision` / `ResearchOutput` / `Incident` unless steered. So this pass gives **validation + enrichment of matches**, not full retyping of prose into MISHKAN classes. Forcing the taxonomy (custom prompt / graph_model) is a **Phase 2**, opened only if Phase-1 `ontology_valid` coverage proves too thin. **Alternatives considered.** (1) *markdown→ttl generator* — rejected for now: a locked v1 schema does not justify the build step (revisit if it starts churning). (2) *RDF/XML* — rejected: far less reviewable than Turtle for a hand-authored, versioned artifact. **Consequences.** *Positive:* ingested docs are linked to the type system where they match; the `CuratedResource`/`CaseNode`/etc. classes become real, queryable types; sets up the D-014 research→ curated feed. *Negative:* a second artifact (`ontology.ttl`) to keep in sync with `ontology.md` (guarded by a label-coverage check); ontology benefit is partial until Phase-2 extraction steering. **Refs:** extends **D-008** (knowledge surfaces) and rides on **D-012** (per-project store). Mechanism confirmed against the installed cognee v1.1.0 source (`modules/ontology/ontology_env_config.py` + `get_ontology_resolver_from_env`): env-config only, requires resolver `rdflib` + strategy `fuzzy` + a file path (resolver/strategy already default to those values). Out of scope: agent-write paths (already convention-bound); the **curated** store (its nodes are pre-typed by the structured seed — `add_data_points`, no cognify extraction — so no ontology needed there); back-filling existing graphs. --- ## D-015 — Unified semantic `mishkan` control surface for the knowledge stack (added 2026-06-11) **Status:** accepted · **Drivers:** Y4NN · **Authors:** drafted by the harness, ratify with Bezalel. **Context.** The engineer drove every stateful cognee op by hand across scattered compose files and scripts (a 3-file `docker compose` incantation, `ensure-work-store.sh`, manual `docker rm -f`/`volume rm`, `ensure-curated-box.sh`, ad-hoc `docker ps`). High friction, high recall load. Two robustness gaps also surfaced this session: a docker rename/orphan race (fixed, `a9cf950`) and a TUI daemon-probe that mis-read a *wedged* `watchd` as healthy. **Decision.** Extend the `mishkan` CLI into one control surface, governed by a strict naming rule and the asymmetric-delegation boundary (D-005). - **Naming rule (semantic, object-first):** a command must name *what it operates on*. - **Bare verb IFF the object is the tool itself** — `mishkan install | uninstall | status`. (Not `harness install`: the npm pkg is `mishkan-harness`, so `npx mishkan-harness harness install` doubles "harness".) - **`mishkan ` for every subsystem** — `knowledge configure|ingest`, `knowledge-stack up|down|restart|status`, `project-work-store [] up|down|reset`, `code-graph status|open|scan`, `observability install|open`, `org show`. Descriptive compound objects (`knowledge-stack`, `project-work-store`) are preferred over brevity — a bare `start` ("start *what*?") or a topology noun the user must decode (`stack`/`memory`/`foundation`) are both rejected. `knowledge-stack` = infra lifecycle (`up`); `project-work-store` = data lifecycle (`reset`). - **Rule-5 boundary (the crux):** the CLI **executes** because the *human* invokes it — D-005 forbids *agents* running stateful ops, and they still can't (the bin is never in an agent tool set); the TUI still can't either (it only *surfaces* the command, never runs it). Destructive ops (`stop`, `reset`) gate on a confirm. - **Guided bring-up, not a thin wrapper:** `knowledge-stack up` preflights config (docker present, `.env`, `COGNEE_MCP_REF`) and, on a gap, prints the exact fix and stops — never a cryptic docker error. `project-work-store up` warns if the stack is down. `/mishkan-init` composes the two: `knowledge-stack up` (ensure infra, confirm-if-down) + `project-work-store up` (this project). - **TUI hardening:** `_probe_socket` now reads the daemon's on-connect `{"type":"snapshot"}` frame and validates it, so a *wedged* watchd is replaced rather than adopted (client-side only). **Alternatives considered.** (1) bare `start`/`stop` — rejected (object-less). (2) `stack`/`memory` nouns — rejected (require decoding internal topology). (3) start/stop *buttons* in the TUI — rejected, moves stateful control into a tool (breaks D-005). (4) the CLI re-implementing the scripts — rejected; it *wraps* them so the scripts stay the single source of truth. **Consequences.** *Positive:* one low-recall surface; preflight/guide kills cryptic failures; the TUI points at the fix without executing it; daemon self-heals on wedge. *Negative:* renames published commands (`configure-knowledge`→`knowledge configure`, `observability`→`observability install`, `org`→`org show`) — old flat names kept as hidden working aliases so nothing breaks mid-migration; help, install sign-off, docs/usage, README, and the org-reference command updated in the same pass. **Refs:** rides on D-005 (asymmetric delegation), D-008 (knowledge surfaces), D-012 (per-project store). --- ## D-016 — Engineer-gated promotion of research findings into the curated library (added 2026-06-11) **Context.** The curated library (`cognee-curated`, :7730) is the org-wide reference surface (D-007), but it was **seed-only**: `ingest-curated.py` *prunes then writes*, so it cannot grow additively without wiping curated. The research pipeline writes `ResearchOutput`/`CaseNode` to the per-project **work** store (D-012), never curated; and **no agent has a curated-write tool** — curated is read-only at the agent layer by design. So a reusable resource Caleb finds (a vendor doc, a spec, a primary reference) dies in one project. The ontology already ships `CuratedResource` + `CuratedLibraryHit` for exactly this growth — only the wire was missing. **Decision.** Add an **engineer-gated** path to grow curated, additive and deduped, with the human as the library's editor: 1. **Shemaiah** (evaluate) emits a structured `curated_promotion_candidate` (name, url, problem_class, team, source_tier, why) **only when** `verdict=resolved` + `confidence≥medium` + `curated_library_agreement=not_covered` + real cross-project reuse. It nominates; it writes nothing. 2. **Baruch** (report) copies the candidate into the research-log **and** appends it to an engineer queue `~/.claude/mishkan/curated-candidates.jsonl`. Baruch has `Write` but still **no** curated-write tool — the boundary holds. The research-log schema is `additionalProperties:false`, so the new optional field was added to **both** `templates/research-log.schema.json` and `scripts/validate-research-log.sh` (with a bash-layer shape check so a malformed candidate fails without ajv). 3. **The engineer** runs `mishkan knowledge curate` — it lists pending candidates, asks per candidate, and on approval runs `scripts/promote-curated.sh` → `cognee/promote-curated.py`: an **additive** write (`CuratedResource` DataPoint + `add_data_points`, **no prune**), deduped by url against the seed manifest and a promoted ledger, via `docker exec` into `mishkan-curated-mcp`. Stateful — the human runs it, never an agent, never automatic. 4. **Telemetry.** Ezra's curated short-circuit records the matched resource (a `CuratedLibraryHit` signal), so a promoted resource's usefulness is measurable and dead weight is auditable. **Why engineer-gated, not auto.** Auto-promotion was rejected: the curated surface is shared across every project, so an unvetted write pollutes it for all, and (per D-012) research output can carry PII/secrets — a human scrub before a shared write is the safety boundary. Phinehas reviews PII/secret-scrub on a candidate; Bezalel ratifies the additive-write design; Nathan authored the ADR. **Consequences.** *Positive:* curated grows from real work without a re-seed; the prune-based seed stays intact for bootstrap; the agent-layer read-only boundary is preserved end-to-end; dedup keeps the library clean; the hit signal makes promotions falsifiable. *Negative:* one more human-gated step (the engineer must run `mishkan knowledge curate`); the dedup ledger assumes these scripts own curated writes (a manual out-of-band write to curated would not be in the ledger — documented boundary). **Out of scope:** auto-promotion (rejected); the seed (`ensure-curated-box.sh` / `seed-curated-library.sh` unchanged); work-store promotion (`cognee-promote` work tiers unchanged); any agent curated-write tool (stays read-only). **Refs:** rides on D-005 (asymmetric delegation — the write is the engineer's hands), D-007 (curated library), D-008 (knowledge surfaces), D-012 (per-project store / PII boundary), D-013 (ontology types `CuratedResource` + `CuratedLibraryHit`), D-015 (the `mishkan knowledge curate` control surface). --- ## D-017 — User-editable model-tier routing via a preserved overlay (added 2026-06-13) **Context.** D-002 made `model-routing.yaml` the central tier map and the `model-route.py` hook reads it live, so a tier change takes effect on the next delegation with no reinstall. But two gaps made re-tiering impractical for the engineer: (1) it is developer-facing YAML with no friendly surface, and (2) `mishkan install` overwrites `model-routing.yaml` (it sits inside the `copyDir(payload/mishkan,…)` tree), so a hand edit is clobbered on the next update. The Fable suspension (D-002 amendment 2026-06-12) made this urgent: a tier can vanish overnight, and the engineer needs to adapt routing — for availability, cost, or preference — without editing source and reinstalling. **Decision.** Add a `mishkan model` control surface backed by a **preserved overlay file**: - **`model-routing.local.yaml`** holds the engineer's per-agent overrides. The hook reads the shipped default, then overlays local (local wins per-agent); fail-open is preserved (absent or malformed overlay → behaves exactly as the single-file routing). The installer **places it once and preserves it** — and because `copyDir` ships no such file and never deletes destination extras, a refresh can never clobber it (the same place-once philosophy as `engineer-standards.md`). This is an **overlay**, not a preserve-the-whole-file scheme: the shipped default keeps flowing on every update (new agents, baseline changes), while the engineer's deltas persist on top — no drift. - **`mishkan model show | set | reset [target]`** edits ONLY the overlay — never the shipped default, never the 45 frontmatter files. `team` expands via `org/org.json` (the same source as `mishkan org show`). `set` validates the tier and, for a **dormant** tier (`fable`, currently suspended), warns and confirms. `show` flags any agent routed to a dormant tier. **Why overlay over preserve-the-file (Bezalel ratified).** Preserving a user-edited `model-routing.yaml` would freeze it: future baseline changes (the Fable revert, a new agent, a re-tier) would never reach an engineer who once edited it — the installed-runtime-vs-source drift trap. The overlay keeps the two concerns separate: harness owns the default, engineer owns the deltas. **Consequences.** *Positive:* routing is now the engineer's to change at will, instantly (hook reads live) and durably (survives updates); responds to abrupt availability changes without a code change; no 45-frontmatter churn. *Negative:* two files now express routing (default + overlay) — mitigated by `mishkan model show` rendering the effective merge; an overlay entry set equal to the default is harmless redundancy. **Out of scope:** named profiles (`thrifty`/`max`) — `set`/`reset` cover edit-by-intent; profiles are a thin future follow-up over the same overlay. No auto-detection of model availability (the dormant warning is static). Agent frontmatter `model:` stays the shipped fallback used only if the hook is absent. **Refs:** rides on D-002 (the tier model + the routing hook), D-005 (the CLI is human-run), D-015 (the semantic `mishkan ` surface this extends with `model`). --- *Decisions locked May 2026. Revisit only with a dated amendment below.*