# Changelog All notable changes to Myco Brain are documented here. This project follows [Semantic Versioning](https://semver.org/). **Tool contracts are stable within a major version** — the inputs and outputs of the `brain_*` MCP tools will not break in a 1.x release. ## [1.2.9] — 2026-06-24 A first-run and surfacing release: connecting Myco is now a guided, self-healing walkthrough, every memory remembers which tool wrote it, and an agent can ask the brain how it is doing and get an answer. **Additive only** — two new MCP tools (`brain_set_mode`, `brain_self_check`) plus additive fields on existing tools; no breaking changes to any `brain_*` contract, and no migrations. ### Added - **`mycobrain-setup` — a guided, self-healing install.** A nine-step, consent-driven walkthrough replaces the cryptic fresh-machine failure: it checks prerequisites (with a fix for each), wires every detected client, verifies the connection for real (semantic search present and a live write test), offers a one-tap import of your ChatGPT or Claude history, and runs the doctor. Experts skip the asking with `--yes` / `--all` / `--client`. - **Memory provenance by tool.** Each connected client gets its own agent identity, so `brain_recall_memory` and `brain_context_pack` show which tool a memory came from, and a cross-tool recall is credited ("recalled from Cursor's memory"). `brain_stats` and `brain_self_check` show a per-tool breakdown so compounding across tools is visible. - **Surfacing modes and the self-check that talks.** `brain_set_mode` sets how loudly Myco surfaces (silent, ambient, or audit; silent by default). `brain_self_check` is a pull-only call that reports what is working, what is awaiting your approval, and any problem with a concrete fix. ### Fixed - `brain_context_pack` hybrid retrieval referenced an out-of-scope column that could fail whenever the embedding path ran; the query now joins it correctly. ## [1.2.8] — 2026-06-18 Version alignment only — no functional changes. The npm package was bumped to 1.2.8 during publishing (the compiled output is identical to 1.2.6); this entry brings the documented history in line with the published version, and adds the `mcpName` field (`io.github.thegoodguysla/myco-brain`) for the Official MCP Registry. ## [1.2.7] — 2026-06-18 Version alignment only — republish, no functional changes from 1.2.6. ## [1.2.6] — 2026-06-17 An agent-experience release: the agent contract now teaches Myco as the adjudication engine it is, and the doctor verifies your setup for real instead of trusting config. **No tool-contract changes** (the `brain_*` inputs/outputs are unchanged; `brain_ingest` gains an additive `extraction` field). No migrations. ### Changed - **Agent contract rewritten to embody "the program writes the facts, not the LLM."** The runtime contract, the pasteable CLAUDE.md / .cursorrules / AGENTS.md manual, the `brain_*` tool descriptions, and the website now teach the source-first write ladder (ingest a source -> propose a claim -> private save_memory) and the engine's mechanics: compounding confidence across independent sources, supersede-don't-overwrite on conflict, provenance via `brain_why`, and propose -> review -> promote. `brain_save_memory` is honestly fenced as a private scratchpad, never workspace truth. The runtime contract is single-sourced so the surfaces cannot drift (CI guard: `npm run check:contract-drift`). ### Added - **`mycobrain-doctor` now verifies live, not just config.** For the local Ollama path it pings the server, confirms the model is pulled, and runs a real embed and generation ("live-verified"); it adds an extraction-backlog check and a `--fix` mode that offers to pull missing models. `brain_ingest` returns an honest `extraction: "graph" | "search-only"` receipt so an agent knows whether a fact graph was actually built from the source. - **Agent-contract eval harness** (`npm run eval:contract`, gated on `ANTHROPIC_API_KEY`): scores whether an agent routes to the right tool across 20 adversarial scenarios, so the contract is measured and regression-guarded. ## [1.2.5] — 2026-06-17 A frictionless-install and value-surfacing release. **No tool-contract changes** — every `brain_*` input is unchanged; the new response fields below are additive and optional. No migrations. ### Added - **`npx @mycobrain/install` — one command to connect any client.** A new installer (`mycobrain-install`, fronted by the `@mycobrain/install` launcher) detects your MCP client and writes the right config for Claude Code, Claude Desktop, Cursor, Codex, or Windsurf (with `--print` snippets for Zed, Continue, Cline), then runs onboarding. Flags: `--client`, `--all`, `--print`, `--scope`, `--no-onboard`. - **Onboarding now indexes your own repo (opt-in).** On a fresh brain, `mycobrain-onboard` asks before indexing the current project, then proves recall on your own code in a fresh context — the fastest felt "it just knew", with no export wait. Decline to leave the workspace untouched; `--tour` runs on throwaway sample data instead. - **"Recalled from your memory" attribution.** `brain_recall_memory` and `brain_context_pack` attach an optional `attribution` credit line that decays as the workspace matures (`BRAIN_ATTRIBUTION`, `BRAIN_ATTRIBUTION_DECAY`). It rides in a structured field, never the result body. - **Pushed stats.** `brain_context_pack` adds a once-per-session `session_greeting` ("Myco has N facts indexed for this workspace"); `brain_save_memory` adds a log-spaced `milestone` toast (10/50/100/250, then every 250). - **`mycobrain-ingest --watch-downloads`.** Opt-in watcher that auto-imports a ChatGPT or Claude export `.zip` the moment it lands in `~/Downloads`. - **Paste-anywhere agent instructions.** `mycobrain-install` prints, and `docs/agent-instructions.md` documents, a portable CLAUDE.md / `.cursorrules` / AGENTS.md contract for when to recall, save, and cite. ## [1.2.4] — 2026-06-16 A reliability, security, and docs release. **No tool-contract changes** — the `brain_*` MCP tool inputs and outputs are unchanged. **Includes two database migrations** (`052`, `053`); apply on upgrade. (The `1.2.3` release was a docs-only benchmark correction; this release ships the prelaunch engine hardening that landed afterward.) ### Fixed - **Extraction-worker durability.** A worker that crashed or restarted mid-chunk left that chunk stranded in `processing` forever, and a retry-exhausted chunk was mislabeled `pending` instead of `failed`. The worker now reclaims stale `processing` chunks once their lease expires and marks retry-exhausted chunks terminally `failed`. *Proof: `npm run test:reliability`.* - **Contradiction / supersession robustness.** Concurrent contradictions of the same functional fact are serialized (no two active objects can result), predicate matching is separator-insensitive (`reports_to` ≡ `reports to`), and the claims ledger no longer duplicates on re-fired contradictions. *Proof: `npm run test:contradiction`.* - **Schema-proposal corroboration counts distinct documents.** `seen_count` is derived from the true distinct-source set, so two documents alternating can no longer reach the auto-promote gate; `brain_why` source counts are per fact, not per edge row. *Proof: `npm run test:proposal-sources`.* ### Changed - **Workspace-scoped dynamic type catalogs.** Under `BRAIN_SCHEMA_AUTO_PROMOTE=1`, a workspace's auto-promoted entity-kind / relation-type names were written into the global catalog (visible to other workspaces). Promoted types are now scoped to their workspace; the canonical seed stays global. *Proof: `npm run test:schema-promotion`.* ### Security - **`form-data` advisory (CRLF injection).** Resolved the transitive `form-data` dependency pulled via `@anthropic-ai/sdk`; `npm audit` reports 0 vulnerabilities. - **stdio auth hardened (defense-in-depth).** The stdio MCP server now derives agent/workspace identity from the environment and ignores caller-supplied `api_key` / `workspace_id` / `agent_id` by default — set `BRAIN_TRUST_REQUEST_IDENTITY=1` to opt back in for a real multi-tenant gateway — and a service-role JWT must now **equal** `BRAIN_SERVICE_ROLE_KEY` rather than merely look like a JWT. Closes a prompt-injection path to another workspace in a multi-tenant deployment; no change for single-tenant self-host (identity already came from env). *Proof: `auth.test.ts`.* ### Docs - README / SECURITY: honest RLS/superuser disclosure (the default `brain` role is a Postgres superuser that bypasses RLS — multi-tenant isolation binds only under the least-privilege `brain_app` role), edge survival cited as **~80% (11–12 of 14, ≥75% gate)** rather than a bare 79%, and a reframed comparison table. Documented the `brain_search` `reranker` argument. The LongMemEval headline (73.6% oracle QA) is now backed by committed n=500 result files so it reproduces from a clone. - Added a consolidated **environment-variable reference** to the README and documented the identity vars `BRAIN_TRUST_REQUEST_IDENTITY`, `BRAIN_AGENT_ID`, and `BRAIN_SERVICE_ROLE_KEY` (with a matching `.env.example`); corrected the api-reference note that per-call `workspace_id`/`api_key` are honored on stdio (they are ignored by default post stdio-auth hardening). ### Migrations - `20260616000052_workspace_scoped_catalogs.sql` - `20260616000053_schema_proposal_distinct_sources.sql` ## [1.2.3] — 2026-06-16 A documentation-accuracy release. **No code or tool-contract changes** — purely a correction to a published benchmark figure, in keeping with our rule that every number we publish reproduces on the full set. ### Fixed - **recall@5 corrected to the full 500-question run.** The recency-reranker recall@5 figures introduced in 1.2.2 ("86% → ~92%") were measured on a **100-question sample**. Re-run on the **full 500-question** LongMemEval `longmemeval_s` set, the numbers are **89.2%** (default `hybrid`) → **91.6%** (`recency`) — the default is actually *higher* than the sample showed, and the reranker holds ~92%. Independently re-run end-to-end and reproduced **exactly** (zero drift; retrieval is deterministic). Reproduce: `python -m evals.longmemeval.run --subset longmemeval_s -n 500 --no-qa` (compare the `hybrid` and `temporal` rows, `ev_at_5`). ## [1.2.2] — 2026-06-16 A retrieval-quality release. No tool-contract changes (the reranker is an optional, additive `brain_search` argument). ### Added - **Keyless recency reranker.** `brain_search(reranker: 'recency')` reorders results by `final = 0.7·relevance + 0.3·recency_norm` — deterministic, no API key, no network call. Lifts recall@5 from 86% to ~92% on the LongMemEval `longmemeval_s` subset (the recency reranker scores ~92% recall@5 in the eval harness, which mirrors the production retrieval path and runs the identical formula). 15/15 reranker tests pass; reproduce with `python -m evals.longmemeval.run --subset longmemeval_s --no-qa` (compare the `hybrid` and `temporal` rows). ### Fixed - **Eval robustness.** Embedder inputs are sanitized so a single bad batch can no longer silently skew benchmark results. - **README accuracy.** The relation edge-survival figure is corrected to `0% → 79%` (86% is the directed-extraction accuracy, not edge survival). ## [1.2.1] — 2026-06-13 A reliability, security, and onboarding release. No tool-contract changes. ### Added - **Import your ChatGPT / Claude history.** `mycobrain-ingest --from chatgpt-export ./export.zip` and `--from claude-export ./export.zip` turn an OpenAI or claude.ai data export into provenance-tracked, deduplicated, searchable memory — one document per conversation. ChatGPT branched conversations import the ACTIVE transcript (not rejected regenerations); re-importing the same export never duplicates (content-hash dedup); `brain_why` traces every imported fact back to its export file. Proof: `npm run test:export-import`. - **Out-of-box agent instructions.** The MCP server now ships a usage contract (`instructions`) to every connected client at initialization, so agents know WHEN to recall, save, and cite without per-project setup. Plus a copy-paste behavioral block in `docs/agent-setup.md`. - **Demo corpus contradiction + "magic moment."** The bundled demo corpus carries a deliberate contradiction (a person changes employers); with the keyless local graph running, the trust engine supersedes the old fact (kept, not deleted) and `brain_why` shows both sources — supersession on the user's own ingested data, not a scripted demo. ### Fixed - **`brain_save_memory` works out of the box.** Previously failed with "Validation error: Required" because `idempotency_key` / `trace_id` / `raw_payload` were required but not advertised; they now auto-default. - **Security — agent identity comes only from the key.** For `brain_*` API keys, caller-supplied `workspace_id` / `agent_id` tool arguments are now ignored (they could let one agent impersonate another and read its private memories); identity overrides remain a service-role-only privilege. `api_key`/secrets are redacted before being written to the `brain_queries` audit table, and the docker-compose ports bind to `127.0.0.1`. Regression in `npm run test:sharing`. - **Local-model extraction hardening.** Ollama calls cap generation and time out (no more runaway-loop stalls that froze the extraction queue), retry transient drops, and recover facts from truncated JSON; relation predicates are canonicalized (so `now works for` matches `works for` for supersession), duplicate triples deduped, and sub-threshold relations queue for review instead of being silently dropped. - **Onboarding friction.** The README hero block now runs verbatim from a fresh clone (the ingest CLI defaults to the quickstart stack when no env is set); a "Connect your client" section leads with a Claude Code one-liner. - **Honesty/accuracy.** Server version is read from `package.json`; the direction-accuracy figure is corrected to the re-measured **86%** (the confidence-emission prompt fix traded a few fixture points to unblock the trust engine on local models); an internal ticket id was removed from a tool description. ## [1.2.0] — 2026-06-12 ### Added - **Full dynamic schema — gated auto-promotion.** The propose-and-surface loop (phase 1) now completes: proposals corroborated by enough DISTINCT documents (`seen_count`, tracked per source) at high confidence **auto-promote into the live catalogs** (`entity_kinds` / `relation_types`) — and the promoted type is immediately usable by the next extraction batch. Strictly opt-in (`BRAIN_SCHEMA_AUTO_PROMOTE=1`, thresholds `BRAIN_SCHEMA_PROMOTE_MIN_SEEN`=3 / `BRAIN_SCHEMA_PROMOTE_MIN_CONFIDENCE`=0.8); strict curation mode always wins. Audit trail lives on the proposal row (`extracted_by`, evidence counts, `state='auto_promoted'`, `applied_id`, `reviewed_at`); `brain_stats.schema` gains `types_auto_promoted`. Covered by the full-loop gated check `npm run test:schema-promotion` (default-off verified, promotion verified, promoted-kind-becomes-usable verified — no LLM required) plus 8 unit tests. Migration `20260612000049` (idempotent) adds the corroboration counter. - **Compounding confidence — the full engine.** A fact's confidence now **rises with independent corroboration and falls on contradiction**, and contradicted facts are **superseded, never silently overwritten**: - *Corroboration*: every relation sighting is recorded as `relation_evidence` (one row per source document per edge — re-extractions used to be silently discarded); the edge's confidence is recomputed with a damped noisy-OR anchored on the strongest source (α=0.4, cap 0.95; a single-source edge keeps exactly the confidence extraction gave it). - *Contradiction*: on **functional predicates** (one current object per subject — defaults `works for`, `reports to`, `located in`; extend via `BRAIN_FUNCTIONAL_PREDICATES`), a confident conflicting observation closes the old edge (`valid_to`), weakens it (`conf × (1 − α·c)`), and records the supersession in the **claims ledger** (`superseded_by` chain) — history stays queryable. - *Surfaces*: `brain_why` pairwise provenance gains `independent_sources`, a `confidence_trend` derived from the `vc` audit trail (e.g. `"0.8 → 0.86"`), and a `superseded_relations` list (contradictions are visible, not hidden). `brain_stats` gains an `evidence` section (corroborated / superseded counts, mean edge confidence) and a summary clause. All additive — no tool contract changes. - Covered by `npm run test:compounding` (full lifecycle end-to-end against a live DB, no LLM needed) plus 17 new unit tests; proven live with `ollama:llama3.2:3b` (two documents asserting different employers → old edge `[SUPERSEDED]`, new edge `[ACTIVE]`, claims chain recorded). The LongMemEval harness never runs extraction, so the benchmark number is structurally unaffected. - **Per-object sharing enforcement.** Documents marked `private` (`sharing_type_id = 1`) are now actually private: readable only by the agent that created them (plus service-role callers) across `brain_search`, `brain_context_pack`, `brain_recall_memory`, `brain_why`, `brain_neighbors`, and `brain_get_related`. Ingest now records the creating agent (`hyobjects.agent_id`); private rows with no recorded creator stay hidden from non-service callers (conservative by design). Workspace/org/public/ llm_readable documents behave exactly as before. Covered by a two-agent visibility-matrix check (`npm run test:sharing`, no LLM required). - **LongMemEval benchmark harness ships in-repo** (`evals/longmemeval/`) so the headline number is reproducible by anyone, not asserted: **73.6% end-to-end QA accuracy on the complete 500-question `oracle` subset** (no sampling) with **100% evidence-retrieval recall** — reader `gpt-4o-mini`, judge `gpt-4o`. Self-contained Python harness (own `requirements.txt`, offline unit tests, per-question workspace isolation, automatic purge); methodology and per-category breakdown in `evals/longmemeval/README.md`; README gains a "Benchmark — run it yourself" section. ## [1.1.0] — 2026-06-11 ### Fixed - **Relation endpoint recovery — graph edges now survive small-model extraction.** Small local models frequently emit a correct relation while omitting one endpoint from `entities`; the worker's anti-hallucination guard (relations never create entities) then silently dropped the edge. Measured on the gold fixture with `llama3.2:3b`: **0% of emitted relations had both endpoints listed — i.e. zero graph edges survived** on a fresh keyless install. `extract()` now runs a best-effort second pass that asks the model to classify exactly the missing endpoint names against the same text (example-anchored prompt; anything beyond the requested names is discarded). Edge survival: **0% → 79%**, direction accuracy unchanged at 86%; the remainder are junk phrase-objects that are correctly rejected. The direction check now also measures and gates **endpoint completeness** (`BRAIN_ENDPOINT_MIN_COMPLETENESS`, default 0.75). ### Changed - **Relationship extraction is now direction-aware.** The extraction prompt states that relations are directed (`subject → predicate → object`), defines subject vs. object, and gives worked directional examples (including the passive-voice trap that flips small models). On the gold fixture this lifts `llama3.2:3b` direction accuracy from ~79% to ~86%. The prompt and provider calls moved to `extraction.ts` (importable without starting the worker loop). ### Added - **Strict curation mode.** `BRAIN_REQUIRE_HUMAN_REVIEW=1` disables all auto-promotion: every extracted entity and relation waits in the review queue (`proposed_*`, state=`pending`) and nothing reaches the canonical graph without a human decision — the LLM writes proposals, the human writes the graph. Covered by a differential end-to-end check (`npm run test:strict-mode`, no LLM required). - **Seeded relationship-type catalog.** Migration `20260611000048` seeds `relation_types` with the eight canonical predicates the extraction prompt teaches (idempotent; tolerant of `ASSIGNED_TO`-style naming from existing deployments). The dynamic-schema known-predicate filter now has a real baseline instead of an empty catalog. - **Dynamic schema (phase 1) — propose and surface.** The extraction worker now records entity kinds and relationship predicates it observes that aren't in the `entity_kinds` / `relation_types` catalogs as **pending** `schema_proposals` rows (provenance-tagged, deduped per workspace by the unique type+name constraint; min confidence via `BRAIN_SCHEMA_PROPOSAL_MIN_CONFIDENCE`, default 0.6, inclusive). `brain_stats` gains a `schema` section and the summary surfaces *"Brain proposed N new types from your data (pending review)"*. Promotion stays manual — nothing auto-applies: entities with a not-yet-cataloged kind stay in the review queue and are **not** auto-promoted into the canonical graph (so a novel "product" can never be mislabeled as an organization or fuzzy-merged into one). Migration `20260611000047` (idempotent) extends the `proposal_type` CHECK with `entity_kind`; the proposal write runs under a savepoint, so a database that hasn't applied the migration degrades to "no proposals recorded" instead of failing extraction. The extraction prompt now lets the model name a novel kind when none of the canonical four fits; the prompt's own example predicates are treated as canonical and never proposed as "new". - **Direction regression test.** `src/extraction-direction.ts` defines a gold fixture of clearly-directed facts (from `examples/demo-corpus`, with passive-voice traps) and a `scoreDirection` scorer (deterministic unit tests in `extraction-direction.test.ts`). `test/extraction-direction-check.mjs` (`npm run test:direction`) measures a model's directed accuracy and fails below `BRAIN_DIRECTION_MIN_ACCURACY` (default 0.8); it skips when no real extraction model is configured, so the keyless CI quickstart is unaffected. - Recommended-model + measured-accuracy note in `docs/relationship-extraction.md`. - **Local (keyless) vector search** via Ollama `nomic-embed-text` (768d), selected with `BRAIN_EMBED_PROVIDER=ollama` (auto-selected when an Ollama base URL is configured and no OpenAI key is present). Runs alongside the existing OpenAI `text-embedding-3-small` (1536d) path; each provider stores vectors in its own table (`chunks_ollama_nomic` / `chunks_openai3small`), so switching providers never causes a dimension clash. `brain_search`, `brain_context_pack`, and `brain_recall_memory` join the active provider's table, and `brain_stats` counts embedded chunks across both. Migration `20260611000046_chunks_ollama_nomic.sql` (idempotent) adds the 768d table. Semantic search now works with no hosted API key at all. ## [1.0.0] First public release. ### Added - Self-hosted MCP server exposing **11 tools** (`brain_context_pack`, `brain_search`, `brain_why`, `brain_neighbors`, `brain_ingest`, `brain_propose_fact`, `brain_annotate`, `brain_save_memory`, `brain_recall_memory`, `brain_get_related`, `brain_stats`). - Postgres 16 + pgvector schema (42 migrations) with a seeded default workspace so `docker compose up` works with zero configuration. - Keyless full-text (BM25) search and document ingestion — no API keys needed. - Content-hash **deduplication** on ingest and queryable **provenance** (`brain_why`), with an evidence summary ("supported by N mentions across M sources"). - **Bulk ingest CLI** (`mycobrain-ingest`) for local folders and GitHub repos. - **Knowledge graph**: entity extraction, entity resolution, and entity-to-entity relationships — built fully locally via Ollama (no API key) or via Anthropic for best accuracy. - Reproducible dedup + provenance benchmark (`examples/benchmark/`). - CI that boots the real Docker stack and runs all tools end-to-end. ### Notes - Vector (semantic) search requires an OpenAI key; full-text search is keyless. Local embeddings are on the [roadmap](./ROADMAP.md). - Managed cloud is waitlist-only and not part of this release.