# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [2.5.3](https://github.com/cameronrye/openzim-mcp/compare/v2.5.2...v2.5.3) (2026-07-01)


### Fixed

* invalidate caches when ZIM files change at runtime ([#308](https://github.com/cameronrye/openzim-mcp/issues/308)) ([0b95461](https://github.com/cameronrye/openzim-mcp/commit/0b9546177d0b0242b41a4e58c89878a4adfe9450))

## [2.5.2](https://github.com/cameronrye/openzim-mcp/compare/v2.5.1...v2.5.2) (2026-06-24)


### Fixed

* resolve v2.5.1 end-to-end test findings (cuts v2.5.2) ([#306](https://github.com/cameronrye/openzim-mcp/issues/306)) ([d58d7b6](https://github.com/cameronrye/openzim-mcp/commit/d58d7b6d41e41b821e99cf29ba5ef8fb50bb8132))


### Documentation

* refresh README and site copy off the out-of-date v2.0.0/Phase F framing ([2956d54](https://github.com/cameronrye/openzim-mcp/commit/2956d54ffdd334fbceaf3a56467174a1e83bf628))

## [2.5.1](https://github.com/cameronrye/openzim-mcp/compare/v2.5.0...v2.5.1) (2026-06-22)


### Fixed

* **deps:** bump msgpack to 1.2.1 and pydantic-settings to 2.14.2 ([#302](https://github.com/cameronrye/openzim-mcp/issues/302)) ([c58baa9](https://github.com/cameronrye/openzim-mcp/commit/c58baa9bf0e0bece36e97ffccac11f5b17d06f1c))

## [2.5.0](https://github.com/cameronrye/openzim-mcp/compare/v2.4.5...v2.5.0) (2026-06-18)


### Added

* **dist:** publish to Smithery + MCP Registry via a reproducible .mcpb pipeline ([#298](https://github.com/cameronrye/openzim-mcp/issues/298)) ([bc1898d](https://github.com/cameronrye/openzim-mcp/commit/bc1898d9b90055991408870318f59ad3c1f15609))

## [2.4.5](https://github.com/cameronrye/openzim-mcp/compare/v2.4.4...v2.4.5) (2026-06-16)


### Documentation

* **changelog:** backfill 2.4.4 entries for [#294](https://github.com/cameronrye/openzim-mcp/issues/294); exclude generated changelog from markdownlint ([3692796](https://github.com/cameronrye/openzim-mcp/commit/3692796be3b4c15184a2ae576193972550ec6f1a))

## [2.4.4](https://github.com/cameronrye/openzim-mcp/compare/v2.4.3...v2.4.4) (2026-06-16)


### Added

* **presets:** trim the Stack Exchange answer vote-score prefix from `q_and_a` summaries (v2.5 [#17](https://github.com/cameronrye/openzim-mcp/issues/17) a2) ([#294](https://github.com/cameronrye/openzim-mcp/issues/294)) ([6a583fe](https://github.com/cameronrye/openzim-mcp/commit/6a583fea4bd8eea7d172cbaee6b99bad2c31ac11))


### Fixed

* close the H14/M31 residuals from the 2026-06-10 review — strip the rejected `cursor` param copy-on-write in `zim_search`, thread `validated_path` into the synthesize snippet render cache, and rebuild the title-merge result copy-on-write to stop cache poisoning ([#294](https://github.com/cameronrye/openzim-mcp/issues/294)) ([6a583fe](https://github.com/cameronrye/openzim-mcp/commit/6a583fea4bd8eea7d172cbaee6b99bad2c31ac11))
* **deps:** clear website vite + astro security advisories (npm audit → 0) ([#295](https://github.com/cameronrye/openzim-mcp/issues/295)) ([0e86731](https://github.com/cameronrye/openzim-mcp/commit/0e86731bdaf030575daef6f172738ec9e6f8d723))

## [2.4.3](https://github.com/cameronrye/openzim-mcp/compare/v2.4.2...v2.4.3) (2026-06-15)


### Fixed

* cap search-family limit to prevent unbounded result materialization ([#291](https://github.com/cameronrye/openzim-mcp/issues/291)) ([650a150](https://github.com/cameronrye/openzim-mcp/commit/650a15045d1b4e02d64c1fab5999e0d03262d850))
* **deps:** floor cryptography&gt;=48.0.1, python-multipart&gt;=0.0.31, starlette&gt;=1.3.1 to clear 8 pip-audit advisories (GHSA-537c-gmf6-5ccf; CVE-2026-53538/-53539/-53540; CVE-2026-48817/-48818/-54282/-54283). ([650a150](https://github.com/cameronrye/openzim-mcp/commit/650a15045d1b4e02d64c1fab5999e0d03262d850))

## [2.4.2](https://github.com/cameronrye/openzim-mcp/compare/v2.4.1...v2.4.2) (2026-06-15)


### Fixed

* **deps:** clear website npm-audit advisories (esbuild, yaml) ([#288](https://github.com/cameronrye/openzim-mcp/issues/288)) ([41dffd6](https://github.com/cameronrye/openzim-mcp/commit/41dffd61fa07cc6c0cfacc5035d77b3cc3b85506))
* resolve 16 findings from real-world zim_query testing ([#287](https://github.com/cameronrye/openzim-mcp/issues/287)) ([bbb9382](https://github.com/cameronrye/openzim-mcp/commit/bbb93823d89739e94fa2b2df1d0a8ec05a1588b8))

## [2.4.1](https://github.com/cameronrye/openzim-mcp/compare/v2.4.0...v2.4.1) (2026-06-11)


### Fixed

* address all confirmed findings from the 2026-06-10 code review ([#284](https://github.com/cameronrye/openzim-mcp/issues/284)) ([32c39ad](https://github.com/cameronrye/openzim-mcp/commit/32c39ad55abc19345a5dd8821840e468ccd94068))

## [2.4.0](https://github.com/cameronrye/openzim-mcp/compare/v2.3.0...v2.4.0) (2026-06-09)


### Added

* dispatch-eval honours either_acceptable for prose sub-modes (v2.5 [#199](https://github.com/cameronrye/openzim-mcp/issues/199)) ([#281](https://github.com/cameronrye/openzim-mcp/issues/281)) ([d8ab040](https://github.com/cameronrye/openzim-mcp/commit/d8ab040fcd41ab1bb040cbbecf856efb9ded7578))
* finer-grained build link-graph CLI errors (v2.5 [#16](https://github.com/cameronrye/openzim-mcp/issues/16)) ([#279](https://github.com/cameronrye/openzim-mcp/issues/279)) ([75fb0e8](https://github.com/cameronrye/openzim-mcp/commit/75fb0e8db38dfcabb4a008db540ab34bf4347d9d))
* link-graph sidecar v2 — builder_version + per-edge anchor_text (v2.5 [#16](https://github.com/cameronrye/openzim-mcp/issues/16)) ([#280](https://github.com/cameronrye/openzim-mcp/issues/280)) ([709d4c3](https://github.com/cameronrye/openzim-mcp/commit/709d4c3d5ca52633fb1070c25fe46666a0a9bb0a))
* zim_get_section true raw-text path (v2.5 [#18](https://github.com/cameronrye/openzim-mcp/issues/18)) ([#277](https://github.com/cameronrye/openzim-mcp/issues/277)) ([6057b6c](https://github.com/cameronrye/openzim-mcp/commit/6057b6c6d554497677231345b1c76049662c67c7))


### Documentation

* **roadmap:** v2.5.0a2 shipped (v2.3.0), v2.5.0a3 landed ([#282](https://github.com/cameronrye/openzim-mcp/issues/282)) ([0408551](https://github.com/cameronrye/openzim-mcp/commit/0408551bae05211eef70ef8f455a2f2897f39beb))

## [2.3.0](https://github.com/cameronrye/openzim-mcp/compare/v2.2.2...v2.3.0) (2026-06-08)


### Added

* inbound link-graph sidecar (v2.5 [#16](https://github.com/cameronrye/openzim-mcp/issues/16)) ([#274](https://github.com/cameronrye/openzim-mcp/issues/274)) ([f87a214](https://github.com/cameronrye/openzim-mcp/commit/f87a21410b6a6395905f90816563fa52351107c5))


### Fixed

* **prompts:** route MCP prompts to the v2 tool surface + correct response docs ([f625ba4](https://github.com/cameronrye/openzim-mcp/commit/f625ba4bae5ce4c8c705f1503baa39c0d7102c87))
* **zim_browse,zim_links:** wire cursor pagination through to the data layer ([7fd0222](https://github.com/cameronrye/openzim-mcp/commit/7fd0222ac8732bea765b07fb685f2f7edadc8ce5))


### Documentation

* align tool-surface references with the v2 contract ([3d9168c](https://github.com/cameronrye/openzim-mcp/commit/3d9168c8a5358a3334ebaa395159c4e9904f8e38))
* **roadmap:** make the sub-D-3/sub-D-4 close-by-default decision explicit ([85a5d8b](https://github.com/cameronrye/openzim-mcp/commit/85a5d8b85ad57f5bc45d68ebbbea613fc9ba23bf))

## [2.2.2](https://github.com/cameronrye/openzim-mcp/compare/v2.2.1...v2.2.2) (2026-06-07)


### Documentation

* sync roadmap to v2.2.1 + record the [#17](https://github.com/cameronrye/openzim-mcp/issues/17) presets reprobe ([#270](https://github.com/cameronrye/openzim-mcp/issues/270)) ([d31de36](https://github.com/cameronrye/openzim-mcp/commit/d31de36b85c1b2a587ab538427cfaa17286ad3b1))

## [2.2.1](https://github.com/cameronrye/openzim-mcp/compare/v2.2.0...v2.2.1) (2026-06-05)


### Fixed

* **docker:** default image to stdio so bare `docker run` works ([#267](https://github.com/cameronrye/openzim-mcp/issues/267)) ([36a6258](https://github.com/cameronrye/openzim-mcp/commit/36a625851ec0f11f5decb1f784e549e49a619856))

## [2.2.0](https://github.com/cameronrye/openzim-mcp/compare/v2.1.8...v2.2.0) (2026-06-05)


### Added

* archive-type presets (v2.5 [#17](https://github.com/cameronrye/openzim-mcp/issues/17)) ([#265](https://github.com/cameronrye/openzim-mcp/issues/265)) ([f0cd9ad](https://github.com/cameronrye/openzim-mcp/commit/f0cd9adf2ff1b90ca4c68ca9465a04c0dd2a0a02))

## [2.1.8](https://github.com/cameronrye/openzim-mcp/compare/v2.1.7...v2.1.8) (2026-06-04)


### Documentation

* sync version facts to v2.1.7 ([#263](https://github.com/cameronrye/openzim-mcp/issues/263)) ([17d8f6a](https://github.com/cameronrye/openzim-mcp/commit/17d8f6ab8a467af99ab2ab556492f03a04251980))

## [2.1.7](https://github.com/cameronrye/openzim-mcp/compare/v2.1.6...v2.1.7) (2026-06-04)


### Fixed

* **synthesize:** resolve 2-token tail-hijack defect class ([#252](https://github.com/cameronrye/openzim-mcp/issues/252), [#253](https://github.com/cameronrye/openzim-mcp/issues/253)) ([#261](https://github.com/cameronrye/openzim-mcp/issues/261)) ([67c4f9c](https://github.com/cameronrye/openzim-mcp/commit/67c4f9c5c4b3f78fccecca87dc4db903e6e71e7f))

## [2.1.6](https://github.com/cameronrye/openzim-mcp/compare/v2.1.5...v2.1.6) (2026-06-04)


### Fixed

* pyjwt 2.13.0 security floor + repair manifest-mode release tagging ([#258](https://github.com/cameronrye/openzim-mcp/issues/258)) ([9b290ad](https://github.com/cameronrye/openzim-mcp/commit/9b290ad617ae97173c50366ff711128879d2acc4))

## [2.1.5](https://github.com/cameronrye/openzim-mcp/compare/v2.1.4...v2.1.5) (2026-06-03)


### Fixed

* **release:** run release-please in pure manifest mode so config + extra-files apply ([#256](https://github.com/cameronrye/openzim-mcp/issues/256)) ([7491aaa](https://github.com/cameronrye/openzim-mcp/commit/7491aaa9610d97d1ef9a8a0676c79a79e5e545a3))

## [2.1.4](https://github.com/cameronrye/openzim-mcp/compare/v2.1.3...v2.1.4) (2026-06-02)


### Bug Fixes

* guard synthesize tail-promotion against off-topic tail-hijacks ([#250](https://github.com/cameronrye/openzim-mcp/issues/250)) ([7512d60](https://github.com/cameronrye/openzim-mcp/commit/7512d60b2772e9a77d2f128769867761075e3288))

## [2.1.3](https://github.com/cameronrye/openzim-mcp/compare/v2.1.2...v2.1.3) (2026-06-02)


### Bug Fixes

* clear three deferred defects — synthesize cross-archive leak, MedlinePlus furniture, non-article-asset browse/walk filter ([c494e6a](https://github.com/cameronrye/openzim-mcp/commit/c494e6a978313e70bc17596f618ebf6e6dbba8ea))

## [2.1.2](https://github.com/cameronrye/openzim-mcp/compare/v2.1.1...v2.1.2) (2026-06-01)


### Bug Fixes

* **http:** port-expand bare allowed-hosts so proxied Host:port passes ([55ad602](https://github.com/cameronrye/openzim-mcp/commit/55ad6025c47e883da90da43354d5fcd4bc574a66))


### Documentation

* sync README + website to the shipped v2.1.1 surface ([#246](https://github.com/cameronrye/openzim-mcp/issues/246)) ([9264f15](https://github.com/cameronrye/openzim-mcp/commit/9264f15a79e6d51f1e518620667093343685e93a))

## [2.1.1](https://github.com/cameronrye/openzim-mcp/compare/v2.1.0...v2.1.1) (2026-05-30)


### Bug Fixes

* post-v2.1.0 beta-test sweep — highlighter empty-link protection + related-articles media exclusion ([#232](https://github.com/cameronrye/openzim-mcp/issues/232)) ([7d0d6df](https://github.com/cameronrye/openzim-mcp/commit/7d0d6dfa53c6d020a456e2f3d3144295e74fc9ee))

## [2.1.0](https://github.com/cameronrye/openzim-mcp/compare/v2.0.5...v2.1.0) (2026-05-29)


### Features

* surface native libzim reader capabilities (validate / identity / index / Counter / exact-title / cache tuning) ([#221](https://github.com/cameronrye/openzim-mcp/issues/221)) ([0ca65b5](https://github.com/cameronrye/openzim-mcp/commit/0ca65b5d9dadde2af17f3fdb77ad58a6ef93063d))


### Bug Fixes

* scope content tools to main content (chrome leak), restore TOC nesting, dedupe search URL variants ([#225](https://github.com/cameronrye/openzim-mcp/issues/225)) ([659d987](https://github.com/cameronrye/openzim-mcp/commit/659d987747a7e9ac8524d10f40a8e5780a0bb95b))

## [2.0.5](https://github.com/cameronrye/openzim-mcp/compare/v2.0.4...v2.0.5) (2026-05-29)


### Bug Fixes

* trigger release-please bump (parse-error recovery for [#219](https://github.com/cameronrye/openzim-mcp/issues/219)) ([251ba20](https://github.com/cameronrye/openzim-mcp/commit/251ba2014017d0719f847930deea270a9d058b24))

## [2.0.4](https://github.com/cameronrye/openzim-mcp/compare/v2.0.3...v2.0.4) (2026-05-28)


### Reverts

* PR [#215](https://github.com/cameronrye/openzim-mcp/issues/215) release.yml recovery hack — immutable-releases disabled at repo level ([#217](https://github.com/cameronrye/openzim-mcp/issues/217)) ([6951e09](https://github.com/cameronrye/openzim-mcp/commit/6951e09c698711bd37648d6d6bc409afadfb3386))

## [2.0.3](https://github.com/cameronrye/openzim-mcp/compare/v2.0.2...v2.0.3) (2026-05-28)


### Bug Fixes

* **release:** recover from immutable-release lockout when release-please publishes draft too early ([#215](https://github.com/cameronrye/openzim-mcp/issues/215)) ([2d8e458](https://github.com/cameronrye/openzim-mcp/commit/2d8e458f3579d177e7d4440e7d434661dbb7bbd6))

## [2.0.2](https://github.com/cameronrye/openzim-mcp/compare/v2.0.1...v2.0.2) (2026-05-28)


### Bug Fixes

* post-v2.0.0 beta-test sweep — 7 dispatcher defects + website 3-pass audit ([#213](https://github.com/cameronrye/openzim-mcp/issues/213)) ([d3402fb](https://github.com/cameronrye/openzim-mcp/commit/d3402fb564de4f4a5c4acaef933728df61c03917))

## [2.0.1](https://github.com/cameronrye/openzim-mcp/compare/v2.0.0...v2.0.1) (2026-05-27)


### Bug Fixes

* **codeql:** clear 2 py/empty-except alerts in dispatch-eval runner ([b3b47e8](https://github.com/cameronrye/openzim-mcp/commit/b3b47e8c15f11b06cdd60634bf0b7e0b29afa987))
* **docs:** zim_browse mode is 'page'|'walk', not 'browse'|'walk' ([#203](https://github.com/cameronrye/openzim-mcp/issues/203)) ([6fddc82](https://github.com/cameronrye/openzim-mcp/commit/6fddc82afd4887c10a7005d1f79b9e4c2ac0e219))


### Documentation

* Astro scaffold + landing v1→v2 refresh + Astro build in deploy workflow ([#205](https://github.com/cameronrye/openzim-mcp/issues/205)) ([3ef225e](https://github.com/cameronrye/openzim-mcp/commit/3ef225eae3b8e4282f0639d15c924b53740ef71f))
* comprehensive v2.0.0 surface update across README, website, deployment ([#201](https://github.com/cameronrye/openzim-mcp/issues/201)) ([3ba32f3](https://github.com/cameronrye/openzim-mcp/commit/3ba32f33df5f857f1f69f3d494f01ac84880ce8d))
* consolidate v2 planning artifacts into single roadmap ([5930dad](https://github.com/cameronrye/openzim-mcp/commit/5930dadf3437f7fad710b6441609c730d5c8f984))
* drop v2.5 references from README + website/llms.txt ([#204](https://github.com/cameronrye/openzim-mcp/issues/204)) ([3a91587](https://github.com/cameronrye/openzim-mcp/commit/3a9158748148bc4ca1d1a865ce8b354050b1a5d2))
* migrate 15-page wiki to Astro docs collection + retire docs/deployment.md ([#206](https://github.com/cameronrye/openzim-mcp/issues/206)) ([2beed02](https://github.com/cameronrye/openzim-mcp/commit/2beed027609bb324fb521945c9639802a0c97680))
* post-v2.0.0 documentation consolidation design spec + 4-PR plan ([#209](https://github.com/cameronrye/openzim-mcp/issues/209)) ([e8920a2](https://github.com/cameronrye/openzim-mcp/commit/e8920a2b6ae54f2ecb778108e4584d4f184a46fe))
* restore ❤️ in 'Made with' attribution (README + landing) ([#210](https://github.com/cameronrye/openzim-mcp/issues/210)) ([081397f](https://github.com/cameronrye/openzim-mcp/commit/081397f525144fa8cf129b392e1896efdf30d033))
* slim README to 147-line project card + relocate Dev/Test to CONTRIBUTING.md + fix /docs/ root URL ([#207](https://github.com/cameronrye/openzim-mcp/issues/207)) ([289b3ab](https://github.com/cameronrye/openzim-mcp/commit/289b3ab4edf19724c836c3204b06d47d33f3edb8))
* **website:** fix loose ends from docs-consolidation cut ([#211](https://github.com/cameronrye/openzim-mcp/issues/211)) ([085fb37](https://github.com/cameronrye/openzim-mcp/commit/085fb3787d07cef9eebfdef392e0ecdec1d65f7b))

## [2.0.0] — 2026-05-27 — Phase F Stage D ships: 8-tool surface

Final cut after `v2.0.0rc1` (PR #194). No surface or behavior changes vs rc1 — this is
the stabilization commit. See the `[2.0.0rc1]` section below for the full surface change
and migration table.

### Stage E verdict

- **E1 dispatch sweep** (Task E1, full 512-probe Gate 0b set, 5 reps): 2560 outcomes
  against Qwen3-8B-Q4 via `chat.owl-atlas.ts.net`, 0 errors. Overall dispatch accuracy
  76.7% (1964/2560), 5 spurious routes. Baseline committed at
  `tests/dispatch_eval/runs/rc1__advanced__qwen3-8b-q4__2026-05-27T03-31-27Z.jsonl`.
- **F2 enforcement** (Task E3 Step 3): **PASS** (`f2_pass=true`, `failures=[]`). Per-class
  delta ceiling at 10pp holds for every new Phase F operation class on the primary cell.
  (Haiku / Llama / Phi cells remain unavailable per `gate_0b_decision.json`
  `scope_limitations` — same posture as Gate 0b and rc1.)
- **E2 disposition** (Task E2): the dedicated 24-legal-probes test was not authored. The
  schema-bypass half (Task D15, `tests/test_phase_f_schema_bypass.py`) passes 15/15. The
  legal half is effectively covered by the 122 unique `zim_get-*` dispatch probes from E1
  against live Wikipedia plus the branch-level unit tests in `tests/test_zim_get.py`.
- **E4** (migration conformance in CI): `tests/test_phase_f_migration.py` runs in the
  default pytest suite; verified during PR #194 CI.

### Known limitation — natural-language dispatch on three new operation classes

Three of the new Phase F operation classes showed low absolute dispatch accuracy on
Qwen3-8B-Q4 in the Stage E1 sweep — but **F2 formally passes** (these are new classes
with no b13 baseline to regress against):

| Class | Accuracy | Where the model goes instead |
| --- | --- | --- |
| `zim_get-summary` | 20% (20/100) | 80/100 → `zim_query` |
| `zim_get-structure` | 53% (56/105) | 49/105 → `zim_query` |
| `zim_get-main-page` | 76% (76/100) | 14/100 → `zim_query`, 10/100 → `zim_metadata` |

The model interprets natural-language phrasings ("give me a brief summary of X") as
**query intent** rather than direct-fetch intent. This is **not a surface defect** — when
`zim_query` is dispatched the user still gets a working answer via the natural-language
entry path. Description tuning and/or probe-set relaxation tracked at #199 for v2.5.

### v1.x maintenance scope

Per the v1.x maintenance commitment, the most recent v1.x tag is retained as a parallel
maintenance branch until the FIRST of `{v2.5.0 ships, 6 calendar months after v2.0.0}`.

- **Accepted backports to v1.x:** security fixes (always), data-corruption fixes (always),
  pre-v2.0.0 crash bugs.
- **Rejected backports to v1.x:** new features, new tools, performance work, refactors.

## [2.0.0rc1] — 2026-05-26 (release candidate) — Phase F Stage D: 8-tool surface consolidation

Second release candidate for v2.0.0. The 22-tool advanced surface
collapses to 8 consolidated tools — the largest API change in the
project's history. `tool_mode='simple'` is unchanged (still registers
only `zim_query`); the consolidation lands in `tool_mode='advanced'`.

### Surface change

22 tools → 8 tools in `tool_mode='advanced'`. All renamed with the
`zim_*` prefix:

- `zim_query` — natural-language entry point (unchanged from b13).
- `zim_search` — fulltext / title / suggest mode dispatch. Collapses
  `search_zim_file`, `search_all`, `search_with_filters`,
  `find_entry_by_title`, `get_search_suggestions` (5 → 1).
- `zim_get` — single / batch / binary / main-page entry fetch.
  Collapses `get_zim_entry`, `get_zim_entries`, `get_main_page`,
  `get_binary_entry`, `get_entry_summary`, `get_table_of_contents`,
  `get_article_structure` (7 → 1).
- `zim_get_section` — section-level fetch (renamed from
  `get_section`).
- `zim_browse` — namespace browse / walk mode dispatch.
  Collapses `browse_namespace` + `walk_namespace` (2 → 1).
- `zim_metadata` — combined archive metadata + namespaces.
  Collapses `get_zim_metadata` + `list_namespaces` (2 → 1).
- `zim_links` — outbound / related direction dispatch.
  Collapses `extract_article_links` + `get_related_articles`
  (2 → 1). `direction="inbound"` arrives in v2.5 #16.
- `zim_health` — combined server health, configuration, and
  loaded archives. Collapses `get_server_health` +
  `get_server_configuration` + `list_zim_files` (3 → 1).

Default `tool_mode` stays `'simple'`. The total advanced-mode wire
footprint lands ~23.5KB — well below the 25-50KB MCP Tax pain band
the spec targets (down from b13's ~36.1KB).

### Migrating from v1.x / v2 beta

v2 allows clean breaks; there are no aliases on the wire. The
mapping is mechanical:

| v1 / v2-beta call | v2.0 equivalent |
| --- | --- |
| `list_zim_files()` | `zim_health()` → `.loaded_archives` |
| `get_server_health()` | `zim_health()` → `.health` |
| `get_server_configuration()` | `zim_health()` → `.configuration` |
| `get_zim_metadata(path)` | `zim_metadata(path)` → `.metadata` |
| `list_namespaces(path)` | `zim_metadata(path)` → `.namespaces` |
| `get_main_page(path)` | `zim_get(path, main_page=True)` |
| `search_zim_file(path, q)` | `zim_search(q, zim_file_path=path)` |
| `search_all(q)` | `zim_search(q, cross_file=True)` |
| `search_with_filters(path, q, ns=, ct=)` | `zim_search(q, zim_file_path=path, namespace=ns, content_type=ct)` |
| `find_entry_by_title(path, title)` | `zim_search(title, zim_file_path=path, mode="title")` |
| `find_entry_by_title(cross_file=True)` | `zim_search(title, cross_file=True, mode="title")` — promotion disabled in cross-archive case |
| `get_search_suggestions(path, prefix)` | `zim_search(prefix, zim_file_path=path, mode="suggest")` |
| `get_zim_entry(path, entry_path)` | `zim_get(path, entry_path=entry_path)` — rename only; `compact` defaults to `False` (legacy behavior preserved) |
| `get_zim_entries(path, entry_paths)` | `zim_get(path, entry_paths=entry_paths)` — rename only; `compact` defaults to `False` |
| `get_binary_entry(path, entry_path)` | `zim_get(path, entry_path=entry_path, binary=True)` |
| `get_entry_summary(path, entry_path)` | `zim_get(path, entry_path=entry_path, view="summary")` |
| `get_table_of_contents(path, entry_path)` | `zim_get(path, entry_path=entry_path, view="toc")` |
| `get_article_structure(path, entry_path)` | `zim_get(path, entry_path=entry_path, view="structure")` |
| `get_section(path, entry_path, section_id)` | `zim_get_section(path, entry_path, section_id)` — rename only. `compact` parameter added for surface uniformity (default `True`) but is a no-op at v2.0; v2.5 #18 wires a true raw-text path. Same response shape as legacy `get_section`. |
| `browse_namespace(path, namespace)` | `zim_browse(path, namespace)` |
| `walk_namespace(path, namespace)` | `zim_browse(path, namespace, mode="walk")` |
| `extract_article_links(path, entry_path)` | `zim_links(path, entry_path)` |
| `get_related_articles(path, entry_path)` | `zim_links(path, entry_path, direction="related")` |
| inbound-link lookup (no v1 tool) | not available at v2.0 — `zim_links(..., direction="related")` is the closest approximation; `direction="inbound"` arrives in v2.5 #16 |
| `zim_query(...)` | unchanged |

### Default behavior changes (silent breaks if not handled)

- **`zim_metadata` no longer exposes `main_page_path`.** Callers who
  used it to construct an explicit `entry_path` round-trip to
  `zim_get` should switch to `zim_get(path, main_page=True)` — a
  single-call, null-safe path. (Note: `main_page` is a dedicated
  boolean flag, NOT a value of the `view` enum — earlier Phase F
  drafts overloaded `view="main_page"` but it now stands as its
  own parameter so the `view` enum stays focused on body slicers.)

The `zim_get` rename from `get_zim_entry` is **behavior-preserving**
on the `compact` axis (default is `False`, matching legacy). v2.5
will revisit the `zim_get` default once telemetry shows adoption.

`zim_get_section` adds a `compact` parameter for surface uniformity
with the rest of the family, but at v2.0 it is a **no-op at the data
layer** — the bundle is always compact-rendered (see
`openzim_mcp/bundle.py` line 300+: load-bearing UX invariant that
section slices match `get_zim_entry` output on the same article).
v2.5 #18 wires a true raw-text path. Until then, the rename from
`get_section` is **behavior-preserving** despite the new parameter.

### Schema shape

Each `zim_get` and `zim_search` call still has multiple mutually-
exclusive branches. The spec's preferred wire shape is JSON Schema
`oneOf` over the branches, but Gate 0.3 (small-model `oneOf` parsing
benchmark) is `unvalidated` in `tests/dispatch_eval/gate_0b_decision.json`
at rc1 cut, so per the spec's fallback rule the schema ships **flat**
with handler-level invalid-combination validation. A small model
that flattens a `oneOf` payload still gets a structured
`tool_error("invalid_path_combination", ...)` envelope rather than a
silent dispatch.

### Gate 0b — surface-change non-regression

The 8-tool surface was validated against the b13 22-tool baseline via
a 300-probe dispatch eval (`tests/dispatch_eval/probes.jsonl`) before
rc1 opened. The Qwen-2.5-7B-Instruct primary cleared all gating
criteria (A: dispatch non-inferiority, B: parameter validity, C1/C3:
Z4 silent-wrong-answer ceilings, D: aggregate non-inferiority, F1/F2:
per-class deltas within 8pp / 10pp ceilings). Haiku / Llama / Phi
secondaries recorded as `unavailable` (Intel Mac i9 — no CUDA for
vLLM; documented in the gate decision artifact).

The full gate outcome ships at
`tests/dispatch_eval/gate_0b_decision.json` and the prototype's
per-tool wire-footprint snapshot ships at
`tests/dispatch_eval/prototype_schema_snapshot.json`. Drift between
the rc1 commit's baked Python constants and the recorded gate
outcome is caught by `tests/test_phase_f_gate_decision_consistency.py`;
drift between the rc1 schemas and the prototype baseline is caught
by `tests/test_phase_f_prototype_parity.py` (±5% bytes + structural
inputSchema identity + ≤30% description Levenshtein edit distance).

### Tests added

- `tests/test_phase_f_schema_budget.py` — total + per-tool byte
  budgets, simple-mode 1-tool registration, gate-decision invariants.
- `tests/test_phase_f_schema_shapes.py` — `oneOf`/flat schema shape
  matches `gate_0_schema_shape`.
- `tests/test_phase_f_gate_decision_consistency.py` — rc1 constants
  match the recorded gate outcome.
- `tests/test_phase_f_prototype_parity.py` — rc1 surface stays
  within parity tolerances of the prototype snapshot.
- `tests/test_phase_f_schema_bypass.py` — 13 invalid-combination
  probes per oneOf-forbidden shape on `zim_get` — handler
  validation surfaces structured `tool_error` envelopes.
- `tests/test_phase_f_migration.py` — v1.x legacy tool names map
  exhaustively to v2.0 Phase F names.

### Files

New: `openzim_mcp/tools/zim_{query,search,get,get_section,browse,
metadata,links,health}.py` + sibling `*_description.md` per-tool
descriptions packaged via `[tool.setuptools.package-data]`. New:
`openzim_mcp/server_state.py` extracts `_build_health_report` and
`_build_configuration_report`. New: `openzim_mcp/tools/__init__.py`
`register_phase_f_tools` orchestrator. Deleted: legacy per-domain
`content_tools.py` / `file_tools.py` / `metadata_tools.py` /
`navigation_tools.py` / `search_tools.py` / `server_tools.py` /
`structure_tools.py` modules.

---

## [2.0.0rc0] — 2026-05-25 (release candidate) — Phase F Stage A: promotion-extraction refactor + Gate 0 transport verification

First release candidate for v2.0.0. Two structural changes land,
both pure refactor / architecture verification — no behavior change
against the b13 baseline.

### Phase F Gate 0 — `oneOf` transport verification (PR #189)

Phase F's eight-tool surface design depends on emitting JSON Schema
``oneOf`` branches over the MCP transport so small dispatch models
can route on a single discriminator. Gate 0 is a two-step probe to
confirm the wire actually carries `oneOf`:

- **Gate 0.1 — emission spike.** Three FastMCP registration patterns
  (Literal-gated signature, hand-authored ``Tool.parameters``
  override, Pydantic discriminated Union) inspected in-process.
  Verdict: Pattern B (``Tool.parameters`` override) is the only path
  that emits a literal ``"oneOf"`` key in the registered tool's
  schema.
- **Gate 0.2 — transport round-trip.** Pattern B exercised over
  three transports (in-memory, stdio JSON-RPC subprocess,
  streamable-HTTP subprocess). Verdict: ``"oneOf"`` survives the
  wire round-trip across all three transports — the design's
  primary assumption holds.

Both probes live under ``tests/dispatch_eval/`` and run only via the
explicit ``--dispatch-eval`` pytest flag (skip-guarded against the
default suite). No production code touched — the env-gated probe-
tool registration block was reverted before merge so v2.0.0rc0
ships the same surface as b13.

### Phase F Stage A — extract `promote_topic_via_title_index` + `auto_select_zim_file` (PR #190)

The rc0 refactor lifts two pure orchestration functions out of
``SimpleToolsHandler`` (in ``simple_tools.py``) into a new module
``openzim_mcp/topic_preprocessing.py``:

- ``promote_topic_via_title_index`` — the four-pass promotion
  orchestrator (full-topic, multi-entity, possessive, typo-tolerant
  passes) that all 17 b-series sweeps have hardened.
- ``auto_select_zim_file`` — the 0/1/N archives selection used by
  the dispatch entry points.

The original ``SimpleToolsHandler`` methods remain as thin wrappers
over the extracted module-level functions, so the public surface is
unchanged. Importers that patched ``openzim_mcp.simple_tools.find_title_match``
in tests have been updated to patch the new live binding at
``openzim_mcp.topic_preprocessing.find_title_match``.

### Why extract now

Phase F's eight-tool surface needs ``zim_search`` and ``zim_query``
to share one promotion pipeline without inheriting
``SimpleToolsHandler``. The pre-rc0 pipeline lived as a bound method
on the simple handler, which Phase F's new tools cannot easily call.
Lifting it to a module-level function (with the ``zim_operations``
dependency passed in explicitly) is the smallest change that lets
the rc1 tool implementations share the orchestrator without a
deeper inheritance refactor.

### Verification

- **Promotion-extraction parity diff-test** (94 probes from b1–b13
  cumulative set) — bound-method path and extracted-function path
  return identical results.
- **Auto-select-extraction parity diff-test** (4 scenarios — zero
  files, one file, n files, exception) — log records and return
  values match across both paths.
- **Preprocessing-orchestration idempotency check** — calling
  ``promote_topic_via_title_index`` twice with the same inputs
  returns the same result (no hidden state mutation).
- **Direct unit tests for ``topic_preprocessing``** (45 new tests
  in ``tests/test_topic_preprocessing.py``) — documents the
  extracted module's contract independently of its call sites
  (Z3 probe-based discriminator, Z4 tangential rejection +
  biographical/digit/type-extension exemptions, OPP-1 possessive,
  auto_select_zim_file 0/1/N + exception handling).
- Full suite: **2573 passed, 246 skipped, 38 deselected**.

### Version bumps

| File | From | To |
|---|---|---|
| ``pyproject.toml`` | 2.0.0b13 | 2.0.0rc0 |
| ``.release-please-manifest.json`` | 2.0.0b13 | 2.0.0rc0 |
| ``website/llms.txt`` | 2.0.0b13 | 2.0.0rc0 |
| ``uv.lock`` | 2.0.0b13 | 2.0.0rc0 |

### What's next

The remaining Phase F work (Stage B + C — Gate 0b dispatch-eval
benchmarks + the eight-tool surface implementation as v2.0.0rc1)
proceeds on the ``v2-phase-f-prototype`` branch.

## [2.0.0b13] — 2026-05-24 (beta pre-release) — post-b12 beta-test sweep shipped — Play-style disambig phrasing variant + CodeQL #231 + test dedupe

Post-b12 live-MCP verification confirmed the Z4 multi-token canonical
fix lands cleanly (7/8 historical defects now route correctly) and
the Sub-pattern C disambig rejection works for Lincoln / O'Brien.
One new silent-wrong-answer slipped through:
``Shakespeare England plays`` at v2.0.0b12 still ships ``Play``
(disambig page) at cert=0.85.

### Root cause — phrasing variant not in ``_DISAMBIG_LEAD_PHRASES``

``_is_disambig_lead`` runs a trailing-tail ``endswith`` check against
the phrase set ``("may refer to", "may also refer to")``. The
Wikipedia ``Play`` disambig template ends its pre-H2 with:

  **Play** may refer also to:

Word order: may-refer-**also**-to (NOT may-**also**-refer-to). The
two-phrase set misses this variant, so ``_is_disambig_lead`` returns
False, the b12 Sub-pattern C rejection doesn't fire, and the Play
disambig page is served as the tell_me_about answer.

The b11 implementation comment at ``simple_tools.py:2660`` explicitly
anticipated this: "easier to extend with new phrasings if ZIM
exporters ever produce them".

### Fix — extend ``_DISAMBIG_LEAD_PHRASES`` with the third variant

One-line tuple extension:

```python
_DISAMBIG_LEAD_PHRASES = (
    "may refer to",
    "may also refer to",
    "may refer also to",  # b13 fix: Play-style word order
)
```

No regex, no backtracking risk, no architectural change. The
trailing-tail ``endswith`` check still position-anchors against
false-positives where the phrase appears earlier in the body but
not at the tail.

### Verification

Live-MCP probe of all documented preserved cases plus the 8 Z4
defect repros from b11. After b13: ``Shakespeare England plays``
falls to BM25 (Z4 + Sub-pattern C combine to reject ``Shakespeare's_Kings``
AND ``Play`` disambig). All other 7 Z4 defects continue routing
correctly (4 to head bios, 3 to tail concepts / BM25). 13/13
preserved cases hold; no regressions.

### CodeQL alert #231 — unquote forward refs to TYPE_CHECKING imports

CodeQL's ``py/unused-import`` flagged ``RerankerConfig`` as unused
in ``synthesize.py`` because two annotations used explicit string-
quoting (``"Optional[RerankerConfig]"``) which the static analyzer
treats as opaque string literals rather than deferred forward
references.

Under ``from __future__ import annotations`` (line 14 of synthesize.py),
ALL annotations are automatically stringified at runtime — explicit
quotes are redundant and serve only to hide the import usage from
static analyzers. Fix: remove the redundant string quotes from three
annotations (lines 1041 / 1444 / 1538). mypy / runtime behavior
unchanged.

### Test dedupe — extract ``make_disambig_handler`` to shared fixtures

SonarCloud flagged 6.2% new-code duplication (threshold 3%) because
the b13 sweep's ``TestPlayDisambigRejection._make_handler`` was a
copy of b11's ``TestSubPatternCDisambigRejection._make_handler``.
Extracted to ``tests/_promote_fixtures.make_disambig_handler``,
both sweep files now import the shared helper. Same dedup pattern
the post-b8 sweep used when it created ``_promote_fixtures.py``.

### Tests

7 new tests in ``tests/test_post_b12_beta_fixes.py``:

- 5 direct unit tests on ``_is_disambig_lead`` covering all three
  phrase variants + Play-style full pre-H2 + false-positive defense.
- 2 integration tests: ``Shakespeare England plays`` (multi-token →
  BM25 fallback) and ``tell me about Play`` (bare-head → preserve
  disambig).

```
2562 passed, 54 skipped (full suite, ~28s)
```

mypy / black / flake8 / pip-audit all clean.

### Methodology — "fix unlocks new paths" 20 sweeps strong

Smallest sweep since b6 — one-line phrase extension. The b11
Sub-pattern C rejection architecture was solid; only the underlying
detection primitive needed a phrase variant added. This is the
"easy to extend" promise of the b11 design paying off.

## [2.0.0b12] — 2026-05-23 (beta pre-release) — post-b11 beta-test sweep shipped — Z4 multi-token canonical tangential + Sub-pattern C disambig rejection

Post-b11 sweep packaged from PR #184. Live-MCP verification against
v2.0.0b11 confirmed the b11 probe-based multi-entity discriminator
fully ships its target shape (4/6 historical Z3 repros now route to
the correct head: ``Stalin USSR Russia`` → ``Joseph_Stalin``,
``Hitler Germany Berlin`` → ``Nazi_Germany``, ``Marie Curie polonium
discovery`` → ``Marie_Curie``, ``Big Rapids Michigan tourism`` →
``Big_Rapids,_Michigan``). One new HIGH-severity defect class
surfaced — Z4 multi-token canonical tangential — plus the
Sub-pattern C disambig promotion class noted but deferred from b8.

### Root cause — ``is_tail_hijack_shape`` is narrow by design

The b11 ``is_tail_hijack_shape`` predicate requires **(a) single-
token canonical AND (b) 3+-token topic**. Both preconditions are
sound for the b8 Z3 target shape, but the silent-wrong-answer
pattern manifests in two adjacent shapes that bypass the gate:

1. **2-token topics** with multi-token canonical that contains both
   topic tokens (head as possessive/parenthetical + extra modifier):
   ``Tesla electricity`` → ``Tesla's_Wireless_Electricity``,
   ``Mozart Vienna`` → ``Mozarthaus_Vienna``,
   ``Beethoven symphony`` → ``Symphony_No._1_(Beethoven)``,
   ``Lenin Russia`` → ``Leninist_Komsomol_of_the_Russian_Federation``.
2. **3+-token topics** with multi-token canonical that overlaps the
   topic only via stemming or non-head tokens:
   ``Marie Curie radioactivity`` → ``Radioactive_(Redniss_book)``,
   ``Darwin evolution Galapagos`` → ``Galápagos_Islands``,
   ``Mao China revolution`` →
   ``History_of_the_People's_Republic_of_China_(1949–1976)``,
   ``Shakespeare England plays`` → ``Shakespeare's_Kings``.

Additionally, the b8-noted Sub-pattern C cases still fire:
``Lincoln slavery emancipation`` → ``Lincoln`` (disambig page) and
``O'Brien character 1984`` → ``O'Brien`` (disambig page). The
canonical is single-token (no Z3 / Z4 shape) but is itself a
disambiguation page.

### Fix — Z4 tangential check with three exemptions + Sub-pattern C disambig render-time rejection

Four new helpers in ``title_promotion``:

- ``is_tangential_multi_token_shape(promoted, topic)`` — pure-logic
  shape: canonical is multi-token AND not a token-set subset of
  topic (after filtering ``_CANONICAL_FUNCTION_WORDS`` — articles,
  conjunctions, prepositions — from canonical). Subset preserves the
  ``Apollo 11 moon landing`` → ``Moon_landing`` and ``Lincoln
  Gettysburg Address`` → ``Gettysburg_Address`` invariants. The
  function-word filter additionally preserves
  ``Assassination_of_John_F._Kennedy`` for topic ``John F Kennedy
  assassination`` (canonical's only extra is the preposition
  ``of``).
- ``probed_head_matches_promoted(topic, promoted, title_probe)`` —
  biographical-canonical exemption: probes EACH non-stop-word topic
  token. True iff ANY probe's canonical path (or pre-redirect path)
  equals the promoted candidate AND the probed token literally
  appears in the promoted canonical's tokens. The probe-all approach
  catches tail-position subjects (``quantum mechanics Einstein`` →
  ``Albert_Einstein`` — subject ``einstein`` at the topic tail);
  the token-in-canonical guard prevents accidental over-acceptance
  (``Darwin evolution Galapagos`` → ``Galápagos_Islands`` would
  match on path-only but ``galapagos`` ≠ ``galápagos`` raw, so the
  guard correctly rejects).
- ``has_digit_specificity_match(promoted, topic)`` — digit
  specificity exemption: when canonical's extras (tokens NOT in
  topic) include a digit AND topic also has a digit-bearing token,
  the user explicitly signaled they want a numbered instance.
  Catches ``Beethoven 9th symphony`` → ``Symphony_No._9_(Beethoven)``
  without over-accepting ``Beethoven symphony`` →
  ``Symphony_No._1_(Beethoven)`` (no topic digit).
- ``has_topic_prefix_canonical_extension(promoted, topic)`` — type-
  extension exemption: canonical's leading tokens form a contiguous
  2+-token slice of topic, suffix tokens are all extras. Catches the
  b8 motivating case ``Big Rapids Michigan Ferris State`` →
  ``Ferris_State_University`` where the topic tail is the
  canonical's entity name without the type-word suffix.

Call-site wiring in ``_promote_topic_via_title_index``:

- **Pass 0 / Pass 3** consult ``accept_possessive_promotion`` AND
  ``_passes_z4`` directly (no Z3 multi-entity escape — that escape
  exists only for Pass 1's documented 1-token-tail filler-prose
  feature).
- **Pass 1 / Pass 2** consult ``_accept_with_multi_entity_check``,
  which layers the Z3 escape over the b9 unconditional tail-hijack
  rejection AND then applies ``_passes_z4``.
- The b3 invariant (first ``find_title_match`` call uses bare
  ``topic``) is preserved by hoisting the Pass 0 probe above the
  closure definitions.

Symmetric application in ``synthesize.py:_promote_title_match``
Pass 0 — synthesize was previously vulnerable to the same Z4
silent-wrong-answer pattern via ``zim_query(synthesize=True)``.

**Sub-pattern C disambig rejection** at render-time in
``_handle_tell_me_about``: when the auto-picked canonical's body
matches the disambig-lead pattern (``may refer to`` / ``may also
refer to``) AND the topic has 2+ non-stop-word content tokens, fall
back to plain BM25 search. Detection-at-render-time avoids a
separate content-peek round-trip (the body is already fetched for
normal rendering). Single-content-token topics (``tell me about
Lincoln``) legitimately want the disambig and are preserved by the
``len >= 2`` floor. Possessive queries bypass via
``has_apostrophe_possessive`` (OPP-1 handles those at the promotion
layer).

### Decision matrix

| Topic | Multi-token tangent | Bio exem | Digit exem | Type-ext exem | Decision |
| --- | --- | --- | --- | --- | --- |
| Tesla electricity | yes | no | no | no | REJECT |
| Mozart Vienna | yes | no | no | no | REJECT |
| Beethoven symphony | yes | no | no | no | REJECT |
| Lenin Russia | yes | no | no | no | REJECT |
| Marie Curie radioactivity | yes | no | no | no | REJECT |
| Shakespeare England plays | yes | no | no | no | REJECT |
| Darwin evolution Galapagos | yes | no | no | no | REJECT |
| Mao China revolution | yes | no | no | no | REJECT |
| Picasso Paris cubism | yes | YES | - | - | ACCEPT |
| Quantum mechanics Einstein | yes | YES (tail) | - | - | ACCEPT |
| Beethoven 9th symphony | yes | no | YES | - | ACCEPT |
| Big Rapids Michigan Ferris State | yes | no | no | YES | ACCEPT |
| John F Kennedy assassination | no (⊆ after func-word filter) | - | - | - | ACCEPT |
| Apollo 11 moon landing | no (⊆) | - | - | - | ACCEPT |
| Lincoln Gettysburg Address | no (⊆) | - | - | - | ACCEPT |
| Newton's gravity (possessive) | n/a | - | - | - | ACCEPT (OPP-1) |
| Hamlet Denmark prince | no (1tk) | - | - | - | ACCEPT |
| Berlin Germany | no (1tk) | - | - | - | ACCEPT |
| what is the population of detroit | no (1tk) | - | - | - | ACCEPT |
| Lincoln slavery emancipation | Sub-pattern C | - | - | - | REJECT (render-time, falls to BM25) |
| O'Brien character 1984 | Sub-pattern C | - | - | - | REJECT (render-time, falls to BM25) |
| tell me about Lincoln (1-tk) | Sub-pattern C | - | - | - | ACCEPT (single-content-token preserved) |

### Tests

52 new tests in ``tests/test_post_b11_beta_fixes.py`` covering:

- 8 Z4 defect-repro integration tests (one per live silent-wrong-answer)
- 9 preserved-case integration tests (Apollo / Lincoln / Hamlet / Berlin / Picasso / Newton's possessive / population-of-detroit / Ferris State / Beethoven-9th / quantum-Einstein / JFK)
- Direct unit tests on each new helper (shape predicate; biographical via probe-all + token-in-canonical guard; digit specificity in 4 directions; type-extension prefix length floor / subset overlap / anywhere-in-topic; function-word filter / lexical-word retention)
- 3 synthesize Z4 integration tests (Tesla reject / Picasso bio accept / possessive bypass)
- 4 Sub-pattern C disambig integration tests (Lincoln multi-token reject / Lincoln single-content-token preserve / stop-words filter / non-disambig body unaffected)

```
2555 passed, 54 skipped (full suite, ~29s)
```

mypy / black / flake8 / pip-audit all clean.

### Methodology — "fix unlocks new paths" 19 sweeps strong

The post-b11 sweep peeled the b11 narrow predicate in three concentric
layers across three commits:

1. **First pass** — Z4 shape predicate plus biographical
   (head-token-only), digit-specificity, and type-extension exemptions.
2. **Second pass** — static code review surfaced two regressions: tail-
   position subject (quantum mechanics Einstein) and canonical
   function-word extras (JFK). Fixed via probe-all-tokens with token-
   in-canonical guard, and the canonical-subset stop-word filter.
3. **Third pass** — closed two scoping gaps: synthesize Pass 0 Z4
   protection (symmetric to simple_tools), and Sub-pattern C disambig
   render-time rejection.

Each layer extends the b11 design without invalidating its core: the
b11 ``count_non_tail_strong_entities`` discriminator stays in place
for Pass 1/2's filler-prose escape; the b9 ``accept_possessive_promotion``
Z3 rejection stays unconditional at Pass 0/3; OPP-1 possessive logic
is untouched. b12 adds the multi-token canonical sibling of Z3 with
the supporting exemptions needed to keep all documented preserved
cases working.

## [2.0.0b11] — 2026-05-23 (beta pre-release) — post-b10 beta-test sweep shipped — probe-based multi-entity discriminator (case-independent Z3)

Post-b10 sweep packaged from PR #182. Live-MCP verification against
v2.0.0b10 confirmed OPP-1's redirect extension lands cleanly
(``Newton's gravity`` now auto-fetches
``Newton's_law_of_universal_gravitation``) but ALL SIX Z3 silent-
wrong-answer repros STILL fire identically to b9.

### Root cause — Tier 1 Rule 1 lowercases the topic upstream

The b10 Z3 discriminator counted capitalized + digit tokens in the
ORIGINAL-case topic. But ``IntentParser._normalize_topic_case``
(Tier 1 Rule 1, ``intent_parser.py:1540``) lowercases the query
BEFORE topic extraction. By the time the discriminator sees the
topic, ``"Stalin USSR Russia"`` is ``"stalin ussr russia"`` — zero
capitalized tokens — the discriminator never fired on live data,
even though OPP-1 (which doesn't depend on case) worked perfectly.

### Fix — case-independent probe-based discriminator

Two new helpers in ``title_promotion``:

- ``is_tail_hijack_shape(promoted, topic)`` — pure-logic shape check.
- ``count_non_tail_strong_entities(topic, title_probe, limit=2)`` —
  probe-based multi-entity counter with TWO refinements that make
  it robust against the live data shapes:
    - **Stop-word filter**: skip non-entity tokens (``what``,
      ``is``, ``the``, ``of``, common auxiliaries / pronouns /
      connectives) that often have legitimate disambiguation-page
      matches on Wikipedia but aren't entities the user is
      querying jointly.
    - **Probed-token-in-canonical check**: only count a probe as
      a "strong" match when the probed token (lowercased) appears
      in the canonical path tokens OR the pre-redirect-path
      tokens. Filters out fuzzy/stemming hits (libzim resolving
      ``musicians`` to ``Musician`` via stem) AND defends against
      overly-permissive test mocks.

``_promote_topic_via_title_index`` Pass 1 / Pass 2 now consult both
helpers via a closure-scoped ``_accept_with_multi_entity_check``
wrapper. The multi-entity discriminator overrides
``accept_possessive_promotion``'s unconditional tail-hijack
rejection only when the topic probes as single-entity (filler-
prose pattern).

``_accept_non_possessive`` no longer carries the case-based
discriminator (which never fired in production). Tail-hijack
rejection there is unconditional; the call site is now the only
place that runs the multi-entity discriminator.

### Decision matrix

| Topic | Tail-hijack? | Multi-entity? | Decision |
| --- | --- | --- | --- |
| Stalin USSR Russia | yes | yes (2+) | REJECT |
| Hitler Germany Berlin | yes | yes | REJECT |
| Marie Curie polonium discovery | yes | yes | REJECT |
| Big Rapids Michigan tourism | yes | yes | REJECT |
| O'Brien character 1984 | yes | yes | REJECT |
| what is the population of detroit | yes | no (stop-filter, 1 left) | ACCEPT |
| people who live in michigan | yes | no | ACCEPT |
| Berlin Germany | no (<3 tk) | n/a | ACCEPT |

### Tests

16 new tests in ``tests/test_post_b10_beta_fixes.py``. Two b9 tests
updated to reflect post-b10 architecture (multi-entity mock probes +
structural pin asserts ``_accept_with_multi_entity_check`` wrapper).

```
2503 passed, 54 skipped (full suite, ~28s)
```

mypy / black / flake8 / isort all clean.

### Methodology — "fix unlocks new paths" 18 sweeps strong

The post-b10 sweep peeled three layers in concert:
- b10 case-based discriminator broken by upstream Sub-D-2 Rule 1.
- Probe-based replacement fooled by stop words that match
  disambiguation pages.
- Probe ALSO fooled by overly-permissive test mocks.

Discriminator now has three layers — shape, stop-word filter,
in-canonical check — each independently testable.

## [2.0.0b10] — 2026-05-23 (beta pre-release) — post-b9 beta-test sweep shipped — Z3 all match_types + Pass 1/2 gate + OPP-1 redirect extension

Post-b9 sweep packaged from PR #180. Live-MCP verification against
v2.0.0b9 confirmed the b9 Z3 + OPP-1 fixes land at the unit-test
level but BOTH bypass the actual live silent-wrong-answer code
paths because b9 gated on the wrong ``match_type``.

### Z3-bypass (HIGH) — tail-hijack lives on direct/redirect

The b9 Z3 rule only fired inside ``_accept_non_possessive`` when
``match_type == "fuzzy_suggest"``. The live silent-wrong-answers
route through Pass 1 ``iter_query_tails``: ``find_title_match``
returns ``None`` for the full topic, so the next pass kicks in,
where the 1-token tail (``"russia"``, ``"berlin"``, ``"discovery"``,
``"tourism"``, ``"1984"``) is passed to ``find_title_match``.
libzim sees the tail string as a case-insensitive title equal →
returns ``match_type="direct"`` at score 1.0. The b9 short-circuit
``if match_type != "fuzzy_suggest": return True`` bypassed the Z3
check entirely.

ALSO: Pass 1 and Pass 2 in ``_promote_topic_via_title_index``
returned ``promoted`` directly without consulting
``accept_possessive_promotion``. Even after extending the gate to
direct/redirect, the Z3 rule wouldn't fire because the call site
didn't invoke it.

#### Fix — three changes in concert

1. **``_accept_non_possessive``** no longer short-circuits on
   ``match_type``. The tail-token-hijack premise is purely about
   the topic↔canonical token relationship; it doesn't depend on
   how libzim resolved the match. The zero-overlap stemming
   sub-rule stays gated on ``fuzzy_suggest`` (direct/redirect by
   definition share at least the matched token).
2. **``_promote_topic_via_title_index`` Pass 1 (tail iter) and
   Pass 2 (window iter)** now consult ``accept_possessive_promotion``
   on each candidate, matching what Pass 0 (full topic) and Pass 3
   (typo-tolerant) already did.
3. **Discriminator** preserves the documented Pass 1 1-token-tail
   feature. Queries like ``what is the population of detroit`` →
   ``Detroit`` and ``people who live in michigan`` → ``Michigan``
   keep working: the tail-hijack rejection only fires when the
   topic has 2+ "specific" tokens — tokens that are capitalized in
   the original case OR digit-only. The silent-wrong-answer
   pattern stacks multiple proper-noun-shaped tokens
   (``Stalin USSR Russia``, ``Hitler Germany Berlin``,
   ``O'Brien character 1984``); legitimate filler-prose queries
   have at most one capitalized entity (the tail itself).

#### Live cases this fix resolves (cert=0.85 silent-wrong-answers at v2.0.0b9)

- ``Stalin USSR Russia`` → ``Russia`` → BM25 / Pass 2 head probe
- ``Hitler Germany Berlin`` → ``Berlin`` → BM25 / Pass 2 head probe
- ``Marie Curie polonium discovery`` → ``Discovery`` (a disambig
  page!) → BM25
- ``Big Rapids Michigan tourism`` → ``Tourism`` → Pass 2 finds
  ``Big_Rapids,_Michigan``
- ``O'Brien character 1984`` → ``1984`` (the year) → BM25 /
  Pass 2 finds ``O'Brien_(Nineteen_Eighty-Four)``
- ``Marie Curie radioactivity`` → fuzzy-suggest stemming hit
  unchanged from b9

#### Regression guards preserved

- ``Hamlet Denmark prince`` → Pass 0 / Pass 2 finds ``Hamlet``
  (HEAD position)
- ``Napoleon France emperor`` → Pass 0 / Pass 2 finds ``Napoleon``
- ``Apollo 11 moon landing`` → ``Moon_landing`` (multi-token
  canonical, tail-hijack doesn't fire)
- ``quantum mechanics Einstein`` → ``Albert_Einstein`` (single
  capitalized token, discriminator skips)
- ``Lincoln Gettysburg Address`` → ``Gettysburg_Address``
  (multi-token canonical)
- ``Berlin Germany`` → ``Berlin`` (2-token topic, Z3 doesn't fire)
- ``population of detroit`` / ``people who live in michigan`` —
  legitimate Pass 1 1-token-tail feature preserved via
  discriminator (zero capitalized tokens)

### OPP-1-bypass (MEDIUM) — Newton's gravity redirect

The b9 OPP-1 carve-out only fired inside
``_accept_possessive_fuzzy_suggest``. The live ``Newton's gravity``
case routes through ``_accept_possessive_redirect``: libzim
returns ``Newton's_law_of_universal_gravitation`` with
``match_type="redirect"`` and
``pre_redirect_path="Newton_Laws_of_Gravity"``. The b7 Z1.1 subset
rule rejects because ``{newton, laws, of, gravity} ⊄ {newton, s,
gravity}``. OPP-1's possessor-in-canonical check never runs.

#### Fix — OPP-1 extension to redirect branch

When the b7 Z1.1 subset rule rejects,
``_accept_possessive_redirect`` NOW falls back to the same
possessor-in-canonical check OPP-1 uses for fuzzy_suggest: ACCEPT
if any of the topic's possessor tokens appears in the
post-redirect canonical path tokens.

Decision matrix:

| Topic | Resolved canonical | Decision |
| --- | --- | --- |
| ``Plato's cave`` | ``Allegory_of_the_cave`` via pre=``Plato's_cave`` | ACCEPT (b8 subset) |
| ``Einstein's theory`` | ``Theory_of_relativity`` via pre=``Einstein's_theory`` | ACCEPT (b8 subset) |
| ``Newton's gravity`` | ``Newton's_law_of_universal_gravitation`` via pre=``Newton_Laws_of_Gravity`` | ACCEPT (post-b9 OPP-1) |
| ``Darwin's evolution`` | ``Evolution`` via pre=``Darwin's_Theory_of_Evolution`` | REJECT (b7) |
| ``Plato's republic philosophy`` | ``Czech_philosophy`` | REJECT (b6) |

### Tests

39 new tests in ``tests/test_post_b9_beta_fixes.py`` across 5
classes. One b4 test mock updated to include ``pre_redirect_path``
reflecting the live libzim row shape since b6.

```
2487 passed, 54 skipped (full suite, ~28s)
pip-audit: no known vulnerabilities
```

mypy clean across 52 source files. black + flake8 + isort clean.

### Methodology — "fix unlocks new paths" 17 sweeps strong

The post-b9 sweep demonstrates the pattern again: b9's Z3 + OPP-1
fixes were conceptually correct but missed the actual live code
paths because I inferred the wrong match_types from upstream
behavior. Live diagnostic against the deployed b9 ZIM corpus
surfaced four new invariants this sweep pins down: (a) tail-hijack
hits direct match_type via Pass 1's tail probe, not fuzzy_suggest
via Pass 0; (b) Pass 1 / Pass 2 didn't call the accept gate; (c)
Newton's gravity redirect goes through a non-subset pre-path; (d)
discriminator needed to preserve the documented Pass 1 1-token-tail
feature.

## [2.0.0b9] — 2026-05-23 (beta pre-release) — post-b8 beta-test sweep shipped — Z3 non-possessive tail-hijack + OPP-1 possessor-in-canonical carve-out

Post-b8 sweep packaged from PR #178. Live-MCP verification against
v2.0.0b8 confirmed all prior b6/b7/b8 fixes land cleanly. ONE HIGH-
severity defect + ONE MEDIUM opportunity unlocked by deeper probing
of the non-possessive 3+ token shape.

### Z3 (HIGH) — Non-possessive multi-token tail-hijack

The b4 D2 raised-``min_len`` floor protected possessive topics from
trailing 1-token tails winning at strict 1.0. Non-possessive
multi-token queries still leaked the same hijack at Pass 0
(``_promote_topic_via_title_index``): libzim's title-suggest
fuzzy-matches a STRONG single token in the topic at score 0.95 and
returns just that token's canonical alone. The full-topic probe at
``min_score=0.95`` (added in b3) accepts the row because
``accept_possessive_promotion`` returned ``True`` for any
non-possessive topic.

Live silent-wrong-answer repros at v2.0.0b8 (all cert=0.85):

- ``Stalin USSR Russia`` → ``Russia`` (user wanted Stalin)
- ``Hitler Germany Berlin`` → ``Berlin`` (user wanted Hitler)
- ``Marie Curie polonium discovery`` → ``Discovery`` (a disambig
  page!)
- ``Marie Curie radioactivity`` → ``Radioactive_(Redniss_book)``
  (an obscure 2010 graphic novel surfaced via stemming match)
- ``Big Rapids Michigan tourism`` → ``Tourism`` (contradicts the
  ``iter_query_windows`` docstring's own canonical example,
  ``Big_Rapids,_Michigan``)
- ``O'Brien character 1984`` → ``1984`` (the year article)

#### Fix — non-possessive fuzzy_suggest gate

Two narrow rejections in the non-possessive branch when
``match_type="fuzzy_suggest"`` and the topic has 3+ tokens:

1. **Tail-token hijack** — canonical is a single token equal to
   the topic's LAST token. The user typed
   ``<subject> ... <generic>``; libzim returned the generic
   article. ``Hamlet Denmark prince`` → ``Hamlet`` stays accepted
   because the canonical sits at the HEAD position, not the tail.
2. **Zero-overlap stemming hit** — canonical's tokens have zero
   exact-overlap with topic's tokens (the match was via stemming
   only). The graphic novel surfaced for ``Marie Curie
   radioactivity`` because libzim's title index stems
   ``radioactivity`` to ``radioactive``; no other topic token
   matches the canonical, so the hit is one-stem-token-deep —
   too thin a signal to auto-fetch.

Topics with fewer than 3 tokens are unaffected.

Counter-cases the fix preserves: ``Hamlet Denmark prince`` →
``Hamlet``, ``Napoleon France emperor`` → ``Napoleon``,
``Apollo 11 moon landing`` → ``Moon_landing``,
``quantum mechanics Einstein`` → ``Albert_Einstein``,
``Lincoln Gettysburg Address`` → ``Gettysburg_Address``,
``Berlin Germany`` → ``Berlin``.

### OPP-1 (MEDIUM) — Possessive fuzzy_suggest carve-out

The b6 D1 rule REJECTS every ``match_type="fuzzy_suggest"`` row
for a possessive topic. Live probe found this is too strict:
``Newton's gravity`` falls to BM25 even though
``Newton's_law_of_universal_gravitation`` is the obvious rank-1
BM25 canonical AND contains the possessor token ``newton``
literally.

#### Refinement

For possessive topics + ``fuzzy_suggest``, ACCEPT iff the
canonical path tokens include any of the topic's possessor tokens.
The canonical literally preserves the user's named entity,
signalling it's a longer-form expansion rather than the
``Darwin's evolution`` → ``Evolution`` shape that drops the
possessor.

Decision matrix for possessive + fuzzy_suggest:

| Topic | Canonical | Decision |
| --- | --- | --- |
| ``Newton's gravity`` | ``Newton's_law_of_universal_gravitation`` | ACCEPT (OPP-1) |
| ``Mary's lamb`` | ``Mary_Had_a_Little_Lamb`` | ACCEPT |
| ``Darwin's evolution`` | ``Evolution`` | REJECT (b6 D1 preserved) |
| ``Plato's republic philosophy`` | ``Czech_philosophy`` | REJECT (b6 Z1 preserved) |

Tokenization uses ``_TOKEN_RE`` (apostrophe-splitting), same as
the b8 Z1.1 subset rule for redirects, so ``newton's`` in the
canonical surfaces as the bare token ``newton`` for comparison.

### Refactor (Sonar S3776 + duplication)

Quality-gate-driven follow-ups landed in the same PR:

- ``accept_possessive_promotion`` extracted three per-branch
  helpers (``_accept_non_possessive``,
  ``_accept_possessive_fuzzy_suggest``,
  ``_accept_possessive_redirect``) to bring cognitive complexity
  from 21 down under the 15 threshold. No behaviour change.
- The three shared sweep test fixtures (``_make_simple_handler``,
  ``_fake_find_title_match``, ``_run_promote_simple``) moved to
  ``tests/_promote_fixtures.py``. b6/b7/b8 sweep test files now
  import from the shared module instead of duplicating locally.

### Tests

26 new tests in ``tests/test_post_b8_beta_fixes.py`` across 5
classes (``TestZ3NonPossessiveTailHijack``,
``TestZ3RegressionGuards``, ``TestOPP1PossessorInCanonical``,
``TestZ3PromoteIntegration``, ``TestStructuralGuards``).

```
2448 passed, 54 skipped (full suite, ~28s)
pip-audit: no known vulnerabilities
```

mypy clean across 52 source files. black + flake8 + isort clean.

### Methodology — "fix unlocks new paths" 16 sweeps strong

Each sweep peels back another layer; the post-b8 sweep generalised
the b4 D2 raised-min_len protection to non-possessive multi-token
topics, and relaxed b6 D1's blanket-reject when the canonical
preserves the possessor literally.

## [2.0.0b8] — 2026-05-22 (beta pre-release) — post-b7 beta-test sweep shipped — Z1.1 subset rule (Darwin's evolution truncation redirect)

Post-b7 sweep packaged from PR #176. Live-MCP verification against
v2.0.0b7 confirmed all prior fixes land cleanly EXCEPT the b6 Z1
fix for ``Darwin's evolution``: it still returned ``Evolution`` at
cert=0.85, the silent-wrong-answer the user originally flagged.

### Z1.1 (HIGH) — Pre-redirect-path containment check too lenient

The post-b6 Z1 filter rejected ``match_type="redirect"`` rows whose
pre-redirect path tokens didn't *contain* any of the topic's
possessor tokens. That correctly caught the
``Plato's republic philosophy`` → ``Czech_philosophy`` case (the
pre-path didn't contain ``plato`` at all).

But the post-b7 live probe surfaced a sibling shape: **2-token
possessive queries where the user typed a TRUNCATED form of a
longer canonical redirect**. libzim's suggestion-search returns a
redirect entry whose pre-path includes the possessor AND extra
tokens not in the topic; the redirect walks to a canonical that
loses the possessor entirely.

Live repro: ``tell me about Darwin's evolution`` →
``Evolution``. libzim returns a redirect entry like
``Darwin's_Theory_of_Evolution`` (pre-path tokens: ``{darwin, s,
theory, of, evolution}``). The b6 containment check accepts
because ``darwin`` IS in the pre-path — but the user's topic
``{darwin, s, evolution}`` doesn't contain ``theory`` / ``of``,
signalling that the user typed an abbreviated form. The resolved
canonical (``Evolution``) drops the possessor.

### Fix — subset rule

Tighten ``accept_possessive_promotion`` in ``title_promotion`` from
"any possessor token in pre-path" to "pre-path tokens ⊆ topic
tokens":

```python
# Before (b6 containment):
return bool(possessors & pre_tokens)
# After (b8 subset):
return pre_tokens.issubset(topic_tokens)
```

Strictly tighter than the containment check: any pre-path that's a
subset of the topic necessarily contains the possessor — so all
cases accepted by b6 with pre ⊆ topic continue to be accepted.
Cases accepted by b6 with pre having extras (the truncation
shape) are now rejected.

Decision matrix:

| Topic | Pre-path | Subset? |
| --- | --- | --- |
| ``Plato's cave`` | ``Plato's_cave`` | ✅ ACCEPT |
| ``Einstein's theory`` | ``Einstein's_theory`` | ✅ ACCEPT |
| ``Newton's gravity`` | ``Newton's_gravity`` | ✅ ACCEPT |
| ``Darwin's evolution`` | ``Darwin's_Theory_of_Evolution`` | ❌ REJECT |

Non-possessive topics, ``match_type="direct"``, and
``match_type="fuzzy_suggest"`` decisions are unchanged from b7.

### Tests

13 new tests in ``tests/test_post_b7_beta_fixes.py`` across 3
classes (TestSubsetRule with 6 parametrized + 4 standalone,
TestPromoteIntegration with 2, TestRegressionGuards with 1).

Full suite: **2423 passing, 54 skipped**. mypy clean across 52
source files. black + flake8 + pip-audit clean. All 14 CI checks
pass on PR #176 (first push — no cleanup waves needed, having
internalized the post-b6 Sonar feedback).

### Methodology — "fix unlocks new paths" 15 sweeps strong

Each prior sweep added a more discriminating signal until the
filter's behaviour aligns with user intent across every shape:

- **b6 Z1** introduced match_type (direct/redirect/fuzzy_suggest).
- **b6 Z1** sub-discriminates redirect via pre-path *containment*.
- **b8 Z1.1** (this sweep) refines pre-path containment to
  *subset*.

The pattern: each layer of discrimination catches a more specific
subset of the wrong-answer attack surface. The subset rule's
strict-tightness guarantees no previously-accepted case regresses
that wasn't already in the truncation-shape attack surface.

---

## [2.0.0b7] — 2026-05-22 (beta pre-release) — post-b6 beta-test sweep shipped — 2 defects (Z1 associative-redirect filter + Z2 synthesize insert shape)

Post-b6 sweep packaged from PR #174. Live-MCP verification against
v2.0.0b6 confirmed all prior fixes land cleanly (b3 Einstein's /
Plato's canonicals, b4 non-possessive carve-out, b3 trailing-modal
politeness, b2 D3 typo retry, all earlier b-series invariants).
TWO new HIGH-severity defects unlocked by deeper probing of the
``match_type="redirect"`` shape and the synthesize-path
promotion's insert contract.

### Z1 (HIGH) — D1 filter misses associative redirects

The post-b4 D1 filter rejected ``fuzzy_suggest`` for possessive
topics but accepted ``redirect`` blindly. libzim's suggestion-
search occasionally produces an **associative redirect**: a
redirect entry whose pre-resolution path is unrelated to the user's
possessor entity, but whose redirect chain walks to a canonical
that shares one user-typed token.

Live silent-wrong-answers:

- ``tell me about Darwin's evolution`` → ``Evolution`` (cert=0.85)
- ``tell me about Plato's republic philosophy`` → ``Czech_philosophy``
  (cert=0.85)

### Z2 (HIGH) — Synthesize pass-0 produces malformed insert

The post-b4 D3 synthesize pass-0 inserted the raw ``find_title_match``
dict into ``top_hits``. The dict has shape ``{path, title, zim_file,
match_type, pre_redirect_path}`` but ``top_hits`` items expect the
``search_top_k`` shape ``{path, snippet, score}``. Downstream score-
sort demoted the canonical to the bottom when it wasn't already in
``top_hits``.

Live impact (``synthesize=true``): ``Einstein's theory`` →
``Theory_of_relativity`` surfaced at rank 6 with score 0 (BM25 hits
dominate; the buggy insert was demoted). ``Plato's cave`` happened
to work because ``Allegory_of_the_cave`` IS in BM25 top_hits — the
reorder branch fired with the existing properly-shaped entry.

### Fixes

1. **``pre_redirect_path`` annotation** through
   ``find_entry_by_title_data`` (fast-path + suggestion-search).
   ``find_title_match`` propagates the field. Schema is
   non-breaking (``FindEntryHit.pre_redirect_path`` is
   ``NotRequired[str]``).
2. **New ``extract_possessor_tokens(topic)`` helper** pulls bare
   possessor tokens from each ``X's``/``X'`` shape.
   ``"Plato's cave"`` → ``["plato"]``; ``"O'Brien"`` → ``[]``
   (name, not possessive).
3. **New shared filter ``accept_possessive_promotion``** in
   ``title_promotion`` (single source of truth for ``simple_tools``
   AND ``synthesize``). Acceptance matrix:

   - Non-possessive topic: accept all match_types (b4 win preserved).
   - Possessive + direct: accept.
   - Possessive + fuzzy_suggest: REJECT (b6 D1).
   - Possessive + redirect: accept iff any query possessor token
     appears in the pre-redirect path's tokens.
   - Missing match_type: accept (backwards-compat).

4. **``search_top_k``-shaped pass-0 insert** in synthesize.
   ``_build_pass0_promoted_hit`` re-probes via
   ``search_handler.title_match_hit(archive, full_probe.title)``
   to produce the proper ``{path, snippet, score: 1.0}`` shape.
   Fallback to a minimal ``{path, snippet: "", score: 1.0}`` hit
   when the re-probe handler misses.

### Tests

20 new tests in ``tests/test_post_b6_beta_fixes.py`` across 5
classes (TestPreRedirectPathPropagation,
TestPossessorTokenExtraction with 12 parametrized cases,
TestRedirectFilterRejectsUnrelatedRedirect with 3 parametrized
cases, TestSynthesizePass0InsertShape, TestRegressionGuards).
Updated 2 b4 tests + 1 golden snapshot.

Full suite: **2410 passing, 54 skipped**. mypy clean across 52
source files. black + flake8 + pip-audit clean. All 14 CI checks
pass on PR #174 (after three cleanup waves: SonarCloud S1192 /
S5869 / S5799 deduplication; helper consolidation to
``title_promotion``; S5852 ReDoS bound on the possessor regex).

### Methodology — "fix unlocks new paths" 14 sweeps strong

Each prior sweep peeled back another layer; post-b6 added two:

1. ``match_type="redirect"`` was assumed semantic. The post-b6
   live probe revealed associative redirects where libzim's fuzzy
   token-matching produces a redirect entry whose pre-resolution
   path is unrelated to the user's possessor.
2. The synthesize pass-0 insert worked only when the canonical
   was already in BM25 top_hits. Otherwise the malformed insert
   leaked through and was demoted by score-sort.

Three new invariants pinned: pre-redirect-path propagation;
possessor-token filter for redirects; ``search_top_k`` shape for
synthesize pass-0 inserts.

---

## [2.0.0b6] — 2026-05-22 (beta pre-release) — CVE-driven lockfile bump (starlette PYSEC-2026-161)

Lockfile-only release re-rolling v2.0.0b5 after the release workflow's
`pip-audit` security gate caught a new starlette CVE that landed
between the v2.0.0b4 release and the v2.0.0b5 attempted publish.

### Vulnerability

- **PYSEC-2026-161** — starlette 1.0.0 → fix in 1.0.1. Transitive
  dependency via `mcp[cli]` and `sse-starlette`. Bumped via
  `uv lock --upgrade-package starlette`.

### Behavior changes

None. Code under `openzim_mcp/` and `tests/` is unchanged from the
v2.0.0b5 attempt. The full post-b4 sweep (FOUR defects + 1 latent +
2 audit defects, see v2.0.0b5 section below) ships in this release.

### Methodology note

Release workflow's `pip-audit` step at the start of "Test before
release" is doing its job — caught a fresh CVE that landed between
PR-time CI (which doesn't run pip-audit) and release-time publish.
Pattern matches prior CVE-driven lockfile bumps (post-a19 idna
PR #151, post-a24 pyjwt PR #160). The v2.0.0b5 git tag exists on the
repo at the aborted merge commit (`385f72d`); v2.0.0b6 is the
released artifact.

---

## [2.0.0b5] — 2026-05-22 (beta pre-release) — post-b4 beta-test sweep shipped — 4 defects across 3 audit passes (aborted — see v2.0.0b6)

Post-b4 sweep packaged from PR #171 (commits `51158e9` → `8f9628d` →
`e6a778f`). FOUR defects + 1 latent surfaced by live-MCP probing of
v2.0.0b4, plus TWO additional defects caught by source-level
self-audits of the pass-1 fix itself. The "fix unlocks new paths"
methodology reproduced THREE times within a single sweep.

### D1 (HIGH) — b4 pass-0 gate can't distinguish redirect-0.95 from fuzzy-0.95

`find_entry_by_title_data` scores libzim's suggestion-search results on
a linearly-decaying rank formula capped at 0.95 (zim/search.py
:2814-2822). The same 0.95 score covers both:

- a redirect walk (suggestion returned `Plato's_cave` redirect entry;
  `_follow_redirect_chain` walked to `Allegory_of_the_cave`)
- a pure fuzzy title-prefix match (suggestion returned `Evolution` for
  the query `Darwin's evolution`)

The b4 `min_score=0.95` gate accepted both. Live: `tell me about
Darwin's evolution` → `Evolution` at cert=0.85 (silent-wrong-answer).

### D2 (HIGH) — pass-1 `iter_query_tails` still strips apostrophes

The b4 fix only patched pass-0. Pass-1 (simple_tools.py:3925) still
consumed `_TAIL_TOKEN_RE` at title_promotion.py:188, which treated
apostrophes as token boundaries. `"plato's republic philosophy"` →
`["plato", "s", "republic", "philosophy"]`; 1-tail `"philosophy"`
matches canonical `Philosophy` at strict 1.0 → silently wins. Live
silent-wrong-answers (cert=0.85): `Plato's republic philosophy` →
`Philosophy`; `Einstein's theory history` → `History`; `Einstein's
theory tourism` → `Tourism`.

### D3 (HIGH) — synthesize `_promote_title_match` never got the b4 treatment

PR #169 only touched `_promote_topic_via_title_index`.
`_promote_title_match` in synthesize.py:869-950 iterated
`iter_query_tails(query)` at line 915 without the b4 pass-0 full-query
probe. Live (`synthesize=true`): `Einstein's theory` → rank-1 citation
`Theory` (expected `Theory_of_relativity`); `Plato's cave` → rank-1
`Cave` (correct article demoted to rank 2).

### D5 (LATENT) — pass-2 windows + pass-3 typo-tolerant tails

`iter_query_windows` and the pass-3 0.8-fuzzy tail probe share the
same tokenizer; the apostrophe-strip shape was masked in practice
because pass-1's strict-1.0 single-token tail short-circuited first.
Fixed for free by the tokenizer change.

### Pass-2 audit defect — synthesize pass-0 silently no-ops in production

The pass-1 fix called `find_title_match(archive, ...)` passing the
libzim `Archive` handle as arg-0. `find_title_match` calls
`arg0.find_entry_by_title_data(...)` — `Archive` has no such method,
so the call raised `AttributeError` inside the `except Exception`
wrapper and silently no-op'd in production. Tests passed only because
they `patch`-ed `find_title_match` at the import site, bypassing the
contract entirely. Fix: `search_handler` in production IS the
`ZimOperations` instance (simple_tools.py:5542:
`search_handler=self.zim_operations`). Pass it as arg-0.

### Pass-3 audit defect — unconditional D1 filter regressed non-possessive prose

The pass-1 unconditional `match_type != "fuzzy_suggest"` gate silently
reverted a real b4 improvement for non-possessive prose queries of the
shape `<entity> <disambiguator>`. Trace `tell me about Berlin Germany`:

| Release | Pass-0 result | Final answer |
| --- | --- | --- |
| pre-b4 | (no pass-0) | `Germany` (pass-1 picks trailing tail) |
| b4 (pre-D1) | `Berlin` at fuzzy_suggest 0.95 — accepted | `Berlin` ✓ |
| b4 + pass-1 D1 (unconditional) | rejected | `Germany` ✗ |

Refine the gate: reject `fuzzy_suggest` ONLY when
`has_apostrophe_possessive(topic)` returns True. The
Darwin/Einstein/Plato silent-wrong-answer cases (possessive) still
reject; the Berlin/Apollo/Paris/Tokyo b4 improvements (non-possessive)
preserve.

### Fixes

1. **`match_type` annotation through find_entry_by_title_data** — each
   result row now carries `match_type ∈ {"direct", "redirect",
   "fuzzy_suggest", "typo_corrected"}`. `find_title_match` propagates
   the field. Schema is non-breaking (`FindEntryHit.match_type` was
   already `NotRequired[str]` for the pre-existing typo annotation).

2. **Tokenizer fix** — `_TAIL_TOKEN_RE` keeps apostrophes (both
   straight `'` and curly `'`) inside otherwise-alphanumeric runs so
   `einstein's` stays one token.

3. **Possessive `min_len` floor** — new
   `has_apostrophe_possessive(topic)` helper. When True, pass-1 /
   pass-3 / synthesize-pass-1 use `min_len=2` in `iter_query_tails` /
   `iter_query_windows` so a generic 1-token tail can't silently
   outrank the canonical the pass-0 probe just missed.

4. **Pass-0 / pass-3 / synthesize pass-0 gate filter** — reject
   `fuzzy_suggest` ONLY when topic carries an apostrophe-possessive.

5. **Synthesize pass-0** — `_promote_title_match` mirrors the
   `_promote_topic_via_title_index` pass-0 probe at the start, with
   the same `match_type` filter. Receives `search_handler`
   (ZimOperations) as arg-0, not the libzim `Archive` handle.

### Tests

21 new tests in `tests/test_post_b4_beta_fixes.py` across 6 classes
(TestMatchTypePropagation, TestFuzzySuggestGateReject,
TestPossessiveTokenizer, TestPossessiveMinLenFloor,
TestSynthesizePromoteFullTopicProbe, TestRegressionGuards). Updated
1 pre-existing assertion in `tests/test_post_a17_beta_fixes.py`
(`O'Brien` tokenizes to `["o'brien"]` post-fix) and 1 in
`tests/test_post_b3_beta_fixes.py` (tail iteration no longer strips
apostrophes). Updated 1 golden snapshot to include
`match_type: "direct"` on the canonical fast-path hit.

Full suite: **2390 passing, 54 skipped**. mypy clean across 52 source
files. black + flake8 clean. All 14 CI checks pass on PR #171.

### Methodology evolution

- **"Fix unlocks new paths" — now 13 sweeps strong, reproduced 3x in
  this single sweep.** Pass-1 looked correct in isolation; pass-2
  found the synthesize API misuse silently no-op'd in production
  (tests masked it via `patch`-at-import-site); pass-3 found the
  unconditional gate regressed a real b4 improvement for non-
  possessive prose.
- **Three new invariants pinned**: (a) `_TAIL_TOKEN_RE` keeps
  apostrophes inside otherwise-alphanumeric runs; (b) at
  `min_score=0.95`, `match_type ∈ {direct, redirect, typo_corrected}`
  is safe to auto-fetch — `fuzzy_suggest` is only safe when the topic
  shape would naturally produce a meaningful first-token resolution
  (non-possessive); (c) when a new call site for `find_title_match`
  is added, verify arg-0 is the `ZimOperations`-shaped object — the
  `except Exception` wrapper masks `AttributeError` silently.

---

## [2.0.0b4] — 2026-05-22 (beta pre-release) — post-b3 beta-test sweep shipped — X's Y auto-fetch tokenization

Post-b3 sweep packaged from PR #169 (commit `f69db46`). ONE pre-existing
defect surfaced by deeper live-MCP probing of the `tell me about X's Y`
shape — the attack surface b2 D3 partially closed. All six post-b2 fix
families verified clean on live MCP first (D1-D4 + the pass-2/pass-3
siblings).

### Defect — `X's Y` auto-fetch silent-wrong-answer

`_promote_topic_via_title_index` (simple_tools.py:3868) iterates trailing
tails via `iter_query_tails` (title_promotion.py:191). `iter_query_tails`
tokenizes on alphanumeric runs, so the apostrophe in `X's Y` is treated
as a separator. The topic `"einstein's theory"` becomes the tokens
`["einstein", "s", "theory"]`; tails yielded longest-first:

- `"einstein s theory"` — no canonical match (the canonical is stored
  WITH the apostrophe — `Einstein's_theory` is a redirect to
  `Theory_of_relativity`)
- `"s theory"` — no canonical match
- `"theory"` — matches the generic `Theory` article at score 1.0 →
  wins → wrong article fetched

Live impact (post-b3 live probe):

- `tell me about Einstein's theory` → `Theory` (expected
  `Theory_of_relativity` — confirmed canonical at 1.00 via
  `find article titled einstein's theory`)
- `tell me about Plato's cave` → `Cave` (expected
  `Allegory_of_the_cave` — confirmed at 1.00)
- `tell me about Plato's Republic` → `Republic` (expected
  `Republic_(Plato)` — confirmed at 0.95)
- `tell me about Darwin's evolution` → `Evolution`

The bug is pre-existing — it would have affected any user typing `tell
me about X's Y` for years — but was masked because:

- Pre-b1, fuzzy-search rescues at search time often surfaced the right
  article through different ranking paths.
- Pre-b2 D3, the retry that exposes the probe gate's True/False
  decision didn't exist.
- The specific repros (Einstein's theory, Plato's cave/Republic)
  weren't in the prior adversarial set.

Why b2 D3 doesn't catch this: D3's probe gate correctly suppresses
decomposition when `title_probe(topic)` finds a canonical (because
`Einstein's_theory` redirects to `Theory_of_relativity` at score 1.0).
So topic stays as `einstein's theory` → the buggy old auto-fetch flow
runs → wrong tail wins. The defect is in the older auto-fetch flow's
tail-iteration, not in the b2 D3 retry.

### Fix

Probe the FULL topic (with original punctuation preserved) BEFORE
entering the tail iteration in `_promote_topic_via_title_index`.
`find_title_match` uses libzim's title index directly — it correctly
handles apostrophes and redirects. The new probe uses `min_score=0.95`
to mirror the canonical-or-fuzzy gate Rule 2/3/4 already use
(intent_parser.py:317), accepting both direct hits (1.0) and
high-confidence redirects (0.95). Live verification of the threshold:
`find article titled` returns score 1.00 for Einstein's theory /
Plato's cave; score 0.95 for Plato's Republic. All three need the 0.95
gate.

Non-possessive queries hit this new probe with the same behavior they'd
get from pass-1's longest tail — the call returns redundantly on those,
never less correct. The pre-existing prose-query case (`famous people
from big rapids michigan`) still falls through to tail iteration
cleanly because the prose phrase isn't itself canonical.

### Tests

9 new tests in `tests/test_post_b3_beta_fixes.py` across 3 classes:

- `TestPossessiveAutofetchProbe` — verifies the full-topic probe fires
  first, uses `min_score=0.95`, runs before tail iteration, falls
  through cleanly on no-canonical, and preserves the apostrophe in the
  probed topic.
- `TestPossessivePromoteIntegration` — three end-to-end-shaped tests
  covering the live repros (Einstein's theory → Theory_of_relativity;
  Plato's cave → Allegory_of_the_cave; Plato's Republic →
  Republic_(Plato) at the 0.95 threshold).
- `TestRegressionGuards` — structural guard pinning that the first
  `find_title_match` call inside the method must use the bare `topic`
  argument (not a tail/window form).

Full suite: **2369 passing, 54 skipped**. mypy clean across 52 source
files. black + flake8 clean. All 14 CI checks pass.

### Methodology evolution

- **"Fix unlocks new paths" — now 10 sweeps strong.** Classic shape:
  b1 P1-D5 made `Photosythesis's reproduction` REACH the auto-fetch
  path; b2 D3 added the possessive retry to fix the immediate
  silent-wrong-answer; the post-b3 sweep's live probing of MORE
  possessive shapes (Einstein's theory, Plato's Republic, Plato's
  cave) revealed that the underlying auto-fetch flow has a
  tokenization bug for possessives where the full phrase is a
  canonical REDIRECT to a different article. Each sweep peels back
  another layer.
- **Single-defect sweep.** All eight b2 fix families + b3 sweep's
  three passes (b3 D1, pass-2 sibling, pass-3 sibling) verified
  clean on live MCP. The adversarial set was clean. The defect
  surfaced only via deeper probing of one specific query shape.
  Reinforces the post-a17 methodology refinement: "live re-probe
  is mandatory after deploy" — the auto-fetch tokenization bug
  is structurally invisible to unit tests but trivially visible
  to live-MCP probing of the right query shape.

---

## [2.0.0b3] — 2026-05-21 (beta pre-release) — post-b2 beta-test sweep shipped — 6 defects across 3 passes

Post-b2 sweep packaged from PR #167 (commits `45de8da` → `cc26b3d`).
Sweep shape: **4 → 1 → 1** across pass-1, pass-2, pass-3. All eight b2
user-facing fix families verified clean on live MCP first; sweep then
probed the adversarial shapes the b2 fixes unlocked. Both pass-2 and
pass-3 surfaced single narrow-scope siblings of pass-1 fixes —
consistent with the "narrow-scope sibling" pattern (now 8 sweeps
strong) and the "fix unlocks new paths" pattern (now 9 sweeps strong).

### Pass-1 defects (4, `45de8da`)

- **D1 — trailing modal politeness ≥2 words falls through.** The
  trailing-politeness regex in `_extract_tell_me_about` only matched
  `please` / `to me` / `for me`; the LEADING regex (line ~374)
  recognised the modal class (`could/can/would/will` + `you`) but
  the trailing twin was missing. Live: `tell me about Tokyo if you
  would` → `Would` (verb stub); `... if you could` → `Could`; `...
  would you` → `Would_You` disambig. Fix: add a trailing pattern
  symmetric to the leading one (both branches require a `you` so a
  bare trailing modal verb in real article titles isn't stripped).
- **D2 — reranker telemetry comment suppressed on no-results.** The
  b1 D-1 in-band telemetry contract promised `<!-- reranker=<state> -->`
  on every multi-token search. `_handle_search` compact path
  early-returned on `total == 0` BEFORE reaching
  `_maybe_rerank_compact`, so neither `_RERANKER_SKIPPED_NO_RESULTS`
  nor `_RERANKER_SKIPPED_NOT_INSTALLED` bumped and the envelope
  writer skipped the comment. Live: `search for asdfqwerzxcv
  nonexistent` → no reranker comment. Fix: invoke
  `_maybe_rerank_compact` on the empty payload before the bail
  (no-op aside from the counter bump; the rerank singleton is
  cached).
- **D3 — Rule 2 + multi-token possessive picks wrong token.** Live:
  `tell me about Photosythesis's reproduction` → `Reproduction`
  article (expected `Photosynthesis`). Rule 2's affix retry
  correctly fires (`Photosythesis's` → `Photosynthesis's`), but
  the b1 P1-D5 fix unlocked the path — pre-fix returned `No
  search results found`, post-fix returns a SILENT WRONG ANSWER.
  Root cause: Rule 4's `_POSSESSIVE_RE` is `^...$`-anchored and
  runs against the FULL query at parse time; the verb prefix
  prevents the match. Fix: in `_handle_tell_me_about`, when no
  decomposition hint was attached AND the topic carries an
  apostrophe-s followed by another token, retry
  `_decompose_x_of_y` on the bare topic. Scope narrowed to
  the possessive shape ONLY (NOT `X of Y`) to avoid regressing
  non-canonical X-of-Y queries.
- **D4 — compact filtered search drops "filtered" qualifier.**
  Live: `search Berlin in namespace C` → `Found 3 matches for
  "Berlin"` (legacy non-compact path emits `Found N filtered
  matches for "X"<filter_text>`). Both paths shared
  `_format_search_text`; pre-fix the formatter had no filter
  awareness. Fix: add optional `filter_text` kwarg to
  `_format_search_text` (mirrors `display_query`); compact filtered
  call site threads through `_format_filter_text` helper. Symmetric
  treatment for filtered no-results.

### Pass-2 sibling (1, `ed674b5`)

- **D1 universal-layer mirror.** Pass-1 added the modal-politeness
  strip inside `_extract_tell_me_about` only, but the universal
  `_TRAILING_POLITENESS_RE` (called by `_strip_trailing_politeness`
  at `parse_intent` line 1048) was added by the post-a20 PD2-1
  sweep specifically so every extractor sees the cleaned query.
  Every NON-tell_me_about intent kept leaking the modal class:
  `search for biology if you would` → `query="biology if you
  would"`; `find article titled Berlin if you would` → looks up
  `Berlin if you would` (not found). Fix: lift the modal class into
  `_TRAILING_POLITENESS_RE`. Pass-1 extractor-level strip kept as
  defense-in-depth. New invariant pinned:
  `TestD1RegexSync.test_leading_and_trailing_share_modal_class` —
  leading + trailing politeness regexes must share the modal class.

### Pass-3 sibling (1, `cc26b3d`)

- **Chained-intent trailing-politeness leak.**
  `_chained_intent_guidance` runs UPSTREAM of `parse_intent` on the
  raw user query. The post-a24 P1-D6 sweep mirrored the param-leak
  strip there; the equivalent mirror of `_strip_trailing_politeness`
  was never added. Pre-fix every trailing-politeness token (the
  full set, including the pass-2 modal class) leaked into chain
  rejection bullets — `tell me about Tokyo if you would then list
  namespaces` produced a rejection whose left bullet read
  `tell me about Tokyo if you would` verbatim, modal politeness and
  all. Caller would copy the suggested left half back,
  re-introducing the politeness on every iteration. Same structural
  sibling pattern as the post-a24 P1-D6 param-leak version. Fix:
  apply `_strip_trailing_politeness` to BOTH chain halves after the
  existing connector / punct trim loop, before bullets render.
  Per-half rather than full-query because the politeness can appear
  inside the chain (not just at the very end). Structurally safe —
  `_CHAINED_OPERATION_PREFIX_RE` checks the LEADING op verb, which
  the trailing strip never touches.

### Out of scope (deferred design call)

- **D5 — `death of stalin` → `Death_and_state_funeral_of_Joseph_Stalin`
  instead of the 2017 Iannucci film.** P1-D3 probe-gate correctly
  suppressed the Stalin disambig misroute; title-probe picked a
  different canonical X-related title rather than the film
  (canonical is `The_Death_of_Stalin`). Picking the film would
  require a prefix-widening probe (`The <query>`) — unwanted side
  effects on arbitrary bare topics — or a popularity ranker. Both
  are design choices beyond the b2 sweep scope.

### D2 / D3 / D4 sibling audits clean

- **D2**: `_handle_filtered_search` always routes through
  `_maybe_rerank_compact`; `_handle_search_all` uses its own
  rerank apply that bumps a counter on every path. `_handle_search`
  was the only early-return gap.
- **D3**: `_handle_tell_me_about` is the only handler that
  auto-fetches a single article based on the extracted topic.
  Other intents take the topic literally; synthesize uses RAG-style
  passage retrieval where decomposition would lose the attribute
  context (pre-existing design out of scope).
- **D4**: `_format_search_text` has three call sites — only the
  compact filtered one needed `filter_text`.
  `search_with_filters_with_canonical_splice` (non-compact filtered)
  already uses `_format_filtered_response` which natively emits
  the qualifier.

### Cross-feature composition verified

- `search for Photosythesis's reproduction in namespace C if you
  would` → universal trailing strip peels `if you would` → intent
  = filtered_search → `_maybe_rerank_compact` bumps counter →
  `_format_search_text` renders with `filter_text`. D1+D2+D4
  compose.
- `tell me about Photosythesis's reproduction if you would` →
  universal strip peels `if you would` → intent = tell_me_about
  → D3 retry fires on possessive topic → `photosynthesis`. D1+D3
  compose.

### Tests

- 40 new tests in `tests/test_post_b2_beta_fixes.py` across 10
  classes (`TestD1TrailingModalPoliteness`,
  `TestD1ParseIntentEndToEnd`, `TestD1SiblingUniversalTrailingModal`,
  `TestD1RegexSync`, `TestD1Pass3ChainedIntentPolitenessLeak`,
  `TestD2RerankerCounterOnNoResults`,
  `TestD3PossessiveDecompositionRetry`,
  `TestD4FilteredSearchEchoQualifier`, `TestRegressionGuards`).
- Full suite: **2360 passing, 54 skipped, 38 deselected**. mypy
  clean across 52 source files. black + flake8 clean. CI checks
  all green (CodeQL, SonarCloud, bandit, security scanning,
  6 OS × Python matrix, both `[reranker]`-extra suites,
  performance benchmarks).

### Methodology evolution

- **"Narrow-scope sibling" pattern** — now 8 sweeps strong. Both
  pass-2 and pass-3 surfaced a single sibling of pass-1's D1
  fix-family: pass-2 caught the universal-layer mirror (modal
  class missing from `_TRAILING_POLITENESS_RE`); pass-3 caught the
  upstream-chained-guidance mirror (trailing-politeness strip
  missing from `_chained_intent_guidance`). Both are STRUCTURAL
  mirrors of fixes already shipped — pass-2's sibling mirrors the
  post-a20 PD2-1 universal-strip extension, pass-3's sibling
  mirrors the post-a24 P1-D6 param-leak strip placement.
- **"Fix unlocks new paths"** — 9th consecutive sweep. D3 is
  particularly nasty because the failure mode changed from
  explicit `No search results found` (pre-b1 P1-D5) to silent
  wrong answer (post-b1 P1-D5 affix retry → post-b2 D3 retry).
- **New invariants pinned via canonical-source tests** — two
  feature-level guards: (a) leading + trailing politeness regexes
  must share the modal class; (b) the no-results early-return path
  in `_handle_search` must route through `_maybe_rerank_compact`.
  These pin the "added X to one side, forgot the other side"
  drift class that drove both pass-2 and pass-3 defects.

---

## [2.0.0b2] — 2026-05-21 (beta pre-release) — post-b1 beta-test sweep shipped — operational fixes + 8 D-2 user-facing defects + D-1 in-band telemetry

Post-b1 sweep packaged from PR #165 (commits `bbda863` → `261412b`).
The first b-series release shipped sub-D-1 (cross-encoder reranker) +
sub-D-2 (Tier-1 query rewriting); the post-b1 sweep covers three
distinct surfaces across two waves.

### Wave 1 — D-1 operational layer (`bbda863`)

Three defects/gaps surfaced operating the b1 reranker integration
end-to-end against the air-gapped pre-stage workflow advertised in
`docs/extras-reranker.md`.

- **`download-models` CLI ignored env vars.** The CLI built a bare
  `RerankerConfig()` (`BaseModel`, not `BaseSettings`), so
  `OPENZIM_MCP_ML__RERANKER__*` env vars (including `cache_dir`)
  were silently dropped. Pre-staging wrote to the FastEmbed default
  cache instead of the operator-configured directory, so the runtime
  re-downloaded on first call. Routes the CLI through a small staging
  `BaseSettings` that mirrors `OpenZimMcpConfig`'s env prefix +
  delimiter and reads `ml.reranker` from env.
- **Reranker telemetry only visible in advanced tool mode.** The four
  reranker events (`reranker_engaged` / `reranker_skipped.*`) lived
  only in `self._telemetry` Counter, surfaced via
  `get_server_health`. Simple-mode operators had no way to confirm
  rerank was actually engaging. `_track()` now also emits a one-line
  INFO log for the four reranker events; `synthesize.py` bumps its
  four DEBUG logs to INFO for consistency. Counter behaviour
  unchanged.
- **`first_call_timeout_seconds` default 5.0s too tight.** Typical
  ONNX session creation on a warm cache takes 7-10s on modest
  hardware; the 5s default tripped the kill switch even with
  pre-staged models. Raised default to 15.0s; bounds (0.1-120.0)
  unchanged.

### Wave 2 — D-2 user-facing defects + D-1 in-band telemetry surface (`54e68af` → `60d8532`)

Three live-MCP probe passes against `wikipedia_en_all_maxi_2026-02.zim`
(118 GB) surfaced 8 user-facing defects in the sub-D-2 wiring plus an
in-band visibility gap for sub-D-1. The "narrow-scope sibling" pattern
now extends at the FEATURE level: every probe-gated rule (Rules 2, 3,
4) shared the same `_build_title_probe(zim_file_path)`-before-auto-
resolve flaw, defeating the suppression for the dominant calling
pattern.

#### Pass-1 defects (6, `54e68af`)

- **P1-D1 — title_probe gated on caller-supplied `zim_file_path`
  BEFORE auto-archive-resolution.** `handle_zim_query` auto-selects
  the single loaded archive downstream (line ~776) when
  `zim_file_path` is omitted (the recommended pattern per the tool's
  own docstring). The probe was built earlier (line ~624) from the
  raw caller argument, so omitted-path callers got a `None` probe —
  Rules 2 (misspellings), 3 (article-strip), and post-fix Rule 4
  (X-of-Y decomposition) silently degraded. Live: `the Beatles` →
  disambig; `An American in Paris` → 🚨 `Niggas_in_Paris` (offensive
  misroute); `A Christmas Carol` → `Christmas_carol` concept. Fix:
  new `_probe_archive_path()` helper auto-resolves before the probe
  is built; mirrored at the synthesize wiring site.
- **P1-D2 — Rule 1's full-query lowercase leaked into user-facing
  chain rejection bullets and soft-connector footer.** Bullets/
  footer echo entities from `params["topic"]` which is extracted
  from the lowercased query. Live: `tell me about Köln, München,
  and Berlin` → bullets read `tell me about köln` / `münchen` /
  `berlin` — diacritics + casing corrupted, breaking the user's
  recovery copy-paste path. Fix: stash `params["_pre_rewrite_query"]`
  (original-case) at the wiring layer; new
  `_recase_from_original()` helper finds each lowercase token in
  the original via case-insensitive substring lookup; threaded
  into both surfaces.
- **P1-D3 — Rule 4 `_decompose_x_of_y` had no title-probe guard at
  all.** Every `<word> of <stuff>` whose attribute word wasn't in
  the structural-intent skip-set decomposed unconditionally,
  including canonical multi-word titles: `lord of the rings` →
  `The_Rings` (1985 Iranian horror film); `the art of war` → `War`
  concept; `wealth of nations` → `Nation`; `origin of species` →
  `Species`; `birth of venus` → Venus disambig; `death of stalin`
  → Stalin disambig; `history of rome` → `Rome` city. Fix: mirror
  Rule 3's probe gate inside Rule 4 — when `title_probe(query)`
  returns True, return `(query, None)` and suppress decomposition.
- **P1-D5 — Rule 2 missed possessives.** Whitespace-only token
  split meant `photosythesis's` didn't match map key
  `photosythesis`. Live: `tell me about Photosythesis's
  reproduction` → `No search results found`.
- **P1-D6 — Rule 2 missed leading/trailing punctuation.** `bilogy.`
  / `"recieve"` / `(photosythesis)` bypassed Rule 2 because the
  lookup key included the punctuation. Combined fix: new
  `_split_misspelling_affixes(token)` helper splits each token into
  `(prefix, core, suffix)` via linear-time `isalnum`/`_` scanning;
  on map miss, retry the core (after peeling trailing `'s`) and
  reattach affixes on hit.
- **D-1 in-band telemetry**: snapshot the four reranker counters at
  the start of `handle_zim_query`; new `_compute_rerank_state()`
  returns per-request engagement state (`engaged` / `skipped:not_installed`
  / `skipped:no_results` / `skipped:passthrough`) from
  the post-call delta; appended as `<!-- reranker=<state> -->` in
  the response envelope (mirrors the existing `<!-- intent=... cert=... -->`
  pattern). Solves the methodology gap where the live
  MCP transport filters out `get_server_health`, leaving reranker
  engagement invisible to simple-tool sweeps.

#### Pass-2 sibling defects (2, `e6f320e`)

- **P2-D1 — disambig heading echoed lowercase topic.** Sibling of
  P1-D2 in `_render_disambiguation`. Pre-fix: `tell me about
  Stalin` → `**Multiple articles match "stalin"**`. Fix: new
  optional `original_query` kwarg; `_recase_from_original` recovers
  the caller's casing (covers diacritics too — `tell me about
  München` → `"München"`).
- **P2-D2 — empty-result handler echoed lowercase query.** Sibling
  of P1-D2 in `_handle_search` compact path's `No results for "X"`
  body. Same recase via `_recase_from_original`; falls back to
  `search_query` when the original isn't available.

#### Pass-3 backend echo-string plumbing (`60d8532`)

- **P3-D1 — search backend echo strings.** Five user-facing echo
  sites in `zim/search.py` read the lowercased query directly:
  `Found N matches for "X"`, `No search results found for "X"`,
  `No filtered matches for "X"`, `Found N matches for "X", but
  offset N exceeds...`, and the filtered-search header. New
  optional `display_query` kwarg threaded through
  `_format_search_text`, `search_zim_file`,
  `_format_filtered_response`, `_perform_filtered_search`,
  `search_with_filters`, and
  `search_with_filters_with_canonical_splice`. Cache key in
  `search_with_filters` includes `display_query` so two calls with
  the same matched query but different display forms don't
  cross-contaminate. Backend matching is unchanged — Xapian is
  case-insensitive, `payload["query"]` keeps the lowercased form
  for cursor/cache stability; only the rendered echoes pick up the
  original case.

### CI cleanup (`261412b`)

- **flake8 E303** in test file (three blank lines before an inline
  `import`).
- **isort + black** drift across the three modified modules.
- **SonarCloud S5852 (ReDoS hotspot)** — pass-1's
  `_MISSPELL_AFFIX_RE = re.compile(r"^(\W*)(.*?)(\W*)$")` tripped
  the polynomial-backtracking detector (lazy `.*?` between two
  greedy `\W*`). Replaced with the linear-time
  `_split_misspelling_affixes(token)` helper — same precedent the
  post-a22 sweep applied to a sibling S5852 hit in
  `_chained_intent_guidance`.

### Tests

- 76 new tests in `tests/test_post_b1_beta_fixes.py`, organised by
  defect class (`TestP1D1ProbeArchiveResolution`,
  `TestP1D3Rule4ProbeGate`, `TestP1D5PossessiveMisspelling`,
  `TestP1D6PunctuationMisspelling`, `TestP1D2RecaseHelper`,
  `TestRerankerStateComment`, `TestPass2SiblingDefects`,
  `TestPass3SearchBackendEchoPlumbing`,
  `TestPass2CrossFeatureIntegration`, `TestRegressionGuards`).
- 6 pre-existing post-a-series test assertions updated to reflect
  the corrected post-P1-D2 original-case echoes.
- Full suite: **2320 passing, 54 skipped**. mypy clean across 53
  source files. All 14 CI checks pass (SonarCloud, CodeQL, bandit,
  security scanning, the 6 OS × Python matrix, both
  `[reranker]`-extra suites, performance benchmarks).

### Methodology evolution

- **"Narrow-scope sibling" pattern at the FEATURE level** —
  reproduced for the 7th sweep but now applies to NEW-FEATURE
  wiring shipped in the same release: every probe-gated Tier-1
  rule (Rules 2, 3, 4) shared the same probe-construction-before-
  auto-resolve flaw. The probe wiring shipped narrower than the
  recommended call pattern; the recommended pattern (omit
  `zim_file_path`) defeated it.
- **"Fix unlocks new paths"** — 6th consecutive sweep; D-2 Rule 4
  LANDED (`population of berlin` decomposes cleanly) but exposed
  the canonical-X-of-Y-title decomposition family that had no
  guard at all.
- **Three-wave sibling progression** — pass-1 (6 live-probed) →
  pass-2 (2 source-audited siblings in dispatcher-edge code) →
  pass-3 (5 backend-plumbed siblings of P2-D2) — validates the
  "defer scope-creep items to a follow-on pass" rule. Pass-2's
  explicit deferral let the dispatcher-edge fix ship without
  blocking on the multi-function backend plumbing.

---

## [2.0.0b1] — 2026-05-21 (beta pre-release) — Phase D sub-D-2 — Tier 1 query rewriting

First b-series release. Ships Phase D sub-D-2: four idempotent rule-based
query rewrites that run before the existing intent regex chain. Zero new
dependencies — every user gets the lift on the next install. The four
rules are:

- **Rule 1 — lowercase topic normalization.** `_normalize_topic_case`
  consolidates scattered `.lower()` calls into a single named pass that
  runs first on every query.
- **Rule 2 — misspelling map.** `_apply_misspelling_map` substitutes
  tokens from a bundled `dict[str, str]` (~40 starter entries from
  Wikipedia's "List of common misspellings (for machines)"). An optional
  title-index probe suppresses substitutions where the original token is
  itself a canonical entity name. Operators can override the bundled
  list via `query_rewrite.misspelling_map_path` and pin exceptions in
  the companion exclusions file. Hard-capped at 500 entries.
- **Rule 3 — stopword phrase detection.** `_detect_stopword_phrase`
  strips leading articles (`the`, `a`, `an`, `of`) unless the full query
  is itself a canonical title (`The Beatles`, `Of Mice and Men`).
  Title-probe-gated; one probe call per query maximum.
- **Rule 4 — `X of Y` decomposition.** `_decompose_x_of_y` recognises
  `population of berlin` and `berlin's population` shapes, returning
  both a cleaner query string (`berlin population`) and a structured
  `{"entity": ..., "attribute": ...}` hint that rides inside
  `params["decomposition_hint"]`. `_handle_tell_me_about` consumes the
  hint and uses the structured entity directly, skipping its own
  topic-extraction logic.

### Config

```python
class QueryRewriteConfig(BaseModel):
    enabled: bool = True
    misspelling_map_path: Path | None = None
    misspelling_exclusion_path: Path | None = None
```

`enabled = False` short-circuits all four rules — queries reach the
existing regex chain unchanged.

### Telemetry

Three new dot-separated events surface via the existing `_track()`
mechanism and `get_server_health`:

- `query_rewrite.misspelling` — Rule 2 substituted at least one token.
- `query_rewrite.stopword_phrase` — Rule 3 stripped a leading article.
- `query_rewrite.x_of_y` — Rule 4 matched and emitted a hint.

Rule 1 has no event (fires on essentially every query — zero signal).

### Risk mitigations baked in

- **Master kill switch** (`config.query_rewrite.enabled=False`) actually
  skips all four rules, not just the telemetry/probe wrapping.
- **Title-index probe** (when an archive is in scope) suppresses
  false-positive rewrites on real proper nouns.
- **Hard cap** of 500 entries in the misspellings map keeps the lookup
  cheap and the file reviewable.
- **Exclusions file** ships empty and grows reactively when sweeps
  observe a real proper noun getting misrouted (e.g., a surname or
  band name that happens to match a misspelling entry).
- **Idempotent rules** — running any rule twice produces the same
  output as running it once; rule order (1 → 2 → 3 → 4) is fixed.
- **Several ripple-effect compensations** in the existing intent chain
  for code paths that depended on case-preserving inputs (regex
  anchors with `[A-Z]`, `isupper()` checks, length-based bare-topic
  thresholds, namespace-path extraction).

### Tests

- 43 new tests in `tests/test_query_rewrite_tier1.py` (per-rule fix /
  no-op / boundary triads, integration, composition, hint handoff).
- 13 pre-existing test files updated for the lowercase ripple
  (assertions changed from `"Proper Case"` → `"lowercase"` to match
  Rule 1's unconditional normalization).
- Full suite: 2245 passing, 54 skipped — sub-D-1 reranker integration
  untouched.

### Not in Tier 1

- Multi-hop questions (`what year did the inventor of X die`) — deferred
  to a potential sub-D-3 if live evidence warrants.
- HyDE / hypothetical document synthesis — locked-in non-goal.
- Algorithmic spell-correction libraries (`pyspellchecker`, `autocorrect`)
  — wrong precision/recall tradeoff for encyclopedia search.

---

## [2.0.0a25] — 2026-05-20 (alpha pre-release) — post-a24 beta-test sweep — 6 live-Wikipedia defects across two passes

Live-MCP beta sweep against `wikipedia_en_all_maxi_2026-02.zim` on
the freshly-deployed `v2.0.0a24` build. Smoke gates 4/4 green
pre-fix; pass-1 surfaced 6 defects across 4 surfaces, pass-2 source-
level audit surfaced zero new defects. **"Narrow-scope sibling"
pattern is now 5 sweeps strong: ALL SIX defects this sweep are
narrower-than-needed shapes on the matching a24 fix** — two on
`_looks_like_slashed_compound` (digit halves, mixed-case short
halves), two on `_TRAILING_POLITENESS_RE` (multi-word multilingual,
third-wave single-word multilingual), one on `_PARAM_LEAK_RE`
(missing `query` token), one on `_chained_intent_guidance`
(param strip not applied before chained-intent detection). The
post-a24 sweep continues to validate the "narrow scope, widen
preemptively" methodology refinement.

### Slashed-compound helper digit-half widening (P1-D1)

The a24-shipped `_looks_like_slashed_compound` accepts letter-only
halves with `min ≤ 2` — tuned for short ALL-CAPS acronyms
(`TCP/IP`, `AC/DC`). Digit halves slipped through:

- `tell me about 9/11 and World War II` → 3-entity chain rejection
  naming `9`, `11`, `World War II`. But `9/11` is a single event.
- `tell me about 24/7 and 9 to 5` → 3-entity chain naming `24`,
  `7`, `9 to 5`. But `24/7` is a single phrase.

Same shape for `5/4` (time signature / fraction), `12/24` (date),
`2024/25` (sports season). All are conceptual single entities with
small-digit halves.

Fix: detect all-digit halves and accept the compound when both
halves are ≤ 2 chars. Catches the common date / ratio / sports-
season shapes; rejects `2024/2025` (min=4) which is more naturally
two distinct years. Mixed letter+digit halves (`A/4`, `X/12`)
still split.

### Slashed-compound helper short mixed-case widening (P1-D2)

Sibling shape: paired-concept TitleCase compounds like `Yin/Yang`,
`Hot/Cold`, `Wet/Dry`, `Mac/Cheese`, `Salt/Pepper` have letter-only
halves of 3-4 chars. Pre-fix:

- `tell me about Yin/Yang and the Tao` → slash split into
  `["Yin", "Yang", "the Tao"]`; Yin and Yang both failed
  substantive (3-4 char ASCII TitleCase, no digit, no non-Latin);
  `_split_multi_entity` returned None and the chain abandoned
  silently, returning the Tao article with Yin/Yang silently
  dropped.

Same shape for `Hot/Cold and Wet/Dry`, `Light/Dark`, etc. Fix:
widen letter floor from `min ≤ 2` to `min ≤ 4`. Catches the short
paired-concept compounds without affecting `Berlin/Munich`
(min=6), `Tokyo/Kyoto` (min=5), and other longer proper-noun
pairs that genuinely benefit from splitting.

### Politeness regex multi-word multilingual extension (P1-D3)

The a24 multi-word politeness additions (`thanks a million`,
`thank you very much`) were English-only. Multi-word counterparts
in other languages leaked or partially-peeled:

- `merci beaucoup` (French) — leaked entirely
- `vielen dank` (German) — leaked entirely (`dank` without `e`
  not in the prior single-word token list)
- `muchas gracias` (Spanish) — `gracias` peeled, `muchas` left
- `arigatou gozaimasu` (Japanese formal) — leaked entirely
- `domo arigato` (Mr. Roboto era) — `arigato` peeled, `domo` left
- `terima kasih` (Malay / Indonesian) — leaked entirely

Fix: each multi-word phrase listed as an explicit alternation
entry before the single-word forms (so the maximal phrase wins).

### Politeness regex third-wave single-word multilingual (P1-D4)

Sibling shape: more single-word multilingual tokens live-observed:
`mahalo` (Hawaiian), `xie xie` / `xièxie` (Chinese romaji),
`shukran` (Arabic), `kiitos` (Finnish), `tack` (Swedish — 4-char,
leading word-boundary anchor protects against embedded matches in
`attack` / `thumbtack`), `gomawo` / `kamsahamnida` (Korean romaji),
`dhanyavad` (Hindi romaji), `domo` / `gozaimasu` (Japanese
remainder fragments). Defence-in-depth via the canonical
`_TRAILING_POLITENESS_RE` propagates to every site that calls
`IntentParser._strip_trailing_politeness`.

### Param-leak strip `query` token (P1-D5)

The a24-shipped `_strip_param_leaks` covered 13 of the 14 public
`zim_query` arguments. The 14th — `query` itself — was missing.
Live: `tell me about Photosynthesis query=biology` returned the
`Biology` disambiguation page (the `=biology` suffix prevented
title promotion from cleanly resolving Photosynthesis; the
fuzzy-match path then resolved to `Biology` instead). Fix: add
`query` to the strip token set.

### Param-leak strip not applied before chained-intent detection (P1-D6)

The a24-shipped `_strip_param_leaks` runs inside `parse_intent`,
but the dispatcher's `_chained_intent_guidance(query)` call runs
upstream of that on the RAW user query. Live: `tell me about
Berlin limit=5 then list namespaces` surfaced a chained-intent
rejection whose `**First op (left)**: tell me about Berlin
limit=5` carried the leaked param verbatim — a user copying the
suggested left-op back into the tool would re-leak the param at
the same point.

Fix: mirror the existing leading-politeness strip pattern in
`_chained_intent_guidance` with a `IntentParser._strip_param_leaks`
call at the same point. Idempotent with `parse_intent`'s
downstream strip — both produce identical output on a clean query.

### What's in this release

- Slashed-compound helper widening (digit halves + short letter
  halves) lands in `openzim_mcp/simple_tools.py:_looks_like_slashed
  _compound`.
- Politeness regex third-wave extension (multi-word multilingual +
  more single-word multilingual) lands in
  `openzim_mcp/intent_parser.py:_TRAILING_POLITENESS_RE`.
- Param-leak strip `query` token + chained-intent param strip
  defence-in-depth lands in
  `openzim_mcp/intent_parser.py:_PARAM_LEAK_RE` and
  `openzim_mcp/simple_tools.py:_chained_intent_guidance`.
- 90 regression tests in `tests/test_post_a24_beta_fixes.py`
  (6 defect classes × multiple shape variants + 9 cross-feature
  pass-2 integration tests + 3 sibling-audit pins + 11 prior-alpha
  regression guards). One a23 test updated to reflect the new
  digit-compound policy.
- Full suite: **2143 passed, 50 skipped**. mypy clean across all
  45 source files. `make lint` (flake8 + isort + black) clean.
  SonarCloud quality gate passed with 0 open issues post-merge.

### Release process

After this changelog lands on `main`, push the `v2.0.0a25` tag
on `main` to trigger `.github/workflows/release.yml` — PyPI
publish + GitHub release notes auto-extracted from the matching
CHANGELOG section.

## [2.0.0a24] — 2026-05-20 (alpha pre-release) — post-a23 beta-test sweep — 4 live-Wikipedia defects across one pass

Live-MCP beta sweep against `wikipedia_en_all_maxi_2026-02.zim` on
the freshly-deployed `v2.0.0a23` build. Smoke gates 4/4 green
pre-fix; the live MCP session dropped after ~30 probes (same long-
session connection timeout pattern observed in the post-a22 sweep)
so pass-2 ran as a source-level sibling audit per the post-a17
methodology refinement — zero new defects surfaced. **"Narrow-scope
sibling" pattern holds for the 4th sweep running: 3 of 4 defects
this sweep are narrower-than-needed enumerations on the matching
a23 fix shape** (politeness regex missed a second wave of SMS /
multi-word / multilingual tokens, q-emitting drift-guard used
non-recursive glob, substantive filter rejected short ALL-CAPS
acronyms). The fourth defect (P1-D3) is a new defect class — the
title-promotion path silently resolves leaked `<param>=<value>`
suffixes to wildly unrelated articles.

### Multi-entity chain ALL-CAPS / slashed-acronym silent abandonment (P1-D1)

The post-a22 `_split_multi_entity` / `_is_substantive_topic` pair
correctly handles long bare-topic chains (Berlin / Munich / Köln)
and non-Latin shorts (東京 / Köln, post-a19 P1-D3). But two
interacting failures left short ALL-CAPS acronym chains silently
abandoned:

- The slash splitter (`\s*/\s*` in `_SOFT_CHAIN_CONNECTOR_PATS`)
  fragmented slashed acronyms — `TCP/IP` → `["TCP", "IP"]`,
  `AC/DC` → `["AC", "DC"]`, `Either/Or` → `["Either", "Or"]`.
- `_is_substantive_topic` rejected the fragments because they fail
  every existing clause (HTTP=4, TCP=3, IP=2, AC=2, DC=2 — none
  ≥5 chars, no digit, no non-ASCII letter). With every half
  failing substantive, `_split_multi_entity` returned None and
  the chain rejection silently abandoned.

Two live failures observed:

- `tell me about TCP/IP and HTTP and HTTPS` → silently returned
  the `HTTPS` article (matching-tail short-circuit picks the
  longest substantive ASCII tail), dropping TCP/IP and HTTP.
- `tell me about AC/DC and Iron Maiden and Metallica` → silently
  returned `Metallica`. Same path.

Two coordinated fixes:

- New `_looks_like_slashed_compound` helper protects slashed
  compounds whose halves are letter-only with a ≤2-char half
  (TCP/IP, AC/DC, Either/Or, A/B). `Berlin / Munich` (min half 6
  chars) still splits as a genuine 2-entity chain.
- New ALL-CAPS clause in `_is_substantive_topic`:
  `isupper() and len ≥ 2` accepts HTTP, TCP, IP, USA, EU, R&B
  etc. Mirrors the post-a19 P1-D3 non-Latin clause — short tokens
  with a clear proper-noun signal aren't English sentence-words.
  Mixed-case `Now` / `Both` / `Here` / `Then` stay rejected.

### Politeness regex second-wave family (P1-D2)

The post-a22 P1-D2 SMS extension added `thnx` / `thanx` / `tysm`
/ `kthx` / `kthxbai` but missed a second wave of live-observed
variants:

- 1-2 char compressions: `tx`, `txs`
- longer SMS: `tyvm`, `thnks`, `thxx`, `kthxbye`
- multi-word: `thanks a million`, `thank (you|u) (so|very) much`
- multilingual second tier: `obrigado(a)` (Portuguese),
  `arigato(u)` (Japanese romaji), `spasibo` (Russian)

Same narrow-scope sibling pattern as a22 P1-D2 → a23 P1-D2 — each
sweep so far has shipped narrower than the natural politeness
family. The word-boundary anchor (post-a21) already protects
short tokens from mid-word matches (`manta` / `pasta` / `vista` /
`cantata` stay intact).

### `<param>=<value>` query-suffix silent fragmentation (P1-D3) — NEW defect class

Live: `tell me about Photosynthesis limit=10` returns the article
for the number `10` (Wikipedia's number article). Same shape for
`compact_budget=200` (returns the year 200 article),
`content_offset=100` (returns 100), `offset=5` (returns 5). Root
cause: a small model that doesn't know to pass `limit` as the
typed MCP parameter occasionally leaks `limit=N` INTO the query
text; the title-promotion tokeniser sees `"10"` as a clean ASCII
digit tail and scores it cleanly against the number-article
title index, returning a wildly unrelated body that masks the
model's actual topic.

Distinct from a23 P1-D5 (docstring nudge for atomic intents that
ignore `limit`). The docstring tells the model not to pass
`limit` as text on atomic intents, but it can't prevent a model
that's confused about parameter-passing semantics from typing
`limit=10` as text anyway. Fix: new
`IntentParser._strip_param_leaks` peels `\s+<param>=<value>`
shapes BEFORE the politeness loop runs. Token list covers every
`zim_query` argument (`limit`, `offset`, `content_offset`,
`max_content_length`, `max_words`, `compact_budget`, `compact`,
`synthesize`, `cursor`, `zim_file_path`, `entry_path`,
`namespace`, `partial_query`). Idempotent loop handles multiple
leaks in one call. The `\s+` leading anchor protects prose
mentions (`offset printing`, `cursor algorithms`, `the compact
disc`) from accidental strip.

### Q-emitting drift scanner non-recursive glob (P1-D4)

The post-a22 P1-D3 widening from `zim/search.py` to all of
`zim/*.py` used `Path.glob` (direct children only). The current
`openzim_mcp/zim/` tree is flat (no subdirectories) so behaviour
is unchanged today, but a future contributor adding
`openzim_mcp/zim/cursor/encoder.py` or any subdirectory with
q-emitting `Cursor.encode` callsites would have those silently
missed by the scan, breaking the drift guard's promise. Same
narrow-scope sibling shape as the a22 P1-D3 widening from one
file to all direct-child files in the directory — the next
widening is naturally to all files in the tree. Fix: switch to
`rglob`.

### Methodology

The recurring **"fix unlocks new paths"** + **"narrow-scope
sibling"** pair held for the 4th sweep running. Three of four
defects this sweep are narrow-scope siblings of a23's own fixes
(P1-D1 narrow substantive filter + narrow slash split, P1-D2
narrow politeness enumeration, P1-D4 narrow scanner glob). The
fourth (P1-D3) is a new defect class — a small-model-leaked
parameter shape that silently fragments to an unrelated article
via title-promotion. The post-a22 lint-leak refinement (`make
lint` locally before push; check SonarCloud findings via API
before merging; avoid `[\s\S]+?` + literal regex shapes) was
followed cleanly — only one SonarCloud finding emerged
(implicit-concat strings from black auto-format, S6571), fixed
in a single follow-up commit before merge. **The methodology is
stabilising**: the structural defect classes the sweep catches
remain consistent across alphas, and the lint discipline now
catches static-analyzer noise pre-merge rather than letting it
leak to CI.

### Testing

- **73 regression tests** in `tests/test_post_a23_beta_fixes.py`:
  `TestP1D1MultiEntityAllCapsAndSlashedAcronyms` (12 cases —
  short ALL-CAPS substantive, R&B with ampersand, mixed-case
  rejected, slashed-compound helper identifies acronyms, rejects
  proper-noun pairs / 3-part slashes / digit halves, end-to-end
  split for TCP/IP, AC/DC, Berlin / Munich, ALL-CAPS chain);
  `TestP1D2PolitenessSecondWave` (~30 parameterized cases — every
  new token + chained + word-boundary safety + case-insensitive +
  regression guards on every post-a22 token);
  `TestP1D3ParamLeakSuffix` (~20 cases — every param-name × value
  shape strips, end-to-end parse_intent, multi-param chains, mix
  with politeness, prose-mention preservation, idempotence);
  `TestP1D4QEmittingScannerRecursive` (3 cases — source-level
  rglob check + scanner returns expected pinned set);
  `TestLiveMcpReproduction` (6 end-to-end probes mirroring the
  live-MCP queries the sweep observed); `TestRegressionGuards`
  (6 cases pinning post-a17 / a18 / a19 / a22 fixes that share
  code with the changed paths).
- Full suite: **2053 passed, 50 skipped**. mypy clean across all
  45 source files. `make lint` (flake8 + isort + black) clean.
  SonarCloud quality gate passed with 0 open issues post-merge.

### Release process

After this changelog lands on `main`, push the `v2.0.0a24` tag
on `main` to trigger `.github/workflows/release.yml` — PyPI
publish + GitHub release notes auto-extracted from the matching
CHANGELOG section.

## [2.0.0a23] — 2026-05-19 (alpha pre-release) — post-a22 beta-test sweep — 7 live-Wikipedia defects across two passes

Live-MCP beta sweep against `wikipedia_en_all_maxi_2026-02.zim` on
the freshly-deployed `v2.0.0a22` build. Smoke gates 4/4 green
pre-fix; the live MCP session dropped mid pass-1 (long-session
connection timeout) so pass-2 ran as a source-level sibling audit
per the post-a17 methodology refinement. **Strong "narrow-scope
sibling" signal this sweep: 4 of 7 defects were narrower-than-needed
enumerations on the matching a22 fix shape** (politeness regex
missed SMS variants, drift-guard scanned one file instead of the
whole zim/ tree, docstring-bait sweep skipped entry_path
placeholders, limit-nudge enumeration missing three atomic intents).

### Multi-entity chain first-word conjunction strip (P1-D1)

The post-a21 `_split_multi_entity` helper applied a defensive
`_CONJUNCTION_PREFIXES` strip to every cleaned half — including
the half that occupies the START of the original topic, where a
leading `And` / `Or` / `&` is real title content. Two live
failures observed:

- `tell me about And Then There Were None and Hercule Poirot and Murder on the Orient Express`
  → rejection bullets read `tell me about Then There Were None`
  (leading "And" mangled).
- `tell me about Or Else and Death and Taxes and Pride and Prejudice`
  → first half stripped from `Or Else` to `Else` (4 chars, fails
  `_is_substantive_topic`) → multi-entity rejection silently
  abandoned → `tell_me_about` resolved Pride and Prejudice,
  dropping 4 of 5 entities with no warning.

Fix: skip leading-conjunction strip on the FIRST non-empty half
(`parts[0]` after iterative `re.split` preserves order). Subsequent
halves retain the defensive strip — they can only get a leading
conjunction prefix from a hypothetical reordered pattern list,
never from the user's typed input.

### Politeness regex SMS variants (P1-D2)

The post-a21 P1-D6/D7 widening (`ta`/`cheers`/`thx`/`ty`/`pls` +
`bitte`/`danke`/`merci`/`gracias`/`por favor`) didn't cover
common chat / SMS spellings: `thnx`, `thanx`, `tysm`, `kthx`,
`kthxbai`. Live: `search for biology thnx` searches for
`"biology thnx"` (3 irrelevant matches). Same shape as the
post-a21 missed-token class. All new tokens are ≥4 chars except
`kthx`, which is word-anchored. No new `\s+` quantifiers —
ReDoS-safe.

### Q-emitting cursor tools drift guard scope (P1-D3)

The post-a21 P1-D5 regression test scanned ONLY `zim/search.py`
for `Cursor.encode` callsites. But callsites also exist in
`zim/namespace.py` (4 sites) and `zim/structure.py` (1 site) — a
future contributor adding a q-emitting tool there would silently
pass the test while breaking the dispatcher's q-overlap guard.
Widened the scan to all of `openzim_mcp/zim/*.py` with state-dict
introspection (handles literal-dict shape AND variable-reference
shape including PEP 526 type-annotated assignments like
`cursor_state: Dict[str, Any] = { ..., "q": ..., ... }`).

### entry_path docstring placeholder bait (P1-D4 + P2-D1)

The post-a21 P1-D9 widened the PD2-2 path-bait sweep to every
`tools/*.py` but only scanned for `/path/...\.zim` /
`/data/...\.zim` shapes. The `entry_path` parameter docstrings
used `'A/Some_Article'` and `'C/Some_Article'` as literal-looking
placeholders (6 sites: 1 in `content_tools.py`, 5 in
`structure_tools.py`). Same weak-instruction-follower defect
class — a small model copying `Some_Article` verbatim hits an
entry-not-found error and drops into a retry loop. The PD2-3 /
PD2-4 recovery hints only trigger on ZIM PATH errors, not
entry-path errors, so the bait was unreachable by the existing
safety net. Fix: replace with `<entry_path>` placeholder pointing
at real MCP tool names (`find_entry_by_title` / `browse_namespace`).

Pass-2 sibling audit found one more entry_path bait site (P2-D1):
`get_section` docstring (`structure_tools.py:576`) used
`'A/Berlin'` as the entry_path example. The legacy `A/` namespace
is the pre-2018 single-namespace ZIM convention; modern multi-
namespace ZIMs (Wikipedia 2026-02 and similar) use `C/`. A small
model copying `A/Berlin` verbatim hits entry-not-found on a
modern archive. Active wrong-guidance rather than obvious
placeholder. Fix: replace with `'C/Berlin'` + document the
legacy/modern distinction inline. Regression test scans
`tools/*.py` for `'A/<word>'` examples (allowing the legitimate
`A/B` Wikipedia testing article and lines that explicitly
document the distinction).

### `limit` docstring nudge — missing atomic intents (P1-D5)

The post-a21 T-D1 nudge enumerated 9 atomic intents that ignore
`limit` in the `zim_query` docstring. Three more atomic intents
whose handlers don't reference `options.get("limit", ...)` were
missing from the enumeration: `summary of <name>`, `table of
contents <name>`, `section <X> of <name>`. Same shape as T-D1 —
small models passing `limit=5` on those calls get no doc signal
that the parameter is ignored. Extended the enumeration.

### Dispatcher-edge politeness strip — additional fields (P1-D6)

The post-a21 P1-D1 defence-in-depth strip covered
`{query, topic, title, entry_path, partial_query}` but not
`section_name` (from `section <X> of <Y>` parses) or `entries`
(list of entry paths from batched parses). Same belt-and-
suspenders rationale: idempotent when `parse_intent`'s universal
strip works upstream, defence-in-depth when it doesn't
(in-process module cache, future regression). Added both fields
(scalar strip for `section_name`, per-element list strip for
`entries`).

### Methodology

The recurring **"fix unlocks new paths"** pattern reproduced again,
and the **"narrow-scope sibling"** pattern is now strong enough
to flag preemptively. Four of seven defects this sweep were
sibling shapes of a22's own fixes that landed at narrower-than-
needed scope (P1-D2 missed SMS politeness, P1-D3 narrow drift-
guard scope, P1-D4 + P2-D1 narrow docstring-bait scope, P1-D5
narrow limit-nudge enumeration). Future fixes should preemptively
widen each new guard to every analogue site before merging.

### Testing

- **34 regression tests** in `tests/test_post_a22_beta_fixes.py`:
  `TestP1D1MultiEntityFirstWordConjunction` (5 cases — first-word
  And/Or/Ampersand preserved, Unicode first word, mixed
  substantive halves); `TestP1D2PolitenessSmsVariants`
  (~22 parameterized cases — strip + word-boundary + full parse);
  `TestP1D3QEmittingDriftGuardWiderScope` (2 cases — scanner
  finds known tools + set membership pin across all zim/
  modules); `TestP1D4EntryPathDocstringBaitSweep` (3 cases —
  no `Some_Article` bait, `<entry_path>` placeholder convention,
  no legacy `A/<word>` bait); `TestP1D5LimitNudgeEnumeratesAllAtomicIntents`
  (1 case — docstring enumeration pin); `TestP1D6DispatcherEdgeStripWiderFields`
  (3 cases — `section_name` in field tuple, `entries` list
  strip wired, end-to-end politeness peel).
- Full suite: **1988 passed, 50 skipped**. mypy clean across all
  45 source files.

### Release process

After this changelog lands on `main`, push the `v2.0.0a23` tag
on `main` to trigger `.github/workflows/release.yml` — PyPI
publish + GitHub release notes auto-extracted from the matching
CHANGELOG section.

## [2.0.0a22] — 2026-05-19 (alpha pre-release) — post-a21 beta-test sweep — 11 live-Wikipedia defects across two passes

Live-MCP beta sweep against `wikipedia_en_all_maxi_2026-02.zim` on
the freshly-deployed `v2.0.0a21` build, plus a small-model failure
transcript review. Smoke gates landed 3/4 green pre-fix; the
politeness-strip gate (`search for biology please` →
`Found 5000 matches for "biology please"`) leaked on the live MCP
despite the source-side `parse_intent` strip working correctly
under direct unit test — most likely cause is an in-process
module cache on the live server that loaded only part of PR #152's
diff. The user-visible defect class is the same regardless of
root cause; defence-in-depth dispatcher-edge strip lands here
(P1-D1).

Pass 2 source-level audit found no new sibling defects across the
landed fix sites. The recurring **"fix unlocks new paths"** cycle
reproduced again three times: post-a20 P1-D2 (alias-fallback
widening to 2-entity asymmetric chains) didn't address 3+ entity
chains (this sweep's P1-D2/D3/D4 catch them); post-a20 PD2-1
(`parse_intent` politeness strip) didn't widen the token set
(P1-D6/D7 add British/texting/multilang variants); post-a20
PD2-2 (`zim_query` docstring de-bait) didn't sweep the sibling
advanced-tool docstrings (P1-D9 widens the regression net).
Live-transcript review remains a distinct test surface (T-D1
came from a Qwen3-8B-Q4 transcript; not reachable via adversarial
query probing alone).

### Fixed

- **Multi-entity chain warning for 3+ entity bare-topic chains**
  (P1-D2 / P1-D3 / P1-D4). Three observed shapes that bypass the
  existing 2-entity `_soft_connector_footer` alias-fallback:
  `tell me about Köln, München, and Berlin` returned Berlin + a
  footer suggesting `tell me about Köln, München,` (still-chained
  recursive suggestion — re-running it re-triggers the same
  defect); `tell me about Berlin or 東京 or Tokyo` silently fell
  through to "No search results found"; `tell me about Berlin
  and München and Köln` returned Cologne (Köln alias) with no
  footer about the dropped Berlin / München. Fix: new
  `_multi_entity_chain_guidance` detects 3+ substantive halves
  split by combined soft connectors (`and` / `or` / `,` / `&` /
  `vs` / `/`) AND probes the title index for the whole topic —
  clean single-title hits (`Earth, Wind & Fire` band;
  `Lions, Tigers, and Bears` idiom) suppress the warning; no
  clean hit fires a structured `Multi-Entity Chain Detected`
  rejection naming each entity. Iterative single-pattern splits
  (no combined alternation regex) keep SonarCloud's S5852
  polynomial-backtracking flag quiet; string-prefix/-suffix
  scans (not regex) handle leftover leading/trailing conjunctions
  for the same reason.

- **Trailing politeness regex extended to British/texting and
  multi-language tokens** (P1-D6 / P1-D7). Post-a20 PD2-1
  enumerated only `please` / `kindly` / `thanks` / `thank you|u`.
  Live probes showed `ta` / `cheers` / `thx` / `ty` / `pls`
  (British/texting) and `bitte` / `danke` / `merci` / `gracias` /
  `por favor` (multi-language) all leaked into search query /
  topic / title silently. Several new tokens are short (`ta` /
  `ty` are 2 chars) so the leading anchor tightens from
  `\s*[,;.!?]?\s*` to `(?:^|\s+|[,;.!?]\s*)` — embedded
  substrings in longer words (`cantata` / `feta` / `Dante`) no
  longer get their last two chars eaten.

- **`Search Terms Required` B4 guard now peels politeness from
  the tail before the empty-check** (P1-D8). Pre-fix,
  `_search_query_tail(query)` ran on the ORIGINAL query, so
  trailing politeness wasn't stripped before the empty-tail
  check; `search for please` silently dispatched with
  `query="for"` (the literal verb word) and returned a 200k-hit
  response dominated by stop-word collisions. Same shape for any
  `search for <politeness>` after the P1-D6 extension. Fix:
  apply `IntentParser._strip_trailing_politeness` to the tail
  before the B4 emptiness check.

- **Defence-in-depth dispatcher-edge politeness strip on params**
  (P1-D1). The live-MCP sweep observed
  `Found 5000 matches for "biology please"` for the query
  `search for biology please` despite the post-a20 PD2-1 fix.
  Source-side, the strip works correctly under direct unit test;
  the most likely cause is an in-process module cache on the
  live server that loaded only part of PR #152. Fix: in
  `handle_zim_query`, after the `parse_intent` call, apply
  `IntentParser._strip_trailing_politeness` to each of the
  user-supplied content fields in `params` (`query` / `topic` /
  `title` / `entry_path` / `partial_query`). Idempotent when
  `parse_intent` already cleaned them; belt-and-suspenders catch
  for any future regression that bypasses `parse_intent`.

- **`_Q_EMITTING_CURSOR_TOOLS` drift guard** (P1-D5). The
  post-a20 P1-D1 fix introduced
  `SimpleToolsHandler._Q_EMITTING_CURSOR_TOOLS` as a hand-
  maintained frozenset of tool names whose cursors legitimately
  carry an `s.q` field. If a future contributor adds a new
  q-emitting tool (a new `Cursor.encode(state={..., "q": ...})`
  callsite) but forgets to update the set, the dispatcher's
  q-overlap guard silently degrades to no-op for that tool —
  paginating with the wrong query proceeds silently. New
  regression test scans every `Cursor.encode(tool=...)` callsite
  in `zim/search.py` and pins membership equality with the set;
  encode-callsite comments updated to point at the set so the
  cross-module link is greppable from either side.

- **PD2-2 sibling docstring path-bait sweep** (P1-D9). Post-a20
  PD2-2 only pinned the `zim_query` docstring in `server.py`.
  Sibling literal path examples lived in advanced tool docstrings
  — `structure_tools.get_entry_summary` ("/path/to/wiki.zim"),
  `structure_tools.get_table_of_contents` ("/path/to/wiki.zim"),
  `structure_tools.get_binary_entry` ("/path/file.zim"), and
  `content_tools.get_zim_entries` ("/path/x.zim"). Small models
  copy these verbatim too — the same weak-instruction-follower
  class PD2-2 was designed to break. Fix: replace literal paths
  with `<zim_path>` placeholders that don't validate as
  filesystem paths; widen the regression test to scan every
  `openzim_mcp/tools/*.py` for `/path/...\.zim` or
  `/data/...\.zim` shapes.

- **PD2-4 recovery hint now preserves the original error reason**
  (P1-D10). The PD2-4 detector substring-matched `"access denied"`
  in the exception message and fired on `OpenZimMcpSecurityError`'s
  "Access denied - Path is outside allowed directories" message
  in addition to the intended file-not-found
  `OpenZimMcpValidationError`. The replacement body dropped the
  security-specific reason on the floor; callers saw only the
  generic "doesn't match any loaded archive" hint. Fix: surface
  the original exception message as a new `**Reason**` line
  alongside the recovery hint so the security-specific context
  isn't lost.

- **`limit` docstring nudge for atomic intents** (T-D1). Live
  small-model transcript (Qwen3-8B-Q4) showed the model passing
  `limit=5` on a `tell_me_about` query. The pre-fix docstring
  said "Max search/browse results (default: 3)" — silent about
  whether `limit` applies to atomic intents. Fix: docstring
  nudge explicitly enumerating the atomic intents that ignore
  `limit` (`tell me about` / `get article` / `show structure` /
  `links in` / `articles related to` / `main_page` /
  `list_namespaces` / `metadata for` / `list_files`).

### Testing

- 54 new regression tests in `tests/test_post_a21_beta_fixes.py`
  covering all eleven defects:
  `TestP1D6P1D7TrailingPolitenessExtensions` (29 parametric strip
  cases + word-boundary safety + full-parse integration);
  `TestP1D8SearchTermsRequiredAfterPolitenessStrip` (8 parametric
  guard-fires); `TestP1D1DispatcherEdgePolitenessStrip` (buggy-
  parse-stub regression); `TestMultiEntityChainGuidance` (8
  cases — 3-entity AND/OR/4-entity chains, 2-entity guard,
  real-title suppression via title-index probe, search-intent
  isolation, leading-conjunction split, Lions/Tigers/Bears idiom
  suppression); `TestP1D5QEmittingCursorToolsDrift` (3 cases —
  set value, search-encode comment hook, parametric
  `Cursor.encode` scan); `TestP1D9DocstringPathBaitSiblings`
  (directory-wide scan of `tools/*.py`);
  `TestP1D10RecoveryHintMarkerDiscriminatesSecurityError` (1
  case — surfaces original `OpenZimMcpSecurityError` reason);
  `TestTD1LimitDocstringClarifiesAtomicIntents` (1 case —
  docstring contract pin).
- Full suite: **1954 passed, 50 skipped**. mypy clean across all
  45 source files.

### Release process

After this changelog lands on `main`, push the `v2.0.0a22` tag
on `main` to trigger `.github/workflows/release.yml` — PyPI
publish + GitHub release notes auto-extracted from the matching
CHANGELOG section.

## [2.0.0a21] — 2026-05-19 (alpha pre-release) — post-a20 beta-test sweep — 6 live-Wikipedia defects across three passes

Live-MCP beta sweep against `wikipedia_en_all_maxi_2026-02.zim` on
the freshly-deployed `v2.0.0a20` build, plus a live small-model
failure transcript review. Pass 1 confirmed all nine prior fixes
(post-a17 P1-D1/P1-D2/P1-D3, post-a18 P3-D1/P3-D2/P1-D4, post-a19
P1-D1/P1-D2/P1-D3) still work as designed in production, then
surfaced two new defects. Pass 2 wave 1 widened probe coverage to
politeness wrappers across all simple-mode intents (one defect).
Pass 2 wave 2 reviewed a Qwen3-8B-Q4 failure transcript and
surfaced three more defects — a docstring-bait hallucinated path
that dropped small models into a retry loop. Pass 3 source-level
audit found zero new siblings across all six fix sites.

All six defects follow either the recurring **"fixes unlock
previously-broken code paths"** pattern (P1-D1, P1-D2, PD2-1
landed on surfaces a20's three landed fixes opened up) or the
**"weak-instruction-follower defect class"** pattern (PD2-2/3/4 —
small-model behaviour that adversarial query probes structurally
can't reach). The latter shape is new to the methodology and is
captured in the post-a20 refinement: live-transcript review should
join live-MCP probing as a recurring sweep input.

### Fixed

- **Cross-tool cursor reuse with stuffed `s.q` now reports
  tool-mismatch instead of q-mismatch** (P1-D1). The dispatcher's
  cursor-decode block runs the cursor's `s.q` overlap check before
  any handler-level `_cursor_tool_mismatch` guard fires. When a
  cross-tool cursor carries an `s.q` field (a hand-stuffed
  walk_namespace cursor with `s.q="biology"` passed to `search for
  photosynthesis`, or a real search cursor reused with a different
  tool), the dispatcher previously emitted the misleading "Cursor
  was issued for query X; current request shares no terms" error
  and advised the user to start the search over — even though the
  cursor was from a different tool entirely. Fix: scope the
  dispatcher's q-overlap check to cursors whose `t` claims a
  q-emitting tool (`search_zim_file` / `search_with_filters` — the
  only `Cursor.encode` callsites that put `s.q` in their envelope).
  Cursors claiming `walk_namespace` / `browse_namespace` /
  `extract_article_links` now pass through the dispatcher's
  q-check; the handler-edge guard emits the correct
  `Cursor / Tool Mismatch` diagnosis.

- **Soft-connector footer now suppresses for asymmetric alias
  cases** (P1-D2). `_soft_connector_footer`'s post-a18 P3-D2
  alias-fallback was gated on `not left_in and not right_in` — it
  only ran when BOTH halves missed the substring check. The
  asymmetric case (one half matches substring, the other matches
  only via title alias) slipped through:
  `tell me about Köln or Cologne` returned the Cologne article
  with a footer suggesting `tell me about Köln`, but Köln's
  title-index entry redirects back to Cologne — a 2-hop journey
  to the same article. Same shape reproduced for `京都 or Kyoto`,
  `上海 or Shanghai`, `München or Munich`, `Москва or Moscow`,
  `Αθήνα or Athens`, and their reverse-order variants. Fix:
  widen the gate to `not (left_in and right_in)` so the alias
  probe runs whenever either half misses substring. The probe
  still only upgrades a half whose top-scored title-index hit
  equals `top_path`, so genuinely different chain halves
  (`Berlin and 東京`) still surface the footer correctly. The
  irreducible `東京 or Tokyo` case stays unsuppressed — `東京`
  has its own disambig article that doesn't alias to `Tokyo`.

- **Trailing politeness now strips across all simple-mode
  intents** (PD2-1). Pre-fix, `tell_me_about` was the only
  intent that stripped trailing `please` / `kindly` / `thanks` /
  `thank you`. Every other extractor that captured the topic with
  a greedy end-anchored pattern (`_extract_search`,
  `_extract_search_all`, `_extract_find_by_title`,
  `_extract_related`, `_extract_suggestions`,
  `_extract_entry_path_keyworded` — feeding get_article / links /
  structure / toc / summary, plus `_extract_get_zim_entries` /
  `_extract_get_section`) silently swallowed the politeness:
  `search for biology please` searched for `"biology please"`
  (ranking `Thanks Maa` above `Biology`); `find article titled
  Berlin please` looked up `"Berlin please"` (not found);
  `links in Photosynthesis please` and `show structure of
  Photosynthesis please` showed the same shape. Comma forms
  (`"biology, please"`) and combinations
  (`"biology, thanks please"`) reproduced too. Fix: lift the
  trailing-politeness strip into `IntentParser.parse_intent` at
  the entry point — a single end-anchored regex, looped so
  combinations peel cleanly, runs before pattern matching +
  extractor dispatch. Legitimate content uses
  (`search for "Please Understand Me"` — song title) are
  unaffected because the strip is end-anchored and quoted phrases
  enclose the content.

- **`zim_query` tool docstring no longer contains a literal-
  looking path example** (PD2-2). The parameter description for
  `zim_file_path` previously included
  `(e.g. /data/wikipedia_en_all_maxi.zim)` as an illustrative
  path. Small models with weak instruction-following parse "e.g."
  inconsistently and routinely copied the example as the actual
  `zim_file_path` value. Real archives are date-suffixed in
  production (`wikipedia_en_all_maxi_2026-02.zim`) so the
  basename doesn't match either. Live transcript captured
  Qwen3-8B-Q4 doing exactly this and dropping into a
  `File does not exist` retry loop with no recovery signal.
  Fix: rewrote the docstring to lead with
  **Omit entirely (recommended)**, dropped the literal path
  example, added an explicit "do NOT invent a path from this
  docstring" line. A regression test pins the absence of the
  bait string so any future docstring edit reintroducing it
  fails CI.

- **`_normalize_zim_file_path` auto-selects when single archive
  loaded, even for slashed candidates** (PD2-3). The previous
  contract (H14: "explicit paths must reach the backend so it can
  surface a clearer error") only made sense when there was
  genuine ambiguity about which archive the caller wanted —
  single-archive setups have none. Pre-fix, a slashed candidate
  that didn't match anything still fell through to the backend in
  single-archive setups, producing the same `File does not exist`
  error that small models can't act on. Fix: when the candidate
  matches nothing via path-or-basename AND exactly one archive
  is loaded, auto-select regardless of separator. Multi-archive
  setups still preserve the candidate so the backend error
  surfaces and PD2-4 enriches it with the actual listing — H14
  narrowed but intact for the case it was actually defending.

- **"ZIM File Not Found" error now surfaces real archive paths
  and the omit-to-auto-select recovery** (PD2-4). The catch-all
  in `handle_zim_query` previously emitted a generic four-step
  troubleshooting block that gave small models no learning
  signal — they just retried with the same args. Fix: detect the
  `validate_zim_file` exception family (`File does not exist` /
  `Path is not a file` / `is not a zim file` / `Access denied`)
  and replace the template with a `ZIM File Not Found` shape:
  single-archive setups get "omit the parameter — only one
  archive loaded" + the actual path (defence-in-depth alongside
  PD2-3); multi-archive setups get a bulleted listing of real
  archive paths with "pass one verbatim" guidance. The generic
  template's step 1 was also rewritten to suggest
  "omit `zim_file_path`" as the canonical fix.

### Tests

- 65 new regression tests in `tests/test_post_a20_beta_fixes.py`
  covering all six defects plus the edge cases probed live
  (reverse-order alias variants, irreducible Tokyo disambig,
  multi-archive H14 preservation, zero archives edge case,
  defence-in-depth backend-failure paths, quoted-inner-please
  content preservation, etc.).
- 4 existing H14 tests updated to reflect the narrowed-to-multi-
  archive contract (single-archive auto-select + multi-archive
  preserve split into separate cases).
- 4 mock-realism updates in post-a16 / post-a17 test files (the
  widened P1-D2 alias-fallback calls the title backend for
  connector halves, so the blanket `return_value` mocks that
  reported every half resolves to `top_path` now use per-title
  `side_effect`).

Full suite: 1902 passed, 50 skipped.

### Methodology refinement (post-a20)

Live-transcript review is a distinct test surface from live-MCP
probing. The Qwen3-8B-Q4 transcript captured PD2-2 (a
docstring-bait hallucination source) that adversarial query
probes structurally couldn't reach — the bait was in the TOOL
DESCRIPTION, not in any user query. The transcript also exposed
PD2-4 ("no learning signal on retry" failure mode) that mocked
tests can't easily catch. Future sweeps should incorporate
small-model transcript review when available — the marginal
cost is low and the defect class it catches
(tool-self-described hallucination sources + error-message-
quality issues for weak-instruction-follower models) is
otherwise invisible.

## [2.0.0a20] — 2026-05-19 (alpha pre-release) — post-a19 beta-test sweep — 3 live-Wikipedia defects across one pass

Live-MCP beta sweep against `wikipedia_en_all_maxi_2026-02.zim` on
the freshly-deployed `v2.0.0a19` build. Pass 1 confirmed all six
prior fixes (post-a17 P1-D1/P1-D2/P1-D3 and post-a18 P3-D1/P3-D2/
P1-D4) still work as designed in production, then surfaced three
new user-facing defects. Pass 2 source-level self-audit found
zero new siblings.

All three defects follow the recurring **"fixes unlock previously-
broken code paths"** pattern: a17's Unicode tail-tokenisation fix
made non-Latin topics REACHABLE; a18's soft-connector alias
fallback + table-dominated subject-attribute fix landed on those
paths; THIS sweep found that the substantiveness filter guarding
the soft-connector footer wasn't Unicode-aware (P1-D3) AND that
the cross-tool cursor guard from a18's P1-D4 hadn't widened to
the search/filtered-search/links siblings (P1-D1, P1-D2 — the
deferred follow-up explicitly flagged by post-a18).

### Fixed

- **`search for X` rejects cross-tool cursors** (P1-D1). A
  `walk_namespace` or `browse_namespace` cursor passed to
  `search for Photosynthesis` previously decoded `s.o=3` into
  `options["offset"]` and search returned `showing 4-6 of 4237`
  instead of `showing 1-3`. Simple-tools-layer mirror of the
  post-a18 P1-D4 fix that landed for `_handle_browse` /
  `_handle_walk_namespace`. The advanced `search_zim_file` tool
  already enforces tool-binding via
  `Cursor.decode(expected_tool=...)`; this restores the check at
  the simple-tools handler edge with
  `_cursor_tool_mismatch(options, "search_zim_file")` at the
  top of `_handle_search`. User now sees the structured
  `Cursor / Tool Mismatch` rejection before any backend call.
- **`search X in namespace C` and `links in X` reject cross-tool
  cursors** (P1-D2). Same shape in `_handle_filtered_search`:
  `_cursor_tool_mismatch(options, "search_with_filters")` guard
  added. Defence-in-depth: `_handle_links` hardcodes `offset=0`
  today so the live shape didn't reproduce, but it IS a cursor-
  emitting handler and the guard
  (`_cursor_tool_mismatch(options, "extract_article_links")`)
  keeps the boundary consistent with sibling handlers and
  prevents a future offset-reading change from regressing
  silently. All four `options.get("offset")` sites in
  `simple_tools.py` (`_handle_browse`, `_handle_walk_namespace`,
  `_handle_search`, `_handle_filtered_search`) are now guarded.
- **Soft-connector footer recognises short non-Latin proper nouns
  as substantive** (P1-D3). `tell me about Berlin and 東京`
  resolved correctly to 東京 (right-promote via a18's Unicode
  tail fix), but the soft-connector footer was silently
  suppressed because `_is_substantive_topic("東京")` returned
  False. The ASCII-length-5 heuristic was tuned for English
  particles (`Then` / `Both` / `Here` / `Now`) and didn't
  account for non-Latin scripts where each character carries
  syllable-level lexical weight — `東京` is 2 chars but names
  the capital of Japan; `Köln` is 4 chars but names Germany's
  fourth-largest city. Same shape for `京都` / `北京` / `上海`.
  Fix: keep the original ASCII path (multi-token OR len≥5 OR
  digit-containing), and add a relaxed branch — when the string
  contains a non-ASCII letter, accept at len≥2. ASCII
  abbreviations (`Dr.` / `St.` / `Mt.`) remain rejected because
  they have no non-ASCII characters; single CJK ideograms (`京`)
  remain rejected because of the len≥2 floor. Both the chain
  detector and the soft-connector footer now fire correctly for
  non-Latin halves.

### Tests

19 regression tests in `tests/test_post_a19_beta_fixes.py`:

- **P1-D1** (4): walk-cursor-to-search rejected; browse-cursor-to-
  search rejected; same-tool search-cursor round-trips cleanly;
  no-cursor passthrough unaffected.
- **P1-D2** (3): walk-cursor-to-filtered-search rejected; walk-
  cursor-to-links rejected; filtered-search no-cursor passthrough
  unaffected.
- **P1-D3** (12): CJK 2-char accept (`東京` / `北京` / `京都` /
  `上海`); umlaut 4-char accept (`Köln`); ASCII particles still
  rejected (`Then` / `Both` / `Here` / `Now` / `This`);
  abbreviations still rejected (`Dr.` / `St.` / `Mt.` / `Jr.`);
  single CJK char still rejected (`京` / `北`); regression guards
  for ASCII long topics, multi-token, digit topics, empty /
  whitespace; Cyrillic short topic via existing 5-char path +
  relaxed branch; end-to-end soft-connector footer fires with
  CJK dropped half + umlaut dropped half.

Full test suite: **1842 passed, 50 skipped** (up from 1823 in a19).

### Pass-2 source-level audit (no siblings)

- **P1-D1 / P1-D2**: all 4 sites in `simple_tools.py` that read
  `options.get("offset", 0)` (`_handle_browse`,
  `_handle_walk_namespace`, `_handle_search`,
  `_handle_filtered_search`) are now guarded by
  `_cursor_tool_mismatch`. `_handle_search_all` and
  `_handle_related` don't read `options["offset"]` at all. No
  siblings remaining.
- **P1-D3**: `_is_substantive_topic` is called from two sites —
  the chain detector right-promote branch
  (`simple_tools.py:983-984`) and `_soft_connector_footer`
  (`simple_tools.py:1156`). Both benefit from the fix. Searched
  for other `len(stripped) >= N` ASCII-length heuristics across
  `simple_tools.py` / `intent_parser.py` / `title_promotion.py` /
  `synthesize.py`; `intent_parser.py:1012` already has explicit
  Unicode handling for the analogous `_looks_like_topic_ask`
  check. No other ASCII-only thresholds on user-provided strings.

Pass-3 live re-probe deferred following the post-a17 methodology:
the three fixes are narrow handler-edge guards + a pure-function
heuristic with no cross-module contract changes, no cursor codec
/ serialization changes. The 19 mock-based regression tests
cover the exact surfaces a live re-probe would.

PR: [#149](https://github.com/cameronrye/openzim-mcp/pull/149).
Commits on the sweep branch: `cc9eb64` (pass-1 fixes + tests),
`8745012` (dedupe cursor encode helpers — Sonar quality gate).

## [2.0.0a19] — 2026-05-19 (alpha pre-release) — post-a18 beta-test sweep — 3 live-Wikipedia defects across two passes

Live-MCP beta sweep against `wikipedia_en_all_maxi_2026-02.zim` on
the freshly-deployed `v2.0.0a18` build. Pass 1 confirmed all three
a18 fixes (P1-D1 soft-connector title-spans, P1-D2 Unicode tail
tokenisation, P1-D3 walk-namespace cursor `ai` preservation) work
as designed in production, then surfaced three new user-facing
defects. Pass 2 source-level self-audit found zero new defects.

Both P3-D1 and P3-D2 are **examples of a recurring pattern**: fixes
that unlock previously-broken code paths surface new defects in
those paths. Neither was reachable from the canonical reproducers
before a18 because the Unicode tokenisation defect intercepted
every non-Latin topic earlier in the pipeline.

### Fixed

- **Subject-attribute section dominated by table placeholders falls
  back to recovery pointer** (P3-D1). `musicians from München`
  resolved correctly via the new Unicode tail probe to Munich,
  then subject-attribute decomposition fired on the Notable people
  section. But that section is two H3 sub-tables
  (`Born in Munich` / `Notable residents`) which compact mode
  renders as `[Table N: M rows x P cols - pass compact=False to
  expand]` placeholders. The LLM got zero substantive content
  from a query that should list musicians — exactly the
  content-less-response shape wave 4's empty-lead fallback was
  designed to prevent. The bundle `get_section_data` reads is
  always built with `compact=True` (`openzim_mcp/bundle.py:307`),
  so the section can't be re-emitted with tables expanded.
  `_maybe_render_subject_section` now detects placeholder
  dominance (≥1 placeholder AND <100 chars of substantive prose
  after stripping them) and substitutes a `compact=False`
  recovery pointer that names the exact call to make. Telemetry
  counter `subject_attribute_table_dominated` for future tuning.
- **Soft-connector footer recognises non-Latin halves via title-
  alias fallback** (P3-D2). `tell me about Berlin and München`
  resolved correctly to Munich (right-promote), but the soft-
  connector footer was silently suppressed. The substring check
  `"berlin" in "munich"` is False; `"münchen" in "munich"` is
  also False because the title-alias index crosses the Unicode +
  language boundary (München → Munich) and substring matching
  can't see through that. So `left_in == right_in == False` hit
  the "neither in title — unclear which was picked" suppression
  branch. User never learned Berlin was dropped. Fix: when both
  halves fail substring, fall back to title-alias probing — probe
  the title index for each half, and if a half's top-scored hit
  resolves to `top_path`, treat that half as "in title"
  semantically. Cheap (in-memory title-index lookup) and only
  fires on the rare both-missed branch. The legacy positional-only
  call signature (without `zim_file_path` / `top_path` kwargs)
  continues to work — alias fallback is gated on those kwargs.
- **Cross-tool cursor reuse rejected at simple-tools handler
  edge** (P1-D4 — deferred from the post-a17 sweep).
  `walk namespace M` emits a cursor; passing that cursor to
  `browse namespace M` previously walked browse silently from
  walk's offset (=3 in the canonical reproducer), returning
  entries 4-6 and emitting a fresh `browse_namespace` cursor as
  if nothing was wrong. The simple-tools dispatcher had decoded
  only `s.o` and `s.ns` from any received cursor, ignoring `s.t`
  (issuing tool). The advanced tools already enforce tool-binding
  via `Cursor.decode(expected_tool=...)`. Fix: stash
  `decoded_payload.get("t")` into `options["_cursor_t"]` at decode
  time; add the `_cursor_tool_mismatch` helper alongside the
  existing `_cursor_ns_mismatch`; fire it at the top of both
  `_handle_browse` and `_handle_walk_namespace` (defence-in-depth
  for the symmetric direction). User now sees a clear
  `Cursor / Tool Mismatch` rejection before any backend call.

### Tests

9 regression tests in `tests/test_post_a18_beta_fixes.py`:

- **P3-D1** (3): table-dominated falls back to recovery pointer;
  prose + 1 table returns body unchanged; zero tables unchanged.
- **P3-D2** (3): alias-resolved half makes the footer fire;
  neither-half-resolves still suppresses; legacy positional-only
  call signature still works.
- **P1-D4** (3): walk cursor passed to browse → rejected; browse
  cursor passed to walk → rejected; same-tool round-trip preserves
  the post-a17 P1-D3 fix.

Full test suite: **1823 passed, 50 skipped** (up from 1814 in a18).

### Pass-2 source-level audit (no siblings)

- **P3-D1**: the table-placeholder shape is unique to subject-
  attribute decomposition. Other content-fetch paths surface
  tables embedded in larger prose bodies; the defect class is
  the "single section, all-tables" shape.
- **P3-D2**: `_soft_connector_footer` is the only substring-in-
  title site in the codebase. Other footers (disambig twin probe,
  related extends paths) use exact path/title matching from search
  results.
- **P1-D4**: `_handle_search` / `_handle_links` /
  `_handle_filtered_search` also read `options["offset"]` from any
  decoded cursor, but they use search-tool offsets that aren't
  cross-tool meaningful in the same way as walk/browse's shared
  namespace-offset semantics. Filed as a follow-up opportunity to
  widen the tool-mismatch guard later if a live probe ever
  surfaces the issue.

PR: [#147](https://github.com/cameronrye/openzim-mcp/pull/147).
Commit on the sweep branch: `7be575e` (pass-1 fixes + 9 tests).

## [2.0.0a18] — 2026-05-18 (alpha pre-release) — post-a17 beta-test sweep — 3 live-Wikipedia defects across two passes

Pass 1 (live-MCP, against the freshly-shipped `v2.0.0a17` build on
`wikipedia_en_all_maxi_2026-02.zim`) surfaced three user-facing
defects. Pass 2 source-level self-audit (sibling grep for the
landed fix shapes + edge-case unit tests) found zero new defects.

A live-MCP pass-3 reprobe is deferred until this release deploys — the
MCP server in the sweep environment couldn't be restarted mid-session
to load the new build. The recent post-a16 methodology refinement
(live-MCP catches a defect class unit tests structurally cannot)
should still apply for that follow-up pass.

### Fixed

- **`_soft_connector_footer` false-fires on titles that
  structurally span the connector** (P1-D1). Queries like
  `notable people from Big Rapids, Michigan` resolved correctly to
  the `Big_Rapids,_Michigan` article (a single entity whose title
  literally contains the comma) but the footer claimed the article
  for `Michigan` was returned and told the caller to query
  separately for `notable people from Big Rapids`. Same shape for
  `musicians from Romeo and Juliet` → "for Juliet". The existing
  `left_in == right_in` suppression only catches the
  both-halves-in-title case; a subject-attribute prefix
  (`notable people from`, `musicians from`) leaves the left half
  longer than the title and defeats it. Fix adds an earlier
  title-spans-connector suppression: when `top_title` matches the
  same connector regex as the topic, the connector is structural
  to the title and the footer is suppressed. The docstring already
  named `Vienna, Austria` as a case this should fire for; the new
  guard makes it work in the prefixed-topic shape too.
- **Non-Latin topic strings resolved to wrong articles at
  cert=0.85** (P1-D2 — critical). `tell me about München` returned
  the `M` letter article; `tell me about Zürich` returned the
  `Rich` disambig; `tell me about Köln` returned the `LN`
  abbreviation. Root cause: `_TAIL_TOKEN_RE = [a-z0-9]+` in
  `openzim_mcp/title_promotion.py` stripped non-ASCII characters,
  so `iter_query_tails("München")` yielded `["m", "nchen"]` and
  `iter_query_windows` then yielded `"m"`, which
  `find_title_match("m")` cleanly resolved to the `M` letter
  article at score 1.0. The backend `find_entry_by_title_data`
  natively handles Unicode topics (`find article titled München`
  resolves to Munich at score 1.00) — only the tokenisation layer
  destroyed the topic before the backend saw it. Fix: switch
  `_TAIL_TOKEN_RE` to `[^\W_]+` (Unicode-aware `\w` minus
  underscore, so underscore still acts as a token boundary for
  path-form input like `Big_Rapids,_Michigan`).
- **`walk namespace M` cursor round-trip false-failed with
  "missing archive-identity field"** (P1-D3). Paging walk_namespace
  by passing back the `next_cursor` it just emitted produced
  `Error: Cursor for 'walk_namespace' missing archive-identity
  field. Re-issue the request without a cursor.` even though the
  cursor (decoded) carried `{"v":2,"t":"walk_namespace","s":
  {"o":3,"l":3,"ns":"M","ai":"e048666a9e92"}}`. The simple-tools
  cursor dispatcher decoded the cursor and stashed only
  `state["o"]` (as `options["offset"]`) and `state["ns"]` (as
  `options["_cursor_ns"]`), dropping `ai`.
  `_handle_walk_namespace` then rebuilt cursor_state as
  `{scan_at, l}` without `ai`; downstream `walk_namespace_data`
  called `verify_archive_identity` unconditionally and raised
  "missing" because the field was gone. Fix: stash `state["ai"]`
  (and re-stash `state["ns"]`) into options at decode time;
  `_handle_walk_namespace` includes them in the rebuilt
  cursor_state when present. The data-layer guard now has the real
  `ai` to compare against and properly distinguishes "missing"
  from "cross-archive mismatch". Browse_namespace didn't surface
  the same failure because its handler passes `offset` directly
  (no cursor_state envelope) and the browse data layer only
  verifies archive identity when an explicit
  `cursor_archive_identity` kwarg is passed — which the
  simple-tools handler doesn't pass.

### Tests

21 regression tests in `tests/test_post_a17_beta_fixes.py`:

- **P1-D1** (6): comma title with subject-attribute prefix
  suppresses; `and` title with subject-attribute prefix
  suppresses; genuine two-entity query still emits the footer;
  pre-fix both-halves-in-title still suppresses; slash-connector
  title-spans suppression (pass-2); no-connector-in-title still
  fires (pass-2).
- **P1-D2** (11): München / Zürich / Köln tokenise as single
  Unicode tokens; multi-word Unicode topic preserved; ASCII path
  unchanged (regression guard for the original `big rapids
  michigan` example); underscore boundary preserved; digits
  preserved; empty topic (pass-2); mixed Latin + non-Latin
  (pass-2); single non-Latin char (pass-2); punctuation as
  boundary (pass-2).
- **P1-D3** (4): end-to-end cursor round-trip carries `ai`;
  dispatcher stashes `_cursor_ai` into options; no-cursor case
  preserved (cursor_state stays None); cross-archive `ai`
  mismatch propagated correctly (pass-2 — preserving `ai` must
  not weaken the cross-archive enforcement guard).

Full test suite: **1814 passed, 50 skipped**.

### Deferred

- **P1-D4** (lower priority): `browse_namespace` silently accepts
  cursors emitted by `walk_namespace` (cross-tool reuse at the
  simple-tools dispatcher layer; the advanced tools already
  enforce). Not user-facing critical — simple-tools reads
  `state["o"]` and walks browse from that offset, which for the
  metadata namespace coincidentally produces a continuation page.
  A defence-in-depth follow-up would stash `state["t"]` and add a
  `_cursor_t_mismatch` check alongside the existing
  `_cursor_ns_mismatch`. Filed as follow-up rather than bundled
  here to keep the sweep tight.

### Methodology

Two passes (rather than the recent 3–7) because the three landed
fixes were narrow, well-characterised, and had no live-only
surfaces that source-level self-audit couldn't cover.
`_AFFINITY_TOKEN_RE` in `synthesize.py` and
`_tokenize_for_relevance` in `zim/search.py` use the same ASCII
pattern as `_TAIL_TOKEN_RE` but are **symmetric** tokenisers (same
regex applied to both sides of the comparison) — the P1-D2 shape
is a **unidirectional probe** that destroys the topic before the
backend sees it, which is structurally different. No siblings.
`verify_archive_identity` is also called from
`browse_namespace_data`, `extract_article_links_data`, search
cursor paths, and structure cursors, but all gate on an explicit
`cursor_archive_identity` kwarg that the simple-tools handlers
don't pass; only walk_namespace builds a cursor_state envelope
whose `ai` the data layer unconditionally checks. No siblings.

PR: [#145](https://github.com/cameronrye/openzim-mcp/pull/145).
Commits on the sweep branch: `d42213b` (pass-1 fixes + 14 tests),
`8f8a44e` (pass-2 self-audit + 7 edge-case tests), `e59b953` /
`2f71bba` (CI lint fixes — F401 unused-imports / isort).

## [2.0.0a17] — 2026-05-18 (alpha pre-release) — post-a16 sweep + empty-lead fallback + subject-attribute decomposition (four waves)

Sixteen commits across four sweep waves on top of `v2.0.0a16`. Waves
1–3 fixed 10 + 7 user-facing defects + 2 opportunities surfaced by
unit-mocked adversarial probes and the new live-MCP probing surface;
wave 4 added two behavioural improvements driven by a 2026-05-18 live
transcript where a small Qwen3-8B-Q4 model hallucinated when
`zim_query` returned section-headings-only responses for short
city/biography articles whose infobox got stripped. Wave 4's own work
went through a pass-2 self-audit that surfaced three more real
defects in the wave-4 code itself (the recurring pattern: each pass's
own fixes have leftover defects). One follow-on test-assertion fix
(PR #143) was required after the merge when the Comprehensive Testing
job tripped on a fixture-drift issue PR #136 missed when it fixed
the parallel test.

### Added

- **Subject-attribute decomposition in `_handle_tell_me_about`**
  (wave 4). Queries like `famous musician from big rapids michigan`,
  `notable people from detroit`, `actors from new york` now route
  to the matching section of the resolved entity's article
  (`Notable people`, `Music`, `Film`, etc.) instead of the (often
  empty) lead. Subject hints (`musician`, `actor`, `athlete`,
  `notable people`, etc.) are extracted from the residual after
  entity-name tokens are subtracted from the topic; the candidate
  section is found via whole-word regex match against H2 headings
  (so `film` matches `Film and television` but not `Microfilm`).
  Strong hints win over weak ones (`famous`/`notable` alone don't
  fire). Soft-connector ambiguity footer fires for multi-entity
  variants like `musicians from Berlin and Paris` so the LLM
  knows the other entity was dropped. The original confidence-gate
  approach was removed by the self-audit (over-blocked legitimate
  explicit-phrasing queries like `who is a famous musician from X`
  classified at 0.85) — `_extract_subject_hint` is now the sole
  gate.
- **Empty-lead fallback in `_lead_with_toc`** (wave 4). When the
  pre-H2 lead is empty (after stripping the ZIM preamble +
  duplicated H1), advance the cut to the second non-wrapper H2 so
  the response includes the first real section's prose instead of
  just a TOC list. Motivating case: `Big_Rapids,_Michigan` from
  the 2026-05-18 live transcript — empty lead before
  `## Notable people` caused the LLM to invent facts. Gated to
  bodies in the ZIM-preamble shape; direct-content unit fixtures
  stay unchanged via a preamble-presence check.

### Fixed

- **D1: chained-intent detector false-fires on `and`/`or`/`&`/
  `,`/`/` connectors that are part of legitimate article titles**
  (`Romeo and Juliet`, `TCP/IP`, etc.). Wave 1 added the soft-
  connector ambiguity layer at `_soft_connector_footer` with a
  strict `_is_substantive_topic` filter so single-token English
  sentence-words don't trip the right-promote branch. Caught by
  wave 2 self-audit and refined to the current shape.
- **D2 + D3: `walk_namespace` on empty new-scheme `B`/`X`/`Z`
  namespaces omitted `namespace_entry_count` while
  `walk_namespace M`/`W` included it** — schema inconsistency
  between sibling aggregators (post-a15 D7 family). Now uniform.
- **D4: `find article titled M/Title` silently returned `0_hits`**
  because the title index only stores titles, not paths — no
  signal to the caller. Now returns a clear error pointing at
  `get article M/Title`.
- **D5: politeness modal lead-in (`could you`/`can you`/`would
  you`/`will you`) leaked into the parsed topic** for chained
  intents because the chain detector ran before the modal-strip.
  The D5 modal-strip now lives in a shared scaffold-strip that
  runs at the top of `_chained_intent_guidance` too.
- **D6 + D7: silent default `params.get("namespace", "C")`** in
  `walk_namespace` allowed garbage namespace input (empty, `AB`,
  `1`, `_`) to walk C silently. Added an input-validation guard
  matching sibling tools.
- **D8 + D9 + D10: aggregator-disagreement family** — `browse
  namespace M` reported 13 entries (including the binary
  `Illustration_*` entry) while `list namespaces` / `walk
  namespace M` / `metadata for <file>` all reported 12. Wave 3's
  P3-D3 applies the same `is_human_readable_metadata_key` filter
  to `_enumerate_new_scheme_metadata` so all four aggregators
  agree. Also pins C namespace total to `archive.entry_count`
  (was drifting ±1 due to sampling projection).
- **P3-D2: walk_namespace cursor encoded `s.scan_at` on the wire
  but the universal top-level cursor decoder only accepts `s.o`**.
  The mismatch made walk_namespace cursors round-trip-broken
  (`cursor_decode` error when replaying the tool's own cursor).
  Caught only by live-MCP probing the tool's own cursor advice;
  unit tests of either module in isolation looked correct.
- **P3-D5 + P3-D6: surface crashes on filtered-search**
  (`KeyError: 'namespace'` on certain compact filtered-search
  branches; missing dict key in the response builder). Both
  caught by live-MCP probing the deployed server, not the
  unit-mocked test set.
- **P6-D1 + P6-D2 + P6-D3: leading-politeness probes + source-
  level sibling audits** caught three more defects in the
  `browse_namespace` family that the live-query passes hadn't
  reached. New methodology angle.
- **Self-audit fix-ups for wave 4** (caught by pass-2 self-audit
  of wave 4 itself, BEFORE the PR landed): (a) confidence gate
  over-blocked `tell me about famous musicians from X` (removed
  the gate; `_extract_subject_hint` is the sole filter);
  (b) empty-lead density threshold of 80 false-fired on real
  one-sentence leads (~11 chars after preamble strip) — lowered
  to 5; (c) `_DUPLICATED_H1_RE` required trailing `\n+` but
  callers `rstrip()` `pre_h2` before passing it in, so the
  duplicated `# Title` was never being stripped — the 80
  threshold was masking the bug; fix changed to accept
  `(?:\n+|\Z)`.
- **PR #143 follow-on: M-namespace browse-total fixture drift**
  (post-merge fix). The P3-D3 metadata-key filter dropped the
  fixture's M-namespace count from 10 to 9, tripping a sibling
  exact-count assertion (`assert result["total"] == 10`) that
  PR #136 missed when it fixed the parallel
  `test_metadata_namespace_from_metadata_keys`. Mirrored PR
  #136's pattern: `>= 5` floor with cross-referenced docstring.

### Refactored

- **Hoisted regex constants out of method bodies** for the
  empty-lead path (`_LEAD_PREAMBLE_RE`, `_DUPLICATED_H1_RE`,
  `_EMPTY_LEAD_DENSITY_THRESHOLD`) matching the project pattern
  for SonarCloud-safe regex declaration. Patterns tightened to
  use literal single space + `[^\n]*` (the ZIM renderer emits
  exactly one space after `#`/`##`) rather than `\s+` adjacent
  to `[^\n]*`, eliminating the polynomial-backtracking
  ambiguity SonarCloud's S5852 detector flagged.
- **Whole-word matching in `_resolve_section_for_subject`**
  (`\bcand\b` instead of substring) prevents false-positives
  like `film` matching `Microfilm` or `science` matching
  `Conscience`.

### Methodology

Pass-4 of the sweep introduced **source-level sibling audits**
as a new angle. After a fix lands for a defect class, grep the
codebase for the same shape and audit every sibling. P6-D1 and
P6-D2 in `browse_namespace` were caught instantly this way;
pass-2 of wave-4 confirmed there are no other regex patterns
in `simple_tools.py` requiring trailing `\n+` that could be
defeated by rstrip, and no other confidence-based gates of the
same shape as the one removed.

Wave 4 also added **adversarial self-audit BEFORE the PR
lands** rather than as a post-merge gate. The original 12-commit
wave-4 push went through a pass-2 self-audit that caught three
real defects in its own work (confidence gate, density
threshold, latent H1-strip bug — the latter unmasked when
fixing the threshold). All landed before the PR was opened for
human review.

## [2.0.0a16] — 2026-05-17 (alpha pre-release) — post-a15 beta-test sweep — 10 live-Wikipedia defects across seven passes

The multi-pass live sweep of a15 against
`wikipedia_en_all_maxi_2026-02.zim` (~118 GB, ~27.2 M entries) ran
across seven passes. Pass 1 surfaced four user-facing defects (D4 in
the `tell_me_about` disambig-page handling for Mercury-class bare
titles; D5 in the intent parser's politeness-prefix regex; D6 in
`find_by_title`'s response to namespace-prefixed input; D7 a
schema-consistency gap in `walk_namespace`). Pass 2 self-audited
every D-fix in both verbose and compact rendering modes and
exercised the canonical-article paths (Berlin / Apollo 11 / Java)
the disambig-detection logic must not regress. Pass 3 re-tested
across a broader disambig set (Mars, Sun, Moon, Paris, Apollo bare),
walked empty namespaces B / X / Z, and exercised cross-fix
interactions (`could you find article titled M/Title`); both
passes 2 and 3 found zero new defects. Pass 4 then deliberately
stress-tested the four landed D-fixes from angles the earlier
passes hadn't probed (more bare-title disambigs, pathological
politeness combinations, find_by_title edge cases, walk_namespace
malformed args) AND exercised the intent paths the earlier passes
had barely touched (synthesize, browse namespace, show structure
of, links in, suggestions for, search in namespace); it surfaced
three more defects (P4-D1 / P4-D2 / P4-D3). Pass 5 verified those
three fixes; zero new defects. Pass 6 went deeper — a source-level
audit of every intent handler for the silent-default pattern
P4-D3 fixed (`params.get("X", DEFAULT)`) caught the same shape in
`_handle_browse`, and a parallel audit of every intent extractor
for the trigger-word-capture pattern P4-D1 fixed caught a sibling
extractor permissiveness in `_extract_browse`; plus a leading-
politeness probe surfaced a third defect (P6-D3) — `please tell
me about X` leaks the leading politeness into the parsed topic
just like the original D5 did for modal verbs. Pass 7 verified
all ten fixes and audited cumulative regressions across the three
commits; zero new defects.

### Fixed

- **D4: `tell me about Mercury` no longer attaches a misleading
  `_May also refer to: Mercury_Monterey — use tell me about <full
  title>_` footer to the disambiguation-page body.** Two cooperating
  bugs: `SimpleToolsHandler._is_disambig_lead` returned False
  whenever `pre_h2` exceeded 400 chars — Mercury's 628-char pre-H2
  (the "most commonly refers to" preamble, three top-level entries,
  and the "may also refer to" header) blew past the cap, so the
  existing disambig-page detection in `_lead_with_toc` never fired;
  AND the trailing-footer block in `_handle_tell_me_about` had no
  way to suppress the `disambig_twin_path` / `related_extends_paths`
  hints when the resolved body was itself a disambig page. Fixed
  by checking only the trailing 400 characters of `pre_h2` (the
  regex-free `endswith` stays bounded, but long preambles now
  trigger) and by gating both trailing footers on a fresh
  `body_is_disambig_page` check on the fetched body. Canonical
  pages with disambig twins (Berlin) keep their footer; canonical
  pages with extends-topic siblings (Apollo 11 → anniversaries /
  lunar sample display / goodwill messages) keep their footer.
- **D5: `could you tell me about Photosynthesis` now parses
  `topic = "Photosynthesis"` instead of leaking the modal lead-in
  into the topic.** The verb-prefix regex in
  `_extract_tell_me_about` anchored at `^\s*` and never matched
  "could you" / "can you" / "would you" / "will you", so the whole
  query fell through to the `topic = query.strip()` fallback and
  downstream relied on the tail-probe entity rescue to find the
  article anyway. Fixed by stripping the modal scaffold
  (`(?:could|can|would|will)\s+(?:you|we|i)\s+(?:please\s+)?`) before
  the verb regex runs. Leaves non-modal queries unchanged; combines
  cleanly with the existing trailing-politeness strip
  (`could you tell me about X please` → topic=X).
- **D6: `find article titled M/Title` now redirects to `get article
  M/Title` instead of returning a silent `0_hits`.** The title index
  only stores titles (M/Title's title is "Title"), so passing a ZIM
  namespace path through the title-lookup backend was guaranteed to
  return nothing — with no signal to the caller that the wrong tool
  was in use. `_handle_find_by_title` now detects the
  uppercase-letter + slash + non-empty-suffix shape upfront and
  returns a structured **Namespace Path, Not a Title** message that
  points at both `get article <path>` (direct lookup) and `find
  article titled <stripped>` (title-only fallback). Lowercase
  prefixes (`a/b`) and titles without the namespace shape pass
  through to the backend unchanged.
- **D7: `walk namespace A` (and any other empty new-scheme
  namespace) now includes `namespace_entry_count: 0` in the
  response.** The short-circuit at
  `openzim_mcp/zim/namespace.py` for new-scheme non-C/M/W namespaces
  built an empty result without passing `namespace_entry_count` to
  `_build_walk_result`, so the field was omitted entirely while
  walk-M and walk-W (which surface their bounded totals) included
  it. Downstream consumers had to special-case "missing" vs "zero".
  Fixed by passing `namespace_entry_count=0` in the short-circuit.
  Updated the `walk_A_10` golden to reflect the new schema; walk-M
  and walk-W goldens are unchanged (already carried the field).
- **P4-D1: `suggestions for` (no actual prefix) now returns the
  structured "Missing Search Term" error instead of silently
  autocompleting against the literal word "for".** The regex's
  optional `(?:for\s+)?` group failed to match without trailing
  whitespace, so the mandatory capture greedily swallowed "for"
  itself; the handler's existing missing-arg guard then saw a
  non-empty `partial_query` and ran the suggestion fallback (which
  spent ~70 s scanning for "for" — a high-frequency English token).
  Fixed in `_extract_suggestions` by discarding a bare-"for"
  capture so the guard takes over. Legitimate prefixes that happen
  to start with "for" (e.g., `suggestions for forest`) still work.
- **P4-D2: chained-intent detector no longer bypassed by a modal
  lead-in.** `_chained_intent_guidance`'s
  `_CHAINED_OPERATION_PREFIX_RE` is anchored at `^` and only
  recognised operation verbs at position 0, so `could you tell me
  about Photosynthesis then list namespaces` shifted the verb past
  the anchor — `left_is_op` evaluated False, the chain gate failed,
  and the query fell through to normal intent classification where
  the higher-confidence `list_namespaces` won and silently dropped
  the `tell me about` half. The D5 modal-strip lives inside
  `_extract_tell_me_about`; it only runs AFTER the chain detector
  has already decided. Fixed by pre-stripping the same modal
  scaffold (`(?:could|can|would|will)\s+(?:you|we|i)\s+
  (?:please\s+)?`) at the top of `_chained_intent_guidance` so
  detection sees the cleaned query.
- **P4-D3: `walk namespace` with a malformed argument now returns
  a structured "Missing or Invalid Namespace" error instead of
  silently walking C.** Multi-char (`AB`), digit (`1`), special
  (`_`), and missing-argument forms all fell through to
  `params.get("namespace", "C")` in `_handle_walk_namespace` with
  no signal to the caller that the input was rejected. Sibling
  tools (`find_by_title`, `links_in`, `suggestions`,
  `tell_me_about`) already return structured missing-arg errors;
  this one didn't. Fixed by adding an upfront guard that mirrors
  their shape (rule / examples) before the C-default kicks in.
- **P6-D1 + P6-D2: `browse namespace` now reaches input-validation
  parity with `walk namespace`.** Two cooperating gaps — the
  handler `_handle_browse` had the same
  `params.get("namespace", "C")` silent-default that P4-D3 fixed
  for walk; AND the extractor `_extract_browse` accepted multi-char,
  digit, and special-character namespace arguments
  (`browse namespace AB / 1 / _`) without uppercasing lowercase
  input — diverging from the strict
  `_extract_walk_namespace`. The two siblings now agree: regex
  tightened to `namespace\s+['"]?([A-Za-z])\b['"]?` with `.upper()`
  on the captured letter, and the handler returns a structured
  "Missing or Invalid Namespace" error when the extractor produces
  nothing.
- **P6-D3: leading `please` / `kindly` now strip cleanly from the
  parsed topic.** `please tell me about Photosynthesis` and
  `kindly describe Photosynthesis` previously parsed with the
  politeness phrase leaking into the topic — same shape as the
  pass-1 D5 defect but for non-modal politeness words. The article
  still resolved via tail-probe rescue, but the parsed topic was
  wrong. Fix extends the leading-strip in `_extract_tell_me_about`
  to cover `please` / `kindly` AND wraps both the modal-strip and
  the politeness-strip in a loop so composite phrases
  (`please could you tell me about X`, `please please tell me
  about X`) peel cleanly. Same loop also applied to the chain-
  detector's `_chained_intent_guidance` pre-strip so leading
  politeness doesn't bypass chain detection (mirror of P4-D2).
  Leaves the existing trailing-politeness strip alone, so
  `tell me about X please` still works, and the leading-only
  anchor (`^\s*`) prevents stripping mid-query mentions of
  `please` / `kindly` that are legitimately part of the topic.

### Tests

- **`tests/test_post_a15_beta_fixes.py`** — 80 regression tests
  pinning all ten defects. Each defect gets:
  - The fix-case test (Mercury body has no misleading trailer;
    `could you tell me about X` parses topic=X; `find article titled
    M/Title` returns redirect; `_build_walk_result` exposes the
    zero-count field; `suggestions for` triggers the missing-arg
    guard; `could you tell me about X then list namespaces` is
    detected as chained; `walk namespace AB` returns the missing-
    namespace error; `browse namespace AB` returns the same error
    and `browse namespace c` lowercases to "C"; `please tell me
    about X` strips cleanly).
  - Negative self-audit cases (Berlin keeps its disambig-twin
    footer; non-modal queries unchanged; lowercase a/b not
    redirected by find_by_title; `namespace_entry_count` omitted
    when caller passes None; legitimate `suggestions for forest`
    still captures the prefix; non-chained `could you tell me about
    X` not tripped by the chain detector; trailing `please` still
    works; mid-query `please in linguistics` not stripped).
  - Cross-defect probes (Java disambig body suppresses
    `disambig_twin_path` footer too; `please could you tell me
    about X` peels both layers; `please tell me about X then list
    namespaces` trips chain detector).

## [2.0.0a15] — 2026-05-16 (alpha pre-release) — post-a14 beta-test sweep — section-affinity feature now actually works on real Wikipedia content

### Pass 2 self-audit findings (the recurring pattern: each pass's own fixes have leftover defects)

Three real defects found while self-auditing pass 1; all fixed in this
commit set. Three new tests added.

- **D-Audit-1: `find_entry_by_title_data` produces duplicate rows after
  F3's redirect-chain canonicalisation.** When two suggestions
  (``Bilogy`` redirect + ``Biology`` canonical) both follow to the
  same canonical path, the result list previously emitted two rows
  with the same path. Added a ``(zim_file, path)`` dedup pass after
  the score sort, keeping the highest-scored occurrence.
- **D-Audit-2: `_follow_redirect_chain` can return ``None``.** The
  pre-existing implementation's docstring promised "Returns the
  original entry on any failure" but a redirect whose
  ``get_redirect_entry()`` returned None resulted in the function
  returning None — which then crashed every downstream
  ``entry.path`` access. Tracks ``last_good`` so the helper now
  always returns a real entry, matching its contract.
- **D-Audit-3: F5's underscore-replace heuristic misses slash-shaped
  paths.** Archives like IEP set their entries' ``title`` field to the
  full path (``iep.utm.edu/kantview/``); the F5 humanise heuristic
  only swaps underscores, so these entries surfaced unchanged in
  ``considered_articles[].title``. Extended ``_build_considered_articles``
  to accept ``archive_titles`` (already computed by
  ``_build_section_lookups``) and prefer the bundle's authoritative
  title when present. Verified in-process against the IEP archive
  where titles like ``"Kant, Immanuel | Internet Encyclopedia of
  Philosophy"`` now flow through correctly.

The live beta-test sweep of a14 against
`wikipedia_en_all_maxi_2026-02.zim` found that every `synthesize=True`
response carried `section_id: null` on every citation and an empty
`considered_sections` list — the three coordinated mechanisms a14
shipped were architecturally inert on real archives. Unit-test
goldens passed because they used a fabricated archive where the
entire article body sits inside a single section whose id matches
the article title; real Wikipedia articles have leads outside any
section and natural-bold markup (`**EntityName**`) that breaks the
snippet-to-markdown locate path.

### Fixed

- **F1 (P1): `_locate_passage` now strips bold from BOTH the snippet
  AND the haystack markdown** before searching, with a position-
  remap so the returned offset still indexes into the original
  markdown. Wikipedia's universal `**EntityName**`-opens-the-lead
  pattern previously caused every lead-snippet `md.find` and
  normalized-search to return -1; section attribution then dropped
  every passage to entry-level citation. New helper
  `_strip_bold_with_remap(text) -> (stripped, remap)` in
  `openzim_mcp/synthesize.py`.
- **F2 (P1 cascade): `_build_considered_sections` no longer short-
  circuits to `[]` when the featured passage is article-level.**
  Surfacing the article's sections regardless of whether the
  featured passage itself was section-attributed is a strict
  improvement for the multi-round pivot — the next-turn
  `get_section` call wins either way. The early-exit at
  `synthesize.py:684` was a strict pessimization.
- **F1 cascade (pre-h1 chrome fallback): `_attribute_sections` falls
  back to the FIRST section in the bundle when no section brackets
  the located passage.** Archives that render page chrome (nav,
  breadcrumbs) before the h1 heading otherwise lose every chrome-
  area BM25 snippet to entry-level citation. Verified live against
  the IEP archive's nav-menu prefix where the h1 section starts at
  char 513.
- **F3 (A1): `title_match_hit` and `find_entry_by_title_data` now
  follow the libzim redirect chain via `_follow_redirect_chain`
  before reporting the entry path.** Wikipedia archives carry many
  comma-stripped / case-normalised redirects (`Big_Rapids_Michigan`
  → `Big_Rapids,_Michigan`); without canonicalisation, the same
  article got two different cite_ids depending on which lookup
  variant fired, splitting multi-round-agent state. Applied at all
  three entry-emission sites (fast-path, suggestion-rank, typo-
  fallback).
- **F4 (A2): non-trailing sliding-window probe added as a fallback
  to `_promote_topic_via_title_index` after the strict trailing-tail
  pass.** Queries whose entity sits at the head/middle of the prose
  (`"Big Rapids Michigan tourism"`) now resolve to the entity
  instead of falling through to BM25 noise. New helper
  `iter_query_windows(query, max_len=4, min_len=1)` in
  `openzim_mcp/title_promotion.py`; non-trailing windows only,
  longest-first, so a14's motivating tail-positioned-entity
  behavior is preserved (sliding-window only fires when no
  trailing tail resolved strictly).
- **F5 (A3): `considered_articles[].title` is humanized via the new
  `_humanize_path_title` helper** so path-shaped hit titles
  (`"West_Michigan"`) render with spaces, matching the
  `citations[]` view in the same response. Eliminates the cross-
  view inconsistency where the same article had two different
  display titles depending on which structured field surfaced it.
- **F6 (B1): the lead-with-TOC trailer in `_lead_with_toc` now
  references the canonical (post-redirect) path** for typo-
  fallback resolutions like `tell me about Bilogy` → Biology.
  Carried through F3's canonicalisation in
  `find_entry_by_title_data` — `_promote_topic_via_title_index`
  returns the canonical path, which `_fetch_topic_article_body`
  passes to `_lead_with_toc`. Previously the trailer suggested
  `show structure of Bilogy` (the typo), pushing the next call
  back through typo-fallback.

### Tests

- Test count: 1581 (up from 1567 in a14). All passing.
- New test files:
  - `tests/test_synthesize_section_attribution_live_shape.py` —
    Wikipedia-shaped HTML fixture exercising the natural-bold
    locate path AND the pre-h1 chrome fallback.
  - `tests/test_title_match_hit_redirect_canonicalization.py` —
    redirect-chain canonicalisation for the fast-path hit.
  - `tests/test_iter_query_windows.py` — sliding-window iterator
    spec.
  - `tests/test_simple_tools_window_probe.py` — three-pass probe
    ordering: trailing-strict → window-strict → trailing-fuzzy.
  - `tests/test_simple_tools_typo_trailer_canonical_path.py` — end-
    to-end confirmation that the lead trailer uses the canonical
    path.
- Updated `test_build_considered_sections_empty_when_featured_is_article_level`
  → `test_build_considered_sections_surfaces_all_sections_when_featured_is_article_level`
  (semantics changed; old name retained as a renamed coverage of
  the new behaviour).
- Updated `test_fast_path_exact_match` and
  `test_cross_file_aggregates_and_skips_failures` mock entries to
  set `is_redirect = False` (the production path now calls
  `_follow_redirect_chain`; default MagicMock truthy-ness made the
  chain bounce forever otherwise).

### Researched, not fixed (B2)

The live sweep observed that four parallel `zim_query` calls (one
heavy with typo-fallback variants) caused every subsequent call to
time out for ~90 seconds before the server recovered. Hypothesis:
libzim is not thread-safe on a single archive handle, and the
typo-fallback path (~1400 archive probes per call) saturates the
thread-pool. Conservative fix surface (not in this sweep):
per-archive `asyncio.Lock` around typo-fallback, OR a per-request
deadline, OR a libzim archive pool. Needs a reliable reproducer
and instrumentation before landing.

## [2.0.0a14] — 2026-05-15 (alpha pre-release) — search-engine-style `zim_query`: tail-probe entity resolution + section-affinity boost + considered_* handles

First post-beta-test alpha that ships a feature rather than a sweep:
natural-language prose questions now resolve to canonical entities
and (in `synthesize=True` mode) lead with the most relevant section
of the resolved article. Three coordinated changes:

1. **Greedy length-down tail-probe entity resolution.** A shared
   `iter_query_tails` helper in `title_promotion.py` iterates the
   trailing 4 → 3 → 2 → 1 tokens of a query. Both the default
   `_handle_tell_me_about` path (via `_promote_topic_via_title_index`,
   two-pass strict-then-fuzzy) and the synthesize path (via
   `_promote_title_match`, single-pass strict) now probe each tail.
   This replaces the M26 4-token short-circuit that previously caused
   long prose queries like *"who are some famous people from big
   rapids, michigan"* to fall through to BM25 noise instead of
   resolving the canonical `Big_Rapids,_Michigan` entity.

2. **Section-heading affinity boost in synthesize.** A new
   `_boost_by_section_affinity` pipeline stage runs after
   `_attribute_sections`. For each passage carrying a `#section_id`,
   it computes `|query_tokens ∩ heading_tokens| / |heading_tokens|`.
   When that ratio meets `SynthesizeConfig.section_affinity_threshold`
   (default `0.25`), the passage score is multiplied by
   `section_affinity_boost` (default `1.5`) and the list is
   re-sorted (with `rank` renumbered to match). Archive-agnostic:
   the archive's own section headings supply the matching
   vocabulary, no curated synonym tables.

3. **Multi-round handles on `SynthesizeResponse`.** Two new optional
   fields surface the candidate space:
   `considered_articles` (top-3 article hits not featured) exposes
   `(archive, entry_path, title, score)` so a follow-up turn can pivot
   via `get_zim_entries`. `considered_sections` (top-10 sections of
   the featured article, in document order, minus the featured one)
   exposes `(section_id, title)` so a follow-up turn can pivot via
   `get_section`. `SynthesizeResponse` switches to
   `TypedDict(total=False)` to accommodate the additive shape;
   existing callers populating every field are unaffected. Compact-
   mode markdown rendering of these fields is deferred — the
   structured payload (`structuredContent`) always carries them.

The motivating query *"who are some famous people from big rapids,
michigan"* now traces:

- Default mode: tail probe resolves `Big_Rapids,_Michigan`, returns
  the article body. Better than today's BM25-noise outcome, though
  the response is not yet section-targeted in default mode.
- `synthesize=True`: tail probe resolves the entity, affinity boost
  promotes the `#Notable_people` section to the lead passage, and
  the response carries `considered_articles` + `considered_sections`
  handles for the next turn.

### Added

- `iter_query_tails(query, *, max_len=4, min_len=1)` in
  `openzim_mcp/title_promotion.py` — greedy length-down trailing-
  token iterator, lowercased + `[a-z0-9]+` tokenized. Shared by both
  entity-resolution paths. Underscore is treated as a token boundary
  so path-form input like `Big_Rapids,_Michigan` tokenizes correctly.
- `_boost_by_section_affinity` pipeline stage in
  `openzim_mcp/synthesize.py` plus the `_section_titles_for` and
  `_maybe_boost_passage` helpers. Bundle-titles lookup is memoized
  per call; exceptions and `None` bundles are no-ops (score unchanged).
- `SynthesizeConfig.section_affinity_threshold` (default `0.25`,
  bounds `[0.0, 1.0]`) and `section_affinity_boost` (default `1.5`,
  bounds `[1.0, 10.0]`) — Pydantic-validated tunables for the new
  stage.
- `ConsideredArticle` and `ConsideredSection` TypedDicts in
  `openzim_mcp/tool_schemas.py`.
- `_build_considered_articles` and `_build_considered_sections`
  helpers in `openzim_mcp/synthesize.py`. Featured article and
  section are excluded so the lists are alternatives, not
  duplicates of the featured citation.

### Changed

- `_promote_title_match` in `synthesize.py`: removed the M26 4-token
  short-circuit. Long prose queries with a clear entity tail now
  resolve canonically instead of falling through to BM25 noise.
- `_promote_topic_via_title_index` in `simple_tools.py`: rewritten
  as a two-pass tail-probe (strict 1.0-score gate across all tails
  first, then 0.8-score typo-tolerant gate across all tails). The
  two-pass ordering prevents a fuzzy 0.8 match on a long noisy tail
  from winning over an exact 1.0 match on a clean shorter tail.
- `SynthesizeResponse` TypedDict is now `total=False` to accommodate
  the new optional fields. Existing callers populating every field
  are unaffected.

### Tests

- 46 new unit tests across `tests/test_iter_query_tails.py`,
  `tests/test_simple_tools_tail_probe.py`,
  `tests/test_synthesize_section_affinity.py`,
  `tests/test_synthesize_considered_handles.py`, and additions to
  `tests/test_synthesize_title_promotion_v2a9.py` and
  `tests/test_tool_schemas.py`. Test count: 1567 → 1566 (one less
  because two affinity-boost tests with identical setup blocks were
  merged into one combined assertion; SonarCloud flagged the
  intra-file duplication).
- Three golden snapshots refreshed
  (`synthesize_berlin_geography.json`, `synthesize_munich_history.json`,
  `synthesize_capital_city.json`) — the new `considered_*` fields are
  always emitted, and the score change from `1.0 → 1.5` on
  entity-name section headings reflects the affinity boost firing.
- `test_metadata_namespace_from_metadata_keys` threshold relaxed
  from `>= 10` to `>= 5` after an upstream `zim-testing-suite`
  fixture refresh changed `nons/small.zim`'s metadata-key count
  from 10 to 9 (broke comprehensive-testing on `main` before this
  alpha was cut).

## [2.0.0a13] — 2026-05-14 (alpha pre-release) — post-a12 beta-test sweep (8 defects across three passes)

Three-pass beta-test of `v2.0.0a12` against the same 118 GB Wikipedia
ZIM (Feb 2026 snapshot) the a8 → a12 cuts targeted, via the simple-
mode `zim_query` MCP surface. The pattern across the alpha series
continues to diminish (a10: 22+6+3, a11: 11+3+1, a12: ~6+2+0 split
across the same three-pass shape — first pass surfaced six defects,
second pass two structural gaps, third pass zero new).

The single most user-visible defect was `search for Berlin in
namespace C` rendering `List_of_songs_about_Berlin` at rank #1 with
the canonical `Berlin` article absent. The H2 canonical-splice gate
short-circuited to the legacy `search_with_filters` whenever the top
BM25 hit token-prefix-matched the topic — `is_strong_title_match`
returns True for any candidate that extends the topic
(`Berlin_(disambiguation)` extends `Berlin`), so the splice never
fired for new-scheme archives that have a disambig page for the
topic. Tightening the gate to require exact path equality fixes the
H2/H3 surface end-to-end for every shape, not just the case the a12
third-pass self-audit addressed.

The recurring infobox-cell concatenation bug (`5th in Europe1st in
Germany`) got its final user-visible fix this cycle: the a10/a11
sweep added a space separator between block-level cell children, but
a downstream small LLM still tokenised `5th in Europe 1st in
Germany` as one phrase. The block-cell joiner now emits `"; "`
between block boundaries so each value reads as a distinct item.

Net: 1513 tests pass (+20 over `v2.0.0a12`), 50 skipped, 38
deselected. `black` / `isort` / `flake8` / `mypy` all clean.

### Fixed — High (post-a12 beta sweep)

- **D1: orphan-bullet sub-rows chained the previous row's full label
  as their parent.** `tell me about France` rendered
  `**Government — • President:** Macron` (correct) but then
  `**• President — • Prime Minister:** Lecornu`,
  `**• Prime Minister — • President of the Senate:** Larcher`
  (wrong — the parent kept shifting). Same shape in the USA infobox.
  Berlin's `Government` sub-rows happened to render correctly because
  Wikipedia marked them differently in HTML. Root cause: the
  virtual-parent extractor for orphan-bullet rows used
  `prev_label.split(" — ", 1)[-1]` (trailing segment) instead of
  `[0]` (original parent). Each bullet row's parent inherited the
  PREVIOUS bullet's label rather than the constant section parent.
  Fixed by taking the original parent.
- **D2: `list_namespaces` reports M=13 while `walk namespace M` /
  `metadata for` report 12.** The a12 M1 fix plumbed the shared
  `is_human_readable_metadata_key` predicate to two of three
  reporting surfaces but missed `_add_new_scheme_metadata_namespace`
  in the namespace walker. `list_namespaces` reported the raw libzim
  count (13, including the `Illustration_48x48@1` binary entry)
  while the other two filtered. Added the predicate to the third
  site so all three surfaces agree on 12.
- **D3 / D4: chained-intent splitter missed two recurring-set
  shapes.** `Biology; Chemistry` (bare topics, `;` connector) fell
  through to topic-fetch and resolved to `Computational_Biology_&_
  Chemistry` (a journal). `tell me about Photosynthesis and then
  about DNA` (single-imperative-prefix continuation, right side is
  `about DNA`) fell through to full-text search on the literal
  phrase. The splitter required an operation verb on BOTH sides of
  the connector. D3 adds a bare-topic-chain branch that wraps both
  halves with `tell me about` when the connector is unambiguous
  (`;` / `then` / `and then` / `after that` / `, then`) AND both
  halves are topic-shaped (≤6 tokens, no internal connectors). D4
  adds a continuation-prefix branch that re-prefixes the right half
  with the left's verb when the right starts with
  `about` / `of` / `for` / `with` / `on` / `in` / `into` / `to`. A
  negative-case guard prevents the bare-topic branch from
  over-triggering when a half is JUST an operation verb prefix with
  no topic content (``tell me about then and now`` — the connector
  was inside the topic name, not a chain marker).
- **D5: H2 canonical-splice early-return fired on any token-prefix
  strong match.** The gate at the top of the populated-results
  branch invoked `is_strong_title_match(query, top.path, top.title)`
  to decide whether to short-circuit to the legacy
  `search_with_filters` path (avoiding canonical duplication when
  BM25 already returned a strong hit). But the matcher returns True
  for any candidate that extends the topic via prefix
  (`Berlin_(disambiguation)` extends `Berlin`,
  `Apollo_(disambiguation)` extends `Apollo`,
  `List_of_…_named_after_X` extends `X`). For new-scheme Wikipedia
  archives — where a disambig page nearly always sits next to the
  canonical — the gate fired on the disambig and the splice never
  ran. Tightened to `top_path == canonical_path` so the splice's
  reorder logic handles canonical promotion in every other shape.
  As a side effect this also unblocks H3's list-article demote,
  which lives inside the same splice block.

### Fixed — Medium (post-a12 beta sweep)

- **D6: L2 trailing-punctuation trim only stripped one category per
  call.** `tell me about DNA, and then tell me about Photosynthesis`
  split on ` then ` to left=`tell me about DNA, and` (after
  trimming) → only the orphan `and` got stripped, the trailing `,`
  stayed. The `for/else` shape entered the punctuation branch only
  when no connector matched. Reworked to loop until stable so the
  trim handles any combination of orphan connector word + trailing
  `;`/`,` in any order.
- **D7: block-level cell separator was a bare space — final fix.**
  The a10/a11 fix turned `5th in Europe1st in Germany` into
  `5th in Europe 1st in Germany` (space separator at block
  boundaries) so cells with `<br>`/`<li>`/`<p>` children no longer
  concatenated without a separator. But downstream LLMs still
  tokenised the space-separated form as a single phrase. Upgraded
  the block-cell joiner to emit `"; "` between block boundaries so a
  population-rank cell like `<td>5th in Europe<br>1st in
  Germany</td>` renders as `5th in Europe; 1st in Germany` — two
  distinct values, same row label. Inline span groups (number
  formatting `3,913,644`, coordinates `52°31′N`) still concatenate
  directly per the a11 second-pass invariant.

### Fixed — Low (post-a12 beta sweep)

- **D8: legacy unstructured `**Error Processing Query**` template
  on four not-found surfaces.** `show structure of nonexistent_x`,
  `summary of nonexistent_x`, `get article nonexistent_x`, and
  `links in nonexistent_x` all let their backend exception
  propagate to the top-level `handle_zim_query` `except` block,
  which emitted a generic template with: no intent telemetry
  comment (`<!-- intent=... cert=... -->` was added in a12 L1 but
  only for the structured early-return paths), Python helper-name
  leakage (`Try using search_zim_file()` / `browse_namespace()` —
  none of which are MCP-surface commands), and unhelpful
  troubleshooting refs (`Check server logs` — not accessible from
  the MCP surface). `articles related to nonexistent_x` was
  already modernised in a10 F3. Added a
  `_render_not_found_recovery` helper that returns the modernised
  shape (`**Article not found: \`path\`**` + `suggestions for` /
  `find article titled` / `search for` recovery) and wrapped the
  four handler delegations with `try/except`. The outer
  `handle_zim_query` now layers the intent telemetry on success
  because the handlers return a string instead of raising.

### Wire-format / surface changes (alpha-line clean breaks)

- **`tell me about France` renders consecutive bullet sub-rows
  consistently anchored to the section parent.** Pre-fix every
  Wikipedia country article showed a chained sequence like
  `**• President — • Prime Minister:** ...` /
  `**• Prime Minister — • President of the Senate:** ...`. Post-fix
  each row reads `**Government — • Prime Minister:** ...`.
- **`list_namespaces` reports M=12 (matching `walk namespace M` and
  `metadata for`)** for archives whose only non-human-readable M
  entry is `Illustration_*`. Pre-fix M=13.
- **`Biology; Chemistry` is detected as a chained query.** Pre-fix
  it silently resolved to `Computational_Biology_&_Chemistry`.
  Other bare-topic chains (`DNA then Photosynthesis`, `Berlin and
  then Munich`) likewise.
- **`tell me about X and then about Y` is detected as a chained
  query.** Pre-fix the right half (`about Y`) wasn't recognised as
  an op verb continuation; the query fell through to full-text
  search on the literal phrase.
- **`tell me about then and now` (a topic whose name contains
  `then`) passes through unchanged.** The bare-topic chain branch
  guards against incomplete-verb halves so connector-in-topic
  queries aren't mis-classified.
- **`search for Berlin in namespace C` returns the canonical
  `Berlin` at rank #1.** Pre-fix it returned
  `[List_of_songs_about_Berlin, Berlin_(disambiguation),
  Timeline_of_Berlin]` with the canonical absent. Similar shape on
  every namespace-C archive that has a disambig page for the topic.
- **L2 chained-intent trim handles both orphan connectors and
  trailing punctuation.** `tell me about DNA, and then …` renders
  the left op as `tell me about DNA` (no trailing `,` or `and`).
- **Wikipedia infobox cells with `<br>`-separated values render with
  a `"; "` separator between values.** `**Rank:** 5th in Europe; 1st
  in Germany` instead of `5th in Europe 1st in Germany`. Inline
  span groups (number formatting, coordinates) unchanged.
- **`show structure of` / `summary of` / `get article` / `links in`
  not-found responses are structured guidance with intent telemetry
  and concrete recovery commands.** Same shape `articles related
  to` has carried since a10. Pre-fix these four used a legacy
  template with no telemetry and Python helper-name leakage.

## [2.0.0a12] — 2026-05-13 (alpha pre-release) — post-a11 beta-test sweep (11+3+1 defects across three passes)

Three-pass beta-test of `v2.0.0a11` against the same 118 GB Wikipedia
ZIM (Feb 2026 snapshot) the a8 → a11 cuts targeted, via the simple-
mode `zim_query` MCP surface. The first pass surfaced 11 live defects
+ a handful of opportunities; the second pass self-audited the first-
pass commit and found 3 more; the third pass self-audited the second-
pass commit and found 1 deeper case. The 22 → 6 → 3 a10 → a11 shape
repeats at 11 → 3 → 1.

The single most user-visible defect was `tell me about France`
silently returning `France_national_football_team_results_(2000–
2019)` while Germany / Italy / Spain / Brazil / Mexico all returned
the correct country article — Xapian's top hit was the football
article and the existing H3 canonical-prepend gate explicitly skipped
the `len(strong_matches) == 1` non-twin case. The same root-cause
shape (silent fall-through to a wrong-but-similar article) drove most
of this sweep's catches.

The two structural root causes — `_extract_entry_path_keyworded`
regex character class and the early-return suffix-bypass pattern —
each accounted for multiple defects in different surfaces.

Net: 1493 tests pass (+30 over `v2.0.0a11`), 50 skipped, 38
deselected. `black` / `isort` / `flake8` / `mypy` / CodeQL /
SonarCloud all clean.

### Fixed — Critical (post-a11 beta sweep)

- **C1: `tell me about France` returned the football-team article.**
  Xapian's #1 hit was `France_national_football_team_results_(2000–
  2019)`, which strong-matched topic=`France` via the candidate-
  extends-topic rule, leaving `len(strong_matches) == 1` non-twin —
  the H3 canonical-prepend gate explicitly skipped that case. Gate
  now also fires when the lone strong match's tokens differ from the
  topic's, and a sibling auto-pick `_auto_pick_canonical_over_extends_topic`
  prefers the canonical when the strong-match set is exactly
  `[canonical-with-topic-tokens, ..._extends-topic-only]`. Mercury /
  Apollo / Java / DNA forks unchanged. Apollo 11 and similar hub
  topics now auto-resolve to the canonical with variants surfaced as
  a `_May also refer to: ..._` footer hint.
- **C2: multi-word entry-path extraction silently dropped the second
  word on five operations.** The shared
  `_extract_entry_path_keyworded` regex used `[A-Za-z0-9_/.-]+` for
  the capture, so `show structure of United States` matched
  `of United` and captured `United`. New extractor anchors at the
  LAST keyword and captures everything that follows, so
  `World War II`, `Albert Einstein`, `Quantum mechanics` all flow
  through correctly on `structure` / `summary` / `links` /
  `get_article` / `toc`.

### Fixed — High (post-a11 beta sweep)

- **H1: title-index lookups for punctuated topics smeared to drop-
  the-punctuation candidates.** `tell me about C++` resolved past the
  title index to `C` (the letter); paired with the C2 fix that now
  preserves `++` through extraction, the punctuation-count guard
  (`_punctuation_smear_detected`) rejects candidates that drop a
  `+` / `#` count present in the topic. Known limitation: topic →
  candidate pairs that preserve the punctuation count (`C++` →
  `C/C++`) require redirect-target inspection and are deferred.
- **H2: filtered-search dropped the canonical title-match hit.**
  `_handle_filtered_search` was a one-call delegate to
  `search_with_filters` (legacy markdown path), so the splice
  `_handle_search` runs at offset=0 never fired. New
  `search_with_filters_with_canonical_splice` runs the same probe +
  prepend as the basic-search path, gated to canonical hits whose
  path lives in the requested namespace.
- **H3: Opp2 list / discography demote was synthesize-layer-only.**
  `_demote_list_articles` lived inside `synthesize_query`; basic
  `search` left catalog-shape hits in place at their BM25 rank.
  Lifted the predicate `_is_list_article` for cross-call use and
  applied it inside `_splice_title_match_into_search` (basic search)
  and the new H2 filtered-search splice.

### Fixed — Medium (post-a11 beta sweep)

- **M1: `walk namespace M` and `metadata for` disagreed (13 vs 12
  keys).** Shared `is_human_readable_metadata_key` predicate now
  consulted from both sites.
- **M2: `get article M/Illustration_48x48@1` stripped `@1`.** Same
  root cause as C2 — fixed by the C2 extractor change.
- **M3: `walk namespace C` reported "archive total" instead of per-
  namespace count.** L16's `namespace_entry_count` plumbing now
  applies to new-scheme C (the count equals `archive.entry_count`).
- **M4: truncation footer reported remaining-after-offset chars as
  "total".** Added `original_total` kwarg, plumbed from the three
  callers in `zim/content.py`. Mid-article reads now switch to
  `showing chars X–Y of N-char body` so the denominator stays stable
  across pagination.

### Fixed — Low (post-a11 beta sweep)

- **L1: structured guidance / error responses skipped the Opp6 intent
  telemetry comment.** Three early-return paths (`Topic Required`,
  `Search Terms Required`, `Chained Operations Detected`) now carry
  their own deterministic telemetry comments at `cert=1.00`.
- **L2: chained-intent splitter left the connector word attached to
  the left op.** Strip trailing connectors / orphan punctuation so
  the suggested split-up call is cleanly pasteable.
- **L3: canonical-title-match snippet rendered as snippet text.** Now
  surfaced as a distinct `Match type: canonical title match` badge
  in both `_format_search_text` and `_format_filtered_response`.

### Fixed — Second-pass self-audit (post-a11 sweep)

L1 covered three of the six structured early-return paths in the same
code section but missed the other three:

- **`Query Required`** (empty / whitespace query) →
  `intent=query_required cert=1.00`
- **`_meta_query_guidance`** (meta-only filler queries like `do
  both` / `try again` / `ok`) → `intent=meta_only_guidance cert=1.00`
- **`No ZIM File Specified`** (no archive selectable) →
  `intent=no_zim_file_specified cert=1.00`

### Fixed — Third-pass self-audit (post-a11 sweep)

- H2 splice silently dropped the canonical when
  `search_with_filters_data` returned 0 hits but `find_title_match`
  reported the canonical exists in the requested namespace.
  Symmetric to the bug the first-pass H2 fix addressed (canonical
  missing from a non-empty result page) — same wrong silent-fall-
  through, different shape. Hoisted the synthetic-canonical row
  construction above the populated-vs-empty branch so both paths
  share the same prepend logic. The empty-results path now lands the
  canonical as a single-result page with the post-a11 L3 badge.

### Fixed — Quality gate (post-a11 sweep)

- SonarCloud S5852 ReDoS on the L2 orphan-trim regex
  `\s+(?:and|or|but)\s*$|\s*[;,]\s*$` (multiple unbounded `\s*` /
  `\s+` quantifiers in alternation). Replaced with string ops that
  mirror the original "strip one of: trailing connector word OR
  trailing `;` / `,`" semantics, same approach as the existing
  `_is_disambig_lead` workaround in the same file.

### Wire-format / surface changes (alpha-line clean breaks)

- **`tell me about` auto-resolves to the canonical when the strong-
  match set is `[canonical-with-topic-tokens, ..._extends-topic-only]`**
  (Apollo 11, Pride and Prejudice, hub topics with parenthesized
  siblings). Variants are surfaced as a `_May also refer to: ..._`
  footer hint instead of the prior disambig fork. Genuine multi-
  meaning topics (Apollo / Mercury / Java / DNA) still fork as
  before.
- **`show structure of` (and `summary` / `links` / `get article` /
  `table of contents` of) actually accept multi-word titles.** Pre-
  fix these silently truncated to the first word and rendered the
  wrong article.
- **Filtered-search responses include canonical title-match hits**
  with a distinct `Match type: canonical title match` badge instead
  of dropping them silently.
- **`tell me about C++` (or any topic with `+` / `#`) no longer
  resolves to a candidate that dropped the punctuation.** Falls
  through to search-fallback where canonical-title-match can find
  the actual `_programming_language`-suffixed article.
- **`get article M/Illustration_48x48@1` (or any path with `@`)
  preserves the suffix through extraction.** Pre-fix the regex
  character class stripped `@1` before the metadata API saw it.
- **Walk-namespace M and metadata-for now agree** on the metadata-
  key set (filtered `Illustration_*` binaries on both sides).
- **Walk-namespace C reports `(of N in namespace C)`** instead of
  `(archive total: ~N entries)` for new-scheme archives.
- **Truncation footer denominator stays stable across pagination.**
  Mid-article reads switch to `showing chars X–Y of N-char body` so
  a caller paging through a 146 KB article doesn't see the "total"
  decrease with every page.
- **Every structured guidance / error response carries an intent
  telemetry comment** so callers branching on
  `<!-- intent=... cert=... -->` see the rejection class.

## [2.0.0a11] — 2026-05-13 (alpha pre-release) — post-a10 beta-test sweep (22+6+3 defects + 7 opportunities across three passes)

Three-pass beta-test of `v2.0.0a10` against a 118 GB Wikipedia ZIM
(Feb 2026 snapshot) via the simple-mode `zim_query` MCP surface. The
first pass surfaced 22 defects + 7 opportunities from live use; the
second and third passes were self-audits of the prior commit, each
finding fewer issues than the last (22 → 6 → 3). Every fix here was
first observed live; the existing 1425-test suite covered none of
them.

The single most user-visible regression is silent text concatenation
inside Wikipedia infoboxes — every city / country article had at
least one corrupted number (`5th in Europe1st in Germany`,
`Berliner(s) (English)Berliner (m)`, `0.967very high`,
`TokyoTamaNorthern Izu Islands`). A small LLM reading this would
emit those as single tokens. Plus one **critical**: the a10 DD2 fix
threaded `content_offset` through the article-paging handler, but
the parameter was never exposed on the MCP tool — the truncation
footer told callers to "pass `content_offset=N`" via a channel that
didn't exist.

Net: 1463 tests pass (+38 over `v2.0.0a10`), 50 skipped, 38
deselected. `black` / `isort` / `flake8` / `mypy` / CodeQL /
SonarCloud all clean.

### Fixed — Critical (post-a10 beta sweep)

- **C1: `content_offset` unreachable from `zim_query`.** A10's DD2
  threaded `options["content_offset"]` through
  `_fetch_topic_article_body`, but the `zim_query` MCP signature
  never exposed the parameter. The top-level `offset` arg routes to
  `options["offset"]` (search / browse pagination), not
  `options["content_offset"]` (article-body paging). Result: every
  `tell me about Photosynthesis` truncation footer pointed to a
  paging channel that returned the same page 1. Exposed
  `content_offset` as a top-level `zim_query` parameter, validated
  `>= 0`, threaded through `options`. Truncation footers on
  `truncate_content` now report the correct next-page offset
  (Opp4 implemented inline).

### Fixed — High (post-a10 beta sweep)

- **H2: `tell me about Berlin` non-determinism.** `Berlin` and
  `Berlin (disambiguation)` both strong-matched by the candidate-
  extends-topic rule, so the disambig set fired 2+ → fork between
  the city article and the disambig page. Auto-pick the canonical
  when the strong-match set is exactly `Foo` + `Foo (disambiguation)`;
  the disambig twin is surfaced as a footer hint on the returned
  body. Genuine multi-meaning topics (Apollo / Mercury / Java) still
  fork as before (Opp1 implemented inline).
- **H3: disambig hides the canonical it should be helping pick.**
  `tell me about Apollo 11` forked between `Apollo_11_anniversaries`,
  `Apollo_11_lunar_sample_display`, `Apollo_11_goodwill_messages` —
  none of which is the canonical `Apollo_11`. Probe the title index
  for the exact-topic canonical BEFORE the disambig check; prepend
  it to the strong-match list when absent.
- **H4: infobox text-extraction silently concatenates adjacent
  block-level children.** `td.get_text()` joined `<br>`, `<li>`,
  `<span>` runs without whitespace. Three-pass evolution:
  first-pass `separator=" "` mangled inline span groups
  (`3,913,644` → `3 , 913 , 644`); second-pass `_join_cell_text`
  helper inserts whitespace at block-tag boundaries only and
  concatenates inline tags directly; third-pass filters
  `Comment` instances (a `NavigableString` subclass) so invisible
  formatnum/microformat comments stop leaking as visible text.
- **H5: intent parser preempts on later-occurring keywords.**
  `tell me about berlin then list namespaces` silently ran only
  `list namespaces` (highest-confidence intent wins). New
  `_chained_intent_guidance` splits on `then` / `;` / `and then`
  connectors; if both halves start with a recognised operation
  prefix, return a "split into separate calls" guidance message.
- **H6: orphan bullet rows lose parent context.** Berlin's
  `**• Summer (DST):** UTC+02:00` rendered without a parent
  because `Time zone:` was a regular KV (not an `infobox-header`).
  When a KV row's label starts with a bullet char AND there's no
  active section, treat the previous KV row's label as a virtual
  parent — applied for that row only, doesn't persist into the
  next non-bullet row.

### Fixed — Medium (post-a10 beta sweep)

- **M7: `show structure of <multi-word title>` doesn't normalize.**
  D2 in a10 added `find_title_match(min_score=0.8)` to
  `_handle_related`; M7 extends the same pattern via the new
  `_resolve_natural_language_path` helper applied to `structure`,
  `table of contents`, `links`, `summary`, `get section`, and
  `get article` (when the path contains spaces and no namespace
  separator — direct-path lookups stay zero-cost).
- **M8: `get section` ignores `max_content_length`.** Section text
  was returned in full regardless of the cap. Honor the cap and
  append a one-line truncation footer reporting the original
  length.
- **M9: malformed cursor silent no-op.** A base64+JSON token that
  decodes but lacks the expected `s` envelope (or whose `s.o` is
  missing/invalid) used to silently degrade to page 1. The contract
  now mirrors the totally-garbled-token case: structured
  `cursor_decode` error.
- **M10: trailing-whitespace `tell me about` produces an empty
  topic.** A query of `tell me about` with a trailing space fell
  through to a topic of `"tell me about"` and disambiguated to
  articles titled "Tell Me About Tomorrow". The
  `_extract_tell_me_about` regex now uses `\b` + `(.*?)` so empty
  topics resolve to empty strings; simple_tools rejects with a
  clear "Topic Required" error.
- **M11: `explain X to me` parses incorrectly.** "explain Berlin
  to me" extracted topic `"Berlin to me"` and returned a memorial
  article. Topic extractor strips `to me` / `for me` / `please`
  politeness tails, loop-until-idempotent so wrapping cases
  (`DNA for me please`) collapse cleanly.

### Fixed — Low (post-a10 beta sweep)

- **L12: trailing-whitespace `search for` with no terms.** Used to
  fall through to searching for the literal word "for". Validate
  the extracted tail before dispatch; surface "Search Terms
  Required".
- **L13: `limit=0` nonsensical pagination.** `Showing 1-0 of N —
  pass offset=0 for the next page` looped on itself. Reject
  non-positive `limit` and negative `offset` at the MCP boundary.
- **L15: `articles related to <nonexistent>` raw error.** Wrap the
  backend's "Cannot find entry" with a structured guidance message
  pointing to `suggestions for` / `find article titled` /
  `search for`. (Second-pass F3 added the same hint trio to the
  `outbound_error` branch in `render_related` that the live case
  actually surfaces through.)
- **L16: walk namespace denominator misleading.** `walk namespace M`
  with 13 entries used to render `of ~27,199,904 archive-wide
  entries`. Prefer a per-namespace denominator when available; fall
  through to the archive total only when no per-namespace count is
  known. Second-pass F4 plumbed `namespace_entry_count` through
  `_build_walk_result` so the new `of N in namespace X` shape
  actually renders.
- **L17 / L18: list namespaces total mismatch, metadata aggregator
  underreports.** Header now annotates "X archive entries (per-
  namespace sum: Y)" when the two differ. `_extract_zim_metadata`
  enumerates `archive.metadata_keys` on new-scheme archives (filtering
  `Illustration_*` binaries) so `metadata for` and
  `walk namespace M` agree on what counts as metadata. Second-pass
  F6 replaced the first-pass hardcoded probe-list extension with
  the enumeration so future archive additions don't reopen the
  disagreement.

### Added — Opportunities (post-a10 beta sweep)

- **Opp1: auto-fall-through twin.** Implemented inline with H2.
- **Opp2: expanded demote patterns.** `_LIST_ARTICLE_PREFIX_RE`
  picks up `Lists_of_*` (plural); two new patterns demote
  `Listed_*` stems and `*_discography` / `*_filmography` /
  `*_videography` / `*_bibliography` / `*_albums` / `*_singles`
  suffixes. `tell me about cats` returning a Rephlex Records
  discography at rank 2 is the canonical failure this fixes.
- **Opp3: synthesize relevance threshold.** New
  `_drop_low_relevance_tail` cuts hits whose Xapian score is below
  25% of the top hit's. Only applied in `xapian_score` fallback
  (single-archive); multi-archive RRF keeps all hits because RRF
  normalizes scores. Always keeps at least one hit.
- **Opp4: `content_offset` in truncation footers.** Implemented
  inline with C1 — `truncate_content` accepts `current_offset` so
  paginated reads compute the next offset relative to where the
  slice started in the original article. Third-pass F2 added a
  `paginatable: bool = True` kwarg so the three main-page call sites
  switch to operation-accurate guidance (the main-page surface
  doesn't accept `content_offset`).
- **Opp5: canonical-exists hint in disambig auto-pick.** When the
  H2 auto-fall-through fires, append a `_Note: this topic also has
  a disambiguation page — see ``get article <path>`` for alternate
  meanings._` footer so the disambiguation stays discoverable.
- **Opp6: intent telemetry on all responses.** Every markdown
  response now carries a trailing `<!-- intent=foo cert=0.85 -->`
  HTML comment. Invisible to humans (HTML comments aren't rendered)
  but visible in the token stream so calling LLMs can branch on the
  parser's classification certainty without parsing the body.
- **Opp7: link-count rank on related articles.** When the related-
  articles backend supplies a `mention_count`, surface it inline as
  `- **Title** (`path`) · N×` so a small LLM can rank which related
  article is most central to the source. (Second-pass H1 fixed the
  first-pass typo that read the wrong field name — `link_count`
  vs `mention_count`.)

### Fixed — Second-pass self-audit findings

A self-audit of the first-pass commit surfaced six defects in the
fixes themselves:

- **D1 second-pass (folded into H4 above).** `get_text(separator=" ")`
  mangled inline-span numeric groups.
- **F3 second-pass (folded into L15 above).** Wrapped the wrong
  error path — backend serialises rather than re-raising.
- **F4 second-pass (folded into L16 above).** `namespace_entry_count`
  was renderer-only and never plumbed through the data payload.
- **F6 second-pass (folded into L17/L18 above).** Hardcoded probe-
  list extension still drifts; replaced with `metadata_keys`
  enumeration on new-scheme archives.
- **H1 second-pass (folded into Opp7 above).** First-pass read
  `link_count` from the related-articles result; the backend stores
  the frequency-rank signal as `mention_count`.
- **C2 perf: title-index probe ran twice on the weak-top-hit path.**
  Gated the H3 canonical-probe behind `len(strong_matches) >= 2`
  (the only condition under which the disambig page would otherwise
  render). Strong-top-hit and weak-then-promoted paths skip the
  second probe entirely. Third-pass extended the gate to also fire
  when the single strong match is itself the disambig twin.

### Fixed — Third-pass self-audit findings

A second self-audit found three more defects in the second-pass
commit:

- **D1 third-pass (folded into H4 above).** `Comment` is a
  `NavigableString` subclass; second-pass `_join_cell_text` caught
  comments and rendered their bodies as visible text.
- **C2 third-pass.** Lone disambig-twin search case bypassed the
  second-pass `>= 2` gate; extended the gate to also fire when the
  one strong match is `Foo (disambiguation)` itself.
- **F2 third-pass (folded into Opp4 above).** The second-pass
  truncation hint pointed at a `content_offset` parameter the
  main-page operation doesn't accept; added a `paginatable=False`
  kwarg on the three main-page call sites and routed them to
  operation-accurate guidance.

### Fixed — Quality gate (PR CI cleanup)

- **CodeQL: `full_len` may be uninitialized.** In the M8 truncation-
  footer code path, `full_len = len(text)` was assigned only inside
  the truncation `if` block but referenced in a different (correlated)
  `if truncated:` block. The correlation was opaque to CodeQL.
  Lifted the assignment above the branch so the variable is always
  defined. No behaviour change.
- **SonarCloud python:S5852 (ReDoS hotspot).** The
  `_search_query_tail` regex had adjacent `\s*` quantifiers that
  the heuristic flagged as polynomial-backtracking. Split into
  three single-token regexes (verb, optional `up` for `look up`,
  optional `for` connector) with plain-Python tail slicing between
  matches. Each individual pattern has at most one whitespace
  quantifier so the heuristic has nothing to flag. Behaviour
  verified identical across all 1463 tests.

### Wire-format / surface changes (alpha-line clean breaks)

- **`zim_query` accepts a top-level `content_offset` parameter.**
  Existing callers passing only the previous parameters are
  unaffected; new callers paginating long article bodies should
  use `content_offset` instead of the legacy `offset` (the latter
  remains the search / browse pagination knob).
- **Every markdown response now carries a trailing
  `<!-- intent=... cert=... -->` HTML comment** (Opp6). Invisible
  to humans; callers that token-count or post-process the trailing
  bytes will see two extra tokens per response.
- **Intent-parser chained-query guard returns guidance instead of
  silently dispatching the rightmost intent.** Callers sending
  `X then Y` queries that previously got Y's result silently now
  receive a structured "split into separate calls" message.
- **`get section` honors `max_content_length` and appends a
  truncation footer.** Callers that previously got full section
  bodies now receive at most `max_content_length` bytes plus a
  one-line footer reporting the original length.
- **Cursor with missing/invalid `s` envelope now errors
  (`cursor_decode`).** Callers that previously got silent page-1
  fall-through now receive a structured error.
- **Infobox cells render with intra-cell whitespace at block-tag
  boundaries only.** Most callers see strictly better text (no
  `5th in Europe1st in Germany`-style concatenation); inline
  numeric / unit / coordinate microformats remain intact.
- **Synthesize ranking demotes `Lists_of_*` and `*_discography` /
  `*_filmography` / `*_albums` / `*_singles` suffixes.** Citation
  order for queries like `cats` no longer surfaces a Rephlex
  Records discography in the top half.
- **`walk namespace M` and `metadata for <file>` agree on what
  counts as metadata** (new-scheme archives enumerate
  `metadata_keys` directly). Old-scheme archives keep the
  hardcoded probe list as a fallback.

---

## [2.0.0a10] — 2026-05-12 (alpha pre-release) — post-a9 beta-test sweep (16 defects + 6 opportunities)

Two-pass beta-test of `v2.0.0a9` against a 118 GB Wikipedia ZIM (Feb
2026 snapshot), plus a self-review code-reviewer audit and a
SonarCloud Quality Gate cleanup. The first pass exercised the
markdown surface; the second pass audited the first-pass fixes and
extended live testing to surfaces not covered the first time. Several
recently-shipped backend features turned out to be unreachable from
the natural-language surface, several handlers had silent fall-through
bugs on common phrasings, and one libzim quirk (silent namespace-prefix
stripping) was masking the entire metadata API.

Net: 1425 tests pass (+5 over `v2.0.0a9`), 50 skipped, 38 deselected.
Live-verified key fixes against the real Wikipedia archive via
in-process `ZimOperations` calls.

### Fixed — Critical (post-a9 beta sweep)

- **D1: infobox section-context leakage on every Wikipedia city /
  country.** Berlin and Tokyo (and the broad city-template family)
  produced trailing rows labelled `**GDP — Time zone:**`,
  `**GDP — Vehicle registration:**`, `**GDP — Website:**`,
  `**GDP — HDI (2022):**` — clearly wrong. The post-a8 #2/Op5
  parent-context fix correctly tracked `current_section` from
  `<th class="infobox-header">` rows but never reset it; trailing
  free-floating rows (which Wikipedia marks `<tr class="mergedtoprow">`)
  inherited the last header. Reset `current_section` on KV rows whose
  `<tr>` carries `mergedtoprow` AND only after at least one row has
  been emitted under the current section — the second guard is the
  third-pass fix, without which the reset stripped section context
  from the *first* KV row inside a section header (Wikipedia uses
  `mergedtoprow` on those too as the visual group lead). Both edges
  covered by new regression tests.
- **D7: `M/<key>` paths silently aliased to C-namespace articles.**
  libzim's `archive.get_entry_by_path("M/Title")` strips the `M/`
  prefix and resolves to the C-namespace article with that name;
  `get article M/Title` against a Wikipedia ZIM returned the 172 KB
  disambiguation article on "Title" instead of the metadata entry.
  Route `M/<key>` paths to `archive.get_metadata_item` on new-scheme
  archives so the proper metadata API serves these requests. Verified:
  `M/Title` now returns `"Wikipedia"`, `M/Date` returns `"2026-02-15"`.

### Fixed — High (post-a9 beta sweep)

- **D2: `articles related to <topic>` failed on natural phrasings.**
  The intent parser hands the topic verbatim from the user's query
  (`articles related to United States` → `United States`), but the
  underlying entry path stores spaces as underscores
  (`United_States`). The handler called `get_related_articles_data`
  with the unresolved string and surfaced "Cannot find entry". Now
  probes the title index via `find_title_match(min_score=0.8)` first;
  fall through to the literal path only when no canonical resolves.
- **D3: `tell me about <typo>` skipped the typo-tolerant title
  fallback.** The first-pass title promotion required score 1.0;
  single-edit typos resolve at score 0.85 via `_find_entry_typo_fallback`.
  `tell me about Photosythesis` (missing `n`) fell through to Xapian
  search and returned `International Year of Chemistry` —
  actively misleading. Retry `find_title_match(min_score=0.8)` after
  the strict gate fails; same conservative typo chain
  (length-gated at ≥ 5 chars, ≤ 700 variants).
- **DD1: `metadata for <file>` aggregator returned 172 KB article
  bodies for new-scheme archives.** D7 fixed the per-entry
  `get article M/Title` surface but `_extract_zim_metadata`
  (a separate code path used by the `metadata for` aggregator) was
  still calling `get_entry_by_path("M/Title")` and getting the same
  silently-aliased C-namespace article. Now uses `get_metadata_item`
  for new-scheme archives, with old-scheme `get_entry_by_path`
  fallback. Verified: `Title` returns `"Wikipedia"`, `Description`
  returns `"The free encyclopedia"`, `Language` returns `"eng"` (was
  172 K / 60 K / 364 K-char garbage respectively).
- **DD2: `tell me about` ignored `content_offset`.** The handler
  hard-coded offset = 0 in the body fetch, so callers paginating a
  148 KB Photosynthesis article through `zim_query` couldn't reach
  the tail without dropping to a separate `get article <path>` call.
  Threaded `options.get("content_offset", 0)` through; suppress the
  compact-mode lead-with-TOC step when reading mid-article.

### Fixed — Medium (post-a9 beta sweep)

- **D4: `get section X of Y` natural-language error path dropped the
  `closest_match` hint.** The structured `get_section` operation
  computes a `difflib`-based closest-match (Op5 from a8) but the
  natural-language handler reimplemented section lookup against the
  headings list and never queried that operation. Compute the same
  hint locally so `get section Goegraphy of Berlin` now suggests
  "Did you mean Geography?".
- **D5: `articles related to <hub>` markdown dropped the
  `scan_truncated` signal.** The a9 #A5 backend addition surfaced
  `scan_truncated` / `scan_total_internal` / `_meta.reason` for hub
  articles whose 500-link scan cap fired, but `compact_renderers.render_related`
  ignored all of it. Append a footer when the signal is set.
- **D6: `suggestions for X` missed the canonical bare-title article.**
  `suggestions for Photosyn` returned 15 results, none of which was
  bare `Photosynthesis` — both libzim's `SuggestionSearcher` and
  Xapian rank disambiguator-bearing variants
  (`Photosynthesis (song)`, `Photosynthetic_efficiency`) above the
  short canonical title. Probe `SuggestionSearcher` for parenthesised
  siblings (`foo_(suffix)`) and prepend the un-suffixed root path
  when the archive resolves it. The third-pass refactor restructured
  this to share a single `SuggestionSearcher.suggest()` round trip
  with Strategy 2, so the cold path stays at one title-index probe.
- **D8: `walk namespace W` returned zero entries while
  `list namespaces` claimed W had two.** The two operations
  contradicted each other on the same archive. The W-namespace
  well-known entries (`mainPage`, `favicon`) live on the
  `archive.main_entry` / `has_illustration` API, not the iterable
  surface that `walk_namespace_data` falls back to. Mirror the same
  probe pair `_add_new_scheme_well_known_namespace` already uses
  for the namespace listing. Also fix the `entries 1-0` off-by-one
  in the empty-walk header rendering.
- **D9: cursor `s.q` field silently ignored — wrong-query
  pagination.** Cursor reused across queries silently paginated the
  new query at the old offset. Reject with a `cursor_decode` error
  when `s.q` shares no meaningful (≥ 3-char) tokens with the current
  query. Falls back to a bidirectional substring check for cursors
  whose stored query has only short tokens. Three regression tests
  cover the unrelated-query reject, the shortened-query accept, and
  the overlapping-tokens accept.
- **DD4: `_splice_title_match_into_search` returned `limit + 1`
  results.** Prepending the canonical synthetic result didn't trim
  back to the requested limit; `limit=3` produced 4 results with
  header `"showing 1-4"`. Trim to `page_info.limit` and update
  `page_info.returned_count` so the header matches the row count.

### Added — Opportunities (post-a9 beta sweep)

- **O2: stopword-saturation footer on search.** Queries that match
  ≥ 1 M results (the stopword-only `search for the and a is in to`
  saturates at ~5 M) now carry a footer noting that top hits are
  ranked by general document importance, not topic relevance — so
  the model doesn't trust the "Found N matches" signal as
  meaningful.
- **O3: truncation hint no longer self-references.** The previous
  hint suggested `show structure of <path>` as the recovery —
  silly when the truncated response IS the show-structure (or
  table-of-contents) output. Replaced with operation-agnostic
  guidance (page via cursor / tighten query / `compact=False`).
- **O4: disambiguation page leads preserve their inline list.**
  `tell me about Martin` previously truncated to `**Martin** may
  refer to:` with no list, forcing a `show structure` round-trip.
  Detect "X may refer to:" leads and skip the H2 cut so the
  disambig list stays inline.
- **O5: synthesize demotes `List_of_*` / `Index_of_*` /
  `Outline_of_*` / `Timeline_of_*` etc.** These articles ranked
  surprisingly high in synthesize because their bodies match many
  query tokens but the actual content is just an enumeration stub.
  Demote to the back of `top_n` AFTER title promotion runs (demoting
  before regressed the promotion's strong-match guard, which would
  treat `Berlin_(disambiguation)` as a match for `Berlin`).
- **O6: docstring notes distinguish `show structure` (flat heading
  list) from `table of contents` (nested children tree).**

### Fixed — Code-reviewer audit findings (post-first-pass)

A `feature-dev:code-reviewer` agent audited the first-pass commit and
surfaced three real defects in the original fixes:

- **A1 (the second guard on D1, listed under Critical above).**
- **A2: D6 ran `SuggestionSearcher` twice on the cold path.** When
  Strategy 1 returned empty, both the canonical probe AND Strategy 2
  opened independent `SuggestionSearcher` instances against the same
  archive. The first-pass "skip canonical probe when Strategy 1
  empty" fix regressed the empty-Strategy-1 case (the canonical
  probe IS needed when Xapian misses). Restructured to share a
  single `SuggestionSearcher.suggest()` round trip via an optional
  `result_paths=` parameter on `_find_canonical_prefix_match`.
- **A3 (the token-overlap rewrite of D9, listed under Medium above).**

### Fixed — Quality gate (SonarCloud third-pass cleanup)

- **5 cognitive-complexity reductions (S3776).** Five functions added
  by the beta-test commits crossed SonarCloud's complexity-15 limit.
  Each was split into self-contained helpers without behaviour
  change: `_find_canonical_prefix_match` (53 → split into 5 helpers
  for path probing, root extraction, entry resolution, and the two
  ranking strategies), `_handle_tell_me_about` (19 → 17 → ~14 over
  two passes via `_promote_topic_via_title_index` and
  `_fetch_topic_article_body`), `render_related` (17 → ~10 via
  `_render_related_link_line` + `_scan_truncated_footer`),
  `render_walk_namespace` (19 → ~12 via `_walk_namespace_header`),
  and `_get_metadata_entry` (18 → ~13 via `_decode_metadata_content`).
- **4 duplicate-literal extractions (S1192).** The "text/" MIME prefix
  had three call sites in `zim/content.py`; "File:" / "Category:" /
  "Template:" each had three call sites in `zim/search.py`. Extracted
  to `_TEXT_MIME_PREFIX` and a `_PSEUDO_NAMESPACE_*` constant trio
  with a shared `_is_pseudo_namespace_entry(extended=)` helper.
- **1 ReDoS hotspot (S5852).** The O4 disambig-lead-detection regex
  `\bmay\s+(?:also\s+)?refer\s+to\s*:?\s*$` was flagged for nested
  unbounded quantifiers. Not actually catastrophic on Python's `re`
  engine, but replaced anyway with a normalised
  `str.endswith(("may refer to", "may also refer to"))` check — same
  behaviour, no regex engine, and the phrase list is easier to
  extend.

### Wire-format / surface changes (alpha-line clean breaks)

- **Infobox extraction labels for trailing rows change.** Berlin /
  Tokyo terminal rows that previously emitted as `GDP — Time zone`
  now emit as `Time zone`. Callers parsing the bullet-prefix
  structure see different label strings.
- **`metadata for <file>`** now returns short metadata strings
  instead of 172 KB article-body excerpts. Wire-format compatible
  (same keys); content is the actual ZIM metadata (`Title` =
  `"Wikipedia"`, `Date` = `"2026-02-15"`, etc.).
- **`get article M/<key>`** now returns the ZIM metadata entry
  instead of the silently-aliased C-namespace article body.
  Wire-format compatible (same response envelope); content differs.
- **`_splice_title_match_into_search`** trims to the requested
  limit. Callers receiving `limit + 1` results will now get exactly
  `limit`.
- **Cursor with mismatched `s.q` now errors.** Callers that
  previously got silent wrong-query results now receive a
  `cursor_decode` `ToolErrorPayload`.
- **Synthesize ranking demotes list articles.** Citation order for a
  query like `Quantum mechanics` no longer includes
  `List_of_textbooks_…` in the top half.
- **Truncation hint footer text changed (O3).** Callers parsing the
  trailing prose see different wording.

### Investigated and deferred

- **Pseudo-namespace pollution in default search results
  (`Portal:` / `User:` / `Help:`).** Filtering pseudo-namespace
  articles from default search is too opinionated; some callers
  legitimately want them. The canonical-promotion already pushes
  the real article to rank 1 in the common case (live-verified:
  `search for biology` → `Biology` at #1 via `(canonical title
  match)`). Revisit if the canonical-promotion fallback proves
  insufficient.

---

## [2.0.0a9] — 2026-05-12 (alpha pre-release) — post-a9 review wave (5 defects + 4 deferred items)

Follow-up review wave after the post-a8 batch (commit d3e310e). 4
parallel code-reviewer agents covered Phases A/B/C plus cross-cutting
concerns; 13 findings were verified, 8 were withdrawn after closer
read (the suspected bug was either already correct or by-design), 5
were real defects. The 4 items the post-a8 batch explicitly deferred
("bigger than this batch") are now closed.

Net: 1420 tests pass (11 new red-green-verified regression tests),
50 skipped. One module + its test suite deleted as dead code — alpha
clean break per v2 plan.

### Fixed — Critical (post-a9)

- **A1: cache `_restore_entry` skipped `_total_bytes` accounting.**
  After a warm-start with persistence enabled, `max_bytes` eviction
  read zero for `_total_bytes` — the `while self._total_bytes > max_bytes`
  loop in `set()` never fired even on a snapshot that already
  exceeded the configured cap. The byte budget was silently
  inoperative across every restart until enough new sets accumulated
  to cross the threshold *from zero*. Now `_restore_entry` updates
  `_total_bytes += entry.size_bytes` symmetrically with `set()` and
  `_remove()`.
- **A2: cache `_load_from_disk` did not enforce `max_size` or
  `max_bytes` against the loaded snapshot.** Operators tightening
  caps between restarts saw the prior caps until eviction was
  triggered by new sets. Added a post-load eviction pass using the
  same LRU heap `set()` maintains.

### Fixed — Medium (post-a9)

- **A3: `create_snippet` collapsed to bare `"..."` on leading-highlight
  truncation.** When the post-highlight slice began with `**` at
  position 0 (an unpaired marker landing inside the first highlighted
  term), `sliced[:0]` produced `""` and the caller saw a content-free
  ellipsis. Now drops the orphan `**` marker and keeps the term text.
- **A4: `render_search_all` blamed the query when every archive
  errored.** `files_with_hits == 0` emitted "Try `suggestions for X`"
  prose for both "no matches" and "all archives failed" cases, sending
  the model to chase a query-correction fix for a server-side problem.
  Now branches on `files_failed >= files_searched` and emits a
  targeted "all archives errored" hint.

### Added — Opportunity (post-a9)

- **A5: `get_related_articles` surfaces scan-truncation signal.** Hub
  articles ("List of …", "Index of …") routinely carry 1000–5000
  internal links; the underlying `extract_article_links_data` was
  called with `limit=500` and the frequency rank was operating on a
  document-head-biased sample with no signal to callers. Response now
  carries optional `scan_truncated` / `scan_total_internal` /
  `scan_limit` and `_meta.reason="scan_truncated"` when the cap fired.
  Added to the `RelatedArticlesResponse` TypedDict in `tool_schemas.py`.

### Deferred items resolved (post-a9)

- **D1 (cross-cutting H1): HTTP rate-limiter `client_id` always
  `"default"`.** Every `check_rate_limit` call across `tools/*.py`
  passed no `client_id`, so the per-(client_id, operation) bucket
  infrastructure was dead in HTTP mode — one aggressive caller could
  exhaust the global bucket for everyone. Added
  `openzim_mcp/request_context.py` with a `ContextVar[str]`;
  `BearerTokenAuthMiddleware` derives client_id from the presented
  token (`"bearer:<sha256-8>"`) or remote IP (`"ip:<host>"`) and sets
  the context var on every request; `check_rate_limit` reads the var
  when `client_id=None` (the default at every tool call site). Stdio
  transport has no middleware so the ContextVar reads its `"default"`
  fallback — single-bucket behavior preserved. No tool call sites
  changed.
- **D2 (cross-cutting H3): `_load_from_disk` JSON parse moved inside
  the `_lock` critical section.** The prior window (file open +
  `json.load` outside the lock, restore inside) was narrow — only
  `__init__`-time threads could race — but a foreign-thread regression
  probe now verifies the lock is held during `open()`. Single brief
  startup blocking window, no contention in production.
- **D3 (Phase B HIGH-4): `openzim_mcp/types.py` + `tests/test_types.py`
  deleted.** The module last shipped in v1.0.0; its TypedDicts
  (`SearchResponse` with `total_results` / `has_more`, `NamespaceInfo`
  with `entry_count` / `has_more` / `offset` / `limit`) contradicted
  the live Phase B contract in `tool_schemas.py`. Only the test file
  imported from it (32 tests pinning dead code). Removed both —
  v2 alpha allows clean breaks per the v2 plan.
- **D4: 3 pre-existing mypy errors fixed.** `content_processor.
  _cell_belongs_to_infobox` narrowed via intermediate `node_bound: Tag`
  local so the closure default carries the post-guard type;
  `simple_tools._splice_title_match_into_search` call site added
  explicit `cast(SearchResponse, ...)` / `cast(Dict[str, Any], ...)`
  bridges between the TypedDict and the splice helper signature.

### Withdrawn findings (post-a9, 8)

After verification each was either correct as-written or by-design:

- browse_namespace sampled-cache poisoning — the underlying
  per-namespace listing is cached separately by archive_stat_token,
  so per-page responses are deterministic after the first call.
- bundle parent_stack not popped for dropped sections — the pop loop
  is level-relative, correctly handles dropped sections.
- synthesize outer / `_meta` total_chars divergence — by intentional
  design (outer = answer length, `_meta` = pre-cap chars).
- heading regex mandatory space — html2text always emits the space.
- `_find_entry_typo_fallback` extra_probes cap overshoot — the cap
  holds; initial analysis was wrong.
- cursor `ns` field bypasses `sanitize_input` — `sanitize_input` IS
  called on the post-cursor namespace value at the tool layer.
- `_walk_new_scheme_metadata` missing `ai` field — only fires when
  `validated_path=None`, which does not happen in production.
- `synthesize.fallback_used` semantics on empty hits — the TypedDict's
  `Literal` constraint precludes a more accurate value.

### Wire-format / surface changes

- **`openzim_mcp.types` module removed.** Any external consumer
  importing from `openzim_mcp.types` must move to
  `openzim_mcp.tool_schemas`. The v1 shapes (`total_results` /
  `has_more`) are gone; the v2 Phase B shapes (`total` / `done` /
  `next_cursor` / `page_info`) are authoritative.
- **`get_related_articles` response gains optional keys.**
  `scan_truncated`, `scan_total_internal`, `scan_limit` plus
  `_meta.reason="scan_truncated"` when the 500-link scan cap fired.
  Existing callers that ignore the new keys see no behavior change.

---

## [Unreleased] — post-a8 review batch (33 defects + 5 opportunities)

Multi-agent review wave after v2.0.0a8: 33 defects and 5 strategic
opportunities found across Phases A/B/C and cross-cutting concerns
(security, concurrency, hot-path perf, public-API stability). Every
finding either fixed or explicitly deferred with rationale. No tests
removed; existing wire-format breaks (C2, H14) are documented in the
"Wire-format breaks" section below.

### Fixed — Critical

- **C1: path-traversal guard extended to every entry-path tool.** D12's
  guard lived only in `get_zim_entry`; sibling tools
  (`get_article_structure`, `extract_article_links`,
  `get_table_of_contents`, `get_section`, `get_entry_summary`,
  `get_binary_entry`, `get_related_articles`, batch `get_entries`) all
  accepted unsanitized entry paths. Extracted `reject_path_traversal`
  in `zim/content.py` and call it at every entry-path tool boundary.
- **C2: `browse_namespace` no longer lies with `done=True` after a
  sample.** When discovery is sampling-based, the contract field
  `done` previously flipped to True once the sample was consumed —
  clients stopped paging even though most entries remained. Now keeps
  emitting `next_cursor` and flags the response with
  `_meta.reason="sample_only"`. Wire-format compatible (existing
  fields preserved; `done` semantics tightened).
- **C4: `_scan_filtered_search` no longer cuts pagination at the scan
  cap.** When the 10K-entry scan cap fired, `total_filtered_is_lower_bound`
  was masked to False, which made `done=True` even though filtered
  hits remained past the cap. Removed the `not scan_cap_hit` guard.
- **C5: `run_with_timeout` runs in a bounded ThreadPoolExecutor.**
  The previous per-call `threading.Thread(daemon=True)` couldn't bound
  thread accumulation under sustained timeouts — a 118 GB ZIM with
  slow libzim decompression and a high timeout rate could pile up
  orphaned threads holding open archives. Default cap of 16 workers;
  override via `OPENZIM_MCP_TIMEOUT_MAX_WORKERS`.
- **C6: `_locate_passage` lockstep walk fix.** The `norm_cursor > 0`
  guard suppressed counting the first whitespace run; passages that
  began with whitespace landed in the *next* section. Dropped the
  guard (`_normalize_ws` already strips leading whitespace).
- **C7: bundle section invariant `char_start < char_end` enforced.**
  A heading at the very end of an article with no body content
  previously produced a degenerate `SectionMeta` with `char_start ==
  char_end`. Now dropped from the bundle's `sections` list.
- **C8: typo probe is single-sweep.** `find_entry_by_title` on a cold
  miss used to iterate the ~700-variant set TWICE (once for the
  fallback, once for the verified-suggestion pool). Merged into
  `_find_entry_typo_fallback_with_suggestions` returning
  `(best_entry, verified_titles)` from one pass. Halves worst-case
  latency on the spec's 30 ms budget.

### Fixed — High

- **H9: `_meta.reason="low_relevance"`** when Xapian returned hits but
  none token-match the query (path or title carries any query token
  ≥3 chars). Same suggestion pool as `0_hits`. Spec §4 defined the
  enum but no code emitted it until now.
- **H10: `_handle_search_all` routes compact-mode through
  `search_all_data`** so the aggregate `_meta.reason` /
  `_meta.suggestions` surface in the footer (legacy path bypassed
  `search_all_data` entirely). New `compact_renderers.render_search_all`.
- **H11: `_highlight_terms` joins paragraphs with `\n\n`** (not a
  single space). The single-space join silently broke any second
  paragraph that opened with a markdown heading.
- **H12: `tokens_est` no longer collapses to 0 for non-empty payloads
  that tokenize to 0 tokens.** Rare BPE edge case; the previous
  `if raw_tokens else 0` clause emitted a misleading zero instead of
  the +5% padded estimate.
- **H13: `_extract_entry_summary_data` no longer bypasses the bundle
  when `compact=True`.** Stale comment claimed the bundle stored
  non-compact markdown; the bundle has rendered with `compact=True`
  since v2.0.0a3. Now the same article produces identical markdown
  from `get_entry_summary` and `get_section`.
- **H14: `SearchAllResponse.results[].result` is shape-stable.**
  Previously `Union[SearchResponse, ToolErrorPayload]` (callers had
  to type-sniff). Now `result` is `Optional[SearchResponse]` and
  errors ride sibling keys (`error: bool`, `error_operation`,
  `error_message`). Wire-format break — callers branching on
  `result.get("error")` move to `entry.get("error")`.
- **H15: Phase B spec updated to cursor v=2.** Spec previously said
  v=1; implementation has been on v=2 since v2.0.0a4 (the cursor
  version that added the `s.ai` archive-identity field).
- **H16: `walk_namespace` archive-identity check is unconditional.**
  The previous `if "ai" in cursor_state:` guard let a hand-crafted
  cursor without `ai` skip the cross-archive verification. The
  underlying `verify_archive_identity` already raises on absent `ai`.
- **H17: bundle relaxed heading regex no longer over-matches.** The
  previous `^#{level} [^\n]*{text}[^\n]*$` accepted any heading
  *containing* the extracted text. Tightened to only match decorated-
  text variants (`*`, `_`, `` ` ``, `\`, whitespace before/after).
- **H18: `get_section` D5 widen scope tightened.** When the narrow
  slice would be empty, the previous fix widened to the first
  following section's `char_end`, which included that section's
  whole sub-tree. Now widens to that section's *first descendant's
  start* so the response covers only the child's lead prose.
- **H19: RRF fuse tie-breaks deterministically.** Equal-score paths
  now sort by `(-score, path)` so repeated multi-archive synthesize
  calls return citations in the same order.
- **H20: `is_strong_title_match` no longer false-positives on
  bare-first-name candidates.** Removed the reverse-direction prefix
  match (topic-extends-candidate) that let `"Martin"` promote past
  the canonical article for query `"Martin Luther King"`. Kept the
  forward direction (`"Berlin"` still promotes to `"Berlin (city)"`).
- **H21: `_get_encoder` uses `functools.lru_cache(maxsize=1)`.**
  Replaces the unguarded `_EncoderCache` check-then-set with a C-level
  lock so two concurrent `asyncio.to_thread` workers no longer race
  to write the tokenizer on first-use.
- **H22: `search_all` honors an aggregate wall-clock timeout.** New
  `OPENZIM_MCP_SEARCH__SEARCH_ALL_TOTAL_TIMEOUT_SECONDS` (default 20s).
  When the budget fires, the fan-out stops, partial results return
  with `done=False`, `budget_exceeded=True`, and
  `_meta.reason="search_all_budget_exceeded"`.
- **H23: `attach_meta` accepts a pre-rendered string.** Callers with a
  ready serialization (e.g. a markdown body about to ship) can pass
  `rendered=` to skip the per-call full-payload JSON serialization.
  Hot-path optimization for ~50 KB search responses.
- **H24: simple-mode cursor decode errors travel as `ToolErrorPayload`.**
  Previously a markdown string; now `tool_error(operation="cursor_decode",
  ...)` so callers can branch on `result.error`.

### Fixed — Medium

- **M25: `available_section_ids` capped at 50.** Long Wikipedia articles
  (United States, World War II) carry 80-150 section IDs; the prior
  unbounded error payload burned 4-6 KB of context for nothing.
  `available_section_ids_truncated` and `available_section_ids_total`
  surface the truncation.
- **M26: `_promote_title_match` skips multi-word content queries.**
  Queries with 5+ alphanumeric tokens (`"effects of climate change on
  arctic biodiversity"`) are recognized as prose, not entity lookups —
  skip the per-archive title-index probe to save a redundant fast-path
  walk.
- **M27: cache persistence default uses XDG.** Default
  `~/.cache/openzim-mcp/cache.json` (honors `XDG_CACHE_HOME`).
  Previously `.openzim_mcp_cache` in CWD, which silently failed inside
  read-only Docker images. Existing configured paths unchanged.
- **M28: Bearer challenge includes `realm`.** RFC 6750 §3 requires it;
  some MCP SDK clients inspect the full challenge to decide whether
  to auto-inject a token. Now emits
  `WWW-Authenticate: Bearer realm="openzim-mcp"`.
- **M29: `process_mime_content(snippet_mode=True)` exposed (hook only).**
  Adds a snippet-only rendering mode that skips infobox/table
  rewrites. `_get_entry_snippet` keeps the full compact pipeline
  because a Wikipedia article's leading infobox dominates the
  snippet's first paragraph without extraction — skipping the
  rewrite produced pipe-soup snippets in golden testing. The hook
  stays available for future callers that want raw rendering.
- **M30: dependency upper bounds.** `mcp[cli]<2.0`, `pydantic<3.0`,
  `libzim<4.0`, `tiktoken<1.0` etc. Caps the next major so a fresh
  `pip install` can't land on a wheel-incompatible upstream.
- **M31: synthesize errors return `ToolErrorPayload`.** If an inner
  exception escapes `_handle_synthesize_query` past its own
  try-except, the outer `handle_zim_query` except previously
  swallowed the shape and emitted markdown. Now detects the
  synthesize branch and emits `tool_error("synthesize_pipeline_error")`.
- **M32: suggestion titles humanized.** `Photosynthesis_(biology)` →
  `Photosynthesis (biology)` so the footer hint reads as a query a
  model can copy verbatim.
- **M33: `_cell_belongs_to_infobox` binds `node` via default arg.**
  Python closures bind by name; the previous version was correct only
  because the function happened to be called inside the same loop
  iteration that defined `node`. Future-proofs against restructuring.

### Added — Opportunities

- **Op1: live-archive smoke skeletons.** New
  `tests/live/test_live_phase_c_primitives.py` covers `get_section`,
  `synthesize`, `get_related_articles`, and `walk_namespace` against
  real Wikipedia ZIMs. Auto-skips when `ZIM_TEST_DATA_DIR` doesn't
  point at a ZIM directory.
- **Op2: `compact` parameter on natural-shape advanced tools.**
  `get_zim_entry`, `get_zim_entries`, `get_entry_summary` now accept
  `compact: bool = False` and thread it through. Phase F decides
  whether to propagate further.
- **Op3: `browse_namespace` sampling semantics documented in the tool
  docstring.** Explicitly says `done=True` in sampling mode means
  "end of sample, not end of namespace" and recommends `walk_namespace`
  for exhaustive iteration.
- **Op4: `_meta.reason` taxonomy expanded.** Added `sample_only`,
  `archive_unavailable`, `search_all_budget_exceeded` reasons + footer
  recovery prose for each.
- **Op5: `section_not_found` carries a `closest_match` hint.**
  `difflib`-based suggestion so a fat-fingered ID (`Goegraphy`) hands
  the model the right ID (`Geography`) without a full TOC scan.

### Wire-format breaks (alpha-line clean breaks)

- **`SearchAllResponse.results[].result`** changes from
  `Union[SearchResponse, ToolErrorPayload]` to `Optional[SearchResponse]`.
  Errors now ride sibling keys `error: bool` / `error_message` /
  `error_operation`. Callers branching on `result["error"]` move to
  the per-file `entry["error"]`.
- **`browse_namespace` `done` semantics** change in sampling mode.
  Clients depending on `done == True` to mean "end of the namespace"
  should consult `sampling_based` and `_meta.reason="sample_only"`.

## [2.0.0a8] — 2026-05-11 (alpha pre-release)

Re-cut of v2.0.0a7 — the v2.0.0a7 tag exists but its GitHub Release
failed to publish because `pip-audit` surfaced two upstream urllib3
CVEs (CVE-2026-44431 / 44432) that landed in the audit database
between the v2.0.0a6 and v2.0.0a7 builds. v2.0.0a8 carries the same
v2.0.0a7 content plus the urllib3 → 2.7.0 bump that closes the CVEs.
Also adjusts `make security` to pass `--skip-editable` so pip-audit
doesn't fail looking for the local package on PyPI mid-release.

Defect + opportunity batch on top of v2.0.0a6, found by end-to-end
testing against a real Wikipedia ZIM (118 GB, 27.2M entries,
Feb 2026 snapshot). 14 defects fixed, 8 opportunities added.
1388 tests pass (+13 from new test modules); no regressions.

### Fixed — Phase A (snippets, infobox, typo fallback)

- **#14: `_typo_variants` now reaches `"Photosythesis"` → `"Photosynthesis"`.**
  v2.0.0a4 shipped only transposition + deletion edits — mathematically
  unable to recover the missing `'n'` (insertion). Added insertion +
  substitution against the full a-z alphabet, length-gated at ≥ 5 chars
  to bound cost (~700 variants for a 13-char input; ≤ 10 ms/call).
- **#1: snippet highlighter no longer produces malformed markdown.**
  `_highlight_terms` previously wrapped query terms verbatim, producing
  `**Artificial **photosynthesis****`, `_****Berlin****_`, and
  `[**Photosynthesis**](**Photosynthesis** "**Photosynthesis**")` when
  the match landed inside existing bold / italic / link constructs.
  Added a skip regex covering paired emphasis runs and full
  `[text](href "tooltip")` link constructs (deliberately not bare
  parens, so prose like `(also called assimilation)` keeps its
  highlighting).
- **#1: snippet fallback to stem-prefix substring match.** When no
  whole-word match existed, the snippet used to drop to the lead
  paragraph. Now it falls back to a stem-prefix substring (first ⅔ of
  the query term) so `"photosynthesis"` catches paragraphs mentioning
  `"photosynthetic"` instead of returning the article's unrelated lead.
- **Op1: snippets drop the duplicate `# <Title>` H1.** `create_snippet`
  accepts an optional `title=`; `_get_entry_snippet` forwards the
  entry title so the heading that already appears in the result row
  doesn't burn 5–15 tokens per result.
- **#2 / Op5: infobox extraction tracks parent-section context.**
  `extract_infobox` now prefixes labels with their parent
  `<th colspan>` heading row, so a Berlin infobox renders
  `Area — City/State` / `Population — City/State` instead of three
  identical `City/State` rows. Also skips rows whose nearest table
  ancestor isn't the infobox (handles nested chronology / coords
  microformats) and rejects `<th>` / `<td>` candidates borrowed from
  inside nested tables.
- **Op6: strip image-caption / hatnote / sidebar / navbox / inline
  citation noise.** `UNWANTED_HTML_SELECTORS` now drops `figure`,
  `figcaption`, `.thumb`, `.thumbcaption`, `.gallery`, `.hatnote`,
  `.sidebar`, `.navbox`, `.metadata.mbox-small`, `sup.reference`,
  `.reference`, `.mw-collapsible-toggle`, and the `.geo-*` coordinate
  microformats. Article leads now start with the actual prose, not
  `Schematic of … For other uses, see X (disambiguation). Part of a
  series on … 52°31'07"N 13°24'16"E …`.

### Fixed — Phase B (response contract)

- **#3 / Op8: `zim_query` accepts a `cursor` parameter.** Tools advertised
  opaque base64 cursors in their responses, but the simple-mode
  `zim_query` tool only took an integer `offset` — the cursors were
  decorative. Now decoded; `s.o` populates `options["offset"]` and the
  per-tool state is preserved. Length-capped at 2 KB
  defense-in-depth.

### Fixed — Phase C (primitives)

- **#9 / #7: `get_section` table rendering now matches `get_zim_entry`.**
  The bundle's `rendered_markdown` was built with `compact=False` while
  `get_zim_entry` rendered with `compact=True`. Result: `get_section
  "Geography"` returned pipe-soup tables while the surrounding article
  fetch path showed `[Table N: M rows x P cols - pass compact=False to
  expand]` placeholders. Bundle and search-snippet rendering paths now
  both apply `compact=True`, so the markdown is consistent everywhere.
- **#10 / D8: synthesize attribution carries the `#section_id` suffix.**
  `_locate_passage` couldn't find passages containing `**bold**`
  highlight markers inside the bundle's plain markdown — every citation
  fell back to entry-level (`section_id: null`). Now strips `**`
  markers before locating so attribution resolves correctly.
- **#10 / D5: synthesize strips natural-language interrogative prefix.**
  `synthesize=True` with `"tell me about Berlin"` previously fed the
  entire phrase to BM25 — returning Irving Berlin songs, Nat King Cole
  albums, and a graffiti article instead of the canonical Berlin
  entry. Intent-parses first, hands only the topic to the search
  stage; preserves the original query for response echo.
- **#10 / D8 / Op4: response dedupe + link-strip in compact mode.**
  `passages[].text_markdown` previously duplicated `answer_markdown`
  verbatim (~50% token bloat on every synthesize call). In compact
  mode, passages now omit the body text. Wikipedia link-soup
  (`[text](href "tooltip")`) is also stripped from passages — small
  models can't follow inline links from inside tool responses anyway.
- **Op3: `get_section` supports narrow scoping.** New
  `include_subsections=False` parameter on `get_section_data` (and the
  `narrow section X of Y` / `just section X of Y` query syntax in
  simple mode) ends the slice at the next heading of any level, so a
  caller can fetch just the H2 lead paragraphs without the cascading
  H3 sub-tree.
- **Op2: compact structure response carries per-heading summaries.**
  The 80-char `summary` field is derived from each section's body
  preview so a small model can choose which section to drill into,
  not just see which exist.

### Fixed — namespace / metadata / `tell me about`

- **D2: `browse namespace C` no longer crashes on new-scheme archives.**
  Legacy code built a full 27 M-entry list before slicing 50 rows out
  of it — slow, memory-hostile, and triggered "session expired" errors
  on real Wikipedia archives. New `_browse_new_scheme_c_paginated`
  pages directly through the entry-id range.
- **D3: `browse namespace W` returns the actual W entries.** New-scheme
  archives keep W off libzim's iterable surface, but the well-known
  paths (`W/mainPage`, `W/favicon`, ...) are reachable via
  `has_entry_by_path`. New `_browse_new_scheme_w_paginated` probes
  them so the response matches `list_namespaces`' count.
- **D11: metadata previews cap at 800 chars.** Wikipedia ZIMs store
  `M/Title` as a full HTML document (~1 MB) rather than the bare title
  string. The `metadata for <archive>` call previously returned 980 KB,
  starving every other metadata field. Each entry is now capped with
  a `[truncated, N chars total]` marker.
- **D6 / Op7: `tell me about <topic>` auto-fetches on title-index hit.**
  When the top BM25 result wasn't a strong-title match (Xapian ranked
  `List of songs about Berlin` above the canonical `Berlin` article),
  the response used to render the search list. Now falls back to
  `find_entry_by_title_data`; promotes any score-1.0 result past the
  BM25 ranking and inlines the article body.

### CI / quality

- **3 new test modules, 47 additional assertions** covering each fix:
  `test_typo_variants_v2a7.py`, `test_content_processor_fixes_v2a7.py`,
  `test_v2a7_fixes_helpers.py`. End-to-end proof that `"Photosythesis"`
  resolves through the full call path (mock archive + suggester); perf
  guard against quadratic regressions in `_typo_variants`; cursor
  garbage-rejection; metadata cap on both long and short values.
- **Goldens regenerated** (all strict improvements): pipe-soup infobox
  snippet → clean lead-paragraph snippet for Einstein; H1 dedup +
  section attribution on the Berlin / Munich synthesize fixtures.
- **Test infra**: explicit `encoding="utf-8"` on golden read/write so
  non-ASCII characters in goldens survive Windows runners.
- **SonarCloud quality gate**: factored shared test setup
  (`_make_simple_handler`, `_build_metadata_mock_archive`,
  `_wire_typo_fallback_archive`) and namespace browse-payload shape
  (`_new_scheme_browse_payload`, `_materialise_paths`) so new-code
  duplication stays under 3%.

## [2.0.0a6] — 2026-05-11 (alpha pre-release)

Bugfix-only follow-up to v2.0.0a4 after a two-pass review of the
shipped Phase A/B/C surface. 13 defects fixed; no new tools, no
wire-format breaks. Existing tests stay green (1344 passed) and a few
new tests pin down the corrected behaviour.

### Fixed — Phase A (`_meta` envelope, snippets, fuzzy fallback)

- **`format_footer` recovery branch now covers every empty-result `reason`.**
  Responses carrying `reason="no_xapian_index"` or `"bad_namespace"`
  previously fell through to the success branch and emitted a useless
  "~0 tokens" footer. They now render archive-shaped recovery hints
  ("No full-text index on this archive. Try `find_entry_by_title` or
  `browse_namespace`." / "Unknown namespace. Try `list_namespaces`…").
- **`create_snippet` no longer slices `**term**` mid-tag.** When the
  highlighted snippet exceeded `snippet_length` and the second
  truncation landed inside a bold marker, the result was a runaway-bold
  fragment (`…**ter`). The truncation now detects an unpaired trailing
  `**` and strips the dangling segment before appending `...`.
- **Spec §14.4: surface `alt_spelling` suggestions when a fuzzy hit
  is returned.** `find_entry_by_title_data` now adds the resolved
  typo-corrected title to `_meta.suggestions[]` so callers can see
  *which* correction the server applied and decide whether to accept
  it. Previously suggestions only appeared when zero results were
  found.
- **`tokens_est` is now omitted from `_meta` when the tokenizer is
  unavailable** (spec §5). Callers can distinguish "tokenizer
  unavailable" from "zero-token response". `format_footer` falls back
  to char-count when the field is absent.

### Fixed — Phase B (cursor)

- **`CursorPayload.v` comment refreshed.** The inline comment said
  "currently 1" while `CURRENT_VERSION = 2`; replaced with a stable
  reference to the module constant so it can't drift again.

### Fixed — Phase C (bundle, get_section, synthesize)

- **`_locate_passage` whitespace-run offset mapping.** The
  normalized→original-offset walk could exit pointing inside a
  whitespace run that `md_norm` had collapsed to one space, attributing
  citations to the prior section when two section boundaries sat
  either side of the run. The mapper now advances past any remaining
  whitespace so the returned offset always lands on the first
  non-space character of the match.
- **`get_section` truncation surfaces `_meta.total_chars`.** Truncation
  set `_meta.truncated=True` but omitted the source-length context.
  Callers now see how much of the section was elided. `more_at_offset`
  remains absent because `get_section` truncation is not resumable.
- **`archive_by_name` collision on duplicate `.stem`.** Two ZIMs with
  the same filename in different directories (`en/wiki.zim` and
  `fr/wiki.zim`) silently overwrote each other in synthesize's
  archive-lookup dict, causing the bundle for the first archive's hit
  to be fetched from the second archive — i.e., citation poisoning.
  Duplicate stems are now detected at archive enumeration and
  disambiguated with a `~N` suffix (`wiki`, `wiki~2`) with a warning
  log. Single-archive synthesise is unaffected.
- **`get_entry_summary(compact=True)` no longer ignores the `compact`
  flag.** The Phase C bundle migration silently routed compact
  callers through `bundle["rendered_markdown"]`, which is rendered
  with `compact=False` (pipe-soup tables, no oversized-table
  placeholders). Compact requests now bypass the bundle and re-render
  through `_extract_html_summary(compact=True)` so Phase A #2's table
  trimming applies. `compact=False` (the default) still benefits from
  the bundle's shared HTML parse.
- **Cache keys for `path_mapping:`, `binary_meta:`, and `ns_entries:`
  now include an `<mtime_ns>:<size>` invalidation token.** Atomic ZIM
  file replacement (the typical monthly Wikipedia refresh) previously
  left stale resolved paths, binary metadata, and namespace listings
  in cache until LRU eviction. The bundle cache already did this; the
  helper has been extracted to `bundle.archive_stat_token()` and
  applied across all four.
- **`synthesize._extract_passages` no longer double-renders the
  snippet through `html_to_plain_text`.** `search_top_k` already
  returns plain-markdown snippets via `create_snippet`; the
  BeautifulSoup→html2text round-trip risked mangling `**bold**`
  highlight markers. Trust the upstream pipeline; a regression test
  pins the behaviour.

### Test suite

- 1344 passed, 50 skipped (one more than the v2.0.0a4 baseline; added
  `test_extract_passages_preserves_bold_highlight_markers`).
- Updated `test_zim_operations.py` and `test_content_tools.py` cache-key
  assertions for the new stat-token format.
- Updated `test_synthesize.py` to drop the `content_processor` arg
  from `_extract_passages` and use plain-markdown snippet fixtures.

## [2.0.0a4] — 2026-05-10 (alpha pre-release)

v2 Phase C, part 2: completes the retrieval-primitives phase. Adds the
`get_section` tool (#7) and the `zim_query(synthesize=True)` mode (#10)
on top of the EntryBundle infrastructure that shipped in v2.0.0a3.
**No wire-format breaks** — both new surfaces are additive.

### #7 — New tool `get_section`

```
get_section(zim_file_path, entry_path, section_id, *, max_chars=None)
  → Union[GetSectionResponse, ToolErrorPayload]
```

Returns a single section's body (~500-1500 tokens — small-model sweet
spot per parent-document-retrieval research) plus full metadata.
`section_id` values come from `get_table_of_contents`
(`TocHeading.section_id`). On miss, returns
`tool_error("section_not_found", extras={"available_section_ids": [...]
})` so the model can self-correct.

The data layer slices `EntryBundle.rendered_markdown[char_start:char_end]`
where the bundle's section ranges include subsections (a parent heading's
`char_end` extends to the next heading at the same or higher level).
Parent sections therefore return the full subtree body. `max_chars`
truncates the body and sets `truncated=True` plus `_meta.truncated=True`
in the envelope for budget-aware clients.

### #10 — New `zim_query(synthesize=True)` mode

```python
{
    "query": str,
    "answer_markdown": str,        # passages + inline [cite: ...] markers
    "passages": list[SynthesizePassage],
    "citations": list[Citation],
    "archives_searched": list[str],
    "fallback_used": Literal["xapian_score", "rrf_fusion", "reranker"],
    "total_chars": int,
    "total_words": int,
    "_meta": MetaEnvelope,
}
```

Pure retrieval + concatenation; no LLM generation. The seven-stage
pipeline (in `openzim_mcp/synthesize.py`):

1. **Per-archive search** — Xapian top-K hits (`search_top_k` helper
   on `ZimOperations`).
2. **RRF fusion** — Reciprocal Rank Fusion (k=60) when multiple archives
   are searched; identity passthrough for single-archive
   (`fallback_used="xapian_score"` vs `"rrf_fusion"`).
3. **Identity rerank** — placeholder for Phase D's cross-encoder.
4. **Passage extraction** — libzim snippets rendered to markdown.
5. **Section attribution** — best-effort lookup via `EntryBundle`;
   passages get `cite_id = "{archive}/{entry_path}#{section_id}"`
   when the snippet text is found in a section's char range. Bundle
   build failures keep the cite_id at entry level.
6. **Budget enforcement** — `output_char_budget` truncates the last
   passage; subsequent passages are dropped.
7. **Render + citations** — passages joined with `\n\n` and inline
   `[cite: ...]` markers; structured `Citation` list deduplicated by
   `cite_id`.

Zero hits returns an empty response with `_meta.reason="0_hits"`.

### Other

- Extended `tool_error()` with an `extras: Optional[Dict[str, Any]]`
  kwarg so error payloads can carry self-correction hints (e.g. the
  `available_section_ids` list above) without `# type: ignore` at
  call sites.
- New tests: `tests/test_get_section.py` (4),
  `tests/test_synthesize.py` (~20 unit + 3 end-to-end),
  `tests/test_golden_v2_phase_c.py` (3 `get_section` + 3 `synthesize`
  snapshots, deterministic via the new `v2_phase_c_zim` heading-rich
  fixture). `test_response_contract` exempts both new tools from the
  list-pagination contract while still asserting `_meta` is present.
- The Phase A `_meta` envelope continues to attach on every response.
  `_meta.truncated` is now correctly forwarded by `get_section_data`
  on truncation (was a hidden gap in earlier scaffolding).

## [2.0.0a3] — 2026-05-10 (alpha pre-release)

v2 Phase C, part 1: EntryBundle infrastructure and the four-tool collapse.
**One wire-format break** (TOC heading field rename). Phase C's other two
items — #7 `get_section` and #10 `synthesize` mode — are deferred to a
later alpha; their TypedDicts ship in this release as forward-declared
contract surface.

### #11 EntryBundle (internal — collapses four tools)

First touch of an entry runs ONE HTML parse → produces a single
`EntryBundle` value cached at `bundle:v2c:{validated_path}:{entry_path}`.
The four content-shape tools `get_entry_summary`, `get_table_of_contents`,
`get_article_structure`, and `extract_article_links` collapse from
independent HTML re-parsers to thin slicers over the bundle. First touch
parses HTML once; subsequent calls (across all four tools, in any order)
hit the bundle cache.

Removed legacy per-tool cache prefixes: `summary_data:`, `toc_data:`,
`structure_data:`, `links_full:v2b:`. Wire formats unchanged for
`get_entry_summary`, `get_article_structure`, `extract_article_links`.

### Breaking — `get_table_of_contents`

| Field | Before | After |
|---|---|---|
| TOC heading identifier | `heading["id"]` | `heading["section_id"]` |
| TOC list element type | `dict[str, Any]` | `TocHeading` TypedDict |

The value is unchanged (still `resolve_heading_id()`'s output with slug
fallback). The new field name is what `get_section(section_id=...)` will
consume in the next alpha. The old `id_source` field is dropped
(debugging-only, not a contract surface).

### Forward-declared TypedDicts (no behavior yet)

The following TypedDicts ship in `openzim_mcp/tool_schemas.py` as part of
this release so a4's implementation tasks don't have to re-litigate the
contract surface. They aren't returned by any tool yet.

- `GetSectionResponse` — for #7 `get_section` (a4)
- `SynthesizeResponse`, `Citation`, `SynthesizePassage` — for #10
  synthesize mode on `zim_query` (a4)
- `EntryBundle`, `SectionMeta`, `InfoboxField`, `InfoboxData`,
  `LinkBuckets` — bundle internals (already used by the four-tool collapse)
- `TocHeading` — already used (wire format)

### Configuration

`OpenZimMcpConfig.synthesize` block added with defaults `top_n=5`,
`per_archive_k=10`, `output_char_budget=4800`. Inert until `synthesize`
mode ships.

### Other

- New module: `openzim_mcp/bundle.py` — `extract_entry_bundle`,
  `get_or_build_bundle`.
- New tests: `tests/test_bundle.py` (15 tests covering bundle
  determinism, parent/child range nesting, eviction-rebuild identity).
- Cross-tool shared-bundle assertion in `tests/test_structured_tool_output.py`
  guards the "one parse per entry across all four tools" invariant.
- Housekeeping: removed stale `[[tool.mypy.overrides]] module = ['libzim']`
  from `pyproject.toml`. Added GitHub labels `v2`, `v2-phase-a/b/c`;
  applied `v2`/`v2-phase-b` retroactively to PR #111.

### Deferred to a later alpha

- **#7 `get_section`** — section-level retrieval by `section_id`. The
  `GetSectionResponse` TypedDict ships now; the data layer and tool
  registration land in a4.
- **#10 `synthesize`** — `zim_query(synthesize=True)` mode. The
  `SynthesizeResponse`/`Citation`/`SynthesizePassage` TypedDicts and
  `SynthesizeConfig` ship now; the pipeline (`openzim_mcp/synthesize.py`,
  per-archive search, RRF fusion, passage extraction, section attribution,
  citation rendering, budget enforcement) lands in a4.

## [2.0.0a2] — 2026-05-09 (alpha pre-release)

v2 Phase B: response-contract migration. **Wire-format break** for every
list-returning tool. v1.x users upgrading must update response-shape parsing.

### Breaking — pagination contract

Every list-returning tool now returns the same five contract keys:
`results`, `next_cursor`, `total`, `done`, `page_info`.

| Tool | What changed |
|---|---|
| `search_zim_file` | `total_results` → `total`; `pagination` block flattened; `next_cursor` at top level; `cursor` accepted as input |
| `search_all` | `per_file` → `results`; per-archive blocks each carry the new contract via `result` field; cursor lives only at the per-archive level |
| `search_with_filters` | now returns structured dict (was markdown); same shape as `search_zim_file` plus `namespace_filter`/`content_type_filter` |
| `find_entry_by_title` | gets the contract; `done=true`, `total=len(results)` always |
| `get_search_suggestions` | `suggestions` → `results`; `count` removed |
| `browse_namespace` | `entries` → `results`; `total_in_namespace` → `total`; `total_in_namespace_is_lower_bound` → `page_info.total_is_lower_bound`; `has_more` removed |
| `walk_namespace` | `cursor` input/output type changes from `int` to opaque `str`; `entries` → `results`; `total` is always `null`; `total_entries` deprecated alias removed |
| `list_namespaces` | `namespaces[<letter>].entry_count` → `total`; payload at top level (no `result` wrapper) |
| `extract_article_links` | **`kind` is now required (default `"internal"`)**; single category per call; `internal_links`/`external_links`/`media_links` parallel arrays removed; `category_totals: {internal, external, media}` added |
| `list_zim_files` | `files` → `results`; `count` → `total` |
| `get_related_articles` | `outbound_results` → `results`; anticipates Phase E inbound-link feature |
| `get_zim_entries` (batch) | gets the contract; `done=true`, `total=len(results)` |

### Breaking — TypedDict everywhere

Every dict-returning tool migrated from `Dict[str, Any]` to a per-tool
TypedDict. FastMCP now emits payloads at the top level of `structuredContent`
with real schemas. FastMCP continues to wrap `Union[<Response>, ToolErrorPayload]` returns in a `{"result": ...}` envelope at the wire level (this is FastMCP's behavior for any `Union` return type with multiple non-None members). Clients that previously parsed `structuredContent.result` keep working. The TypedDict change ensures the inner payload now carries a real schema instead of `Dict[str, Any]`.

Migrated tools (TypedDict-only, no contract): `get_zim_metadata`,
`get_zim_entry`, `get_main_page`, `get_entry_summary`,
`get_table_of_contents`, `get_article_structure`, `get_binary_entry`.

### Cursor format

Cursors are URL-safe base64 JSON: `{v: 2, t: <tool_name>, s: <state>}`.
**Tool-bound** — a search cursor passed to browse raises a clear error.
**Versioned** — adding new fields later doesn't break the wire format.
**Archive-bound** — cursors for archive-specific tools carry `s.ai` (a
short SHA-256 token of the validated archive path); resubmitting a
cursor against a different archive is rejected. v=1 cursors are
rejected so callers re-fetch rather than silently follow stale state.

### Compat shim removed

`openzim_mcp.zim.archive.PaginationCursor` (the v1 cursor class) is removed.
Use `openzim_mcp.pagination.Cursor` instead.

### Other

- New module: `openzim_mcp/pagination.py` — `Cursor.encode/decode`, `CursorMismatchError`.
- New module: `openzim_mcp/tool_schemas.py` — every per-tool response TypedDict.
- New tests: `tests/test_pagination_cursor.py`, `tests/test_response_contract.py`, `tests/test_golden_v2_phase_b.py`.
- The Phase A `_meta` envelope is unchanged in shape and still populates on every response.

## [2.0.0a1] — 2026-05-08

> First v2 pre-release. Phase A of the multi-phase v2 effort. All changes additive at the tool-signature layer; small compact-mode prose change for empty search results (see Changed below).

### Added

* **meta:** every dict-returning tool now includes a `_meta` envelope (`tokens_est`, `chars`, `truncated`, `more_at_offset`, `total_chars`, `suggestions`, `reason`). `tokens_est` uses tiktoken `cl100k_base` with a 5% pad. (#5)
* **simple:** compact-mode responses gain a one-line markdown blockquote footer (`> ~4.2K tokens · ...`). Set `OPENZIM_MCP_META__FOOTER_ENABLED=false` to suppress. (#5)
* **content:** in compact mode, `.infobox` / `.vcard` tables emit a Markdown KV list prepended to the body. (#2)
* **content:** in compact mode, tables exceeding row or character thresholds are replaced with `[Table N: ...]` placeholders. (#2)
* **search:** every search response is query-aware — snippets contain the actual matched passage (with `**bold**` highlights, capped at 5 hits) rather than the article lead. (#1)
* **search:** `_meta.suggestions[]` surfaces typo variants (`alt_spelling`) and other-archive candidates (`alt_archive`) for empty / low-confidence searches. (#4)
* **search:** `find_entry_by_title` fuzzy fallback now triggers whenever no result clears 0.7 (previously only on zero hits). Score and length-gate are configurable via `OPENZIM_MCP_SEARCH__FUZZY_TITLE_*`. (#14)

### Changed

* **simple:** compact-mode empty-result prose now renders via the new footer + structured suggestions instead of the v1.2.0 paragraph. The information is one-for-one; the format is more model-readable. `compact=False` paths retain byte-identical v1.2.0 behavior. (#4)
* **search:** `find_entry_by_title` typo-corrected hits now score `0.85` (was hardcoded `0.7`) by default. (#14)

### Dependencies

* Added `tiktoken>=0.7.0` to core dependencies.

## [1.3.0](https://github.com/cameronrye/openzim-mcp/compare/v1.2.0...v1.3.0) (2026-05-08)


### Features

* v1.2.0 follow-up — refinements + production-readiness improvements ([#106](https://github.com/cameronrye/openzim-mcp/issues/106)) ([e9396ec](https://github.com/cameronrye/openzim-mcp/commit/e9396ec699a62d4b0ec990d1a155b0ae7ddb73dd))

## [1.2.0](https://github.com/cameronrye/openzim-mcp/compare/v1.1.2...v1.2.0) (2026-05-06)


### Features

* **http:** operator-acknowledged auth bypass + rate-limit env-var docs ([#104](https://github.com/cameronrye/openzim-mcp/issues/104)) ([7294b1d](https://github.com/cameronrye/openzim-mcp/commit/7294b1d33dfa40a8e07c7ba67c062cc2d3c741c7))
* v1.2.0 simple-mode tool ergonomics — tell_me_about, bigger snippets, compact pagination ([#103](https://github.com/cameronrye/openzim-mcp/issues/103)) ([212a60a](https://github.com/cameronrye/openzim-mcp/commit/212a60afd7c1e66bd573605925ccbb06261ed27c))

## [1.1.2](https://github.com/cameronrye/openzim-mcp/compare/v1.1.1...v1.1.2) (2026-05-05)


### Bug Fixes

* **server:** mirror cors_origins into SDK transport allowed_origins ([#100](https://github.com/cameronrye/openzim-mcp/issues/100)) ([96001d1](https://github.com/cameronrye/openzim-mcp/commit/96001d1365933cd948027e6291c34d26234792fe))

## [1.1.1](https://github.com/cameronrye/openzim-mcp/compare/v1.1.0...v1.1.1) (2026-05-05)


### Bug Fixes

* walk_namespace, related-articles, and confidence beta-refinement fixes ([#98](https://github.com/cameronrye/openzim-mcp/issues/98)) ([912d346](https://github.com/cameronrye/openzim-mcp/commit/912d34607220d2a3f7b61d0f39cff918d23c3f99))

## [1.1.0](https://github.com/cameronrye/openzim-mcp/compare/v1.0.1...v1.1.0) (2026-05-05)


### Features

* tool responses use MCP structured content (no more double-stringified JSON) ([#96](https://github.com/cameronrye/openzim-mcp/issues/96)) ([5b541ec](https://github.com/cameronrye/openzim-mcp/commit/5b541ec616386128e6a9d105f07c27bc94676265))


### Bug Fixes

* **http:** allow MCP-Protocol-Version header and DELETE method in CORS ([#93](https://github.com/cameronrye/openzim-mcp/issues/93)) ([dbb791e](https://github.com/cameronrye/openzim-mcp/commit/dbb791e5a6b90a229283ee0a6a283615deae6e42))
* namespace, pagination, resources, and find-by-title beta-test fixes ([#92](https://github.com/cameronrye/openzim-mcp/issues/92)) ([4b572ef](https://github.com/cameronrye/openzim-mcp/commit/4b572efa9acdec84551163e0170c3e6569c28151))
* **server:** make simple mode actually expose only zim_query ([#94](https://github.com/cameronrye/openzim-mcp/issues/94)) ([92c725f](https://github.com/cameronrye/openzim-mcp/commit/92c725f0cffd43f592185c0f2173f4c6ca0ef1e4))

## [1.0.1](https://github.com/cameronrye/openzim-mcp/compare/v1.0.0...v1.0.1) (2026-05-04)


### Bug Fixes

* **http:** allow operator-configured Host allowlist ([#90](https://github.com/cameronrye/openzim-mcp/issues/90)) ([c4dad8a](https://github.com/cameronrye/openzim-mcp/commit/c4dad8a4eb0147178ca268403d85f90530290fe4))

## [1.0.0](https://github.com/cameronrye/openzim-mcp/compare/v0.9.0...v1.0.0) (2026-05-03)

Includes an end-to-end review pass before tagging — security hardening, correctness fixes, performance work, and a refactor that splits `zim_operations.py` into a `zim/` package via mixin classes. See sections below.

### Features

* **http:** streamable HTTP transport with bearer-token auth, CORS allow-list, and `/healthz`/`/readyz` endpoints
* **http:** safe-default startup check refuses to bind a non-localhost host without an auth token
* **transport:** legacy SSE transport (`--transport sse`) for clients that haven't migrated to streamable-HTTP; bound to localhost only, no auth/CORS middleware
* **docker:** multi-stage, multi-arch (`linux/amd64`, `linux/arm64`) image published to `ghcr.io/cameronrye/openzim-mcp`, runs as non-root with a built-in health check
* **content:** `get_zim_entries` batch tool — fetch up to 50 entries in one call, with per-entry success/error reporting
* **resources:** per-entry `zim://{name}/entry/{path}` resource serves entries with their native MIME type (clients must URL-encode `/` as `%2F` in the path segment)
* **subscriptions:** clients can subscribe to `zim://files` and `zim://{name}`; mtime-polling watcher emits `notifications/resources/updated` when allowed directories or `.zim` files change
* **search:** opaque `cursor` parameter on `search_zim_file` for resumable pagination
* **simple:** intent pattern routes batch retrieval queries to `get_zim_entries`

### Improvements

* **content:** `get_related_articles` resolves relative hrefs against the source entry's directory and detects the content namespace correctly on domain-scheme archives (previously returned nothing)
* **content:** suggestion fallback uses `SuggestionSearcher(archive).suggest(text)` (the prior `archive.suggest()` call did not exist)
* **tools:** `list_zim_files` accepts a case-insensitive `name_filter` substring argument; one shared cache slot regardless of filter value
* **content:** `get_zim_entries` accepts bare entry-path strings paired with a `zim_file_path` default (dicts still work for multi-archive batches)
* **content:** heading-id resolution falls through `id` → mw-headline anchor → preceding `<a name="">` → slug, returning `(id, source)` so consumers can distinguish real anchors from synthetic slugs
* **content:** summary extraction skips USWDS banners and skip-nav blocks above the first `<h1>` (MedlinePlus / NIH / NIST style sites)
* **content:** link extraction drops non-navigable schemes (`javascript:`, `mailto:`, `tel:`, `data:`, `blob:`, `vbscript:`)
* **server:** `__version__` reads from `importlib.metadata`; `serverInfo.version` reports openzim-mcp's actual version (no longer the FastMCP SDK default)

### Removed

* **tools:** advanced-mode tool surface drops 27 → 21. Removed: `warm_cache`, `cache_stats`, `cache_clear`, `get_random_entry`, `diagnose_server_state`, `resolve_server_conflicts`. The cache itself remains; the explicit management tools were dropped.
* **instance:** multi-instance conflict tracking removed; `instance_tracker.py` deleted. HTTP server instances coexist freely.

### Bug Fixes

* **content:** sanitize per-entry paths in `get_zim_entries` and expand test coverage
* **resources:** per-entry `zim://` returns libzim's native MIME type
* **http:** start subscription watcher via wrapped lifespan
* **instance:** relax conflict logic for HTTP transport so multiple HTTP server instances can coexist

### Security

* **errors:** redact absolute paths from MCP error responses (rejected traversals previously leaked the canonical allowed-directory layout)
* **errors:** regex-based path redaction with cross-platform separator handling and tightened lookbehind so wrapped/quoted paths (`(/opt/foo)`, `"/opt/bar"`) no longer slip through
* **diagnostics:** redact filesystem paths and PIDs in `get_server_health` / `get_server_configuration` responses (no longer transport-gated; always redacted)
* **resources:** sanitize URI-decoded entry paths before passing to libzim
* **search:** always sanitize `zim_file_path` in `find_entry_by_title` (previously skipped when `cross_file=True`)
* **prompts:** strip control characters and cap user-supplied arguments before interpolating into MCP prompt bodies; re-check empty after sanitization to avoid empty `('', ...)` tool calls
* **http:** require auth on `OPTIONS /mcp` (the unconditional preflight bypass let unauthenticated callers probe the endpoint)
* **http:** resolve `localhost` before classifying as loopback; warn and fall through to the public-host path when `/etc/hosts` maps it elsewhere
* **rate-limit:** make global + per-operation acquire atomic; concurrent callers no longer transiently over-consume the global bucket
* **rate-limit:** per-client buckets with LRU eviction (10k cap) — infrastructure ready for HTTP context wiring

### Correctness

* **search:** reject mismatched `cursor` and `query` arguments instead of silently applying the cursor's offsets to a different query
* **cache:** stop caching error sentinels and zero-result responses (previously a transient libzim error or index warmup poisoned the cache for the full TTL); audit follow-up extends the gate to `get_search_suggestions`, `get_entry_summary`, `get_table_of_contents`
* **cache:** treat empty-string cache values as hits, not misses
* **content:** resolve redirects to their target before rendering; cache the resolved path so subsequent lookups skip the chain; reject redirect cycles and chains deeper than `MAX_REDIRECT_DEPTH = 10`
* **content:** instantiate `html2text.HTML2Text` per call to eliminate a shared-state race that corrupted concurrent conversions
* **content:** preserve Unicode in heading slugs (Arabic, Chinese, Cyrillic, Japanese ZIMs no longer produce empty TOC anchors); disambiguate duplicate heading slugs with `_2`, `_3` suffixes
* **content:** drop trailing punctuation from path tokens extracted by the simple-tools `get_zim_entries` parser
* **simple-tools:** dispatch the `get_zim_entries` intent (was silently falling through to `search_zim_file`); honor explicit `zim_file_path` for `walk_namespace`, `find_by_title`, and `related` intents
* **subscriptions:** detect same-size ZIM replacements via mtime change (size-only detection silently missed identical-size replacements)
* **validation:** `browse_namespace` and `walk_namespace` parameter checks now raise `OpenZimMcpValidationError` instead of `OpenZimMcpArchiveError` or markdown error strings; bound `walk_namespace` `limit` to `[1, 500]` per the documented contract
* **validation:** validate `get_zim_entries` batch size before charging rate-limit so an oversized batch doesn't increment the limiter

### Performance

* **search:** skip-counter pagination in `_perform_filtered_search` (offset=900, limit=10 went from ~1000 backend calls to ~10)
* **content:** `get_entries` groups by ZIM file and opens each archive once
* **navigation:** cache namespace listings per `(archive, namespace)`; pagination now slices from cache instead of re-scanning
* **search:** hoist `Searcher` construction in `_find_entry_by_search` (up to 5 Xapian opens collapse to 1)
* **suggestions:** Strategy 2 uses libzim's `SuggestionSearcher` instead of a strided ID scan that skipped 95% of entries on large archives
* **subscriptions:** `SubscriberRegistry` is set-backed for O(1) subscribe/unsubscribe/clear; broadcast fans out concurrently with per-call `wait_for` timeout so one slow subscriber doesn't stall the watcher

### Refactoring

* **zim:** split `zim_operations.py` (3557 → 39 lines, pure shim) into a `zim/` package with `_SearchMixin`, `_ContentMixin`, `_StructureMixin`, `_NamespaceMixin`. Public API preserved via re-exports
* **simple-tools:** extract `IntentParser` into `intent_parser.py` (parsing logic now unit-testable without `ZimOperations` mocks)
* **config:** unify `RateLimitConfig` into a single Pydantic `BaseModel`; `per_operation_limits` is now reachable from environment variables and JSON config
* **defaults:** default cache `persistence_path` to `~/.cache/openzim-mcp` (absolute) rather than `.openzim_mcp_cache` (relative to CWD)
* **defaults:** relocate `MAX_REDIRECT_DEPTH` and `SUBSCRIPTION_SEND_SECONDS` to `defaults.py` (matches existing project pattern)
* **resources:** offload blocking `list_zim_files_data` directory scan via `asyncio.to_thread`
* **resources:** extract `_resolve_zim_name` helper, replacing duplicated inline ZIM-name match loops
* **simple-tools:** intent confidence boost capped (low-priority intents with extracted params can no longer overtake higher-priority param-less intents)
* **prompts:** dedupe ask-for-args message into a `_ask_for_args(prompt_name)` helper

### Hardening (other)

* **cache:** validate values are JSON-serializable at write time when persistence is enabled (previously `default=str` silently coerced non-JSON types)
* **security:** add an unconditional `..` pattern to path normalization so embedded `foo..bar` traversal candidates trigger the regex layer
* **exceptions:** drop `details` from `Exception.args` so it no longer leaks into `repr()` and tracebacks
* **main:** route startup banner through the logger (now respects `OPENZIM_MCP_LOGGING__LEVEL`)
* **simple-tools:** consistently append low-confidence note across all intents (was missing on `search_all`, `walk_namespace`, `find_by_title`, `related`)

### Pre-release fix-up

Final bug-sweep passes after the main review work above. Categorised by area for easier scanning.

* **content/structure:** `_resolve_entry_with_fallback` and `get_binary_entry` now follow the redirect chain (bounded by the shared `MAX_REDIRECT_DEPTH = 10` cap with cycle detection) before calling `entry.get_item()`. Without this the structure, links, TOC, summary, and binary-entry tools all crashed with `RuntimeError` from libzim whenever the requested path was a redirect to the canonical article (the common case for Kiwix-generated ZIMs)
* **content:** `_get_main_page_content` resolves `archive.main_entry` and the fallback `main_page_paths` entries before calling `get_item()`. Most ZIMs point `W/mainPage` at the real article via a redirect; previously this raised on every such archive
* **content:** `get_zim_metadata` resolves redirect entries before reading metadata content
* **content:** `get_related_articles` preserves trailing slash in path resolution and resolves relative links against the post-redirect path
* **zim:** `_resolve_link_to_entry_path` rejects self-referential refs that previously fed back into the resolver
* **search:** `_perform_filtered_search` canonicalises lowercase / long-form namespace input so filters stop silently dropping every result; suggestion cache now skips zero-result responses
* **search:** `search_all` validates effective_limit is in the documented 1-50 range
* **simple-tools:** `get_article` intent forwards `options[content_offset]` so simple-mode pagination works (previously always returned page 1); passthrough intents forward `options[limit]` / `options[offset]`
* **subscriptions:** `broadcast_resource_updated` re-raises `CancelledError` that `gather(return_exceptions=True)` had silently collected, so `stop()` no longer hangs until the next sleep tick
* **subscriptions:** `MtimeWatcher.start()` offloads initial `_scan` via `asyncio.to_thread` to match `_tick`, no longer blocking the ASGI lifespan on slow filesystems
* **subscriptions:** mtime scan offloaded to thread; fan-out cleanup guarded against late exceptions
* **prompts:** switch user-input interpolation delimiter to backticks and preserve quotes in user input
* **rate-limit:** add missing `RATE_LIMIT_COSTS` keys for `find_entry_by_title`, `get_zim_entries`, `get_related_articles` (were silently using the cost=1 default)
* **http:** add `Mcp-Session-Id` to CORS `allow_headers` and `expose_headers` so browser MCP clients can resume sessions
* **main:** catch `pydantic.ValidationError` from `OpenZimMcpConfig` construction and re-surface as `OpenZimMcpConfigurationError` so operators see a clean message instead of a pydantic validation dump
* **cache:** suppress shutdown logging spam; tolerate malformed persisted entries
* **security:** symlink-tighten archive scan; harden error context; sanitise `name_filter`; reject whitespace-only CORS wildcard
* **tools:** `get_binary_entry` docstring example uses keyword `include_data=False` (positional `False` was landing in `max_size_bytes`)
* **packaging:** `Development Status :: 5 - Production/Stable` classifier for the 1.0.0 release

### Final pre-release sweep

* **resource:** `ZimEntryResource.read` and the `zim://files` / `zim://{name}` resource handlers now offload archive opens via `asyncio.to_thread`; previously a single read stalled the HTTP/SSE event loop for every other concurrent client
* **resource:** `ZimEntryResource.read` resolves redirect chains (with cycle detection and the shared `MAX_REDIRECT_DEPTH = 10` cap) before `entry.get_item()`; previously every redirect-stub path crashed with `RuntimeError` from libzim
* **content:** `get_zim_entries` (batch) replaces manual `__enter__`/`__exit__` with a regular `with` block — cleaner cleanup on `BaseException`, no silent swallowing of `__exit__` errors
* **content:** drop `_get_main_page_content`'s `archive._get_entry_by_id(0)` fallback (libzim private API; entry-zero is not the spec's main-page pointer); the inline redirect helper now uses `MAX_REDIRECT_DEPTH` and raises `OpenZimMcpArchiveError` on cycles or chain exhaustion to match the rest of the redirect helpers
* **server:** `OpenZimMcpServer.run()` defaults to `self.config.transport` (translating the short name `'http'` to FastMCP's `'streamable-http'`) and rejects an explicit `transport=` argument that contradicts the configured value — closes the gap where HTTP-mode subscriptions could be wired while a stdio transport was actually started
* **search/structure:** `find_entry_by_title`, `search_all`, and `get_related_articles` raise `OpenZimMcpValidationError` on out-of-range `limit` / `limit_per_file` instead of returning a hand-formatted markdown string, so the tool layer sees a consistent exception shape
* **http:** `_is_loopback_host` adds a 1-second timeout around `socket.gethostbyname("localhost")` so a slow resolver can't hang server startup
* **ci:** drop `pull_request_target` trigger from `test.yml` / `codeql.yml` / `performance.yml` (closes the pwn-request gap where untrusted PR code could exfiltrate secrets); release-please prerelease detection reads the resolved tag name (works for `workflow_dispatch`); release-please bootstrap-sha placeholders removed; Dockerfile uv image pinned to `0.11`
* **make:** `make benchmark` selects via `-k benchmark` (the previously referenced `tests/test_benchmarks.py` does not exist); `make security` no longer swallows bandit / pip-audit non-zero exits, so `make check` (used by `release.yml`) actually fails on findings
* **docs:** `OPENZIM_MCP_TOOL_MODE`, `_TRANSPORT`, `_HOST`, `_PORT`, `_AUTH_TOKEN`, `_CORS_ORIGINS`, `_WATCH_INTERVAL_SECONDS`, `_SUBSCRIPTIONS_ENABLED` documented in the README configuration table; install commands aligned across README / `website/llms.txt` / `website/index.html` (lead with `uv tool install openzim-mcp`); `website/llm.txt` renamed to `website/llms.txt` (matches the [llmstxt.org](https://llmstxt.org) convention) and advertised in the sitemap

## [0.9.0](https://github.com/cameronrye/openzim-mcp/compare/v0.8.3...v0.9.0) (2026-04-30)

### Features

* **search:** `search_all` queries every ZIM file in allowed directories at once and merges results
* **search:** `find_entry_by_title` resolves a title (or partial title) to entry paths, case-insensitive, optionally cross-file
* **prompts:** MCP prompts (`/research`, `/summarize`, `/explore`) for multi-step ZIM workflows
* **resources:** MCP resources `zim://files` (index of all ZIM files) and `zim://{name}` (per-archive overview)
* **navigation:** `walk_namespace` for deterministic cursor-paginated namespace iteration (vs. `browse_namespace` which samples)
* **content:** `get_random_entry` to sample a random article
* **content:** `get_related_articles` returns link-graph nearest neighbours (outbound, inbound, or both)
* **server:** `warm_cache`, `cache_stats`, and `cache_clear` for inspecting and managing the in-memory cache

### Bug Fixes

* namespace listing deterministically surfaces minority namespaces (M, W, X, I) that random sampling could miss
* search filtering uses streaming scan instead of a hard 1000-hit cap, so rare-mime-type filters return matches that were previously hidden
* error messages route by failure mode first (no more "check disk space" for "entry not found")
* phantom server-instance conflicts are no longer reported (TOCTOU re-check before raising)

## [0.8.3](https://github.com/cameronrye/openzim-mcp/compare/v0.8.2...v0.8.3) (2026-01-30)

### Bug Fixes

* fix logo URL in README.md to use absolute GitHub raw URL for PyPI display ([README.md](README.md))
* resolve GitHub code scanning alert #133 - variable defined multiple times in security.py ([security.py](openzim_mcp/security.py))
* resolve GitHub code scanning alert #134 - mixed import styles in test_main.py ([test_main.py](tests/test_main.py))
* remove unused `contextlib` import from security.py (flake8 fix)

## [0.8.2](https://github.com/cameronrye/openzim-mcp/compare/v0.8.1...v0.8.2) (2026-01-29)

### Bug Fixes

* fix search pagination when offset exceeds total results ([zim_operations.py](openzim_mcp/zim_operations.py))
* improve exception handling in instance tracker for Python 3 compatibility ([instance_tracker.py](openzim_mcp/instance_tracker.py))
* add fallback to stderr for logging during shutdown ([instance_tracker.py](openzim_mcp/instance_tracker.py))
* improve Windows process checking with debug logging ([instance_tracker.py](openzim_mcp/instance_tracker.py))
* fix release workflow to skip automatic GitHub release creation ([release-please.yml](.github/workflows/release-please.yml))
* resolve linting issues in simple_tools.py and content_tools.py

## [0.8.1](https://github.com/cameronrye/openzim-mcp/compare/v0.7.1...v0.8.1) (2026-01-29)

### Features

* add article summaries, table of contents, and pagination cursors ([bf5d18f](https://github.com/cameronrye/openzim-mcp/commit/bf5d18fcfecb2e6b03c667565640439b145a4e30))

### Bug Fixes

* remove unused imports in test files for CI linting ([0ddb250](https://github.com/cameronrye/openzim-mcp/commit/0ddb250d49fb627ee7adb41cf3fa52a8caf69172))
* resolve GitHub code scanning alerts ([2ad2c56](https://github.com/cameronrye/openzim-mcp/commit/2ad2c56a6e7a958ed63d6bd23ad975dd80e1e1f0))

### Details

* **Article Summaries** (`get_entry_summary`): Extract concise article summaries from opening paragraphs
  * Removes infoboxes, navigation, and sidebars for clean summaries
  * Configurable `max_words` parameter (10-1000, default: 200)
  * Returns structured JSON with title, summary, word count, and truncation status
  * Useful for quick content preview without loading full articles

* **Table of Contents Extraction** (`get_table_of_contents`): Build hierarchical TOC from article headings
  * Hierarchical tree structure with nested children based on heading levels (h1-h6)
  * Includes heading text, level, and anchor IDs for navigation
  * Provides heading count and maximum depth statistics
  * Enables LLMs to navigate directly to specific article sections

* **Pagination Cursors**: Token-based pagination for easier result navigation
  * Base64-encoded cursor tokens encode offset, limit, and optional query
  * `next_cursor` field in search and browse results for continuation
  * Eliminates need for clients to track pagination state manually

### Enhanced

* **Intent Parsing**: Improved multi-match resolution with weighted scoring
  * Collects all matching patterns before selecting best match
  * Weighted scoring: 70% confidence + 30% specificity
  * Prevents earlier patterns from incorrectly shadowing more specific ones
  * New intent patterns for "toc" and "summary" queries in Simple mode

* **Simple Mode**: Added natural language support for new features
  * "summary of Biology" or "summarize Evolution" for article summaries
  * "table of contents for Biology" or "toc of Evolution" for TOC extraction

## [0.7.1](https://github.com/cameronrye/openzim-mcp/compare/v0.7.0...v0.7.1) (2026-01-28)

### Bug Fixes

* **ci:** handle existing GitHub releases in release workflow ([#54](https://github.com/cameronrye/openzim-mcp/issues/54)) ([63afa3d](https://github.com/cameronrye/openzim-mcp/commit/63afa3d9150a60716b7fa25524beedb806ded84d))

## [0.7.0](https://github.com/cameronrye/openzim-mcp/compare/v0.6.3...v0.7.0) (2026-01-28)

### Features

* add binary content retrieval for PDFs, images, and media files ([#52](https://github.com/cameronrye/openzim-mcp/issues/52)) ([95611c9](https://github.com/cameronrye/openzim-mcp/commit/95611c9135836202d1fc97181d98307c199e3888))

## [0.6.3](https://github.com/cameronrye/openzim-mcp/compare/v0.6.2...v0.6.3) (2025-11-14)

### Bug Fixes

* configure release-please to skip GitHub release creation and handle existing PyPI packages ([b865454](https://github.com/cameronrye/openzim-mcp/commit/b8654546c1a8ea3a90eb3dedfb95c671beaaca98))

## [0.6.2](https://github.com/cameronrye/openzim-mcp/compare/v0.6.1...v0.6.2) (2025-11-14)

### Bug Fixes

* add tag_name parameter to GitHub Release action ([74d393c](https://github.com/cameronrye/openzim-mcp/commit/74d393c600155b303a26d6f066130cb26351cb49))

## [0.6.1](https://github.com/cameronrye/openzim-mcp/compare/v0.6.0...v0.6.1) (2025-11-14)

### Bug Fixes

* resolve CI workflow issues ([4bd6c33](https://github.com/cameronrye/openzim-mcp/commit/4bd6c332548a444c58390889052ebcc417d65094))

## [0.6.0](https://github.com/cameronrye/openzim-mcp/compare/v0.5.1...v0.6.0) (2025-11-14)

### Features

* add dual-mode support with intelligent natural language tool ([#31](https://github.com/cameronrye/openzim-mcp/issues/31)) ([6d97993](https://github.com/cameronrye/openzim-mcp/commit/6d97993a8bda3f20cc65abfeef459f9487b94406))
* enhance GitHub Pages website with dark mode, dynamic versioning, and improved UX ([#22](https://github.com/cameronrye/openzim-mcp/issues/22)) ([977d46a](https://github.com/cameronrye/openzim-mcp/commit/977d46abf61efbafca2bd24142176c3857cc32b8))

## [0.5.1](https://github.com/cameronrye/openzim-mcp/compare/v0.5.0...v0.5.1) (2025-09-16)

### Bug Fixes

* resolve CI/CD status reporting issue for bot commits ([#20](https://github.com/cameronrye/openzim-mcp/issues/20)) ([af23589](https://github.com/cameronrye/openzim-mcp/commit/af235896b4a1afd96269d08d97362ff903e093d5))
* resolve GitHub Actions workflow errors ([#17](https://github.com/cameronrye/openzim-mcp/issues/17)) ([dcda274](https://github.com/cameronrye/openzim-mcp/commit/dcda2749a394a599e3f77a4b64412fa21e65a29d))

## [0.5.0](https://github.com/cameronrye/openzim-mcp/compare/v0.4.0...v0.5.0) (2025-09-15)

### Features

* enhance GitHub Pages site with comprehensive feature showcase ([#14](https://github.com/cameronrye/openzim-mcp/issues/14)) ([c50c69b](https://github.com/cameronrye/openzim-mcp/commit/c50c69b73bc4ec142a2080146644ed9c84da63c4))
* enhance GitHub Pages site with comprehensive feature showcase and uv-first installation ([#15](https://github.com/cameronrye/openzim-mcp/issues/15)) ([f988c5a](https://github.com/cameronrye/openzim-mcp/commit/f988c5a9c7af4acbfe08922a68e11a288f06da70))

### Bug Fixes

* correct CodeQL badge URL to match workflow name ([#13](https://github.com/cameronrye/openzim-mcp/issues/13)) ([7446f74](https://github.com/cameronrye/openzim-mcp/commit/7446f7491d1c0a028a7ba55071b46c73424b58e4))

### Documentation

* Comprehensive documentation update for v0.4.0+ features ([#16](https://github.com/cameronrye/openzim-mcp/issues/16)) ([e1bce58](https://github.com/cameronrye/openzim-mcp/commit/e1bce5816e95beca7adeca92c03dbd551808151f))
* improve installation instructions with PyPI as primary method ([d6f758b](https://github.com/cameronrye/openzim-mcp/commit/d6f758b30836e916933e87a316754cd757cec833))

## [0.4.0](https://github.com/cameronrye/openzim-mcp/compare/v0.3.3...v0.4.0) (2025-09-15)

### Features

* overhaul release system for reliability and enterprise-grade automation ([#9](https://github.com/cameronrye/openzim-mcp/issues/9)) ([ef0f1b8](https://github.com/cameronrye/openzim-mcp/commit/ef0f1b8f2eaac99a1850672088ddc29d28f0bcde))

## [0.3.1](https://github.com/cameronrye/openzim-mcp/compare/v0.3.0...v0.3.1) (2025-09-15)

### Bug Fixes

* add manual trigger support to Release workflow ([b968cf6](https://github.com/cameronrye/openzim-mcp/commit/b968cf661f536183f4ef5fd6374e75a847a0123f))
* ensure Release workflow checks out correct tag for all jobs ([b4a61ca](https://github.com/cameronrye/openzim-mcp/commit/b4a61ca7a034f9eefae2606c4eb9769ef4f79379))

## [0.3.0](https://github.com/cameronrye/openzim-mcp/compare/v0.2.0...v0.3.0) (2025-09-15)

### Features

* add automated version bumping with release-please ([6b4e27c](https://github.com/cameronrye/openzim-mcp/commit/6b4e27c0382bb4cfa16a7e101f012e8355f7c827))

### Bug Fixes

* resolve release-please workflow issues ([68b47ea](https://github.com/cameronrye/openzim-mcp/commit/68b47ea711525e126ec3ed8297808f7779edd87e))

## [0.2.0] - 2025-01-15

### Added

* **Complete Architecture Refactoring**: Modular design with dependency injection
* **Enhanced Security**:
  * Fixed path traversal vulnerability using secure path validation
  * Comprehensive input sanitization and validation
  * Protection against directory traversal attacks
* **Comprehensive Testing**: 80%+ test coverage with pytest
  * Unit tests for all components
  * Integration tests for end-to-end functionality
  * Security tests for vulnerability prevention
* **Intelligent Caching**: LRU cache with TTL support for improved performance
* **Modern Configuration Management**: Pydantic-based configuration with validation
* **Structured Logging**: Configurable logging with proper error handling
* **Type Safety**: Complete type annotations throughout the codebase
* **Resource Management**: Proper cleanup with context managers
* **Health Monitoring**: Built-in health check endpoint
* **Development Tools**:
  * Makefile for common development tasks
  * Black, flake8, mypy, isort for code quality
  * Comprehensive development dependencies

### Changed

* **Project Name**: Changed from "zim-mcp-server" to "openzim-mcp" for consistency
* **Entry Point**: New `python -m openzim_mcp` interface (backwards compatible)
* **Error Handling**: Consistent custom exception hierarchy
* **Content Processing**: Improved HTML to text conversion
* **API**: Enhanced tool interfaces with better validation

### Security

* **CRITICAL**: Fixed path traversal vulnerability in PathManager
* **HIGH**: Added comprehensive input validation
* **MEDIUM**: Sanitized error messages to prevent information disclosure

### Performance

* **Caching**: Intelligent caching reduces ZIM file access overhead
* **Resource Management**: Proper cleanup prevents memory leaks
* **Optimized Processing**: Improved content processing performance

## [0.1.0] - 2024-XX-XX

### Added

* Initial release of ZIM MCP Server
* Basic ZIM file operations (list, search, get entry)
* Simple path management
* HTML to text conversion
* MCP server implementation

### Known Issues (Fixed in 0.2.0)

* Path traversal security vulnerability
* No input validation
* Missing error handling
* No testing framework
* Resource management issues
* Global state management problems