# Changelog All notable changes to jcodemunch-mcp are documented here. ## [Unreleased] ## [1.22.6] - 2026-04-06 ### Fixed - **`_merge_hooks()` idempotency** — per-rule dedup instead of per-event. Previously, once any jcodemunch PreToolUse hook was installed, no additional PreToolUse rules could be added by `init --hooks`. Now each rule's command is checked individually, allowing incremental hook installation. Cherry-picked from @DrHayt's PR #214. - **Worktree hook-event derivation** — Claude Code sends `{cwd, name}` in WorktreeCreate/WorktreeRemove payloads, not `worktreePath`. Derive path as `{cwd}/.claude/worktrees/{name}`. Legacy fields still accepted. Also outputs resolved path on stdout as Claude Code expects. Cherry-picked from @DrHayt's PR #214. - **`config --check` hook validation** — now verifies Python hooks in `~/.claude/settings.json` instead of scanning for shell scripts in `~/.claude/hooks/`. Warns about legacy shell scripts if found. Cherry-picked from @DrHayt's PR #214. ## [1.22.5] - 2026-04-06 ### Added - **`TWEAKCC.md`** — guide for system prompt routing via [tweakcc](https://github.com/Piebald-AI/tweakcc) as an alternative to hook-based enforcement. Includes 8 prompt rewrites that embed jCodemunch preferences into Claude's core tool descriptions. Cross-referenced from AGENT_HOOKS.md. Credit: [@vadash](https://github.com/vadash). Closes #173. ### Fixed - **PreToolUse hook no longer blocks Read** — changed from hard `deny` to a stderr warning. The deny broke the Edit workflow because Claude Code requires Read before Edit, forcing workarounds or env var overrides. Targeted reads (with `offset` or `limit`) are now silently allowed; full-file reads on large code files produce a stderr hint nudging toward `get_file_outline` + `get_symbol_source`. Aligns the Python CLI hook with the documented shell hook design in AGENT_HOOKS.md which explicitly notes "Read is intentionally NOT blocked". ## [1.22.4] - 2026-04-06 ### Added - **`get_session_snapshot` MCP tool** — compact ~200 token markdown summary of session state (focus files by read count, edited files, key searches, negative evidence). Designed for context injection after compaction to restore session orientation. Contributed by @MariusAdrian88. Closes #211. - **PreCompact CLI hook** (`jcodemunch-mcp hook-precompact`) — automatically generates and injects a session snapshot before Claude Code context compaction via the `systemMessage` hook output field. Registered by `jcodemunch-mcp init`. - **`sort_by` parameter for `get_context()`** — session journal now supports `sort_by="frequency"` (by read/edit/query count) in addition to the default `sort_by="timestamp"`. - **`max_edits` parameter for `get_context()`** — limits the number of edited files returned, consistent with `max_files` and `max_queries`. ## [1.22.3] - 2026-04-06 ### Added - **`exclude_skip_directories` config** — remove entries from the built-in skip directory list at runtime. Mirrors the existing `exclude_secret_patterns` pattern. Example: set `["proto"]` to index protobuf directories that are skipped by default. Contributed by @DrHayt. Closes #209. ## [1.22.2] - 2026-04-05 ### Fixed - **CLI init indexing broken** — `run_index()` passed `folder_path=` to `index_folder()` which expects `path=`, causing `unexpected keyword argument` error on `jcodemunch-mcp init`. Closes #208. ## [1.22.1] - 2026-04-05 ### Fixed - **streamable-http session persistence** — `run_streamable_http_server` previously created a new `StreamableHTTPServerTransport` (and a new `server.run()` coroutine) for every incoming HTTP request, leaving follow-up calls like `tools/list` hitting an uninitialised session and failing with `-32602 INVALID_PARAMS`. The handler now maintains a session map keyed by `mcp-session-id`: on the first request a background `asyncio.Task` runs `transport.connect()` + `server.run()` for the lifetime of the session, and all subsequent requests from the same client are routed to the existing transport. Terminated sessions (e.g. after DELETE) are cleaned up automatically. Includes a 10-second setup timeout with graceful error response. Closes #204. - 9 new tests in `test_streamable_http_sessions.py`. ## [1.22.0] - 2026-04-05 ### Added - **`plan_turn` tool** — opening-move router for any task. Runs BM25 + PageRank against the query, returns confidence level (high/medium/low/none), recommended symbols/files, insertion point suggestions for missing features, prior negative evidence detection, and a budget advisor when turn budget exceeds 60%. - **`get_session_context` tool** — returns session history: files read, searches run, edits made, tool call counts. Use to avoid re-reading the same files. - **`register_edit` tool** — post-edit cache invalidation. Clears BM25 token cache and search result cache for edited files; optionally reindexes. - **Session journal** (`session_journal.py`) — process-lifetime singleton tracking reads, searches, edits, and negative evidence. Bounded at 5000 entries per category with LRU eviction. Thread-safe. - **Turn budget** (`turn_budget.py`) — cross-call token accumulator. Injects `budget_warning` + `auto_compacted` into `_meta` when budget runs low. Configurable via `turn_budget_tokens` and `turn_gap_seconds`. - **Session state persistence** (`session_state.py`) — save/restore session across restarts. Writes only on clean shutdown via `atexit`. Staleness validated against `indexed_at` on restore. Opt-in via `session_resume: true`. - **Negative evidence in `search_symbols`** — when results are empty or below threshold, response includes structured `negative_evidence` with `verdict` (no_implementation_found / low_confidence_matches), `scanned_symbols`, `scanned_files`, `related_existing`. - **LRU result cache in `search_symbols`** — 128-entry default, cache key includes `indexed_at` for automatic invalidation on reindex. - **10 new config keys**: `negative_evidence_threshold`, `search_result_cache_max`, `session_journal`, `plan_turn_high_threshold`, `plan_turn_medium_threshold`, `turn_budget_tokens`, `turn_gap_seconds`, `session_resume`, `session_max_age_minutes`, `session_max_queries`. - **CLAUDE.md policy updates** — routing rules for `plan_turn`, negative evidence handling, budget warning response, and a Read exception note (harness requires Read before Edit/Write). - 75 new tests across 10 test files (2191 total, 0 regressions). ## [1.21.27] - 2026-04-04 ### Added - **PreToolUse enforcement hook** (`hook-pretooluse` subcommand) — intercepts `Read` calls on large code files (>=4KB, configurable via `JCODEMUNCH_HOOK_MIN_SIZE`) and returns a `deny` decision directing Claude to use `get_file_outline` + `get_symbol_source` instead. Non-code files and small files pass through silently. Addresses the "0% jcodemunch efficiency" problem where CLAUDE.md rules are ignored under cognitive load. - **PostToolUse auto-reindex hook** (`hook-posttooluse` subcommand) — fires after `Edit` or `Write` on code files and spawns `jcodemunch-mcp index-file` in the background to keep the index fresh. Eliminates "index staleness anxiety" that caused users to bypass jcodemunch and fall back to `Read`. - **Enforcement hooks in `init`** — `jcodemunch-mcp init` now offers to install both hooks into `~/.claude/settings.json` (PreToolUse matcher: `Read`, PostToolUse matcher: `Edit|Write`). Enabled by `--hooks` flag or interactive prompt. Idempotent, backup-aware, and respects `--dry-run`/`--demo`. - New `_merge_hooks()` helper in `cli/init.py` — shared logic for merging hook definitions into settings.json, used by both worktree and enforcement hook installers. - 25 new tests in `test_hooks.py` covering PreToolUse deny/allow logic, PostToolUse indexing, idempotent install, and edge cases (missing files, invalid JSON, Windows creation flags). ## [1.21.26] - 2026-04-04 ### Added - **Cursor rules injection in `init`** — when Cursor is detected, `jcodemunch-mcp init` now offers to write `.cursor/rules/jcodemunch.mdc` with `alwaysApply: true`. This ensures the code-exploration policy is in context for every Cursor agent turn, including subagents, fixing the unreliable tool-fallback behaviour reported by Cursor users. - **Windsurf rules injection in `init`** — when Windsurf is detected, `init` now offers to append the code-exploration policy to `.windsurfrules`. Both files are idempotent, backup-aware, and respect `--dry-run`. - **`--demo` flag for `init`** — `jcodemunch-mcp init --demo` walks through the full setup process without making any changes, then prints "Had this NOT been a demo, I would have:" followed by each action and its benefit. 10 new tests in `test_init.py`. ## [1.21.25] - 2026-04-03 ### Added - **`audit_agent_config` tool** — scans agent config files (CLAUDE.md, .cursorrules, copilot-instructions.md, .windsurfrules, settings.json, etc.) for token waste. Reports per-file token cost, stale symbol references (cross-referenced against the jcodemunch index), dead file paths, redundancy between global and project configs, bloat patterns, and scope leaks. 34 new tests. ## [1.21.24] - 2026-04-03 ### Added - **`jcodemunch-mcp init` subcommand** — one-command onboarding that auto-detects installed MCP clients (Claude Code, Claude Desktop, Cursor, Windsurf, Continue), writes their config entries, injects the Code Exploration Policy into CLAUDE.md, installs worktree lifecycle hooks, and optionally indexes the current directory. Supports `--dry-run`, `--yes` (non-interactive), `--client`, `--claude-md`, `--hooks`, `--index`, and `--no-backup` flags. 27 new tests in `test_init.py`. - Updated QUICKSTART.md, README.md, and AGENT_HOOKS.md to lead with `init` as the recommended setup path. - **`audit_agent_config` tool** — scans agent config files (CLAUDE.md, .cursorrules, copilot-instructions.md, .windsurfrules, settings.json, etc.) for token waste. Reports per-file token cost, stale symbol references (cross-referenced against the jcodemunch index), dead file paths, redundancy between global and project configs, bloat patterns, and scope leaks. 34 new tests in `test_audit_agent_config.py`. ## [1.21.23] - 2026-04-02 ### Fixed - **`@includeFirst` Blade directive now parsed** (`laravel.py`) — `_BLADE_INCLUDE_FIRST` regex captures the first (highest-priority) candidate from `@includeFirst(['primary.view', 'fallback.view'])` array arguments and injects it as an import edge. Previously the directive was silently dropped. The fallback candidates are intentionally omitted — only the preferred view is tracked. Closes #203. ## [1.21.22] - 2026-04-02 ### Added - **Stage 4: Cross-language dependency graph (Laravel)** — `laravel.py` now injects extra import edges via the new `get_extra_imports()` provider hook: Blade `@extends`/`@include`/`@includeWhen`/`@includeUnless`/`@component`/`view()` templates, `` components, Eloquent relationship edges (`hasMany`/`belongsTo`/etc.), 40 built-in Laravel facades mapped to underlying `Illuminate\*` classes, route→controller file edges, and Inertia.js `Inertia::render`/`inertia()` → Vue/React page component resolution. Frontend `fetch`/`axios`/`useFetch` API calls are matched to Laravel route→controller files via wildcard URI matching. - **Stage 5: Nuxt/Next.js context providers** — `parser/context/nuxt.py` (`NuxtContextProvider`) parses `pages/` for file-based routing, `server/api/` for API handlers with HTTP method extraction, and scans `composables/`/`utils/` for auto-import edges. `parser/context/nextjs.py` (`NextjsContextProvider`) handles App Router `page`/`layout`/`loading`/`error`/`route` files, route group `(auth)` segment collapsing, middleware detection, and HTTP method extraction from route handlers. - **Stage 6: FQN ↔ symbol_id translation** — new `parser/fqn.py` with `symbol_to_fqn()` and `fqn_to_symbol()` for bidirectional PSR-4 translation. Optional `fqn` parameter added to `get_symbol_source`, `get_blast_radius`, `search_symbols`, and `get_context_bundle`. `_utils.py` gains `resolve_fqn()` helper. Detailed error messages for missing PSR-4 config, unindexed files, or namespace mismatch. - **`collect_extra_imports()` merger in `context/base.py`** — called in all 4 indexing pipeline paths; deduplicates by specifier, swallows per-provider failures with a warning. - **125 new tests** across `test_laravel_provider.py`, `test_nuxt_provider.py`, `test_nextjs_provider.py`, `test_fqn.py`, and `test_find_importers.py`; 2008 total passing. ## [1.21.21] - 2026-04-02 ### Changed - **`files_to_remove` kept as `set` in `incremental_save` (T8)** — `sqlite_store.py` no longer converts the union of `deleted_files` and `changed_files` to a list. The set is preserved through the function and passed to `_patch_index_from_delta`, making membership tests in the hot path (`in files_to_remove`) O(1) instead of O(n). sqlite3 calls receive `tuple(files_to_remove)`. - **Defer `stat()` until after LRU key check in `load_index` (T9)** — `stat()` is now only called when the cache key is already present; cold-start loads skip the pre-load `stat()` syscall entirely. `_CACHE_MAX_SIZE` raised from 16 → 32. - **Cap `_REPO_PATH_CACHE` at 512 entries (T23)** — `config.py` trims the oldest entries after each `update()` so the cache cannot grow unbounded in long-running server sessions. - **`expanduser()` on startup storage path log (T24)** — all three transport startup log lines (`stdio`, `sse`, `streamable-http`) now call `os.path.expanduser()` on the `CODE_INDEX_PATH` value so the logged path shows the real expanded path on Windows instead of `~/.code-index/`. ## [1.21.20] - 2026-04-02 ### Added - **Dart import extractor (T19)** — `imports.py` now includes `_extract_dart_imports` (regex on `import`/`export` statements) registered as `"dart"` in `_LANGUAGE_EXTRACTORS`. Dart files no longer appear in `missing_extractors` after indexing. 9 new tests in `tests/test_dart_imports.py`; `test_parse_warnings.py` updated to use Elixir as the canonical missing-extractor example. - **LANGUAGE_SUPPORT.md expanded (T20)** — added full extraction rows for CSS, SCSS, SASS, YAML, Ansible, OpenAPI, and JSON; fixed C# entry to list `constant (property/field/event)` symbol types (were incorrectly documented as "not indexed"); corrected CSS row previously listed only under "text search indexing"; SASS entry now documents the CSS-parser fallback. - **Hypothesis property-based tests (T22)** — `tests/test_property_based.py` with 4 tests across 3 invariant classes: **ID uniqueness** (`TestIdUniqueness` — all symbol IDs in a freshly indexed folder are unique); **Incremental idempotency** (`TestIncrementalIdempotency` — indexing the same files twice yields the same symbol IDs and counts); **No self-imports** (`TestNoSelfImports` — no file in the import graph lists itself as an importer). `hypothesis>=6.0.0` added to dev dependency group. 4 new tests, 90 Hypothesis examples per run. ### Changed - **`JCODEMUNCH_EXTRA_EXTENSIONS` valid language names** (T21) — added `scss`, `sass`, `less`, `styl`, `yaml`, `ansible`, `json`, `openapi`, `luau` to the documented list in LANGUAGE_SUPPORT.md. ## [1.21.19] - 2026-04-02 ### Added - **Methodology disclosure on all 6 analytical tools (T15)** — every analytical tool response now includes `_meta.methodology` and `_meta.confidence_level`. Values: `get_call_hierarchy` + `get_impact_preview` → `methodology: "text_heuristic"`, `confidence_level: "low"`; `get_symbol_complexity` → `methodology: "stored_metrics"`, `confidence_level: "medium"`; `get_churn_rate` → `methodology: "git_log"`, `confidence_level: "high"`; `get_hotspots` → `methodology: "complexity_x_churn"`, `confidence_level: "medium"`; `get_repo_health` → `methodology: "aggregate"`, `confidence_level: "medium"`; `get_dead_code_v2` → `methodology: "multi_signal"`, `confidence_level: "medium"`. 18 new tests in `tests/test_meta_disclosure.py`. - **Import-gap signal in `index_folder` (T17)** — `index_folder` now reports `missing_extractors` (sorted list of languages that have symbol extraction but no import extractor) and `parse_warnings` when import graph coverage is incomplete. Example: indexing a folder with `.dart` files yields `missing_extractors: ["dart"]` and a human-readable `parse_warnings` entry. 4 new tests in `tests/test_parse_warnings.py`. - **`framework_warning` in `get_dead_code_v2` (T18)** — when BFS finds zero standard entry points (`main.py`, `app.py`, etc.), all files are unreachable from entry points and Signal 1 fires for every symbol, inflating dead code counts. `get_dead_code_v2` now includes `framework_warning` in that case, advising callers to pass `entry_point_patterns`. 5 new tests in `tests/test_parse_warnings.py`. ### Fixed - **Parameter count off-by-one for C-style zero-param functions (T16)** — `_count_params` in `parser/complexity.py` treated `void foo(void)` as a one-parameter function because `"void"` was a non-empty `params_str` with no commas, yielding `commas + 1 = 1`. Added a special case: `params_str == "void"` → return 0, matching the C/C++ convention that `(void)` declares zero parameters. `void*` and multi-param signatures containing `void` are unaffected. 3 new tests in `tests/test_complexity.py`. ## [1.21.18] - 2026-04-02 ### Added - **Correctness fixture library (T12)** — `tests/conftest.py` now exports three shared pytest fixtures (`small_index`, `medium_index`, `hierarchy_index`) that build deterministic synthetic Python repos with documented ground-truth expected outputs. Used across multiple test modules as the canonical in-process test corpus. - **Tests for `get_class_hierarchy` (T13)** — 22 new tests in `tests/test_class_hierarchy.py` covering: `_parse_bases` unit tests (Python single/multi base, Java extends/implements, combined, lowercase filter, empty); hierarchy BFS error cases (repo not indexed, class not found); ancestor direction (no ancestors for root, direct parent, transitive chain, BFS nearest-first order); descendant direction (all descendants of root, direct children, leaf has none); meta fields (case-insensitive lookup, timing, class info, external base recorded as `"(external)"`). - **Tests for `get_related_symbols` (T13)** — 14 new tests in `tests/test_related_symbols.py` covering: `_tokenize_name` unit tests (snake_case, camelCase, single word, short-token filter, lowercase); error cases (repo not indexed, symbol not found); same-file grouping (co-located symbols are related, scores positive); name-token overlap scoring; `max_results` cap; meta fields (timing, target symbol in response, required entry fields). - **Tests for `get_symbol_diff` (T13)** — 15 new tests in `tests/test_symbol_diff.py` covering: error cases (repo A not indexed, repo B not indexed); added symbols (detected, count matches list); removed symbols (detected, count matches list); unchanged symbols (not in added/removed, identical repo → all unchanged); changed symbols (signature change detected, both signatures present); meta fields (timing, symbol counts, repo identifiers). - **Tests for `suggest_queries` (T13)** — 11 new tests in `tests/test_suggest_queries.py` covering: error cases (repo not indexed, empty index); small repo stats (symbol count, file count, kind distribution, language distribution, example queries non-empty, required query fields); medium repo stats (file count, most_imported file structure, class+function kinds, repo field, timing meta). - **Tests for rate-limit middleware (T13)** — 10 new tests in `tests/test_rate_limit.py` covering: factory returns `None` when `JCODEMUNCH_RATE_LIMIT` is 0, unset, invalid, or negative; returns non-`None` `Middleware` when limit is positive; sliding-window bucket logic: under-limit all allowed, over-limit rejected, expired entries evicted, limit=1 allows first denies second. - **In-process perf benchmarks with latency budgets (T14)** — `tests/test_search_perf.py` rewritten from an external-index-dependent skip-if-not-indexed pattern to a fully self-contained suite. Builds a 5-file, 20+ symbol synthetic repo at module scope. New latency assertions: cold search < 2000 ms, warm search < 500 ms (BM25 cache benefit). Correctness assertions: result order stable across two consecutive calls, scores stable with `debug=True`, relevant symbol appears in top-5 for known query, all queries return non-empty results. Zero `pytest.skip` in the file. ### Changed - **`tests/test_search_perf.py`** — removed `_require_index()` / `pytest.skip` pattern that caused CI-skip when `jcodemunch-mcp` was not indexed locally. Tests now run unconditionally against the synthetic in-process index. ## [1.21.17] - 2026-04-02 ### Fixed - **BM25 `avgdl` inflation corrected (T10)** — `_sym_tokens` computed `_dl` (document length) as `len(tokens)` where `tokens` is the weighted repeated bag (field-repetition multipliers make the name appear 3× in the bag, signature 2×, etc.). This inflated `_dl` and therefore `avgdl`, distorting the BM25 length-normalisation term `K`. Fixed by using `len(set(tokens))` — the unique-token count — consistent with how document-frequency (`df`) is already computed via `for t in set(toks)`. Symbols with overlap across name/signature/summary fields (the common case) were previously penalised as "long documents" when they are not. - **BM25 rebuild canonical `_dl` enforcement (T11)** — `_compute_bm25` now overwrites `sym["_dl"]` with `len(unique_toks)` on every corpus rebuild. Previously the function used the cached `_dl` from `_sym_tokens`, meaning retained symbols carrying a pre-T10 `_dl` value (the inflated bag length) would make `avgdl` inconsistent with the new formula. The forced rewrite ensures the corpus and all scoring are internally consistent even when the BM25 cache is rebuilt over a mix of freshly computed and carried-forward symbols (e.g., after deferred AI summarisation). 11 new correctness tests added (`tests/test_bm25_correctness.py`). ## [1.21.16] - 2026-04-02 ### Fixed - **Watcher hash-cache double-read race eliminated (T6)** — after each incremental reindex the watcher previously re-read each changed file to compute the new content hash for its in-memory cache. If the file changed again between `index_folder`'s internal read and the watcher's post-reindex re-read, the cache recorded the wrong (newer) hash while the index held the older content. The *next* watchfiles event would then deliver `old_hash=`, `index_folder` would hash the file, see no difference, and silently skip re-parsing a stale index entry. Fixed by replacing per-file re-reads with a single `_build_hash_cache()` call that reads hashes from the store `index_folder` just wrote — the single authoritative source of truth. Removed the now-dead `_update_hash_cache` / `_remove_from_hash_cache` helpers and the unused `_file_hash` import. ## [1.21.15] - 2026-04-02 ### Fixed - **Deferred-summarize write-lock race eliminated (T7)** — a narrow but real race existed between the deferred summarization thread's generation check ("check 2") and its `incremental_save` call. A concurrent `mark_reindex_start` could bump `deferred_generation` and write a fresh index between those two points; the deferred thread would then overwrite it with stale AI summaries from the previous parse generation. Fixed by introducing a per-repo `threading.Lock` (`_repo_deferred_save_locks` in `reindex_state.py`). The deferred thread holds this lock across check 2 + save; `mark_reindex_start` holds it while bumping `deferred_generation`. This makes check-and-save atomic with respect to generation bumps: either the deferred thread saves before the new generation is written, or it sees the new generation and self-aborts. Added `gen=N` to deferred-summarize log messages so abandoned and completed saves are distinguishable in debug output (pre-T7 instrumentation). ## [1.21.14] - 2026-04-02 ### Fixed - **Threading locks added to all in-process caches (T5)** — four module-level caches were missing `threading.Lock` guards, leaving them vulnerable to data races under concurrent MCP requests (HTTP transport, multi-client stdio). Now protected: - `_bare_name_cache` (`tools/_utils.py`) — new `_BARE_NAME_LOCK`; check and write are each under the lock; expensive `list_repos()` I/O happens between the two lock acquisitions so the lock is never held during I/O. - `_REPO_PATH_CACHE` (`config.py`) — now protected by the existing `_CONFIG_LOCK`; reads (check) and bulk writes (`update`) are each atomic under the lock; store I/O happens outside. - `_alias_map_cache` (`parser/imports.py`) — new `_ALIAS_MAP_LOCK`; same check-then-build-then-write pattern. - `_sql_stem_cache` (`parser/imports.py`) — new `_SQL_STEM_LOCK`; same pattern. - **`invalidate_cache` now clears all 5 in-process caches under their locks (T4.5)** — previously `_sql_stem_cache` was not cleared on `invalidate_cache`, leaving stale SQL stem mappings across re-indexes. Also, `_REPO_PATH_CACHE.clear()` and `_PROJECT_CONFIGS.pop()` were called outside `_CONFIG_LOCK`. All five caches (`_REPO_PATH_CACHE`, `_PROJECT_CONFIGS`, `_PROJECT_CONFIG_HASHES`, `_bare_name_cache`, `_sql_stem_cache`, `_alias_map_cache`) are now cleared under their respective locks. ## [1.21.13] - 2026-04-02 ### Fixed - **`truncated` flag now correct when `token_budget` packing drops results** — in the BM25 search path, the flag was computed using `candidates_scored > len(scored_results)` after the fuzzy augmentation pass, meaning fuzzy results appended after budget packing could mask dropped BM25 results and produce `truncated=False` incorrectly. Now tracked as a separate `budget_truncated` boolean computed immediately after packing; the final flag is `candidates_scored > heap_count or budget_truncated`. The semantic search path was already correct. - **Call graph `"source"` label corrected** — `get_call_hierarchy` and `get_impact_preview` both returned `"source": "ast"` in `_meta`, implying type-resolved AST analysis. The implementation is word-token regex matching on raw file text. Label changed to `"source": "text_heuristic"` with updated tip text to accurately describe the approach and its limitations (false positives for common names, no dynamic dispatch). ## [1.21.12] - 2026-04-02 ### Added - **PSR-4 namespace resolution for PHP projects (Stage 1 of jgravelle/jcodemunch-mcp#201)** — `find_importers`, `get_blast_radius`, `get_dependency_graph`, `find_dead_code`, and all other import-graph tools now correctly resolve PHP `use App\Models\User` statements to `app/Models/User.php` via `composer.json` PSR-4 autoload mappings. Previously these tools returned zero results for PHP projects using Composer autoloading (effectively every modern PHP project). `build_psr4_map()` and `resolve_php_namespace()` are new public helpers; `CodeIndex` auto-loads the PSR-4 map at load time when PHP files are present and `source_root` is set. 61 new tests added. - **PHP `property_declaration` symbol indexing** — PHP class properties (`protected $fillable`, `public string $name`, etc.) are now indexed as `property`-kind symbols, fixing a gap in PHP symbol coverage. - **Laravel context provider** — new `LaravelContextProvider` detects Laravel projects (via `artisan` + `laravel/framework` in `composer.json`) and enriches symbols with: routes parsed from `routes/*.php`, Eloquent relationship/fillable/scope metadata from `app/Models/*.php`, controller-to-route mapping, and event→listener mappings from `EventServiceProvider`. Migration column definitions (from `database/migrations/*.php`) are exposed via `search_columns` under the `laravel_columns` key. - **Framework profile auto-detection** — `detect_framework()` checks for Laravel, Nuxt, Next.js, Vue SPA, and React SPA at index time and applies framework-specific `ignore_patterns` (e.g. `vendor/`, `.nuxt/`, `.next/`) automatically. Profile `entry_point_patterns` and `layer_definitions` are stored in `context_metadata` for downstream use by `find_dead_code` and `get_layer_violations`. The `index_folder` result now includes `framework_profile` when a profile is active. Zero overhead for non-matching projects. ## [1.21.11] - 2026-04-02 ### Added - **`config --check` now detects CLAUDE.md and hook-script drift (issue #200)** — the existing check command gains two new sections. *CLAUDE.md check* reads `~/.claude/CLAUDE.md` and reports any canonical tool names absent from the file, pointing to `jcodemunch-mcp claude-md --generate` to fix them. *Hook scripts check* scans `~/.claude/hooks/jcodemunch_read_guard.*` and lists any tool names missing from the guard's feedback message. - **`jcodemunch-mcp claude-md --generate`** — new subcommand that prints a ready-to-paste CLAUDE.md prompt-policy snippet listing all 45 tools in logical categories. `--format=append` outputs only the tools not yet mentioned in the existing `~/.claude/CLAUDE.md`, making it easy to diff-and-merge without rewriting the whole file. - **`_CANONICAL_TOOL_NAMES` module-level tuple** — authoritative ordered list of every registered tool name, used by both the drift-detection checks and the snippet generator. Validated by a test that asserts no tool produced by `_build_tools_list()` is absent from the tuple. ## [1.21.10] - 2026-04-02 ### Fixed - **`index_folder` full re-index no longer crashes with `'dict' object has no attribute 'summary'` when an existing index is present (issue #198)** — `CodeIndex.symbols` is `list[dict]` (serialized symbol dicts), but the summary-preservation dict comprehension at the top of the full-index path used dot notation (`s.file`, `s.name`, `s.kind`, `s.summary`) instead of bracket notation (`s["file"]`, etc.). Any second full index (or first index when an in-memory stale cache remained after `invalidate_cache` on pre-1.21.8) would immediately fail with this `AttributeError`. Fixed by using `s["key"]` / `s.get("key")` throughout that comprehension. Regression test added. ## [1.21.9] - 2026-04-02 ### Added - **`workflow` MCP prompt** — Claude Code surfaces this as `/mcp__jcodemunch-mcp__workflow`, a slash command that injects step-by-step usage guidance (list_repos → search_symbols → get_symbol_source) directly into context. Provides reliable workflow instructions even when CLAUDE.md is absent or not loaded by the model. - **`discovery_hint` config flag** (default `true`) — when enabled, the `list_repos` tool description includes a short note reminding Claude to prefer jcodemunch tools over native Grep/Read and to call `ToolSearch` if schemas appear deferred. Set `"discovery_hint": false` in `config.jsonc` to suppress this. Addresses jgravelle/jcodemunch-mcp#199. ## [1.21.8] - 2026-04-02 ### Fixed - **`invalidate_cache` now clears all four in-process caches (X1 / C4-B)** — previously `_REPO_PATH_CACHE`, `_PROJECT_CONFIGS`/`_PROJECT_CONFIG_HASHES`, `_alias_map_cache`, and `_bare_name_cache` were never evicted on `invalidate_cache`, leaving stale import graphs, wrong project config, and unresolvable repo names for the process lifetime. `invalidate_cache` now resolves `source_root` before deletion and clears all four caches in addition to the SQLite/JSON index. - **`_alias_map_cache` evicted at the start of every `index_folder` run (C6-A)** — tsconfig/jsconfig path alias edits were permanently invisible to re-indexing because `_load_tsconfig_aliases` cached by `source_root` with no invalidation hook. `index_folder` now pops the stale entry before parsing begins, so alias-dependent import edges (`find_importers`, `find_references`, `get_dependency_graph`) are always computed against the current tsconfig. - **`_sql_stem_cache` keyed by frozenset instead of `id()` (C7-A)** — the single-entry tuple cache used `id(source_files)` as its key. After the previous `source_files` set was GC'd, a new set allocated at the same address received the same `id`, causing a false cache hit and returning SQL stem mappings for the wrong file set. Replaced with a bounded frozenset-keyed dict (max 4 entries) for correct content-based identity. ## [1.21.7] - 2026-04-02 ### Fixed - **`search_symbols` no longer returns centrality-only results for out-of-corpus queries (C3)** — `_bm25_score` now guards the centrality bonus with `score > 0`, so it is only applied when at least one query term contributed BM25 relevance (or an exact name match fired). Previously, queries whose terms appeared in no indexed symbol produced BM25 score 0 for every symbol, but the unconditional `centrality` add-on gave structurally popular files scores > 0, causing them to pass the `score <= 0` filter and surface as apparent results with no indication they were purely import-graph artifacts. ## [1.21.6] - 2026-04-02 ### Fixed - **`_REPO_PATH_CACHE` negative entries no longer permanently suppress project config (C2)** — `_resolve_repo_key` previously wrote `None` into `_REPO_PATH_CACHE` for any identifier that couldn't be resolved at call time (e.g. during watcher startup before the first index completes). That entry was never invalidated, so all subsequent calls — including those after successful indexing — silently fell through to the global config, ignoring `.jcodemunch.jsonc` for the process lifetime. Removed the negative cache write; unknown identifiers now re-scan `list_repos()` on each call (cheap read) so project configs are picked up as soon as the repo is indexed. ## [1.21.5] - 2026-04-02 ### Fixed - **Deferred summarization no longer doubles `CodeIndex.symbols` in memory (C1)** — `_patch_index_from_delta` now builds a set of symbol IDs present in `new_sym_dicts` and skips any retained symbol whose ID is already being replaced. Previously, when `_run_deferred_summarize` called `incremental_save` with `changed_files=[]` and `deleted_files=[]`, every symbol was retained *and* appended again as a summarized copy, doubling the in-memory symbol list. This caused BM25 scores to be computed over a 2× corpus (wrong IDF, wrong `avgdl`) and `search_symbols` to return duplicate hits for the same symbol ID until the next cold cache load. ## [1.21.4] - 2026-04-02 ### Added - **JSON indexing and symbol extraction** — `.json` files are now indexed and text-searchable. Top-level object keys are extracted as `constant` symbols (e.g. `name`, `dependencies`, `scripts` in `package.json`; compiler options keys in `tsconfig.json`). Compound extensions `.openapi.json` / `.swagger.json` and well-known basenames (`openapi.json`, `swagger.json`) continue to resolve to `openapi` as before. Closes reported gap in issue #197 follow-up comment (nikolai-vysotskyi). - **15 new tests** covering extension detection, compound-extension precedence, top-level key extraction, symbol kind/metadata, array-at-root edge case, and `parse_file()` dispatch. ## [1.21.3] - 2026-04-02 ### Added - **CSS preprocessor support (SCSS, SASS, Less, Stylus)** — `.scss`, `.sass`, `.less`, and `.styl` files are now indexed and text-searchable. SCSS additionally gets full symbol extraction: `$variables` → `constant`, `@mixin` → `function`, `@function` → `function`, rule-set selectors (including `%placeholders`) → `class`, `@media`/`@supports` → `type`. SASS, Less, and Stylus have no tree-sitter grammar in the pack so they index for text search only — `search_text` with `file_pattern: "**/*.scss"` now returns results. Closes issue #197 (reported by nikolai-vysotskyi). - **24 new tests** covering SCSS extension detection, variable/mixin/function/selector/at-rule extraction, symbol ID uniqueness, byte metadata, `parse_file()` dispatch, empty-file edge case, and text-only confirmation for Less/SASS/Stylus. ## [1.21.2] - 2026-04-02 ### Added - **Summary preservation during full reindex** — when a full reindex runs over a repo that already has an index (e.g. after a schema bump or explicit `incremental=False` call), symbols whose file content hash is unchanged now reuse their existing AI-generated summaries instead of triggering new AI calls. Symbols in changed or new files are summarized normally. This is automatic and requires no parameter changes — the optimization fires whenever a prior index is present and has stored file hashes. Addresses issue #192 (reported by rknighton). ## [1.21.1] - 2026-04-02 ### Fixed - **`summarizer_concurrency` now respected by OpenAI-compatible provider** — `OpenAIBatchSummarizer` was reading concurrency from `OPENAI_CONCURRENCY` env var with a hardcoded default of 1, ignoring the `summarizer_concurrency` config key entirely. The default is now `_config.get("summarizer_concurrency", 4)`, so the config file (and `JCODEMUNCH_SUMMARIZER_CONCURRENCY` env var) correctly controls concurrency for all providers. `OPENAI_CONCURRENCY` env var still overrides when set. The `config` diagnostic display now shows the effective fallback value from config rather than the stale hardcoded 1. Reported by nikolai-vysotskyi (issue #194). ## [1.21.0] - 2026-04-02 ### Added - **CSS symbol extraction** — CSS files now produce real symbols: rule-set selectors (`.container`, `#header`, `body`, `:root`, compound selectors like `.navbar .item`) are extracted as `kind: class`; `@keyframes` as `kind: function`; `@media` and `@supports` blocks as `kind: type`. Previously CSS was indexed (text-searchable) but `get_file_outline` always returned 0 symbols. Fixes reported issue where users believed CSS was not supported at all. - **17 new tests** (1641 total, 7 skipped): full coverage of CSS selector extraction, @-rule extraction, edge cases (empty file, comment-only file), symbol ID uniqueness, and `parse_file()` dispatch. ## [1.20.0] - 2026-04-02 ### Changed - **Lazy tool imports** — all 45 tool module imports in `server.py` are now deferred to the first `call_tool()` dispatch for each tool. Previously, importing `server.py` loaded every tool module (and their transitive dependencies: tree-sitter, httpx, pathspec, subprocess wrappers) regardless of which tools the session actually uses. Now only 7 tool modules load at startup (via the watcher's `index_folder` chain). Tools not called in a session are never imported. This reduces cold-start overhead for query-only sessions that never trigger indexing. - **`_build_tools_list()` helper** — `list_tools()` now delegates to a named `_build_tools_list()` function, making the tool list construction easier to test and reason about independently of the MCP decorator. - **Test patch targets updated** — tests that previously patched `jcodemunch_mcp.server.xxx` (where `xxx` is a tool function) now correctly patch `jcodemunch_mcp.tools.xxx_module.xxx_func`, which is where the name is looked up during dispatch. This follows Python's `unittest.mock.patch` best practice: patch where the name is looked up, not where it is defined. - **No API or output schema changes.** Zero new tools, zero removed tools, zero field changes. ## [1.19.0] - 2026-04-01 ### Added - **`assessment` field on `get_hotspots` entries** — each hotspot now includes `assessment: "low" | "medium" | "high"` based on `hotspot_score` thresholds (low ≤ 3, medium ≤ 10, high > 10). Allows an LLM to relay findings directly without interpreting the raw score. - **`architecture.layers` documented in README** — the `.jcodemunch.jsonc` reference now includes the full `architecture` block schema with a worked example for a typical layered Python project (api → service → repo → db). Used by `get_layer_violations`. - **2 new tests** (1624 total, 7 skipped): `test_assessment_field_present`, `test_high_complexity_no_churn_is_low`. ## [1.18.0] - 2026-04-01 ### Added - **Session-level LRU result cache** — `get_blast_radius` and `find_references` (single-identifier mode) now cache their results for the duration of the MCP session. Repeated calls with the same arguments return instantly from the in-process cache with `_meta.cache_hit: true` instead of re-running the expensive BFS traversal and file-content scans. Cache is a 256-entry LRU (OrderedDict); oldest entries are evicted first. Thread-safe via the existing `_State` lock. - **Automatic cache invalidation** — the result cache is cleared after any `index_repo`, `index_folder`, `index_file`, or `invalidate_cache` call so stale results are never served after re-indexing. - **`get_session_stats` — `result_cache` field** — the existing `get_session_stats` tool now includes a `result_cache` section: `{total_hits, total_misses, hit_rate, cached_entries}`. Useful for tuning and for verifying that the cache is working in real sessions. - **18 new tests** (1622 total, 7 skipped): `test_result_cache.py` covers get/put, hit/miss counters, by-tool breakdown, invalidation (all-repos and repo-specific), LRU eviction at maxsize, and the `result_cache` field in `get_session_stats`. ## [1.17.0] - 2026-04-01 ### Added - **`get_symbol_complexity(symbol_id)`** — returns cyclomatic complexity, max nesting depth, parameter count, line count, and a human-readable `assessment` ("low" / "medium" / "high") for any indexed function or method. Data is read directly from the index (no re-parsing); requires INDEX_VERSION 7 (jcodemunch-mcp >= 1.16). - **`get_churn_rate(target, days=90)`** — returns git commit count, unique authors, first-seen date, last-modified date, and `churn_per_week` for a file or symbol over a configurable look-back window. `assessment` field: "stable" (≤1/week), "active" (≤3/week), "volatile" (>3/week). Accepts a relative file path or a symbol ID. Requires a locally indexed repo. - **`get_hotspots(top_n=20, days=90, min_complexity=2)`** — ranks functions and methods by `hotspot_score = cyclomatic × log(1 + commits_last_N_days)`. Surfaces code that is both complex and frequently changed — the highest bug-introduction risk in the repo. Identical methodology to Adam Tornhill's CodeScene hotspot analysis. Falls back gracefully when git is unavailable (complexity-only scoring). - **`get_repo_health(days=90)`** — one-call triage snapshot: total files/symbols, dead-code %, average cyclomatic complexity, top-5 hotspots, dependency cycle count, and unstable module count. Produces a `summary` string suitable for immediate relay. Designed to be the first tool called in any new session. Thin aggregator — delegates to individual tools, no duplicated logic. - **Bug fix: complexity data now correctly persisted through `save_index`** — the symbol serialization dict in `save_index` was missing `cyclomatic`, `max_nesting`, and `param_count` fields (they were computed by the parser but silently dropped before DB write). Fixed by including these fields in the serialized dict. All tools depending on complexity data (`get_extraction_candidates`, `get_symbol_complexity`, `get_hotspots`) now return accurate values after a fresh `index_folder`. - **36 new tests** (1604 total, 7 skipped): `test_symbol_complexity.py`, `test_churn_rate.py`, `test_hotspots.py`, `test_repo_health.py`. ## [1.16.0] - 2026-04-01 ### Added - **`check_rename_safe(symbol_id, new_name)`** — new tool that detects name collisions before renaming a symbol. Scans the symbol's defining file and every file that imports it, checking for an existing symbol already using the proposed new name. Returns `{safe, conflicts, checked_files}`. Use before any rename/refactor to avoid silent breakage. - **`get_dead_code_v2()`** — enhanced dead-code detection with three independent evidence signals per function/method: (1) the symbol's file is not reachable from any entry point via the import graph, (2) no indexed symbol calls this symbol in the call graph, (3) the symbol name is not re-exported from any `__init__` or barrel file. Each result includes a `confidence` score (0.33 = 1 signal, 0.67 = 2 signals, 1.0 = all 3). More reliable than single-signal detection. Accepts `min_confidence` (default 0.5) and `include_tests` parameters. - **`get_extraction_candidates(file_path, min_complexity, min_callers)`** — new tool that identifies functions worth extracting to a shared module. A candidate must have high cyclomatic complexity (doing a lot) AND be called from multiple other files (already implicitly shared). Results ranked by `score = cyclomatic × caller_file_count`. - **Complexity metrics stored at index time** — `INDEX_VERSION` bumped from 6 to 7. Three new fields per symbol (functions and methods only): `cyclomatic` (McCabe complexity), `max_nesting` (bracket-nesting depth), `param_count`. Computed from symbol body text at index time via `parser/complexity.py`. Existing indexes are automatically migrated (columns added as NULL; re-index to populate). Consumed by `get_extraction_candidates`. - **37 new tests** (1568 total, 7 skipped): `test_complexity.py`, `test_check_rename_safe.py`, `test_dead_code_v2.py`, `test_extraction_candidates.py`. ### Changed - `INDEX_VERSION` is now 7 (was 6). Re-index required to populate complexity fields; existing indexes load and operate correctly with complexity = 0. ## [1.15.3] - 2026-04-01 ### Added - **`config --upgrade`** — new CLI flag that adds missing keys from the current version's template into an existing `config.jsonc`, preserving all user-set values. Useful after upgrading jcodemunch-mcp to a newer version that introduces new config keys. Updates the `"version"` field automatically and reports which keys were injected. Addresses the gap implied by the `"version"` field / "additive migrations" comment in `config.jsonc`. Requested by nikolai-vysotskyi in issue #191. ## [1.15.2] - 2026-04-01 ### Added - **`summarize_repo(repo, force)`** — new MCP tool that re-runs AI summarization on all symbols in an existing index. Useful when `index_folder` completed without AI summaries (deferred background thread was interrupted, AI was disabled at index time, or the provider wasn't configured). With `force=true`, clears all existing summaries and re-runs the full 3-tier pipeline (docstring → AI → signature fallback). Returns `{success, symbol_count, updated, skipped, duration_seconds}`. Reported by nikolai-vysotskyi in issue #190. - **AI summarization progress logging** — `summarize_batch` (both `BaseSummarizer` and `OpenAIBatchSummarizer`) now logs progress at INFO level every ~10% of batches: `"AI summarization: N/M symbols (P%)"`. Start and completion are also logged. Previously there was zero feedback during 10–30 minute summarization runs on large codebases. - **`summarization_deferred` field in `index_folder` response** — when the watcher-driven fast path fires a background summarization thread, the response now includes `"summarization_deferred": true` and a note suggesting `summarize_repo` as a synchronous fallback. ### Changed - **Deferred summarization thread logging promoted to INFO** — thread start (`"Deferred AI summarization started for owner/repo (N symbols)"`) and completion (`"Deferred AI summarization saved N symbols for owner/repo"`) are now logged at INFO instead of DEBUG, making them visible in default logging configurations. ## [1.15.1] - 2026-04-01 ### Fixed - **Empty-array false positive in singular/batch mode detection** — `get_symbol_source`, `find_references`, `check_references`, `find_importers`, and `get_file_outline` each support a singular param (e.g. `symbol_id`) and a batch param (e.g. `symbol_ids`). Some MCP clients (observed with OpenCode + GPT codex) pass the batch param as an empty array `[]` even when invoking singular mode. Since `[] is not None` is `True`, the mutual-exclusivity guard fired and returned `"Provide symbol_id or symbol_ids, not both."` / `"Internal error processing find_references"`. Fixed by normalizing empty lists to `None` before the guard check in all five tools. Reported by razorree in issue #189. ## [1.15.0] - 2026-04-01 ### Added - **`get_dependency_cycles()`** — new tool detecting circular import chains in the repository. Uses Kosaraju's algorithm (iterative, no recursion limit) on the file-level import graph. Returns each strongly-connected component (set of files mutually reachable via imports) as a cycle. Useful for finding architectural problems and test-isolation blockers. - **`get_coupling_metrics(module_path)`** — new tool returning afferent coupling (Ca, how many files import this module), efferent coupling (Ce, how many files this module imports), instability score I = Ce/(Ca+Ce), and a human-readable `assessment` ("stable" | "neutral" | "unstable" | "isolated"). Identifies fragile modules and guides refactoring priorities. - **`get_layer_violations(rules?)`** — new tool validating inter-module imports against declared architectural layer boundaries. Reports every import that crosses a forbidden boundary. Rules can be passed directly or defined in `.jcodemunch.jsonc` under `architecture.layers`. Output includes `file`, `file_layer`, `import_target`, `target_layer`, `rule_violated` per violation. - **`architecture` config key** — new `.jcodemunch.jsonc` / global config key (type: dict) for per-project layer definitions. Structure: `{"layers": [{"name": str, "paths": [str], "may_not_import": [str]}]}`. Consumed by `get_layer_violations` when no inline `rules` are provided. - **36 new tests** (1527 total, 9 skipped) in `tests/test_architecture_tools.py`. ## [1.14.0] - 2026-04-01 ### Added - **`get_call_hierarchy(symbol_id, direction, depth)`** — new tool returning incoming callers and outgoing callees for any indexed symbol, N levels deep (default 3). Uses AST-derived detection: callers = symbols in importing files whose bodies mention the name; callees = imported symbols mentioned in the symbol's source body. No LSP required. Results include `{id, name, kind, file, line, depth}` per entry and `source: "ast"` in `_meta`. - **`get_impact_preview(symbol_id)`** — new tool answering "what breaks if I delete or rename this?". DFS over the call graph transitively, returns all affected symbols grouped by file (`affected_by_file`) with call-chain paths (`call_chains`) showing how each symbol is reached from the target. - **`_call_graph.py`** — shared internal module with `find_direct_callers`, `find_direct_callees`, `bfs_callers`, `bfs_callees` used by all call-graph tools. ### Changed - **`get_blast_radius`** — new optional `call_depth` param (default 0, disabled). When `call_depth > 0`, adds `callers` list of symbols that actually call the target symbol (call-level analysis) alongside the existing import-level `confirmed`/`potential` lists. All existing fields unchanged; fully backwards-compatible. - **`find_references`** — new optional `include_call_chain` param (default false, singular mode only). When true, each reference entry gains `calling_symbols`: symbols in that file whose source bodies mention the identifier. Batch mode ignores this flag. ## [1.13.2] - 2026-03-31 ### Fixed - **Per-project language config ignored during parsing** — `parse_file()` was calling `is_language_enabled(language)` without forwarding the `repo` path, so it always consulted the global config and never the per-project `.jcodemunch.jsonc`. Projects that declared their own `"languages"` list got `symbol_count: 0` when the global config had `"languages": []` (the recommended default). Fixed by threading `repo` from every `parse_file` call site (`index_folder`, `index_file`, `get_changed_symbols`, and all three pipeline functions in `_indexing_pipeline`) down to the language-gate check. `index_repo` is unaffected (remote repos have no local project config). Reported and root-caused by AmaralVini in issue #187. ## [1.13.1] - 2026-03-30 ### Changed - **`get_repo_outline` 2-level directory grouping for large repos** — when a repository has more than 500 indexed files, `directories` now groups by two path components (e.g., `src/api/`, `src/models/`) instead of only the top-level directory. Results are capped at 40 entries (highest file-count dirs first). Small repos (≤ 500 files) retain the existing 1-level behavior. Agents navigating large monorepos get actionable directory hints rather than a single coarse bucket. ## [1.13.0] - 2026-03-30 ### Added - **Cross-repository dependency tracking** — import graph tools (`find_importers`, `get_blast_radius`, `get_dependency_graph`, `get_changed_symbols`) now accept an opt-in `cross_repo: bool` parameter (default `false`). When enabled, the tools traverse repo boundaries using a package registry built from manifest files (`pyproject.toml`, `package.json`, `go.mod`, `Cargo.toml`, `*.csproj`). Cross-repo results are annotated with `"cross_repo": true` and `"source_repo"`. Zero behavior change when `cross_repo` is omitted. - **`get_cross_repo_map` tool** — new tool that returns the full cross-repository dependency map at the package level, or filtered to a single repo. Shows `depends_on` and `depended_on_by` for each indexed repo, plus a flat `cross_repo_edges` list. - **`package_names` field on `CodeIndex`** — package names are extracted from manifest files at index time (both `index_folder` and `index_repo`) and stored in the SQLite meta table. Old indexes load cleanly with `package_names = []`. - **`package_registry.py`** — new module providing `extract_package_names()` (5 ecosystems: Python, JS/TS, Go, Rust, C#), `extract_root_package_from_specifier()` (language-aware root extraction), `build_package_registry()` (in-memory registry with mtime-based cache), and `resolve_cross_repo_file()`. - **`cross_repo_default` config key** — boolean default for the `cross_repo` parameter across all import graph tools. Env var: `JCODEMUNCH_CROSS_REPO_DEFAULT`. Default: `false`. - **53 new tests** (1431 total, 9 skipped). ## [1.12.9] — docs patch 2026-03-30 ### Changed - **QUICKSTART.md Step 3** — upgraded AGENT_HOOKS.md footnote to an `[!IMPORTANT]` callout naming the "pressure bypass" failure mode (agent sees CLAUDE.md rule, ignores it under load) and explaining why hooks are needed for hard enforcement. - **QUICKSTART.md Troubleshooting** — added entry for "Claude uses jCodeMunch in simple tasks but falls back to Read/Grep in complex ones" pointing to AGENT_HOOKS.md. - **AGENT_HOOKS.md intro** — sharpened to explicitly name the failure mode: the agent sees the rule and skips it anyway because native tools feel faster under pressure or in long sessions. ## [1.12.9] - 2026-03-29 ### Added - **Tri-state `use_ai_summaries`** (PR #186 — contributed by MariusAdrian88) — Config key and `JCODEMUNCH_USE_AI_SUMMARIES` env var now accept three values: `"auto"` (new default; auto-detect provider from API keys, identical to previous `true`), `true` (use explicit `summarizer_provider` + `summarizer_model` from config), `false` (disable AI summarization entirely). Existing boolean `true`/`false` configs are fully backward-compatible. - **`summarizer_model` config key** — Override the default model for any provider via config or `JCODEMUNCH_SUMMARIZER_MODEL` env var. Priority: config key > provider-specific env var (`ANTHROPIC_MODEL`, `GOOGLE_MODEL`, etc.) > hardcoded default. Applies to all providers. - **`summarizer_max_failures` config key** — Circuit breaker threshold (default 3). After this many consecutive batch failures the summarizer stops calling the API and falls back to signature summaries for all remaining symbols. A successful batch resets the counter. Set 0 to disable. Thread-safe (`threading.Lock`). Configurable via `JCODEMUNCH_SUMMARIZER_MAX_FAILURES`. - **OpenRouter provider** — New provider via `OPENROUTER_API_KEY` using the OpenAI-compatible API at `openrouter.ai/api/v1`. Default model: `meta-llama/llama-3.3-70b-instruct:free` (zero cost). Auto-detect priority: last in chain (after GLM-5). Explicit selection: `summarizer_provider: "openrouter"` or `JCODEMUNCH_SUMMARIZER_PROVIDER=openrouter`. `jcodemunch-mcp config` now shows active OpenRouter section. - **`test_summarizer` diagnostic tool** — Sends a probe request to the configured AI summarizer and reports status: `ok`, `disabled`, `no_provider`, `misconfigured`, `fallback`, `timeout`, or `error`. Disabled by default (remove from `disabled_tools` in config to enable). Optional `timeout_ms` parameter (default 15000). - **`strict_timeout_ms` config key** — Configures the maximum milliseconds to block in `freshness_mode: strict` before proceeding with a stale index (previously hardcoded at 500ms). Default: 500. - **`embed_model` config key** — Promotes `JCODEMUNCH_EMBED_MODEL` env var to a config file setting. Configures the sentence-transformers model for local semantic embeddings. Config key takes priority over env var. - **`summarizer_provider` config key** — Promotes `JCODEMUNCH_SUMMARIZER_PROVIDER` env var to a config file setting. Takes priority over env var. - **60+ new tests** (1397 total, 7 skipped). ### Fixed - **`languages_adaptive` config key** (PR #185 — contributed by MariusAdrian88) — New boolean config key that enables automatic language detection based on files actually found in the indexed folder, overriding the `languages` allowlist for that run. Useful when indexing polyglot repos without maintaining an explicit language list. - **`meta_fields` default changed to `[]`** (PR #185) — Previously defaulted to `null` (all meta fields included); now defaults to `[]` (no `_meta` block) for token-efficient responses. Set to `null` in config to restore all meta fields. ## [1.12.7] - 2026-03-29 ### Added - **MiniMax and GLM-5 summarizer providers** (PR #184 — contributed by SkaldeStefan) — `MINIMAX_API_KEY` auto-detects MiniMax M2.7 (`api.minimax.io/v1`) and `ZHIPUAI_API_KEY` auto-detects GLM-5 (`api.z.ai`), both via the existing OpenAI-compatible summarizer path. `JCODEMUNCH_SUMMARIZER_PROVIDER` env var added for explicit selection (`anthropic`, `gemini`, `openai`, `minimax`, `glm`, `none`). Auto-detect priority: Anthropic → Gemini → OpenAI-compatible → MiniMax → GLM-5. Remote providers (including MiniMax/GLM) still require `allow_remote_summarizer: true` in `config.jsonc`. `get_provider_name()` exported from `jcodemunch_mcp.summarizer`. `jcodemunch-mcp config` now shows active provider and new MiniMax/GLM sections. 10 new tests (1332 total). ### Fixed - **`test_get_provider_name_unknown_falls_back_to_auto` test isolation** — test did not clear higher-priority env vars before auto-detecting MiniMax, causing false `anthropic` result in environments where `ANTHROPIC_API_KEY` is set. ## [1.12.6] - 2026-03-29 ### Fixed - **Gemini `CODE_RETRIEVAL_QUERY` KeyError on legacy SDK** (follow-up to #181) — The legacy `google-generativeai` package does not include `CODE_RETRIEVAL_QUERY` in its `TaskType` proto enum (it was introduced in the newer `google-genai` SDK). Passing that string to `genai.embed_content` caused a `KeyError` during semantic search. A new `_normalise_gemini_task_type` helper probes the installed SDK's `TaskType` enum at runtime and falls back `CODE_RETRIEVAL_QUERY` → `RETRIEVAL_QUERY` on legacy installs, producing equivalent retrieval quality. New SDK installs with `CODE_RETRIEVAL_QUERY` are unaffected. 5 new tests (1322 total). ## [1.12.5] - 2026-03-29 ### Added - **YAML and Ansible parser support** (PR #183 — contributed by SkaldeStefan) — `.yaml` and `.yml` files are now indexed as first-class symbols. A path-heuristic layer (`_looks_like_ansible_path`) automatically promotes Ansible-structured files (playbooks, roles, group_vars, host_vars, tasks, handlers, defaults) to the `ansible` language so they receive Ansible-aware symbol extraction: plays as `class`, tasks as `function`, roles and handlers as `type`, and variables as `constant`. Generic YAML falls back to a structural walker that emits container keys as `type` and scalar keys as `constant`. Multi-document YAML (multiple `---` sections) is handled correctly. pyyaml is already a base dependency — no extra install step. 8 new tests (1317 total). ## [1.12.4] - 2026-03-29 ### Added - **Task-aware embedding for Gemini** (closes #181) — When `GOOGLE_EMBED_MODEL` is configured, `embed_repo` now passes `task_type="RETRIEVAL_DOCUMENT"` to `genai.embed_content` for document indexing, and `search_symbols` passes `task_type="CODE_RETRIEVAL_QUERY"` when embedding the search query. Models that support task types (e.g. `text-embedding-004`, Gemini Embedding 2) produce measurably better code retrieval results; models that do not simply ignore the parameter. Other providers (sentence-transformers, OpenAI) are unaffected. - **`GEMINI_EMBED_TASK_AWARE` env var** — Set to `0` / `false` / `no` / `off` to opt out of task-type routing (default: on). Useful if your Gemini model predates task-type support. - **`embed_task_type` stored in meta** — The task type used when building the embedding index is now persisted. If you toggle `GEMINI_EMBED_TASK_AWARE`, `embed_repo` detects the mismatch and automatically forces a re-embed so query and document embeddings always come from the same task-type space. - **`task_type` field in `embed_repo` response** — Present when a task type was applied; absent for providers that do not use one. - 7 new tests (1309 total): `_gemini_task_aware` default/opt-out, Gemini document task type in `embed_repo`, `CODE_RETRIEVAL_QUERY` routing in `search_symbols`, opt-out disables task types, task-type change triggers re-embed, `EmbeddingStore` task type round-trip. ## [1.12.3] - 2026-03-29 ### Fixed - **Cross-process LRU cache invalidation** — SQLite WAL mode does not always update the `.db` file's mtime on commit. The watcher (a separate process) was writing new index data that the MCP server's in-memory cache never detected, causing agents to see stale results. New `_db_mtime_ns()` helper checks `max(db_mtime, db-wal_mtime)` so WAL writes are detected without an explicit cache eviction call. `os.utime()` added after `save_index()` and `incremental_save()` as a belt-and-suspenders measure; `os.utime()` runs *before* `_cache_put()` so the cached mtime matches what cross-process readers compute. - **`get_file_tree` silently ignored `max_files`** — the parameter was present in the MCP schema but was never passed through `call_tool` dispatch. - **Config template stale entries** — `wait_for_fresh` (removed v1.12.0) was still listed in `disabled_tools` template; staleness `_meta` fields (`index_stale`, `reindex_in_progress`, `stale_since_ms`) were still listed in `meta_fields` template. ### Added - **`file_tree_max_files` config key** — configures the `get_file_tree` result cap via `config.jsonc` or `JCODEMUNCH_FILE_TREE_MAX_FILES` env var (default 500). Per-call `max_files` param still overrides. - **`gitignore_warn_threshold` config key** — configures the missing-`.gitignore` warning threshold in `index_folder` via `config.jsonc` or `JCODEMUNCH_GITIGNORE_WARN_THRESHOLD` env var (default 500). Set `0` to disable entirely. - **Config template overhaul** — all keys now have inline documentation; tools and meta_fields lists sorted alphabetically; all missing keys added (`trusted_folders_whitelist_mode`, `exclude_secret_patterns`, `path_map`, watcher params, transport docs); `version` field added for future migration tooling. Note: the template now defaults to `"meta_fields": []` (no `_meta` in responses) rather than `null` (all fields) — better for token efficiency; users who want `_meta` should uncomment the desired fields. - 5 new tests covering `_db_mtime_ns` (no-WAL, WAL-newer, WAL-older) and the full cross-process cache invalidation scenario (1302 total). Contributed by MariusAdrian88 (PR #180). ## [1.12.2] - 2026-03-29 ### Added - **`.razor` (Blazor component) file support** — `.razor` extension now mapped to the `razor` language spec alongside `.cshtml`. `_parse_razor_symbols` extended to emit `@page` route directives and `@inject` dependency injection bindings as constant symbols, making Blazor component routes and injected services first-class navigable symbols. Includes `Counter.razor` test fixture and 8 new tests (1298 total). Contributed by drax1222 (PR #182). ## [1.12.1] - 2026-03-28 ### Fixed - **`get_file_tree` token overflow on large indexes** (closes #178) — results are now capped at `max_files` (default 500). When truncated, the response includes `truncated: true`, `total_file_count`, and a `hint` suggesting `path_prefix` to scope the query. `max_files` is exposed as a tool parameter so callers can raise it explicitly if needed. - **`index_folder` silent over-inclusion** (closes #178) — when no `.gitignore` is present in the repo root and ≥ 500 files are indexed, a warning is now included in the result advising the user to add a `.gitignore` and re-index. - 10 new tests (1288 total). ## [1.12.0] - 2026-03-28 ### Removed - **`check_freshness` and `wait_for_fresh` MCP tools** — no client ever consumed these; removing them saves ~400 schema tokens per call. Server-side freshness management via `freshness_mode` config key (`relaxed`/`strict`) remains fully functional. - **Staleness `_meta` fields** (`index_stale`, `reindex_in_progress`, `stale_since_ms`) — ~30-50 tokens of annotated noise per response. The watcher still manages freshness internally; strict mode blocks silently in `call_tool` before returning clean results. - `powered_by` removed from `_meta` common fields. ### Fixed - **Watcher config layering** — `_get_watcher_enabled()` previously bypassed `config_module.get()` and read `JCODEMUNCH_WATCH` env var directly, silently ignoring the `"watch"` key in `config.jsonc`. Precedence is now: CLI flag > config file (with env var as fallback only when key absent). - **Hash-cache miss reindex skip** — when the watcher's in-memory hash cache missed, the fallback read the file from disk. By the time `watchfiles` delivers the event the file already has new content, making `old_hash == new_hash` and silently skipping the change. Fixed with a `"__cache_miss__"` sentinel that guarantees re-parse on any cache miss. - **Flaky Windows tests from SQLite WAL cache contamination** — tests that modified the DB directly didn't invalidate the in-memory LRU cache; WAL mode on Windows doesn't always update file mtime on write, so the cache key matched stale data. Fixed via `tests/conftest.py` autouse fixtures for cache clear and config reset, plus targeted `_cache_evict()` calls after direct DB writes. - `test_openai_summarizer_timeout_config` now correctly flows `allow_remote_summarizer` through `load_config()` instead of reading from `config.get()` directly. ### Added - **Config-driven watcher parameters** — all watcher options are now configurable via `config.jsonc` (CLI flags remain as overrides). New keys: - `watch_debounce_ms` (int, default 2000) — was wired in config.py but not forwarded to watcher kwargs - `watch_paths` (list, default `[]` → CWD) — folders to watch - `watch_extra_ignore` (list, default `[]`) — additional gitignore-style patterns - `watch_follow_symlinks` (bool, default `false`) - `watch_idle_timeout` (int or null, default `null`) — auto-stop after N minutes idle - `watch_log` (str or null, default `null`) — log watcher output to file; `"auto"` = temp file - 25 new tests (1285 total). ## [1.11.17] - 2026-03-27 ### Added - **Optional semantic / embedding search (Feature 8)** — hybrid BM25 + vector search, opt-in only, zero mandatory new dependencies. - `search_symbols` gains three new params: `semantic` (bool, default `false`), `semantic_weight` (float 0–1, default 0.5), `semantic_only` (bool, default `false`). When `semantic=false` (default) there is zero performance impact and zero new imports. - **New `embed_repo` tool** — precomputes and caches all symbol embeddings in one pass (`batch_size`, `force` params). Optional warm-up; `search_symbols` lazily embeds missing symbols on first semantic query. - **New `EmbeddingStore`** — thin SQLite CRUD layer (`symbol_embeddings` table) in the existing per-repo `.db` file. Embeddings serialised as float32 BLOBs via stdlib `array` module. Persists across restarts; invalidatable per-symbol for incremental reindex. - **Three embedding providers** (priority order): local `sentence-transformers` (`JCODEMUNCH_EMBED_MODEL` env var), Gemini (`GOOGLE_API_KEY` + `GOOGLE_EMBED_MODEL`), OpenAI (`OPENAI_API_KEY` + `OPENAI_EMBED_MODEL`). `OPENAI_API_KEY` alone does **not** activate embeddings (prevents conflation with local-LLM summariser use). - **Hybrid ranking**: `combined = (1−w) × bm25_normalised + w × cosine_similarity`. BM25 normalised by max score over the candidate set. `semantic_weight=0.0` produces identical results to pure BM25. - **Pure Python cosine similarity** — `math.sqrt` + `sum()`, no numpy required. - `semantic=true` with no provider configured returns `{"error": "no_embedding_provider", "message": "..."}` (structured error, not a crash). - New optional dep: `pip install jcodemunch-mcp[semantic]` installs `sentence-transformers>=2.2.0`. - 22 new tests. ## [1.11.16] - 2026-03-27 ### Added - **Token-budgeted context assembly (Feature 5)** — two new capabilities: - `get_context_bundle` gains `token_budget`, `budget_strategy`, and `include_budget_report` params. When `token_budget` is set, symbols are ranked and trimmed to fit. `budget_strategy` controls how: `most_relevant` (default) ranks by file import in-degree, `core_first` keeps the primary symbol first then ranks the rest by centrality, `compact` strips all source bodies and returns signatures only. `include_budget_report=true` adds a `budget_report` field showing `budget_tokens`, `used_tokens`, `included_symbols`, `excluded_symbols`, and `strategy`. Fully backward-compatible: all new params default to existing behavior. - **New `get_ranked_context` tool** — standalone token-budgeted context assembler. Takes a `query` + `token_budget` (default 4000) and returns the best-fit symbols with their full source, greedy-packed by combined score. `strategy` controls ranking: `combined` (BM25 + PageRank weighted sum, default), `bm25` (pure text relevance), `centrality` (PageRank only). Optional `include_kinds` and `scope` params restrict the candidate set. Response includes per-item `relevance_score`, `centrality_score`, `combined_score`, `tokens`, and `source`. Token counting uses `len(text) // 4` heuristic with optional `tiktoken` upgrade (no hard dep). No new dependencies. 19 new tests. ## [1.11.15] - 2026-03-27 ### Added - **`get_changed_symbols` tool** — maps a git diff to affected symbols. Given two commits (`since_sha` / `until_sha`, defaulting to index-time SHA vs HEAD), returns `added_symbols`, `removed_symbols`, and `changed_symbols` (with `change_type`: "added", "removed", "modified", or "renamed"). `renamed` detection fires when body hash is identical but name differs. Set `include_blast_radius=true` to also return downstream importers (with `max_blast_depth` hop limit). Requires a locally indexed repo (`index_folder`); GitHub-indexed repos return a clear error. Requires `git` on PATH; graceful error if not available. Filters index-storage files (e.g. `.index/`) from the diff when the storage dir is inside the repo. No new dependencies. 12 new tests. ## [1.11.14] - 2026-03-27 ### Added - **`find_dead_code` tool** — finds files and symbols unreachable from any entry point using the import graph. Entry points auto-detected by filename (`main.py`, `__main__.py`, `conftest.py`, `manage.py`, etc.), `__init__.py` package roots, and `if __name__ == "__main__"` guards (Python only). Returns `dead_files` and `dead_symbols` with confidence scores: `1.0` = zero importers, no framework decoration; `0.9` = zero importers in a test file; `0.7` = all importers are themselves dead (cascading). Parameters: `granularity` ("symbol"/"file"), `min_confidence` (default 0.8), `include_tests` (bool), `entry_point_patterns` (additional glob roots). No new dependencies. 13 new tests. ## [1.11.13] - 2026-03-27 ### Fixed - **Manifest watcher reliability** — replaced `watchfiles.awatch()` in `_manifest_watcher` with a simple 0.5s polling loop. `watchfiles` was unreliable on Windows (especially in temp directories used by tests and agent hooks), causing the manifest watcher to silently miss create/remove events. Polling the manifest file's size every 500ms is sufficient for this append-only JSONL file and works reliably on all platforms. ## [1.11.12] - 2026-03-27 ### Added - **PageRank / centrality ranking** — new `get_symbol_importance` tool returns the most architecturally important symbols in a repo, ranked by full PageRank or simple in-degree on the import graph. Parameters: `top_n` (default 20), `algorithm` ("pagerank" or "degree"), `scope` (subdirectory filter). Response includes `symbol_id`, `rank`, `score`, `in_degree`, `out_degree`, `kind`, `iterations_to_converge`. New `sort_by` parameter on `search_symbols` ("relevance" | "centrality" | "combined") — "centrality" filters by BM25 query match but ranks by PageRank; "combined" adds PageRank as weighted boost to BM25 score; "relevance" (default) is unchanged (backward compatible). `get_repo_outline` now includes `most_central_symbols` (top 10 symbols by PageRank score, one representative per file, alongside the existing `most_imported_files`). PageRank implementation: damping=0.85, convergence threshold=1e-6, max 100 iterations, dangling-node correction, cached in `_bm25_cache` per `CodeIndex` load. 23 new tests. ## [1.11.11] - 2026-03-27 ### Added - **Fuzzy symbol search** — `search_symbols` gains three new parameters: `fuzzy` (bool, default `false`), `fuzzy_threshold` (float, default `0.4`), and `max_edit_distance` (int, default `2`). When enabled, a trigram Jaccard + Levenshtein pass runs as fallback when BM25 confidence is low (top score < 0.1) or when explicitly requested. Fuzzy results carry `match_type="fuzzy"`, `fuzzy_similarity`, and `edit_distance` fields; BM25 results carry `match_type="exact"`. Zero behavioral change when `fuzzy=false` (default). No new dependencies — pure stdlib (`frozenset` trigrams + Wagner-Fischer edit distance). 21 new tests. ## [1.11.10] - 2026-03-27 ### Added - **Blast radius depth scoring** — `get_blast_radius` now always returns `direct_dependents_count` (depth-1 count) and `overall_risk_score` (0.0–1.0, weighted by hop distance using `1/depth^0.7`). New `include_depth_scores=true` parameter adds `impact_by_depth` (files grouped by BFS layer, each with a `risk_score`). Flat `confirmed`/`potential` lists are preserved unchanged (backward compatible). 14 new tests. ## [1.11.9] - 2026-03-27 ### Fixed - **Windows CI: trusted_folders tests** — `_platform_path_str` was using `str(Path(...))` which on Windows returns backslash paths (`C:\work`). When embedded raw into f-string JSON literals in tests, the backslash produced invalid `\escape` sequences, causing `config.jsonc` parse failures across all 4 Windows matrix legs (6 tests failing). Fixed by switching to `.as_posix()`, which returns forward-slash paths (`C:/work`) that are valid in both JSON and Windows pathlib. ## [1.11.8] - 2026-03-27 ### Added - **`trusted_folders` allowlist for `index_folder`** (PR #175, credit: @tmeckel) — new `trusted_folders` config key (plus `trusted_folders_whitelist_mode`) restricts or blocks indexing by path. Whitelist mode (default) allows only explicitly named roots; blacklist mode blocks specific paths while trusting all others. Path-aware matching (not string-prefix). Project config supports `.`, `./subdir`, and bare relative paths. Escape-attempt paths are rejected. Empty list preserves existing behavior (backward compatible). Env var fallback via `JCODEMUNCH_TRUSTED_FOLDERS`. ## [1.11.7] - 2026-03-27 ### Added - **`check_freshness` tool** — compares the git HEAD SHA recorded at index time against the current HEAD for locally indexed repos. Returns `fresh` (bool), `indexed_sha`, `current_sha`, and `commits_behind`. GitHub repos return `is_local: false` with an explanatory message. `get_repo_outline` staleness check upgraded to SHA-based comparison (accurate) with time-based fallback for GitHub/no-git repos; `is_stale` added to `_meta`. 8 new tests. ## [1.11.6] - 2026-03-27 ### Added - **Structured file-cap warnings** — `index_folder` and `index_repo` now surface `files_discovered`, `files_indexed`, and `files_skipped_cap` fields plus a human-readable `warning` when the file cap is hit. Previously a silent "note". - **`_meta` hint on single-symbol responses** — `search_symbols` and `get_symbol_source` single-symbol responses now include a `_meta` hint pointing to `get_context_bundle`. ### Changed - **Benchmark docs** — `METHODOLOGY.md` expanded with a "Common Misreadings" section; reproducible results table added to README. ## [1.11.5] - 2026-03-26 ### Fixed - **`tsconfig.json`/`jsconfig.json` parsed as JSONC** — previously `json.loads()` silently failed on commented tsconfigs (TypeScript projects commonly use `//` comments in tsconfig.json), leaving `alias_map` empty and causing `find_importers`/`get_blast_radius` to return 0 alias-based results. Now parsed with the same JSONC stripper used for `config.jsonc`. Also adds a test for nested layouts with specific `@/lib/*` overrides. Closes #170. 5 new tests. ## [1.11.4] - 2026-03-25 ### Fixed - **TypeScript/SvelteKit path alias resolution** — `find_importers`, `get_blast_radius`, `get_dependency_graph`, and 5 other import-graph tools now resolve `@/*`, `$lib/*`, and other configured aliases by reading `compilerOptions.paths` from `tsconfig.json`/`jsconfig.json` at the project root. Also resolves TypeScript's ESM `.js`→`.ts` extension convention. `alias_map` is auto-loaded from `source_root` and cached at module level. Closes #169. 10 new tests. ## [1.11.3] - 2026-03-25 ### Added - **Debug logging for silent skip paths** — all three skip paths (`skip_dir`, `skip_file`, `secret`) now emit debug-level log lines. `skip_dir` and `skip_file` counters added to the discovery summary. `exclude_secret_patterns` config option suppresses specific `SECRET_PATTERNS` entries (workaround for `*secret*` glob false-positives on full relative paths in Go monorepos). (PR #168, credit: @DrHayt) 6 new tests. ## [1.11.2] - 2026-03-25 ### Fixed - **`resolve_repo` hang on Windows** — added `stdin=subprocess.DEVNULL` to the git subprocess call in `_git_toplevel()`. Without it, the git child process inherits the MCP stdio pipe and blocks indefinitely. Same pattern fixed in v1.1.7 for `index_folder`. Closes #166. - **`parse_git_worktrees` hang on Windows** (watcher) — same missing `stdin=subprocess.DEVNULL` fix, preventative. ## [1.8.3] - 2026-03-18 ### Added - **`find_importers`: `has_importers` flag** — each result now includes `has_importers: bool`. When `false`, the importer itself has no importers, revealing transitive dead code chains without requiring recursive calls. Implemented as one additional O(n) pass over the import graph; no re-indexing required. Closes #132. Identified via 50-iteration dead code A/B test (#130). ## [1.8.2] - 2026-03-18 ### Changed - **`get_file_outline` tool description** — now explicitly states "full signatures (including parameter names)" and adds "Use signatures to review naming at parameter granularity without reading the full file." Parameter names were always present in the `signature` field; the description now makes this discoverable. Closes #131. ## [1.8.1] - 2026-03-18 ### Fixed - **Dynamic `import()` detection in JS/TS/Vue** — `find_importers` now detects Vue Router lazy routes and other code-splitting patterns using `import('specifier')` call syntax. Previously these files appeared to have zero importers and were misclassified as dead. Identified via 50-iteration dead code A/B test (#130, @Mharbulous); 4 Vue view files affected. ## [1.8.0] - 2026-03-18 ### Security - **Supply-chain integrity check** — `verify_package_integrity()` added to `security.py` and called at startup. Uses `importlib.metadata.packages_distributions()` to identify the distribution that actually owns the running code. If it differs from the canonical `jcodemunch-mcp`, a `SECURITY WARNING` is printed to stderr. Catches the fork-republishing attack class described at https://news.ycombinator.com/item?id=47428217. Silent for source/editable installs. ### Added - **`authors` and `[project.urls]`** in `pyproject.toml` — PyPI pages now display official provenance metadata (author, homepage, issue tracker). ## [1.7.9] - 2026-03-18 ### Added - **JS/TS const extraction** — top-level `const` and `export const` declarations in JavaScript, TypeScript, and TSX are now indexed as `constant` symbols. Arrow functions and function expressions assigned to consts are correctly skipped (handled by existing function extraction). Accepts all identifier naming conventions for JS/TS. - **`index_file` tool** (PR #126, credit: @thellMa) — re-index a single file instantly after editing. Locates the correct index by scanning `source_root` of all indexed repos (picks most specific match), validates security, computes hash + mtime, and exits early if the file is unchanged. Parses with tree-sitter, runs context providers, and calls `incremental_save()` for a surgical single-file update. Registered as a new MCP tool with `path`, `use_ai_summaries`, and `context_providers` parameters. - **mtime optimization** (PR #126, credit: @thellMa) — `index_folder` and `index_repo` now check file modification time (`st_mtime_ns`) before reading or hashing. Files with unchanged mtimes are skipped entirely; hashes are computed lazily only for files whose mtime changed. Indexes store a `file_mtimes` dict; old indexes without mtime data fall back to hash-all for backward compatibility. - **`watch-claude` CLI subcommand** — auto-discover and watch Claude Code worktrees via two complementary modes: - **Hook-driven mode** (recommended): install `WorktreeCreate`/`WorktreeRemove` hooks that call `jcodemunch-mcp hook-event create|remove`. Events are written to `~/.claude/jcodemunch-worktrees.jsonl` and `watch-claude` reacts instantly via filesystem watch. - **`--repos` mode**: `jcodemunch-mcp watch-claude --repos ~/project1 ~/project2` polls `git worktree list --porcelain` and filters for Claude-created worktrees (branches matching `claude/*` or `worktree-*`). - Both modes can run simultaneously. When a worktree is removed, the watcher stops and the index is invalidated. - **`hook-event` CLI subcommand** — `jcodemunch-mcp hook-event create|remove` reads Claude Code's hook JSON from stdin and appends to the JSONL manifest. Designed to be called from Claude Code's `WorktreeCreate`/`WorktreeRemove` hooks. ### Changed - **Shared indexing pipeline** (PR #126, credit: @thellMa) — new `_indexing_pipeline.py` consolidates logic previously duplicated across `index_folder`, `index_repo`, and the new `index_file`: `file_languages_for_paths()`, `language_counts()`, `complete_file_summaries()`, `parse_and_prepare_incremental()`, and `parse_and_prepare_full()`. All three tools now call the shared pipeline functions. - `main()` subcommand set expanded to include `hook-event` and `watch-claude`. ## [1.7.2] - 2026-03-17 ### Fixed - **Stale `context_metadata` on incremental save** — `{}` from active providers was treated as falsy, silently preserving old metadata instead of clearing it. Changed to `is not None` check. - **`_resolve_description` discarding surrounding text** — `"Prefix {{ doc('name') }} suffix"` now preserves both prefix and suffix instead of returning only the doc block content. - **dbt tags only extracted from `config.tags`** — top-level `model.tags` (valid in dbt schema.yml) are now merged with `config.tags`, deduplicated. - **Redundant `posixpath.sep` check** in `resolve_specifier` — removed duplicate of adjacent `"/" not in` check. - **Inaccurate docstring** on `_detect_dbt_project` — said "max 2 levels deep" but only checks root + immediate children. ### Changed - **Concurrent AI summarization** — `BaseSummarizer.summarize_batch()` now uses `ThreadPoolExecutor` (default 4 workers) for Anthropic and Gemini providers. Configurable via `JCODEMUNCH_SUMMARIZER_CONCURRENCY` env var. Matches the pattern already used by `OpenAIBatchSummarizer`. ~4x faster on large projects. - **O(1) stem resolution** — `resolve_specifier` stem-matching fallback now uses a cached dict lookup instead of O(n) linear scan. Significant perf improvement for dbt projects with thousands of files, called in tight loops across 7 tools. - **`collect_metadata` collision warning** — logs a warning when two providers emit the same metadata key, instead of silently overwriting via `dict.update()`. - **`find_importers`/`find_references` tool descriptions** — now note that `{{ source() }}` edges are extracted but not resolvable since sources are external. - **`search_columns` cleanup** — moved `import fnmatch` to top-level; documented empty-query + `model_pattern` behavior (acts as "list all columns for matching models"). ## [1.7.0] - 2026-03-17 ### Added - **Centrality ranking** — `search_symbols` BM25 scores now include a log-scaled bonus for symbols in frequently-imported files, surfacing core utilities as tiebreakers when relevance scores are otherwise equal. - **`get_symbol_diff`** — diff two indexed snapshots by `(name, kind)`. Reports added, removed, and changed symbols using `content_hash` for change detection. Index the same repo under two names to compare branches. - **`get_class_hierarchy`** — traverse inheritance chains upward (ancestors via `extends`/`implements`/Python parentheses) and downward (subclasses/implementors) from any class. Handles external bases not in the index. - **`get_related_symbols`** — find symbols related to a given one via three heuristics: same-file co-location (weight 3.0), shared importers (1.5), name-token overlap (0.5/token). - **Git blame context provider** — `GitBlameProvider` auto-activates during `index_folder` when a `.git` directory is present. Runs a single `git log` at index time and attaches `last_author` + `last_modified` to every file via the existing context provider plugin system. - **`suggest_queries`** — scan the index and get top keywords, most-imported files, kind/language distribution, and ready-to-run example queries. Ideal first call when exploring an unfamiliar repository. - **Markdown export** — `get_context_bundle` now accepts `output_format="markdown"`, returning a paste-ready document with import blocks, docstrings, and fenced source code. ## [1.6.1] - 2026-03-17 ### Added - **`watch` CLI subcommand** (PR #113, credit: @DrHayt) — `jcodemunch-mcp watch ...` monitors one or more directories for filesystem changes and triggers incremental re-indexing automatically. Uses `watchfiles` (Rust-based, async) for OS-native notifications with configurable debounce. Install with `pip install jcodemunch-mcp[watch]`. - `watchfiles>=1.0.0` optional dependency under `[watch]` and `[all]` extras. ### Changed - `main()` refactored to use argparse subcommands (`serve`, `watch`). Full backwards compatibility preserved — bare `jcodemunch-mcp` and legacy flags like `--transport` continue to work unchanged. ## [1.6.0] - 2026-03-17 ### Added - **`get_context_bundle` multi-symbol bundles** — new `symbol_ids` (list) parameter fetches multiple symbols in one call. Import statements are deduplicated when symbols share a file. New `include_callers=true` flag appends the list of files that directly import each symbol's defining file. ### Changed - Single `symbol_id` (string) remains fully backward-compatible. ## [1.5.9] - 2026-03-17 ### Added - **`get_blast_radius` tool** — find every file affected by changing a symbol. Given a symbol name or ID, traverses the reverse import graph (up to 3 hops) and text-scans each importing file. Returns `confirmed` (imports the file + references the symbol name) and `potential` (imports the file only — wildcard/namespace imports). Handles ambiguous names by listing all candidate IDs. ## [1.5.8] - 2026-03-17 ### Changed - **BM25 search** — replaced hand-tuned substring scoring in `search_symbols` with proper BM25 + IDF. IDF is computed over all indexed symbols at query time (no re-indexing required). CamelCase/snake_case tokenization splits `getUserById` into `get`, `user`, `by`, `id` for natural language queries. Per-field repetition weights: name 3×, keywords 2×, signature 2×, summary 1×, docstring 1×. Exact name match retains a +50 bonus. `debug=true` now returns per-field BM25 score breakdowns. ## [1.5.7] - 2026-03-17 ### Added - **`get_dependency_graph` tool** — file-level import graph with BFS traversal up to 3 hops. `direction` parameter: `imports` (what this file depends on), `importers` (what depends on this file), or `both`. Returns nodes, edges, and per-node neighbor map. Built from existing index data — no re-indexing required. ## [1.5.6] - 2026-03-17 ### Added - **`get_session_stats` tool** — process-lifetime token savings dashboard. Reports tokens saved and cost avoided (current session + all-time cumulative), per-tool breakdown, session duration, and call counts. ## [1.5.5] - 2026-03-17 ### Added - **Tiered loading** (`detail_level` on `search_symbols`) — `compact` returns id/name/kind/file/line only (~15 tokens/result, ideal for discovery); `standard` is unchanged (default); `full` inlines source, docstring, and end_line. - `byte_length` field added to all `search_symbols` result entries regardless of detail level. ## [1.5.4] - 2026-03-17 ### Added - **Token budget search** (`token_budget=N` on `search_symbols`) — greedily packs results by byte length until the budget is exhausted. Overrides `max_results`. Reports `tokens_used` and `tokens_remaining` in `_meta`. ## [1.5.3] - 2026-03-17 ### Added - **Microsoft Dynamics 365 Business Central AL language support** (PR #110, credit: @DrHayt) — `.al` files are now indexed. Extracts procedures, triggers, codeunits, tables, pages, reports, and XML ports. ## [1.5.2] - 2026-03-17 ### Fixed - `tokens_saved` always reporting 0 in `get_file_outline` and `get_repo_outline`. ## [1.5.1] - 2026-03-16 ### Added - **Benchmark reproducibility** — `benchmarks/METHODOLOGY.md` with full reproduction details. - **HTTP bearer token auth** — `JCODEMUNCH_HTTP_TOKEN` env var secures HTTP transport endpoints. - **`JCODEMUNCH_REDACT_SOURCE_ROOT`** env var redacts absolute local paths from responses. - **Schema validation on index load** — rejects indexes missing required fields. - **SHA-256 checksum sidecars** — index integrity verification on load. - **GitHub rate limit retry** — exponential backoff in `fetch_repo_tree`. - **`TROUBLESHOOTING.md`** with 11 common failure scenarios and solutions. - CI matrix extended to Windows and Python 3.13. ### Changed - Token savings labeled as estimates; `estimate_method` field added to all `_meta` envelopes. - `search_text` raw byte count now only includes files with actual matches. - `VALID_KINDS` moved to a `frozenset` in `symbols.py`; server-side validation rejects unknown kinds. ## [1.5.0] - 2026-03-16 ### Added - **Cross-process file locking** via `filelock` — prevents index corruption under concurrent access. - **LRU index cache with mtime invalidation** — re-reads index JSON only when the file changes on disk. - **Metadata sidecars** — `list_repos` reads lightweight sidecar files instead of loading full index JSON. - **Streaming file indexing** — peak memory reduced from ~1 GB to ~500 KB during large repo indexing. - **Bounded heap search** — `O(n log k)` instead of `O(n log n)` for bounded result sets. - **`BaseSummarizer` base class** — deduplicates `_build_prompt`/`_parse_response` across AI summarizers. - +13 new tests covering `search_columns`, `get_context_bundle`, and ReDoS hardening. ### Fixed - **ReDoS protection** in `search_text` — pathological regex patterns are rejected before execution. - **Symlink-safe temp files** — atomic index writes use `tempfile` rather than direct overwrite. - **SSRF prevention** — API base URL validation rejects non-HTTP(S) schemes. ## [1.4.4] - 2026-03-16 ### Added - **Assembly language support** (PR #105, credit: @astrobleem) — WLA-DX, NASM, GAS, and CA65 dialects. `.asm`, `.s`, `.wla` files indexed. Extracts labels, macros, sections, and directives as symbols. - `"asm"` added to `search_symbols` language filter enum. ## [1.4.3] - 2026-03-15 ### Fixed - Cross-process token savings loss — `token_tracker` now uses additive flush so savings accumulated in one process are not overwritten by a concurrent flush from another. ## [1.4.2] - 2026-03-15 ### Added - XML `name` and `key` attribute extraction — elements with `name=` or `key=` attributes are now indexed as `constant` symbols (closes #102). ## [1.4.1] - 2026-03-14 ### Added - **Minimal CLI** (`cli/cli.py`) — 47-line command-line interface over the shared `~/.code-index/` store covering all jMRI ops: `list`, `index`, `outline`, `search`, `get`, `text`, `file`, `invalidate`. - `cli/README.md` — explains MCP as the preferred interface and documents CLI usage. ### Changed - README onboarding improved: added "Step 3: Tell Claude to actually use it" with copy-pasteable `CLAUDE.md` snippets. ## [1.4.0] - 2026-03-13 ### Added - **AutoHotkey hotkey indexing** — all three hotkey syntax forms are now extracted as `kind: "constant"` symbols: bare triggers (`F1::`), modifier combos (`#n::`), and single-line actions (`#n::Run "notepad"`). Only indexed at top level (not inside class bodies). - **`#HotIf` directive indexing** — both opening expressions (`#HotIf WinActive(...)`) and bare reset (`#HotIf`) are indexed, searchable by window name or expression string. - **Public benchmark corpus** — `benchmarks/tasks.json` defines the 5-task × 3-repo canonical task set in a tool-agnostic format. Any code retrieval tool can be evaluated against the same queries and repos. - **`benchmarks/README.md`** — full methodology documentation: baseline definition, jMunch workflow, how to reproduce, how to benchmark other tools. - **`benchmarks/results.md`** — canonical tiktoken-measured results (95.0% avg reduction, 20.2x ratio, 15 task-runs). Replaces the obsolete v0.2.22 proxy-based benchmark files. - Benchmark harness now loads tasks from `tasks.json` when present, falling back to hardcoded values. ## [1.3.9] - 2026-03-13 ### Added - **OpenAPI / Swagger support** — `.openapi.yaml`, `.openapi.yml`, `.openapi.json`, `.swagger.yaml`, `.swagger.yml`, `.swagger.json` files are now indexed. Well-known basenames (`openapi.yaml`, `swagger.json`, etc.) are auto-detected regardless of directory. Extracts: API info block, paths as `function` symbols, schema definitions as `class` symbols, and reusable component schemas. - `get_language_for_path` now checks well-known OpenAPI basenames before compound-extension matching. - `"openapi"` added to `search_symbols` language filter enum. ## [1.3.8] - 2026-03-13 ### Added - **`get_context_bundle` tool** — returns a self-contained context bundle for a symbol: its definition source, all direct imports, and optionally its callers/implementers. Replaces the common `get_symbol` + `find_importers` + `find_references` round-trip with a single call. Scoped to definition + imports in this release. ## [1.3.7] - 2026-03-13 ### Added - **C# properties, events, and destructors** (PR #100) — `get { set {` property accessors, `event EventHandler Name`, and `~ClassName()` destructors are now extracted as symbols alongside existing C# method/class support. ## [1.3.6] - 2026-03-13 ### Added - **XML / XUL language support** (PR #99) — `.xml` and `.xul` files are now indexed. Extracts: document root element as a `type` symbol, elements with `id` attributes as `constant` symbols, and `