# SigMap — Complete LLM Reference > The deterministic, verifiable grounding layer for AI code work. > A reproducible signature-and-evidence map that agents, CI, and reviewers can trust and audit. No embeddings, no vector DB, fully offline. SigMap is the deterministic, verifiable grounding layer for AI code work. It extracts function and class signatures from a codebase and builds a byte-stable signature-and-evidence map that agents, CI, and reviewers can trust and audit — proving which files and symbols are real before acting. Deterministic TF-IDF ranking keeps the relevant context in scope (cutting tokens ~97% as a side effect), with no LLM calls, embeddings, or vector database. Works with Claude, Cursor, GitHub Copilot, Aider, Windsurf, local LLMs, and MCP. # Version: 7.30.0 | Benchmark: sigmap-v7.30-main (2026-06-23) # Source: auto-generated from package.json, version.json, benchmarks/latest.json, src/mcp/tools.js, src/config/defaults.js # Regenerate: npm run generate:llms | Validate: npm run validate:llms --- ## Core metrics (benchmark: sigmap-v7.30-main, 2026-06-23) | Metric | Without SigMap | With SigMap | |--------|----------------|-------------| | Retrieval hit@5 | 13.6% (random) | 75.6% (5.6× lift) | | Token reduction | — | 97.0% average | | Task success proxy | 10% | 52.2% | | Prompts per task | 2.84 | 1.72 (39.4% fewer) | | Supported languages | — | 33 | | MCP tools | — | 17 | | npm runtime dependencies | — | 0 | --- ## Installation ```bash # Run immediately without installing npx sigmap # Install globally npm install -g sigmap # Auto-wire MCP + editor config + git hook + watcher npx sigmap --setup ``` --- ## CLI commands — complete reference Every command and flag (`sigmap --help`): ``` sigmap Generate context once and exit sigmap --monorepo Generate per-package context (monorepo) sigmap --each Run for every repo in the current directory sigmap --routing Include model routing hints in output sigmap --format cache Also write Anthropic prompt-cache JSON sigmap --track Append run metrics to .context/usage.ndjson sigmap --watch Generate + watch for file changes sigmap --setup Generate + install git hook + watch sigmap --mcp Start MCP server on stdio sigmap --report Token reduction stats to stdout sigmap --report --json Token report as JSON (for CI; exits 1 if over budget) sigmap --report --history Print usage log summary from .context/usage.ndjson sigmap --report --history --chart Include inline SVG charts + Unicode sparklines sigmap --dashboard Write benchmarks/reports/dashboard.html sigmap --suggest-tool "" Recommend model tier for a task description sigmap --suggest-tool "" --json Machine-readable tier recommendation sigmap --health Print composite health score sigmap --health --json Machine-readable health score sigmap gain Token-savings dashboard (totals + by-operation) sigmap gain --all Add daily / weekly / monthly trend tables sigmap gain --json Aggregate savings as JSON sigmap gain --since 7d Window filter (7d, 30d, 12h, or ISO date) sigmap gain --top | --model Limit rows / set $ pricing model sigmap gain --reset Clear the local savings log (.context/gain.ndjson) sigmap ... --no-track Disable gain savings capture for this run sigmap --diff Generate context for git-changed files only sigmap --diff Generate context + structural diff vs base ref (e.g. main) sigmap --diff --staged Generate context for staged files only sigmap --benchmark Run retrieval benchmark (benchmarks/tasks/retrieval.jsonl) sigmap --adapter Generate for a specific adapter only (v3.0+) sigmap --adapter --json Show adapter output path as JSON sigmap --benchmark --json Benchmark results as JSON sigmap --eval Alias for --benchmark sigmap --analyze Per-file breakdown: sigs, tokens, extractor, coverage sigmap --analyze --json Breakdown as JSON sigmap --analyze --slow Re-time each extractor; flag files >50ms sigmap --diagnose-extractors Run all 21 extractors vs fixtures; show pass/fail + diff sigmap --query "" Rank files by relevance to a query sigmap --query "" --json Ranked results as JSON sigmap --query "" --top Limit results to top N files (default 10) sigmap learn --good Boost files in .context/weights.json sigmap learn --bad Penalize files in .context/weights.json sigmap learn --reset Delete learned file weights sigmap weights Show learned file multipliers sigmap weights --json Learned weights as JSON sigmap --impact Show every file impacted by changing sigmap --impact --json Impact as JSON {changed, direct, transitive, tests, routes} sigmap --impact --depth BFS depth limit (default 3, 0=unlimited) sigmap verify-ai-output Flag fake files/tests/imports/symbols/npm-scripts in an AI answer sigmap verify-ai-output --json Hallucination report as JSON (exits 1 if issues) sigmap verify-ai-output --report Write a standalone HTML report (red/amber/green) sigmap conventions Extract repo file-naming/export/test conventions (--conflicts, --inject, --report, --fix) sigmap scaffold "" Propose a convention-matched file/dir scaffold (--ext, --threshold, --force, --json) sigmap verify-plan Check a plan vs the live index — files/symbols exist, blast radius, scope (--json) sigmap review-pr Audit a diff — scope drift, god-node edits, missing tests, security files (--staged, --json) sigmap create "" Grounded-creation pipeline: scaffold → verify-plan → verify-ai-output → review-pr (--staged) sigmap squeeze Minimize a pasted stacktrace/CI-log/JSON blob (--json for stats) sigmap ask "" --squeeze Auto-accept input minimization (no prompt; for scripts/CI) sigmap ask "" --no-squeeze Disable input minimization entirely sigmap ask "" --squeeze-threshold N Min reduction %% to prompt (default 30) sigmap evidence "" Build a deterministic Evidence Pack (JSON) → .context/evidence-pack.json sigmap evidence "" --markdown Emit the Markdown handoff rendering to stdout sigmap evidence "" --top --budget --out Tune ranked files / token budget / write rendered output sigmap note "" Append a note to the cross-session decision log sigmap note List recent notes (also: note --list ) sigmap status Show repo state — branch, dirty files, index freshness, notes sigmap doctor Diagnose config, index, freshness, coverage, MCP wiring — with fixes (--json; exits 1 on hard failure) sigmap mcp list List MCP clients and their config paths (--json) sigmap mcp install Wire MCP for one client (claude|cursor|windsurf|vscode|zed|codex|gemini|opencode|mcp); --global for user-level sigmap --init Write example config + .contextignore scaffold sigmap --help Show this message sigmap --version Show version ``` --- ## MCP server — 17 tools Start with `sigmap --mcp` (stdio JSON-RPC). Configure once: ```json { "mcpServers": { "sigmap": { "command": "npx", "args": ["sigmap", "--mcp"] } } } ``` ### read_context Read extracted code signatures for the project or a specific module path. Returns the full copilot-instructions.md content (~500–4K tokens) or a filtered subset when a module path is provided (~50–500 tokens). ``` Input: { module?: string } ``` ### search_signatures Search extracted code signatures for a keyword, function name, or class name. Returns matching signature lines with their file paths. ``` Input: { query: string } ``` ### get_map Read a section from PROJECT_MAP.md — import graph, class hierarchy, or route table. Requires gen-project-map.js to have been run first. ``` Input: { type: string } ``` ### create_checkpoint Create a session checkpoint summarising current project state. Returns recent git commits, active branch, token count, and a compact snapshot of the codebase context — ideal for session handoffs or periodic saves during long coding sessions. ``` Input: { note?: string } ``` ### get_routing Get model routing hints for this project — which files belong to which complexity tier (fast/balanced/powerful) and which AI model to use for each type of task. Helps reduce API costs by 40–80% by routing simple tasks to cheaper models. ``` Input: { } (no arguments) ``` ### explain_file Explain a specific file: returns its extracted signatures, direct imports (files it depends on), and callers (files that import it). Ideal for understanding a file in isolation without reading raw source. Requires the context file to have been generated first. ``` Input: { path: string } ``` ### list_modules List all top-level modules (srcDirs) present in the context file, sorted by token count descending. Use this to decide which module to pass to read_context before querying a specific area of the codebase. ``` Input: { } (no arguments) ``` ### query_context Rank and return the most relevant files for a specific task or question. Uses keyword + symbol + path scoring to surface only the top-K files relevant to the query — much cheaper than reading all context. Returns ranked file list with signatures and relevance scores. ``` Input: { query: string, topK?: number } ``` ### get_impact Show every file that is impacted when a given file changes — direct importers, transitive importers, affected tests, and affected routes/controllers. Gives agents instant blast-radius awareness before making a change. Handles circular dependencies safely (no infinite loops). ``` Input: { file: string, depth?: number } ``` ### get_lines Fetch an exact line range from a source file on demand — the Surgical Context workhorse. Signatures carry `path:start-end` anchors; call this to read just those lines instead of re-opening the whole file. Lines are clamped to the file bounds and secret-scanned (redacted) before return. Path is sandboxed to the project root. ``` Input: { file: string, start: number, end: number } ``` ### read_memory Recall the project decision log — recent notes left by humans or agents across sessions (via `sigmap note`), plus the last ranking-session focus. Call this at the start of a task to kill cold-start: it answers "what were we doing and why" without re-reading the whole codebase. ``` Input: { limit?: number } ``` ### get_callee_signatures Return the EXACT current signature(s) of named symbols (functions, classes, methods) from the index — so an agent never guesses a callee's parameter types from training memory. Call this before writing code that uses a symbol. Unknown names get a closest-match suggestion. ``` Input: { symbols: array } ``` ### sigmap_notify_file_created Tell SigMap a file was created or modified so its signatures are indexed live for the rest of the session. Call this after writing a file — the new symbols become resolvable by search_signatures / get_callee_signatures. ``` Input: { path: string, content?: string } ``` ### sigmap_notify_symbol_added Fast path: register a single new symbol signature directly in the live index without re-reading the whole file. ``` Input: { signature: string, file: string, line?: number } ``` ### sigmap_notify_file_deleted Tell SigMap a file was deleted so its symbols are dropped from the live index. ``` Input: { path: string } ``` ### get_diff_context For every changed file in the working tree (or staged, or vs a base ref), return its current signatures plus blast radius — direct importers, transitive count, and affected tests/routes — with a risk label. One call gives an agent everything a code review or a safe edit needs. Lists changed files shell-free (git binary, never a shell). ``` Input: { base?: string, staged?: boolean, depth?: number } ``` ### get_architecture_overview A high-level map of the codebase in one call: module breakdown (files/tokens), the most depended-on "hub" files, the dependency-cycle count, and route totals. Extends get_map — use it to orient in an unfamiliar repo before drilling in with read_context / query_context. ``` Input: { } (no arguments) ``` --- ## Configuration (gen-context.config.json) Every config key and its default: ``` output = .github/copilot-instructions.md outputs = ["copilot"] adapters = null srcDirs = ["src","app","lib","packages","services","api","server","client","web","frontend","backend","desktop","mobile","shared","common","core","workers","functions","lambda","cmd","pages","components","hooks","routes","controllers","models","views","resources","config","db","projects","apps","libs","instance","blueprints","src/main/java","src/main/kotlin","src/main/scala","app/src/main/java","app/src/main/kotlin","src/test/java","src/test/kotlin"] exclude = ["node_modules",".git","dist","build","out","__pycache__",".next","coverage","target","vendor",".context","playwright-tmp","playwright-report","test-results",".turbo","storybook-static",".docusaurus"] maxDepth = 6 maxSigsPerFile = 25 maxTokens = 6000 autoMaxTokens = true coverageTarget = 0.8 modelContextLimit = 128000 maxTokensHeadroom = 0.2 secretScan = true monorepo = false diffPriority = true strategy = full hotCommits = 10 watchDebounce = 300 routing = false format = default tracking = false mcp = {"autoRegister":true} depMap = true todos = true changes = true changesCommits = 10 testCoverage = false testDirs = ["tests","test","__tests__","spec"] sigCache = false impactRadius = false retrieval = {"topK":10,"recencyBoost":1.5} impact = {"depth":3,"includeSigs":true} ``` --- ## Supported languages (33 extractors) cpp, csharp, css, dart, dockerfile, gdscript, go, graphql, html, java, javascript, kotlin, markdown, php, properties, protobuf, python, r, ruby, rust, scala, shell, sql, svelte, swift, terraform, toml, typescript, typescript_react, vue, vue_sfc, xml, yaml --- ## Integrations Generates native context files for: claude, codex, copilot, cursor, gemini, openai, willow, windsurf — plus an MCP server for any agent (Claude Code, Cursor, Cline, Windsurf, OpenCode, Gemini CLI, Aider). One `sigmap --setup` wires the lot. --- ## Compliance evidence support SigMap can surface repository facts that *support* technical-evidence narratives (e.g. DORA Art. 8–11, NIS2 Art. 21, ISO 27001 A.8) — it is a **technical evidence pack**, never a certification or a "compliance report". Signed evidence packs are planned for a later release. Any compliance-adjacent wording is reviewed against the relevant regulation before publication; SigMap makes no legal claims. --- ## Project information - Author: Manoj Mallick - License: MIT - Repository: https://github.com/manojmallick/sigmap - Documentation: https://sigmap.io/ - npm: https://www.npmjs.com/package/sigmap - Benchmark dataset: https://doi.org/10.5281/zenodo.19898842 - Issues: https://github.com/manojmallick/sigmap/issues --- ## What SigMap does not do - **No embeddings / vector database.** Ranking is deterministic TF-IDF over extracted signatures — reproducible and offline, not a semantic vector search. - **No code execution.** SigMap reads source statically; it never runs your code. - **No network calls** on the core generate/ask/verify paths. Nothing is uploaded; generation works fully offline. - **Not a linter or type checker.** It maps and ranks code structure; it does not judge correctness (use `verify-ai-output` only to flag *fabricated* references). - **Not a full file reader.** It emits signatures + line anchors; an agent fetches exact bodies on demand via the `get_lines` MCP tool. - **No telemetry.** Usage tracking (`--track`, `.context/usage.ndjson`) is local and opt-in; nothing leaves your machine.