# Context Mode — Benchmark Results > Benchmarked against **real outputs** from popular Claude Code MCP servers, Skills, and dev tools. > All fixtures captured from actual tool invocations — not synthetic data. ## Overview | Metric | Value | |--------|-------| | Total scenarios | 21 | | Tools benchmarked | `ctx_execute_file` (summarize) + `ctx_index`/`ctx_search` (knowledge retrieval) | | Smart truncation | Head + tail preservation (60/40 split) | | Total raw data processed | 376 KB | | Total context consumed | 16.5 KB | | Overall context savings | **96%** | | Code examples preserved | **100%** (exact, not summarized) | ## Tool Decision Matrix | Data Type | Best Tool | Why | |-----------|-----------|-----| | Documentation, API refs | `ctx_index` + `ctx_search` | Need exact code examples — not summaries | | Skills prompts | `ctx_index` + `ctx_search` | Large prompts eat context; search on-demand | | MCP tool signatures | `ctx_index` + `ctx_search` | Need exact tool names and parameters | | Log files, test output | `ctx_execute_file` | Need aggregate stats, not raw lines | | CSV data, analytics | `ctx_execute_file` | Need computed metrics | | Build output | `ctx_execute_file` | Need error counts, not full logs | | Browser snapshots | `ctx_execute_file` | Need page structure summary | ## Part 1: `ctx_execute_file` — Structured Data Processing *Best for: logs, test output, CSV, build output — data where summaries are more useful than raw content.* | Scenario | Source | Raw Size | Context | Savings | Time | |----------|--------|----------|---------|---------|------| | React useEffect docs | Context7 | 5.9 KB | 261 B | 96% | 18ms | | Next.js App Router docs | Context7 | 6.5 KB | 249 B | 96% | 18ms | | Tailwind CSS docs | Context7 | 4.0 KB | 186 B | 95% | 18ms | | Page snapshot (Hacker News) | Playwright | 56.2 KB | 299 B | 99% | 16ms | | Network requests | Playwright | 0.4 KB | 349 B | 13% | 16ms | | PR list (vercel/next.js) | GitHub | 6.4 KB | 719 B | 89% | 16ms | | Issues (facebook/react) | GitHub | 58.9 KB | 1,139 B | 98% | 16ms | | Test output (30 suites) | vitest | 6.0 KB | 337 B | 95% | 16ms | | TypeScript errors (50) | tsc | 4.9 KB | 347 B | 93% | 16ms | | Build output (100+ lines) | next build | 6.4 KB | 405 B | 94% | 16ms | | MCP tools (40 tools) | MCP tools/list | 17.0 KB | 742 B | 96% | 15ms | | Access log (500 requests) | nginx | 45.1 KB | 155 B | 100% | 17ms | | Git log (150+ commits) | git | 11.6 KB | 107 B | 99% | 16ms | | Analytics CSV (500 rows) | analytics | 85.5 KB | 222 B | 100% | 32ms | **Subtotal: 315 KB raw → 5.5 KB context (98% savings)** ## Part 2: `ctx_index` + `ctx_search` — Knowledge Retrieval (FTS5 BM25) *Best for: documentation, code examples, API references, Skills — content where you need EXACT text, not summaries.* | Scenario | Source | Raw Size | Search Result (3 queries) | Savings | Chunks | Code Blocks | |----------|--------|----------|---------------------------|---------|--------|-------------| | Supabase Edge Functions | Context7 | 3.9 KB | 2,246 B | 44% | 5 | 4 | | React useEffect docs | Context7 | 5.9 KB | 1,494 B | 75% | 16 | 4 | | Next.js App Router docs | Context7 | 6.5 KB | 3,311 B | 50% | 5 | 5 | | Tailwind CSS docs | Context7 | 4.0 KB | 620 B | 85% | 5 | 5 | | Skill prompt (main) | context-mode | 4.4 KB | 932 B | 79% | 15 | 6 | | Skill references (4 files) | context-mode | 33.2 KB | 2,412 B | 93% | 51 | 32 | **Subtotal: 60.3 KB raw → 11.0 KB context (82% savings)** **Key difference from `ctx_execute_file`:** Code examples are returned **exactly as written** — not summarized. A `useEffect` cleanup pattern comes back with the full code block intact. ### Why `ctx_index + ctx_search` savings are lower `ctx_execute_file` achieves 95-100% savings because it compresses data into 1-2 line summaries. `ctx_index + ctx_search` achieves 50-93% savings because it returns **complete, exact chunks** — the actual code examples, not descriptions of them. This is by design: - `ctx_execute_file` on React docs: `"5 code blocks, 3 sections about cleanup"` → **useless for coding** - `ctx_index + ctx_search` on React docs: returns the full `useEffect(() => { ... }, [deps])` block → **actually useful** ## Part 3: Smart Truncation *When output exceeds the limit, context-mode keeps the first 60% + last 40% of lines — preserving both initial context and final error messages.* | Before (v0.2) | After (v0.3) | |---|---| | Blindly keeps first N bytes | Keeps head (60%) + tail (40%) | | Cuts mid-line, corrupts UTF-8 | Snaps to line boundaries | | Error messages at end: **LOST** | Error messages at end: **PRESERVED** | | `"... [output truncated]"` | `"[47 lines / 3.2KB truncated — showing first 12 + last 8 lines]"` | ### Example ``` line 0: data initialization line 1: loading config line 2: starting server ... ... [47 lines / 3.2KB truncated — showing first 12 + last 8 lines] ... line 92: connection timeout line 93: retry attempt 3 failed line 94: FATAL: database unreachable line 95: Stack trace: Error at connect() line 96: exit code: 1 ``` The LLM can now see **both** the setup context (head) and the actual error (tail). ## Context Window Impact Claude's context window: **200,000 tokens** ### Scenario: Full debugging session | Tool Calls | Without context-mode | With context-mode | |---|---|---| | Context7 docs (3 queries) | 16.4 KB | 5.6 KB | | Playwright snapshot | 56.2 KB | 299 B | | GitHub issues | 58.9 KB | 1,139 B | | Test output | 6.0 KB | 337 B | | Build output | 6.4 KB | 405 B | | Skill prompt | 33.2 KB | 2.4 KB | | **Total** | **177.1 KB** | **10.2 KB** | | **Tokens** | **~45,300** | **~2,600** | | **Context used** | **22.7%** | **1.3%** | **Result: 94% more context available for actual problem solving.** ## Test Suite | Suite | Tests | Status | |-------|-------|--------| | Executor (10 languages + edge cases) | 55 | All pass | | ContentStore (FTS5 BM25) | 34 | All pass | | MCP Integration (JSON-RPC) | 22 | All pass | | Ecosystem Benchmark (14 scenarios) | 14 | All pass | | **Total** | **125** | **All pass** | ## How to Reproduce ```bash # Run individual test suites npm run test # Executor tests npm run test:store # FTS5 BM25 store tests npm run test:ecosystem # Ecosystem benchmark # Run all tests npm run test:all # Live benchmark (requires Context7 fixture) npx tsx tests/live-benchmark.ts ``` ## Fixtures All fixtures in `tests/fixtures/` are captured from real tool invocations: | Fixture | Source | Size | |---------|--------|------| | `context7-react-docs.md` | Context7 MCP — React useEffect | 5.9 KB | | `context7-nextjs-docs.md` | Context7 MCP — Next.js App Router | 6.5 KB | | `context7-tailwind-docs.md` | Context7 MCP — Tailwind CSS | 4.0 KB | | `context7-supabase-edge.md` | Context7 MCP — Supabase Edge Functions | 3.9 KB | | `playwright-snapshot.txt` | Playwright MCP — page snapshot | 56.2 KB | | `playwright-network.txt` | Playwright MCP — network requests | 0.4 KB | | `github-prs.json` | `gh pr list --repo vercel/next.js` | 6.4 KB | | `github-issues.json` | `gh issue list --repo facebook/react` | 58.9 KB | | `test-output.txt` | vitest run (30 suites) | 6.0 KB | | `tsc-errors.txt` | tsc --noEmit (50 errors) | 4.9 KB | | `build-output.txt` | next build output | 6.4 KB | | `mcp-tools.json` | MCP tools/list (40 tools) | 17.0 KB | | `access.log` | nginx access log (500 requests) | 45.1 KB | | `git-log.txt` | git log --oneline (153 commits) | 11.6 KB | | `analytics.csv` | Event analytics (500 rows) | 85.5 KB |