# GCF Integration: Native Token-Optimized Output for agent-lsp ## Goal Add GCF (Graph Compact Format) as a native output format for agent-lsp's MCP tool responses. Tool responses currently serialized as JSON can optionally be emitted as GCF tabular format, saving 34-44% tokens on structured data. ## Quick Start GCF is enabled by default. To revert to JSON: ```bash export AGENT_LSP_OUTPUT_FORMAT=json ``` ## Measured Token Savings Benchmarked on representative tool responses (`go run scripts/gcf-benchmark.go`): | Tool Response | JSON | GCF | Savings | |---|---|---|---| | `list_symbols` (10 symbols) | ~334 tokens | ~165 tokens | **50.6%** | | `list_symbols` (30 symbols) | ~1,082 tokens | ~548 tokens | **49.4%** | | `find_references` (12 locations) | ~220 tokens | ~125 tokens | **43.2%** | | `find_references` (50 locations) | ~858 tokens | ~437 tokens | **49.1%** | | `get_diagnostics` (5 diagnostics) | ~213 tokens | ~133 tokens | **37.6%** | | `blast_radius` (5 symbols, 6 callers) | ~526 tokens | ~365 tokens | **30.6%** | Token estimates use byte length / 3.5 (validated: 0.97 correlation with o200k_base for ASCII-dominant payloads). Savings grow with record count because the eliminated overhead (field names, delimiters, quotes) is per-record. At 50+ records, savings converge to ~49%. ## Why 1. **34-44% fewer tokens on structured data.** Most agent-lsp tools return arrays of objects (symbols, references, diagnostics, callers). GCF tabular encoding eliminates field name repetition and structural delimiters. 2. **Session dedup compounds.** Multi-turn code exploration reuses symbols. By the 5th tool call, 92.7% savings vs JSON. 3. **Competitive moat.** No other MCP code intelligence server has token-optimized output. 4. **We own both projects.** No external approval needed. Ship when ready. ## Profile Fit Analysis GCF has two encoding profiles. agent-lsp's data model determines which one applies. ### Graph profile (Sections 3-6): Does NOT fit natively The graph profile has a **fixed schema**: `@id kind qualified_name score provenance` agent-lsp's tools **do not compute** scores or provenance: | GCF graph field | knowing (has it) | agent-lsp (has it) | |----------------|-----------------|-------------------| | `qualified_name` | Yes | Partial (has name + file, not always qualified) | | `kind` | Yes | Yes (from LSP SymbolKind) | | `score` | Yes (relevance ranking) | **No** (LSP doesn't score results) | | `provenance` | Yes (lsp_resolved, ast_inferred) | **No** (all results come from LSP) | | `distance` | Yes (graph distance from query) | **No** (no graph distance model) | **Bottom line:** The graph profile was designed for knowing's data model (scored, ranked, distance-grouped symbols). Forcing it onto agent-lsp would require either fake values or adding a scoring layer, both of which are wrong for this phase. ### Tabular profile (Section 6a): Fits naturally The tabular profile encodes **arbitrary arrays of objects** with a field declaration header: ``` ## symbols [12]{name,kind,file,line,test_callers,non_test_callers} AuthMiddleware|fn|auth.go|45|2|5 NewServer|fn|server.go|12|1|8 HandleRequest|method|handler.go|78|3|12 ``` This maps directly to every agent-lsp tool response: | Tool | Current JSON shape | Tabular fit | |------|-------------------|-------------| | blast_radius | `{changed_symbols: [{name, file, line, callers}]}` | Direct: array of uniform objects | | find_references | `[{file, line, character}]` | Direct: flat array | | list_symbols | `[{name, kind, range, detail}]` | Direct: flat array | | find_callers | `{items, incoming, outgoing}` | Direct: arrays + nested | | get_diagnostics | `[{file, range, severity, message}]` | Direct: flat array | | explore_symbol | `{hover, impls, calls, refs}` | Object with nested arrays | | get_completions | `[{label, kind, detail}]` | Direct: flat array | ### Future: Graph profile as opt-in upgrade If agent-lsp later adds relevance scoring (e.g., ranking blast_radius results by caller count, or scoring symbols by distance from query center), the graph profile becomes a natural fit. This is a separate feature decision, not a serialization concern. ## Current State ### Serialization Layer All 66 tool handlers follow the same pattern: ```go // internal/tools/*.go (53 json.Marshal call sites) response := map[string]any{ ... } data, err := json.Marshal(response) return types.TextResult(string(data)), nil ``` The response type is always: ```go // internal/types/types.go type ToolResult struct { Content []ContentItem `json:"content"` IsError bool `json:"isError,omitempty"` } type ContentItem struct { Type string `json:"type"` // always "text" Text string `json:"text"` // <-- JSON string lives here } ``` MCP protocol requires `Content` to be a list of text/image items. GCF output replaces the JSON string inside `Text` with a GCF string. The MCP protocol envelope stays JSON; only the tool response payload changes format. ### Tool Categories by Expected Savings | Category | Tools | Current output | Expected savings | |----------|-------|---------------|-----------------| | **High-value arrays** | blast_radius, find_references, list_symbols, find_callers, call_hierarchy, cross_repo | arrays of uniform objects with 4-8 fields | **34-44%** | | **Medium-value arrays** | get_diagnostics, suggest_fixes, get_completions, semantic_tokens | arrays of objects | **30-40%** | | **Mixed structures** | explore_symbol, detect_changes, simulate_chain | objects with nested arrays | **25-35%** | | **Simulation results** | simulate_edit, preview_edit, evaluate_session | diff results, diagnostic deltas | **20-30%** | | **Low-value / skip** | hover, get_signature_help, format_document, apply_edit, rename_symbol | mostly text or small objects | **Skip** | ### Priority: High-value array tools first blast_radius, find_references, list_symbols, and find_callers produce the largest payloads with the most uniform structure. These get the biggest savings and prove the concept. ## Architecture ### Phase 1: Format negotiation Add format preference to the MCP session: ```go // Option A: Per-tool parameter {"tool": "blast_radius", "args": {"file": "auth.go", "format": "gcf"}} // Option B: Session-level capability (preferred) // Client sends during initialize: {"capabilities": {"experimental": {"output_format": "gcf"}}} ``` Option B is cleaner: set once, applies to all tool responses. Agent-lsp reads the capability during session init and stores it. ### Phase 2: Encoding layer New package: `internal/encoding/gcf/` ```go // internal/encoding/gcf/encode.go package gcf import gcfgo "github.com/blackwell-systems/gcf-go" // Encode converts a tool response to GCF tabular format. // Falls back to JSON if the data structure is not suitable for tabular encoding. func Encode(tool string, data any) (string, error) { return gcfgo.EncodeGeneric(data) } ``` ### Phase 3: Migration helper Replace the 53 `json.Marshal` calls with a format-aware helper: ```go // internal/tools/helpers.go func encodeResult(ctx context.Context, tool string, data any) (types.ToolResult, error) { format := session.GetOutputFormat(ctx) // "json" or "gcf" switch format { case "gcf": encoded, err := gcf.Encode(tool, data) if err != nil { // Fall back to JSON if GCF encoding fails raw, _ := json.Marshal(data) return types.TextResult(string(raw)), nil } return types.TextResult(encoded), nil default: raw, err := json.Marshal(data) if err != nil { return types.ErrorResult(err.Error()), nil } return types.TextResult(string(raw)), nil } } ``` ### Phase 4: Session deduplication agent-lsp already has session state (`internal/session/`). Add GCF session tracking so that symbols sent in prior responses become bare references: ```go // internal/session/gcf_state.go type GCFState struct { mu sync.Mutex session *gcfgo.Session } func (s *GCFState) Encode(data any) string { s.mu.Lock() defer s.mu.Unlock() return s.session.EncodeGeneric(data) } ``` Session dedup is where the compounding savings happen. Multi-turn exploration (blast_radius then explore then find_callers then inspect) reuses records automatically. By the 5th call, 92.7% cumulative savings vs JSON. ### Phase 5: Delta encoding (future) When a user re-queries after an edit, delta encoding sends only what changed. This depends on the tabular profile gaining delta support in GCF (currently only the graph profile has it). **Decision:** Defer until tabular delta encoding is specified and implemented in gcf-go. ## Example: blast_radius Before and After ### Current JSON output (estimated ~2,400 tokens for 12 symbols) ```json { "changed_symbols": [ {"name": "AuthMiddleware", "file": "auth.go", "line": 45}, {"name": "NewServer", "file": "server.go", "line": 12} ], "affected_symbols": [ {"name": "AuthMiddleware", "file": "auth.go", "line": 45, "test_callers": [{"name": "TestAuth", "file": "auth_test.go", "line": 10}], "non_test_callers": [{"name": "main", "file": "main.go", "line": 5}]}, {"name": "NewServer", "file": "server.go", "line": 12, "test_callers": [{"name": "TestServer", "file": "server_test.go", "line": 8}], "non_test_callers": [{"name": "main", "file": "main.go", "line": 3}, {"name": "Setup", "file": "setup.go", "line": 20}]} ], "test_files": ["auth_test.go", "server_test.go"], "test_functions": ["TestAuth", "TestServer"], "non_test_callers": [ {"name": "main", "file": "main.go", "line": 5}, {"name": "main", "file": "main.go", "line": 3}, {"name": "Setup", "file": "setup.go", "line": 20} ], "summary": "Found 2 changed symbols with 2 test references across 2 test files." } ``` ### GCF tabular output (estimated ~1,500 tokens, 37% savings) ``` summary=Found 2 changed symbols with 2 test references across 2 test files. ## changed_symbols [2]{name,file,line} AuthMiddleware|auth.go|45 NewServer|server.go|12 ## affected_symbols [2]{name,file,line,sync_guarded} @0 AuthMiddleware|auth.go|45|false .test_callers ## [1]{name,file,line} TestAuth|auth_test.go|10 .non_test_callers ## [1]{name,file,line} main|main.go|5 @1 NewServer|server.go|12|false .test_callers ## [1]{name,file,line} TestServer|server_test.go|8 .non_test_callers ## [2]{name,file,line} main|main.go|3 Setup|setup.go|20 ## test_files [2] auth_test.go server_test.go ## test_functions [2] TestAuth TestServer ## non_test_callers [3]{name,file,line} main|main.go|5 main|main.go|3 Setup|setup.go|20 ``` The field declarations (`{name,file,line}`) appear once per section. Each record is just pipe-separated values. For a blast_radius response with 50 symbols, the savings grow linearly because the eliminated waste is per-record. ## Rollout Plan ### Milestone 1: Format negotiation (2 days) - [ ] Add `output_format` field to session config - [ ] Read format preference from MCP experimental capabilities - [ ] Thread format through context to tool handlers - [ ] Add `--format gcf` flag to CLI for testing ### Milestone 2: Tabular encoding for high-value tools (3 days) - [ ] Add `internal/encoding/gcf/` package - [ ] Add `encodeResult` helper to `internal/tools/helpers.go` - [ ] Migrate blast_radius (change_impact.go) - [ ] Migrate find_references, list_symbols, find_symbol (navigation.go, analysis.go) - [ ] Migrate find_callers, call_hierarchy (callhierarchy.go) - [ ] Migrate cross_repo (cross_repo.go) - [ ] Integration tests: verify GCF output decodes to equivalent data ### Milestone 3: Tabular encoding for remaining tools (2 days) - [ ] Migrate get_diagnostics, suggest_fixes - [ ] Migrate explore_symbol, detect_changes - [ ] Migrate simulation tools (simulate_edit, preview_edit, etc.) - [ ] Migrate remaining tools (get_completions, semantic_tokens, etc.) - [ ] Skip scalar/text tools (hover, format_document, etc.) ### Milestone 4: Session deduplication (2 days) - [ ] Add GCFState to session manager - [ ] Wire session encoding through tool handlers - [ ] Test: multi-turn exploration shows dedup in action - [ ] Benchmark: measure actual token savings across 5-call session ### Milestone 5: Benchmarks and documentation (1 day) - [ ] Benchmark: JSON vs GCF token counts on real tool responses (all tool categories) - [ ] Update agent-lsp README with measured savings ("Token-Optimized Output" section) - [ ] Update GCF README to list agent-lsp as production consumer - [ ] Publish results ## Dependencies | Dependency | Version | Purpose | |-----------|---------|---------| | `github.com/blackwell-systems/gcf-go` | v0.1.0 | Tabular encoding via `EncodeGeneric`, session management | Zero additional runtime dependencies (gcf-go is zero-dep itself). ## Risks ### 1. MCP clients that can't handle non-JSON text **Risk:** Some MCP clients might parse the `text` field as JSON and break on GCF output. **Mitigation:** GCF is opt-in via capabilities. Default remains JSON. Clients that don't negotiate get JSON. ### 2. Comprehension regression on tabular format **Risk:** LLMs might understand JSON better than GCF tabular for small payloads. **Mitigation:** GCF tabular has 100% comprehension parity with JSON for structured extraction. For payloads under 10 records, savings are minimal, so keep JSON for scalar/text tools. Test with real agent-lsp payloads before expanding. ### 3. Debugging difficulty **Risk:** GCF output is less familiar than JSON for debugging. **Mitigation:** GCF is human-readable by design. Keep JSON as default. Add `gcf decode` CLI command for debugging. Consider a `--verbose` mode that emits both formats. ### 4. Tabular profile maturity **Risk:** GCF tabular profile is v1.1 (released same day as v1.0). May have edge cases with deeply nested data. **Mitigation:** agent-lsp's nested data is shallow (1-2 levels max: affected_symbols with test_callers/non_test_callers). Test all tool outputs through encode/decode round-trip before shipping. ### 5. EncodeGeneric performance **Risk:** Reflection-based generic encoding may be slower than direct json.Marshal. **Mitigation:** Profile on real payloads. If slow, add type-specific encoders for high-frequency tools (blast_radius, find_references). Encoding time is negligible compared to LSP round-trip latency. ## Success Metrics | Metric | Target | How to measure | |--------|--------|---------------| | Token savings (high-value tools) | >30% vs JSON | Benchmark blast_radius, find_references, list_symbols | | Token savings (medium-value tools) | >20% vs JSON | Benchmark get_diagnostics, explore_symbol | | Session dedup (5-call session) | >80% cumulative | Benchmark multi-turn exploration | | Comprehension accuracy | 100% on extraction tasks | Run extraction eval with real agent-lsp GCF output | | Zero regressions | All existing tests pass | CI green after migration | ## Timeline **Total: ~10 days of focused work** | Week | Milestones | Deliverable | |------|-----------|-------------| | Week 1 | M1 + M2 | Format negotiation, high-value tool encoding | | Week 2 | M3 + M4 + M5 | Remaining tools, session dedup, benchmarks | ## Future: Graph Profile Upgrade Path If agent-lsp later adds relevance scoring (ranking symbols by caller count, blast radius, recency), the graph profile becomes viable: ``` GCF tool=blast_radius budget=5000 tokens=847 symbols=12 ## targets @0 fn auth.go:AuthMiddleware 0.95 lsp @1 fn server.go:NewServer 0.72 lsp ## related @2 fn main.go:main 0.40 lsp @3 fn setup.go:Setup 0.35 lsp ## edges @0<@2 called_by @0<@3 called_by @1<@2 called_by ``` This would unlock 79-84% savings on graph tools. The scoring model is a separate feature decision tracked in the roadmap; the encoding infrastructure built in this plan will support it when it arrives. ## Open Questions 1. **Should GCF be the default?** No. Opt-in via capabilities. Make default only after comprehension validation across Claude, GPT, Gemini. 2. **MCP spec alignment?** Is there an emerging standard for output format negotiation in MCP? Check with spec contributors (Sam Morrow is one). 3. **Per-tool opt-in?** Scalar/text tools don't benefit. Format negotiation should be global with per-tool fallback to JSON when GCF adds no value. 4. **Tabular delta encoding?** Currently only the graph profile supports delta. Should GCF spec add tabular deltas? If so, agent-lsp would benefit from re-query efficiency.