StakGraph

Your AI agent wastes thousands of tokens reading files over and over.
Give it a code graph instead.

InstallCLIMCP ServerGraph ServerLanguages

--- StakGraph parses source code into a graph of **functions, classes, endpoints, data models, tests**, and their relationships -- using [tree-sitter](https://tree-sitter.github.io/tree-sitter/), instantly, with zero config. 1. **CLI** -- install in 10 seconds, point at any file or directory 2. **MCP server** -- plug into Cursor, Claude Code, Windsurf, OpenCode 3. **Graph server** -- Neo4j-backed querying, embedding, and visualization ## Install ```bash curl -fsSL https://raw.githubusercontent.com/stakwork/stakgraph/refs/heads/main/install.sh | bash ``` Pre-built binaries for **Linux** (x86_64, aarch64), **macOS** (Intel, Apple Silicon), and **Windows**. ## CLI Point `stakgraph` at a file, directory, or git scope to inspect code structure instead of raw text. ### Most useful commands Parse a file and print extracted nodes: ```bash stakgraph mcp/src/index.ts stakgraph cli/src/main.rs --stats ``` stakgraph file output Get a compact repo overview: ```bash stakgraph overview . ``` Search for endpoints, functions, models, or tests: ```bash stakgraph search GET --type Endpoint mcp/src stakgraph search batch_process --context ast/src ``` Inspect structural git changes: ```bash stakgraph changes diff --last 5 mcp/src/ stakgraph changes list cli/src ``` stakgraph changes diff output Trace dependencies forward or backward: ```bash stakgraph deps batch_process ast/src stakgraph impact --name cn cli/ ``` Useful flags: ```bash stakgraph --json # machine-readable output stakgraph --type Class # filter node types stakgraph --name main # print a single named node stakgraph completions zsh # shell completions ``` --- ## What it extracts StakGraph understands the semantic structure of code, not just syntax: | Node Type | Examples | | ------------- | ----------------------------------------------------- | | **Function** | Functions, methods, handlers, callbacks | | **Endpoint** | HTTP routes (`GET /users`, `POST /api/v1/login`) | | **Request** | HTTP client calls to external services | | **DataModel** | Structs, interfaces, types, enums, schemas | | **Class** | Classes with method ownership | | **Trait** | Interfaces, abstract classes, protocols | | **Test** | Unit tests, integration tests, E2E tests (classified) | | **Import** | Module imports with resolution | And the relationships between them: | Edge Type | Meaning | | -------------- | ----------------------------------- | | **Calls** | Function A calls Function B | | **Handler** | Endpoint handled by Function | | **Contains** | File/Module contains Function/Class | | **Operand** | Class owns Method | | **Implements** | Class implements Trait | | **ParentOf** | Class inheritance | --- ## Languages 16 languages with framework-aware parsing: | Language | Frameworks | | -------------- | ------------------------- | | **TypeScript** | React, Express, Nest.js | | **JavaScript** | React, Express | | **Python** | FastAPI, Django, Flask | | **Go** | Gin, Echo, net/http | | **Rust** | Axum, Actix, Rocket | | **Ruby** | Rails (routes, ERB, HAML) | | **Java** | Spring Boot | | **Kotlin** | Spring, Ktor | | **Swift** | Vapor | | **C#** | ASP.NET | | **PHP** | Laravel | | **C / C++** | | | **Angular** | Components, services | | **Svelte** | Components | | **Bash** | | | **TOML** | Config parsing | --- ## MCP Server The MCP server exposes StakGraph's graph intelligence to AI agents running in Cursor, Claude Code, Windsurf, OpenCode, or any MCP-compatible editor. ### Tools | Tool | What it does | | ------------------------- | ------------------------------------------------------------------------------- | | `stakgraph_search` | Fulltext or vector (semantic) search across the codebase graph | | `stakgraph_map` | Visual map of code relationships from any node (configurable depth & direction) | | `stakgraph_code` | Retrieve actual code from a subtree | | `stakgraph_shortest_path` | Find shortest path between two nodes in the graph | | `stakgraph_rules_files` | Fetch rules/instructions files (.cursorrules, AGENTS.md, etc.) | ### Built-in agents - **Explore Agent** -- AI-driven codebase exploration using the "zoom pattern" (Overview → Files → Functions → Dependencies). Configurable LLM provider (Anthropic, OpenAI, Google, OpenRouter). - **Describe Agent** -- Generates descriptions for undocumented nodes and stores embeddings for semantic search. - **Docs Agent** -- Summarizes documentation and rules files across the repo. - **Mocks Agent** -- Scans for 3rd-party service integrations and records mock coverage. ### Gitree: Feature knowledge from git history Gitree extracts feature-level knowledge from PR and commit history using LLM analysis: ```bash yarn gitree process # extract features from PR history yarn gitree summarize-all # generate docs for all features yarn gitree search-clues "auth flow" # semantic search across architectural clues ``` Builds a knowledge base of **Features**, **PRs**, **Commits**, and **Clues** (architectural insights like patterns, conventions, gotchas, data flows). Links them to code entities in the graph. --- ## Graph Server For full-scale codebase indexing, StakGraph runs as an HTTP server backed by **Neo4j**: ```bash docker-compose up # starts Neo4j + StakGraph server on port 7799 ``` ### Ingest repositories ```bash # Parse one or more repos into the graph export REPO_URL="https://github.com/org/backend.git,https://github.com/org/frontend.git" cargo run --bin index ``` Endpoints and requests are linked across repos -- a `POST /api/users` endpoint in the backend connects to the `fetch("/api/users")` request in the frontend. ### Query the graph Neo4j Graph The graph stores 21 node types and 13 edge types. Query with Cypher, search with fulltext or vector similarity, or use the MCP tools. ### Vector search Code is embedded using **BGE-Small-EN-v1.5** (384 dimensions) via fastembed. Weighted pooling prioritizes function signatures. Search semantically across the entire codebase: ``` POST /search { "query": "user authentication middleware", "limit": 10 } ``` ### API endpoints | Endpoint | Description | | --------------------- | --------------------------------------- | | `POST /process` | Parse and index a repository | | `POST /embed_code` | Generate embeddings for code | | `GET /search` | Fulltext or vector search | | `GET /map` | Relationship map from a node | | `GET /shortest_path` | Path between two nodes | | `GET /tests/coverage` | Test coverage analysis | | `POST /ingest_async` | Background repo ingestion with webhooks | --- ## Architecture ``` stakgraph/ ├── ast/ # Core Rust library: tree-sitter parsing → graph of nodes & edges ├── cli/ # CLI binary: parse, summarize, diff ├── lsp/ # LSP integration for precise symbol resolution ├── standalone/ # Axum HTTP server wrapping the ast library ├── mcp/ # TypeScript MCP server with agents, gitree, vector search └── shared/ # Shared types ``` The `ast` crate is the engine. It takes source files, runs tree-sitter queries to extract nodes, resolves cross-file calls (optionally via LSP), and produces a graph. The graph can be: - **In-memory** (`ArrayGraph`) -- used by the CLI, fast, no dependencies - **Neo4j** (`Neo4jGraph`) -- persistent, queryable, used by the server --- ## Contributing ```bash cargo test # run tests USE_LSP=1 cargo test # run tests with LSP resolution ``` You may need to install LSPs: ```bash # TypeScript npm install -g typescript typescript-language-server # Go go install golang.org/x/tools/gopls@latest # Rust rustup component add rust-analyzer # Python pip install python-lsp-server ``` ---

github.com/stakwork/stakgraph