# GitNexus **⚠️ Important Notice:** GitNexus has NO official cryptocurrency, token, or coin. Any token/coin using the GitNexus name on Pump.fun or any other platform is **not affiliated with, endorsed by, or created by** this project or its maintainers. Do not purchase any cryptocurrency claiming association with GitNexus.
Enterprise (SaaS & Self-hosted) - akonlabs.com
# Search execution flows across all repos in a group gitnexus group status# Check staleness of repos in a group ``` If `analyze` reports a worker parse timeout on a large or unusual repository, it keeps running and falls back safely. To give slow worker jobs more time, use `gitnexus analyze --worker-timeout 60` or set `GITNEXUS_WORKER_SUB_BATCH_TIMEOUT_MS=60000`. For very large files, `GITNEXUS_WORKER_SUB_BATCH_MAX_BYTES` controls the worker job byte budget. ### What Your AI Agent Gets **16 tools** exposed via MCP (11 per-repo + 5 group): | Tool | What It Does | `repo` Param | | ------------------ | ----------------------------------------------------------------- | -------------- | | `list_repos` | Discover all indexed repositories | — | | `query` | Process-grouped hybrid search (BM25 + semantic + RRF) | Optional | | `context` | 360-degree symbol view — categorized refs, process participation | Optional | | `impact` | Blast radius analysis with depth grouping and confidence | Optional | | `detect_changes` | Git-diff impact — maps changed lines to affected processes | Optional | | `rename` | Multi-file coordinated rename with graph + text search | Optional | | `cypher` | Raw Cypher graph queries | Optional | | `group_list` | List configured repository groups | — | | `group_sync` | Extract contracts and match across repos/services | — | | `group_contracts`| Inspect extracted contracts and cross-links | — | | `group_query` | Search execution flows across all repos in a group | — | | `group_status` | Check staleness of repos in a group | — | > When only one repo is indexed, the `repo` parameter is optional. With multiple repos, specify which one: `query({query: "auth", repo: "my-app"})`. **Resources** for instant context: | Resource | Purpose | | ----------------------------------------- | ---------------------------------------------------- | | `gitnexus://repos` | List all indexed repositories (read this first) | | `gitnexus://repo/{name}/context` | Codebase stats, staleness check, and available tools | | `gitnexus://repo/{name}/clusters` | All functional clusters with cohesion scores | | `gitnexus://repo/{name}/cluster/{name}` | Cluster members and details | | `gitnexus://repo/{name}/processes` | All execution flows | | `gitnexus://repo/{name}/process/{name}` | Full process trace with steps | | `gitnexus://repo/{name}/schema` | Graph schema for Cypher queries | **2 MCP prompts** for guided workflows: | Prompt | What It Does | | ----------------- | ------------------------------------------------------------------------- | | `detect_impact` | Pre-commit change analysis — scope, affected processes, risk level | | `generate_map` | Architecture documentation from the knowledge graph with mermaid diagrams | **4 agent skills** installed to `.claude/skills/` automatically: - **Exploring** — Navigate unfamiliar code using the knowledge graph - **Debugging** — Trace bugs through call chains - **Impact Analysis** — Analyze blast radius before changes - **Refactoring** — Plan safe refactors using dependency mapping **Repo-specific skills** generated with `--skills`: When you run `gitnexus analyze --skills`, GitNexus detects the functional areas of your codebase (via Leiden community detection) and generates a `SKILL.md` file for each one under `.claude/skills/generated/`. Each skill describes a module's key files, entry points, execution flows, and cross-area connections — so your AI agent gets targeted context for the exact area of code you're working in. Skills are regenerated on each `--skills` run to stay current with the codebase. --- ## Multi-Repo MCP Architecture GitNexus uses a **global registry** so one MCP server can serve multiple indexed repos. No per-project MCP config needed — set it up once and it works everywhere. ```mermaid flowchart TD subgraph CLI [CLI Commands] Setup["gitnexus setup"] Analyze["gitnexus analyze"] Clean["gitnexus clean"] List["gitnexus list"] end subgraph Registry ["~/.gitnexus/"] RegFile["registry.json"] end subgraph Repos [Project Repos] RepoA[".gitnexus/ in repo A"] RepoB[".gitnexus/ in repo B"] end subgraph MCP [MCP Server] Server["server.ts"] Backend["LocalBackend"] Pool["Connection Pool"] ConnA["LadybugDB conn A"] ConnB["LadybugDB conn B"] end Setup -->|"writes global MCP config"| CursorConfig["~/.cursor/mcp.json"] Analyze -->|"registers repo"| RegFile Analyze -->|"stores index"| RepoA Clean -->|"unregisters repo"| RegFile List -->|"reads"| RegFile Server -->|"reads registry"| RegFile Server --> Backend Backend --> Pool Pool -->|"lazy open"| ConnA Pool -->|"lazy open"| ConnB ConnA -->|"queries"| RepoA ConnB -->|"queries"| RepoB ``` **How it works:** Each `gitnexus analyze` stores the index in `.gitnexus/` inside the repo (portable, gitignored) and registers a pointer in `~/.gitnexus/registry.json`. When an AI agent starts, the MCP server reads the registry and can serve any indexed repo. LadybugDB connections are opened lazily on first query and evicted after 5 minutes of inactivity (max 5 concurrent). If only one repo is indexed, the `repo` parameter is optional on all tools — agents don't need to change anything. --- ## Web UI (browser-based) A client-side graph explorer and AI chat — your code never leaves your machine. **Try it now:** [gitnexus.vercel.app](https://gitnexus.vercel.app) — run `npx gitnexus@latest serve` locally and the page auto-connects to your local backend. Or run the frontend locally: ```bash git clone https://github.com/abhigyanpatwari/gitnexus.git cd gitnexus/gitnexus-shared && npm install && npm run build cd ../gitnexus-web && npm install npm run dev # Then in another terminal, start the backend the frontend connects to: npx gitnexus@latest serve ``` ## Docker The official Docker setup ships **two signed images** orchestrated by `docker-compose.yaml`. Each image is published to both **GitHub Container Registry** (GHCR) and **Docker Hub** — same build, same digest, same Cosign signature — so pick whichever registry you prefer: | Purpose | GHCR (default in `docker-compose.yaml`) | Docker Hub mirror | | ---------------------------------------------------------------------- | --------------------------------------------- | ------------------------------------------- | | CLI / `gitnexus serve` backend (HTTP API on port `4747`, MCP, indexer) | `ghcr.io/abhigyanpatwari/gitnexus:latest` | `akonlabs/gitnexus:latest` | | Static web UI (port `4173`) | `ghcr.io/abhigyanpatwari/gitnexus-web:latest` | `akonlabs/gitnexus-web:latest` | > **Heads-up — image rename.** Earlier releases published the web UI under > `ghcr.io/abhigyanpatwari/gitnexus`. Starting with the introduction of the > bundled backend, that slug now hosts the CLI/server image and the UI moved > to `ghcr.io/abhigyanpatwari/gitnexus-web`. The previous tags remain > available for pulling, but new versions are only published under the new > slugs. Update your `docker run` / compose files accordingly (or just adopt > the bundled compose). ### One-command setup ```bash docker compose up -d ``` This starts the server on `http://localhost:4747` and the web UI on `http://localhost:4173`. The UI auto-detects the server because the browser runs on the host and reaches the container via the mapped port. A named volume (`gitnexus-data`) persists the global registry, indexes, and cloned repos at `/data/gitnexus` inside the server container. To make repos on your host machine indexable, set `WORKSPACE_DIR` before bringing the stack up: ```bash WORKSPACE_DIR=$HOME/code docker compose up -d # Inside the server container the directory is mounted read-only at /workspace. docker compose exec gitnexus-server gitnexus index /workspace/my-repo ``` ### Direct `docker run` ```bash # Server docker run --rm -d \ --name gitnexus-server \ -p 4747:4747 \ -v gitnexus-data:/data/gitnexus \ ghcr.io/abhigyanpatwari/gitnexus:latest # Web UI docker run --rm -d \ --name gitnexus-web \ -p 4173:4173 \ ghcr.io/abhigyanpatwari/gitnexus-web:latest ``` Optional env file (override image tags, container names, ports, workspace dir): ```bash cp .env.example .env docker compose --env-file .env up -d ``` ### Versioning & supply-chain protection The Docker images are version-locked to the npm package: - Stable images are **only published from `vX.Y.Z` git tags** (via `docker.yml` triggered directly by the tag push), and the workflow refuses to build unless the tag exactly matches `gitnexus/package.json`'s version. So `ghcr.io/abhigyanpatwari/gitnexus:1.6.2` (and its Docker Hub mirror `akonlabs/gitnexus:1.6.2`) is byte-for-byte the same release as `npm install gitnexus@1.6.2` — no drift, no floating builds from `main`. Both registries receive the same digest from a single build step, so you can pull from either and the signature verifies identically. - Release-candidate images (e.g. `:1.7.0-rc.1`) are published alongside each RC npm release. They are built by `release-candidate.yml` calling `docker.yml` as a reusable workflow after the RC tag is created and pushed. - `:latest` is auto-promoted only from non-prerelease tags by the Docker metadata action, so it always points at a real, npm-published version. Both images are signed with [Cosign keyless signing][cosign-keyless] using the workflow's GitHub OIDC identity, and shipped with build provenance and SBOM attestations. **This is your protection against supply-chain attacks**: even if an attacker republishes a same-named image elsewhere (or somehow pushes to a typo-squatted registry), they cannot forge a Cosign signature tied to `abhigyanpatwari/GitNexus`'s `docker.yml`. Always verify before pulling into sensitive environments: **Stable releases** — signed from the `v*` tag ref: ```bash cosign verify ghcr.io/abhigyanpatwari/gitnexus:1.6.2 \ --certificate-identity-regexp '^https://github\.com/abhigyanpatwari/GitNexus/\.github/workflows/docker\.yml@refs/tags/v[0-9]+\.[0-9]+\.[0-9]+(-[a-zA-Z0-9.]+)?$' \ --certificate-oidc-issuer https://token.actions.githubusercontent.com # Same signature verifies the Docker Hub mirror (identical digest): cosign verify docker.io/akonlabs/gitnexus:1.6.2 \ --certificate-identity-regexp '^https://github\.com/abhigyanpatwari/GitNexus/\.github/workflows/docker\.yml@refs/tags/v[0-9]+\.[0-9]+\.[0-9]+(-[a-zA-Z0-9.]+)?$' \ --certificate-oidc-issuer https://token.actions.githubusercontent.com ``` The regex pins the certificate identity to this repo's `docker.yml` workflow **run from a `v*` tag** — rejecting unsigned images, images signed by other workflows, and images signed from unprotected refs. It is identical for both registries because both sets of tags were signed at the same digest in one workflow run. **Release candidates** — signed from `refs/heads/main` (the caller's ref when `release-candidate.yml` invokes `docker.yml` as a reusable workflow): ```bash cosign verify ghcr.io/abhigyanpatwari/gitnexus:1.7.0-rc.1 \ --certificate-identity 'https://github.com/abhigyanpatwari/GitNexus/.github/workflows/docker.yml@refs/heads/main' \ --certificate-oidc-issuer https://token.actions.githubusercontent.com ``` You can also inspect the build provenance and SBOM: ```bash cosign download attestation ghcr.io/abhigyanpatwari/gitnexus:1.6.2 \ --predicate-type https://slsa.dev/provenance/v1 ``` #### Kubernetes: enforce signatures at admission For Kubernetes deployments, ship the bundled [`ClusterImagePolicy`](deploy/kubernetes/cluster-image-policy.yaml) so the [Sigstore policy-controller][policy-controller] rejects any GitNexus pod whose image is not signed by this repo's `docker.yml` running from a `vX.Y.Z` tag — the same identity the `cosign verify` snippet above pins. ```bash # 1. Install the controller (one-time, cluster-wide) helm repo add sigstore https://sigstore.github.io/helm-charts && helm repo update helm install policy-controller -n cosign-system --create-namespace \ sigstore/policy-controller # 2. Opt your namespace in kubectl label namespace
policy.sigstore.dev/include=true # 3. Apply the policy kubectl apply -f deploy/kubernetes/cluster-image-policy.yaml ``` After this, attempting to deploy an unsigned image — or one signed by anything other than `abhigyanpatwari/GitNexus`'s `docker.yml` at a `v*` tag — fails the admission webhook before a pod is ever created. This turns the verifiable signature into an enforced policy, which is the supply-chain control most clusters actually need. [cosign-keyless]: https://docs.sigstore.dev/cosign/signing/overview/ [policy-controller]: https://docs.sigstore.dev/policy-controller/overview/ ### Files - [Dockerfile.web](Dockerfile.web) — builds `gitnexus-shared` and `gitnexus-web`, then serves the production frontend. - [Dockerfile.cli](Dockerfile.cli) — builds the CLI/server (with its native deps) and runs `gitnexus serve --host 0.0.0.0`. - [docker-compose.yaml](docker-compose.yaml) — starts both signed images side by side. - [.env.example](.env.example) — overrides for image names, container names, ports, and the workspace mount. The web UI uses the same indexing pipeline as the CLI but runs entirely in WebAssembly (Tree-sitter WASM, LadybugDB WASM, in-browser embeddings). It's great for quick exploration but limited by browser memory for larger repos. **Local Backend Mode:** Run `gitnexus serve` and open the web UI locally — it auto-detects the server and shows all your indexed repos, with full AI chat support. No need to re-upload or re-index. The agent's tools (Cypher queries, search, code navigation) route through the backend HTTP API automatically. --- ## The Problem GitNexus Solves Tools like **Cursor**, **Claude Code**, **Codex**, **Cline**, **Roo Code**, and **Windsurf** are powerful — but they don't truly know your codebase structure. **What happens:** 1. AI edits `UserService.validate()` 2. Doesn't know 47 functions depend on its return type 3. **Breaking changes ship** ### Traditional Graph RAG vs GitNexus Traditional approaches give the LLM raw graph edges and hope it explores enough. GitNexus **precomputes structure at index time** — clustering, tracing, scoring — so tools return complete context in one call: ```mermaid flowchart TB subgraph Traditional["Traditional Graph RAG"] direction TB U1["User: What depends on UserService?"] U1 --> LLM1["LLM receives raw graph"] LLM1 --> Q1["Query 1: Find callers"] Q1 --> Q2["Query 2: What files?"] Q2 --> Q3["Query 3: Filter tests?"] Q3 --> Q4["Query 4: High-risk?"] Q4 --> OUT1["Answer after 4+ queries"] end subgraph GN["GitNexus Smart Tools"] direction TB U2["User: What depends on UserService?"] U2 --> TOOL["impact UserService upstream"] TOOL --> PRECOMP["Pre-structured response: 8 callers, 3 clusters, all 90%+ confidence"] PRECOMP --> OUT2["Complete answer, 1 query"] end ``` **Core innovation: Precomputed Relational Intelligence** - **Reliability** — LLM can't miss context, it's already in the tool response - **Token efficiency** — No 10-query chains to understand one function - **Model democratization** — Smaller LLMs work because tools do the heavy lifting --- ## How It Works GitNexus builds a complete knowledge graph of your codebase through a multi-phase indexing pipeline: 1. **Structure** — Walks the file tree and maps folder/file relationships 2. **Parsing** — Extracts functions, classes, methods, and interfaces using Tree-sitter ASTs 3. **Resolution** — Resolves imports, function calls, heritage, constructor inference, and `self`/`this` receiver types across files with language-aware logic 4. **Clustering** — Groups related symbols into functional communities 5. **Processes** — Traces execution flows from entry points through call chains 6. **Search** — Builds hybrid search indexes for fast retrieval ### Supported Languages | Language | Imports | Named Bindings | Exports | Heritage | Type Annotations | Constructor Inference | Config | Frameworks | Entry Points | |----------|---------|----------------|---------|----------|-----------------|---------------------|--------|------------|-------------| | TypeScript | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | | JavaScript | ✓ | ✓ | ✓ | ✓ | — | ✓ | ✓ | ✓ | ✓ | | Python | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | | Java | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — | ✓ | ✓ | | Kotlin | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — | ✓ | ✓ | | C# | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | | Go | ✓ | — | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | | Rust | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — | ✓ | ✓ | | PHP | ✓ | ✓ | ✓ | — | ✓ | ✓ | ✓ | ✓ | ✓ | | Ruby | ✓ | — | ✓ | ✓ | — | ✓ | — | ✓ | ✓ | | Swift | — | — | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | | C | — | — | ✓ | — | ✓ | ✓ | — | ✓ | ✓ | | C++ | — | — | ✓ | ✓ | ✓ | ✓ | — | ✓ | ✓ | | Dart | ✓ | — | ✓ | ✓ | ✓ | ✓ | — | ✓ | ✓ | **Imports** — cross-file import resolution · **Named Bindings** — `import { X as Y }` / re-export tracking · **Exports** — public/exported symbol detection · **Heritage** — class inheritance, interfaces, mixins · **Type Annotations** — explicit type extraction for receiver resolution · **Constructor Inference** — infer receiver type from constructor calls (`self`/`this` resolution included for all languages) · **Config** — language toolchain config parsing (tsconfig, go.mod, etc.) · **Frameworks** — AST-based framework pattern detection · **Entry Points** — entry point scoring heuristics --- ## Tool Examples ### Impact Analysis ``` impact({target: "UserService", direction: "upstream", minConfidence: 0.8}) TARGET: Class UserService (src/services/user.ts) UPSTREAM (what depends on this): Depth 1 (WILL BREAK): handleLogin [CALLS 90%] -> src/api/auth.ts:45 handleRegister [CALLS 90%] -> src/api/auth.ts:78 UserController [CALLS 85%] -> src/controllers/user.ts:12 Depth 2 (LIKELY AFFECTED): authRouter [IMPORTS] -> src/routes/auth.ts ``` Options: `maxDepth`, `minConfidence`, `relationTypes` (`CALLS`, `IMPORTS`, `EXTENDS`, `IMPLEMENTS`), `includeTests` ### Process-Grouped Search ``` query({query: "authentication middleware"}) processes: - summary: "LoginFlow" priority: 0.042 symbol_count: 4 process_type: cross_community step_count: 7 process_symbols: - name: validateUser type: Function filePath: src/auth/validate.ts process_id: proc_login step_index: 2 definitions: - name: AuthConfig type: Interface filePath: src/types/auth.ts ``` ### Context (360-degree Symbol View) ``` context({name: "validateUser"}) symbol: uid: "Function:validateUser" kind: Function filePath: src/auth/validate.ts startLine: 15 incoming: calls: [handleLogin, handleRegister, UserController] imports: [authRouter] outgoing: calls: [checkPassword, createSession] processes: - name: LoginFlow (step 2/7) - name: RegistrationFlow (step 3/5) ``` ### Detect Changes (Pre-Commit) ``` detect_changes({scope: "all"}) summary: changed_count: 12 affected_count: 3 changed_files: 4 risk_level: medium changed_symbols: [validateUser, AuthService, ...] affected_processes: [LoginFlow, RegistrationFlow, ...] ``` ### Rename (Multi-File) ``` rename({symbol_name: "validateUser", new_name: "verifyUser", dry_run: true}) status: success files_affected: 5 total_edits: 8 graph_edits: 6 (high confidence) text_search_edits: 2 (review carefully) changes: [...] ``` ### Cypher Queries ```cypher -- Find what calls auth functions with high confidence MATCH (c:Community {heuristicLabel: 'Authentication'})<-[:CodeRelation {type: 'MEMBER_OF'}]-(fn) MATCH (caller)-[r:CodeRelation {type: 'CALLS'}]->(fn) WHERE r.confidence > 0.8 RETURN caller.name, fn.name, r.confidence ORDER BY r.confidence DESC ``` --- ## Wiki Generation Generate LLM-powered documentation from your knowledge graph: ```bash # Requires an LLM API key (OPENAI_API_KEY, etc.) gitnexus wiki # Use a custom model or provider gitnexus wiki --model gpt-4o gitnexus wiki --base-url https://api.anthropic.com/v1 # Force full regeneration gitnexus wiki --force ``` The wiki generator reads the indexed graph structure, groups files into modules via LLM, generates per-module documentation pages, and creates an overview page — all with cross-references to the knowledge graph. --- ## Tech Stack | Layer | CLI | Web | | ------------------------- | ------------------------------------- | --------------------------------------- | | **Runtime** | Node.js (native) | Browser (WASM) | | **Parsing** | Tree-sitter native bindings | Tree-sitter WASM | | **Database** | LadybugDB native | LadybugDB WASM | | **Embeddings** | HuggingFace transformers.js (GPU/CPU) | transformers.js (WebGPU/WASM) | | **Search** | BM25 + semantic + RRF | BM25 + semantic + RRF | | **Agent Interface** | MCP (stdio) | LangChain ReAct agent | | **Visualization** | — | Sigma.js + Graphology (WebGL) | | **Frontend** | — | React 18, TypeScript, Vite, Tailwind v4 | | **Clustering** | Graphology | Graphology | | **Concurrency** | Worker threads + async | Web Workers + Comlink | --- ## Roadmap ### Actively Building - [ ] **LLM Cluster Enrichment** — Semantic cluster names via LLM API - [ ] **AST Decorator Detection** — Parse @Controller, @Get, etc. - [ ] **Incremental Indexing** — Only re-index changed files ### Recently Completed - [X] Constructor-Inferred Type Resolution, `self`/`this` Receiver Mapping - [X] Wiki Generation, Multi-File Rename, Git-Diff Impact Analysis - [X] Process-Grouped Search, 360-Degree Context, Claude Code Hooks - [X] Multi-Repo MCP, Zero-Config Setup, 14 Language Support - [X] Community Detection, Process Detection, Confidence Scoring - [X] Hybrid Search, Vector Index --- ## Security & Privacy - **CLI**: Everything runs locally on your machine. No network calls. Index stored in `.gitnexus/` (gitignored). Global registry at `~/.gitnexus/` stores only paths and metadata. - **Web**: Everything runs in your browser. No code uploaded to any server. API keys stored in localStorage only. - Open source — audit the code yourself. --- ## Acknowledgments - [Tree-sitter](https://tree-sitter.github.io/) — AST parsing - [LadybugDB](https://ladybugdb.com/) — Embedded graph database with vector support (formerly KuzuDB) - [Sigma.js](https://www.sigmajs.org/) — WebGL graph rendering - [transformers.js](https://huggingface.co/docs/transformers.js) — Browser ML - [Graphology](https://graphology.github.io/) — Graph data structures - [MCP](https://modelcontextprotocol.io/) — Model Context Protocol