# Semfora Engine — Quick Start / Install Guide Install the **Semfora MCP server** in any project and make your coding agent (Claude Code, Cursor, …) prefer Semfora's semantic tools over raw `Read`/`Grep`. > **Reading this as an agent?** This file is written so you can follow it top to > bottom. Each step has a **check** (decide if you need to act) and the **exact > commands**. Do the steps in order. Stop conditions are called out. The end state > you are driving toward: `claude mcp list` shows `semfora … ✔ Connected`, and a > `get_context` MCP call returns real data for the current repo. You'll build the **current** `semfora-engine` from source (there's no published binary — see Step 1) and end up with the full set of Semfora MCP tools listed at the end of this guide. --- ## Step 1 — Ensure the `semfora-engine` binary is installed and current **Check:** ```bash which semfora-engine && semfora-engine --version ``` - Prints a path and a version → you have a build; **skip to Step 2.** (The word printed is `semfora`, not `semfora-engine` — that's expected.) If the MCP server later exposes an unexpectedly small or different tool set, the build is old — rebuild from source (§1A). - Command not found → install below. > If a build is already on `PATH` and you want to refresh it, note its location > (`which semfora-engine`) — you'll overwrite that path in §1A. > **There is no published binary to download** — building from source (§1A) is the > only way to get the current engine. The GitHub releases are either asset-less or > predate the current tool set, and it's not on crates.io / Homebrew yet. So: **build > from source.** ### 1A. Install — build from source (the supported path; works on any platform / fresh container) This path is fully self-contained — it assumes **nothing is installed yet**. Run the blocks for your OS in order. (You only need network access to reach `github.com` and `crates.io`.) **i. Install OS build dependencies.** The only hard requirement is a **C compiler** (SQLite is compiled from bundled C source; TLS is pure-Rust `rustls`, so **no OpenSSL is needed**). Plus `git` + `curl` to fetch the source and toolchain. ```bash # Debian/Ubuntu containers command -v cc >/dev/null || { sudo apt-get update && sudo apt-get install -y build-essential git curl; } # Fedora/RHEL # sudo dnf install -y gcc gcc-c++ make git curl # Alpine # sudo apk add build-base git curl # macOS (installs the Command Line Tools if absent) # xcode-select -p >/dev/null 2>&1 || xcode-select --install # Note: if you are not root and `sudo` is unavailable (some containers), drop `sudo` # and run as root, or bake build-essential + git + curl into the image. ``` **ii. Install the Rust toolchain if missing** (`rustc`/`cargo`, 1.70+): ```bash command -v cargo >/dev/null || { curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y . "$HOME/.cargo/env" # add cargo to THIS shell now } cargo --version # confirm it's available ``` **iii. Clone and build + install:** ```bash git clone https://github.com/Semfora-AI/semfora-engine.git cd semfora-engine cargo install --path . # builds (~2–5 min cold) and installs to ~/.cargo/bin # cargo signs the fresh binary itself — no codesign needed ``` **iv. Make sure `~/.cargo/bin` is on `PATH`, then verify:** ```bash case ":$PATH:" in *":$HOME/.cargo/bin:"*) ;; *) export PATH="$HOME/.cargo/bin:$PATH" ;; esac # persist for future shells: grep -qs '.cargo/bin' ~/.bashrc || echo 'export PATH="$HOME/.cargo/bin:$PATH"' >> ~/.bashrc hash -r semfora-engine --version # prints "semfora " (the word is "semfora") ``` > **Upgrading an existing install that lives elsewhere** (e.g. `~/.local/bin`): build > and copy over it, then re-sign on macOS: > ```bash > cargo build --release --bin semfora-engine > cp target/release/semfora-engine "$(which semfora-engine)" > codesign --force --sign - "$(which semfora-engine)" # macOS Apple Silicon only > semfora-engine --version > ``` Other build artifacts (not needed for MCP): `semfora-daemon` (real-time index daemon), `semfora-benchmark-builder`, `semfora-security-compiler`. The MCP server is built into `semfora-engine` as the **`serve`** subcommand — there is no separate server binary. > The clone above pulls the default branch, which already contains the current > `semfora-engine` source — you do **not** need to check out any other branch to get a > working binary. --- ## Step 2 — Register the MCP server for the project `cd` into the target project first — everything below is per-project. ```bash cd /path/to/your/project ``` ### 2A. Build the index (optional — the server auto-builds it) ```bash semfora-engine index generate . # if it prints "Index is fresh", that's fine ``` ### 2B. Register with Claude Code The MCP server command is **`semfora-engine serve`** (stdio; it auto-maintains a fresh index while running). **Choose a scope explicitly** — the default is *not* what you usually want: ```bash # user scope — available in EVERY project, trusted immediately (simplest for solo dev) claude mcp add --scope user semfora -- semfora-engine serve # project scope — writes ./.mcp.json, committable & shared with the team # (requires a one-time approval the first time someone runs `claude` here — see 2D) claude mcp add --scope project semfora -- semfora-engine serve ``` > **Default-scope warning:** `claude mcp add` with **no** `--scope` uses `local` > scope and writes to `~/.claude.json` (private to you, this project only) — *not* > `./.mcp.json`. Always pass `--scope user` or `--scope project`. If `semfora-engine` is not on `PATH`, use an absolute path after `--`: ```bash claude mcp add --scope user semfora -- "$(which semfora-engine)" serve ``` **Verify the registration:** ```bash claude mcp list # expect a line: semfora: semfora-engine serve - ✔ Connected ``` ### 2C. (Alternative) Write `.mcp.json` by hand For a committable, team-shared setup, create `./.mcp.json` at the repo root: ```json { "mcpServers": { "semfora": { "type": "stdio", "command": "semfora-engine", "args": ["serve"], "env": {} } } } ``` Pin a specific repo or tune the watcher via `args`, e.g. `["serve", "--repo", "/abs/path", "--no-git-poll"]`. Flags: `--repo `, `--no-watch`, `--no-git-poll`. > **Claude Desktop / Cursor / VS Code** use a `mcpServers` block too (Claude Desktop: > `~/Library/Application Support/Claude/claude_desktop_config.json`). Or run the > bundled wizard: `semfora-engine setup`. ### 2D. Auto-approve a project-scope server (so tools actually load) A **project-scope** (`.mcp.json`) server shows up as `⏸ Pending approval` and its tools won't load in a session until trusted. Two ways to trust it: - **Interactive:** run `claude` in the project once and accept the prompt to trust the project's MCP servers. - **Headless / pre-approved (recommended for agents & CI):** add the server to `.claude/settings.local.json` so it's enabled without a prompt: ```json { "enabledMcpjsonServers": ["semfora"] } ``` With this present, `claude mcp list` shows `✔ Connected` and an agent session loads the `mcp__semfora__*` tools immediately. (A `user`-scope server from 2B is trusted automatically and skips this entirely.) --- ## Step 3 — Make the agent prefer Semfora tools (`CLAUDE.md`) Adding the server isn't enough — agents default to `Read`/`Grep`/`Bash` unless told otherwise. Paste this block at the **top** of the project's `CLAUDE.md` (create it if missing): ```markdown ## Code navigation: ALWAYS prefer Semfora MCP tools This project has the **Semfora MCP server** (`semfora`) connected. Semfora gives you a semantic view of the codebase for a fraction of the tokens raw file access costs. **Use Semfora tools first.** Only fall back to `Read`/`Grep`/`Glob`/`Bash` for what Semfora genuinely can't do (small config files, non-code assets, running commands). ### Hard rules 1. **Start every session** with `get_context` to orient (git state, index freshness). 2. **To find code**, use `search` — never `Grep`/`Glob` for symbols or logic. 3. **To read a function**, use `get_source` (by symbol) — never `Read` the whole file. 4. **Before editing existing code**, use `get_callers` to understand impact radius. 5. **To review changes / a PR**, use `analyze_diff` — never read each changed file. 6. **To understand a new codebase**, use `get_overview` then `analyze`. 7. Reach for `Read` only for files <~100 lines of config or non-code assets. ### Tool map — use the LEFT, not the RIGHT | Goal | Use this Semfora tool | Instead of | |------------------------------|-----------------------|------------| | Orient / git + index state | `get_context` | `git status`, `ls` | | Architecture of the repo | `get_overview` | reading many files | | Find a symbol / concept | `search` | `Grep`, `Glob` | | Analyze a file/dir/module | `analyze` | `Read` whole file | | Review a diff / PR / staged | `analyze_diff` | `git diff` + `Read` | | Read a specific function | `get_source` | `Read` + scroll | | List symbols in a file | `get_file` | `Read` + skim | | Details about one symbol | `get_symbol` | `Read` | | Who calls this? (impact) | `get_callers` | `Grep` for name | | Trace data/control flow | `trace` | manual reading | | Call graph / dependencies | `get_callgraph` | manual reading | | Module summary | `get_module` | `Read` dir | | Complexity / quality audit | `validate` | eyeballing code | | Find duplicated code | `find_duplicates` | `Grep` | | Find likely dead code | `dead_code_audit` | manual reading | | Run / discover tests | `test` | `Bash` test cmd | | Run linters | `lint` | `Bash` lint cmd | | Prepare a commit message | `prep_commit` | `git diff` + guess | | Refresh / check the index | `index` / `index_status` | n/a | ### Token discipline - Prefer `summary_only` / `output_mode: "summary"` / `limit` on big calls. - For large files use `analyze` with `start_line`/`end_line` instead of reading all. - `search` and `analyze` auto-refresh the index — you rarely need `index` manually. ``` --- ## Step 4 — Verify it works Check two layers: the **CLI** and the **MCP transport**. The CLI working does not by itself prove the MCP server answers tool calls (different code path). ### 4A. CLI smoke test ```bash cd /path/to/your/project semfora-engine query overview # prints a real overview of THIS repo ``` > `semfora-engine search ""` returning **0 results is not a failure** — it means > nothing matched. Search a name you know exists to see hits. ### 4B. MCP transport smoke test (proves the server answers tool calls) Drive `serve` over stdio with a real JSON-RPC handshake. **The `sleep`s matter** — the server answers messages as they arrive, so a single burst with an immediate EOF can cause it to reply only to `initialize` and exit: ```bash cd /path/to/your/project { echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"t","version":"1"}}}' echo '{"jsonrpc":"2.0","method":"notifications/initialized"}' sleep 1 echo '{"jsonrpc":"2.0","id":2,"method":"tools/list","params":{}}' sleep 1 echo '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"get_context","arguments":{}}}' sleep 2 } | semfora-engine serve --no-watch --no-git-poll 2>/dev/null ``` Expect: `initialize` → `serverInfo: semfora-engine`; `tools/list` → the full Semfora tool set (the tools listed below); `get_context` → real data for your repo (`repo_name`, `branch`, `index_status`). ### 4C. In the agent session > **A session loads its MCP tools at startup.** If you registered the server from > *inside* a running Claude Code session (or a `claude -p` run), the `mcp__semfora__*` > tools will **not** appear in that same session — you must **start a new session** > (restart Claude Code, or spawn a fresh `claude` / `claude -p`). Servers registered > before launch are available immediately. In a session started *after* registration, run `/mcp` (or `claude mcp list`) — `semfora` should be **Connected**. Ask the agent: *"use semfora to give me an overview of this repo"* — it should call `get_context` / `get_overview` (tool names appear as `mcp__semfora__*`) instead of reading files. If you set `enabledMcpjsonServers` (Step 2D), a fresh headless `claude -p` picks the tools up with no prompt — verify with: ```bash claude -p "Call the semfora get_context tool and report repo_name and branch." \ --allowedTools mcp__semfora__get_context --output-format stream-json --verbose \ | grep -o '"name":"mcp__semfora__get_context"' # non-empty => the tool really ran ``` --- ## MCP tool reference Auto-discovered by the client once connected — you don't list them anywhere. | Tool | What it does | |------|--------------| | `get_context` | Quick git + project context (~200 tokens). Use first. | | `get_overview` | Repo overview from the index: frameworks, modules, risk, entry points. | | `search` | Hybrid symbol + semantic search. Auto-refreshes the index. | | `analyze` | Auto-detects file / directory / module and returns semantic analysis. | | `analyze_diff` | Semantic diff between branches/commits/working tree. Use for reviews. | | `get_file` | List symbols in a file or module. | | `get_symbol` | Detailed semantic info for one or more symbols (by hash). | | `get_source` | Source for a symbol (or line range) without reading the whole file. | | `get_callers` | Reverse call graph — who calls this. Use before modifying code. | | `get_callgraph` | Call flow and dependencies between functions. | | `trace` | Trace incoming/outgoing relationships from a symbol/name/file. | | `get_module` | Module summary or module symbol listing. | | `validate` | Quality audit: complexity, duplicates, impact radius. | | `find_duplicates` | Detect duplicated/similar code clusters. | | `dead_code_audit` | Find likely dead code via call-graph + framework/export/test heuristics. | | `test` | Run or discover tests (auto-detects framework). | | `lint` | Auto-detect and run linters across many languages. | | `prep_commit` | Analyze staged changes and prepare a commit message. | | `index` | Smart index management (refresh only if stale). | | `index_status` | Index health/freshness without rebuilding. | | `get_languages` | Languages supported for semantic analysis. | | `server_status` | Server mode, features, and layer status. | --- ## Troubleshooting | Problem | Fix | |---------|-----| | `semfora-engine` not found | Install (Step 1); ensure its dir (`~/.cargo/bin`, `~/.local/bin`, `/usr/local/bin`) is on `PATH`. | | MCP server exposes a small or unexpected tool set | The build is old — rebuild from source (Step 1A). There is no prebuilt binary to download. | | Binary killed instantly (`exit 137`) on macOS | Re-sign a copied binary: `codesign --force --sign - "$(which semfora-engine)"`. | | `semfora` not in `/mcp` / `claude mcp list` | Restart the client; confirm the `command` is correct and on `PATH`. | | `semfora` shows `⏸ Pending approval` | Project-scope server awaiting trust — set `enabledMcpjsonServers: ["semfora"]` (Step 2D) or register with `--scope user`. | | "No index found" | Run `semfora-engine index generate .` in the repo root (the server also auto-builds it). | | Index stale after edits | The live watcher handles it; otherwise `semfora-engine index generate . --incremental` or call the `index` tool. | | Wrong project indexed | Run from inside the repo, or add `--repo /abs/path` to the `serve` args. | | MCP smoke test answers only `initialize` | Add the `sleep`s between messages (Step 4B) — stdin must stay open. | | Agent still reads files raw | Ensure the Step 3 `CLAUDE.md` block is present and near the top. | ## More - [`docs/cli.md`](docs/cli.md) — Full CLI command reference - [`docs/mcp-tools-reference.md`](docs/mcp-tools-reference.md) — Per-tool parameters - [`docs/mcp-workflows.md`](docs/mcp-workflows.md) — Recommended multi-tool workflows - [`docs/websocket-daemon.md`](docs/websocket-daemon.md) — Real-time daemon for multi-client setups