# Kun Kun is the local HTTP/SSE agent runtime for DeepSeek-GUI. It exposes a TypeScript-typed agent loop with a stable, GUI-friendly contract: - `kun serve` starts a local HTTP server with `/v1/*` routes. - Threads, turns, events, approvals, and usage are persisted as append-only JSONL logs with atomic index updates. - The loop is cache-first by construction: immutable prompt prefix, bounded TTL/LRU caches, inflight tracking, and explicit context compaction. The name Kun is inspired by the great fish in Zhuangzi's line, "In the northern sea there is a fish; its name is Kun." In DeepSeek-GUI, it means a deeper local runtime rather than a thin model UI: one agent loop that can carry project context, call tools reliably, resume sessions, and serve desktop chat, writing, phone connections, and scheduled tasks. Kun's core goal is to improve the ROI of every token. Tokens should be spent on user requirements, code, decisions, and results, not repeated tool schemas, runaway tool output, malformed history, useless retries, or stable prefixes that could have been reused from cache. ## Layout ``` kun/ src/ cli/ Command-line entrypoints (serve, run, chat, exec) contracts/ Zod schemas and inferred types for the HTTP/SSE contract domain/ Thread, Turn, Item, Event, Approval, Usage entities ports/ ModelClient, ToolHost, stores, EventBus, ApprovalGate, ... adapters/ DeepSeek-compatible model client, local tool host, in-memory and file-backed stores, workspace inspector services/ Thread and turn orchestration services loop/ The cache-first agent loop and inflight helpers cache/ LRU / TTL caches and immutable prefix utilities telemetry/ Usage, cache, and cost counters server/ HTTP routing, auth, SSE, response helpers tests/ Cross-cutting contract tests dist/ Build output (gitignored) ``` ## Scripts Run from the `kun/` directory. - `npm run typecheck` – run the package typecheck (no emit). - `npm run test` – run Vitest unit and contract tests. - `npm run build` – emit ESM JavaScript and type declarations into `dist/`. - `npm run serve` – start the runtime after a build. - `npm run dev` – rebuild in watch mode. ## CLI `kun serve` accepts the following flags: | Flag | Description | Default | | --- | --- | --- | | `--config` | JSON config file. If omitted, Kun reads `{--data-dir}/config.json` when present | optional | | `--host` | Bind address | `127.0.0.1` | | `--port` | HTTP port | `8899` | | `--data-dir` | Root directory for threads, events, and usage | required | | `--runtime-token` | Bearer token for `/v1/*` requests | empty | | `--api-key` | DeepSeek-compatible API key | empty | | `--base-url` | DeepSeek-compatible model API base URL | `https://api.deepseek.com/beta` | | `--model` | Default model id | `deepseek-v4-pro` | | `--approval-policy` | `on-request` \| `untrusted` \| `never` \| `auto` \| `suggest` | `auto` | | `--sandbox-mode` | `read-only` \| `workspace-write` \| `danger-full-access` \| `external-sandbox` | `workspace-write` | | `--insecure` | Disable bearer token check (local dev only) | off | Example: ```bash kun serve \ --config ~/.deepseekgui/kun/config.json \ --host 127.0.0.1 \ --port 8899 \ --data-dir ~/.deepseekgui/kun \ --runtime-token dev-token \ --api-key "$DEEPSEEK_API_KEY" \ --model deepseek-v4-pro ``` Kun can also run as a standalone agent without the GUI: ```bash kun run --data-dir ~/.deepseekgui/kun --workspace "$PWD" "summarize this repo" kun chat --data-dir ~/.deepseekgui/kun --workspace "$PWD" kun exec --data-dir ~/.deepseekgui/kun --workspace "$PWD" --list-tools kun exec --data-dir ~/.deepseekgui/kun --workspace "$PWD" read --args '{"path":"README.md"}' ``` - `kun run` creates a thread, runs one turn, streams assistant text, and exits. - `kun chat` starts a line-oriented REPL. Use `/exit`, `/quit`, or an empty line to stop. - `kun exec --list-tools` prints the effective dynamic tool registry for the chosen config/workspace. - `kun exec --args ` invokes one tool directly. Use `--json` on `run` or `exec` for machine-readable output. ## Environment variables The runtime reads these from `process.env` when not set via CLI flags. - `KUN_CONFIG` – explicit JSON config file - `KUN_HOST` – bind host (overrides `--host` if set) - `KUN_PORT` – bind port (overrides `--port` if set) - `KUN_DATA_DIR` – root data directory (overrides `--data-dir` if set) - `KUN_RUNTIME_TOKEN` – bearer token (overrides `--runtime-token` if set) - `KUN_BASE_URL` – model API base URL (overrides `--base-url` if set) - `DEEPSEEK_BASE_URL` – fallback model API base URL - `KUN_MODEL` – default model id (overrides `--model` if set) - `DEEPSEEK_API_KEY` – the DeepSeek API key the adapter forwards to the upstream model. Required at runtime for the default model client. ## Config file Kun supports a JSON config file so runtime behavior can be managed without rebuilding or hard-coding loop thresholds. Config resolution order is: 1. Built-in defaults. 2. JSON config file. 3. Environment variables. 4. CLI flags. Use `--config ` or `KUN_CONFIG=` for an explicit file. If no explicit config is provided and `--data-dir` / `KUN_DATA_DIR` is set, Kun also reads `{data-dir}/config.json` when it exists. In the GUI's default setup this is: ```text ~/.deepseekgui/kun/config.json ``` Shape: ```json { "serve": { "host": "127.0.0.1", "port": 8899, "dataDir": "~/.deepseekgui/kun", "runtimeToken": "", "apiKey": "", "baseUrl": "https://api.deepseek.com/beta", "model": "deepseek-v4-pro", "approvalPolicy": "auto", "sandboxMode": "workspace-write", "storage": { "backend": "hybrid" }, "insecure": false }, "contextCompaction": { "defaultSoftThreshold": 16000, "defaultHardThreshold": 24000, "summaryMode": "heuristic", "summaryTimeoutMs": 15000, "summaryMaxTokens": 1200, "summaryInputMaxBytes": 98304 }, "models": { "profiles": { "deepseek-v4-pro": { "contextWindowTokens": 1000000, "contextCompaction": { "softThreshold": 980000, "hardThreshold": 990000 } }, "deepseek-v4-flash": { "aliases": ["deepseek-chat", "deepseek-reasoner"], "contextWindowTokens": 1000000, "contextCompaction": { "softThreshold": 980000, "hardThreshold": 990000 } } } }, "capabilities": { "mcp": { "enabled": false, "servers": { "github": { "enabled": true, "transport": "stdio", "command": "npx", "args": ["-y", "@modelcontextprotocol/server-github"], "env": { "GITHUB_TOKEN": "" }, "trustScope": "workspace", "trustedWorkspaceRoots": ["/path/to/workspace"], "timeoutMs": 30000 }, "remote-docs": { "enabled": false, "transport": "streamable-http", "url": "https://mcp.example.com/mcp", "headers": { "authorization": "Bearer " }, "trustScope": "user", "timeoutMs": 30000 } } }, "web": { "enabled": false, "fetchEnabled": false, "searchEnabled": false, "provider": "fetch", "allowDomains": [], "denyDomains": ["localhost", "127.0.0.1"] }, "skills": { "enabled": false, "roots": ["~/.agents/skills", "./.agents/skills"], "legacySkillMd": true }, "subagents": { "enabled": false, "maxParallel": 2, "maxChildRuns": 4 }, "attachments": { "enabled": false, "maxImageBytes": 5242880, "maxImageDimension": 4096, "allowedMimeTypes": ["image/png", "image/jpeg", "image/webp"], "textFallbackMaxBase64Bytes": 524288, "textFallbackMaxImageDimension": 1280, "textFallbackPreferredMimeType": "image/webp" }, "memory": { "enabled": false, "scopes": ["user", "workspace", "project"], "maxInjectedRecords": 8 } } } ``` Kun defaults to hybrid session storage: `threads/{threadId}/messages.jsonl` and `events.jsonl` remain the canonical transcript/replay logs, while `index.sqlite3` stores only rebuildable thread metadata for fast lists and search. Set `serve.storage.backend` to `"file"` to use the legacy JSON index backend, or set `serve.storage.sqlitePath` to override the default `{dataDir}/index.sqlite3` path. Model-specific context windows, capabilities, and compaction thresholds belong in `models.profiles`. Built-in profiles already cover `deepseek-v4-pro`, `deepseek-v4-flash`, and the compatibility aliases `deepseek-chat` / `deepseek-reasoner`; DeepSeek V4 defaults to a 1M context window and starts compaction around 980k input tokens. The legacy `contextCompaction.modelProfiles` location is still read for backward compatibility, but new configs should use `models.profiles`. See `../docs/KUN_CONFIG.md` for the detailed file layout and examples. Feature flags are intentionally explicit: - `capabilities.mcp` starts configured MCP clients and imports their tools into the dynamic registry. Workspace-scoped servers require `trustedWorkspaceRoots`. - `serve.mcpSearch` can collapse a large MCP catalog into four entry points: `mcp_search`, `mcp_describe`, `mcp_call`, and `mcp_refresh_catalog`. When the catalog is too large, the model searches for relevant tools first, then describes and calls the exact tool instead of carrying every MCP schema on every turn. - `serve.tokenEconomy` / `tokenEconomyMode` compresses tool descriptions, tool results, and history context while preserving code, paths, commands, URLs, errors, and other high-value signals. - `contextCompaction` controls fallback long-thread compaction thresholds and summary behavior. Per-model thresholds live in `models.profiles`. Compaction preserves goals, constraints, decisions, touched files, tool outcomes, and unresolved next steps. - `serve.runtimeTuning.toolStorm` suppresses repeated identical tool calls within a turn so useless tool loops do not keep spending tokens. - `capabilities.web` exposes `web_fetch` and/or `web_search`. The built-in provider can fetch HTTP(S) pages; search requires a provider implementation and may report unavailable. - `capabilities.skills` scans configured roots for `skill.json` manifests and, when `legacySkillMd` is true, older `SKILL.md` directories. - `capabilities.attachments` stores image bytes outside thread logs and allows turns to reference `attachmentIds`. Vision-capable models receive image parts; text-only models receive a bounded compressed base64 text fallback. - `capabilities.memory` stores long-term records under the data dir, retrieves scoped matches before turns, and exposes `memory_create`, `memory_update`, and `memory_delete` tools. - `capabilities.subagents` exposes `delegate_task` with `maxParallel` and `maxChildRuns` concurrency budgets. Use `GET /v1/runtime/info` for the runtime capability manifest and `GET /v1/runtime/tools` for redacted provider diagnostics. The GUI Settings page reads both routes. ## Data directory layout `--data-dir` is the on-disk root for everything the runtime owns: ``` {--data-dir}/ config.json # Optional Kun runtime config attachments/ # Image metadata + content blobs when enabled memory/ # Long-term memory records and tombstones when enabled child-runs/ # Delegated child run records when subagents are enabled threads/ index.json {threadId}/ thread.json # ThreadRecord messages.jsonl # TurnItem append-only events.jsonl # RuntimeEvent append-only session.json # Latest AgentSession projection usage.json # Per-thread usage snapshot ``` Atomic JSON writes are used for `index.json`, `thread.json`, and `session.json`. JSONL streams are append-only and tolerate malformed lines (the next replay skips them). The renderer can re-read a thread by listing `index.json` and replaying the per-thread JSONL. ## HTTP API The HTTP server exposes the following routes under `/v1/*`: | Method | Path | Description | | --- | --- | --- | | GET | `/health` | unauthenticated health probe | | GET | `/v1/runtime/info` | runtime metadata and capability manifest | | GET | `/v1/runtime/tools` | redacted dynamic tool/provider diagnostics | | GET | `/v1/workspace/status?path=...` | workspace git/branch status | | GET | `/v1/threads?include=side` | list threads (most recently updated first); side threads are hidden unless `include=side` is passed | | POST | `/v1/threads` | create a thread | | GET | `/v1/threads/{id}` | read a thread with its turns | | PATCH | `/v1/threads/{id}` | update title/status/approval/sandbox/relation (promote a side thread by setting `relation: "primary"`) | | DELETE | `/v1/threads/{id}` | delete a thread | | POST | `/v1/threads/{id}/fork` | fork the thread. Optional JSON body: `{ "relation": "fork" \| "side", "title"?: string }` (defaults to `fork` when omitted). `relation: "side"` marks the result as a side conversation and tags `parentThreadId`. | | POST | `/v1/threads/{id}/turns` | start a turn | | GET | `/v1/threads/{id}/turns/{turnId}` | read a single turn | | POST | `/v1/threads/{id}/turns/{turnId}/steer` | queue steering text | | POST | `/v1/threads/{id}/turns/{turnId}/interrupt` | abort a turn | | POST | `/v1/threads/{id}/compact` | fold old history | | GET | `/v1/threads/{id}/events?since_seq=N` | SSE backlog + live | | POST | `/v1/approvals/{approvalId}` | allow/deny | | POST | `/v1/attachments` | upload an image attachment as base64 | | GET | `/v1/attachments/diagnostics` | attachment store status | | GET | `/v1/attachments/{id}` | attachment metadata | | GET | `/v1/attachments/{id}/content?thread_id=...&workspace=...` | authorized attachment bytes as base64 | | GET | `/v1/memory?workspace=...&include_deleted=false` | list memory records in scope | | POST | `/v1/memory` | create a memory record | | GET | `/v1/memory/diagnostics` | memory store status | | PATCH | `/v1/memory/{id}` | update, disable, or retag a memory record | | DELETE | `/v1/memory/{id}` | tombstone a memory record | | GET | `/v1/usage` | cumulative token/cache/turn counters | SSE events use `id: `, `event: `, and JSON `data:`. A late-joining client passes `since_seq` to receive the backlog before live events flow. `POST /v1/threads/{id}/turns` accepts `attachmentIds` alongside `prompt`, `model`, `mode`, and `guiPlan`. Attachments are resolved against the turn thread/workspace and are never embedded into thread JSONL logs. Runtime events may include optional child-agent metadata, web citations/sources, attachment ids, active Skill ids, and injected memory ids; older clients can ignore these fields. ## Thread record Each thread persisted under `{data-dir}/threads/{id}/thread.json` is a `ThreadRecord` with the following relation metadata: - `relation`: discriminator describing how the thread relates to its origin. One of `primary` (default), `fork` (a manual fork that switches you away), or `side` (a "by-the-way" side conversation inherited from a parent snapshot). - `parentThreadId`: live parent link for `fork` and `side` threads; absent for primary threads. Cleared automatically when promoting a side thread back to `primary` via `PATCH /v1/threads/{id}`. - `forkedFromThreadId` / `forkedFromTitle` / `forkedAt` / `forkedFromMessageCount` / `forkedFromTurnCount`: lineage metadata copied from the parent for forks and side conversations. The default `GET /v1/threads` listing excludes `relation: "side"` threads to keep the main thread list uncluttered. Pass `?include=side` to opt in. ## Migration notes Legacy Skill folders that only contain `SKILL.md` continue to work when `capabilities.skills.legacySkillMd` is true. New Skills should prefer a `skill.json` manifest with explicit `id`, `description`, trigger metadata, instruction file, and allowed tool list; this makes activation and diagnostics deterministic. A safe migration path is: 1. Keep the existing `SKILL.md`. 2. Add a `skill.json` next to it that points at the same instructions. 3. Restart Kun or refresh diagnostics. 4. Once `/v1/runtime/tools` reports the Skill without validation errors, decide whether to keep legacy compatibility enabled. Existing thread-level `pinnedConstraints` are not converted into long-term memory automatically. They remain part of compaction items and replay exactly as before. If a constraint should become cross-thread recall, create an explicit memory record through the GUI memory review surface or the `memory_create` tool. If it should stay local to one thread, leave it as a pinned constraint. ## Troubleshooting - MCP server does not appear: check `capabilities.mcp.enabled`, the server `enabled` flag, transport-specific fields (`command` for `stdio`, `url` for HTTP/SSE), `trustedWorkspaceRoots` for workspace-scoped servers, and `/v1/runtime/tools` for redacted `lastError` diagnostics. - Web tools are missing: `capabilities.web.enabled` must be true and at least one of `fetchEnabled` / `searchEnabled` must be true. Built-in fetch handles HTTP(S) pages; search may still be unavailable when no provider implementation is configured. - Image upload succeeds but the turn fails: check `maxImageBytes`, `maxImageDimension`, `allowedMimeTypes`, and the text fallback limits. Text-only models need a compressed fallback small enough to fit `textFallbackMaxBase64Bytes`. - Memory is not injected: enable `capabilities.memory`, confirm `/v1/memory/diagnostics.enabled`, make sure records are in the selected workspace scope and not disabled/deleted, then inspect `lastInjectedIds`. - `kun run`, `kun chat`, or `kun exec` cannot authenticate or load config: pass the same `--config`, `--data-dir`, `--api-key`, `--base-url`, and `--runtime-token` values used by `kun serve`. `kun exec --list-tools --json` is the quickest way to verify the effective tool registry for a CLI environment. - A capability reports `disabled`: that normally means the config flag is false. A capability reports `unavailable`: the flag is true, but the backing provider/store/model is absent or failed initialization. ## GUI integration After the legacy provider retirement, the DeepSeek-GUI main process starts Kun through `kun-process.ts` and routes all `runtimeRequest` calls to the active base URL with a bearer token. The renderer uses the same `AgentProvider` interface as the legacy CodeWhale provider because Kun speaks the same HTTP/SSE contract. Settings live under `agents.kun` in `AppSettingsV1` and include `binaryPath`, `port`, `autoStart`, `apiKey`, `baseUrl`, `runtimeToken`, `dataDir`, `model`, `approvalPolicy`, `sandboxMode`, and `insecure`. The renderer also consumes the extension routes added for the larger agent surface: `/v1/runtime/info`, `/v1/runtime/tools`, `/v1/attachments/*`, and `/v1/memory/*`. Composer image controls are enabled only when both the runtime attachment capability and model image modality are available. Settings diagnostics display MCP servers, Skill roots, web provider state, attachment store state, memory records, and the live capability manifest. Legacy persisted settings (`agentProvider: "codewhale"` or `"reasonix"`) are migrated by `migrateLegacyAppSettings`. Legacy credentials, base URLs, ports, and model selections seed `agents.kun`; the saved settings file no longer keeps live CodeWhale or Reasonix agent entries.