--- title: "API Reference" version: 3.8.40 lastUpdated: 2026-06-28 --- # API Reference 🌐 **Languages:** 🇺🇸 [English](./API_REFERENCE.md) | 🇧🇷 [Português (Brasil)](../i18n/pt-BR/docs/reference/API_REFERENCE.md) | 🇪🇸 [Español](../i18n/es/docs/reference/API_REFERENCE.md) | 🇫🇷 [Français](../i18n/fr/docs/reference/API_REFERENCE.md) | 🇮🇹 [Italiano](../i18n/it/docs/reference/API_REFERENCE.md) | 🇷🇺 [Русский](../i18n/ru/docs/reference/API_REFERENCE.md) | 🇨🇳 [中文 (简体)](../i18n/zh-CN/docs/reference/API_REFERENCE.md) | 🇩🇪 [Deutsch](../i18n/de/docs/reference/API_REFERENCE.md) | 🇮🇳 [हिन्दी](../i18n/in/docs/reference/API_REFERENCE.md) | 🇹🇭 [ไทย](../i18n/th/docs/reference/API_REFERENCE.md) | 🇺🇦 [Українська](../i18n/uk-UA/docs/reference/API_REFERENCE.md) | 🇸🇦 [العربية](../i18n/ar/docs/reference/API_REFERENCE.md) | 🇯🇵 [日本語](../i18n/ja/docs/reference/API_REFERENCE.md) | 🇻🇳 [Tiếng Việt](../i18n/vi/docs/reference/API_REFERENCE.md) | 🇧🇬 [Български](../i18n/bg/docs/reference/API_REFERENCE.md) | 🇩🇰 [Dansk](../i18n/da/docs/reference/API_REFERENCE.md) | 🇫🇮 [Suomi](../i18n/fi/docs/reference/API_REFERENCE.md) | 🇮🇱 [עברית](../i18n/he/docs/reference/API_REFERENCE.md) | 🇭🇺 [Magyar](../i18n/hu/docs/reference/API_REFERENCE.md) | 🇮🇩 [Bahasa Indonesia](../i18n/id/docs/reference/API_REFERENCE.md) | 🇰🇷 [한국어](../i18n/ko/docs/reference/API_REFERENCE.md) | 🇲🇾 [Bahasa Melayu](../i18n/ms/docs/reference/API_REFERENCE.md) | 🇳🇱 [Nederlands](../i18n/nl/docs/reference/API_REFERENCE.md) | 🇳🇴 [Norsk](../i18n/no/docs/reference/API_REFERENCE.md) | 🇵🇹 [Português (Portugal)](../i18n/pt/docs/reference/API_REFERENCE.md) | 🇷🇴 [Română](../i18n/ro/docs/reference/API_REFERENCE.md) | 🇵🇱 [Polski](../i18n/pl/docs/reference/API_REFERENCE.md) | 🇸🇰 [Slovenčina](../i18n/sk/docs/reference/API_REFERENCE.md) | 🇸🇪 [Svenska](../i18n/sv/docs/reference/API_REFERENCE.md) | 🇵🇭 [Filipino](../i18n/phi/docs/reference/API_REFERENCE.md) | 🇨🇿 [Čeština](../i18n/cs/docs/reference/API_REFERENCE.md) Complete reference for all OmniRoute API endpoints. --- ## Table of Contents - [Chat Completions](#chat-completions) - [Embeddings](#embeddings) - [Image Generation](#image-generation) - [List Models](#list-models) - [Compatibility Endpoints](#compatibility-endpoints) - [Files API](#files-api) - [Batches API](#batches-api) - [Search API](#search-api) - [WebSocket Streaming](#websocket-streaming) - [Quotas & Issues Reporting](#quotas--issues-reporting) - [Semantic Cache](#semantic-cache) - [Dashboard & Management](#dashboard--management) - [Combo Management](#combo-management) - [Webhooks](#webhooks) - [Registered Keys (Auto-Management)](#registered-keys-auto-management) - [Agents Protocol](#agents-protocol) - [Management Proxies](#management-proxies) - [Resilience (extended)](#resilience-extended) - [Skills](#skills) - [Memory](#memory) - [MCP Server](#mcp-server) - [A2A Server](#a2a-server) - [Cloud, Evals & Assess](#cloud-evals--assess) - [Request Processing](#request-processing) - [Authentication](#authentication) --- ## Chat Completions ```bash POST /v1/chat/completions Authorization: Bearer your-api-key Content-Type: application/json { "model": "cc/claude-opus-4-6", "messages": [ {"role": "user", "content": "Write a function to..."} ], "stream": true } ``` ### Custom Headers | Header | Direction | Description | | ------------------------ | --------- | ---------------------------------------------------------------------------------------------------------------------------- | | `X-OmniRoute-No-Cache` | Request | Set to `true` to bypass cache | | `x-omniroute-no-memory` | Request | Set to `true` to skip memory + skills injection for this request (mirrors no-cache; avoids the per-call token/cost overhead) | | `X-OmniRoute-Progress` | Request | Set to `true` for progress events | | `X-Session-Id` | Request | Sticky session key for external session affinity | | `x_session_id` | Request | Underscore variant also accepted (direct HTTP) | | `Idempotency-Key` | Request | Dedup key (5s window) | | `X-Request-Id` | Request | Alternative dedup key | | `X-OmniRoute-Cache` | Response | `HIT` or `MISS` (non-streaming) | | `X-OmniRoute-Idempotent` | Response | `true` if deduplicated | | `X-OmniRoute-Progress` | Response | `enabled` if progress tracking on | | `X-OmniRoute-Session-Id` | Response | Effective session ID used by OmniRoute | | `X-OmniRoute-Request-Id` | Response | Request correlation id (when known) | | `X-OmniRoute-Version` | Response | OmniRoute build version (always present) | | `X-OmniRoute-Cost-Saved` | Response | USD the cache avoided on a HIT (cache hits only) | > Nginx note: if you rely on underscore headers (for example `x_session_id`), enable `underscores_in_headers on;`. > **Cost telemetry headers:** non-streaming success responses also carry the `X-OmniRoute-*` cost-telemetry set — `X-OmniRoute-Response-Cost` (USD, fixed 10 decimals; `0.0000000000` for free/unpriced), `X-OmniRoute-Tokens-In` / `X-OmniRoute-Tokens-Out`, `X-OmniRoute-Model`, `X-OmniRoute-Provider`, `X-OmniRoute-Latency-Ms`, `X-OmniRoute-Cache-Hit`, and `X-OmniRoute-Fallback-Attempts` (only when > 0), plus `X-OmniRoute-Request-Id` and `X-OmniRoute-Version`. These are emitted by chat completions, `/v1/responses`, `/v1/messages`, **and the media endpoints** — `/v1/embeddings`, `/v1/images/generations`, `/v1/audio/speech`, `/v1/audio/transcriptions`, `/v1/rerank`, `/v1/videos/generations`, `/v1/music/generations`, and `/v1/moderations` (always cost `0`). Media cost is computed per modality (per-image, per-second, per-character, per search-unit) when pricing is available, otherwise `0` (fail-open). > **Cache-hit cost semantics:** on a semantic-cache HIT (`X-OmniRoute-Cache-Hit: true`) no upstream call is made, so `X-OmniRoute-Response-Cost` is `0.0000000000` (the **incremental** cost of serving the hit). The original/would-have-been cost is reported separately in `X-OmniRoute-Cost-Saved`. Billing consumers should sum `X-OmniRoute-Response-Cost` (hits cost nothing); cache analytics can aggregate `X-OmniRoute-Cost-Saved`. ### `x-omniroute-compression` Per-request override of the compression plan. Highest precedence — beats the routing-combo override, the active profile, auto-trigger, and the panel Default. Values: | Value | Effect | | ------------- | -------------------------------------------------------------------- | | `off` | No compression for this request. | | `default` | The panel-derived Default profile (ignores the active profile). | | `engine:` | A single engine when enabled, e.g. `engine:rtk`. | | `` | A named combo, matched by name (case-insensitive) first, then by id. | Notes: - Unknown values are ignored (the request is never rejected); resolution falls through to the normal operator precedence. - If multiple combos share a name, pass the combo **id** for a deterministic match. - A combo whose name is `off` or `default` cannot be selected by name (those keywords are interpreted first); reference such a combo by its id. - The master compression switch is a hard gate: when compression is disabled globally, this header cannot enable it. The applied plan is echoed back in the response header: ``` X-OmniRoute-Compression: ; source= ``` where `` is one of `request-header`, `routing-override`, `active-profile`, `auto-trigger`, `default`, or `off`. --- ## Embeddings ```bash POST /v1/embeddings Authorization: Bearer your-api-key Content-Type: application/json { "model": "nebius/Qwen/Qwen3-Embedding-8B", "input": "The food was delicious" } ``` Available providers: Nebius, OpenAI, Mistral, Together AI, Fireworks, NVIDIA, **OpenRouter**, **GitHub Models**. ```bash # List all embedding models GET /v1/embeddings ``` --- ## Image Generation ```bash POST /v1/images/generations Authorization: Bearer your-api-key Content-Type: application/json { "model": "openai/gpt-image-2", "prompt": "A beautiful sunset over mountains", "size": "1024x1024" } ``` Available providers: OpenAI (GPT Image 2), xAI (Grok Image), Together AI (FLUX), Fireworks AI, Nebius (FLUX), Hyperbolic, NanoBanana, **OpenRouter**, SD WebUI (local), ComfyUI (local). ```bash # List all image models GET /v1/images/generations ``` --- ## List Models ```bash GET /v1/models Authorization: Bearer your-api-key → Returns all chat, embedding, and image models + combos in OpenAI format ``` ### No-thinking model variants For thinking-capable Claude models, `/v1/models` also advertises a **no-thinking** variant whose id is prefixed with `claude-3-omniroute-no-thinking/`: ``` claude-3-omniroute-no-thinking// ``` Selecting this id (e.g. in a Claude Code config that always attaches a `thinking` block) resolves back to the real `/` with reasoning suppressed — `thinking:{type:"disabled"}` on the `/v1/messages` path, or the `reasoning`/`reasoning_effort` fields dropped on the `/v1/chat/completions` path. The variant is only listed for Claude-family models that support thinking **and** honor `disabled` (so e.g. adaptive-only models that reject `disabled` are excluded). Operators can force the variant on or off per model via `ModelSpec.noThinkingAlias`. --- ## Compatibility Endpoints | Method | Path | Format | | ------ | ----------------------------------------- | -------------------------------- | | POST | `/v1/chat/completions` | OpenAI | | POST | `/v1/messages` | Anthropic | | POST | `/v1/responses` | OpenAI Responses | | POST | `/v1/embeddings` | OpenAI | | POST | `/v1/images/generations` | OpenAI Images | | POST | `/v1/images/edits` | OpenAI Images (edit/inpaint) | | POST | `/v1/videos/generations` | OpenAI-style video generation | | POST | `/v1/music/generations` | OpenAI-style music generation | | POST | `/v1/audio/transcriptions` | OpenAI Audio (STT) | | POST | `/v1/audio/speech` | OpenAI TTS (returns audio body) | | POST | `/v1/rerank` | Cohere/Voyage-style rerank | | POST | `/v1/moderations` | OpenAI Moderations | | GET | `/v1/models` | OpenAI | | POST | `/v1/messages/count_tokens` | Anthropic | | GET | `/v1beta/models` | Gemini | | POST | `/v1beta/models/{...path}` | Gemini generateContent | | POST | `/v1/api/chat` | Ollama | | GET | `/api/v1/vscode/{token}/` | OpenAI catalog alias | | GET | `/api/v1/vscode/{token}/models` | OpenAI models alias | | POST | `/api/v1/vscode/{token}/chat/completions` | OpenAI tokenized alias | | POST | `/api/v1/vscode/{token}/responses` | OpenAI Responses tokenized alias | | POST | `/api/v1/vscode/{token}/api/chat` | Ollama tokenized alias | | GET | `/api/v1/vscode/{token}/api/tags` | Ollama tags tokenized alias | All POST routes follow the same shape: `Bearer your-api-key` + Zod-validated JSON body (`v1RerankSchema`, `v1ModerationSchema`, `v1AudioSpeechSchema`, etc., see `src/shared/validation/schemas.ts`). 4xx is returned on schema failure. For clients that cannot attach `Authorization: Bearer ...`, OmniRoute also accepts API keys in the URL via either query-string compatibility (`?token=...`, `?apiKey=...`, `?api_key=...`, `?key=...`) or the dedicated `/api/v1/vscode/{token}/...` endpoints documented below. ```bash # Rerank POST /v1/rerank { "model": "cohere/rerank-3", "query": "...", "documents": ["..."] } # Moderations POST /v1/moderations { "model": "omni-moderation-latest", "input": "..." } # TTS — returns audio/mpeg (or requested format) body POST /v1/audio/speech { "model": "openai/tts-1", "input": "Hello", "voice": "alloy" } # Image edit (multipart) POST /v1/images/edits -F image=@input.png -F prompt="..." -F mask=@mask.png # Video / music generation (provider-prefixed model id) POST /v1/videos/generations { "model": "runway/gen-3", "prompt": "..." } POST /v1/music/generations { "model": "suno/v3.5", "prompt": "..." } ``` ### Dedicated Provider Routes ```bash POST /v1/providers/{provider}/chat/completions POST /v1/providers/{provider}/embeddings POST /v1/providers/{provider}/images/generations ``` The provider prefix is auto-added if missing. Mismatched models return `400`. --- ## Files API OpenAI-compatible files endpoint for batch input/output and file-purpose uploads. | Method | Path | Description | | ------ | ------------------------ | ------------------------------------------------------------------------------------------------------------- | | POST | `/v1/files` | Upload a file (multipart: `file`, `purpose`, `expires_after[anchor]`, `expires_after[seconds]`) — 512 MiB max | | GET | `/v1/files` | List files for the authenticated API key | | GET | `/v1/files/[id]` | Retrieve a file's metadata | | DELETE | `/v1/files/[id]` | Delete a file | | GET | `/v1/files/[id]/content` | Stream the raw file body back | **Auth:** Bearer API key — files are scoped per-API-key via `getApiKeyRequestScope`. --- ## Batches API OpenAI-compatible batch processing. | Method | Path | Description | | ------ | ------------------------- | --------------------------------------------------------------------------------------------------------- | | POST | `/v1/batches` | Create batch — body validated by `v1BatchCreateSchema` (`input_file_id`, `endpoint`, `completion_window`) | | GET | `/v1/batches` | List batches | | GET | `/v1/batches/[id]` | Retrieve batch status + `request_counts` | | DELETE | `/v1/batches/[id]` | Delete a finished/failed batch | | POST | `/v1/batches/[id]/cancel` | Cancel an in-progress batch | **Auth:** Bearer API key. Batches are scoped per-API-key. --- ## Search API Web/search provider abstraction (Tavily, Brave, Exa, Serper, etc.). | Method | Path | Description | | ------ | ---------------------- | ------------------------------------------------------------------------------------ | | GET | `/v1/search` | List configured search providers + capabilities | | POST | `/v1/search` | Run a search query — body validated by `v1SearchSchema`, supports caching/coalescing | | GET | `/v1/search/analytics` | Per-provider hit/latency/cache stats | **Auth:** Bearer API key (`extractApiKey` + `isValidApiKey`). Search policy enforced via `enforceApiKeyPolicy`. --- ## WebSocket Streaming ```bash GET /v1/ws?handshake=1 ``` Validates a WebSocket upgrade handshake and returns the wire protocol example messages (`request`, `cancel`). Actual WS frames are handled by the bundled WS server outside the Next.js route table. **Auth:** Bearer API key during handshake. ### Responses API over WebSocket (codex only) ```bash # Same host:port as the HTTP API (default 20128); upgrade the connection: wscat -c "ws://localhost:20128/v1/responses?api_key=" # (or: -H "Authorization: Bearer ") # First frame MUST be response.create: { "type": "response.create", "model": "gpt-5.5", "input": [ { "role": "user", "content": "hi" } ] } ``` A Responses-API-over-WebSocket proxy is wired **exclusively to `codex`** (ChatGPT backend). It listens on the same port as the API/dashboard at paths `/v1/responses`, `/responses`, and `/api/v1/responses`. On the first `response.create` frame it authenticates + prepares via the internal `codex-responses-ws` bridge, selects a codex OAuth connection, and tunnels to `wss://chatgpt.com/backend-api/codex/responses` via the `wreq-js` transport. **Non-codex models are rejected** (`codex_ws_provider_required`). For quota-share routing use `model: "qtSd//codex/"`. Implemented in `app/server-ws.mjs` + `scripts/dev/responses-ws-proxy.mjs` + `src/app/api/internal/codex-responses-ws/route.ts`. **Auth:** Bearer API key during handshake. The bundled HTTP server (`server-ws.mjs`) must be the active entrypoint (it is, by default, when `app/server-ws.mjs` exists). #### Model id: use the bare ChatGPT id (no `codex/` prefix) The OpenAI **Codex CLI** validates the model name client-side when `supports_websockets = true` and **rejects provider-prefixed ids** like `codex/gpt-5.5` (`The 'codex/gpt-5.5' model is not supported when using Codex with a ChatGPT account`). Send the **bare** id (e.g. `gpt-5.5`). OmniRoute's bridge is codex-only, so it re-resolves a bare id as a codex model (`resolveCodexWsModelInfo`) before tunneling upstream — even though a bare `gpt-5.5` would otherwise route to another provider over HTTP. #### Configuring the OpenAI Codex CLI Point the Codex CLI at OmniRoute by adding a custom provider with WebSocket support to `~/.codex/config.toml` (use a separate `CODEX_HOME` to avoid touching an existing config): ```toml model = "gpt-5.5" # bare id — NOT "codex/gpt-5.5" model_provider = "omniroute" [model_providers.omniroute] name = "OmniRoute (WS)" base_url = "http://localhost:20128/v1" # no trailing slash; the WS URL is derived (use https/wss in production) wire_api = "responses" # only supported value since Feb 2026 supports_websockets = true # enables the Responses-over-WS transport env_key = "OMNIROUTE_API_KEY" # holds the OmniRoute API key (Bearer) ``` ```bash export OMNIROUTE_API_KEY=sk-... # an OmniRoute API key (any key if REQUIRE_API_KEY=false) codex exec "Responda apenas: PONG" ``` The CLI upgrades `base_url + /responses` to a WebSocket and OmniRoute tunnels it to the selected codex OAuth connection. Validated end-to-end against the local server: ChatGPT returns `codex.rate_limits` + `response.created` and streams the completion. --- ## Quotas & Issues Reporting | Method | Path | Description | | ------ | ------------------- | ------------------------------------------------------------------------------------- | | GET | `/v1/quotas/check` | Pre-validate quota for a `provider` + `accountId` before issuing a registered key | | POST | `/v1/issues/report` | Report a quota/key issuance failure to GitHub (requires `GITHUB_ISSUES_REPO` + token) | **Auth:** Bearer API key (`isAuthenticated`). --- ## Semantic Cache ```bash # Get cache stats GET /api/cache/stats # Clear all caches DELETE /api/cache/stats ``` Response example: ```json { "semanticCache": { "memorySize": 42, "memoryMaxSize": 500, "dbSize": 128, "hitRate": 0.65 }, "idempotency": { "activeKeys": 3, "windowMs": 5000 } } ``` --- ## Dashboard & Management ### Authentication | Endpoint | Method | Description | | ----------------------------- | ------- | --------------------- | | `/api/auth/login` | POST | Login | | `/api/auth/logout` | POST | Logout | | `/api/settings/require-login` | GET/PUT | Toggle login required | ### Provider Management | Endpoint | Method | Description | | ---------------------------- | --------------------- | ---------------------------------------------- | | `/api/providers` | GET/POST | List / create providers | | `/api/providers/[id]` | GET/PUT/DELETE | Manage a provider | | `/api/providers/[id]/test` | POST | Test provider connection | | `/api/providers/[id]/models` | GET | List provider models | | `/api/providers/validate` | POST | Validate provider config | | `/api/provider-nodes*` | Various | Provider node management | | `/api/provider-models` | GET/POST/PATCH/DELETE | Custom models (add, update, hide/show, delete) | ### OAuth Flows | Endpoint | Method | Description | | -------------------------------- | ------- | ----------------------- | | `/api/oauth/[provider]/[action]` | Various | Provider-specific OAuth | ### Routing & Config | Endpoint | Method | Description | | --------------------- | -------- | ----------------------------- | | `/api/models/alias` | GET/POST | Model aliases | | `/api/models/catalog` | GET | All models by provider + type | | `/api/combos*` | Various | Combo management | | `/api/keys*` | Various | API key management | | `/api/pricing` | GET | Model pricing | ### Usage & Analytics | Endpoint | Method | Description | | --------------------------- | --------------- | ------------------------------- | | `/api/usage/history` | GET | Usage history | | `/api/usage/logs` | GET | Usage logs | | `/api/usage/request-logs` | GET | Request-level logs | | `/api/usage/[connectionId]` | GET | Per-connection usage | | `/api/usage/token-limits` | GET/POST/DELETE | Per-API-key token-limit budgets | ### Settings | Endpoint | Method | Description | | ------------------------------------- | ------------- | --------------------------------------------------- | | `/api/settings` | GET/PUT/PATCH | General settings | | `/api/settings/proxy` | GET/PUT | Network proxy config | | `/api/settings/proxy/test` | POST | Test proxy connection | | `/api/settings/ip-filter` | GET/PUT | IP allowlist/blocklist | | `/api/settings/thinking-budget` | GET/PUT | Reasoning token budget | | `/api/settings/system-prompt` | GET/PUT | Global system prompt | | `/api/settings/compression` | GET/PUT | Global compression config | | `/api/settings/purge-request-history` | POST | Clear request log rows and local call-log artifacts | ### Context & Compression | Endpoint | Method | Description | | -------------------------------------- | -------------- | ------------------------------------------------------------------------ | | `/api/compression/preview` | POST | Preview off/lite/standard/aggressive/ultra/RTK/stacked compression | | `/api/compression/language-packs` | GET | List available Caveman language packs | | `/api/compression/rules` | GET | List Caveman rule metadata | | `/api/context/caveman/config` | GET/PUT | Caveman-specific settings alias | | `/api/context/rtk/config` | GET/PUT | RTK-specific settings, including custom filters and raw-output retention | | `/api/context/rtk/filters` | GET | RTK filter catalog and custom-filter diagnostics | | `/api/context/rtk/test` | POST | Run RTK preview/test against a text payload | | `/api/context/rtk/raw-output/[id]` | GET | Read retained redacted raw output by pointer id | | `/api/context/combos` | GET/POST | Compression combo list/create | | `/api/context/combos/[id]` | GET/PUT/DELETE | Compression combo detail/update/delete | | `/api/context/combos/[id]/assignments` | GET/PUT | Assign compression combos to routing combos | | `/api/context/analytics` | GET | Compression analytics alias | ### Monitoring | Endpoint | Method | Description | | ------------------------ | ---------- | ---------------------------------------------------------------------------------------------------- | | `/api/sessions` | GET | Active session tracking | | `/api/rate-limits` | GET | Per-account rate limits | | `/api/monitoring/health` | GET | Health check + provider summary (`catalogCount`, `configuredCount`, `activeCount`, `monitoredCount`) | | `/api/cache/stats` | GET/DELETE | Cache stats / clear | ### Backup & Export/Import | Endpoint | Method | Description | | --------------------------- | ------ | --------------------------------------- | | `/api/db-backups` | GET | List available backups | | `/api/db-backups` | PUT | Create a manual backup | | `/api/db-backups` | POST | Restore from a specific backup | | `/api/db-backups/export` | GET | Download database as .sqlite file | | `/api/db-backups/import` | POST | Upload .sqlite file to replace database | | `/api/db-backups/exportAll` | GET | Download full backup as .tar.gz archive | ### Cloud Sync | Endpoint | Method | Description | | ---------------------- | ------- | --------------------- | | `/api/sync/cloud` | Various | Cloud sync operations | | `/api/sync/initialize` | POST | Initialize sync | | `/api/cloud/*` | Various | Cloud management | ### Tunnels | Endpoint | Method | Description | | -------------------------- | ------ | ----------------------------------------------------------------------- | | `/api/tunnels/cloudflared` | GET | Read Cloudflare Quick Tunnel install/runtime status for the dashboard | | `/api/tunnels/cloudflared` | POST | Enable or disable the Cloudflare Quick Tunnel (`action=enable/disable`) | | `/api/tunnels/ngrok` | GET | Read ngrok Tunnel runtime status for the dashboard | | `/api/tunnels/ngrok` | POST | Enable or disable the ngrok Tunnel (`action=enable/disable`) | ### CLI Tools | Endpoint | Method | Description | | ---------------------------------- | ------ | ------------------- | | `/api/cli-tools/claude-settings` | GET | Claude CLI status | | `/api/cli-tools/codex-settings` | GET | Codex CLI status | | `/api/cli-tools/droid-settings` | GET | Droid CLI status | | `/api/cli-tools/openclaw-settings` | GET | OpenClaw CLI status | | `/api/cli-tools/runtime/[toolId]` | GET | Generic CLI runtime | CLI responses include: `installed`, `runnable`, `command`, `commandPath`, `runtimeMode`, `reason`. ### ACP Agents | Endpoint | Method | Description | | ----------------- | ------ | -------------------------------------------------------- | | `/api/acp/agents` | GET | List all detected agents (built-in + custom) with status | | `/api/acp/agents` | POST | Add custom agent or refresh detection cache | | `/api/acp/agents` | DELETE | Remove a custom agent by `id` query param | GET response includes `agents[]` (id, name, binary, version, installed, protocol, isCustom) and `summary` (total, installed, notFound, builtIn, custom). ### Resilience & Rate Limits | Endpoint | Method | Description | | --------------------------------- | --------- | ------------------------------------------------------------------------------------ | | `/api/resilience` | GET/PATCH | Get/update request queue, connection cooldown, provider breaker, and wait settings | | `/api/resilience/reset` | POST | Reset provider circuit breakers | | `/api/resilience/model-cooldowns` | GET | List active per-(provider, connection, model) lockouts, sorted by remaining time | | `/api/resilience/model-cooldowns` | DELETE | Clear a model lockout — body `{provider, model}` or `{all: true}` to wipe everything | | `/api/rate-limits` | GET | Per-account rate limit status | | `/api/rate-limit` | GET | Global rate limit configuration | > All four `/api/resilience/*` routes require **management auth** (`requireManagementAuth`). See [Resilience (extended)](#resilience-extended) for a full breakdown of provider breaker vs connection cooldown vs model lockout. ### Evals | Endpoint | Method | Description | | ------------ | -------- | --------------------------------- | | `/api/evals` | GET/POST | List eval suites / run evaluation | ### Policies | Endpoint | Method | Description | | --------------- | --------------- | ----------------------- | | `/api/policies` | GET/POST/DELETE | Manage routing policies | ### Compliance | Endpoint | Method | Description | | --------------------------- | ------ | ----------------------------- | | `/api/compliance/audit-log` | GET | Compliance audit log (last N) | ### v1beta (Gemini-Compatible) | Endpoint | Method | Description | | -------------------------- | ------ | --------------------------------- | | `/v1beta/models` | GET | List models in Gemini format | | `/v1beta/models/{...path}` | POST | Gemini `generateContent` endpoint | These endpoints mirror Gemini's API format for clients that expect native Gemini SDK compatibility. ### Internal / System APIs | Endpoint | Method | Description | | ------------------------ | ------ | ---------------------------------------------------- | | `/api/init` | GET | Application initialization check (used on first run) | | `/api/tags` | GET | Ollama-compatible model tags (for Ollama clients) | | `/api/restart` | POST | Trigger graceful server restart | | `/api/shutdown` | POST | Trigger graceful server shutdown | | `/api/system/env/repair` | POST | Repair OAuth provider environment variables | > **Note:** These endpoints are used internally by the system or for Ollama client compatibility. They are not typically called by end users. ### OAuth Environment Repair _(v3.6.1+)_ ```bash POST /api/system/env/repair Content-Type: application/json { "provider": "claude-code" } ``` Repairs missing or corrupted OAuth environment variables for a specific provider. Returns: ```json { "success": true, "repaired": ["CLAUDE_CODE_OAUTH_CLIENT_ID", "CLAUDE_CODE_OAUTH_CLIENT_SECRET"], "backupPath": "/home/user/.omniroute/backups/env-repair-2026-04-11.bak" } ``` --- ## Audio Transcription ```bash POST /v1/audio/transcriptions Authorization: Bearer your-api-key Content-Type: multipart/form-data ``` Transcribe audio files using Deepgram or AssemblyAI. **Request:** ```bash curl -X POST http://localhost:20128/v1/audio/transcriptions \ -H "Authorization: Bearer your-api-key" \ -F "file=@recording.mp3" \ -F "model=deepgram/nova-3" ``` **Response:** ```json { "text": "Hello, this is the transcribed audio content.", "task": "transcribe", "language": "en", "duration": 12.5 } ``` **Supported providers:** `deepgram/nova-3`, `assemblyai/best`. **Supported formats:** `mp3`, `wav`, `m4a`, `flac`, `ogg`, `webm`. --- ## Ollama Compatibility For clients that use Ollama's API format: ```bash # Chat endpoint (Ollama format) POST /v1/api/chat # Model listing (Ollama format) GET /api/tags ``` Requests are automatically translated between Ollama and internal formats. ## Tokenized VS Code / Headerless Aliases Use these aliases when an integration cannot inject an `Authorization` header and needs the API key embedded in the base URL. ```bash # OpenAI-style catalog alias GET /api/v1/vscode/{token}/ GET /api/v1/vscode/{token}/models # OpenAI-style chat aliases POST /api/v1/vscode/{token}/chat/completions POST /api/v1/vscode/{token}/responses # Ollama-style aliases POST /api/v1/vscode/{token}/api/chat GET /api/v1/vscode/{token}/api/tags ``` Example: ```bash curl https://your-host.example/api/v1/vscode/YOUR_API_KEY/models curl -X POST https://your-host.example/api/v1/vscode/YOUR_API_KEY/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"auto","messages":[{"role":"user","content":"hello"}]}' ``` Notes: - The tokenized aliases reuse the same handlers as `/v1/*` and `/api/tags`; response shapes stay identical. - Prefer `Authorization: Bearer ...` whenever the client supports custom headers. - URL-based tokens may appear in reverse-proxy logs, browser history, and telemetry outside OmniRoute. Treat them as a compatibility option, not the default authentication mode. --- ## Telemetry ```bash # Get latency telemetry summary (p50/p95/p99 per provider) GET /api/telemetry/summary ``` **Response:** ```json { "providers": { "claudeCode": { "p50": 245, "p95": 890, "p99": 1200, "count": 150 }, "github": { "p50": 180, "p95": 620, "p99": 950, "count": 320 } } } ``` --- ## Budget ```bash # Get budget status for all API keys GET /api/usage/budget # Set or update a budget POST /api/usage/budget Content-Type: application/json { "apiKeyId": "key-123", "dailyLimitUsd": 5.00, "weeklyLimitUsd": 30.00, "monthlyLimitUsd": 100.00, "warningThreshold": 0.8, "resetInterval": "monthly" } ``` > **Schema notes** (`setBudgetSchema`): `apiKeyId` is required; at least one of `dailyLimitUsd`, `weeklyLimitUsd`, or `monthlyLimitUsd` must be greater than zero. Optional fields: `warningThreshold` (0–1), `resetInterval` (`daily` | `weekly` | `monthly`), `resetTime` (`HH:MM`). The legacy `{keyId, limit, period}` shape returns `400 Bad Request`. ## Token Limits Per-API-key **token** budgets (distinct from the USD-based Budget above). Enforced inline on the request path: when a key's current window usage reaches its limit, requests are rejected with `429 Too Many Requests`. Limits can be scoped to a specific `model`, a `provider`, or applied `global`ly across the key; when several limits match a request, the most restrictive one wins. ```bash # List a key's token limits (includes live window usage) GET /api/usage/token-limits?apiKeyId=key-123 # Create or update a token limit POST /api/usage/token-limits Content-Type: application/json { "apiKeyId": "key-123", "scopeType": "model", "scopeValue": "openai/gpt-4o", "tokenLimit": 1000000, "resetInterval": "monthly", "enabled": true } # Delete a token limit by id DELETE /api/usage/token-limits?id=tl-abc ``` > **Schema notes** (`setTokenLimitSchema`): `apiKeyId` and `scopeType` (`model` | `provider` | `global`) are required. `scopeValue` is required unless `scopeType` is `global` (e.g. a model id for `model` scope, a provider id for `provider` scope). `tokenLimit` must be a positive integer (coerced from string). Optional: `id` (omit to create, supply to update), `resetInterval` (`daily` | `weekly` | `monthly`, default `monthly`), `resetTime` (`HH:MM`), `enabled` (default `true`). `GET` responses enrich each limit with `tokensUsed`, `remaining`, `windowStart`, `periodStartAt`, and `nextResetAt`. This is a management-class endpoint (auth enforced centrally by the authz pipeline). ## Request Processing 1. Client sends request to `/v1/*` 2. Route handler calls `handleChat`, `handleEmbedding`, `handleAudioTranscription`, or `handleImageGeneration` 3. Model is resolved (direct provider/model or alias/combo) 4. Credentials selected from local DB with account availability filtering 5. For chat: `handleChatCore` checks semantic/signature cache and resolves combo compression settings 6. Proactive compression runs before provider translation when enabled (`lite`, Caveman, RTK, or stacked) 7. Provider executor sends upstream request 8. Response translated back to client format (chat) or returned as-is (embeddings/images/audio) 9. Usage, compression analytics, and request logs are recorded 10. Fallback applies on errors according to combo rules Full architecture reference: [`ARCHITECTURE.md`](../architecture/ARCHITECTURE.md) --- ## Combo Management Higher-level routing combos (already summarized under `/api/combos*`) can also be mapped 1:1 from a model id pattern, allowing transparent redirection of an OpenAI-style model id to a combo. | Method | Path | Description | | ------ | -------------------------------- | ------------------------------------------------------------------------------ | | GET | `/api/model-combo-mappings` | List all model→combo mappings | | POST | `/api/model-combo-mappings` | Create mapping — body: `{pattern, comboId, priority?, enabled?, description?}` | | GET | `/api/model-combo-mappings/[id]` | Retrieve a single mapping | | PUT | `/api/model-combo-mappings/[id]` | Update fields of an existing mapping | | DELETE | `/api/model-combo-mappings/[id]` | Remove a mapping | **Auth:** management session/API key (`requireManagementAuth`). --- ## Webhooks Outbound webhook subscriptions for OmniRoute events (request completion, quota exhaustion, key rotation, etc.). | Method | Path | Description | | ------ | ------------------------- | --------------------------------------------------------------------- | | GET | `/api/webhooks` | List webhooks (secrets are masked to `...`) | | POST | `/api/webhooks` | Create webhook — body: `{url, events?: ["*"], secret?, description?}` | | GET | `/api/webhooks/[id]` | Retrieve a webhook | | PUT | `/api/webhooks/[id]` | Update url/events/secret/description | | DELETE | `/api/webhooks/[id]` | Remove a webhook | | POST | `/api/webhooks/[id]/test` | Send a test payload to the webhook URL and return delivery status | **Auth:** management session/API key (`requireManagementAuth`). --- ## Registered Keys (Auto-Management) Used by the auto-key management subsystem to issue and rotate API keys against a backing provider/account, with daily/hourly quotas. | Method | Path | Description | | ------ | ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | GET | `/api/v1/registered-keys` | List registered keys (masked prefix only) | | POST | `/api/v1/registered-keys` | Issue a new registered key — body: `{name, provider?, accountId?, idempotencyKey?, expiresAt?, dailyBudget?, hourlyBudget?}`. Returns the raw key **once**. Returns `429` on quota refusal. | | GET | `/api/v1/registered-keys/[id]` | Retrieve a registered key's metadata (no raw material) | | DELETE | `/api/v1/registered-keys/[id]` | Revoke a registered key | | POST | `/api/v1/registered-keys/[id]/revoke` | Explicit revoke endpoint (same effect as DELETE) | **Auth:** Bearer API key (`isAuthenticated`). See also `/v1/quotas/check` and `/v1/issues/report`. --- ## Agents Protocol Cloud agent tasks (Claude Code, Codex Cloud, OpenHands, etc.) executed remotely on behalf of OmniRoute users. | Method | Path | Description | | ------ | ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- | | GET | `/api/v1/agents/tasks` | List tasks — optional `?provider=`, `?status=`, `?limit=` (1–500, default 50) | | POST | `/api/v1/agents/tasks` | Create task — body validated by `CreateCloudAgentTaskSchema` (`providerId`, `prompt`, `source`, `options?`). Returns `201` with task envelope | | DELETE | `/api/v1/agents/tasks?id=...` | Delete a task | | GET | `/api/v1/agents/tasks/[id]` | Read task — synchronously refreshes status from the upstream cloud agent when an `external_id` is set | | POST | `/api/v1/agents/tasks/[id]` | Discriminated action: `{action: "approve"}`, `{action: "message", message}`, or `{action: "cancel"}` | | DELETE | `/api/v1/agents/tasks/[id]` | Delete a specific task by id | > **Auth:** management auth required on every method (`requireCloudAgentManagementAuth`). Prior to v3.8.0 these were unauthenticated — see commit `588a0333` for the breaking change. ```bash # Create a Claude Code cloud task curl -X POST http://localhost:20128/api/v1/agents/tasks \ -H "Authorization: Bearer your-management-key" \ -H "Content-Type: application/json" \ -d '{"providerId":"claude-code-cloud","prompt":"Fix the failing test","source":{"repo":"...","branch":"..."}}' ``` --- ## Management Proxies Outbound HTTP(S)/SOCKS proxies that can be assigned to providers, accounts, or globally. | Method | Path | Description | | ------ | -------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | | GET | `/api/v1/management/proxies` | List proxies (with `?id=` returns one; with `?id=&where_used=1` returns the assignment graph) | | POST | `/api/v1/management/proxies` | Create proxy — body validated by `createProxyRegistrySchema` | | PATCH | `/api/v1/management/proxies` | Update proxy — body validated by `updateProxyRegistrySchema` (requires `id`) | | DELETE | `/api/v1/management/proxies?id=...&force=1` | Delete proxy (use `force=1` to detach assignments) | | GET | `/api/v1/management/proxies/assignments` | List assignments — filterable by `proxy_id`, `scope`, `scope_id`; pass `resolve_connection_id=` to resolve the active proxy for a connection | | PUT | `/api/v1/management/proxies/assignments` | Assign — body validated by `proxyAssignmentSchema` (`{scope, scopeId?, proxyId?}`). Clears dispatcher cache | | PUT | `/api/v1/management/proxies/bulk-assign` | Bulk-assign — body validated by `bulkProxyAssignmentSchema` (`{scope, scopeIds[], proxyId?}`) | | GET | `/api/v1/management/proxies/health?hours=24` | Aggregate proxy health (success/fail counts, latency) over a window | **Auth:** management session/API key on every route (`requireManagementAuth`). > The task description's `POST /api/v1/management/proxies/[id]/assignments` and `POST /api/v1/management/proxies/[id]/health` are served by the flat `/assignments` and `/health` routes shown above — there are no per-id subroutes in the codebase. --- ## Resilience (extended) OmniRoute exposes three independent temporary-failure mechanisms; the management endpoints below let operators read and override them: | Scope | State storage | Read | Reset / clear | | ------------------- | ------------------------------------------ | ----------------------------------------- | ------------------------------------------- | | Provider breaker | `domain_circuit_breakers` + in-memory | `/api/monitoring/health` | `POST /api/resilience/reset` | | Connection cooldown | `rateLimitedUntil` on provider connections | `/api/rate-limits`, `/api/providers/[id]` | (re-enables lazily; clear via provider PUT) | | Model lockout | In-memory model-availability registry | `GET /api/resilience/model-cooldowns` | `DELETE /api/resilience/model-cooldowns` | `PATCH /api/resilience` accepts provider breaker overrides under `providerBreaker.oauth` and `providerBreaker.apikey`. Each profile supports `degradationThreshold`, `failureThreshold`, and `resetTimeoutMs`; the same fields are exposed in Dashboard → Settings → Resilience. ```bash # Clear a single model lockout curl -X DELETE http://localhost:20128/api/resilience/model-cooldowns \ -H "Cookie: auth_token=..." \ -H "Content-Type: application/json" \ -d '{"provider":"openai","model":"gpt-4o-mini"}' # Wipe every lockout curl -X DELETE http://localhost:20128/api/resilience/model-cooldowns \ -H "Cookie: auth_token=..." \ -d '{"all":true}' ``` Full conceptual reference and breaker defaults: see [`CLAUDE.md`](../../CLAUDE.md) → "Resilience Runtime State". --- ## Skills Skill framework for extending OmniRoute with custom executable handlers, plus marketplace integrations. | Method | Path | Description | | ------ | --------------------------------- | -------------------------------------------------------------------------------------------------------------------------- | | GET | `/api/skills` | List installed skills — filterable by `?q=`, `?mode=on\|off\|auto`, `?source=skillsmp\|skillssh\|local`, paginated | | GET | `/api/skills/[id]` | Retrieve one skill | | PUT | `/api/skills/[id]` | Update skill (name, description, mode, schema, handler, tags) | | DELETE | `/api/skills/[id]` | Uninstall a skill | | POST | `/api/skills/install` | Install a skill from a raw manifest — body: `{name, version, description, schema:{input, output}, handlerCode, apiKeyId?}` | | GET | `/api/skills/executions` | List recent skill executions (audit trail with inputs/outputs/duration) | | GET | `/api/skills/marketplace?q=...` | Search/popular list from the SkillsMP marketplace (requires `skillsmpApiKey` setting) | | POST | `/api/skills/marketplace/install` | Install a skill by id from SkillsMP | | GET | `/api/skills/skillssh?q=&limit=` | Search the skills.sh registry | | POST | `/api/skills/skillssh/install` | Install a skill by id from skills.sh | **Auth:** management session/API key. Marketplace search routes accept either management auth or a Bearer API key (`isAuthenticated`). --- ## Memory Persistent conversational/factual memory store, scoped per API key / session. | Method | Path | Description | | ------ | -------------------- | ------------------------------------------------------------------------------------------------------------ | | GET | `/api/memory` | List memories — `?apiKeyId=`, `?type=`, `?sessionId=`, `?q=`, with `offset/limit` or `page/limit` pagination | | POST | `/api/memory` | Create memory — body validated by Zod: `{content, key, type?, sessionId?, apiKeyId?, metadata?, expiresAt?}` | | GET | `/api/memory/[id]` | Retrieve one memory | | DELETE | `/api/memory/[id]` | Delete a memory | | GET | `/api/memory/health` | Memory subsystem health (DB connectivity, embeddings backend, vector index status) | **Auth:** management session/API key (`requireManagementAuth`). `type` enum: `FACTUAL`, `EPISODIC`, `SEMANTIC`, `PROCEDURAL` (see `MemoryType` in `src/lib/memory/types.ts`). --- ## MCP Server OmniRoute ships an embedded Model Context Protocol server with 3 transports (stdio, SSE, streamable-http) and scoped tools. The dashboard endpoints below read status/audit data and proxy the HTTP transports. | Method | Path | Description | | ------ | ---------------------- | ------------------------------------------------------------------------------------------------ | -------------------- | | GET | `/api/mcp/status` | Heartbeat, transport, online state, last call, top tools, 24h success rate | | GET | `/api/mcp/tools` | List of MCP tools with `name`, `description`, `scopes`, `phase`, `auditLevel`, `sourceEndpoints` | | GET | `/api/mcp/sse` | Open SSE stream for the SSE transport (returns `503` if MCP disabled or transport mismatch) | | POST | `/api/mcp/sse` | Send JSON-RPC frame on the SSE transport | | GET | `/api/mcp/stream` | Open SSE side of the Streamable HTTP transport (server-initiated messages) | | POST | `/api/mcp/stream` | Send JSON-RPC frame on the Streamable HTTP transport | | DELETE | `/api/mcp/stream` | End a Streamable HTTP session | | GET | `/api/mcp/audit` | Query audit log — `?limit=`, `?offset=`, `?tool=`, `?success=true | false`, `?apiKeyId=` | | GET | `/api/mcp/audit/stats` | Aggregate audit stats (totals, success rate, avg duration, top tools) | **Auth:** the `sse`/`stream` transports honor the MCP-specific auth surface (Bearer API key with `mcp` scope); the `status`/`tools`/`audit*` routes are readable from the dashboard (no extra auth required beyond reaching the dashboard host). > Both HTTP transports are gated by `settings.mcpEnabled` and `settings.mcpTransport` — a transport mismatch returns `400`, an MCP disabled state returns `503`. --- ## A2A Server OmniRoute exposes an A2A (Agent-to-Agent) JSON-RPC 2.0 endpoint plus a REST wrapper for inspection/dashboard use. ### JSON-RPC ```bash POST /a2a Authorization: Bearer your-api-key # optional unless OMNIROUTE_API_KEY is set Content-Type: application/json { "jsonrpc": "2.0", "id": 1, "method": "message/send", "params": { "skill": "smart-routing", "messages": [{"role": "user", "content": "Route this coding task"}] } } ``` Supported methods (all gated on `settings.a2aEnabled`): | Method | Description | | ---------------- | ------------------------------------------------------------------ | | `message/send` | Synchronous skill execution; returns `{task, artifacts, metadata}` | | `message/stream` | Streaming SSE execution of the same skill set | | `tasks/get` | Fetch a task by `taskId` | | `tasks/cancel` | Cancel a task by `taskId` | Built-in skills: `smart-routing`, `quota-management`, `provider-discovery`, `cost-analysis`, `health-report`. ### Agent Card ```bash GET /.well-known/agent.json ``` Returns the public A2A agent card (name, description, capabilities, skill catalog, auth scheme) — cached publicly for 1h. No auth required. ### REST helpers | Method | Path | Description | | ------ | ---------------------------- | --------------------------------------------------------------------------------------------------------------- | | GET | `/api/a2a/status` | A2A enabled + task stats + cached agent card summary | | GET | `/api/a2a/tasks` | List tasks — `?state=submitted\|working\|completed\|failed\|cancelled`, `?skill=`, `?limit=` (≤200), `?offset=` | | POST | `/api/a2a/tasks` | (Not implemented as a REST helper — create via JSON-RPC `message/send`) | | GET | `/api/a2a/tasks/[id]` | Retrieve one task | | POST | `/api/a2a/tasks/[id]/cancel` | Cancel a task | **Auth:** the REST helpers run without management auth (dashboard-readable); the JSON-RPC `/a2a` route uses Bearer `OMNIROUTE_API_KEY` if configured. --- ## Cloud, Evals & Assess | Method | Path | Description | | ------ | ------------------------------- | ------------------------------------------------------------------------------------------------- | ----------------------------- | ----------------------------------- | | POST | `/api/cloud/auth` | Verify a Bearer key and return masked provider connections + model aliases for cloud sync clients | | POST | `/api/cloud/credentials/update` | Update encrypted credentials for a cloud-synced provider | | POST | `/api/cloud/model/resolve` | Resolve a logical model id to a concrete provider/model using the local routing table | | GET | `/api/cloud/models/alias` | List model aliases as exposed to cloud sync | | GET | `/api/assess` | Read latest assessment categorizations (per-provider/model) | | POST | `/api/assess` | Run an assessment — body: `{scope: {type:"all"} | {type:"provider", providerId} | {type:"model", modelId}, trigger?}` | | GET | `/api/evals` | List built-in eval suites + most recent runs | | POST | `/api/evals` | Trigger an eval run | | POST | `/api/evals/suites` | Create a custom eval suite — body validated by `evalSuiteSaveSchema` | | GET | `/api/evals/suites/[id]` | Retrieve a custom eval suite | **Auth:** `/api/cloud/auth` validates a Bearer key directly; the other `/api/cloud/*`, `/api/evals/*`, and `/api/assess` routes require management session/API key. `/api/assess` POST uses `validateBody` with a discriminated-union scope schema. --- ## ACP (Agent Client Protocol) Management as child processes. These endpoints manage ACP agent detection and custom agent registration. | Method | Path | Description | | ------ | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | | GET | `/api/acp/agents` | List all known CLI agents (built-in + custom) with installation status, version, binary | | POST | `/api/acp/agents` | Register a custom ACP agent or refresh cache — body: `{id, name, binary, versionCommand, providerAlias, spawnArgs, protocol}` or `{action: "refresh"}` | | DELETE | `/api/acp/agents` | Remove a custom ACP agent — query param: `?id=` | **Response example** (`GET /api/acp/agents`): ```json { "agents": [ { "id": "claude", "name": "Claude Code CLI", "binary": "claude", "version": "1.0.45", "installed": true, "protocol": "stdio", "providerAlias": "claude", "isCustom": false }, { "id": "my-custom-cli", "name": "My Custom CLI", "installed": false, "protocol": "stdio", "providerAlias": "my-provider", "isCustom": true } ], "cacheTtlMs": 60000, "cacheAge": 1234 } ``` **Auth:** Requires management session (dashboard `auth_token` cookie) or a management-scoped API key. See [ACP Framework](../frameworks/ACP.md) for full details. --- ## Analytics & Observability Real-time analytics endpoints for monitoring routing, compression, and provider diversity. These power the `/dashboard/analytics/*` pages. ### Auto-routing analytics | Method | Path | Description | | ------ | ------------------------------------ | -------------------------------------------------------------------------------------------------- | | GET | `/api/analytics/auto-routing` | Aggregate auto-routing stats: total calls, strategy distribution, tier distribution, top providers | | GET | `/api/analytics/auto-routing?days=7` | Time-windowed stats (default 24h) | **Response example**: ```json { "window": "24h", "totalCalls": 1234, "strategyBreakdown": { "rules": 800, "cost": 200, "latency": 150, "sla-aware": 50, "lkgp": 34 }, "tierBreakdown": { "ultra": 100, "pro": 500, "standard": 400, "free": 234 }, "topProviders": [ { "provider": "openai", "calls": 500, "avgLatencyMs": 850 }, { "provider": "anthropic", "calls": 300, "avgLatencyMs": 1200 } ] } ``` ### Compression analytics | Method | Path | Description | | ------ | ---------------------------- | ------------------------------------------------------------------------------------- | | GET | `/api/analytics/compression` | Aggregate compression stats: tokens saved, savings %, mode distribution, engine usage | **Response example**: ```json { "window": "24h", "totalOriginalTokens": 5000000, "totalCompressedTokens": 3500000, "totalSavings": 1500000, "savingsPct": 30.0, "modeBreakdown": { "lite": 400, "standard": 600, "aggressive": 100, "ultra": 50, "rtk": 84 }, "engineBreakdown": { "caveman": 800, "rtk": 434 } } ``` ### Provider diversity tracking | Method | Path | Description | | ------ | -------------------------- | -------------------------------------------------------------------------------------------------------- | | GET | `/api/analytics/diversity` | Shannon entropy-based diversity tracking: prevents single points of failure by measuring provider spread | **Response example**: ```json { "window": "24h", "shannonEntropy": 2.45, "maxEntropy": 3.17, "diversityRatio": 0.77, "providerUsage": { "openai": 0.4, "anthropic": 0.25, "google": 0.2, "kiro": 0.15 }, "warnings": ["OpenAI accounts for 40% of traffic — consider diversifying"] } ``` **Auth:** Requires management session or management-scoped API key. --- ## Admin Operations Admin-only endpoints for operational management. | Method | Path | Description | | ------ | ------------------------ | ------------------------------------------------------------------------------------------- | | GET | `/api/admin/concurrency` | Read current concurrency limits (global + per-provider) | | POST | `/api/admin/concurrency` | Update concurrency limits — body: `{global?: number, perProvider?: Record}` | **Auth:** Requires management session with admin scope. --- ## CLI Tools Management Manage CLI tools that integrate with OmniRoute (antigravity, chipotle, commandCode, devin-cli, etc.). See [Provider Reference](./PROVIDER_REFERENCE.md) for the full list. | Method | Path | Description | | ------ | --------------------------------------- | ---------------------------------------------------------------------------------------------- | | GET | `/api/cli-tools/all-statuses` | Status of all CLI tools (installed, version, last seen) | | GET | `/api/cli-tools/[id]/status` | Status of a specific CLI tool (id can be: antigravity, chipotle, commandCode, devin-cli, etc.) | | POST | `/api/cli-tools/apply` | Apply a CLI tool configuration to a provider connection | | GET | `/api/cli-tools/backups` | List CLI tool configuration backups | | POST | `/api/cli-tools/backups` | Create a backup of all CLI tool configurations | | POST | `/api/cli-tools/[id]/restore` | Restore a CLI tool from a backup | | GET | `/api/cli-tools/antigravity-mitm` | Antigravity MITM proxy status (the "antigravity-mitm" CLI tool) | | POST | `/api/cli-tools/antigravity-mitm/alias` | Configure antigravity-mitm aliases | **Auth:** Requires management session. --- ## Agent Skills Manage AI agent skills (similar to OpenAI's custom GPTs but for agents). | Method | Path | Description | | ------ | ---------------------------- | --------------------------------------------------------------------------------------- | | GET | `/api/agent-skills` | List all agent skills (built-in + custom) | | GET | `/api/agent-skills/[id]` | Get a specific agent skill | | POST | `/api/agent-skills` | Create a custom agent skill — body: `{name, description, prompt, model?, temperature?}` | | PUT | `/api/agent-skills/[id]` | Update a custom agent skill | | DELETE | `/api/agent-skills/[id]` | Delete a custom agent skill | | GET | `/api/agent-skills/[id]/raw` | Get raw prompt + metadata (no execution) | | POST | `/api/agent-skills/generate` | AI-generate a new skill from a natural language description | **Auth:** Requires management session or management-scoped API key. --- ## Cache Management Manage the semantic cache and reasoning cache. | Method | Path | Description | | ------ | ---------------------- | ------------------------------------------------------------------------------------------------------- | | GET | `/api/cache` | Cache overview: total entries, hit rate, size on disk | | GET | `/api/cache/entries` | List cached entries (with pagination) | | DELETE | `/api/cache/entries` | Delete cache entries (filter by query parameters) | | GET | `/api/cache/stats` | Detailed cache statistics (per-provider, per-model) | | GET | `/api/cache/reasoning` | Reasoning cache status (for reasoning replay) | | DELETE | `/api/cache/reasoning` | Clear reasoning cache — query params: `?toolCallId=` (single) or `?provider=

` or no params (all) | **Auth:** Requires management session. --- ## Memory System Manage persistent memory (FTS5 + vector embeddings). | Method | Path | Description | | ------ | -------------------- | --------------------------------------------------------------------- | | GET | `/api/memory` | List memory entries (filter by scope, type, search query) | | POST | `/api/memory` | Create a new memory entry — body: `{scope, type, content, metadata?}` | | GET | `/api/memory/[id]` | Get a specific memory entry | | PUT | `/api/memory/[id]` | Update a memory entry | | DELETE | `/api/memory/[id]` | Delete a memory entry | | GET | `/api/memory/search` | Search memory (FTS5 + vector) | | POST | `/api/memory/clear` | Clear memory entries (with filters) | | GET | `/api/memory/stats` | Memory statistics (total entries, embedding coverage, etc.) | **Auth:** Requires management session or management-scoped API key. --- ## Webhooks Manage webhook subscriptions for events. | Method | Path | Description | | ------ | ------------------------------- | ------------------------------------------------------------------------- | | GET | `/api/webhooks` | List all webhook subscriptions | | POST | `/api/webhooks` | Create a webhook subscription — body: `{url, events[], secret?, active?}` | | GET | `/api/webhooks/[id]` | Get a specific webhook subscription | | PUT | `/api/webhooks/[id]` | Update a webhook subscription | | DELETE | `/api/webhooks/[id]` | Delete a webhook subscription | | GET | `/api/webhooks/events` | List all available webhook event types | | GET | `/api/webhooks/[id]/deliveries` | List delivery history for a webhook (success/failure log) | | POST | `/api/webhooks/[id]/test` | Send a test event to a webhook | **Auth:** Requires management session. See [Webhooks Framework](../frameworks/WEBHOOKS.md) for full event types. --- ## Skills Framework Manage Skills (the agentic extensions framework). | Method | Path | Description | | ------ | ------------------------ | --------------------------------------------------------------------------------------- | | GET | `/api/skills` | List all installed skills (built-in + custom) | | POST | `/api/skills/install` | Install a skill from a local path or URL | | DELETE | `/api/skills/[id]` | Uninstall a skill | | PUT | `/api/skills/[id]` | Enable or disable a skill — body: `{enabled?: boolean, mode?: "on" \| "off" \| "auto"}` | | POST | `/api/skills/executions` | Execute a skill — body: `{skillName, apiKeyId, input?, sessionId?}` | | GET | `/api/skills/executions` | List execution history for all skills (filter by `?apiKeyId=`) | **Auth:** Requires management session or management-scoped API key. See [Skills Framework](../frameworks/SKILLS.md) for full details. --- ## Plugins Manage OmniRoute plugins (third-party extensions). | Method | Path | Description | | ------ | -------------------------------- | ----------------------------------------- | | GET | `/api/plugins` | List installed plugins | | POST | `/api/plugins/install` | Install a plugin from a local path or URL | | DELETE | `/api/plugins/[name]` | Uninstall a plugin | | POST | `/api/plugins/[name]/activate` | Activate a plugin | | POST | `/api/plugins/[name]/deactivate` | Deactivate a plugin | | GET | `/api/plugins/[name]/config` | Get plugin configuration | | PUT | `/api/plugins/[name]/config` | Update plugin configuration | **Auth:** Requires management session. See [Plugins Framework](../frameworks/PLUGIN_SDK.md) for full details. --- ## Shadow Routing Shadow / A-B comparison of providers is **not a standalone REST surface** — it is configured through combo routing (see [Auto-Combo](../routing/AUTO-COMBO.md)). Per-combo comparison metrics are served by `GET /api/combos/metrics`. --- ## Guardrails Inspect the runtime guardrails (PII detection, prompt injection detection, vision bridging). Guardrails run on every request; per-call opt-out is via the `x-omniroute-disabled-guardrails` request header — there is no persisted enable/disable surface. | Method | Path | Description | | ------ | ---------------------- | ---------------------------------------------------------------------------------------- | | GET | `/api/guardrails` | List the registered guardrails and their status (name / enabled / priority) | | POST | `/api/guardrails/test` | Dry-run the pre-call pipeline over a sample input — body: `{input, disabledGuardrails?}` | **Auth:** Requires management session. See [Security > Guardrails](../security/GUARDRAILS.md) for full details. --- --- ## Authentication - Dashboard routes (`/dashboard/*`) use `auth_token` cookie - Login uses saved password hash; fallback to `INITIAL_PASSWORD` - `requireLogin` toggleable via `/api/settings/require-login` - `/v1/*` routes optionally require Bearer API key when `REQUIRE_API_KEY=true` > **Breaking change (v3.8.0)** — `/api/v1/agents/tasks/*` and the cooldown management endpoints now require **management auth** (dashboard `auth_token` cookie or a management-scoped API key). Clients that previously called these routes unauthenticated will receive `401 Unauthorized`. See commit `588a0333` (`fix(auth): require management auth for agent and cooldown APIs`).