---
title: "API Reference"
version: 3.8.40
lastUpdated: 2026-06-28
---

# API Reference

🌐 **Languages:** 🇺🇸 [English](./API_REFERENCE.md) | 🇧🇷 [Português (Brasil)](../i18n/pt-BR/docs/reference/API_REFERENCE.md) | 🇪🇸 [Español](../i18n/es/docs/reference/API_REFERENCE.md) | 🇫🇷 [Français](../i18n/fr/docs/reference/API_REFERENCE.md) | 🇮🇹 [Italiano](../i18n/it/docs/reference/API_REFERENCE.md) | 🇷🇺 [Русский](../i18n/ru/docs/reference/API_REFERENCE.md) | 🇨🇳 [中文 (简体)](../i18n/zh-CN/docs/reference/API_REFERENCE.md) | 🇩🇪 [Deutsch](../i18n/de/docs/reference/API_REFERENCE.md) | 🇮🇳 [हिन्दी](../i18n/in/docs/reference/API_REFERENCE.md) | 🇹🇭 [ไทย](../i18n/th/docs/reference/API_REFERENCE.md) | 🇺🇦 [Українська](../i18n/uk-UA/docs/reference/API_REFERENCE.md) | 🇸🇦 [العربية](../i18n/ar/docs/reference/API_REFERENCE.md) | 🇯🇵 [日本語](../i18n/ja/docs/reference/API_REFERENCE.md) | 🇻🇳 [Tiếng Việt](../i18n/vi/docs/reference/API_REFERENCE.md) | 🇧🇬 [Български](../i18n/bg/docs/reference/API_REFERENCE.md) | 🇩🇰 [Dansk](../i18n/da/docs/reference/API_REFERENCE.md) | 🇫🇮 [Suomi](../i18n/fi/docs/reference/API_REFERENCE.md) | 🇮🇱 [עברית](../i18n/he/docs/reference/API_REFERENCE.md) | 🇭🇺 [Magyar](../i18n/hu/docs/reference/API_REFERENCE.md) | 🇮🇩 [Bahasa Indonesia](../i18n/id/docs/reference/API_REFERENCE.md) | 🇰🇷 [한국어](../i18n/ko/docs/reference/API_REFERENCE.md) | 🇲🇾 [Bahasa Melayu](../i18n/ms/docs/reference/API_REFERENCE.md) | 🇳🇱 [Nederlands](../i18n/nl/docs/reference/API_REFERENCE.md) | 🇳🇴 [Norsk](../i18n/no/docs/reference/API_REFERENCE.md) | 🇵🇹 [Português (Portugal)](../i18n/pt/docs/reference/API_REFERENCE.md) | 🇷🇴 [Română](../i18n/ro/docs/reference/API_REFERENCE.md) | 🇵🇱 [Polski](../i18n/pl/docs/reference/API_REFERENCE.md) | 🇸🇰 [Slovenčina](../i18n/sk/docs/reference/API_REFERENCE.md) | 🇸🇪 [Svenska](../i18n/sv/docs/reference/API_REFERENCE.md) | 🇵🇭 [Filipino](../i18n/phi/docs/reference/API_REFERENCE.md) | 🇨🇿 [Čeština](../i18n/cs/docs/reference/API_REFERENCE.md)

Complete reference for all OmniRoute API endpoints.

---

## Table of Contents

- [Chat Completions](#chat-completions)
- [Embeddings](#embeddings)
- [Image Generation](#image-generation)
- [List Models](#list-models)
- [Compatibility Endpoints](#compatibility-endpoints)
- [Files API](#files-api)
- [Batches API](#batches-api)
- [Search API](#search-api)
- [WebSocket Streaming](#websocket-streaming)
- [Quotas & Issues Reporting](#quotas--issues-reporting)
- [Semantic Cache](#semantic-cache)
- [Dashboard & Management](#dashboard--management)
- [Combo Management](#combo-management)
- [Webhooks](#webhooks)
- [Registered Keys (Auto-Management)](#registered-keys-auto-management)
- [Agents Protocol](#agents-protocol)
- [Management Proxies](#management-proxies)
- [Resilience (extended)](#resilience-extended)
- [Skills](#skills)
- [Memory](#memory)
- [MCP Server](#mcp-server)
- [A2A Server](#a2a-server)
- [Cloud, Evals & Assess](#cloud-evals--assess)
- [Request Processing](#request-processing)
- [Authentication](#authentication)

---

## Chat Completions

```bash
POST /v1/chat/completions
Authorization: Bearer your-api-key
Content-Type: application/json

{
  "model": "cc/claude-opus-4-6",
  "messages": [
    {"role": "user", "content": "Write a function to..."}
  ],
  "stream": true
}
```

### Custom Headers

| Header                   | Direction | Description                                                                                                                  |
| ------------------------ | --------- | ---------------------------------------------------------------------------------------------------------------------------- |
| `X-OmniRoute-No-Cache`   | Request   | Set to `true` to bypass cache                                                                                                |
| `x-omniroute-no-memory`  | Request   | Set to `true` to skip memory + skills injection for this request (mirrors no-cache; avoids the per-call token/cost overhead) |
| `X-OmniRoute-Progress`   | Request   | Set to `true` for progress events                                                                                            |
| `X-Session-Id`           | Request   | Sticky session key for external session affinity                                                                             |
| `x_session_id`           | Request   | Underscore variant also accepted (direct HTTP)                                                                               |
| `Idempotency-Key`        | Request   | Dedup key (5s window)                                                                                                        |
| `X-Request-Id`           | Request   | Alternative dedup key                                                                                                        |
| `X-OmniRoute-Cache`      | Response  | `HIT` or `MISS` (non-streaming)                                                                                              |
| `X-OmniRoute-Idempotent` | Response  | `true` if deduplicated                                                                                                       |
| `X-OmniRoute-Progress`   | Response  | `enabled` if progress tracking on                                                                                            |
| `X-OmniRoute-Session-Id` | Response  | Effective session ID used by OmniRoute                                                                                       |
| `X-OmniRoute-Request-Id` | Response  | Request correlation id (when known)                                                                                          |
| `X-OmniRoute-Version`    | Response  | OmniRoute build version (always present)                                                                                     |
| `X-OmniRoute-Cost-Saved` | Response  | USD the cache avoided on a HIT (cache hits only)                                                                             |

> Nginx note: if you rely on underscore headers (for example `x_session_id`), enable `underscores_in_headers on;`.

> **Cost telemetry headers:** non-streaming success responses also carry the `X-OmniRoute-*` cost-telemetry set — `X-OmniRoute-Response-Cost` (USD, fixed 10 decimals; `0.0000000000` for free/unpriced), `X-OmniRoute-Tokens-In` / `X-OmniRoute-Tokens-Out`, `X-OmniRoute-Model`, `X-OmniRoute-Provider`, `X-OmniRoute-Latency-Ms`, `X-OmniRoute-Cache-Hit`, and `X-OmniRoute-Fallback-Attempts` (only when > 0), plus `X-OmniRoute-Request-Id` and `X-OmniRoute-Version`. These are emitted by chat completions, `/v1/responses`, `/v1/messages`, **and the media endpoints** — `/v1/embeddings`, `/v1/images/generations`, `/v1/audio/speech`, `/v1/audio/transcriptions`, `/v1/rerank`, `/v1/videos/generations`, `/v1/music/generations`, and `/v1/moderations` (always cost `0`). Media cost is computed per modality (per-image, per-second, per-character, per search-unit) when pricing is available, otherwise `0` (fail-open).

> **Cache-hit cost semantics:** on a semantic-cache HIT (`X-OmniRoute-Cache-Hit: true`) no upstream call is made, so `X-OmniRoute-Response-Cost` is `0.0000000000` (the **incremental** cost of serving the hit). The original/would-have-been cost is reported separately in `X-OmniRoute-Cost-Saved`. Billing consumers should sum `X-OmniRoute-Response-Cost` (hits cost nothing); cache analytics can aggregate `X-OmniRoute-Cost-Saved`.

### `x-omniroute-compression`

Per-request override of the compression plan. Highest precedence — beats the routing-combo
override, the active profile, auto-trigger, and the panel Default. Values:

| Value         | Effect                                                               |
| ------------- | -------------------------------------------------------------------- |
| `off`         | No compression for this request.                                     |
| `default`     | The panel-derived Default profile (ignores the active profile).      |
| `engine:<id>` | A single engine when enabled, e.g. `engine:rtk`.                     |
| `<combo>`     | A named combo, matched by name (case-insensitive) first, then by id. |

Notes:

- Unknown values are ignored (the request is never rejected); resolution falls through to the normal operator precedence.
- If multiple combos share a name, pass the combo **id** for a deterministic match.
- A combo whose name is `off` or `default` cannot be selected by name (those keywords are interpreted first); reference such a combo by its id.
- The master compression switch is a hard gate: when compression is disabled globally, this header cannot enable it.

The applied plan is echoed back in the response header:

```
X-OmniRoute-Compression: <mode>; source=<source>
```

where `<source>` is one of `request-header`, `routing-override`, `active-profile`, `auto-trigger`, `default`, or `off`.

---

## Embeddings

```bash
POST /v1/embeddings
Authorization: Bearer your-api-key
Content-Type: application/json

{
  "model": "nebius/Qwen/Qwen3-Embedding-8B",
  "input": "The food was delicious"
}
```

Available providers: Nebius, OpenAI, Mistral, Together AI, Fireworks, NVIDIA, **OpenRouter**, **GitHub Models**.

```bash
# List all embedding models
GET /v1/embeddings
```

---

## Image Generation

```bash
POST /v1/images/generations
Authorization: Bearer your-api-key
Content-Type: application/json

{
  "model": "openai/gpt-image-2",
  "prompt": "A beautiful sunset over mountains",
  "size": "1024x1024"
}
```

Available providers: OpenAI (GPT Image 2), xAI (Grok Image), Together AI (FLUX), Fireworks AI, Nebius (FLUX), Hyperbolic, NanoBanana, **OpenRouter**, SD WebUI (local), ComfyUI (local).

```bash
# List all image models
GET /v1/images/generations
```

---

## List Models

```bash
GET /v1/models
Authorization: Bearer your-api-key

→ Returns all chat, embedding, and image models + combos in OpenAI format
```

### No-thinking model variants

For thinking-capable Claude models, `/v1/models` also advertises a **no-thinking** variant whose id is prefixed with `claude-3-omniroute-no-thinking/`:

```
claude-3-omniroute-no-thinking/<provider>/<model>
```

Selecting this id (e.g. in a Claude Code config that always attaches a `thinking` block) resolves back to the real `<provider>/<model>` with reasoning suppressed — `thinking:{type:"disabled"}` on the `/v1/messages` path, or the `reasoning`/`reasoning_effort` fields dropped on the `/v1/chat/completions` path. The variant is only listed for Claude-family models that support thinking **and** honor `disabled` (so e.g. adaptive-only models that reject `disabled` are excluded). Operators can force the variant on or off per model via `ModelSpec.noThinkingAlias`.

---

## Compatibility Endpoints

| Method | Path                                      | Format                           |
| ------ | ----------------------------------------- | -------------------------------- |
| POST   | `/v1/chat/completions`                    | OpenAI                           |
| POST   | `/v1/messages`                            | Anthropic                        |
| POST   | `/v1/responses`                           | OpenAI Responses                 |
| POST   | `/v1/embeddings`                          | OpenAI                           |
| POST   | `/v1/images/generations`                  | OpenAI Images                    |
| POST   | `/v1/images/edits`                        | OpenAI Images (edit/inpaint)     |
| POST   | `/v1/videos/generations`                  | OpenAI-style video generation    |
| POST   | `/v1/music/generations`                   | OpenAI-style music generation    |
| POST   | `/v1/audio/transcriptions`                | OpenAI Audio (STT)               |
| POST   | `/v1/audio/speech`                        | OpenAI TTS (returns audio body)  |
| POST   | `/v1/rerank`                              | Cohere/Voyage-style rerank       |
| POST   | `/v1/moderations`                         | OpenAI Moderations               |
| GET    | `/v1/models`                              | OpenAI                           |
| POST   | `/v1/messages/count_tokens`               | Anthropic                        |
| GET    | `/v1beta/models`                          | Gemini                           |
| POST   | `/v1beta/models/{...path}`                | Gemini generateContent           |
| POST   | `/v1/api/chat`                            | Ollama                           |
| GET    | `/api/v1/vscode/{token}/`                 | OpenAI catalog alias             |
| GET    | `/api/v1/vscode/{token}/models`           | OpenAI models alias              |
| POST   | `/api/v1/vscode/{token}/chat/completions` | OpenAI tokenized alias           |
| POST   | `/api/v1/vscode/{token}/responses`        | OpenAI Responses tokenized alias |
| POST   | `/api/v1/vscode/{token}/api/chat`         | Ollama tokenized alias           |
| GET    | `/api/v1/vscode/{token}/api/tags`         | Ollama tags tokenized alias      |

All POST routes follow the same shape: `Bearer your-api-key` + Zod-validated JSON body (`v1RerankSchema`, `v1ModerationSchema`, `v1AudioSpeechSchema`, etc., see `src/shared/validation/schemas.ts`). 4xx is returned on schema failure.

For clients that cannot attach `Authorization: Bearer ...`, OmniRoute also accepts API keys in the URL via either query-string compatibility (`?token=...`, `?apiKey=...`, `?api_key=...`, `?key=...`) or the dedicated `/api/v1/vscode/{token}/...` endpoints documented below.

```bash
# Rerank
POST /v1/rerank      { "model": "cohere/rerank-3", "query": "...", "documents": ["..."] }

# Moderations
POST /v1/moderations { "model": "omni-moderation-latest", "input": "..." }

# TTS — returns audio/mpeg (or requested format) body
POST /v1/audio/speech { "model": "openai/tts-1", "input": "Hello", "voice": "alloy" }

# Image edit (multipart)
POST /v1/images/edits  -F image=@input.png -F prompt="..." -F mask=@mask.png

# Video / music generation (provider-prefixed model id)
POST /v1/videos/generations { "model": "runway/gen-3", "prompt": "..." }
POST /v1/music/generations  { "model": "suno/v3.5",   "prompt": "..." }
```

### Dedicated Provider Routes

```bash
POST /v1/providers/{provider}/chat/completions
POST /v1/providers/{provider}/embeddings
POST /v1/providers/{provider}/images/generations
```

The provider prefix is auto-added if missing. Mismatched models return `400`.

---

## Files API

OpenAI-compatible files endpoint for batch input/output and file-purpose uploads.

| Method | Path                     | Description                                                                                                   |
| ------ | ------------------------ | ------------------------------------------------------------------------------------------------------------- |
| POST   | `/v1/files`              | Upload a file (multipart: `file`, `purpose`, `expires_after[anchor]`, `expires_after[seconds]`) — 512 MiB max |
| GET    | `/v1/files`              | List files for the authenticated API key                                                                      |
| GET    | `/v1/files/[id]`         | Retrieve a file's metadata                                                                                    |
| DELETE | `/v1/files/[id]`         | Delete a file                                                                                                 |
| GET    | `/v1/files/[id]/content` | Stream the raw file body back                                                                                 |

**Auth:** Bearer API key — files are scoped per-API-key via `getApiKeyRequestScope`.

---

## Batches API

OpenAI-compatible batch processing.

| Method | Path                      | Description                                                                                               |
| ------ | ------------------------- | --------------------------------------------------------------------------------------------------------- |
| POST   | `/v1/batches`             | Create batch — body validated by `v1BatchCreateSchema` (`input_file_id`, `endpoint`, `completion_window`) |
| GET    | `/v1/batches`             | List batches                                                                                              |
| GET    | `/v1/batches/[id]`        | Retrieve batch status + `request_counts`                                                                  |
| DELETE | `/v1/batches/[id]`        | Delete a finished/failed batch                                                                            |
| POST   | `/v1/batches/[id]/cancel` | Cancel an in-progress batch                                                                               |

**Auth:** Bearer API key. Batches are scoped per-API-key.

---

## Search API

Web/search provider abstraction (Tavily, Brave, Exa, Serper, etc.).

| Method | Path                   | Description                                                                          |
| ------ | ---------------------- | ------------------------------------------------------------------------------------ |
| GET    | `/v1/search`           | List configured search providers + capabilities                                      |
| POST   | `/v1/search`           | Run a search query — body validated by `v1SearchSchema`, supports caching/coalescing |
| GET    | `/v1/search/analytics` | Per-provider hit/latency/cache stats                                                 |

**Auth:** Bearer API key (`extractApiKey` + `isValidApiKey`). Search policy enforced via `enforceApiKeyPolicy`.

---

## WebSocket Streaming

```bash
GET /v1/ws?handshake=1
```

Validates a WebSocket upgrade handshake and returns the wire protocol example messages (`request`, `cancel`). Actual WS frames are handled by the bundled WS server outside the Next.js route table.

**Auth:** Bearer API key during handshake.

### Responses API over WebSocket (codex only)

```bash
# Same host:port as the HTTP API (default 20128); upgrade the connection:
wscat -c "ws://localhost:20128/v1/responses?api_key=<OMNIROUTE_API_KEY>"
# (or: -H "Authorization: Bearer <OMNIROUTE_API_KEY>")

# First frame MUST be response.create:
{ "type": "response.create", "model": "gpt-5.5", "input": [ { "role": "user", "content": "hi" } ] }
```

A Responses-API-over-WebSocket proxy is wired **exclusively to `codex`** (ChatGPT
backend). It listens on the same port as the API/dashboard at paths `/v1/responses`,
`/responses`, and `/api/v1/responses`. On the first `response.create` frame it
authenticates + prepares via the internal `codex-responses-ws` bridge, selects a
codex OAuth connection, and tunnels to `wss://chatgpt.com/backend-api/codex/responses`
via the `wreq-js` transport. **Non-codex models are rejected** (`codex_ws_provider_required`).
For quota-share routing use `model: "qtSd/<group>/codex/<model>"`. Implemented in
`app/server-ws.mjs` + `scripts/dev/responses-ws-proxy.mjs` + `src/app/api/internal/codex-responses-ws/route.ts`.

**Auth:** Bearer API key during handshake. The bundled HTTP server (`server-ws.mjs`)
must be the active entrypoint (it is, by default, when `app/server-ws.mjs` exists).

#### Model id: use the bare ChatGPT id (no `codex/` prefix)

The OpenAI **Codex CLI** validates the model name client-side when
`supports_websockets = true` and **rejects provider-prefixed ids** like
`codex/gpt-5.5` (`The 'codex/gpt-5.5' model is not supported when using Codex with
a ChatGPT account`). Send the **bare** id (e.g. `gpt-5.5`). OmniRoute's bridge is
codex-only, so it re-resolves a bare id as a codex model
(`resolveCodexWsModelInfo`) before tunneling upstream — even though a bare
`gpt-5.5` would otherwise route to another provider over HTTP.

#### Configuring the OpenAI Codex CLI

Point the Codex CLI at OmniRoute by adding a custom provider with WebSocket
support to `~/.codex/config.toml` (use a separate `CODEX_HOME` to avoid touching
an existing config):

```toml
model = "gpt-5.5"                 # bare id — NOT "codex/gpt-5.5"
model_provider = "omniroute"

[model_providers.omniroute]
name = "OmniRoute (WS)"
base_url = "http://localhost:20128/v1"   # no trailing slash; the WS URL is derived (use https/wss in production)
wire_api = "responses"                    # only supported value since Feb 2026
supports_websockets = true                # enables the Responses-over-WS transport
env_key = "OMNIROUTE_API_KEY"             # holds the OmniRoute API key (Bearer)
```

```bash
export OMNIROUTE_API_KEY=sk-...           # an OmniRoute API key (any key if REQUIRE_API_KEY=false)
codex exec "Responda apenas: PONG"
```

The CLI upgrades `base_url + /responses` to a WebSocket and OmniRoute tunnels it
to the selected codex OAuth connection. Validated end-to-end against the local
server: ChatGPT returns `codex.rate_limits` + `response.created` and streams the
completion.

---

## Quotas & Issues Reporting

| Method | Path                | Description                                                                           |
| ------ | ------------------- | ------------------------------------------------------------------------------------- |
| GET    | `/v1/quotas/check`  | Pre-validate quota for a `provider` + `accountId` before issuing a registered key     |
| POST   | `/v1/issues/report` | Report a quota/key issuance failure to GitHub (requires `GITHUB_ISSUES_REPO` + token) |

**Auth:** Bearer API key (`isAuthenticated`).

---

## Semantic Cache

```bash
# Get cache stats
GET /api/cache/stats

# Clear all caches
DELETE /api/cache/stats
```

Response example:

```json
{
  "semanticCache": {
    "memorySize": 42,
    "memoryMaxSize": 500,
    "dbSize": 128,
    "hitRate": 0.65
  },
  "idempotency": {
    "activeKeys": 3,
    "windowMs": 5000
  }
}
```

---

## Dashboard & Management

### Authentication

| Endpoint                      | Method  | Description           |
| ----------------------------- | ------- | --------------------- |
| `/api/auth/login`             | POST    | Login                 |
| `/api/auth/logout`            | POST    | Logout                |
| `/api/settings/require-login` | GET/PUT | Toggle login required |

### Provider Management

| Endpoint                     | Method                | Description                                    |
| ---------------------------- | --------------------- | ---------------------------------------------- |
| `/api/providers`             | GET/POST              | List / create providers                        |
| `/api/providers/[id]`        | GET/PUT/DELETE        | Manage a provider                              |
| `/api/providers/[id]/test`   | POST                  | Test provider connection                       |
| `/api/providers/[id]/models` | GET                   | List provider models                           |
| `/api/providers/validate`    | POST                  | Validate provider config                       |
| `/api/provider-nodes*`       | Various               | Provider node management                       |
| `/api/provider-models`       | GET/POST/PATCH/DELETE | Custom models (add, update, hide/show, delete) |

### OAuth Flows

| Endpoint                         | Method  | Description             |
| -------------------------------- | ------- | ----------------------- |
| `/api/oauth/[provider]/[action]` | Various | Provider-specific OAuth |

### Routing & Config

| Endpoint              | Method   | Description                   |
| --------------------- | -------- | ----------------------------- |
| `/api/models/alias`   | GET/POST | Model aliases                 |
| `/api/models/catalog` | GET      | All models by provider + type |
| `/api/combos*`        | Various  | Combo management              |
| `/api/keys*`          | Various  | API key management            |
| `/api/pricing`        | GET      | Model pricing                 |

### Usage & Analytics

| Endpoint                    | Method          | Description                     |
| --------------------------- | --------------- | ------------------------------- |
| `/api/usage/history`        | GET             | Usage history                   |
| `/api/usage/logs`           | GET             | Usage logs                      |
| `/api/usage/request-logs`   | GET             | Request-level logs              |
| `/api/usage/[connectionId]` | GET             | Per-connection usage            |
| `/api/usage/token-limits`   | GET/POST/DELETE | Per-API-key token-limit budgets |

### Settings

| Endpoint                              | Method        | Description                                         |
| ------------------------------------- | ------------- | --------------------------------------------------- |
| `/api/settings`                       | GET/PUT/PATCH | General settings                                    |
| `/api/settings/proxy`                 | GET/PUT       | Network proxy config                                |
| `/api/settings/proxy/test`            | POST          | Test proxy connection                               |
| `/api/settings/ip-filter`             | GET/PUT       | IP allowlist/blocklist                              |
| `/api/settings/thinking-budget`       | GET/PUT       | Reasoning token budget                              |
| `/api/settings/system-prompt`         | GET/PUT       | Global system prompt                                |
| `/api/settings/compression`           | GET/PUT       | Global compression config                           |
| `/api/settings/purge-request-history` | POST          | Clear request log rows and local call-log artifacts |

### Context & Compression

| Endpoint                               | Method         | Description                                                              |
| -------------------------------------- | -------------- | ------------------------------------------------------------------------ |
| `/api/compression/preview`             | POST           | Preview off/lite/standard/aggressive/ultra/RTK/stacked compression       |
| `/api/compression/language-packs`      | GET            | List available Caveman language packs                                    |
| `/api/compression/rules`               | GET            | List Caveman rule metadata                                               |
| `/api/context/caveman/config`          | GET/PUT        | Caveman-specific settings alias                                          |
| `/api/context/rtk/config`              | GET/PUT        | RTK-specific settings, including custom filters and raw-output retention |
| `/api/context/rtk/filters`             | GET            | RTK filter catalog and custom-filter diagnostics                         |
| `/api/context/rtk/test`                | POST           | Run RTK preview/test against a text payload                              |
| `/api/context/rtk/raw-output/[id]`     | GET            | Read retained redacted raw output by pointer id                          |
| `/api/context/combos`                  | GET/POST       | Compression combo list/create                                            |
| `/api/context/combos/[id]`             | GET/PUT/DELETE | Compression combo detail/update/delete                                   |
| `/api/context/combos/[id]/assignments` | GET/PUT        | Assign compression combos to routing combos                              |
| `/api/context/analytics`               | GET            | Compression analytics alias                                              |

### Monitoring

| Endpoint                 | Method     | Description                                                                                          |
| ------------------------ | ---------- | ---------------------------------------------------------------------------------------------------- |
| `/api/sessions`          | GET        | Active session tracking                                                                              |
| `/api/rate-limits`       | GET        | Per-account rate limits                                                                              |
| `/api/monitoring/health` | GET        | Health check + provider summary (`catalogCount`, `configuredCount`, `activeCount`, `monitoredCount`) |
| `/api/cache/stats`       | GET/DELETE | Cache stats / clear                                                                                  |

### Backup & Export/Import

| Endpoint                    | Method | Description                             |
| --------------------------- | ------ | --------------------------------------- |
| `/api/db-backups`           | GET    | List available backups                  |
| `/api/db-backups`           | PUT    | Create a manual backup                  |
| `/api/db-backups`           | POST   | Restore from a specific backup          |
| `/api/db-backups/export`    | GET    | Download database as .sqlite file       |
| `/api/db-backups/import`    | POST   | Upload .sqlite file to replace database |
| `/api/db-backups/exportAll` | GET    | Download full backup as .tar.gz archive |

### Cloud Sync

| Endpoint               | Method  | Description           |
| ---------------------- | ------- | --------------------- |
| `/api/sync/cloud`      | Various | Cloud sync operations |
| `/api/sync/initialize` | POST    | Initialize sync       |
| `/api/cloud/*`         | Various | Cloud management      |

### Tunnels

| Endpoint                   | Method | Description                                                             |
| -------------------------- | ------ | ----------------------------------------------------------------------- |
| `/api/tunnels/cloudflared` | GET    | Read Cloudflare Quick Tunnel install/runtime status for the dashboard   |
| `/api/tunnels/cloudflared` | POST   | Enable or disable the Cloudflare Quick Tunnel (`action=enable/disable`) |
| `/api/tunnels/ngrok`       | GET    | Read ngrok Tunnel runtime status for the dashboard                      |
| `/api/tunnels/ngrok`       | POST   | Enable or disable the ngrok Tunnel (`action=enable/disable`)            |

### CLI Tools

| Endpoint                           | Method | Description         |
| ---------------------------------- | ------ | ------------------- |
| `/api/cli-tools/claude-settings`   | GET    | Claude CLI status   |
| `/api/cli-tools/codex-settings`    | GET    | Codex CLI status    |
| `/api/cli-tools/droid-settings`    | GET    | Droid CLI status    |
| `/api/cli-tools/openclaw-settings` | GET    | OpenClaw CLI status |
| `/api/cli-tools/runtime/[toolId]`  | GET    | Generic CLI runtime |

CLI responses include: `installed`, `runnable`, `command`, `commandPath`, `runtimeMode`, `reason`.

### ACP Agents

| Endpoint          | Method | Description                                              |
| ----------------- | ------ | -------------------------------------------------------- |
| `/api/acp/agents` | GET    | List all detected agents (built-in + custom) with status |
| `/api/acp/agents` | POST   | Add custom agent or refresh detection cache              |
| `/api/acp/agents` | DELETE | Remove a custom agent by `id` query param                |

GET response includes `agents[]` (id, name, binary, version, installed, protocol, isCustom) and `summary` (total, installed, notFound, builtIn, custom).

### Resilience & Rate Limits

| Endpoint                          | Method    | Description                                                                          |
| --------------------------------- | --------- | ------------------------------------------------------------------------------------ |
| `/api/resilience`                 | GET/PATCH | Get/update request queue, connection cooldown, provider breaker, and wait settings   |
| `/api/resilience/reset`           | POST      | Reset provider circuit breakers                                                      |
| `/api/resilience/model-cooldowns` | GET       | List active per-(provider, connection, model) lockouts, sorted by remaining time     |
| `/api/resilience/model-cooldowns` | DELETE    | Clear a model lockout — body `{provider, model}` or `{all: true}` to wipe everything |
| `/api/rate-limits`                | GET       | Per-account rate limit status                                                        |
| `/api/rate-limit`                 | GET       | Global rate limit configuration                                                      |

> All four `/api/resilience/*` routes require **management auth** (`requireManagementAuth`). See [Resilience (extended)](#resilience-extended) for a full breakdown of provider breaker vs connection cooldown vs model lockout.

### Evals

| Endpoint     | Method   | Description                       |
| ------------ | -------- | --------------------------------- |
| `/api/evals` | GET/POST | List eval suites / run evaluation |

### Policies

| Endpoint        | Method          | Description             |
| --------------- | --------------- | ----------------------- |
| `/api/policies` | GET/POST/DELETE | Manage routing policies |

### Compliance

| Endpoint                    | Method | Description                   |
| --------------------------- | ------ | ----------------------------- |
| `/api/compliance/audit-log` | GET    | Compliance audit log (last N) |

### v1beta (Gemini-Compatible)

| Endpoint                   | Method | Description                       |
| -------------------------- | ------ | --------------------------------- |
| `/v1beta/models`           | GET    | List models in Gemini format      |
| `/v1beta/models/{...path}` | POST   | Gemini `generateContent` endpoint |

These endpoints mirror Gemini's API format for clients that expect native Gemini SDK compatibility.

### Internal / System APIs

| Endpoint                 | Method | Description                                          |
| ------------------------ | ------ | ---------------------------------------------------- |
| `/api/init`              | GET    | Application initialization check (used on first run) |
| `/api/tags`              | GET    | Ollama-compatible model tags (for Ollama clients)    |
| `/api/restart`           | POST   | Trigger graceful server restart                      |
| `/api/shutdown`          | POST   | Trigger graceful server shutdown                     |
| `/api/system/env/repair` | POST   | Repair OAuth provider environment variables          |

> **Note:** These endpoints are used internally by the system or for Ollama client compatibility. They are not typically called by end users.

### OAuth Environment Repair _(v3.6.1+)_

```bash
POST /api/system/env/repair
Content-Type: application/json

{
  "provider": "claude-code"
}
```

Repairs missing or corrupted OAuth environment variables for a specific provider. Returns:

```json
{
  "success": true,
  "repaired": ["CLAUDE_CODE_OAUTH_CLIENT_ID", "CLAUDE_CODE_OAUTH_CLIENT_SECRET"],
  "backupPath": "/home/user/.omniroute/backups/env-repair-2026-04-11.bak"
}
```

---

## Audio Transcription

```bash
POST /v1/audio/transcriptions
Authorization: Bearer your-api-key
Content-Type: multipart/form-data
```

Transcribe audio files using Deepgram or AssemblyAI.

**Request:**

```bash
curl -X POST http://localhost:20128/v1/audio/transcriptions \
  -H "Authorization: Bearer your-api-key" \
  -F "file=@recording.mp3" \
  -F "model=deepgram/nova-3"
```

**Response:**

```json
{
  "text": "Hello, this is the transcribed audio content.",
  "task": "transcribe",
  "language": "en",
  "duration": 12.5
}
```

**Supported providers:** `deepgram/nova-3`, `assemblyai/best`.

**Supported formats:** `mp3`, `wav`, `m4a`, `flac`, `ogg`, `webm`.

---

## Ollama Compatibility

For clients that use Ollama's API format:

```bash
# Chat endpoint (Ollama format)
POST /v1/api/chat

# Model listing (Ollama format)
GET /api/tags
```

Requests are automatically translated between Ollama and internal formats.

## Tokenized VS Code / Headerless Aliases

Use these aliases when an integration cannot inject an `Authorization` header and needs the API key embedded in the base URL.

```bash
# OpenAI-style catalog alias
GET /api/v1/vscode/{token}/
GET /api/v1/vscode/{token}/models

# OpenAI-style chat aliases
POST /api/v1/vscode/{token}/chat/completions
POST /api/v1/vscode/{token}/responses

# Ollama-style aliases
POST /api/v1/vscode/{token}/api/chat
GET /api/v1/vscode/{token}/api/tags
```

Example:

```bash
curl https://your-host.example/api/v1/vscode/YOUR_API_KEY/models
curl -X POST https://your-host.example/api/v1/vscode/YOUR_API_KEY/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"auto","messages":[{"role":"user","content":"hello"}]}'
```

Notes:

- The tokenized aliases reuse the same handlers as `/v1/*` and `/api/tags`; response shapes stay identical.
- Prefer `Authorization: Bearer ...` whenever the client supports custom headers.
- URL-based tokens may appear in reverse-proxy logs, browser history, and telemetry outside OmniRoute. Treat them as a compatibility option, not the default authentication mode.

---

## Telemetry

```bash
# Get latency telemetry summary (p50/p95/p99 per provider)
GET /api/telemetry/summary
```

**Response:**

```json
{
  "providers": {
    "claudeCode": { "p50": 245, "p95": 890, "p99": 1200, "count": 150 },
    "github": { "p50": 180, "p95": 620, "p99": 950, "count": 320 }
  }
}
```

---

## Budget

```bash
# Get budget status for all API keys
GET /api/usage/budget

# Set or update a budget
POST /api/usage/budget
Content-Type: application/json

{
  "apiKeyId": "key-123",
  "dailyLimitUsd": 5.00,
  "weeklyLimitUsd": 30.00,
  "monthlyLimitUsd": 100.00,
  "warningThreshold": 0.8,
  "resetInterval": "monthly"
}
```

> **Schema notes** (`setBudgetSchema`): `apiKeyId` is required; at least one of `dailyLimitUsd`, `weeklyLimitUsd`, or `monthlyLimitUsd` must be greater than zero. Optional fields: `warningThreshold` (0–1), `resetInterval` (`daily` | `weekly` | `monthly`), `resetTime` (`HH:MM`). The legacy `{keyId, limit, period}` shape returns `400 Bad Request`.

## Token Limits

Per-API-key **token** budgets (distinct from the USD-based Budget above). Enforced inline on the request path: when a key's current window usage reaches its limit, requests are rejected with `429 Too Many Requests`. Limits can be scoped to a specific `model`, a `provider`, or applied `global`ly across the key; when several limits match a request, the most restrictive one wins.

```bash
# List a key's token limits (includes live window usage)
GET /api/usage/token-limits?apiKeyId=key-123

# Create or update a token limit
POST /api/usage/token-limits
Content-Type: application/json

{
  "apiKeyId": "key-123",
  "scopeType": "model",
  "scopeValue": "openai/gpt-4o",
  "tokenLimit": 1000000,
  "resetInterval": "monthly",
  "enabled": true
}

# Delete a token limit by id
DELETE /api/usage/token-limits?id=tl-abc
```

> **Schema notes** (`setTokenLimitSchema`): `apiKeyId` and `scopeType` (`model` | `provider` | `global`) are required. `scopeValue` is required unless `scopeType` is `global` (e.g. a model id for `model` scope, a provider id for `provider` scope). `tokenLimit` must be a positive integer (coerced from string). Optional: `id` (omit to create, supply to update), `resetInterval` (`daily` | `weekly` | `monthly`, default `monthly`), `resetTime` (`HH:MM`), `enabled` (default `true`). `GET` responses enrich each limit with `tokensUsed`, `remaining`, `windowStart`, `periodStartAt`, and `nextResetAt`. This is a management-class endpoint (auth enforced centrally by the authz pipeline).

## Request Processing

1. Client sends request to `/v1/*`
2. Route handler calls `handleChat`, `handleEmbedding`, `handleAudioTranscription`, or `handleImageGeneration`
3. Model is resolved (direct provider/model or alias/combo)
4. Credentials selected from local DB with account availability filtering
5. For chat: `handleChatCore` checks semantic/signature cache and resolves combo compression settings
6. Proactive compression runs before provider translation when enabled (`lite`, Caveman, RTK, or stacked)
7. Provider executor sends upstream request
8. Response translated back to client format (chat) or returned as-is (embeddings/images/audio)
9. Usage, compression analytics, and request logs are recorded
10. Fallback applies on errors according to combo rules

Full architecture reference: [`ARCHITECTURE.md`](../architecture/ARCHITECTURE.md)

---

## Combo Management

Higher-level routing combos (already summarized under `/api/combos*`) can also be mapped 1:1 from a model id pattern, allowing transparent redirection of an OpenAI-style model id to a combo.

| Method | Path                             | Description                                                                    |
| ------ | -------------------------------- | ------------------------------------------------------------------------------ |
| GET    | `/api/model-combo-mappings`      | List all model→combo mappings                                                  |
| POST   | `/api/model-combo-mappings`      | Create mapping — body: `{pattern, comboId, priority?, enabled?, description?}` |
| GET    | `/api/model-combo-mappings/[id]` | Retrieve a single mapping                                                      |
| PUT    | `/api/model-combo-mappings/[id]` | Update fields of an existing mapping                                           |
| DELETE | `/api/model-combo-mappings/[id]` | Remove a mapping                                                               |

**Auth:** management session/API key (`requireManagementAuth`).

---

## Webhooks

Outbound webhook subscriptions for OmniRoute events (request completion, quota exhaustion, key rotation, etc.).

| Method | Path                      | Description                                                           |
| ------ | ------------------------- | --------------------------------------------------------------------- |
| GET    | `/api/webhooks`           | List webhooks (secrets are masked to `<prefix>...`)                   |
| POST   | `/api/webhooks`           | Create webhook — body: `{url, events?: ["*"], secret?, description?}` |
| GET    | `/api/webhooks/[id]`      | Retrieve a webhook                                                    |
| PUT    | `/api/webhooks/[id]`      | Update url/events/secret/description                                  |
| DELETE | `/api/webhooks/[id]`      | Remove a webhook                                                      |
| POST   | `/api/webhooks/[id]/test` | Send a test payload to the webhook URL and return delivery status     |

**Auth:** management session/API key (`requireManagementAuth`).

---

## Registered Keys (Auto-Management)

Used by the auto-key management subsystem to issue and rotate API keys against a backing provider/account, with daily/hourly quotas.

| Method | Path                                  | Description                                                                                                                                                                                 |
| ------ | ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| GET    | `/api/v1/registered-keys`             | List registered keys (masked prefix only)                                                                                                                                                   |
| POST   | `/api/v1/registered-keys`             | Issue a new registered key — body: `{name, provider?, accountId?, idempotencyKey?, expiresAt?, dailyBudget?, hourlyBudget?}`. Returns the raw key **once**. Returns `429` on quota refusal. |
| GET    | `/api/v1/registered-keys/[id]`        | Retrieve a registered key's metadata (no raw material)                                                                                                                                      |
| DELETE | `/api/v1/registered-keys/[id]`        | Revoke a registered key                                                                                                                                                                     |
| POST   | `/api/v1/registered-keys/[id]/revoke` | Explicit revoke endpoint (same effect as DELETE)                                                                                                                                            |

**Auth:** Bearer API key (`isAuthenticated`). See also `/v1/quotas/check` and `/v1/issues/report`.

---

## Agents Protocol

Cloud agent tasks (Claude Code, Codex Cloud, OpenHands, etc.) executed remotely on behalf of OmniRoute users.

| Method | Path                          | Description                                                                                                                                   |
| ------ | ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
| GET    | `/api/v1/agents/tasks`        | List tasks — optional `?provider=`, `?status=`, `?limit=` (1–500, default 50)                                                                 |
| POST   | `/api/v1/agents/tasks`        | Create task — body validated by `CreateCloudAgentTaskSchema` (`providerId`, `prompt`, `source`, `options?`). Returns `201` with task envelope |
| DELETE | `/api/v1/agents/tasks?id=...` | Delete a task                                                                                                                                 |
| GET    | `/api/v1/agents/tasks/[id]`   | Read task — synchronously refreshes status from the upstream cloud agent when an `external_id` is set                                         |
| POST   | `/api/v1/agents/tasks/[id]`   | Discriminated action: `{action: "approve"}`, `{action: "message", message}`, or `{action: "cancel"}`                                          |
| DELETE | `/api/v1/agents/tasks/[id]`   | Delete a specific task by id                                                                                                                  |

> **Auth:** management auth required on every method (`requireCloudAgentManagementAuth`). Prior to v3.8.0 these were unauthenticated — see commit `588a0333` for the breaking change.

```bash
# Create a Claude Code cloud task
curl -X POST http://localhost:20128/api/v1/agents/tasks \
  -H "Authorization: Bearer your-management-key" \
  -H "Content-Type: application/json" \
  -d '{"providerId":"claude-code-cloud","prompt":"Fix the failing test","source":{"repo":"...","branch":"..."}}'
```

---

## Management Proxies

Outbound HTTP(S)/SOCKS proxies that can be assigned to providers, accounts, or globally.

| Method | Path                                         | Description                                                                                                                                      |
| ------ | -------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| GET    | `/api/v1/management/proxies`                 | List proxies (with `?id=` returns one; with `?id=&where_used=1` returns the assignment graph)                                                    |
| POST   | `/api/v1/management/proxies`                 | Create proxy — body validated by `createProxyRegistrySchema`                                                                                     |
| PATCH  | `/api/v1/management/proxies`                 | Update proxy — body validated by `updateProxyRegistrySchema` (requires `id`)                                                                     |
| DELETE | `/api/v1/management/proxies?id=...&force=1`  | Delete proxy (use `force=1` to detach assignments)                                                                                               |
| GET    | `/api/v1/management/proxies/assignments`     | List assignments — filterable by `proxy_id`, `scope`, `scope_id`; pass `resolve_connection_id=<id>` to resolve the active proxy for a connection |
| PUT    | `/api/v1/management/proxies/assignments`     | Assign — body validated by `proxyAssignmentSchema` (`{scope, scopeId?, proxyId?}`). Clears dispatcher cache                                      |
| PUT    | `/api/v1/management/proxies/bulk-assign`     | Bulk-assign — body validated by `bulkProxyAssignmentSchema` (`{scope, scopeIds[], proxyId?}`)                                                    |
| GET    | `/api/v1/management/proxies/health?hours=24` | Aggregate proxy health (success/fail counts, latency) over a window                                                                              |

**Auth:** management session/API key on every route (`requireManagementAuth`).

> The task description's `POST /api/v1/management/proxies/[id]/assignments` and `POST /api/v1/management/proxies/[id]/health` are served by the flat `/assignments` and `/health` routes shown above — there are no per-id subroutes in the codebase.

---

## Resilience (extended)

OmniRoute exposes three independent temporary-failure mechanisms; the management endpoints below let operators read and override them:

| Scope               | State storage                              | Read                                      | Reset / clear                               |
| ------------------- | ------------------------------------------ | ----------------------------------------- | ------------------------------------------- |
| Provider breaker    | `domain_circuit_breakers` + in-memory      | `/api/monitoring/health`                  | `POST /api/resilience/reset`                |
| Connection cooldown | `rateLimitedUntil` on provider connections | `/api/rate-limits`, `/api/providers/[id]` | (re-enables lazily; clear via provider PUT) |
| Model lockout       | In-memory model-availability registry      | `GET /api/resilience/model-cooldowns`     | `DELETE /api/resilience/model-cooldowns`    |

`PATCH /api/resilience` accepts provider breaker overrides under `providerBreaker.oauth` and `providerBreaker.apikey`. Each profile supports `degradationThreshold`, `failureThreshold`, and `resetTimeoutMs`; the same fields are exposed in Dashboard → Settings → Resilience.

```bash
# Clear a single model lockout
curl -X DELETE http://localhost:20128/api/resilience/model-cooldowns \
  -H "Cookie: auth_token=..." \
  -H "Content-Type: application/json" \
  -d '{"provider":"openai","model":"gpt-4o-mini"}'

# Wipe every lockout
curl -X DELETE http://localhost:20128/api/resilience/model-cooldowns \
  -H "Cookie: auth_token=..." \
  -d '{"all":true}'
```

Full conceptual reference and breaker defaults: see [`CLAUDE.md`](../../CLAUDE.md) → "Resilience Runtime State".

---

## Skills

Skill framework for extending OmniRoute with custom executable handlers, plus marketplace integrations.

| Method | Path                              | Description                                                                                                                |
| ------ | --------------------------------- | -------------------------------------------------------------------------------------------------------------------------- |
| GET    | `/api/skills`                     | List installed skills — filterable by `?q=`, `?mode=on\|off\|auto`, `?source=skillsmp\|skillssh\|local`, paginated         |
| GET    | `/api/skills/[id]`                | Retrieve one skill                                                                                                         |
| PUT    | `/api/skills/[id]`                | Update skill (name, description, mode, schema, handler, tags)                                                              |
| DELETE | `/api/skills/[id]`                | Uninstall a skill                                                                                                          |
| POST   | `/api/skills/install`             | Install a skill from a raw manifest — body: `{name, version, description, schema:{input, output}, handlerCode, apiKeyId?}` |
| GET    | `/api/skills/executions`          | List recent skill executions (audit trail with inputs/outputs/duration)                                                    |
| GET    | `/api/skills/marketplace?q=...`   | Search/popular list from the SkillsMP marketplace (requires `skillsmpApiKey` setting)                                      |
| POST   | `/api/skills/marketplace/install` | Install a skill by id from SkillsMP                                                                                        |
| GET    | `/api/skills/skillssh?q=&limit=`  | Search the skills.sh registry                                                                                              |
| POST   | `/api/skills/skillssh/install`    | Install a skill by id from skills.sh                                                                                       |

**Auth:** management session/API key. Marketplace search routes accept either management auth or a Bearer API key (`isAuthenticated`).

---

## Memory

Persistent conversational/factual memory store, scoped per API key / session.

| Method | Path                 | Description                                                                                                  |
| ------ | -------------------- | ------------------------------------------------------------------------------------------------------------ |
| GET    | `/api/memory`        | List memories — `?apiKeyId=`, `?type=`, `?sessionId=`, `?q=`, with `offset/limit` or `page/limit` pagination |
| POST   | `/api/memory`        | Create memory — body validated by Zod: `{content, key, type?, sessionId?, apiKeyId?, metadata?, expiresAt?}` |
| GET    | `/api/memory/[id]`   | Retrieve one memory                                                                                          |
| DELETE | `/api/memory/[id]`   | Delete a memory                                                                                              |
| GET    | `/api/memory/health` | Memory subsystem health (DB connectivity, embeddings backend, vector index status)                           |

**Auth:** management session/API key (`requireManagementAuth`). `type` enum: `FACTUAL`, `EPISODIC`, `SEMANTIC`, `PROCEDURAL` (see `MemoryType` in `src/lib/memory/types.ts`).

---

## MCP Server

OmniRoute ships an embedded Model Context Protocol server with 3 transports (stdio, SSE, streamable-http) and scoped tools. The dashboard endpoints below read status/audit data and proxy the HTTP transports.

| Method | Path                   | Description                                                                                      |
| ------ | ---------------------- | ------------------------------------------------------------------------------------------------ | -------------------- |
| GET    | `/api/mcp/status`      | Heartbeat, transport, online state, last call, top tools, 24h success rate                       |
| GET    | `/api/mcp/tools`       | List of MCP tools with `name`, `description`, `scopes`, `phase`, `auditLevel`, `sourceEndpoints` |
| GET    | `/api/mcp/sse`         | Open SSE stream for the SSE transport (returns `503` if MCP disabled or transport mismatch)      |
| POST   | `/api/mcp/sse`         | Send JSON-RPC frame on the SSE transport                                                         |
| GET    | `/api/mcp/stream`      | Open SSE side of the Streamable HTTP transport (server-initiated messages)                       |
| POST   | `/api/mcp/stream`      | Send JSON-RPC frame on the Streamable HTTP transport                                             |
| DELETE | `/api/mcp/stream`      | End a Streamable HTTP session                                                                    |
| GET    | `/api/mcp/audit`       | Query audit log — `?limit=`, `?offset=`, `?tool=`, `?success=true                                | false`, `?apiKeyId=` |
| GET    | `/api/mcp/audit/stats` | Aggregate audit stats (totals, success rate, avg duration, top tools)                            |

**Auth:** the `sse`/`stream` transports honor the MCP-specific auth surface (Bearer API key with `mcp` scope); the `status`/`tools`/`audit*` routes are readable from the dashboard (no extra auth required beyond reaching the dashboard host).

> Both HTTP transports are gated by `settings.mcpEnabled` and `settings.mcpTransport` — a transport mismatch returns `400`, an MCP disabled state returns `503`.

---

## A2A Server

OmniRoute exposes an A2A (Agent-to-Agent) JSON-RPC 2.0 endpoint plus a REST wrapper for inspection/dashboard use.

### JSON-RPC

```bash
POST /a2a
Authorization: Bearer your-api-key   # optional unless OMNIROUTE_API_KEY is set
Content-Type: application/json

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "message/send",
  "params": {
    "skill": "smart-routing",
    "messages": [{"role": "user", "content": "Route this coding task"}]
  }
}
```

Supported methods (all gated on `settings.a2aEnabled`):

| Method           | Description                                                        |
| ---------------- | ------------------------------------------------------------------ |
| `message/send`   | Synchronous skill execution; returns `{task, artifacts, metadata}` |
| `message/stream` | Streaming SSE execution of the same skill set                      |
| `tasks/get`      | Fetch a task by `taskId`                                           |
| `tasks/cancel`   | Cancel a task by `taskId`                                          |

Built-in skills: `smart-routing`, `quota-management`, `provider-discovery`, `cost-analysis`, `health-report`.

### Agent Card

```bash
GET /.well-known/agent.json
```

Returns the public A2A agent card (name, description, capabilities, skill catalog, auth scheme) — cached publicly for 1h. No auth required.

### REST helpers

| Method | Path                         | Description                                                                                                     |
| ------ | ---------------------------- | --------------------------------------------------------------------------------------------------------------- |
| GET    | `/api/a2a/status`            | A2A enabled + task stats + cached agent card summary                                                            |
| GET    | `/api/a2a/tasks`             | List tasks — `?state=submitted\|working\|completed\|failed\|cancelled`, `?skill=`, `?limit=` (≤200), `?offset=` |
| POST   | `/api/a2a/tasks`             | (Not implemented as a REST helper — create via JSON-RPC `message/send`)                                         |
| GET    | `/api/a2a/tasks/[id]`        | Retrieve one task                                                                                               |
| POST   | `/api/a2a/tasks/[id]/cancel` | Cancel a task                                                                                                   |

**Auth:** the REST helpers run without management auth (dashboard-readable); the JSON-RPC `/a2a` route uses Bearer `OMNIROUTE_API_KEY` if configured.

---

## Cloud, Evals & Assess

| Method | Path                            | Description                                                                                       |
| ------ | ------------------------------- | ------------------------------------------------------------------------------------------------- | ----------------------------- | ----------------------------------- |
| POST   | `/api/cloud/auth`               | Verify a Bearer key and return masked provider connections + model aliases for cloud sync clients |
| POST   | `/api/cloud/credentials/update` | Update encrypted credentials for a cloud-synced provider                                          |
| POST   | `/api/cloud/model/resolve`      | Resolve a logical model id to a concrete provider/model using the local routing table             |
| GET    | `/api/cloud/models/alias`       | List model aliases as exposed to cloud sync                                                       |
| GET    | `/api/assess`                   | Read latest assessment categorizations (per-provider/model)                                       |
| POST   | `/api/assess`                   | Run an assessment — body: `{scope: {type:"all"}                                                   | {type:"provider", providerId} | {type:"model", modelId}, trigger?}` |
| GET    | `/api/evals`                    | List built-in eval suites + most recent runs                                                      |
| POST   | `/api/evals`                    | Trigger an eval run                                                                               |
| POST   | `/api/evals/suites`             | Create a custom eval suite — body validated by `evalSuiteSaveSchema`                              |
| GET    | `/api/evals/suites/[id]`        | Retrieve a custom eval suite                                                                      |

**Auth:** `/api/cloud/auth` validates a Bearer key directly; the other `/api/cloud/*`, `/api/evals/*`, and `/api/assess` routes require management session/API key. `/api/assess` POST uses `validateBody` with a discriminated-union scope schema.

---

## ACP (Agent Client Protocol) Management

as child processes. These endpoints manage ACP agent detection and custom agent
registration.

| Method | Path              | Description                                                                                                                                            |
| ------ | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
| GET    | `/api/acp/agents` | List all known CLI agents (built-in + custom) with installation status, version, binary                                                                |
| POST   | `/api/acp/agents` | Register a custom ACP agent or refresh cache — body: `{id, name, binary, versionCommand, providerAlias, spawnArgs, protocol}` or `{action: "refresh"}` |
| DELETE | `/api/acp/agents` | Remove a custom ACP agent — query param: `?id=<agentId>`                                                                                               |

**Response example** (`GET /api/acp/agents`):

```json
{
  "agents": [
    {
      "id": "claude",
      "name": "Claude Code CLI",
      "binary": "claude",
      "version": "1.0.45",
      "installed": true,
      "protocol": "stdio",
      "providerAlias": "claude",
      "isCustom": false
    },
    {
      "id": "my-custom-cli",
      "name": "My Custom CLI",
      "installed": false,
      "protocol": "stdio",
      "providerAlias": "my-provider",
      "isCustom": true
    }
  ],
  "cacheTtlMs": 60000,
  "cacheAge": 1234
}
```

**Auth:** Requires management session (dashboard `auth_token` cookie) or a
management-scoped API key.

See [ACP Framework](../frameworks/ACP.md) for full details.

---

## Analytics & Observability

Real-time analytics endpoints for monitoring routing, compression, and provider
diversity. These power the `/dashboard/analytics/*` pages.

### Auto-routing analytics

| Method | Path                                 | Description                                                                                        |
| ------ | ------------------------------------ | -------------------------------------------------------------------------------------------------- |
| GET    | `/api/analytics/auto-routing`        | Aggregate auto-routing stats: total calls, strategy distribution, tier distribution, top providers |
| GET    | `/api/analytics/auto-routing?days=7` | Time-windowed stats (default 24h)                                                                  |

**Response example**:

```json
{
  "window": "24h",
  "totalCalls": 1234,
  "strategyBreakdown": {
    "rules": 800,
    "cost": 200,
    "latency": 150,
    "sla-aware": 50,
    "lkgp": 34
  },
  "tierBreakdown": {
    "ultra": 100,
    "pro": 500,
    "standard": 400,
    "free": 234
  },
  "topProviders": [
    { "provider": "openai", "calls": 500, "avgLatencyMs": 850 },
    { "provider": "anthropic", "calls": 300, "avgLatencyMs": 1200 }
  ]
}
```

### Compression analytics

| Method | Path                         | Description                                                                           |
| ------ | ---------------------------- | ------------------------------------------------------------------------------------- |
| GET    | `/api/analytics/compression` | Aggregate compression stats: tokens saved, savings %, mode distribution, engine usage |

**Response example**:

```json
{
  "window": "24h",
  "totalOriginalTokens": 5000000,
  "totalCompressedTokens": 3500000,
  "totalSavings": 1500000,
  "savingsPct": 30.0,
  "modeBreakdown": {
    "lite": 400,
    "standard": 600,
    "aggressive": 100,
    "ultra": 50,
    "rtk": 84
  },
  "engineBreakdown": {
    "caveman": 800,
    "rtk": 434
  }
}
```

### Provider diversity tracking

| Method | Path                       | Description                                                                                              |
| ------ | -------------------------- | -------------------------------------------------------------------------------------------------------- |
| GET    | `/api/analytics/diversity` | Shannon entropy-based diversity tracking: prevents single points of failure by measuring provider spread |

**Response example**:

```json
{
  "window": "24h",
  "shannonEntropy": 2.45,
  "maxEntropy": 3.17,
  "diversityRatio": 0.77,
  "providerUsage": {
    "openai": 0.4,
    "anthropic": 0.25,
    "google": 0.2,
    "kiro": 0.15
  },
  "warnings": ["OpenAI accounts for 40% of traffic — consider diversifying"]
}
```

**Auth:** Requires management session or management-scoped API key.

---

## Admin Operations

Admin-only endpoints for operational management.

| Method | Path                     | Description                                                                                 |
| ------ | ------------------------ | ------------------------------------------------------------------------------------------- |
| GET    | `/api/admin/concurrency` | Read current concurrency limits (global + per-provider)                                     |
| POST   | `/api/admin/concurrency` | Update concurrency limits — body: `{global?: number, perProvider?: Record<string, number>}` |

**Auth:** Requires management session with admin scope.

---

## CLI Tools Management

Manage CLI tools that integrate with OmniRoute (antigravity, chipotle, commandCode,
devin-cli, etc.). See [Provider Reference](./PROVIDER_REFERENCE.md) for the full list.

| Method | Path                                    | Description                                                                                    |
| ------ | --------------------------------------- | ---------------------------------------------------------------------------------------------- |
| GET    | `/api/cli-tools/all-statuses`           | Status of all CLI tools (installed, version, last seen)                                        |
| GET    | `/api/cli-tools/[id]/status`            | Status of a specific CLI tool (id can be: antigravity, chipotle, commandCode, devin-cli, etc.) |
| POST   | `/api/cli-tools/apply`                  | Apply a CLI tool configuration to a provider connection                                        |
| GET    | `/api/cli-tools/backups`                | List CLI tool configuration backups                                                            |
| POST   | `/api/cli-tools/backups`                | Create a backup of all CLI tool configurations                                                 |
| POST   | `/api/cli-tools/[id]/restore`           | Restore a CLI tool from a backup                                                               |
| GET    | `/api/cli-tools/antigravity-mitm`       | Antigravity MITM proxy status (the "antigravity-mitm" CLI tool)                                |
| POST   | `/api/cli-tools/antigravity-mitm/alias` | Configure antigravity-mitm aliases                                                             |

**Auth:** Requires management session.

---

## Agent Skills

Manage AI agent skills (similar to OpenAI's custom GPTs but for agents).

| Method | Path                         | Description                                                                             |
| ------ | ---------------------------- | --------------------------------------------------------------------------------------- |
| GET    | `/api/agent-skills`          | List all agent skills (built-in + custom)                                               |
| GET    | `/api/agent-skills/[id]`     | Get a specific agent skill                                                              |
| POST   | `/api/agent-skills`          | Create a custom agent skill — body: `{name, description, prompt, model?, temperature?}` |
| PUT    | `/api/agent-skills/[id]`     | Update a custom agent skill                                                             |
| DELETE | `/api/agent-skills/[id]`     | Delete a custom agent skill                                                             |
| GET    | `/api/agent-skills/[id]/raw` | Get raw prompt + metadata (no execution)                                                |
| POST   | `/api/agent-skills/generate` | AI-generate a new skill from a natural language description                             |

**Auth:** Requires management session or management-scoped API key.

---

## Cache Management

Manage the semantic cache and reasoning cache.

| Method | Path                   | Description                                                                                             |
| ------ | ---------------------- | ------------------------------------------------------------------------------------------------------- |
| GET    | `/api/cache`           | Cache overview: total entries, hit rate, size on disk                                                   |
| GET    | `/api/cache/entries`   | List cached entries (with pagination)                                                                   |
| DELETE | `/api/cache/entries`   | Delete cache entries (filter by query parameters)                                                       |
| GET    | `/api/cache/stats`     | Detailed cache statistics (per-provider, per-model)                                                     |
| GET    | `/api/cache/reasoning` | Reasoning cache status (for reasoning replay)                                                           |
| DELETE | `/api/cache/reasoning` | Clear reasoning cache — query params: `?toolCallId=<id>` (single) or `?provider=<p>` or no params (all) |

**Auth:** Requires management session.

---

## Memory System

Manage persistent memory (FTS5 + vector embeddings).

| Method | Path                 | Description                                                           |
| ------ | -------------------- | --------------------------------------------------------------------- |
| GET    | `/api/memory`        | List memory entries (filter by scope, type, search query)             |
| POST   | `/api/memory`        | Create a new memory entry — body: `{scope, type, content, metadata?}` |
| GET    | `/api/memory/[id]`   | Get a specific memory entry                                           |
| PUT    | `/api/memory/[id]`   | Update a memory entry                                                 |
| DELETE | `/api/memory/[id]`   | Delete a memory entry                                                 |
| GET    | `/api/memory/search` | Search memory (FTS5 + vector)                                         |
| POST   | `/api/memory/clear`  | Clear memory entries (with filters)                                   |
| GET    | `/api/memory/stats`  | Memory statistics (total entries, embedding coverage, etc.)           |

**Auth:** Requires management session or management-scoped API key.

---

## Webhooks

Manage webhook subscriptions for events.

| Method | Path                            | Description                                                               |
| ------ | ------------------------------- | ------------------------------------------------------------------------- |
| GET    | `/api/webhooks`                 | List all webhook subscriptions                                            |
| POST   | `/api/webhooks`                 | Create a webhook subscription — body: `{url, events[], secret?, active?}` |
| GET    | `/api/webhooks/[id]`            | Get a specific webhook subscription                                       |
| PUT    | `/api/webhooks/[id]`            | Update a webhook subscription                                             |
| DELETE | `/api/webhooks/[id]`            | Delete a webhook subscription                                             |
| GET    | `/api/webhooks/events`          | List all available webhook event types                                    |
| GET    | `/api/webhooks/[id]/deliveries` | List delivery history for a webhook (success/failure log)                 |
| POST   | `/api/webhooks/[id]/test`       | Send a test event to a webhook                                            |

**Auth:** Requires management session.

See [Webhooks Framework](../frameworks/WEBHOOKS.md) for full event types.

---

## Skills Framework

Manage Skills (the agentic extensions framework).

| Method | Path                     | Description                                                                             |
| ------ | ------------------------ | --------------------------------------------------------------------------------------- |
| GET    | `/api/skills`            | List all installed skills (built-in + custom)                                           |
| POST   | `/api/skills/install`    | Install a skill from a local path or URL                                                |
| DELETE | `/api/skills/[id]`       | Uninstall a skill                                                                       |
| PUT    | `/api/skills/[id]`       | Enable or disable a skill — body: `{enabled?: boolean, mode?: "on" \| "off" \| "auto"}` |
| POST   | `/api/skills/executions` | Execute a skill — body: `{skillName, apiKeyId, input?, sessionId?}`                     |
| GET    | `/api/skills/executions` | List execution history for all skills (filter by `?apiKeyId=`)                          |

**Auth:** Requires management session or management-scoped API key.

See [Skills Framework](../frameworks/SKILLS.md) for full details.

---

## Plugins

Manage OmniRoute plugins (third-party extensions).

| Method | Path                             | Description                               |
| ------ | -------------------------------- | ----------------------------------------- |
| GET    | `/api/plugins`                   | List installed plugins                    |
| POST   | `/api/plugins/install`           | Install a plugin from a local path or URL |
| DELETE | `/api/plugins/[name]`            | Uninstall a plugin                        |
| POST   | `/api/plugins/[name]/activate`   | Activate a plugin                         |
| POST   | `/api/plugins/[name]/deactivate` | Deactivate a plugin                       |
| GET    | `/api/plugins/[name]/config`     | Get plugin configuration                  |
| PUT    | `/api/plugins/[name]/config`     | Update plugin configuration               |

**Auth:** Requires management session.

See [Plugins Framework](../frameworks/PLUGIN_SDK.md) for full details.

---

## Shadow Routing

Shadow / A-B comparison of providers is **not a standalone REST surface** — it is configured through combo routing (see [Auto-Combo](../routing/AUTO-COMBO.md)). Per-combo comparison metrics are served by `GET /api/combos/metrics`.

---

## Guardrails

Inspect the runtime guardrails (PII detection, prompt injection detection, vision bridging). Guardrails run on every request; per-call opt-out is via the `x-omniroute-disabled-guardrails` request header — there is no persisted enable/disable surface.

| Method | Path                   | Description                                                                              |
| ------ | ---------------------- | ---------------------------------------------------------------------------------------- |
| GET    | `/api/guardrails`      | List the registered guardrails and their status (name / enabled / priority)              |
| POST   | `/api/guardrails/test` | Dry-run the pre-call pipeline over a sample input — body: `{input, disabledGuardrails?}` |

**Auth:** Requires management session.

See [Security > Guardrails](../security/GUARDRAILS.md) for full details.

---

---

## Authentication

- Dashboard routes (`/dashboard/*`) use `auth_token` cookie
- Login uses saved password hash; fallback to `INITIAL_PASSWORD`
- `requireLogin` toggleable via `/api/settings/require-login`
- `/v1/*` routes optionally require Bearer API key when `REQUIRE_API_KEY=true`

> **Breaking change (v3.8.0)** — `/api/v1/agents/tasks/*` and the cooldown management endpoints now require **management auth** (dashboard `auth_token` cookie or a management-scoped API key). Clients that previously called these routes unauthenticated will receive `401 Unauthorized`. See commit `588a0333` (`fix(auth): require management auth for agent and cooldown APIs`).