# Changelog ## [Unreleased] --- ## [3.8.40] — TBD _In development — bullets added per PR; finalized at release._ ### ✨ New Features - **feat(compression): relevance extractive engine** — a new opt-in compression engine that scores each sentence by term-overlap (Jaccard) with the user's last query minus a length/boilerplate penalty, greedily keeps the most relevant within a budget, and reconstructs the original order. Pure-string, deterministic, ReDoS-safe (char-code tokenization, no `RegExp` over user input), fail-open, default off. Ideal for trimming long pasted RAG context / tool output to what's relevant. Sentences carrying real signal (digits/URLs/errors/code/paths) are never dropped; `overlapThreshold`/`budgetPercent`/`boilerplateWeight` are configurable. Tier-2 item of the compression feature-extraction roadmap (#7). ([#5289](https://github.com/diegosouzapw/OmniRoute/pull/5289)) - **feat(compression): hard-budget mode — compress to ≤ N tokens** — a deterministic post-pass (`targetTokens` / `targetRatio`, default unset → no-op) that trims a body to a token budget. It ranks sentences/lines by average `scoreToken` ascending and drops the lowest-saliency ones until the body fits (measured by the exact cl100k `countTextTokens`), preserving original order. Lines carrying real signal (digits, URLs, `Error:`-family, code fences, stack `at`-frames, multi-segment paths, `key=value`) are never dropped; the budget is distributed proportionally across messages so the total stays ≤ target; an unreachable target (all-preserved) surfaces a `validationWarnings` note instead of failing silently. Does NOT touch the `estimateCompressionTokens` budget-gate estimator. Tier-3 item of the compression feature-extraction roadmap (#17). ([#5288](https://github.com/diegosouzapw/OmniRoute/pull/5288), follow-up [#5291](https://github.com/diegosouzapw/OmniRoute/pull/5291)) - **feat(compression): result memoization for deterministic engines (opt-in)** — caches `(input, config) → result` for provably pure, stateless modes (`lite`/`standard`/`rtk` and stacked pipelines of `{lite,caveman,rtk}`) to skip recompute on the hot path. Opt-in via `memoizeCompressionResults` (default off → zero behavior change). Conservative opt-in whitelist (stateful `ccr`/`session-dedup` — which write the cross-request CCR store — and model-backed `ultra`/`aggressive`/`llmlingua` are never cached), principal-scoped (skipped without a principal, so no cross-principal body leak), and clone-on-store + clone-on-read. Tier-3 item of the compression feature-extraction roadmap (#21). ([#5286](https://github.com/diegosouzapw/OmniRoute/pull/5286)) - **feat(compression): inline transparency annotation** — surfaces `tokens=847→312; rules: filler×8, dedup×2` derived from existing compression stats. The `X-OmniRoute-Compression` response header is extended **append-only** (the `mode; source=X` prefix stays byte-identical, so existing header parsers don't break) and the compression studio cockpit shows a matching badge. Zero new computation — it aggregates the `rulesApplied`/`techniquesUsed` already on the stats. Tier-3 item of the compression feature-extraction roadmap (#18). ([#5284](https://github.com/diegosouzapw/OmniRoute/pull/5284)) - **feat(compression): saliency heatmap in the compression studio** — the preview studio can now color each token by saliency: `ultra` per-token `scoreToken` (0–1, green→red gradient) or universal kept/removed from the existing diff. A dry-run visualization behind a toggle (no cost on a normal preview; backward-compatible when off). Completes the visualization half of roadmap item #13 (the A/B comparison shipped in [#5080](https://github.com/diegosouzapw/OmniRoute/pull/5080)). ([#5285](https://github.com/diegosouzapw/OmniRoute/pull/5285)) - **feat(compression): composite-command splitter for RTK detection** — `cd /x && git status` now detects as `git-status` (previously the whole string was treated as one command and matched no filter). A quote-aware top-level tokenizer splits on `&&`/`||`/`;` (never inside quotes or `$(…)`/backtick subshells) and feeds the **last** segment to RTK command detection, so every RTK filter/renderer fires on commands wrapped in `cd … &&`/`||`/`;` chains. O(n), no RegExp over the command (ReDoS-safe). Tier-3 item of the compression feature-extraction roadmap (#16). ([#5283](https://github.com/diegosouzapw/OmniRoute/pull/5283)) - **feat(mcp): `omniroute_tool_search` tool + one-line TS signatures** — new MCP tool that does lexical keyword search over every MCP tool's name/description and returns the top matches as compact one-line TypeScript signatures (~half the JSON-schema token cost), so agents discover tools on demand instead of carrying all ~88 schemas every turn. Search is ReDoS-safe (substring scoring, never `new RegExp` on the query) and deterministic; `tools/list` stays complete (no hidden tools). Adds the `read:tools` scope. Tier-1 item of the compression feature-extraction roadmap. ([#5269](https://github.com/diegosouzapw/OmniRoute/pull/5269)) - **feat(compression): RTK semantic command-output renderers (opt-in)** — adds a second, opt-in compaction layer to the RTK engine that rewrites structured command output into a far more compact semantic form: `git diff` → file headers + `@@` hunks + changed lines only; an all-green `pytest`/`jest`/`vitest`/`eslint` run → its one-line summary; `terraform`/`tofu plan` → `Plan: +N ~M -K` plus the resource list; `kubectl`/`aws` JSON arrays → a minimal table. Each renderer is conservative (no-op when the shape doesn't match) and the integration is fail-open; the test-green renderer never collapses output that carries any failure signal. Gated by `RtkConfig.enableRenderers` (default off → zero behavioral change). Eighth item of the compression feature-extraction roadmap. ([#5268](https://github.com/diegosouzapw/OmniRoute/pull/5268)) - **feat(compression): QuantumLock cache-prefix stabilization (opt-in, default off)** — recovers upstream prompt-cache hits that a volatile fragment in the system prompt would otherwise bust. When a caller injects a session UUID, unix timestamp, request-id, JWT, API-key shape, or long hex digest into the `role:system` message every turn, the longest common prefix across turns ends at that changing byte → the whole system prompt after it is re-billed and re-processed each turn. QuantumLock replaces each non-semantic volatile fragment with a **positional, value-independent** placeholder `⟦Q{i}⟧` and appends the real values in a delimited `⟦QUANTUMLOCK⟧` tail. The rewrite is **sent to the model** (lossless — not restored), so the system-prompt body becomes **byte-identical across turns** and the provider caches the long stable prefix while only the small tail differs. Opt-in, default off, applied only for caching providers (`isCachingProvider && config.quantumLock.enabled`); bounded ReDoS-safe patterns; idempotent; **no date/time patterns** (semantically meaningful — explicit non-goal). Studio gets a toggle + a "🔒 N volatile fragment(s) stabilized" dry-run badge. Seventh item of the compression feature-extraction roadmap (bench: [#5080](https://github.com/diegosouzapw/OmniRoute/pull/5080), gate: [#5127](https://github.com/diegosouzapw/OmniRoute/pull/5127), fuzzy: [#5143](https://github.com/diegosouzapw/OmniRoute/pull/5143), ionizer: [#5148](https://github.com/diegosouzapw/OmniRoute/pull/5148), TOON: [#5163](https://github.com/diegosouzapw/OmniRoute/pull/5163), CCR ranged: [#5187](https://github.com/diegosouzapw/OmniRoute/pull/5187), risk-gate: [#5243](https://github.com/diegosouzapw/OmniRoute/pull/5243)). ([#5260](https://github.com/diegosouzapw/OmniRoute/pull/5260)) - **kilocode:** anonymous (no-auth) access to Kilo Code's free models, mirroring the `opencode`/`mimocode` pattern. With no Kilo account connected, requests now fall back to the gateway's anonymous tier (`Authorization: Bearer anonymous` on `api.kilo.ai/api/openrouter`) so the free models work without signup; a connected OAuth account is still used unchanged for the paid tier ([#5259](https://github.com/diegosouzapw/OmniRoute/pull/5259), #4019 — thanks @Theadd for the reference implementation) - **feat(logging): call-log correlation ID (end-to-end)** — every request now gets a unique correlation id, returned in the `X-Correlation-Id` response header, persisted in `call_logs` (migration **109**), filterable via `/api/usage/call-logs`, and surfaced in the dashboard request logger (per-chunk stream timestamps + active-requests-first sort). This is the safe, cohesive core subset of the larger #5275 — landed on its own so the low-risk value isn't blocked by the parts of that PR still under review. ([#5279](https://github.com/diegosouzapw/OmniRoute/pull/5279) — thanks @hartmark) - **feat(providers): Microsoft 365 Copilot individual provider** — adds the `copilot-m365-web` provider (the 237th), wiring the M365 BizChat framing/connection helpers into a selectable web-session provider backed by `m365.cloud.microsoft/chat` for individual Microsoft 365 plans. Builds on the M365 pure-framing groundwork from #4696. Regression guard: `tests/unit/copilot-m365-web-executor.test.ts`. ([#5302](https://github.com/diegosouzapw/OmniRoute/pull/5302) — thanks @skyzea1) ### 🔧 Bug Fixes - **ci(docker):** re-point the Docker Hub / GHCR `:latest` (and `:latest-web`) tags to the just-published release. On a `release: released` event the freshly-created git tag is often not yet visible to `git fetch --tags` when `docker-publish` runs, so the `:latest`-promotion gate built its candidate set purely from `git tag -l` and resolved the highest semver to the **previous** version — leaving `latest` one release behind (3.8.39 published, `latest` still 3.8.38). The decision now lives in `scripts/ci/should-promote-latest.sh`, which folds the current `VERSION` into the candidate set before picking the highest stable semver, making promotion independent of tag-sync timing (a patch published after a higher minor still won't grab `latest`). Regression guard: `tests/unit/build/should-promote-latest-5301.test.ts` ([#5301](https://github.com/diegosouzapw/OmniRoute/issues/5301)) - **command-code:** treat a non-positive `max_tokens`/`max_completion_tokens` (e.g. Zoo Code's `-1` "let the server choose") as "no limit" — omit the field instead of forcing it to `1`. `clampMaxTokens` previously did `Math.max(1, …)`, so a client `-1` was sent upstream as `max_tokens: 1`, truncating the response to a single token (the observed `completion_tokens: 1`, `content: null`, `reasoning_content: "The"` with `finish_reason: stop`). Now any value `≤ 0` is dropped so Command Code applies the model's native default; positive values are still floored and clamped to the 200k ceiling. Regression guard: `tests/unit/command-code-maxtokens-negative-5166.test.ts` ([#5166](https://github.com/diegosouzapw/OmniRoute/issues/5166) — thanks @Stazyu) - **fix(auth): compare-and-swap guard on the OAuth refresh persist** — under multi-agent load, the per-connection refresh mutex makes `[network refresh + DB write]` atomic for **one** connection, but it does not protect against a **third** writer (a sibling request, a concurrent HealthCheck, or a replica) landing a fresher `refresh_token` rotation on the same `connection_id` between the staleness read and the persist. Overwriting that fresher row reverts the sibling's rotation; the next caller then loads the now-consumed token, Auth0/Anthropic flag it as `refresh_token_reused`, and the whole token family gets revoked (the 1352× claude/`aa5dd5cf` invalidation storm). `getAccessToken` now re-reads the row's current `refresh_token` immediately before persisting (inside the mutex) and **skips the write** when it has rotated past the token the caller presented — the caller still receives the freshly-issued access token, only the DB overwrite is skipped. Opt-in via `runWithCasGuard` (no active guard ⇒ byte-identical behavior); skip/persist counters exposed via `getCasGuardStats()`. Regression guard: `tests/unit/token-refresh-cas-guard-4038.test.ts`. ([#4038](https://github.com/diegosouzapw/OmniRoute/issues/4038) — thanks @KooshaPari for the root-cause diagnosis) - **mcp:** break the `schemas/tools.ts ↔ schemas/toolSearch.ts` import cycle introduced when the `tool_search` defs (#5269) were extracted into their own module — `toolSearch.ts` imported `McpToolDefinition` from `tools.ts` while `tools.ts` imported `toolSearchTool` from `toolSearch.ts`, failing `check:cycles` on `release/v3.8.40`. The shared `AuditLevel` + `McpToolDefinition` types now live in a leaf `schemas/toolDefinition.ts` that both import; `tools.ts` re-exports them for backward compatibility. - **compression (analytics):** record attempted-but-no-op compression runs so Stacked is no longer invisible when it saves nothing. Previously a `compression_analytics` row was written only on a net-positive saving, so a Stacked (RTK→Caveman) pipeline that ran on already-compact context produced no row — indistinguishable from "never dispatched" (`byMode.stacked.count` stayed flat while Ultra climbed). Such runs are now recorded with `skip_reason` and surfaced as a per-mode `skipped` count plus `totalSkipped`/`bySkipReason` in the analytics summary and the Mode Breakdown; the existing net-saving totals/averages are unchanged (skip rows are excluded from them) (#4268 — thanks @abdulkadirozyurt, @androw) - **cli (tray):** fix `omniroute server --tray` showing no tray on macOS/Linux with no error printed. The wired Unix tray path loaded `systray2` through an inline loader that called `require("module")` inside an ESM `.mjs` file (`"type":"module"`) → `ReferenceError: require is not defined`, silently swallowed (regressed in v3.8.34); even if it had loaded, `systray2` isn't in `node_modules` (it's lazily installed into `~/.omniroute/runtime`). The loader now delegates to the runtime loader, the icon path (`icon.png`) is corrected, `isTemplateIcon` is `false` (the full-color icon rendered as a white square under macOS template mode), and tray start failures are surfaced to stderr instead of being swallowed (#4605 — thanks @ProgMEM-CC) - **agent-bridge (antigravity):** unwrap the cloudcode-pa `.request` envelope when converting Antigravity IDE requests. The real IDE sends `cloudcode-pa.googleapis.com/v1internal:generateContent` with the Gemini request nested under `.request` (`{ project, model, request: { contents, systemInstruction, generationConfig } }`), but the bridge read those fields at the top level — yielding an empty conversation, so prompts hung mid-execution. The legacy `/v1beta/models/:generateContent` top-level shape still works (#4294 — thanks @shabeer) - **dashboard:** add a GitHub releases fallback to the "Update Available" lookup. After the v3.8.28 fix added an npm-registry HTTP fallback, the banner could still stay hidden on networks that reach GitHub (where the news feed already loads) but not `registry.npmjs.org`. `resolveLatestVersion()` now tries npm CLI → npm registry → GitHub releases (`/repos/diegosouzapw/OmniRoute/releases/latest`) before giving up, and logs a warning only when all three fail (#4100) - **command-code:** omit `max_tokens` when the client omits it so the upstream applies the model's native default, fixing `400 "expected <=200000"` on `/alpha/generate` for high-cap models; an explicit oversized client value is clamped to the 200k endpoint ceiling (#5221 — thanks @adivekar-utexas) - **combo:** wire session stickiness into the round-robin dispatch path. Multi-turn conversations from clients that send no session id (Codex CLI, Claude Code, most OpenAI-compatible tools) were rotated to a different connection on every turn by round-robin combos, busting the upstream prompt-cache → cold high-reasoning starts, intermittent `504`s and throughput collapse under concurrency. The weighted/priority paths already honored per-conversation stickiness; the round-robin handler returned before reaching it. Round-robin now starts the rotation at the conversation's sticky connection (failover to the other targets is preserved), and different conversations still spread across connections — only intra-conversation rotation is removed ([#5248](https://github.com/diegosouzapw/OmniRoute/pull/5248), #3825 — thanks @bypanghu, @jpsn123, @xz-dev) - **kiro:** replace the synthesized trailing `"Continue"` turn with a neutral filler (`"..."`) — when an OpenAI→Kiro request ends on an assistant/tool turn, the translator synthesizes the protocol-required trailing user turn, and the literal word `"Continue"` could be read by Kiro/CodeWhisperer as a real user instruction and trigger unintended agent action. A trailing tool-result turn is still promoted as-is (it already collapses to a real user turn); only the assistant-text-ending case is affected. Regression guards: `tests/unit/kiro-continue-filler-5231.test.ts`. ([#5231](https://github.com/diegosouzapw/OmniRoute/issues/5231)) - **combo:** advance to the next combo target on a `400 "requested model is not supported"` instead of hard-failing. The 400 guard in the priority strategy treated `MODEL_CAPACITY` as a block-fallback reason, so a combo that hit a provider lacking a specific model returned a hard `400` even when other targets (different providers) supported it. Such 400s now fall through to the next target. ([#5249](https://github.com/diegosouzapw/OmniRoute/pull/5249) — thanks @Chewji9875) - **dashboard:** disabled no-auth providers no longer vanish from the All Providers page. Disabling a no-auth provider (the "No authentication required" toggle, which adds it to `blockedProviders`) silently removed its card because the page _dropped_ blocked no-auth entries from its render list — the only way back was buried under Settings → Security → Blocked Providers. The page now **partitions** no-auth entries: visible providers render as before, blocked ones appear in a "Disabled" sub-group with an **Enable** button that un-blocks them in place. Aggregates, counts and `/v1/models` still consume the visible-only list (blocked providers stay out of routing). Regression guard: `tests/unit/noauth-blocked-partition-5183.test.ts`. ([#5183](https://github.com/diegosouzapw/OmniRoute/issues/5183), follow-up from [#5166](https://github.com/diegosouzapw/OmniRoute/issues/5166) — thanks @WslzGmzs) - **dashboard:** add a parent `/dashboard/context` page so RSC prefetches of the compression-context hub no longer 404. The route only had sub-routes (`settings`, `combos`, `ultra`, …) and no parent page, so the App Router returned 404 for the bare segment. The parent now redirects to its canonical sub-route (`/dashboard/context/settings`), honoring a legacy `?tab=` query for deep links. Regression guard: `tests/unit/dashboard/context-parent-redirect-5298.test.ts` ([#5298](https://github.com/diegosouzapw/OmniRoute/issues/5298) — thanks @KooshaPari) - **i18n:** add the missing `sidebar.gamificationGroup` message across all 42 locales — the Gamification sidebar group referenced a `titleKey` that existed in no locale, logging `MISSING_MESSAGE: sidebar.gamificationGroup (en)` at runtime (the group still rendered via its `titleFallback`). The key is now present everywhere so the warning is gone and locale coverage is unaffected ([#5298](https://github.com/diegosouzapw/OmniRoute/issues/5298) — thanks @KooshaPari) - **api(stream):** `/v1/chat/completions` no longer returns SSE for a non-stream OpenAI-compatible request when `stream` is omitted and the client sends `Accept: application/json, text/event-stream` — the Vercel AI SDK / OpenAI SDK non-stream signature (`doGenerate()`/`generateText()`), which then failed with `Invalid JSON response` (`Unexpected token 'd', "data: {"id"...`). The route-level Accept override (#302) and `resolveStreamFlag` now treat an Accept header that explicitly lists `application/json` as a JSON opt-in even when it also lists `text/event-stream`; only a pure `Accept: text/event-stream` (no `application/json`) still opts an omitted-`stream` request into SSE, and an explicit body `stream` value always wins. The shared decision now lives in `acceptHeaderForcesStream`. Regression guard: `tests/unit/sse-nonstream-accept-5305.test.ts`. ([#5305](https://github.com/diegosouzapw/OmniRoute/issues/5305) — thanks @md-riaz) - **providers:** drop the retired GPT‑5.2 / GPT‑4.5 models from the direct **ChatGPT‑web** and **Codex** surfaces (OpenAI removed them there), so OmniRoute stops advertising/routing models that no longer exist. Scoped on purpose to those two providers — third‑party proxies that still expose the ids are untouched. ([#5280](https://github.com/diegosouzapw/OmniRoute/pull/5280) — thanks @backryun) - **codex:** drop the deprecated `local_shell` hosted tool type before forwarding to OpenAI's Responses API, resolving the omni-combo `400 "The local_shell tool is no longer supported."` spike. Inbound Responses `local_shell` is still accepted and mapped to a caller-side Chat `shell` function for compatibility. ([#5250](https://github.com/diegosouzapw/OmniRoute/pull/5250), [#5256](https://github.com/diegosouzapw/OmniRoute/pull/5256) — thanks @KooshaPari) - **antigravity:** retry excluded accounts via the fallback LRU. The combo same-model retry loop accumulates excluded Antigravity connection ids after account-level failures, but auth selection only treated a single `excludeConnectionId` as a fallback scenario — once exclusions accumulated through `excludedConnectionIds`, selection could fall back to normal sticky/priority behavior instead of LRU-selecting the next eligible account for the same model/family. Any non-empty accumulated exclude set is now treated as fallback mode. Builds on the family-scoped lockout work in #5180 (v3.8.39). ([#5222](https://github.com/diegosouzapw/OmniRoute/pull/5222) — thanks @Ardem2025) - **grok-cli:** strip unsupported sampling params (`presencePenalty`, `frequencyPenalty`, `logprobs`, `topLogprobs`) before sending to the Grok Build API, fixing `400 'Model does not support parameter presencePenalty'` when clients (MiMoCode, Cursor, etc.) send OpenAI-style params. ([#5273](https://github.com/diegosouzapw/OmniRoute/pull/5273) — thanks @fulorgnas) - **grok-cli:** accept the full `~/.grok/auth.json` object in the dashboard import-token endpoint. The `oauthImportTokenSchema` only accepted a bare string token while the UI sends the whole auth.json object → `400 Bad Request`; the schema now accepts the object and stores the original under `providerSpecificData.rawAuthJson` for diagnostics and token refresh. ([#5258](https://github.com/diegosouzapw/OmniRoute/pull/5258) — thanks @fulorgnas) - **qoder:** coalesce concurrent PAT→job-token exchanges per PAT so high-concurrency / multi-agent bursts no longer stampede `openapi.qoder.sh/api/v1/jobToken/exchange` before the first exchange populates the completed-token cache; the shared exchange is also decoupled from any single caller's `AbortSignal` so one aborted waiter can't cancel it for the others. ([#5254](https://github.com/diegosouzapw/OmniRoute/pull/5254), [#5265](https://github.com/diegosouzapw/OmniRoute/pull/5265) — thanks @KooshaPari) - **proxy:** scope the fallback reachability cache by normalized target **URL** instead of by hostname, so a failed probe for one endpoint on a shared API host no longer suppresses a later probe for a different endpoint on that same host for the full TTL (host fallback is preserved for malformed URLs). ([#5261](https://github.com/diegosouzapw/OmniRoute/pull/5261) — thanks @KooshaPari) - **proxy:** cache failed fast-fail health probes with a short negative TTL instead of the full positive health TTL, so a single transient timeout/load blip no longer marks a working residential SOCKS5 proxy unreachable for the whole window (#5109 regression coverage added). ([#5255](https://github.com/diegosouzapw/OmniRoute/pull/5255) — thanks @KooshaPari) - **mcp:** forward HTTP auth to internal tool fetches so MCP tools that call back into the local API surface carry the caller's authorization. ([#5218](https://github.com/diegosouzapw/OmniRoute/pull/5218) — thanks @KooshaPari) - **logging:** preserve the **outbound provider request** headers in the detailed call-log Provider Request payload (previously the upstream **response** headers were shown there). chatCore now keeps executor-returned request headers when wrapping streaming and non-streaming responses; response headers stay scoped to the `Response`. ([#5257](https://github.com/diegosouzapw/OmniRoute/pull/5257) — thanks @rdself) - **sse:** scope textual ``/`` tag extraction so generic OpenAI-compatible paths don't rewrite prompt-format content into `reasoning_content`; an explicit opt-in keeps tag-native families (DeepSeek-R1, QwQ) working while Antigravity/Agy stay excluded by provider/model prefix. ([#5224](https://github.com/diegosouzapw/OmniRoute/pull/5224) — thanks @rdself) - **mcp:** break the `schemas/tools.ts ↔ schemas/toolSearch.ts` import cycle introduced when the `tool_search` defs (#5269) were extracted into their own module — `toolSearch.ts` imported `McpToolDefinition` from `tools.ts` while `tools.ts` imported `toolSearchTool` from `toolSearch.ts`, failing `check:cycles` on `release/v3.8.40`. The shared `AuditLevel` + `McpToolDefinition` types now live in a leaf `schemas/toolDefinition.ts` that both import; `tools.ts` re-exports them for backward compatibility. ([#5282](https://github.com/diegosouzapw/OmniRoute/pull/5282)) - **compression (analytics):** record attempted-but-no-op compression runs so Stacked is no longer invisible when it saves nothing. A `compression_analytics` row was previously written only on a net-positive saving, so a Stacked pipeline that ran on already-compact context produced no row — indistinguishable from "never dispatched". Such runs are now recorded with `skip_reason` and surfaced as a per-mode `skipped` count plus `totalSkipped`/`bySkipReason`; net-saving totals/averages are unchanged (skip rows excluded). ([#5277](https://github.com/diegosouzapw/OmniRoute/pull/5277), #4268 — thanks @abdulkadirozyurt, @androw) - **cli (tray):** fix `omniroute server --tray` showing no tray on macOS/Linux with no error printed. The Unix tray path loaded `systray2` through an inline loader that called `require("module")` inside an ESM `.mjs` file → `ReferenceError: require is not defined`, silently swallowed (regressed in v3.8.34); even if loaded, `systray2` is lazily installed into `~/.omniroute/runtime`, not `node_modules`. The loader now delegates to the runtime loader, the icon path is corrected, `isTemplateIcon` is `false` (the full-color icon rendered as a white square under macOS template mode), and tray start failures surface to stderr. ([#5276](https://github.com/diegosouzapw/OmniRoute/pull/5276), #4605 — thanks @ProgMEM-CC) - **agent-bridge (antigravity):** unwrap the cloudcode-pa `.request` envelope when converting Antigravity IDE requests. The real IDE sends `cloudcode-pa.googleapis.com/v1internal:generateContent` with the Gemini request nested under `.request`, but the bridge read those fields at the top level — yielding an empty conversation, so prompts hung mid-execution. The legacy `/v1beta/models/:generateContent` top-level shape still works. ([#5267](https://github.com/diegosouzapw/OmniRoute/pull/5267), #4294 — thanks @shabeer) - **dashboard:** add a GitHub releases fallback to the "Update Available" lookup. After the v3.8.28 npm-registry fallback, the banner could still stay hidden on networks that reach GitHub but not `registry.npmjs.org`. `resolveLatestVersion()` now tries npm CLI → npm registry → GitHub releases before giving up. ([#5266](https://github.com/diegosouzapw/OmniRoute/pull/5266), #4100) - **command-code:** omit `max_tokens` when the client omits it so the upstream applies the model's native default, fixing `400 "expected <=200000"` on `/alpha/generate` for high-cap models; an explicit oversized client value is clamped to the 200k endpoint ceiling. ([#5221](https://github.com/diegosouzapw/OmniRoute/pull/5221) — thanks @adivekar-utexas) ### 🔒 Security - **authz:** require auth for the `/v1beta/*` Gemini-compatible client API. `next.config.mjs` rewrote `/v1beta/:path*` → `/api/v1beta/:path*`, but `src/proxy.ts` didn't match `/v1beta` before the rewrite and `classifyRoute()` didn't classify `/api/v1beta/*` as client API — so unauthenticated `/v1beta/models/...:generateContent` traffic could reach the model-serving route without the central client-API auth policy. Both alias and rewritten forms are now classified `CLIENT_API`, enforcing Bearer auth when `REQUIRE_API_KEY` is enabled. ([#5274](https://github.com/diegosouzapw/OmniRoute/pull/5274) — thanks @rdself) - **sentinel:** security hardening pass across request handling. ([#5241](https://github.com/diegosouzapw/OmniRoute/pull/5241) — thanks @iamedwardngo) - **providers:** refresh impersonation User-Agents + TLS fingerprint profiles to current real-client versions; several had drifted or were inconsistent across files, a bot-detection/blocking risk. ([#5237](https://github.com/diegosouzapw/OmniRoute/pull/5237) — thanks @backryun) - **authz (public origin):** centralize browser-mutation origin validation into `src/server/origin/publicOrigin.ts` and wire it through the authz pipeline, replacing the per-route same-origin-only check that `403`'d dashboard mutations when served behind a reverse proxy on a different public origin. The module resolves the allowed public origin from configured base-URL env vars or trusted forwarded headers (only when `OMNIROUTE_TRUST_PROXY` is set **and** the peer is loopback/LAN via peer-stamp), validates `Sec-Fetch-Site` metadata, and sanitizes `Host`/`Forwarded` inputs (rejects control chars, userinfo, path/query in Host). Regression guards: `tests/unit/authz/public-origin.test.ts` + `tests/unit/authz/pipeline.test.ts`. ([#5278](https://github.com/diegosouzapw/OmniRoute/pull/5278) — thanks @Thinkscape / @abodera) ### 📝 Maintenance - **docs:** reorganize `docs/`, run an accuracy audit, and drop Node 20 from the supported matrix (rebased onto the current release tip). ([#5262](https://github.com/diegosouzapw/OmniRoute/pull/5262)) - **providers:** remove the discontinued Gemini CLI channel — Google shut it down on 2026-06-18; the supported migration path is Antigravity. ([#5246](https://github.com/diegosouzapw/OmniRoute/pull/5246) — thanks @rdself) - **dashboard:** add more provider icons from Lobehub. ([#5220](https://github.com/diegosouzapw/OmniRoute/pull/5220) — thanks @backryun) - **deps:** resolve `npm install` warnings, fix a runtime `Cannot find module` crash on `omniroute serve` (the runtime-env module was missing from the npm `files` allow-list), and clear the 4 moderate `npm audit` advisories. ([#5252](https://github.com/diegosouzapw/OmniRoute/pull/5252) — thanks @yunaamelia), ([#5230](https://github.com/diegosouzapw/OmniRoute/pull/5230), #5227) - **docker:** harden the base image against container-scan CVEs and skip comment lines in the #4076 builder heap-ordering check. ([#5229](https://github.com/diegosouzapw/OmniRoute/pull/5229), [#5233](https://github.com/diegosouzapw/OmniRoute/pull/5233)) - **ci:** Trivy advisory scan now ignores unfixed CVEs to cut Security-tab noise. ([#5235](https://github.com/diegosouzapw/OmniRoute/pull/5235)) - **test(combo):** reconcile the #4279 stop-guard test with the #5249 advance policy. ([#5300](https://github.com/diegosouzapw/OmniRoute/pull/5300)) - **test(targets):** add 14 unit tests for the shared `combo/targetExhaustion.ts` handler (provider-exhausted / connection-error / transient rate-limited classification), registered in the vitest discovery list. ([#5296](https://github.com/diegosouzapw/OmniRoute/pull/5296) — thanks @KooshaPari) --- ## [3.8.39] — 2026-06-28 ### ✨ New Features - **feat(oauth): remote Antigravity login via local helper + paste-credentials** — Antigravity (and other Google "native/desktop" OAuth providers) use Google's `firstparty/nativeapp` consent, which only releases the auth code when the loopback redirect (`127.0.0.1:`) is reachable from the approving browser. On a remote VPS install that loopback lives on the server, so the consent hangs forever and never emits a code — the "paste the callback URL" fallback has nothing to paste (a Google-side constraint, identical in upstream 9router). A new `omniroute login antigravity` CLI helper runs the OAuth on the user's **own** machine (where 127.0.0.1 works), exchanges the code, and prints a single-line `omniroute-cred-v1.…` credential blob; the dashboard's Antigravity Connect → Step 2 field now accepts that blob (alongside callback URLs) and persists the connection via a new `paste-credentials` action (server-side onboarding, provider-allowlisted, with the blob's embedded provider required to match the route). The SSH local-forward tunnel is documented as a zero-tooling alternative. See [`docs/guides/REMOTE-MODE.md`](docs/guides/REMOTE-MODE.md). ([#5203](https://github.com/diegosouzapw/OmniRoute/pull/5203)) - **feat(agent-bridge): graceful cert-install fallback for containers / headless** — when the MITM root CA can't be installed into the system trust store automatically (Docker / headless / no sudo / read-only trust store), the Agent Bridge no longer hard-fails on start with a generic "Certificate install failed". It now starts in skip mode and the dashboard surfaces a platform-specific **manual-install guide** (plus a CA download link) so the operator can trust the certificate by hand. The trust-cert endpoints return a structured `{ skippable, manualGuide }` response (HTTP 200) for environment failures instead of a 500; an explicit user cancellation is still reported distinctly. ([#4546](https://github.com/diegosouzapw/OmniRoute/issues/4546) — thanks @phuchptty) - **feat(compression): CCR ranged/grep/stats retrieval (ReDoS-safe, backward-compat)** — extends the `omniroute_ccr_retrieve` MCP tool and `/api/compression/retrieve` endpoint with optional `range` (byte/line slice), `grep` (ReDoS-safe literal or bounded-pattern match against stored lines), and `stats` (byte/line/word counts) parameters so agents pull exactly the slice or summary they need instead of re-expanding the entire stored block. All parameters are optional — no parameters returns the full block byte-identical to the existing behavior; the CCR store written by ionizer/fuzzy/headroom is fully compatible. Sixth item of the compression roadmap. ([#5187](https://github.com/diegosouzapw/OmniRoute/pull/5187)) - **feat(compression): TOON best-of-N candidate encoder + encoder A/B table** — adds `@toon-format/toon` as a candidate encoder in the headroom compression engine via a best-of-N scheme: both GCF and TOON run per prompt and the shorter result is kept, rather than hard-swapping encoders (GCF already encodes the headroom block and TOON is not a lossless universal win). An encoder A/B comparison table (GCF vs TOON vs JSON — bytes and cl100k tokens) is now surfaced in the compression studio. Fifth item of the compression feature-extraction roadmap (bench: [#5080](https://github.com/diegosouzapw/OmniRoute/pull/5080), gate: [#5127](https://github.com/diegosouzapw/OmniRoute/pull/5127), fuzzy/gate: [#5143](https://github.com/diegosouzapw/OmniRoute/pull/5143), ionizer: [#5148](https://github.com/diegosouzapw/OmniRoute/pull/5148)). ([#5163](https://github.com/diegosouzapw/OmniRoute/pull/5163)) ### 🔧 Bug Fixes - **fix(cli): global npm install no longer fails with "Cannot find module scripts/build/runtime-env.mjs"** — the v3.8.39 heap auto-calibration fix made `bin/cli/commands/serve.mjs` import `scripts/build/runtime-env.mjs`, but that file was never added to the `files` whitelist in `package.json`, so the published npm tarball shipped the importing CLI without the imported module — breaking **every** `npm install -g omniroute` at startup (regression, all platforms). The file is now whitelisted (`npm pack` confirms it ships) and added to the pack-artifact policy allow/required lists, and a new regression test scans all `bin/**` entrypoints for runtime imports resolving under `scripts/` and asserts each is covered by the package `files` whitelist, so any future unpackaged CLI import fails CI instead of users' installs. ([#5227](https://github.com/diegosouzapw/OmniRoute/issues/5227) — thanks @PriyomSaha, @m-Yaghoubi, @jonlwheat2-gif) - **fix(oauth): Antigravity refresh no longer nulls the stored refresh_token on an empty upstream response** — Google's OAuth token endpoint uses non-rotating refresh tokens: a refresh response normally OMITS `refresh_token` and occasionally returns it as an empty string. The Antigravity executor's `refreshCredentials` used `typeof tokens.refresh_token === "string" ? tokens.refresh_token : credentials.refreshToken`, and because `typeof "" === "string"` is true, an empty-string response overwrote the good token with `""` — nulling it on first refresh. The check now treats a non-string **or empty** value as absent and preserves the stored token, matching the canonical `refreshGoogleToken` (`tokens.refresh_token || refreshToken`) semantics. ([#3850](https://github.com/diegosouzapw/OmniRoute/issues/3850) — thanks @3xa228148) - **fix(api): LAN/Tailscale dashboard access — `ws:` CSP scheme, GET-exempt version route, surface combo field errors** — three failures when opening the dashboard from a non-loopback host: (1) CSP `connect-src` allowed the `ws:` scheme only for loopback origins, blocking the dashboard's `ws://:*` Live WebSocket from LAN/Tailscale clients; the bare `ws:` scheme is now permitted (symmetric with the bare `wss:` already allowed), kept declarative in `next.config.mjs` with no global middleware (the project has none by design); (2) `GET /api/system/version` was blocked by `LOCAL_ONLY_API_PREFIXES` for all methods despite only `POST` spawning child processes (git/npm/pm2) — a new `LOCAL_ONLY_API_GET_EXEMPTIONS` set exempts safe read methods for this path while keeping `POST`/`PUT`/`PATCH`/`DELETE` strictly loopback-only; (3) `COMBO_002` validation errors only surfaced the generic message — `firstField`/`firstMessage` are now extracted from the first Zod issue and included in the response body. ([#5083](https://github.com/diegosouzapw/OmniRoute/issues/5083) — thanks @KooshaPari for the diagnosis and original PR #5084) - **fix(sse): defer `` close so it never leaks before `tool_calls` in Claude→OpenAI streaming** — when a Claude thinking block was followed by a tool_use block, the translator unconditionally emitted a `content: ""` chunk at `content_block_stop`, injecting a spurious assistant text chunk immediately before the `tool_calls` delta and corrupting OpenAI-compatible clients (e.g. Kimi Coding). The close marker is now deferred: it is flushed at the first `text_delta` that follows the thinking block (preserving the #4633 / decolua/9router#454 behavior for Claude Code / Cursor) or at stream finish when no tool_calls were collected. Tool-use streams never get a `text_delta` after the thinking block, so `` is never emitted into content before `tool_calls`. ([#5123](https://github.com/diegosouzapw/OmniRoute/issues/5123)) - **fix(sse): normalize array user-message content in the Command Code executor to prevent upstream 400** — when a client sends a user turn whose `content` is an array of content parts (e.g. `[{type:"text",text:"…"}, …]`), the raw array was forwarded verbatim to the Command Code upstream, which requires `messages[N].content` for the `user` role to be a plain string — resulting in `expected string, received array` / HTTP 400 on DeepSeek V4-Pro and other Command Code models. The user branch of `convertMessages` now calls `normalizeContentText()` (already used by system, assistant, and tool branches) so multi-part user content is joined to a string before dispatch. Partially addresses ([#5166](https://github.com/diegosouzapw/OmniRoute/issues/5166)); the 0-output-token symptom on reasoning-only models is tracked separately. - **fix(mcp): return HTTP 404 (not 400) for an unknown/expired Streamable HTTP session id** — when an MCP session is terminated or idles out and the client reuses the stale `Mcp-Session-Id` header, the Streamable HTTP transport replied with HTTP 400. The MCP spec (2025-03-26 and 2025-11-25, Session Management) mandates HTTP 404 Not Found in that case, and spec-compliant clients only re-initialize a session on 404 — so the 400 was non-recoverable. The handler now returns 404 for a present-but-unknown session id, while a _missing_ session id on a non-initialize request correctly stays 400. ([#5169](https://github.com/diegosouzapw/OmniRoute/issues/5169) — thanks @czer323) - **fix(api): blocking "Auto (Zero-Config)" in Security settings now removes `auto/*` from `/v1/models`** — the built-in `auto/*` combo advertiser (#4164 / #4235) at the top of the models catalog ignored `settings.blockedProviders`, so checking **Auto (Zero-Config)** under Security → Blocked Providers had no effect and the model picker kept listing every `auto/*` entry. The injection loop now skips the entire `auto/*` block when the system provider `auto` (its id and alias are both `auto`) is blocked, consistent with how every other provider is filtered from the catalog. ([#5192](https://github.com/diegosouzapw/OmniRoute/issues/5192) — thanks @WslzGmzs) - **fix(cli): auto-calibrate the server V8 heap from physical RAM instead of a fixed 512MB default** — the server was spawned with a hard-coded `--max-old-space-size=512` (`omniroute serve`) or with no heap flag at all (Electron desktop, which then inherited the runtime's low ~512MB default), so RAM-rich machines still OOM-crashed under load (`FATAL ERROR: Ineffective mark-compacts near heap limit … ~500MB` at code=134) with many providers/accounts and large model catalogs (one report: 16GB RAM, 65 providers, ~100 accounts, ~2600 models). A new `calibrateHeapFallbackMb(os.totalmem())` helper derives the default heap as ~35% of physical RAM, clamped to `[512, 4096]`, and is wired into both `bin/cli/commands/serve.mjs` and `electron/main.js`. An explicit `OMNIROUTE_MEMORY_MB` (or a pre-set `--max-old-space-size`) still wins, so the #2939 override contract is unchanged. ([#5172](https://github.com/diegosouzapw/OmniRoute/issues/5172), [#5160](https://github.com/diegosouzapw/OmniRoute/issues/5160), [#5152](https://github.com/diegosouzapw/OmniRoute/issues/5152) — thanks @manchairwang, @Xyzjesus) - **fix(oauth): Antigravity login no longer hangs — fire-and-forget onboarding + bounded post-exchange** — the dashboard's Antigravity OAuth login spun indefinitely because `postExchange` awaited the `onboardUser` retry loop inline (up to 10 × 5 s per attempt, each fetch with no timeout), blocking the `/exchange` response forever. Matching the upstream 9router web flow: `onboardUser` now runs fire-and-forget in a background task; the `/exchange` endpoint is bounded by a 10 s hard timeout so it always returns; a progress endpoint lets the dashboard poll onboarding completion state. ([#5193](https://github.com/diegosouzapw/OmniRoute/pull/5193)) - **fix(antigravity): retry Antigravity accounts by quota family before escalating the combo** — when one Antigravity account returns a quota or rate-limit `429` for a Gemini model (e.g. `gemini-3.5-flash-medium`), combo orchestration could prematurely advance to the next combo model instead of trying other eligible Antigravity accounts for the same quota family. Antigravity quota-family awareness is now added to the fallback path so a `429` on one account triggers a bounded same-model retry across other Antigravity accounts sharing that quota bucket before the combo degrades to a lower-tier model. ([#5180](https://github.com/diegosouzapw/OmniRoute/pull/5180) — thanks @Ardem2025) - **fix(translator): accept Claude Messages shape in the non-stream malformed-200 guard** — when a Claude client (e.g. Claude Code) is routed to a non-Claude provider, the translated non-streaming response body is in Claude Messages shape (`type: "message", content[]`) produced by `convertOpenAINonStreamingToClaude`. `detectMalformedNonStream` only recognized OpenAI `choices[].message` and Responses API `output[]`, so this shape fell through to `empty_choices` → 502. The guard now recognizes the Claude Messages shape: text, tool_use, and thinking blocks carrying a `signature` count as valid output, while a genuinely empty `content: []` is still flagged. ([#5156](https://github.com/diegosouzapw/OmniRoute/pull/5156) — thanks @NomenAK) - **fix(sse): resolve nameless deepseek-web `` blocks via parameter-schema match** — when `chat.deepseek.com` emits a `` block with no `` child, no JSON body `name`/`type` key, and no tag suffix, every name-resolution path in `extractCall` returned `null` and the raw XML leaked to the client as plain text. A conservative schema-based fallback now compares the block's extracted parameter names against each declared tool's schema keys; if exactly one tool matches, its name is used. Zero or ambiguous (>1) matches still return `null` so no calls are misattributed. ([#5154](https://github.com/diegosouzapw/OmniRoute/issues/5154), [#5173](https://github.com/diegosouzapw/OmniRoute/pull/5173)) - **fix(stream): normalize provider safety finish reasons to `content_filter`** — Gemini and Antigravity can return safety/prohibited terminal reasons (`SAFETY`, `RECITATION`, `BLOCKLIST`, `PROHIBITED_CONTENT`) that OpenAI-compatible downstream clients do not recognize. A shared finish-reason normalization helper now maps these to the standard `content_filter` value, applied in both the streaming and JSON collection paths for both providers. ([#5197](https://github.com/diegosouzapw/OmniRoute/pull/5197) — thanks @rdself) - **fix(responses): normalize non-array Responses API `input` before routing** — the OpenAI Responses API accepts `input` as a string, object, or list, but OmniRoute only handled list-shaped payloads; a string or object `input` was silently dropped on the Responses→Chat Completions path. The translator now normalizes `input` to a list before dispatch; the Codex-native Responses path also normalizes before forwarding (preventing upstream `400 Input must be a list`); and the prompt-injection and PII sanitizer extraction paths are guarded against object-valued `input` so security checks do not throw. ([#5204](https://github.com/diegosouzapw/OmniRoute/pull/5204) — thanks @wilsonicdev) - **fix(zenmux): normalize vendor-prefixed GLM system roles for Z.AI models** — ZenMux exposes Z.AI GLM via vendor-prefixed OpenAI-compatible IDs such as `z-ai/glm-5.2`. The existing GLM detection only matched bare `glm-*`/`glm` ids, so `zenmux/z-ai/glm-5.2` kept system messages in place; Z.AI rejects compressed histories ending with a system turn before `assistant(tool_calls) → tool` sequences. The fix extends GLM detection to cover `z-ai/glm-*` prefixes and routes them through the existing `normalizeSystemRole` path. ([#5158](https://github.com/diegosouzapw/OmniRoute/pull/5158) — thanks @Thinkscape) - **fix(xai): add OAuth connection test probe + normalize xAI reasoning effort aliases** — xAI rejects unsupported reasoning effort values (`max`, `xhigh`) with HTTP 400 after a provider update; the xAI translator now maps `max` and `xhigh` to `high` before forwarding. Additionally, xAI OAuth connections had no dashboard test configuration, so provider tests returned `"Provider test not supported"`; a dedicated OAuth test probe is now wired for xAI accounts with regression coverage for the effort normalization. ([#5157](https://github.com/diegosouzapw/OmniRoute/pull/5157) — thanks @nguyenxvotanminh3) - **fix(serve): honour `HOSTNAME` from `.env` instead of hardcoding `0.0.0.0`** — `bin/cli/commands/serve.mjs` spread `process.env` into the child-process environment but immediately overwrote `HOSTNAME` with a literal `"0.0.0.0"`, silently discarding any user-configured bind address even though `HOSTNAME` is documented in `.env.example` and `docs/reference/ENVIRONMENT.md`. `dist/server.js` already read `process.env.HOSTNAME` correctly; only the CLI wrapper was overriding it. The fix applies `process.env.HOSTNAME || "0.0.0.0"` so the env value takes effect. ([#5134](https://github.com/diegosouzapw/OmniRoute/issues/5134), [#5170](https://github.com/diegosouzapw/OmniRoute/pull/5170) — thanks @anki1kr / @Angelo90810) - **fix(cli): force `NODE_ENV` to match dev/start run mode in the custom Next server** — when `.env.example` ships `NODE_ENV=production`, starting `npm run dev` via `scripts/dev/run-next.mjs` forwarded that value to the programmatic `next()` entry, which — unlike the `next` CLI — does not normalize it to match the run mode. The resulting production flag caused PostCSS to skip Tailwind's CSS transform, surfacing as `Module parse failed: Unexpected character '@'` on `globals.css`. The custom server now explicitly forces `NODE_ENV=development` for the `dev` path and `NODE_ENV=production` for the `start` path regardless of `.env`. ([#5189](https://github.com/diegosouzapw/OmniRoute/pull/5189) — thanks @backryun) - **fix(cli): raise dev server Node heap limit to 8 GB to prevent OOM** — `npm run dev` crashed with `FATAL ERROR: Ineffective mark-compacts near heap limit — Allocation failed - JavaScript heap out of memory` while compiling heavy dashboard routes because `node scripts/dev/run-next.mjs` ran on V8's ~4 GB default with no `--max-old-space-size` flag. The `dev` npm script now passes `--max-old-space-size=8192` at invocation time (the only point where this flag can be set for that process). ([#5198](https://github.com/diegosouzapw/OmniRoute/pull/5198) — thanks @backryun) - **fix(cli): re-enable Turbopack as the default `npm run dev` bundler** — PR #4092 forced webpack because an earlier Turbopack 16.2.x panic (`internal error: entered unreachable code: there must be a path to a root` in `turbopack-core/module_graph`) blocked the OmniRoute module graph. That panic no longer reproduces on the pinned Next 16.2.9, so `OMNIROUTE_USE_TURBOPACK` is flipped from `0` to `1` in `.env.example`, aligning it with `docs/reference/ENVIRONMENT.md` which had already documented the default as `1`. ([#5206](https://github.com/diegosouzapw/OmniRoute/pull/5206) — thanks @backryun) - **fix(auth): allow synthetic no-auth fallback for mimocode** — mimocode connections without explicit credentials were blocked before reaching the executor. The auth layer now permits a synthetic no-auth fallback for the mimocode provider so credential-free access patterns work as intended. ([#5205](https://github.com/diegosouzapw/OmniRoute/pull/5205) — thanks @KooshaPari) - **fix(combo): reject empty Responses API `output: []` as a fail-over trigger** — a non-streaming Responses API body with `object: "response"` and `output: []` was accepted as a valid HTTP 200 by the combo response-quality validator, allowing a combo target to stop rather than fail over to the next leg. The non-stream validator now inspects Responses-API-shaped bodies before the generic `output` shortcut and rejects an empty `output: []` as `empty_choices`; structural non-empty output (e.g. `function_call`) remains valid. ([#5207](https://github.com/diegosouzapw/OmniRoute/pull/5207) — thanks @KooshaPari) - **fix(proxy): close cached dispatchers when clearing the proxy cache** — cached proxy and direct-retry dispatchers were not closed on cache clear, leaking open connection handles. The cache-clear path now calls `close()` on all evicted dispatchers; dispatcher cache and lifecycle helpers have been extracted from the oversized proxy-dispatcher module into a dedicated helper for reuse. ([#5202](https://github.com/diegosouzapw/OmniRoute/pull/5202) — thanks @KooshaPari) - **fix(proxy): coalesce concurrent fast-fail health probes per proxy URL** — under high concurrency each simultaneous request opened its own TCP health probe for the same proxy URL, creating a thundering-herd burst. Concurrent proxy fast-fail checks are now coalesced so only one TCP probe runs per proxy URL at a time; the completed-result health cache is preserved so subsequent same-URL checks return immediately. ([#5109](https://github.com/diegosouzapw/OmniRoute/issues/5109), [#5208](https://github.com/diegosouzapw/OmniRoute/pull/5208) — thanks @KooshaPari) - **fix(pwa): prefer cached navigation before showing the offline page** — the service worker was too eager to display `/offline` on transient navigation failures. It now caches successful navigation responses and consults the cached route or app shell before falling back to `/offline`; `/offline` remains the final fallback when no cached navigation or app shell exists. ([#5165](https://github.com/diegosouzapw/OmniRoute/issues/5165), [#5209](https://github.com/diegosouzapw/OmniRoute/pull/5209) — thanks @KooshaPari) - **fix(request-logger): never render a negative percentage in the compression badge** — when every prompt token was compressed (`totalIn = 0, compressed > 0`), the compression pill displayed `(-100%)` because the badge format hard-coded a leading `-` before the percentage value. The badge now omits the negative sign in this case, correctly representing the saving as a positive ratio. ([#5201](https://github.com/diegosouzapw/OmniRoute/pull/5201) — thanks @KooshaPari) - **fix(dashboard): use amber for home update-step warning icon** — the warning-state icon in the home update steps (`HomePageClient.tsx`) used `text-yellow-500` (Tailwind `#eab308`), which has poor contrast on light backgrounds (~1.9:1, below WCAG AA) and is inconsistent with the `amber` warning convention used by every sibling element in the same component. Switched to `text-amber-500` — a one-line `className` change with no behavior change. ([#5176](https://github.com/diegosouzapw/OmniRoute/pull/5176)) ### 📝 Maintenance - **test(docker): de-brittle the #4076 builder-stage heap-ordering test (false-failing on a comment)** — `dockerfile-build-heap-4076.test.ts` located the `npm run build` step via a raw `findIndex(/npm run build\b/)`, which matched a **comment line** in the builder stage (`# … npm run build fails with "Module not…"`, added with the v3.8.40 workspace-deps Docker fix) sitting before the `NODE_OPTIONS` heap line — making the test report the heap ceiling as set _after_ the build and fail, even though the real `RUN … npm run build` correctly follows the `ENV NODE_OPTIONS` line. Instruction matching now skips Dockerfile comment lines (`# …`), so the ordering guard checks real `ENV`/`RUN` instructions only. The Dockerfile itself was already correct; this was a base-red blocking every PR into `release/v3.8.40`. ([#4076](https://github.com/diegosouzapw/OmniRoute/issues/4076)) - **test(combo): deterministic context-relay universal-handoff coverage** — covers the universal (provider-agnostic) session-handoff path in `context-relay` (`combo.ts:2099–2139`), which previously had only a definition-order assertion and a `TODO(phase-2)`. The test drives the real pipeline via session seams (`x-session-id` → `relayOptions.sessionId` → `maybeGenerateUniversalHandoff`) without live infrastructure. ([#5168](https://github.com/diegosouzapw/OmniRoute/pull/5168)) - **test(combo): end-to-end quota-share DRR routing-decision coverage (matrix parity)** — adds the missing E2E test for the `quota-share` strategy, driving the real `handleChat` → chatCore → `selectQuotaShareTarget` → executor pipeline via in-process seams and asserting which connection is dispatched. The DRR selector already had 29 unit tests; this closes the E2E gap and brings quota-share to parity with the 17-strategy public matrix. ([#5179](https://github.com/diegosouzapw/OmniRoute/pull/5179)) - **test(combo): deterministic context-relay codex quota-handoff coverage (closes last gap)** — covers the codex-specific handoff block of `context-relay` (`combo.ts:2143–2183`), which #5168 left documented-but-untested because it requires a `codex` connection. All seams (`fetchCodexQuota`, handoff generation, session relay) are mocked deterministically without live infra. ([#5195](https://github.com/diegosouzapw/OmniRoute/pull/5195)) - **test(ci): wire antigravity-quota-family test under `test:vitest` (fix test-discovery orphan)** — `open-sse/services/__tests__/antigravity-quota-family.test.ts` (introduced by #5180) was not collected by any active runner, causing `check:test-discovery` to report a new orphan and gate every subsequent PR on the release branch. The file is now added to `vitest.mcp.config.ts` `include` and the corresponding orphan-allowlist entry is removed. ([#5196](https://github.com/diegosouzapw/OmniRoute/pull/5196)) - **test(security): regression guard — PII redaction stays opt-in (default off) + Hard Rule #20** — adds a test asserting both `PII_REDACTION_ENABLED` and `PII_RESPONSE_SANITIZATION` feature-flag `defaultValue` fields are `"false"` and that data passes through all three application points (`piiMasker`, `piiSanitizer`, `streamingPiiTransform`) untouched when both flags are off, encoding Hard Rule #20 as a CI-enforced contract and fixing a misleading doc implication that PII masking was on by default. ([#5159](https://github.com/diegosouzapw/OmniRoute/pull/5159)) - **docs(i18n): add Traditional Chinese (zh-TW) README + update zh-CN** — adds a new Traditional Chinese translation (`docs/i18n/zh-TW/README.md`) and updates the Simplified Chinese README to the current English baseline; the language index (`docs/i18n/README.md`) and root `README.md` badge row are updated accordingly. ([#5162](https://github.com/diegosouzapw/OmniRoute/pull/5162) — thanks @lunkerchen) - **docs(i18n): full sync of zh-TW and zh-CN README to canonical English v3.8.39** — brings both translations to full parity, adding the complete What's New section, compression real-token examples, and all sections updated in the v3.8.38/39 English README. ([#5171](https://github.com/diegosouzapw/OmniRoute/pull/5171) — thanks @lunkerchen) - **docs(combo): sync combo/routing-strategy docs to current state + document test coverage** — removes a stale ordinal from the Fusion bullet in `README.md`; adds a new **Testing & Coverage** section to `docs/routing/AUTO-COMBO.md` documenting the deterministic strategy matrix (`npm run test:combo:matrix`), quota-share DRR E2E coverage, and context-relay handoff tests delivered across the v3.8.39 cycle. ([#5185](https://github.com/diegosouzapw/OmniRoute/pull/5185)) - **fix(docker):** copy the `open-sse` workspace manifest before `npm ci` so workspace-only deps install — the Dockerfile copied only the root `package*.json`, so `npm ci` skipped `safe-regex` and `@toon-format/toon` (declared in `open-sse/package.json`, not hoisted to root), breaking the multi-arch image build with `Module not found` during `npm run build`. (thanks @diegosouzapw) --- ## [3.8.38] — 2026-06-27 ### ✨ New Features - **feat(sidebar): colored menu icons** — sidebar menu icons now render with a per-item accent color: curated colors for known items (`SIDEBAR_ICON_ACCENTS`) plus a deterministic hash-based fallback (`getSidebarIconAccent`) so every item gets a stable, distinct color across sessions. ([#3812](https://github.com/diegosouzapw/OmniRoute/pull/3812) — thanks @rafacpti23) - **feat(providers): add Factory (factory.ai) as a subscription gateway provider** — `factory` (Factory Droids' hosted gateway) is now a first-class routing provider on the OpenAI-compatible `https://api.factory.ai/v1` endpoint with Bearer apikey auth; the key is supplied from the Dashboard connection (not env). ([#5065](https://github.com/diegosouzapw/OmniRoute/pull/5065) — thanks @KooshaPari) - **feat(providers): add Grok Build (xAI) provider with OAuth import-token flow** — `grok-cli` (alias `gc`) routes through Grok's CLI chat proxy; users paste their `~/.grok/auth.json` (or the JWT), with automatic `refresh_token` rotation. The public xAI client_id is embedded via `resolvePublicCred("grok_id")` (Hard Rule #11), never a literal. ([#5020](https://github.com/diegosouzapw/OmniRoute/pull/5020) — thanks @fulorgnas) - **feat(dashboard): click-to-edit model alias in the provider page** — click an alias to edit it inline (Enter/blur saves, Escape cancels), instead of only being able to delete and re-add it. ([#5119](https://github.com/diegosouzapw/OmniRoute/pull/5119) — thanks @waguriagentic) - **feat(providers): add ZenMux Free (session-cookie free-tier) provider** — `zenmux-free` (alias `zmf`) with a dedicated executor translating ZenMux's Anthropic-style SSE to OpenAI format; ships 12 free-tier models (DeepSeek V3.2, GLM 4.7 Flash Free, etc.). ([#5105](https://github.com/diegosouzapw/OmniRoute/pull/5105) — thanks @mrnasil) - **feat(providers): allow local/private provider URLs by default (`Allow Local Provider URLs` flag)** — adding/validating an OpenAI-compatible provider on a loopback/LAN address (e.g. `http://127.0.0.1:3264/api`) was rejected by the SSRF guard with "Blocked private or local provider URL", even though OmniRoute is local-first. A new `OMNIROUTE_ALLOW_LOCAL_PROVIDER_URLS` feature flag (default **ON**, toggle in Settings → Feature Flags) now scopes the provider-validation guard to allow local/private hosts while still blocking cloud-metadata endpoints (169.254.169.254, metadata.google.internal). Disable it to restore strict public-only blocking. Webhook/remote-image SSRF defaults are unchanged. ([#5066](https://github.com/diegosouzapw/OmniRoute/issues/5066), thanks @daniij) - **feat(blackbox):** refresh provider model catalog with latest models. (thanks @ptkelanatechsolutions) - **kiro**: inline `` stream splitter — when `enabled` is present, `assistantResponseEvent` content is now split into separate `delta.content` / `delta.reasoning_content` SSE chunks (new `open-sse/executors/kiroThinking.ts` module wired into `KiroExecutor.transformEventStreamToSSE`). - **feat(cursor):** parse Cursor Composer DeepSeek-style inline tool calls — Composer `cu/composer-2.5*` models embed tool invocations in their visible text using `<|tool▁calls▁begin|>…<|tool▁calls▁end|>` markers instead of structured protobuf frames; a new streaming parser (`composerToolCalls.ts`) intercepts these in both streaming and non-streaming paths, suppresses the markers from the client-visible content, and emits proper OpenAI `tool_calls` deltas so downstream clients handle them natively. (thanks @noestelar) - **feat(proxy):** support auth-less `host:port` batch import and surface proxy-test failures. (thanks @dimaslanjaka) - **feat(video): Alibaba DashScope video provider (`wan2.7-t2v`)** — adds the `alibaba` video provider (DashScope async task → poll → MP4) wired through the standard apikey credential path, so text-to-video requests can route to Alibaba's `wan2.7-t2v` model. (thanks @josevictorferreira) - **feat(cc): per-connection "summarized thinking display" toggle for Claude-Code-compatible providers** — exposes a connection-level toggle that drives the existing Copilot summarized-thinking marker, so operators can opt a CC-compatible connection into summarized reasoning display from the UI (schema + request defaults + provider modals, with i18n). (thanks @rdself) - **feat(compression): compression playground in the studio (Play + Compare tabs)** — `/dashboard/compression/studio` gains a synthetic playground: paste text → per-engine **lanes** (each deterministic engine run alone via `/api/compression/preview`) plus a **combined waterfall** ordered by `stackPriority`, and a free **A/B Compare** grid with on-demand, **USD-capped** fidelity verdicts (`/api/compression/compare` + `compare/verify`). The preview route now uses the real cl100k tokenizer, returns `engineBreakdown`, and accepts an ordered `pipeline[]`; new `compare` / `compare/verify` / `retrieve` routes; the live WS feed moved to `/dashboard/compression/live`. Management-only. ([#5080](https://github.com/diegosouzapw/OmniRoute/pull/5080)) - **feat(dashboard): expose Fusion `judgeModel` + `fusionTuning` in the combo editor** — the Fusion strategy editor now surfaces the judge model (synthesizes the panel answers; defaults to the first panel model) plus the quorum-grace tuning fields (`minPanel`, `stragglerGraceMs`, `panelHardTimeoutMs`) that `open-sse/services/fusion.ts` already reads. Schema-validated + bounded; empty tuning is never persisted. ([#5074](https://github.com/diegosouzapw/OmniRoute/pull/5074)) - **feat(compression): opt-in per-step fidelity gate for the stacked pipeline** — each compression step can now be guarded by a pure fidelity checker (4 invariants, fail-open) so a lossy engine that would degrade the prompt past a threshold is rejected and its lane skipped instead of silently shipping. Configurable via `fidelityGate` (advanced thresholds intentionally API-omitted), with a per-lane rejection breakdown surfaced in the studio playground toggle. ([#5143](https://github.com/diegosouzapw/OmniRoute/pull/5143)) - **feat(compression): fuzzy near-duplicate dedup (session-dedup 2nd pass)** — the session-dedup engine gains a second fuzzy pass that collapses near-duplicate (not just byte-identical) segments, with a playground toggle to compare on/off. ([#5143](https://github.com/diegosouzapw/OmniRoute/pull/5143)) - **feat(quota): opt-in Codex/Claude auto-ping keepalive** — an opt-in background keepalive can periodically ping Codex/Claude connections to keep their session/quota state warm, reducing cold-start failures on the first real request. ([#5102](https://github.com/diegosouzapw/OmniRoute/pull/5102)) - **feat(ops): SRE playbooks + ops helper scripts** — salvaged from a closed stale PR; adds operator runbooks and ops helper scripts. ([#5138](https://github.com/diegosouzapw/OmniRoute/pull/5138) — thanks @KooshaPari / @diegosouzapw) - **feat(mcp): web-session robustness — cookie dedup + browser-pool observability** — the MCP web-session path now de-duplicates cookies when (re)hydrating a session (avoiding conflicting duplicate `Cookie` headers) and exposes browser-pool observability (pool size / in-use / acquisition metrics) for the headless web providers. ([#5121](https://github.com/diegosouzapw/OmniRoute/pull/5121), builds on [#3368](https://github.com/diegosouzapw/OmniRoute/issues/3368)) - **feat(compression): Ionizer engine — lossy JSON-array sampling reversible via CCR** — a new compression engine that down-samples large JSON arrays to a representative subset and records a Compact Change Representation (CCR) so the omitted rows can be reconstructed, trading exactness for a large token reduction on tabular/array-heavy payloads. ([#5148](https://github.com/diegosouzapw/OmniRoute/pull/5148)) ### 🔧 Bug Fixes - **fix(sse): resolve nameless deepseek-web `` blocks via parameter-schema match** — when `chat.deepseek.com` emits a `` block with no tag suffix, no `` child, and no JSON body (only `` children), every existing name-resolution path returned `null` and the raw XML leaked to the client instead of being converted to a `tool_call`. `extractCall` now falls back to a conservative schema-based match: if the extracted parameter names are a subset of exactly one requested tool's declared schema keys, that tool name is used; zero or ambiguous (>1) schema matches still return `null` so no calls are misattributed. ([#5154](https://github.com/diegosouzapw/OmniRoute/issues/5154)) - **fix(proxy): make the SOCKS5 handshake timeout operator-tunable (`SOCKS_HANDSHAKE_TIMEOUT_MS`)** — under high concurrency against a single residential gateway host, the SOCKS5 connect handshake could exceed the hardcoded 10s even though the proxy was reachable, surfacing as a false `[Proxy Fast-Fail] Proxy unreachable` (the pool size is already tunable via `OMNIROUTE_PROXY_DISPATCHER_CONNECTIONS`). The handshake timeout now reads `SOCKS_HANDSHAKE_TIMEOUT_MS` (default unchanged at `10000`, capped at `120000`) so a concurrency-heavy deployment can raise it without a code change. Mitigation for #5109 (the full concurrency-100 collapse still needs the reporter's live load-test confirmation). ([#5109](https://github.com/diegosouzapw/OmniRoute/issues/5109)) - **fix(api): resolve `GET /v1/models/{id}` case-insensitively** — clients that normalise the model id (e.g. OpenCode requesting `minimax/minimax-m3` for the canonical catalog entry `minimax/MiniMax-M3`) missed the single-model lookup, which is case-sensitive, and fell back to advertising `context_length: 0`. `findModelById` now prefers an exact-case match and falls back to a case-insensitive match, so the real entry (and its context window) is returned regardless of casing. ([#5082](https://github.com/diegosouzapw/OmniRoute/issues/5082)) - **fix(services): embed WS proxy honours `LIVE_WS_HOST`; reject empty `messages` early** — two headless/Docker deployment fixes (#5110). The embed WebSocket proxy (`:20131`) only read `EMBED_WS_PROXY_HOST`, so behind a reverse proxy/tunnel it stayed bound to `127.0.0.1` even with `LIVE_WS_HOST=0.0.0.0` set and the Live dashboard showed "WebSocket disconnected"; it now falls back to `LIVE_WS_HOST` (default still loopback). Separately, a request with an explicitly empty `messages: []` array was forwarded upstream and bounced back as a confusing raw `400/502`; `handleChat` now rejects it up front with a clear `messages: at least one message is required` (Responses-API `input` requests are unaffected). ([#5110](https://github.com/diegosouzapw/OmniRoute/issues/5110)) - **fix(proxy): repair one-click Deno & Cloudflare relay deployments** — the `/api/settings/proxy/test` endpoint only recognized the `vercel` relay type, so testing a deployed Deno or Cloudflare relay returned `proxy.type must be http, https, or socks5` and never reached the relay; it now routes all relay types through `isRelayType()`. On installs with `STORAGE_ENCRYPTION_KEY` the relay-auth token is read via `extractRelayAuth` (encrypted `relayAuthEnc` form), fixing the silent `401` that left `publicIp` null. The Cloudflare Worker upload now sends the script part as `application/javascript` (the API rejects `application/javascript+module`; ES-module semantics come from `main_module`), and the proxy-registry schema accepts the `deno`/`cloudflare` types + `deno-relay`/`cloudflare-relay` sources so editing a deployed relay no longer 400s. ([#5128](https://github.com/diegosouzapw/OmniRoute/issues/5128)) - **fix(kiro): retire `claude-sonnet-4.5` from the Kiro catalog + pin the exact Kiro 400 error** — `claude-sonnet-4.5` left the Kiro free-tier lineup (current active models: Opus 4.8/4.7/4.6, Sonnet 4.6, Haiku 4.5), so it is removed from the Kiro registry entry and the free-model catalog. A regression test now pins Kiro's verbatim `[400] Invalid model. Please select a different model to continue.` to the `isModelUnavailableError` model-unavailable classification. A 400 on every model (including current ones) points to a server-side Kiro tier/region gate, not an OmniRoute catalog bug. ([#5140](https://github.com/diegosouzapw/OmniRoute/pull/5140), closes [#4484](https://github.com/diegosouzapw/OmniRoute/issues/4484)) - **fix(dashboard): preserve every rendered field when loading/saving Resilience settings** — `ResilienceTab` renders `comboCooldownWait` and `quotaShareConcurrencyLimit`, but both the initial-load and save paths rewrote component state without those fields, so after a successful `/api/resilience` response the cards received `undefined` and the page fell back to the generic "failed to load" state. A shared `toResilienceResponse()` mapper now keeps all rendered fields, and `PATCH /api/resilience` returns `quotaShareConcurrencyLimit` to match GET and the UI contract. ([#5139](https://github.com/diegosouzapw/OmniRoute/pull/5139) — thanks @rdself) - **fix(quota): hydrate the in-memory quota cache from snapshots + scope auto-combo candidates** — after a restart the quota cache was empty, so a known-exhausted connection looked healthy until re-queried; `isAccountQuotaExhausted` now lazily hydrates from persisted `quota_snapshots`. Auto-combo candidate expansion is also scoped to the connections each combo target actually allows, instead of pulling in every connection for the provider. ([#5015](https://github.com/diegosouzapw/OmniRoute/pull/5015) — thanks @JxnLexn) - **fix(resilience): harden quota cutoff, Gemini audio MIME, and model-lockout cooldown** — stored quota hard-cutoff values are no longer coerced to `enabled=true` from arbitrary strings; Gemini audio input parts have their MIME type validated/normalized before forwarding; and model lockout now honours the configured `maxCooldownMs` ceiling. ([#5093](https://github.com/diegosouzapw/OmniRoute/pull/5093) — thanks @KooshaPari) - **fix(streaming): harden long OpenAI-compatible SSE streams** — a late pipeline-wind-down error can no longer overwrite an already-recorded successful stream (`streamCompletionRecorded` guard), client disconnects finalize as `499 client_disconnected` instead of poisoning provider/account failure state, JSON bodies that are actually SSE (wrong `application/json` content-type) are sniffed and re-streamed, and reasoning fields (`reasoning`/`reasoning_content` + OpenRouter/Gemini encrypted `reasoning_details`) are preserved through the JSON-as-SSE fallback. ([#5124](https://github.com/diegosouzapw/OmniRoute/pull/5124) — thanks @rdself) - **fix(usage): dedupe request-usage logging and debounce stats events** — `saveRequestUsage` now guards against duplicate inserts (natural key: timestamp + provider + model + connection + api-key + token counts), back-fills a missing `endpoint`, and only emits `usageRecorded` when a row was actually inserted; stats `update`/`pending` event bursts are collapsed into a single debounced notification to reduce churn. ([#4940](https://github.com/diegosouzapw/OmniRoute/pull/4940) — thanks @nguyenxvotanminh3) - **fix(sse): convert the native Gemini request body to OpenAI format in the Antigravity MITM handler** — `contents` / `systemInstruction` / `generationConfig` / `thinkingConfig` are now translated to OpenAI chat-completions format before forwarding to `/v1/chat/completions`, so thinking-capable models (e.g. `ag/claude-opus-4-6-thinking`) no longer fail with provider-side 400 "invalid argument" errors. ([#4845](https://github.com/diegosouzapw/OmniRoute/pull/4845) — thanks @anuragg-saxenaa) - **fix(db): translate the two pt-BR SQLite driver-fallback log lines to English** — `[DB] Pré-inicializando sql.js WASM…` and `[DB] Drivers síncronos indisponíveis…` were the only non-English server log strings, mixing languages in the logs. Now `[DB] Pre-initializing sql.js WASM (synchronous drivers unavailable)…` / `[DB] Synchronous drivers unavailable — falling back to sql.js (WASM)`, guarded by a test that scans the driver path for accented log strings. ([#5103](https://github.com/diegosouzapw/OmniRoute/issues/5103)) - **fix(diagnostics): non-streaming Claude responses no longer false-502 as `empty_choices`** — the v3.8.37 malformed-200 detector (#4942) only understood OpenAI `choices` and Responses-API `output` shapes, so a `/v1/messages` response that stays in Claude shape (`{type:"message", content:[…]}`) fell through to `empty_choices` → 502 (cascading to "All models failed" in a combo). Most visibly, an extended-thinking turn whose buffered body is a single **empty thinking block with a valid `signature`** (Claude Code's non-streaming Bash classifier) 502'd on every call. `detectMalformedNonStream` now understands the Claude shape: text/tool_use blocks and thinking blocks carrying a signature count as valid output, while a genuinely empty `content:[]` is still flagged. ([#5108](https://github.com/diegosouzapw/OmniRoute/issues/5108), thanks @insoln) - **fix(combo): empty-content 502 now fails over within the same request instead of exhausting the provider** — a leg that answers HTTP 200 with no usable completion is rewritten to `502 "Provider returned empty content"`, but the combo exhaustion classifier treated that synthetic 502 as a connection-level failure (`#1731v2`) and marked the whole provider/connection exhausted, skipping every remaining **same-provider** leg in that request. The connection is actually healthy (it just returned an empty body), so empty-content 502s are now classified as model-level transient failures: the request advances to the next leg and the rest of that provider's legs stay eligible. Genuine gateway 502s still trip connection exhaustion. ([#5085](https://github.com/diegosouzapw/OmniRoute/issues/5085), thanks @andrea-kingautomation) - **fix(dashboard): surface the detailed credential-validation error instead of a bare "invalid" badge** — the inline "Check" in the Add-Connection modal discarded the `error` message returned by `/api/providers/validate` and showed only an `invalid` badge. For web providers (claude-web / chatgpt-web) the real cause is often an environment error the backend already reports (e.g. `TLS impersonation client failed to start: EACCES … mkdir tls-client-node/bin`), so users were left guessing. The modal now renders the full reason next to the badge. ([#5088](https://github.com/diegosouzapw/OmniRoute/issues/5088), thanks @tkhs101) - **fix(executors): strip `client_metadata` from forwarded body for Cerebras and Mistral** — Cerebras returns 400 (`wrong_api_format`) and Mistral returns 422 (`extra_forbidden`) when the passthrough body carries `client_metadata` (an OpenAI Codex / Claude CLI field with no equivalent on these upstreams). The default executor now drops it for these two providers before sending downstream; other providers (notably `openai`/`codex`) keep it. (thanks @saurabh321gupta) - **fix(codebuddy):** only send reasoning params when the client requests reasoning. (thanks @anki1kr) - **fix(sse):** keep streaming for forceStream providers when a JSON client requests it. Providers marked `forceStream:true` reject `stream:false` upstream (HTTP 400); `resolveStreamFlag` now guards against this so stream-only providers keep streaming even when the client sends `Accept: application/json` or `stream:false`. (thanks @anki1kr) - **fix(sse):** prevent non-JSON SSE lines and duplicate `[DONE]` from breaking clients. (thanks @qianze0628) - **fix(sse):** dedupe case-variant Anthropic headers in the executor `buildHeaders` path — Node/undici's `fetch` merges `anthropic-version` and `Anthropic-Version` into a single `"v, v"` value that the Anthropic API rejects, so both case variants are now collapsed to one canonical lowercase header (same for `anthropic-beta`). (thanks @Delcado19) - **oauth(kiro):** support Kiro IDC (organization) token import — when the `~/.aws/sso/cache` token carries a `clientIdHash`, auto-import now reads the linked client registration file to obtain `clientId`/`clientSecret`, probes the Kiro IDE `profile.json` for `profileArn` (ARN region normalized to `us-east-1` for the runtime gateway), and refreshes via the regional AWS OIDC endpoint instead of the social path; the import schema and modal forward these credentials so manual imports also work for IDC tokens. (thanks @enjoyer-hub) - **fix(translator):** preserve client `cache_control` breakpoints when routing Claude-format requests (e.g. Claude Code) to Alibaba DashScope's OpenAI-compatible providers (`alibaba` / `alibaba-cn`). The Claude→OpenAI translation previously stripped the markers from the system and message text blocks, so DashScope's explicit caching never engaged and every request was a cache miss. Cache hints now survive when preservation is requested for caching-capable OpenAI-format providers. (thanks @sacrtap) - **fix(tts):** resolve Gemini TTS models from catalog and add `gemini-3.1-flash-tts-preview` as the new default Vertex TTS model. (thanks @nguyenha935) - **fix(sse): don't cool down a healthy connection on a self-inflicted upstream timeout (504)** — when OmniRoute's own deadline elapses (surfaced as `TimeoutError`/`BodyTimeoutError` → 504), the connection is no longer disabled/failed-over, so a slow-but-healthy provider isn't penalised for our timeout. Genuine upstream 5xx/429 still trigger cooldown; antigravity keeps its own policy. (thanks @costaeder) - **fix(translator):** forward image `tool_result` blocks as `image_url` instead of stringifying base64. (thanks @alican532) - **fix(sse): robust Anthropic `/v1/messages` streaming — real ping keepalive + client-disconnect guard** — slow first tokens on reasoning models could trip strict clients' idle-read watchdog; the route now keeps the stream warm with a real `event: ping` (Anthropic clients ignore SSE comments) from the very first frame, and a client disconnect (AbortError / controller-closed) no longer counts as a provider failure (no failover/cooldown). (thanks @costaeder) - **fix: preserve model hidden flags (`isHidden`) across model sync** — `replaceCustomModels` pruned the compat-override list to the new custom-model ids, silently wiping the `isHidden` flag of eye-hidden SYNCED models on every periodic sync / import (all hidden models turned back on). The redundant cleanup is removed (per-model removal already handles its own compat cleanup), so eye-hidden models stay hidden across re-sync. ([#5086](https://github.com/diegosouzapw/OmniRoute/pull/5086) — thanks @herjarsa) - **fix(models): derive model-discovery config from the registry `modelsUrl`** — providers absent from the hardcoded `PROVIDER_MODELS_CONFIG` but carrying a registry `modelsUrl` (e.g. MiniMax) now get an auto-derived Bearer `/v1/models` discovery config, so "discover models" works instead of returning nothing. (thanks @herjarsa) - **fix(compression): resolve worker + rule/filter assets via runtime anchors (standalone bundle)** — the LLMLingua worker and the RTK rule/filter loaders relied on `fileURLToPath(import.meta.url)`, which the standalone bundle freezes to the build-machine path, so the worker never spawned and rule/filter packs failed to resolve. They now anchor on `process.cwd()`/`argv[1]` (with `pathToFileURL` for the worker URL). (thanks @fulorgnas) - **fix(api): sanitize error responses on seven management routes (Rule #12 hardening)** — `cli-tools/backups`, `cli-tools/guide-settings/[toolId]`, `logs/export`, `models/catalog`, `providers/test-batch`, `settings/import-json` and `usage/proxy-logs` no longer return raw `error.message`; they wrap caught errors in `sanitizeErrorMessage(...)`, and the routes are removed from the `check-error-helper` allowlist. (thanks @JxnLexn) - **fix(sse): keep `output_text`-only Responses bodies from being dropped/false-502'd** — some upstreams return a shorthand Responses body whose answer is only in `output_text` with an empty `output[]`. `sanitizeResponsesApiResponse` discarded the text, so the response then tripped the malformed-200 guard. The sanitizer now synthesizes an `output[]` message item from a non-empty `output_text` (complements the Claude-native fix in #5108; both stem from #4942). - **fix(executors): preserve a lone caller-supplied `Anthropic-Version` header casing** — the case-variant dedupe (#4846) unconditionally rewrote `Anthropic-Version`/`Anthropic-Beta` to lowercase even when only one variant was present, clobbering the caller's header. Dedupe now runs only when both case variants coexist (the actual undici-merge collision it was meant to fix). - **fix(responses):** default `text.format` to `{ type: "text" }` for openai-compatible **responses** providers — some Responses-compatible upstreams (e.g. LM Studio) reject a `text` object missing `text.format` with a 400 `missing_required_parameter`; the default executor now fills the Responses-API default before forwarding (guarded to `openai-compatible-*responses*`, never overwriting an existing format). (thanks @StevanusPangau) - **fix(translator): stop stripping client-provided `reasoning_content` for reasoning-replay providers** — the #4849 agentic-context strip (which drops `reasoning_content` from tool-call assistant turns to avoid O(n²) token growth) ran unconditionally, so replay providers (DeepSeek V4, Kimi K2, Qwen-Thinking, etc.) lost the client's reasoning and the reasoning-replay cache then overwrote it with a stale cached value (and such upstreams 400 without the original reasoning). The strip now skips reasoning-replay targets while non-reasoning providers keep the O(n²) protection. ([#5122](https://github.com/diegosouzapw/OmniRoute/pull/5122)) - **fix(providers): add MiniMax M3 & Nemotron 3 Ultra to the Cline catalog** — the two models were missing from Cline's provider catalog and could not be selected; both are now registered. ([#5136](https://github.com/diegosouzapw/OmniRoute/pull/5136), closes [#3321](https://github.com/diegosouzapw/OmniRoute/issues/3321)) - **fix(dashboard): key model-visibility toggle on the canonical `providerId`** — the per-model visibility toggle keyed off a display id, so toggling a model on one provider alias could mis-target another; it now keys on the canonical `providerId`. ([#5091](https://github.com/diegosouzapw/OmniRoute/pull/5091) — thanks @Theadd) - **fix(diagnostics): recognize the Claude API format in `detectMalformedNonStream`** — salvaged null-guard so a Claude-shaped non-streaming body is no longer misclassified. ([#5141](https://github.com/diegosouzapw/OmniRoute/pull/5141) — thanks @herjarsa / @diegosouzapw) - **fix(logging): track the final connection IDs in failover logs** — failover log lines now record the connection that actually served (or last failed) the request, instead of only the first attempt. ([#5016](https://github.com/diegosouzapw/OmniRoute/pull/5016) — thanks @JxnLexn) - **fix(sse): ignore disconnect races during in-band stream error handling** — a client disconnect that races with in-band upstream error handling no longer surfaces as a spurious provider failure. ([#5007](https://github.com/diegosouzapw/OmniRoute/pull/5007) — thanks @JxnLexn) - **fix(dashboard): surface the server error on `handleToggleCombo` failure** — a failed combo toggle now shows the backend error instead of silently no-op'ing. ([#5138](https://github.com/diegosouzapw/OmniRoute/pull/5138) — thanks @KooshaPari / @diegosouzapw) - **fix(quota): track provider quota reset windows + enrich the Codex playground** — observed quota reset windows are tracked and surfaced, and the Codex playground gains the enriched quota metadata. ([#5141](https://github.com/diegosouzapw/OmniRoute/pull/5141) — thanks @Witroch4 / @diegosouzapw) - **fix(sidebar): drop the orphan `settings` accent color** — removed a dangling accent-color entry that broke `typecheck:core`. ([#5142](https://github.com/diegosouzapw/OmniRoute/pull/5142)) - **fix(sse): preserve non-stream reasoning fields for compatible clients** — non-streaming responses now keep the upstream reasoning fields (`reasoning` / `reasoning_content` and OpenRouter/Gemini `reasoning_details`) instead of stripping them in `responseSanitizer`, so clients that render reasoning on buffered responses no longer lose it. ([#5155](https://github.com/diegosouzapw/OmniRoute/pull/5155) — thanks @rdself) - **fix(i18n): add missing English UI labels** — fills in untranslated English strings that were surfacing as raw keys in the dashboard. ([#5153](https://github.com/diegosouzapw/OmniRoute/pull/5153) — thanks @rdself) ### 🔒 Security - **fix(security): exact-host Anthropic `baseUrl` check** — the Anthropic base-URL guard used a substring match that a crafted host could partially satisfy; it now requires an exact host match (resolves CodeQL `js/incomplete-url-substring-sanitization` alert #674). ([#5130](https://github.com/diegosouzapw/OmniRoute/pull/5130)) ### 📝 Maintenance - **refactor(store): remove dead legacy store modules** — salvaged cleanup of unused legacy store code. ([#5138](https://github.com/diegosouzapw/OmniRoute/pull/5138) — thanks @JxnLexn / @diegosouzapw) - **test(combo): deterministic routing-decision matrix for all 17 strategies** — a deterministic E2E matrix pins the routing decision of every combo strategy. ([#5146](https://github.com/diegosouzapw/OmniRoute/pull/5146)) - **chore:** baseline reconciliations (complexity / file-size / cognitive), golden-snapshot + apikey-count alignment for new providers, orphan-test relocation, release base-red repairs, CHANGELOG i18n mirror sync, and an `actions/cache` 5→6 bump. ([#5145](https://github.com/diegosouzapw/OmniRoute/pull/5145), [#5144](https://github.com/diegosouzapw/OmniRoute/pull/5144), [#5125](https://github.com/diegosouzapw/OmniRoute/pull/5125), [#5126](https://github.com/diegosouzapw/OmniRoute/pull/5126), [#5120](https://github.com/diegosouzapw/OmniRoute/pull/5120), [#5117](https://github.com/diegosouzapw/OmniRoute/pull/5117), [#5112](https://github.com/diegosouzapw/OmniRoute/pull/5112)) - **test:** gated live smoke for combo strategies (in-process + VPS HTTP) and refreshed release expectations to match current code. ([#5151](https://github.com/diegosouzapw/OmniRoute/pull/5151), [#5150](https://github.com/diegosouzapw/OmniRoute/pull/5150) — thanks @KooshaPari / @diegosouzapw) --- ## [3.8.37] — 2026-06-26 ### ✨ New Features - **feat(providers):** add DGrid AI gateway provider — OpenAI-compatible gateway at `api.dgrid.ai/v1` (alias `dgrid`, API-key auth, passthrough models). Free router tier (10 RPM / 100 RPD); a $5 lifetime top-up raises limits to 20 RPM / 1,000 RPD. ([#4931](https://github.com/diegosouzapw/OmniRoute/pull/4931) — thanks @dgridOP) - **feat(providers):** add Pioneer AI (Fastino Labs) provider — OpenAI-compatible chat completions at `api.pioneer.ai/v1`. Registered with alias `pn`, `X-API-Key` auth, and a catalog of 10 open-tier serverless models (Qwen3, Llama 3.1/3.2, Gemma 3, SmolLM3). Free $75 credits, no credit card required. Gated enterprise models (Claude/GPT/Gemini) require prior fine-tuning on the Pioneer platform and are intentionally excluded from the catalog. ([#4909](https://github.com/diegosouzapw/OmniRoute/pull/4909) — thanks @HikiNarou) - **feat(providers):** add xAI Grok inbound translators and a thinking patcher — Grok requests are now translated on the inbound path and reasoning is normalized so Grok modes behave consistently across clients. ([#4910](https://github.com/diegosouzapw/OmniRoute/pull/4910) — thanks @mugnimaestra) - **feat(oauth):** Codex bulk-import endpoint — `POST /api/oauth/codex/import` accepts multiple Codex OAuth credentials in one call for fast multi-account onboarding. ([#4914](https://github.com/diegosouzapw/OmniRoute/pull/4914) — thanks @beaaan) - **feat(embeddings):** add a `dimensions` override field to embedding combos so an embedding combo can pin the output vector size per target. ([#4913](https://github.com/diegosouzapw/OmniRoute/pull/4913) — thanks @wenzetan) - **feat(sse):** auto-promote successful combo model — a new opt-in `comboAutoPromoteEnabled` setting reorders a combo's persisted model list so that, when a combo model responds successfully, it is moved to position #1 for future requests. ([#4852](https://github.com/diegosouzapw/OmniRoute/pull/4852) — thanks @arssnndr) - **feat(sse):** add toggleable tool-source diagnostics — an opt-in switch surfaces where each tool definition originated when debugging tool-routing issues. ([#4856](https://github.com/diegosouzapw/OmniRoute/pull/4856) — thanks @DuyPrX) - **feat(headroom):** proxy lifecycle management + dashboard UI — start/stop/monitor a Headroom compression proxy from the dashboard, with Docker sidecar support. ([#4649](https://github.com/diegosouzapw/OmniRoute/pull/4649) — thanks @diegosouzapw / @carmelogunsroses) - **feat(sse):** `x-omniroute-strip-reasoning` request header to drop `reasoning_content` from upstream responses (opt-in, preserving reasoning-aware clients). ([#4678](https://github.com/diegosouzapw/OmniRoute/pull/4678) — thanks @anuragg-saxenaa / @diegosouzapw) - **feat(cli):** multi-model support for the Factory Droid CLI integration. ([#4682](https://github.com/diegosouzapw/OmniRoute/pull/4682) — thanks @anuragg-saxenaa / @diegosouzapw) - **feat(sse):** parse Gemini CLI 429 `retryDelay` from the structured `RetryInfo` payload so cooldowns honor the upstream-provided backoff. ([#4738](https://github.com/diegosouzapw/OmniRoute/pull/4738) — thanks @NoxzRCW) - **feat(sse):** add GPT-4 and GPT-4o mini to the GitHub Copilot provider catalog. ([#4798](https://github.com/diegosouzapw/OmniRoute/pull/4798), [#4797](https://github.com/diegosouzapw/OmniRoute/pull/4797) — thanks @decolua) - **feat(api):** add the `MiniMax-M3` pricing row (canonical + lowercase alias) so the new MiniMax default model gets accurate per-request cost accounting instead of falling back to a zero/default rate. ([#4814](https://github.com/diegosouzapw/OmniRoute/pull/4814) — thanks @octo-patch) ### 🔧 Bug Fixes - **fix(sse):** dense, deterministic `response.output` ordering in `response.completed` — items are now sorted by their actual `output_index` (via a recorded-as-emitted accumulator + stable sort) instead of being rebuilt from unordered state dicts; `normalizeOutputIndex` replaces fragile `parseInt` calls for robust index coercion; superseded tool calls (replaced at the same index mid-stream) are excluded from the final output array. ([#4906](https://github.com/diegosouzapw/OmniRoute/pull/4906) — thanks @Marco9113) - **fix(sse):** normalize Codex custom/freeform tools (`apply_patch`, `type:"custom"` with no `parameters`) to a `{ input: string }` function schema instead of an empty schema — the empty schema made models invoke `apply_patch` with `{}`, breaking the Codex runtime which expects `{ input: string }`. Also maps `custom_tool_call` / `custom_tool_call_output` input items and streams `apply_patch` tool calls via `custom_tool_call_input.delta`/`.done` events. ([#4862](https://github.com/diegosouzapw/OmniRoute/pull/4862) — thanks @nstung463) - **fix(sse):** preserve the `required` array when translating Draft 2020-12 antigravity tool schemas (e.g. from OpenCode), stripping unsupported JSON Schema meta keywords while keeping mandatory arguments required so the model no longer calls tools without them. ([#4843](https://github.com/diegosouzapw/OmniRoute/pull/4843) — thanks @anuragg-saxenaa) - **fix(sse):** Kiro tool-schema sanitizer — strip unsupported JSON-Schema keywords (`anyOf`/`$ref`/`if`-`then`, etc.) and hash-truncate tool names >64 chars before dispatch, mapping the streamed tool-call name back for the client, so Kiro no longer rejects tool calls with `400 "Improperly formed request"`. ([#4847](https://github.com/diegosouzapw/OmniRoute/pull/4847) — thanks @smarthomeblack) - **fix(sse):** make the `anthropic-version` default-guard case-insensitive for `anthropic-compatible-*` providers, so a caller/operator-supplied `Anthropic-Version` (any casing) is no longer clobbered by a second lowercase `anthropic-version: 2023-06-01` header. ([#4823](https://github.com/diegosouzapw/OmniRoute/pull/4823) — thanks @zakirkun) - **fix(db):** validate HuggingFace API tokens via the `whoami-v2` endpoint as a pure auth probe so fine-grained Inference-Provider tokens (valid even when model/task endpoints reject them) are no longer falsely marked invalid; only 401/403 means an invalid key, other non-OK statuses surface as transient upstream errors. ([#4819](https://github.com/diegosouzapw/OmniRoute/pull/4819) — thanks @Delcado19) - **fix(sse):** reject the Anthropic-only `[1m]` context-1m suffix in `buildKiroPayload` before it reaches AWS Bedrock — Kiro is Bedrock-backed and cannot honor the beta, so a forwarded `kr/*[1m]` model id was malformed upstream; callers now get a clear error pointing them at a direct-Anthropic provider for 1M-context routing. ([#4816](https://github.com/diegosouzapw/OmniRoute/pull/4816) — thanks @Delcado19) - **fix(dashboard):** align the Engine Combos editor engines with the API schema — the named-combos pipeline dropdown offered four engines (`headroom`, `session-dedup`, `ccr`, `llmlingua`) that `PUT /api/context/combos/[id]` rejects, so selecting one made the save return 400 while the UI swallowed the error. The dropdown is now sourced from a single canonical engine map shared with `stackedPipelineStepSchema` (parity guarded by a unit test), and the editor surfaces save errors plus empty-name/empty-pipeline validation instead of failing quietly. ([#5062](https://github.com/diegosouzapw/OmniRoute/pull/5062) — closes #4955) - **fix(sse):** surface malformed HTTP-200 upstream responses instead of treating them as success, so combo fallback can trigger. ([#4942](https://github.com/diegosouzapw/OmniRoute/pull/4942) — thanks @haipham22) - **fix(antigravity):** retry transient upstream failures rather than failing the request outright. ([#4941](https://github.com/diegosouzapw/OmniRoute/pull/4941) — thanks @Jordannst) - **fix(sse):** exclude WS-bridge controller-closed errors from the provider circuit breaker so a client disconnect no longer trips the whole provider. ([#4870](https://github.com/diegosouzapw/OmniRoute/pull/4870) — closes #4602, thanks @huohua-dev) - **fix(sse):** resolve custom combos by id and case-insensitive name. ([#4869](https://github.com/diegosouzapw/OmniRoute/pull/4869) — closes #4446, thanks @herjarsa) - **fix(sse):** forward AI SDK image parts in the Responses translator. ([#4859](https://github.com/diegosouzapw/OmniRoute/pull/4859) — thanks @mugnimaestra) - **fix(sse):** emit valid concatenable Kiro `tool_calls.arguments` deltas. ([#4855](https://github.com/diegosouzapw/OmniRoute/pull/4855) — thanks @wahyuzero) - **fix(sse):** strip `temperature` for Claude models with extended thinking enabled (the upstream rejects it). ([#4853](https://github.com/diegosouzapw/OmniRoute/pull/4853) — thanks @noestelar) - **fix(sse):** unwrap the Qoder HTTP-200 SSE error envelope so combo fallback can trigger. ([#4850](https://github.com/diegosouzapw/OmniRoute/pull/4850) — thanks @vianlearns) - **fix(sse):** strip reasoning blobs from agentic context to prevent O(n²) token growth across multi-turn agent loops. ([#4849](https://github.com/diegosouzapw/OmniRoute/pull/4849) — thanks @GodrezJr2) - **fix(sse):** close the reasoning block before message content in the Responses stream so clients render reasoning and answer in the right order. ([#4848](https://github.com/diegosouzapw/OmniRoute/pull/4848) — thanks @kwanLeeFrmVi) - **fix(config):** sync the full SiliconFlow model list into the registry. ([#4844](https://github.com/diegosouzapw/OmniRoute/pull/4844) — thanks @letanphuc) - **fix(sse):** strip Composer `<|final|>` sentinel markers that leaked after Composer reasoning. ([#4842](https://github.com/diegosouzapw/OmniRoute/pull/4842) — thanks @noestelar) - **fix(build):** trace-include `sql.js`'s `sql-wasm.wasm` in the standalone bundle so SQLite-WASM works in the packaged build. ([#4839](https://github.com/diegosouzapw/OmniRoute/pull/4839) — thanks @Delcado19) - **fix(cli):** persist lazily-installed native runtime deps (`better-sqlite3`, `systray2`) to the shared runtime `package.json` with `--save-exact` instead of `--no-save`, so installing one no longer prunes the other as "extraneous" — fixing a "No SQLite driver available" failure after a `--tray` install. ([#4841](https://github.com/diegosouzapw/OmniRoute/pull/4841) — thanks @omartuhintvs) - **fix(sse):** resolve bare model names to a connection's `defaultModel` before upstream calls. ([#4825](https://github.com/diegosouzapw/OmniRoute/pull/4825) — thanks @anuragg-saxenaa) - **fix(api):** surface a Docker-localhost hint on provider-node validation connection errors. ([#4822](https://github.com/diegosouzapw/OmniRoute/pull/4822) — thanks @anuragg-saxenaa) - **fix(sse):** strip Gemini built-in tools when `functionDeclarations` are present in the Antigravity envelope (the two are mutually exclusive upstream). ([#4821](https://github.com/diegosouzapw/OmniRoute/pull/4821) — thanks @vanszs) - **fix(sse):** strip `X-Stainless-*` headers and normalize the SDK `User-Agent` for OpenAI-compatible endpoints. ([#4820](https://github.com/diegosouzapw/OmniRoute/pull/4820) — thanks @anuragg-saxenaa) - **fix(oauth):** allow a per-connection refresh lead-time override via `providerSpecificData.refreshLeadMs`. ([#4818](https://github.com/diegosouzapw/OmniRoute/pull/4818) — thanks @anuragg-saxenaa) - **fix(dashboard):** resolve passthrough model aliases by `providerId` in `ModelSelectModal`. ([#4815](https://github.com/diegosouzapw/OmniRoute/pull/4815) — thanks @anuragg-saxenaa) - **fix(sse):** strip `enumDescriptions` from Antigravity tool schemas. ([#4813](https://github.com/diegosouzapw/OmniRoute/pull/4813), [#4740](https://github.com/diegosouzapw/OmniRoute/pull/4740) — thanks @anuragg-saxenaa) - **fix(dashboard):** keep the desktop sidebar visible via an explicit CSS class. ([#4812](https://github.com/diegosouzapw/OmniRoute/pull/4812) — thanks @Delcado19) - **fix(sse):** filter nameless hosted tools when converting Responses API to Chat format. ([#4789](https://github.com/diegosouzapw/OmniRoute/pull/4789) — upstream, thanks Владимир Акимов) - **fix(sse):** stream-writer mock `abort()` now returns a Promise (test-stability fix). ([#4788](https://github.com/diegosouzapw/OmniRoute/pull/4788) — thanks @decolua) - **fix(sse):** use the WorkOS auth-token shape for Cline. ([#4787](https://github.com/diegosouzapw/OmniRoute/pull/4787) — thanks @apeltekci) - **fix(api):** fall back to the existing access token for any OAuth provider when a refresh fails. ([#4786](https://github.com/diegosouzapw/OmniRoute/pull/4786) — thanks @decolua) - **fix(sse):** read Antigravity usage from the `response.usageMetadata` envelope. ([#4785](https://github.com/diegosouzapw/OmniRoute/pull/4785) — thanks @decolua) - **fix(oauth):** verify Cursor installation on Linux before auto-import. ([#4770](https://github.com/diegosouzapw/OmniRoute/pull/4770) — upstream, thanks Ibrahim Ryan) - **fix(cli):** fall back to the default data dir when `DATA_DIR` is not writable. ([#4767](https://github.com/diegosouzapw/OmniRoute/pull/4767) — upstream, thanks Thiên Toán) - **fix(sse):** `json_schema` fallback for OpenAI-compatible providers that don't support structured outputs. ([#4766](https://github.com/diegosouzapw/OmniRoute/pull/4766) — thanks @mustafabozkaya) - **fix(cli):** verify launchd registration and skip self-SIGTERM in macOS autostart. ([#4765](https://github.com/diegosouzapw/OmniRoute/pull/4765) — thanks @ntdung6868) - **fix(sse):** finalize the `tool_calls` `finish_reason` on early stream end in the OpenAI Responses translator. ([#4764](https://github.com/diegosouzapw/OmniRoute/pull/4764) — thanks @decolua) - **fix(sse):** gate Kiro image attachments behind a Claude-capability check. ([#4763](https://github.com/diegosouzapw/OmniRoute/pull/4763) — thanks @decolua) - **fix(sse):** track Ollama streaming usage from raw NDJSON chunks. ([#4754](https://github.com/diegosouzapw/OmniRoute/pull/4754) — thanks @fresent) - **fix(sse):** include low-level cause details in `formatProviderError`. ([#4741](https://github.com/diegosouzapw/OmniRoute/pull/4741) — thanks @decolua) - **fix(executors):** `anthropic-compatible-*` gateways now get a `Bearer` token alongside `x-api-key`. ([#4729](https://github.com/diegosouzapw/OmniRoute/pull/4729) — thanks @hodtien) - **fix(translator):** strip the `x-anthropic-billing-header` in the claude-to-openai path. ([#4728](https://github.com/diegosouzapw/OmniRoute/pull/4728) — thanks @weimaozhen) - **fix(translator):** preserve `reasoning_effort` for non-Copilot Responses clients. ([#4688](https://github.com/diegosouzapw/OmniRoute/pull/4688) — thanks @ryanngit / @diegosouzapw) - **fix(codex):** treat an OAuth 401 as an unrecoverable refresh failure (stop retrying a dead token). ([#4686](https://github.com/diegosouzapw/OmniRoute/pull/4686) — thanks @sacwooky / @diegosouzapw) - **fix(translator):** coerce tool descriptions to strings in OpenAI normalization. ([#4675](https://github.com/diegosouzapw/OmniRoute/pull/4675) — thanks @East-rayyy / @diegosouzapw) - **fix(dashboard):** stop double-masking an already-masked API key in the list view (E2E 3/9 regression). ([#4671](https://github.com/diegosouzapw/OmniRoute/pull/4671) — thanks @diegosouzapw) - **fix(combo):** flatten Anthropic tool messages + tool history to prevent an upstream 503. ([#4648](https://github.com/diegosouzapw/OmniRoute/pull/4648) — thanks @warelik / @diegosouzapw) - **fix(providers):** require a Default Model in the compatible-provider API-key setup flow. ([#4641](https://github.com/diegosouzapw/OmniRoute/pull/4641) — thanks @arden1601) ### 🔒 Security - **fix(auth):** only trust forwarding headers (`X-Forwarded-For` / `X-Real-IP`) from loopback TCP peers, so a non-loopback client can't spoof its origin to bypass local-only route guards. ([#4689](https://github.com/diegosouzapw/OmniRoute/pull/4689) — thanks @Jordannst / @diegosouzapw) - **fix(sse):** redact the API key from the AUTH debug log in the chat handler. ([#4858](https://github.com/diegosouzapw/OmniRoute/pull/4858) — thanks @sacwooky) - **fix(oauth):** classify `/api/oauth/cursor/auto-import` as a local-only route in the route guard, so the loopback-enforced process-spawning endpoint can't be reached through a tunneled/leaked JWT (Hard Rule #17). ([#5070](https://github.com/diegosouzapw/OmniRoute/pull/5070) — thanks @diegosouzapw) ### 📝 Maintenance - **chore(ci):** harden the release flow — decouple the Quality Ratchet from coverage-shard flakes (`if: !cancelled()` + `--allow-missing`), add fast-path drift gates (`check:complexity`, `check:cognitive-complexity`, `check:pack-policy`, `check:build-scope`), and raise the default build heap to 8 GB. ([#5054](https://github.com/diegosouzapw/OmniRoute/pull/5054) — thanks @diegosouzapw) - **docs(routing):** sync the combo strategy docs for Fusion (17 strategies). ([#5067](https://github.com/diegosouzapw/OmniRoute/pull/5067) — thanks @diegosouzapw) - **test(sse):** golden-lock the `provider.ts` translate-path across all providers. ([#4734](https://github.com/diegosouzapw/OmniRoute/pull/4734) — thanks @diegosouzapw / @decolua) - **docs(env):** document `HEADROOM_URL` in `.env.example` + `ENVIRONMENT.md`. (thanks @diegosouzapw) - **chore(quality):** rebaseline the file-size ratchet across the rc17 PR-batch levas (leva2/leva3/leva4) to absorb cycle drift. (thanks @diegosouzapw) --- ## [3.8.36] — 2026-06-25 ### ✨ New Features **Quota-Share system** - **feat(quota):** introduce a dedicated `quota-share` combo strategy (Fase 3 #9) — Deficit Round Robin scheduling with per-model in-flight gating (P2C), automatic DB migration that promotes existing `qtSd/*` combos, and per-policy gating so invalid allocations cannot bleed `allow` to unintended connections. ([#4939](https://github.com/diegosouzapw/OmniRoute/pull/4939), [#4901](https://github.com/diegosouzapw/OmniRoute/pull/4901)) - **feat(quota):** multi-window usage buckets, per-(key,model) caps, and session stickiness — connections now track consumption across 5 h, 7 d, and per-model windows; `quota_allocation_model_caps` enforces per-key/model limits; session stickiness preserves prompt-cache integrity across turns. ([#4928](https://github.com/diegosouzapw/OmniRoute/pull/4928), [#4927](https://github.com/diegosouzapw/OmniRoute/pull/4927), [#4929](https://github.com/diegosouzapw/OmniRoute/pull/4929)) - **feat(quota):** headroom strategy + proactive saturation — new `headroom` combo strategy selects connections by available quota margin; universal proactive saturation via upstream token-usage response headers; real Claude quota saturation sourced from `/api/oauth/usage`. ([#4908](https://github.com/diegosouzapw/OmniRoute/pull/4908), [#4907](https://github.com/diegosouzapw/OmniRoute/pull/4907), [#4885](https://github.com/diegosouzapw/OmniRoute/pull/4885)) - **feat(quota):** concurrency control + cooldown-wait (Fase 2.1) — `max_concurrent` is enforced at dispatch time; quota-share combos queue concurrent requests with a short cooldown-wait and re-dispatch on slot availability (Variant A); a cron heal proactively restores connections after their window resets. ([#4965](https://github.com/diegosouzapw/OmniRoute/pull/4965), [#4970](https://github.com/diegosouzapw/OmniRoute/pull/4970), [#4967](https://github.com/diegosouzapw/OmniRoute/pull/4967), [#4900](https://github.com/diegosouzapw/OmniRoute/pull/4900)) **Combo routing** - **feat(combo):** task-aware routing strategy — routes requests to the best-fit connection based on task-type metadata, enabling per-task provider specialization within a combo. ([#4945](https://github.com/diegosouzapw/OmniRoute/pull/4945)) - **feat(combo):** Fusion strategy (16th strategy) — fan out to a configurable panel of models in parallel, then synthesize results through a judge model. ([#4652](https://github.com/diegosouzapw/OmniRoute/pull/4652)) - **feat(combos):** add an editable per-combo `description` field. The routing-combo form now has a Description input, persisted in the combo `data` blob via `/api/combos` (POST/PUT) and round-tripped through GET — no new DB column. ([#5005](https://github.com/diegosouzapw/OmniRoute/issues/5005)) - **feat(routing):** honor `X-Route-Model` request header to override `body.model`, enabling per-request model switching without modifying the request body. ([#4863](https://github.com/diegosouzapw/OmniRoute/pull/4863) — thanks @costaeder) **Providers & models** - **feat(providers):** update volcengine-ark model list, adding DeepSeek-V4-Flash and DeepSeek-V4-Pro. ([#4905](https://github.com/diegosouzapw/OmniRoute/pull/4905) — thanks @kenlin8827) - **feat(provider):** add CodeBuddy CN (`copilot.tencent.com`) — full OAuth + executor + model catalog stack. ([#4664](https://github.com/diegosouzapw/OmniRoute/pull/4664)) - **feat(opencode-go):** advertise `glm-5.2` and `kimi-k2.7-code` to align with official Go endpoints. ([#4711](https://github.com/diegosouzapw/OmniRoute/pull/4711)) - **feat(sse):** add Google Flow video-generation provider. ([#4769](https://github.com/diegosouzapw/OmniRoute/pull/4769)) - **feat(api/v1):** include alias-backed models in the `/v1/models` listing. ([#4630](https://github.com/diegosouzapw/OmniRoute/pull/4630)) **Proxy pool** - **feat(proxy-pool):** Cloudflare Workers proxy deployer + pool integration — deploy Cloudflare Workers relays directly from the dashboard and register them in the proxy pool. ([#4640](https://github.com/diegosouzapw/OmniRoute/pull/4640)) - **feat(proxy-pool):** Deno Deploy relays + group action buttons — deploy Deno Deploy relay workers and manage proxy groups with new bulk-action controls. ([#4643](https://github.com/diegosouzapw/OmniRoute/pull/4643)) **Compression & infrastructure** - **feat(compression):** Kiro/CodeWhisperer tool-result compression engine — dedicated compressor for Kiro/CodeWhisperer tool outputs integrated into the streaming pipeline. ([#4635](https://github.com/diegosouzapw/OmniRoute/pull/4635)) - **feat(endpoint):** per-endpoint custom system prompt injection. A toggle + text field in the Endpoint settings card lets users inject a custom system prompt into every model request, applied via the existing system-prompt engine. Stored in settings DB. ([#5022](https://github.com/diegosouzapw/OmniRoute/pull/5022) — thanks @whale9820) - **feat(live-ws):** allow non-loopback clients via `LIVE_WS_ALLOWED_HOSTS` env var, enabling multi-host setups to access the live WebSocket API. ([#4877](https://github.com/diegosouzapw/OmniRoute/pull/4877) — thanks @KooshaPari) - **feat(db):** track API endpoint dimension on `usage_history` for per-endpoint cost and usage analytics. ([#4676](https://github.com/diegosouzapw/OmniRoute/pull/4676)) --- ### 🔧 Bug Fixes **Translator** - **fix(translator):** regroup parallel tool results to be adjacent to their originating assistant turn, fixing tool-message ordering for providers that require strict interleaving. ([#4882](https://github.com/diegosouzapw/OmniRoute/pull/4882)) - **fix(translator):** preserve literal empty-string tool arguments in OpenAI-to-Claude streaming — they were previously dropped, causing tool calls to arrive with missing parameters. ([#4959](https://github.com/diegosouzapw/OmniRoute/pull/4959)) - **fix(translator):** normalize tools to Anthropic-native shape for non-Anthropic providers, ensuring tool definitions pass validation regardless of the format at the call site. ([#4650](https://github.com/diegosouzapw/OmniRoute/pull/4650)) - **fix(translator):** provider thinking compatibility — correct thinking-block serialization for DeepSeek and Gemini providers. ([#4946](https://github.com/diegosouzapw/OmniRoute/pull/4946)) - **fix(translator):** emit `` close marker for Anthropic thinking blocks, fixing truncated reasoning output in streamed responses. ([#4633](https://github.com/diegosouzapw/OmniRoute/pull/4633)) - **fix(translator):** normalize `developer` role to `system` for OpenAI-format providers. ([#4625](https://github.com/diegosouzapw/OmniRoute/pull/4625)) - **fix(translator):** strip top-level `client_metadata` on the OpenAI passthrough path (port from 9router#1157). ([#4624](https://github.com/diegosouzapw/OmniRoute/pull/4624)) - **fix(translator):** replay `reasoning_content` on plain Xiaomi MiMo turns (port from 9router#1321). ([#4639](https://github.com/diegosouzapw/OmniRoute/pull/4639)) **Copilot / GitHub executor** - **fix(copilot):** never route Gemini/Claude model variants to the `/responses` endpoint — these models require the chat-completions path only. ([#4627](https://github.com/diegosouzapw/OmniRoute/pull/4627)) - **fix(github):** route Copilot Codex models to `/responses` (port from 9router#102). ([#4626](https://github.com/diegosouzapw/OmniRoute/pull/4626)) - **fix(copilot,antigravity):** cap `maxOutputTokens` at 16384 to stop "Invalid Argument" 400 errors on high-token requests. ([#4636](https://github.com/diegosouzapw/OmniRoute/pull/4636)) - **fix(codex):** drop non-standard `codex.*` streaming events that break `responses.stream` consumers. ([#4715](https://github.com/diegosouzapw/OmniRoute/pull/4715) — thanks @jeffer1312) **Claude / Anthropic** - **fix(claude):** omit `adaptive_thinking` and `output_config.effort` for Haiku model variants, which reject those parameters. ([#4661](https://github.com/diegosouzapw/OmniRoute/pull/4661)) - **fix(claude):** skip `mcp__` tool-name cloaking and guard against missing `connectionId` to prevent crashes on Claude-native MCP tool calls. ([#4861](https://github.com/diegosouzapw/OmniRoute/pull/4861) — thanks @costaeder) - **fix(claude-oauth):** respect `429` backoff headers on the Claude OAuth usage endpoint to reduce polling spam during quota checks. ([#4655](https://github.com/diegosouzapw/OmniRoute/pull/4655)) **Routing & SSE** - **fix(sse):** fail over on `400` responses that carry rate-limit text in the body, not just on canonical `429` status codes. ([#4986](https://github.com/diegosouzapw/OmniRoute/pull/4986)) - **fix(sse):** honor per-account proxy and fingerprint-rotation settings in the opencode executor. ([#4989](https://github.com/diegosouzapw/OmniRoute/pull/4989)) - **fix(sse):** soft-penalize exhausted providers in auto-combo scoring instead of hard-excluding them, improving fallback resilience. ([#4990](https://github.com/diegosouzapw/OmniRoute/pull/4990)) - **fix(sse):** drop the CCP pin when the pinned provider is durably unhealthy, with anti-flap logic to prevent oscillation. ([#4864](https://github.com/diegosouzapw/OmniRoute/pull/4864) — thanks @costaeder) - **fix(combo):** fetch models dynamically from custom provider endpoints instead of relying on a static list. ([#4860](https://github.com/diegosouzapw/OmniRoute/pull/4860)) - **fix(combo):** propagate the selected connection ID to fallback error responses so model lockout applies to the correct connection rather than the wrong fallback target. ([#4809](https://github.com/diegosouzapw/OmniRoute/pull/4809) — thanks @Chewji9875) - **fix(sse):** skip third-party tool-name cloaking for Anthropic-native server tools to prevent naming conflicts. ([#4808](https://github.com/diegosouzapw/OmniRoute/pull/4808) — thanks @NomenAK) **Quota** - **fix(quota):** quota-exclusive `qtSd/*` connections now appear in `/v1/models` listings; EPSILON-threshold check no longer falsely blocks under-budget allocations. ([#4830](https://github.com/diegosouzapw/OmniRoute/pull/4830)) - **fix(quota):** migration 107 correctly activates the `quota-share` strategy on existing `qtSd/*` combos. ([#4962](https://github.com/diegosouzapw/OmniRoute/pull/4962)) **API / responses** - **fix(api):** parse the `/v1/responses` request body once instead of 3–4 times on the hot path, reducing per-request overhead. ([#4958](https://github.com/diegosouzapw/OmniRoute/pull/4958)) - **fix(api):** evict stale in-memory rate-limit windows to stop a slow heap leak on long-running instances. ([#4957](https://github.com/diegosouzapw/OmniRoute/pull/4957)) - **fix(api):** require authentication on the compression `run-telemetry` endpoint; document `OMNIROUTE_EVAL_CREDENTIALS` env var. ([#4796](https://github.com/diegosouzapw/OmniRoute/pull/4796)) - **fix(api):** stop `GET /api/system/env/repair` returning HTTP `500` on packaged installs (it broke the onboarding wizard). `createRequire(import.meta.url)` ran at module top-level; once webpack bundles the route into the standalone build, `import.meta.url` is frozen to the build-machine path and `createRequire` throws during evaluation, so the whole route failed to load. `createRequire` is now resolved lazily inside the guarded `better-sqlite3` block, root-dir resolution falls back to `process.cwd()`, and the route passes an explicit `rootDir`. ([#5028](https://github.com/diegosouzapw/OmniRoute/pull/5028)) **Dashboard** - **fix(dashboard):** show custom provider given-name instead of internal id across dashboard pages — cache, combo health, compression analytics, cost overview, health/autopilot, provider stats, route explainability, provider utilization, runtime. Adds shared `resolveProviderName` resolver and `useProviderNodeMap` hook. (#4603) - **fix(dashboard):** on OAuth providers (e.g. GLM Coding), "Test all models" with auto-hide-failed now switches the model list to the "visible" filter after the run, so just-hidden failed models actually disappear on-screen — parity with the passthrough-provider path (#3610). Previously they were hidden in the DB but stayed visible under the "All" filter, so it looked like nothing was hidden. (#4887) - **fix(dashboard):** restore the home-page provider topology card that was hidden by a default state change in #4596. ([#4963](https://github.com/diegosouzapw/OmniRoute/pull/4963)) - **fix(dashboard):** proxy-pool success gating, sync-timestamp persistence, and opt-in Redis backend. ([#4988](https://github.com/diegosouzapw/OmniRoute/pull/4988)) - **fix(dashboard):** show custom vision models in the LLM selector dropdown. ([#4653](https://github.com/diegosouzapw/OmniRoute/pull/4653)) **Providers** - **fix(pollinations):** stop forcing `jsonMode` on every request. Pollinations treats `jsonMode=true` as "the model MUST return JSON" and rejects (HTTP 400 "messages must contain the word 'json'") any normal chat request whose messages don't mention "json", so all non-JSON chat was broken. `jsonMode` is now only enabled when the caller actually requests JSON output (`response_format.type` of `json_object` or `json_schema`). (#3981) - **fix(antigravity):** default `safetySettings` to all-OFF for parity with the native Gemini paths. The Antigravity (Google Cloud Code) request builder set `safetySettings: undefined`, which `JSON.stringify` drops — so no safety settings reached Google and its server-side defaults false-flagged benign technical prompts as `prohibited_content` (HTTP 200 + blocked body, which combo failover treats as terminal). Now honors a caller-supplied value and otherwise defaults to `DEFAULT_SAFETY_SETTINGS`, matching the claude-to-gemini / openai-to-gemini paths. (#5003) - **fix(antigravity):** exclude the standard Gemini rate-limit message from quota-exhaustion keyword matching to prevent false-positive saturation flags. ([#4810](https://github.com/diegosouzapw/OmniRoute/pull/4810) — thanks @Chewji9875) - **fix(chatgpt-web):** map the advertised `gpt-5.5`, `gpt-5.5-pro`, `gpt-5.4-pro` and `gpt-5.2-pro` catalog ids to their dash-form ChatGPT backend slugs. They were missing from `MODEL_MAP`, so the executor sent the dot-form id verbatim, which the ChatGPT backend silently ignored and served the default Plus model instead of the requested one. Adds a drift guard asserting no advertised dot-form id reaches the backend verbatim. (#4665) - **fix(gemini):** preserve the `pattern` field in the Antigravity tool schema sanitizer to avoid stripping valid regex constraints from tool definitions. ([#4651](https://github.com/diegosouzapw/OmniRoute/pull/4651)) - **fix(opencode):** preserve DeepSeek reasoning content in streamed responses. ([#4631](https://github.com/diegosouzapw/OmniRoute/pull/4631)) - **fix(perplexity):** validate API keys via the `/v1/models` endpoint instead of issuing a full chat request. ([#4654](https://github.com/diegosouzapw/OmniRoute/pull/4654)) - **fix(qoder):** exchange PAT for `jt-*` job token before initiating Cosy chat, fixing auth failures after the Qoder credential format change. ([#4884](https://github.com/diegosouzapw/OmniRoute/pull/4884)) - **fix(executors):** strip parameters unsupported by the target provider/model to prevent `400 Invalid parameter` errors on strict endpoints. ([#4658](https://github.com/diegosouzapw/OmniRoute/pull/4658)) - **fix(executors):** preserve literal `reasoning_effort: "max"` for Ollama Cloud instead of normalizing to `xhigh`. Ollama Cloud accepts `high|medium|low|max|none` and rejects `xhigh` (`invalid reasoning value: 'xhigh'`); OpenRouter DeepSeek `max→xhigh` normalization is unaffected. ([#4993](https://github.com/diegosouzapw/OmniRoute/pull/4993) — thanks @Thinkscape) - **fix(headroom):** translate openai-responses input through OpenAI for external compression. `adaptBodyForCompression` now serialises `function_call_output` items whose `output` field is a JSON object (not a string) so compression engines can process the content — previously those items were excluded from compression because `hasTextContent()` returned false for object values. ([#5023](https://github.com/diegosouzapw/OmniRoute/pull/5023) — thanks @anki1kr) - **fix(proxy):** fan out direct dispatcher streams to all registered proxy endpoints. ([#4803](https://github.com/diegosouzapw/OmniRoute/pull/4803) — thanks @makcimbx) **Compression** - **fix(compression):** eliminate ReDoS in the `math_inline` preservation regex — the previous pattern could catastrophically backtrack on untrusted input. ([#4838](https://github.com/diegosouzapw/OmniRoute/pull/4838)) - **fix(compression):** stop RTK over-truncating file-read tool results — RTK now respects the full content length for file-read outputs. ([#4987](https://github.com/diegosouzapw/OmniRoute/pull/4987)) **Build / CLI / infrastructure** - **fix(build):** drop `@omniroute/open-sse` from `optimizePackageImports` to fix the Next.js build OOM crash. ([#4968](https://github.com/diegosouzapw/OmniRoute/pull/4968)) - **fix(cli):** SIGKILL the systray child PID before closing the IPC channel to prevent macOS NSStatusItem orphan processes. ([#4732](https://github.com/diegosouzapw/OmniRoute/pull/4732)) - **fix(cli):** bump `better-sqlite3` runtime pin to 12.10.1 for Node 26 compatibility. ([#4685](https://github.com/diegosouzapw/OmniRoute/pull/4685)) - **fix(cli):** harden the systray2 tray runtime (port from 9router#1080). ([#4628](https://github.com/diegosouzapw/OmniRoute/pull/4628)) - **fix(cli-tools):** tolerate JSONC (comments and trailing commas) in tool settings files. ([#4659](https://github.com/diegosouzapw/OmniRoute/pull/4659)) - **fix(install):** make the `transformers` dependency optional so CUDA-host installs that lack Python bindings succeed. ([#4807](https://github.com/diegosouzapw/OmniRoute/pull/4807) — thanks @megamen32) - **fix(db):** correct storage tuning settings to prevent WAL runaway on high-write workloads. ([#4834](https://github.com/diegosouzapw/OmniRoute/pull/4834) — thanks @rdself) - **fix(image):** prevent compatible nodes from shadowing provider aliases in the image routing table. ([#4656](https://github.com/diegosouzapw/OmniRoute/pull/4656)) **Plugin** - **fix(plugin):** opencode `auth.json` dual-key fallback for the auto-prefix migration. The config hook now looks up both the prefixed (`opencode-omniroute`) and bare (`omniroute`) keys, so users who authenticated before the `opencode-` prefix landed no longer need to re-auth. ([#5027](https://github.com/diegosouzapw/OmniRoute/pull/5027) — thanks @herjarsa) --- ### 🔒 Security - **fix(security):** block SSRF allowlist bypass via `x-relay-path` header manipulation on Deno/Vercel relays. ([#4899](https://github.com/diegosouzapw/OmniRoute/pull/4899)) - **fix(security):** pin image-fetch DNS resolution to prevent SSRF DNS-rebinding attacks (GHSA-cmhj-wh2f-9cgx). ([#4634](https://github.com/diegosouzapw/OmniRoute/pull/4634)) - **fix(security):** do not trust the loopback socket as local-only when the server is behind a reverse proxy, closing a potential auth bypass path. ([#4632](https://github.com/diegosouzapw/OmniRoute/pull/4632)) - **fix(security):** validate the Kiro region parameter to prevent SSRF via crafted region strings (GHSA-6mwv-4mrm-5p3m). ([#4629](https://github.com/diegosouzapw/OmniRoute/pull/4629)) - **fix(copilot):** replace `execSync` shell interpolation with `execFile` in the `runOmniRouteCli` tool to prevent command injection. The user-supplied command is now split into an argv array and passed to `execFile` (no shell), so shell metacharacters are treated as literal text; error output is routed through `sanitizeErrorMessage()`. ([#5024](https://github.com/diegosouzapw/OmniRoute/pull/5024) — thanks @hamsa0x7) - **fix(db):** replace `Math.random` with `crypto.randomUUID` for database ID generation, removing a non-cryptographic source of collision/predictability in generated identifiers. ([#5026](https://github.com/diegosouzapw/OmniRoute/pull/5026) — thanks @hamsa0x7) --- ### 📝 Maintenance **God-file decomposition (continued, #3501)** - **refactor(chatCore):** extracted 12 focused helpers from `chatCore.ts` covering the streaming pipeline (`assembleStreamingPipeline`), cache-store logic (`storeStreamingSemanticCacheResponse`, `storeSemanticCacheResponse`), response headers (`assembleStreamingResponseHeaders`, `buildNonStreamingResponseHeaders`), JSON→SSE bridge (`maybeConvertJsonBodyToSse`), guardrail context (`buildPostCallGuardrailContext`), usage buffer (`applyClientUsageBuffer`), plugin hook (`runPluginOnRequestHook`), analytics (`writeCompressionAnalytics`, `emitOutputStyleTelemetry`), and compression predicates/settings (`resolveCompressionSettings`, et al.). ([#4811](https://github.com/diegosouzapw/OmniRoute/pull/4811)–[#4837](https://github.com/diegosouzapw/OmniRoute/pull/4837)) - **refactor(sse/db/api):** continued decomposition of `services/usage.ts` (extracted quota-core, scalar/format helpers, Antigravity/GLM/MiniMax usage families), `db/core.ts` (schema-column reconciliation, snake↔camel column-mapping), `db/apiKeys.ts` (row-parsers, model-permission matching), and `validation.ts` (URL/headers/transport leaf layer, web-cookie/Meta-AI validators, enterprise-cloud + probe, audio/speech/apikey, search/embedding/rerank, and OpenAI/Anthropic format validators). ([#4921](https://github.com/diegosouzapw/OmniRoute/pull/4921)–[#4956](https://github.com/diegosouzapw/OmniRoute/pull/4956)) - **refactor(pricing/providers):** decomposed `pricing.ts` into shared tiers + partitioned `DEFAULT_PRICING` modules, and split the `providers.ts` catalog into semantic data modules organized by provider family. ([#4917](https://github.com/diegosouzapw/OmniRoute/pull/4917), [#4918](https://github.com/diegosouzapw/OmniRoute/pull/4918)) - **refactor(open-sse):** extract `safeParseJSON` utility and dedup `tryParseJSON` call sites; extract and dedup the fallback `tool_call` ID generation helper. ([#4735](https://github.com/diegosouzapw/OmniRoute/pull/4735), [#4736](https://github.com/diegosouzapw/OmniRoute/pull/4736)) **Quality & CI** - **chore(quality):** release base-red reconciliation + ratchet rebaselines — file-size, env-doc, and catalog baseline updates across multiple gates. ([#4630](https://github.com/diegosouzapw/OmniRoute/pull/4630), [#4879](https://github.com/diegosouzapw/OmniRoute/pull/4879), [#4886](https://github.com/diegosouzapw/OmniRoute/pull/4886), [#4915](https://github.com/diegosouzapw/OmniRoute/pull/4915), [#4961](https://github.com/diegosouzapw/OmniRoute/pull/4961), [#4973](https://github.com/diegosouzapw/OmniRoute/pull/4973)) - **ci(quality):** shift heavy validation gates to the PR→release merge fast-path to accelerate the release cycle. ([#4857](https://github.com/diegosouzapw/OmniRoute/pull/4857)) - **fix(ci):** include `coverage/lcov.info` in the coverage-report artifact so SonarQube can consume it. ([#4670](https://github.com/diegosouzapw/OmniRoute/pull/4670)) - **fix(test):** validate Anthropic-compatible connections via `POST /v1/messages` for accurate connectivity testing. ([#4657](https://github.com/diegosouzapw/OmniRoute/pull/4657)) **Docs** - **docs(resilience):** document Quota-Share Concurrency Control — `max_concurrent` enforcement, serialization behavior, and cooldown-wait semantics. ([#4980](https://github.com/diegosouzapw/OmniRoute/pull/4980)) - **docs(perf):** add per-endpoint p50/p95/p99 latency and cost budgets reference. ([#4867](https://github.com/diegosouzapw/OmniRoute/pull/4867) — thanks @KooshaPari) - **docs(ops):** add canonical incident response runbook. ([#4868](https://github.com/diegosouzapw/OmniRoute/pull/4868) — thanks @KooshaPari) - **docs(ops):** document the release-green family — `green-prs`, `check:release-green`, `babysit`, and nightly gate workflows. ([#4679](https://github.com/diegosouzapw/OmniRoute/pull/4679)) - **docs(agentbridge):** document Electron `NODE_EXTRA_CA_CERTS`, real model IDs, and identity caveat for agent bridge integrations. ([#4718](https://github.com/diegosouzapw/OmniRoute/pull/4718)) - **docs:** clarify Kiro provides ~50 credits/month per account, not unlimited. ([#4690](https://github.com/diegosouzapw/OmniRoute/pull/4690)) **Misc** - **chore(claude,codex):** bump pinned CLI identities — Claude `2.1.158 → 2.1.187`, Codex `0.132.0 → 0.142.0`. ([#4883](https://github.com/diegosouzapw/OmniRoute/pull/4883)) - **chore(dashboard):** rename Qoder display label from "Qoder AI" to "Qoder". ([#4733](https://github.com/diegosouzapw/OmniRoute/pull/4733)) --- ## [3.8.35] — 2026-06-23 ### ✨ New Features - **Adaptive context compression (Phase 4)**: a four-layer compression upgrade landed across stacked PRs — an **Output Styles** registry (`terse-prose` / `less-code` / `terse-cjk`) ([#4694](https://github.com/diegosouzapw/OmniRoute/pull/4694) — thanks @diegosouzapw), an opt-in **SLM `ultra` tier** (two-tier LLMLingua with heuristic fallback) ([#4707](https://github.com/diegosouzapw/OmniRoute/pull/4707) — thanks @diegosouzapw), a **context-budget adaptive dial** (reserve-output ladder + floor) ([#4716](https://github.com/diegosouzapw/OmniRoute/pull/4716) — thanks @diegosouzapw), and an **offline evaluation harness** (PII-gated corpus, self-test judge, gold-grader, real-pipeline runner behind a `ModelClient` seam) ([#4720](https://github.com/diegosouzapw/OmniRoute/pull/4720) — thanks @diegosouzapw). All four layers share a single `CompressionRunTelemetry` contract. - **Redoc-rendered API docs**: a consolidated OpenAPI spec now lives at `docs/openapi.yaml` and is served as interactive Redoc documentation at `/api/docs`. ([#4781](https://github.com/diegosouzapw/OmniRoute/pull/4781) — thanks @KooshaPari / @diegosouzapw) ### 🔧 Bug Fixes - **db-backups**: make the database-import size cap configurable via `OMNIROUTE_DB_IMPORT_MAX_MB` (default 100 MB, 4 GB ceiling) so backups larger than 100 MB can be restored; error message now points to the env var and to VACUUM ([#4757](https://github.com/diegosouzapw/OmniRoute/pull/4757) — closes #4719, thanks @diegosouzapw). - **Onboarding**: add the missing `onboarding.tiers` step-title translation so the setup wizard no longer crashes with `MISSING_MESSAGE: onboarding.tiers` ([#4755](https://github.com/diegosouzapw/OmniRoute/pull/4755) — closes #4698, thanks @diegosouzapw). - **deepseek-web**: fold `role:"tool"` results into the single-prompt transcript (`messagesToPrompt`) so tool outputs reach the model instead of being silently dropped when a follow-up turn omits the `tools[]` array ([#4756](https://github.com/diegosouzapw/OmniRoute/pull/4756) — closes #4712, thanks @diegosouzapw). - **Dashboard**: remove the dead, unconditional `useLiveRequests()` call from `HomePageClient.tsx` — it crashed the `/home` page in production builds with `ReferenceError: useLiveRequests is not defined` (#4759, #4745) and opened the live-dashboard WebSocket even when Provider Topology was hidden (#4596). The live feed remains owned by the settings-gated `HomeProviderTopologySection` ([#4761](https://github.com/diegosouzapw/OmniRoute/pull/4761) — thanks @diegosouzapw). - **Providers dashboard**: dedupe provider nodes by id when adding a compatible provider (`upsertProviderNodeById`) so the same provider can no longer appear twice and no-op adds don't invalidate the compatible-provider memo ([#4768](https://github.com/diegosouzapw/OmniRoute/pull/4768) — closes #4746, thanks @diegosouzapw). - **Storage VACUUM**: the scheduled VACUUM job now follows the Storage page settings (`scheduledVacuum` / `vacuumHour`) as the single source of truth; the legacy env-flag control path was removed ([#4726](https://github.com/diegosouzapw/OmniRoute/pull/4726) — thanks @rdself). - **Storage SQLite tuning**: `Cache Size` is now a positive KiB setting (for example, `16384`) that applies to SQLite as `PRAGMA cache_size = -16384`; Page Size and Cache Size changes are applied to the live database instead of being persisted only in the settings table. - **Tiers**: no-auth providers are now counted as free, and the free-tier filter returns an empty set instead of falling through to every provider ([#4753](https://github.com/diegosouzapw/OmniRoute/pull/4753) — thanks @megamen32 / @diegosouzapw). - **Combos**: auto-promote `zeroLatencyOptimizationsEnabled` so legacy configs (pre-3.8.33 `fallbackCompressionMode="lite"`) round-trip cleanly on the first GUI edit ([#4774](https://github.com/diegosouzapw/OmniRoute/pull/4774) — thanks @KooshaPari / @diegosouzapw). ### 📝 Maintenance - **chatCore (#3501)**: continued the incremental decomposition of `executeProviderRequest` and the streaming/non-streaming hooks into pure leaf modules — top-level helpers + 6 pure leaves ([#4571](https://github.com/diegosouzapw/OmniRoute/pull/4571)), `resolveExecutorWithProxy` + `getExecutionCredentials` ([#4646](https://github.com/diegosouzapw/OmniRoute/pull/4646)), Claude message transforms ([#4708](https://github.com/diegosouzapw/OmniRoute/pull/4708)), `persistAttemptLogs` ([#4717](https://github.com/diegosouzapw/OmniRoute/pull/4717)), `stageTrace` + `compressionUsageReceipt` ([#4721](https://github.com/diegosouzapw/OmniRoute/pull/4721)), `prepareUpstreamBody` ([#4730](https://github.com/diegosouzapw/OmniRoute/pull/4730)), parse + non-streaming usage-stats ([#4762](https://github.com/diegosouzapw/OmniRoute/pull/4762)), `recordContextEditingTelemetryHook` ([#4779](https://github.com/diegosouzapw/OmniRoute/pull/4779)), `scheduleQuotaShareConsumption` ([#4780](https://github.com/diegosouzapw/OmniRoute/pull/4780)), `emitRequestGamificationEvent` ([#4776](https://github.com/diegosouzapw/OmniRoute/pull/4776)), `runPluginOnResponseHook` ([#4782](https://github.com/diegosouzapw/OmniRoute/pull/4782)), `scheduleStreamingQuotaShareConsumption` ([#4784](https://github.com/diegosouzapw/OmniRoute/pull/4784)), `recordCompressionCacheStats` ([#4792](https://github.com/diegosouzapw/OmniRoute/pull/4792)), `writeCavemanOutputAnalytics` ([#4794](https://github.com/diegosouzapw/OmniRoute/pull/4794)), `recordStreamingUsageStats` ([#4791](https://github.com/diegosouzapw/OmniRoute/pull/4791)), and `recordStreamingCost` ([#4790](https://github.com/diegosouzapw/OmniRoute/pull/4790)). (thanks @diegosouzapw) - **Quality**: expand `check:release-green` to reproduce the full release-PR gate set locally ([#4758](https://github.com/diegosouzapw/OmniRoute/pull/4758) — thanks @diegosouzapw). - **db**: re-export `compressionRunTelemetry` from `localDb` to satisfy the db-rules gate ([#4775](https://github.com/diegosouzapw/OmniRoute/pull/4775) — thanks @diegosouzapw). - **Security docs**: add a canonical STRIDE-based threat model ([#4783](https://github.com/diegosouzapw/OmniRoute/pull/4783) — thanks @KooshaPari). - **Tests**: add a smoke test for the home-client dashboard ([#4793](https://github.com/diegosouzapw/OmniRoute/pull/4793) — thanks @JxnLexn). - **Docs**: credit **ponytail** and **OmniCompress** in the README inspiring-projects list and restore the `check:env-doc-sync` release-green by exempting the harness-only `OMNIROUTE_EVAL_CREDENTIALS` var ([#4799](https://github.com/diegosouzapw/OmniRoute/pull/4799) — thanks @diegosouzapw); declare the Phase 4 compression layers in the README + GUIDE ([#4801](https://github.com/diegosouzapw/OmniRoute/pull/4801) — thanks @diegosouzapw). - **Quality**: trim `combo-config.test.ts` comments back under the file-size cap (follow-up to #4774) ([#4800](https://github.com/diegosouzapw/OmniRoute/pull/4800) — thanks @diegosouzapw). --- ## [3.8.34] — 2026-06-23 ### ✨ New Features - **feat(executors): Microsoft 365 Copilot pure framing + connection helpers** — adds the request/response framing and connection helpers to support `m365.cloud.microsoft/chat` for individual M365 plans. ([#4696](https://github.com/diegosouzapw/OmniRoute/pull/4696) — thanks @skyzea1 / @diegosouzapw) - **feat(compression): per-request `x-omniroute-compression` header (Phase 3)** — a request header now overrides the compression plan with the highest precedence (`request-header > routing > profile > auto-trigger > Default > off`), accepting `off` / `default` / `engine:` / ``. The response echoes `X-OmniRoute-Compression: ; source=`. ([#4645](https://github.com/diegosouzapw/OmniRoute/pull/4645) — thanks @diegosouzapw) - **feat(audio): MiniMax T2A v2 TTS dispatch in `audioSpeech`** — adds MiniMax text-to-speech dispatch (port of upstream #1043). ([#4553](https://github.com/diegosouzapw/OmniRoute/pull/4553) — thanks @diegosouzapw) - **feat(opencode): OpenCode Go DeepSeek reasoning variants** — registers the Go DeepSeek reasoning model variants. ([#4647](https://github.com/diegosouzapw/OmniRoute/pull/4647) — thanks @DevEstacion) - **feat(quota): quota scraping for OpenCode Go and Ollama Cloud** — surfaces quota windows for the OpenCode Go and Ollama Cloud providers. ([#4642](https://github.com/diegosouzapw/OmniRoute/pull/4642) — thanks @JxnLexn) - **feat(settings): expose stream recovery feature flags** — surfaces the stream-recovery toggles in settings. ([#4586](https://github.com/diegosouzapw/OmniRoute/pull/4586) — thanks @rdself) - **feat(providers): optional model ID for custom API-key validation** — custom API-key connection tests can now specify the model ID used to validate the key. ([#4555](https://github.com/diegosouzapw/OmniRoute/pull/4555) — thanks @diegosouzapw) ### 🐛 Fixed - **fix(tier): noAuth providers count as free; `auto/:free` returns an empty pool when no free candidate matches** — `freeProviders` is now the union of the legacy explicit list and every chat-tier `noAuth` provider derived from `NOAUTH_PROVIDERS` (so opencode / mimocode / duckduckgo-web are correctly classified free), the task-fitness lookup inherits a base model's `arena_elo` for its `-free` variant, and the `auto/:` filter no longer silently falls back to the full pool — a `:free` request that matches no connected free model returns empty instead of billing a paid model (opt back into the legacy fallback with `OMNIROUTE_AUTO_FREE_FALLBACK_TO_FULL_POOL=true`). Corrupted/invalid `tier_config` rows now log a structured warning and fall back to defaults instead of throwing. ([#4753](https://github.com/diegosouzapw/OmniRoute/pull/4753), [#4517](https://github.com/diegosouzapw/OmniRoute/issues/4517) — thanks @megamen32) - **fix(db): scheduled VACUUM follows Storage settings** — the SQLite VACUUM scheduler now uses the existing Storage page `scheduledVacuum` / `vacuumHour` configuration as its single source of truth, refreshes immediately when those settings are saved, and no longer exposes a separate environment-variable control path. - **fix(db): scheduled cleanup actually runs + queries target the real tables (DB-bloat / OOM)** — `runAutoCleanup` was never scheduled, so retention cleanup never executed and tables (`compression_analytics`, `usage_history`, …) grew unbounded into multi-GB SQLite files driving high RSS. Worse, several cleanup queries referenced wrong table/column names (`call_logs.created_at`→`timestamp`, `compression_analytics.created_at`→`timestamp`, `mcp_audit_log`→`mcp_tool_audit`, `a2a_events`→`a2a_task_events`, `memory_entries`→`memories`), so even a manual run silently no-op'd or errored. Fixed the five queries to match the real schema, added `cleanupProxyLogs`, and wired a `startCleanupScheduler` (startup + every 6h, VACUUM after deletes) into `server-init` alongside the existing budget-reset and reasoning-cache jobs. ([#4691](https://github.com/diegosouzapw/OmniRoute/pull/4691), extracted from [#4428](https://github.com/diegosouzapw/OmniRoute/pull/4428) — thanks @oyi77 / @diegosouzapw) - **fix(routing): include all noAuth models in auto-combos + add reka-flash + best-free template** — noAuth provider models are no longer skipped when building auto-combos, `reka-flash` is registered, and a `best-free` combo template is added. ([#4621](https://github.com/diegosouzapw/OmniRoute/pull/4621) — thanks @oyi77) - **fix: noAuth provider validation + Kimi executor routing** — corrects noAuth provider membership checks and removes a mis-routed Kimi alias. (closes #4620) ([#4699](https://github.com/diegosouzapw/OmniRoute/pull/4699) — thanks @oyi77) - **fix(executors): Firecrawl `web_fetch` 500 with `include_metadata=true`** — fixes a crash when Firecrawl web_fetch is invoked with metadata extraction enabled. ([#4692](https://github.com/diegosouzapw/OmniRoute/pull/4692) — thanks @ponkcore) - **fix(proxy): apply `pipelining:0` + connections cap to the direct dispatcher** — same-provider concurrent requests no longer serialize behind a long/streaming request on the direct path. ([#4684](https://github.com/diegosouzapw/OmniRoute/pull/4684) — thanks @jeffer1312 / @diegosouzapw) - **fix(telemetry): back off live-WS event forwarding when the sidecar is unreachable** — stops repeatedly attempting to connect to `LIVE_WS_PORT` when live monitoring is not configured. ([#4687](https://github.com/diegosouzapw/OmniRoute/pull/4687) — thanks @FikFikk / @diegosouzapw) - **fix(api): serve `GET /v1/models/{model}` as JSON, not the HTML dashboard** — the per-model endpoint (IDs with slashes via a catch-all route) now returns JSON, unbreaking Claude Code. ([#4677](https://github.com/diegosouzapw/OmniRoute/pull/4677) — thanks @papajo / @diegosouzapw) - **fix(executors): robust deepseek-web tool-call parsing and agentic context retention** — hardens DeepSeek-web tool-call parsing and preserves agentic context across turns. ([#4644](https://github.com/diegosouzapw/OmniRoute/pull/4644) — thanks @BugsBag) - **fix(cli): authenticate `omniroute logs` and honor the active context** — the `logs` command now authenticates and respects the active context. ([#4638](https://github.com/diegosouzapw/OmniRoute/pull/4638) — thanks @Rahulsharma0810) - **fix(stream): estimate input tokens when upstream reports `prompt_tokens=0`** — input token usage is estimated when the upstream omits it. ([#4615](https://github.com/diegosouzapw/OmniRoute/pull/4615) — thanks @adivekar-utexas) - **fix(plugin): auto-prefix providerId with `opencode-` for OpenCode 1.17.8+ native gate** — adapts provider IDs to the OpenCode 1.17.8+ native provider gate. ([#4527](https://github.com/diegosouzapw/OmniRoute/pull/4527) — thanks @herjarsa) - **fix(catalog): shorten no-thinking gateway prefix to `no-think/`** — renames the no-thinking gateway prefix. ([#4525](https://github.com/diegosouzapw/OmniRoute/pull/4525) — thanks @Rahulsharma0810) - **fix(models): unknown max output limits no longer default to 8192** — models without synced/registry/static `maxOutputTokens` resolve the limit as unknown instead of a generic 8192 cap; clamping/injection only happens when a real cap is known. ([#4584](https://github.com/diegosouzapw/OmniRoute/pull/4584) — thanks @rdself) - **fix(resilience): respect upstream retry-hint toggle** — honors the configured toggle for upstream retry hints. ([#4585](https://github.com/diegosouzapw/OmniRoute/pull/4585) — thanks @rdself) - **fix(providers): show revealed connection API keys** — fixes revealing stored connection API keys in the UI. ([#4583](https://github.com/diegosouzapw/OmniRoute/pull/4583) — thanks @rdself) - **fix(logs): make active-request stale sweep configurable** — exposes the stale-request sweep interval as a setting. ([#4599](https://github.com/diegosouzapw/OmniRoute/pull/4599) — thanks @rdself) - **fix(resilience): retain provider cooldowns for the configured max window** — cooldowns persist for the configured maximum window. ([#4588](https://github.com/diegosouzapw/OmniRoute/pull/4588) — thanks @KooshaPari) - **fix(resilience): reject invalid provider cooldown bounds** — validates cooldown bound configuration. ([#4589](https://github.com/diegosouzapw/OmniRoute/pull/4589) — thanks @KooshaPari) - **fix(combo): preserve production combo metrics on shadow eviction** — shadow eviction no longer drops production combo metrics. ([#4590](https://github.com/diegosouzapw/OmniRoute/pull/4590) — thanks @KooshaPari) - **fix(combo): exclude exhausted connections from auto scoring** — exhausted connections are no longer scored as auto-combo candidates. ([#4592](https://github.com/diegosouzapw/OmniRoute/pull/4592) — thanks @KooshaPari) - **fix(relay): apply IP rate limit to the Bifrost sidecar** — extends IP rate limiting to the Bifrost relay sidecar. ([#4593](https://github.com/diegosouzapw/OmniRoute/pull/4593) — thanks @KooshaPari) - **fix(bifrost): finalize SSE relay usage after stream** — finalizes relay usage accounting once the SSE stream completes. ([#4612](https://github.com/diegosouzapw/OmniRoute/pull/4612) — thanks @KooshaPari) - **fix(quota): expose Bailian quota windows** — surfaces Bailian provider quota windows. ([#4610](https://github.com/diegosouzapw/OmniRoute/pull/4610) — thanks @KooshaPari) - **fix(dashboard): gate home topology live-WS networking behind widget visibility** — the home dashboard no longer starts topology polling / live sockets when topology is hidden. ([#4618](https://github.com/diegosouzapw/OmniRoute/pull/4618), [#4606](https://github.com/diegosouzapw/OmniRoute/pull/4606) — thanks @KooshaPari) - **fix(dashboard): isolate the quota widget refresh clock** — the quota widget refresh no longer drives unrelated re-renders. ([#4611](https://github.com/diegosouzapw/OmniRoute/pull/4611) — thanks @KooshaPari) - **fix(dashboard): memoize compatible provider groups** — avoids recomputing compatible provider groups on every render. ([#4613](https://github.com/diegosouzapw/OmniRoute/pull/4613) — thanks @KooshaPari) - **fix(cli): align `omniroute` data dir and env loading with the runtime** — the CLI's data-dir/env loading no longer drifts from the server runtime configuration. ([#4619](https://github.com/diegosouzapw/OmniRoute/pull/4619), [#4607](https://github.com/diegosouzapw/OmniRoute/pull/4607) — thanks @KooshaPari) - **fix(api/settings): prevent cached `/api/settings` responses** — disables caching on the settings endpoint (port from 9router#951). ([#4566](https://github.com/diegosouzapw/OmniRoute/pull/4566) — thanks @diegosouzapw) - **fix(executors): strip temperature for the GitHub Copilot gpt-5.4 family** — removes the unsupported `temperature` param for Copilot gpt-5.4 models (port from 9router#612). ([#4564](https://github.com/diegosouzapw/OmniRoute/pull/4564) — thanks @diegosouzapw) - **fix(dashboard): keep play_arrow spinning on provider "Test All" buttons** — fixes the spinner state on the provider test buttons (port from 9router#715). ([#4563](https://github.com/diegosouzapw/OmniRoute/pull/4563) — thanks @diegosouzapw) - **fix(dashboard): surface manual config CTA when Open Claw CLI auto-detect fails** — shows a manual-config call-to-action on the Open Claw CLI card when auto-detection fails. ([#4562](https://github.com/diegosouzapw/OmniRoute/pull/4562) — thanks @diegosouzapw) - **fix(oauth): update Qwen OAuth URLs from `chat.qwen.ai` to `qwen.ai`** — refreshes the Qwen OAuth endpoints (port of decolua/9router#683). ([#4561](https://github.com/diegosouzapw/OmniRoute/pull/4561) — thanks @diegosouzapw) ### 📝 Maintenance - **refactor(imageGeneration): extract 8 provider families to co-located files** — splits the image-generation module into eight co-located per-provider files with no behavioral change. ([#4609](https://github.com/diegosouzapw/OmniRoute/pull/4609) — thanks @KooshaPari) - **deps: bump production + development groups; migrate js-yaml to v5 (ESM)** — dependency bumps plus a `js-yaml` v4→v5 migration to the ESM-only namespace import. ([#4697](https://github.com/diegosouzapw/OmniRoute/pull/4697) — thanks @diegosouzapw) - **chore(quality): release-green pre-flight validator + nightly signal** — new `npm run check:release-green` (`scripts/quality/validate-release-green.mjs`) reproduces the release-equivalent validation (full unit + vitest + ratchets + typecheck + lint, optional `--with-build` package-artifact) against the current working tree and classifies each red as **HARD** (real defect) vs **DRIFT** (ratchet, rebaselined at release) — purely diagnostic, never blocking contributors. A new `nightly-release-green` workflow runs it on the active release branch and opens/updates a tracking issue on hard failures. Closes the gap where the full gate (`ci.yml`) only ran on the release PR, so reds accrued silently on `release/**` and surfaced in layers at release time. ([#4622](https://github.com/diegosouzapw/OmniRoute/pull/4622) — thanks @diegosouzapw) - **chore(quality): reconcile file-size baseline for #4644 (`deepseek-web.ts` 1117→1125)** — rebaselines the file-size gate after the deepseek-web hardening. ([#4695](https://github.com/diegosouzapw/OmniRoute/pull/4695) — thanks @diegosouzapw) --- ## [3.8.33] — 2026-06-21 ### ✨ New Features - **feat(combo): nested combo-ref execution (`nestedComboMode: execute`)** — selection strategies can now treat a combo-reference step as a black box, executing the referenced combo as a single unit instead of flattening its targets. ([#4537](https://github.com/diegosouzapw/OmniRoute/pull/4537) — thanks @adivekar-utexas) - **feat(combo): sticky weighted selection limit with exhaustion-aware renormalization** — weighted strategies gain a configurable sticky-selection limit; once a target is exhausted, remaining weights renormalize so traffic is redistributed correctly. ([#4489](https://github.com/diegosouzapw/OmniRoute/pull/4489) — thanks @adivekar-utexas) - **feat(combos): provider-wildcard expansion in combo steps** — a combo step may now reference a whole provider via wildcard and have it expand to that provider's models at resolution time. ([#4545](https://github.com/diegosouzapw/OmniRoute/pull/4545) — thanks @Rahulsharma0810) - **feat(compression): Phase 2 — named profiles + active selector** — the compression settings panel becomes the single source of truth via a single active-profile selector (Default panel vs a named combo) wired into the runtime. ([#4521](https://github.com/diegosouzapw/OmniRoute/pull/4521) — thanks @diegosouzapw) - **feat(sse): route `web_search` requests to a configured model** — CCR-style webSearch scenario: requests carrying a `web_search*` tool can be routed to a dedicated `webSearchRouteModel`, configurable from the Routing tab. ([#4509](https://github.com/diegosouzapw/OmniRoute/pull/4509) — thanks @shafqatevo / @diegosouzapw) - **feat(mcp): `omniroute_web_fetch` tool for URL content extraction** — new MCP tool that fetches and extracts the content of a URL. ([#4510](https://github.com/diegosouzapw/OmniRoute/pull/4510) — thanks @ponkcore) - **feat(models): qualify duplicate model names with their provider prefix** — when two providers expose a same-named model, the catalog now disambiguates each with its provider prefix. ([#4516](https://github.com/diegosouzapw/OmniRoute/pull/4516) — thanks @Rahulsharma0810) - **feat(translator): accept OpenAI audio input parts in Gemini translation** — `input_audio` message parts are now translated through to Gemini. ([#4434](https://github.com/diegosouzapw/OmniRoute/pull/4434) — thanks @diegosouzapw) - **feat(webhooks): enrich Telegram request notifications** — Telegram webhook payloads carry richer request context. ([#4524](https://github.com/diegosouzapw/OmniRoute/pull/4524) — thanks @mppata-glitch) - **feat(bazaarlink): add `authHint` to the existing APIKEY_PROVIDERS entry** — surfaces the auth hint for the bazaarlink provider. ([#4522](https://github.com/diegosouzapw/OmniRoute/pull/4522) — thanks @adivekar-utexas) - **feat(usage): API-key USD quota percent + reset hints, weekly-window cutoff** — usage dashboard surfaces API-key USD quota percentage and reset hints, honoring the weekly window cutoff. ([#4398](https://github.com/diegosouzapw/OmniRoute/pull/4398) — thanks @Witroch4) - **feat(usage): surface Codex code-review weekly window + `additional_rate_limits` fallback** — exposes the Codex code-review weekly window and falls back to `additional_rate_limits` when present. ([#4494](https://github.com/diegosouzapw/OmniRoute/pull/4494) — thanks @diegosouzapw) - **feat(dashboard): per-provider dropdown filter on the quota dashboard** — filter the quota dashboard by provider. ([#4495](https://github.com/diegosouzapw/OmniRoute/pull/4495) — thanks @diegosouzapw) - **feat(dashboard): inline show/hide toggle for API keys on the API Manager page** ([#4505](https://github.com/diegosouzapw/OmniRoute/pull/4505) — thanks @diegosouzapw) - **feat(dashboard): toggle-style model deselection in the combo builder modal** ([#4498](https://github.com/diegosouzapw/OmniRoute/pull/4498) — thanks @diegosouzapw) - **feat(dashboard): Done button in the model picker for combo creation** ([#4496](https://github.com/diegosouzapw/OmniRoute/pull/4496) — thanks @diegosouzapw) - **feat(providers): expose `gpt-4o` on the built-in GitHub Copilot (`gh`) provider** ([#4487](https://github.com/diegosouzapw/OmniRoute/pull/4487) — thanks @diegosouzapw) - **feat(pricing): default pricing for the Qwen coder-model on the `qw` provider** ([#4488](https://github.com/diegosouzapw/OmniRoute/pull/4488) — thanks @diegosouzapw) ### 🔧 Bug Fixes - **fix(telemetry): back off the live-WS event bridge so a missing sidecar stops spamming ProxyFetch errors** — in single-port deployments the live-dashboard sidecar (port 20129) is not running, but `forwardDashboardEventToLiveWs` POSTed to it on every compression event. Since the global `fetch` is `proxyFetch`, each `ECONNREFUSED` logged a `[ProxyFetch] Undici dispatcher failed` warning (~272× in 42 min). The forwarder now backs off after consecutive failures (60s cooldown, lazy recovery) and clears the backoff on success, so a missing sidecar no longer floods the logs. ([#4604](https://github.com/diegosouzapw/OmniRoute/issues/4604) — thanks @FikFikk) - **fix(api): resolve a compatible provider node by base type, not only exact id** — connection→node resolution now matches on the bare derived node type when the exact id isn't found and the match is unambiguous (ambiguous → 404), via a pure `providerNodeSelect` helper. ([#4576](https://github.com/diegosouzapw/OmniRoute/pull/4576) — thanks @aleksesipenko / @diegosouzapw) - **fix(cli): supervisor restarts on spontaneous exit-0 (OOM cgroup) + waits for port before respawn** — a child that exits 0 because the cgroup OOM-killer reaped it is now restarted (not treated as a clean shutdown), the restart reset window widened 30s→60s, and the supervisor waits for the port to be free before respawning. ([#4578](https://github.com/diegosouzapw/OmniRoute/pull/4578) — thanks @oyi77 / @diegosouzapw) - **fix(combo): attribute lockout decay & success telemetry to the dynamically-selected connection** — on the combo success path the actual connection chosen by dynamic account-selection is read from the `X-OmniRoute-Selected-Connection-Id` response header (instead of the often-empty static `target.connectionId`), so model-lockout decay, `recordProviderSuccess`, LKGP and success/failure telemetry attribute to the right connection on both the priority and round-robin paths. The pre-screen "unavailable" snapshot is also no longer a permanent skip — availability is re-checked on each retry since connection cooldowns can expire mid-request. ([#4550](https://github.com/diegosouzapw/OmniRoute/pull/4550) — thanks @Chewji9875) - **fix(auto): enforce the quota cutoff before scoring (opt-in)** — auto-routing now evaluates a hard quota cutoff in `buildAutoCandidates` to drop low-quota candidates before scoring, with a 429 guard when all candidates fall below cutoff. The cutoff is **opt-in** behind `QuotaPreflightSettings.enabled` (default OFF via `QUOTA_PREFLIGHT_CUTOFF_ENABLED`), so default behavior is unchanged. ([#4483](https://github.com/diegosouzapw/OmniRoute/pull/4483) — thanks @megamen32) - **fix(antigravity): reasoning/thinking models no longer 400 with `oneOf at '/' not met`** — the Cloud Code envelope passthrough also leaked the Claude/OpenAI-native thinking fields (`thinking`, `reasoning_effort`, `reasoning`, `enable_thinking`, `thinking_budget`) the unified thinking adapter sets at the body root; Google rejected them with `400 Bad input: oneOf at '/' not met`. The whole thinking family is now stripped before the envelope is built; Gemini's own `generationConfig.thinkingConfig` is unaffected. ([#4485](https://github.com/diegosouzapw/OmniRoute/pull/4485) — port from 9router#1926, thanks @theseven99 / @diegosouzapw) - **fix(integration): restore the codex and memory pipeline contracts** — realigns the CLI fingerprint + memory-tools contracts so the codex and memory pipelines pass their integration checks again. ([#4474](https://github.com/diegosouzapw/OmniRoute/pull/4474) — thanks @KooshaPari) - **fix(sse): RTK must preserve `cache_control`-marked `tool_result` blocks** — reasoning-token-keeping no longer drops tool_result blocks that carry a `cache_control` marker. ([#4560](https://github.com/diegosouzapw/OmniRoute/pull/4560) — thanks @diegosouzapw) - **fix(auto-combo): respect model visibility (`isHidden`) in the auto-combo candidate pool** — hidden models are excluded from auto-combo candidates. ([#4558](https://github.com/diegosouzapw/OmniRoute/pull/4558) — thanks @herjarsa) - **fix(dashboard): avoid overlapping provider health polls** — guards against concurrent provider-health poll cycles overlapping. ([#4557](https://github.com/diegosouzapw/OmniRoute/pull/4557) — thanks @KooshaPari) - **fix(dashboard): make the API Manager key table usable on mobile** ([#4556](https://github.com/diegosouzapw/OmniRoute/pull/4556) — thanks @janeza2) - **fix(executors): decode Composer/Cursor ``-marked visible output** — visible text wrapped in Cursor Composer's `` markers is now decoded correctly. ([#4554](https://github.com/diegosouzapw/OmniRoute/pull/4554) — thanks @diegosouzapw) - **fix(oauth): improve Cursor auto-import reliability on macOS** ([#4552](https://github.com/diegosouzapw/OmniRoute/pull/4552) — thanks @diegosouzapw) - **fix(providers/test): probe the real Codex `/responses` endpoint** — connection test hits the actual Codex `/responses` endpoint. ([#4551](https://github.com/diegosouzapw/OmniRoute/pull/4551) — thanks @diegosouzapw) - **fix(mcp): `webFetchInput` emits `URL is required` for a missing url** — clearer validation error for the web-fetch tool. ([#4541](https://github.com/diegosouzapw/OmniRoute/pull/4541) — thanks @ponkcore / @diegosouzapw) - **fix(compression): allow `enginesExplicit` through the PUT validation schema** — the compression settings PUT no longer rejects the `enginesExplicit` flag. ([#4532](https://github.com/diegosouzapw/OmniRoute/pull/4532) — thanks @DevEstacion) - **fix(no-think): normalize provider prefix to canonical in no-think variants** ([#4531](https://github.com/diegosouzapw/OmniRoute/pull/4531) — thanks @Rahulsharma0810) - **fix(combo): pass `maxCooldownMs` from settings to the `recordModelLockoutFailure` call sites** ([#4530](https://github.com/diegosouzapw/OmniRoute/pull/4530) — thanks @Chewji9875) - **fix(combo): allow fallback on context-overflow & param-validation 400s; preserve upstream codes** — combo fallback now triggers on recoverable 400s while keeping the original upstream status. ([#4519](https://github.com/diegosouzapw/OmniRoute/pull/4519) — thanks @adivekar-utexas) - **fix(command-code): cap `max_tokens` per model using the registry `maxOutputTokens`** ([#4518](https://github.com/diegosouzapw/OmniRoute/pull/4518) — thanks @adivekar-utexas) - **fix(mitm): gate sudo prompts on server platform, not browser UA** ([#4514](https://github.com/diegosouzapw/OmniRoute/pull/4514) — thanks @diegosouzapw) - **fix(mitm): graceful sudo degradation in slim Docker / non-root containers** ([#4513](https://github.com/diegosouzapw/OmniRoute/pull/4513) — thanks @diegosouzapw) - **fix(usage): clear auth-expired message for Kiro social-auth accounts** ([#4512](https://github.com/diegosouzapw/OmniRoute/pull/4512) — thanks @diegosouzapw) - **fix(pricing): default cost rows for Antigravity Gemini 3.5 Flash tiers + `gemini-pro-agent`** ([#4508](https://github.com/diegosouzapw/OmniRoute/pull/4508) — thanks @diegosouzapw) - **fix(api): dedupe exact-duplicate ids in `/v1/models`** — low-noise model output without alias/canonical duplicates. ([#4506](https://github.com/diegosouzapw/OmniRoute/pull/4506) — thanks @Rahulsharma0810 / @diegosouzapw) - **fix(dashboard): enable Codex Apply/Reset buttons when the CLI is installed** ([#4504](https://github.com/diegosouzapw/OmniRoute/pull/4504) — thanks @diegosouzapw) - **fix(dashboard): show API-Key-compatible providers in the Antigravity CLI Tools model picker** ([#4503](https://github.com/diegosouzapw/OmniRoute/pull/4503) — thanks @diegosouzapw) - **fix(dashboard): migrate ManualConfigModal copy to the shared `useCopyToClipboard` hook** ([#4502](https://github.com/diegosouzapw/OmniRoute/pull/4502) — thanks @diegosouzapw) - **fix(sse): skip disabled providers in combo fallback** ([#4500](https://github.com/diegosouzapw/OmniRoute/pull/4500) — thanks @diegosouzapw) - **fix(usage): parse numeric-string quota reset timestamps as Unix sec/ms** ([#4493](https://github.com/diegosouzapw/OmniRoute/pull/4493) — thanks @diegosouzapw) - **fix(db): scheduled VACUUM + persist `lastVacuumAt`** — a new `vacuumScheduler.ts` persists the last run timestamp and last error to the `key_value` table (migration 102) and feeds the database settings panel; wired into the Next.js lifecycle (default 24h, window 02:00–04:00 local). The initial env-flag control path from this entry is superseded in v3.8.34 by the Storage page settings. ([#4480](https://github.com/diegosouzapw/OmniRoute/pull/4480) — thanks @KooshaPari / @oyi77) - **perf(quota): stop writing redundant `quota_snapshots` rows from idle connections** — the 60s background refresh persisted a snapshot for every window of every connection regardless of change, generating 400K+ rows/day from idle accounts. `setQuotaCache` now skips the write when a window's `remaining_percentage`/`is_exhausted` is unchanged from the last cached observation; the first observation and every real change still persist. ([#4565](https://github.com/diegosouzapw/OmniRoute/pull/4565), [#4438](https://github.com/diegosouzapw/OmniRoute/issues/4438) — thanks @oyi77) ### 🔒 Security - **fix(sse): crypto-secure RNG for combo/deck load-balancing selection** — replaces `Math.random()` with a crypto-secure source in the combo/deck weighted-selection path. ([#4455](https://github.com/diegosouzapw/OmniRoute/pull/4455) — thanks @diegosouzapw) ### 📝 Maintenance - **perf(dashboard): shrink provider assets + fix the usage rollup cutoff** — recompresses oversized provider images (nanobot/picoclaw/zeroclaw) and adds a `check:provider-assets` gate, plus a usage-analytics rollup cutoff fix. ([#4464](https://github.com/diegosouzapw/OmniRoute/pull/4464) — thanks @KooshaPari) - **refactor(chatCore): extract pure leaves from `chatCore.ts`** — incremental decomposition of the chat-core handler into pure, individually-testable leaves (system-role extraction, upstream-header build, failure usage-record builder, key-health, request-format, claude-effort, target-format, Background-Task-Redirect decision, Codex quota-state persistence). ([#4548](https://github.com/diegosouzapw/OmniRoute/pull/4548), [#4547](https://github.com/diegosouzapw/OmniRoute/pull/4547), [#4544](https://github.com/diegosouzapw/OmniRoute/pull/4544), [#4538](https://github.com/diegosouzapw/OmniRoute/pull/4538), [#4526](https://github.com/diegosouzapw/OmniRoute/pull/4526), [#4492](https://github.com/diegosouzapw/OmniRoute/pull/4492) — #3501, thanks @diegosouzapw) - **chore(i18n): remove unused config helpers** ([#4482](https://github.com/diegosouzapw/OmniRoute/pull/4482) — thanks @KooshaPari) - **chore(quality): reconcile quality baselines (complexity, cognitive-complexity, file-size) across the cycle** ([#4579](https://github.com/diegosouzapw/OmniRoute/pull/4579), [#4570](https://github.com/diegosouzapw/OmniRoute/pull/4570), [#4543](https://github.com/diegosouzapw/OmniRoute/pull/4543), [#4542](https://github.com/diegosouzapw/OmniRoute/pull/4542), [#4535](https://github.com/diegosouzapw/OmniRoute/pull/4535), [#4534](https://github.com/diegosouzapw/OmniRoute/pull/4534), [#4529](https://github.com/diegosouzapw/OmniRoute/pull/4529), [#4528](https://github.com/diegosouzapw/OmniRoute/pull/4528), [#4523](https://github.com/diegosouzapw/OmniRoute/pull/4523) — thanks @diegosouzapw) --- ## [3.8.32] — 2026-06-20 ### ✨ New Features - **feat(dashboard): inline show/hide toggle for API keys on the API Manager page** — each row in the API keys list now exposes an eye / eye-off button next to the masked key. Clicking it lazy-fetches the full key via the existing `/api/keys/{id}/reveal` endpoint (so the policy gate is unchanged), caches it client-side, and renders the full value inline; clicking again hides it. The toggle only appears when `allowKeyReveal` is true (server policy), so an installation that disables reveal still sees a locked stub. Reuses the existing i18n keys `apiManager.showKey` / `apiManager.hideKey` already shipped in every locale, and clears the cached reveal when the key is deleted. Inspired-by: toanalien. - **feat(oauth): import accounts from CLIProxyAPI** — Settings → CLIProxyAPI now has an "Import accounts" button that reads the OAuth accounts CLIProxyAPI already saved in `~/.cli-proxy-api/` and imports them as OmniRoute connections, so you don't have to log into every account individually. CLIProxyAPI's unified auth-file format is parsed by `type` discriminator and the supported account types (Gemini, Codex, Claude/Anthropic, Antigravity, Qwen, Kimi) are upserted; unknown types are skipped. The preview never exposes tokens to the client. (thanks @powellnorma) - **feat(routing): opt-in setting to echo the requested alias/combo name in the response model field** — Settings → Routing now has an "Echo requested model name in responses" toggle (default off). When enabled, the response `model` field (non-streaming and every streamed SSE chunk) reports the alias or combo name the client requested instead of the upstream model name, so strict clients such as Claude Desktop — which reject a response whose `model` does not match the request with a 401 — work with aliases and combos. (thanks @thaiphuong1202) - **feat(providers): expand the openai and gemini direct registries with first-class variants already known elsewhere** — the `openai` provider entry now exposes `gpt-4.1-mini`, `gpt-4.1-nano`, `o3-mini`, and `o4-mini` (the latter two carry `REASONING_UNSUPPORTED` like `o3`), and the `gemini` entry now exposes `gemini-2.0-flash-lite` and `gemini-3-flash-lite-preview`. These models were already first-class throughout sibling subsystems (cost estimator, task fitness, free-model catalog, multiple aggregator registries) but happened to be missing from the direct openai/gemini namespaces. Embedding/TTS/image-gen models stay in their dedicated registries (`embeddingRegistry.ts`, `audioRegistry.ts`, `imageRegistry.ts`); legacy ids OmniRoute curated out (o1, gpt-4-turbo, …) are not restored. (thanks @East-rayyy) - **feat(translator): OpenAI SSE → Gemini SSE conversion for `/v1beta/models/{model}:streamGenerateContent`** — the `@google/genai` SDK (Gemini CLI) always calls `:streamGenerateContent?alt=sse` for chat and expects Gemini SSE chunks (no `[DONE]` sentinel — the stream just closes). The v1beta route was forwarding OpenAI SSE from `handleChat` unchanged, so the SDK crashed on the OpenAI `[DONE]` line with `SyntaxError: Unexpected token 'D', "[DONE]" is not valid JSON`. A new `transformOpenAISSEToGeminiSSE()` (in `open-sse/translator/response/openai-to-gemini-sse.ts`) rewrites each OpenAI delta into `candidates[].content.parts[]`, maps `finish_reason` → `finishReason` (STOP / MAX_TOKENS / SAFETY), attaches `usageMetadata` + `modelVersion` on the final chunk, and surfaces `reasoning_content` as `{ thought: true }` parts for thinking models. The non-streaming `:generateContent` action gets a sibling `convertOpenAIResponseToGemini()` for the JSON path. Streaming intent is now keyed off the URL action suffix (canonical Gemini convention) rather than the non-standard `generationConfig.stream` body field. (thanks @SteelMorgan) - **feat(compression): unified compression configuration panel (Phase 1)** — `/dashboard/context/settings` is now the single source of truth for compression: a master toggle plus per-engine on/off and level controls, with the dispatch pipeline derived from a stored `engines` map on `CompressionConfig`. A gate (`enginesExplicit`) ensures the new map only drives dispatch when an `engines` row was actually saved from the panel, so legacy/backfilled installs (the seeded default combo from migrations 042/043) keep their existing `defaultMode` behavior unchanged. The default-combo and per-engine routes are shimmed (410). ([#4432](https://github.com/diegosouzapw/OmniRoute/pull/4432) — thanks @diegosouzapw) - **feat(mcp): register the web-session pool observability tools** — the `poolTools` MCP tool set (web-session pool stats/health) was defined but never wired into `createMcpServer()`, so it was dead. It is now registered in `server.ts` with `withScopeEnforcement` against the typed `read:health` / `write:resilience` scopes (no enum inflation), giving MCP clients visibility into the pooled web-session lifecycle. ([#4399](https://github.com/diegosouzapw/OmniRoute/pull/4399), [#3368](https://github.com/diegosouzapw/OmniRoute/issues/3368) — thanks @diegosouzapw) - **feat(providers): stronger no-auth and web-cookie provider validation (`AUTH_007`)** — provider connection validation now handles no-auth and web-cookie providers explicitly: instead of returning a generic "Provider validation not supported", these providers report a precise `AUTH_007` status so the dashboard surfaces actionable validation feedback for cookie/no-auth flows. ([#4023](https://github.com/diegosouzapw/OmniRoute/pull/4023) — thanks @oyi77) - **feat(combo): per-combo `stickyRoundRobinLimit` override on the combos page** — the round-robin sticky-affinity limit can now be set per combo from the combos page UI, overriding the global default, so a combo can pin (or loosen) how many consecutive requests stick to the same round-robin member independently of the others. ([#4472](https://github.com/diegosouzapw/OmniRoute/pull/4472) — thanks @adivekar-utexas) - **feat(usage): quota fetch for `kimi-coding-apikey`** — usage/quota tracking now supports the `kimi-coding-apikey` provider, so its remaining quota is fetched and surfaced like the other quota-aware providers. ([#4435](https://github.com/diegosouzapw/OmniRoute/pull/4435) — thanks @janeza2) - **feat(cluster): opt-in memory + Bifrost cluster profiles** — adds opt-in cluster profiles that wire the memory subsystem and the Bifrost Go sidecar into a clustered deployment (follow-up to #3932). ([#4433](https://github.com/diegosouzapw/OmniRoute/pull/4433) — thanks @KooshaPari) - **feat(models): opt-in low-noise `/v1/models` catalog mode** — a new opt-in mode trims the `/v1/models` response to a quieter, lower-noise catalog for clients that choke on or don't need the full provider/model list. ([#4427](https://github.com/diegosouzapw/OmniRoute/pull/4427) — thanks @Rahulsharma0810) - **feat(ui): expose a `targetFormat` selector in the custom-models form** — the custom-models form now lets you pick the upstream target format explicitly, so a custom model can be pinned to the right wire format instead of relying on inference. ([#4475](https://github.com/diegosouzapw/OmniRoute/pull/4475) — thanks @adivekar-utexas) - **feat(providers): expose `gpt-4o` on the built-in GitHub Copilot (`gh`) provider** — GitHub Copilot still serves the original `gpt-4o` chat model via its `/chat/completions` endpoint, but the OmniRoute registry only shipped the GPT-5.x family, so clients that explicitly request `gpt-4o` against `gh` got an unknown-model error. `gpt-4o` is now registered under the `github` provider next to the GPT-5.x lineup (chat/completions, 128k context — no `openai-responses` targetFormat). Ported from [9router#98](https://github.com/decolua/9router/pull/98). (thanks @I3eka) - **feat(pricing): default pricing for Qwen `coder-model` on the `qw` provider** — the Qwen Coder Free (`qw`) registry already exposed the `coder-model` id (Qwen3.5/3.6 Coder Model) but `DEFAULT_PRICING.qw` was missing the row, so usage tracking reported `$0.00` for that model. The pricing row is now added with the same shape as the sibling `vision-model` tier, restoring non-zero cost tracking. Ported from upstream 9router PR [decolua/9router#156](https://github.com/decolua/9router/pull/156). (thanks @LinearSakana) - **feat(usage): Codex review-quota now surfaces the weekly window and the `additional_rate_limits` fallback shape** — the dashboard's Codex usage card showed only the **session** half of `code_review_rate_limit` and dropped review descriptors that arrived inside `additional_rate_limits` (the shape some ChatGPT Codex plans report). `buildCodexUsageQuotas` now emits the secondary window as `quotas.code_review_weekly` and, when the dedicated `code_review_rate_limit` block is empty, falls back to the matching descriptor in `additional_rate_limits` (matched on `limit_name`/`metered_feature`/`limit_id` containing `code_review` / `codex_review` / `review`). The new label `code_review_weekly → "Code Review Weekly"` is registered in `ProviderLimits/utils.tsx` so the card renders both windows side-by-side. The existing `quotas.code_review` key is preserved for back-compat. Inspired by upstream decolua/9router PR #836. (thanks @hiepau1231) - **feat(dashboard): per-provider dropdown filter on the quota dashboard** — the Quota dashboard now has a "Provider" dropdown alongside the existing Status / Type / Tier / Env filters. Choosing a provider narrows the visible accounts to that provider only; the selection persists in `localStorage` (`omniroute:limits:providerFilter`) and the dropdown auto-falls back to "All providers" if the persisted key no longer matches a connection in the current session. The dropdown only renders when there are at least two distinct providers in view, so single-provider setups aren't cluttered. The upstream "Expiring first" toggle is intentionally not ported — `visibleConnections` already always sorts by soonest reset within each status group, so the toggle would be redundant. Inspired by [decolua/9router#769](https://github.com/decolua/9router/pull/769) — thanks @DEYLNN. - **feat(dashboard): "Done" button in the model picker during combo creation** — `ModelSelectModal` now supports a `keepOpenOnSelect` prop (opt-in, off by default). When set — and the combos page now sets it — picking a model no longer auto-closes the modal, and a full-width "Done" button is rendered in the modal footer so users can add several models in a row and confirm explicitly. Single-select callers (e.g. CLI tool cards) are unchanged: the prop is opt-in, so they keep auto-close. The existing `multiSelect` mode (Clear + Done footer driven by `selectedModels`) takes precedence over `keepOpenOnSelect` to avoid two competing footers. Inspired by upstream PR [decolua/9router#1031](https://github.com/decolua/9router/pull/1031). (thanks @zanuartri) - **feat(dashboard): toggle-style model deselection inside the combo builder modal** — `ModelSelectModal` (used by the combo builder) now treats clicks on an already-added model as an inline remove instead of a duplicate add: the click invokes a new `onDeselect` callback when one is supplied, and a new `closeOnSelect={false}` prop keeps the modal open so several models can be added or removed in one session before the user closes it manually. Wired into the combo builder so the existing green "✓" highlight is now actionable — clicking it removes every step that points at that qualified model. Inspired-by upstream decolua/9router PR #889. (thanks @fajarhide) ### 🐛 Fixed - **fix(sse): combo routing now skips a provider whose credentials are all disabled instead of failing the whole request** — when a combo like `antigravity/opus → github/opus` hit a leg whose only configured connections were disabled (or where no connections existed at all), `handleNoCredentials` returned `400 BAD_REQUEST`, which the combo target loop treats as a hard stop (combo's 400-break guard from PR #4316 / issue #4279 prevents infinite fallback loops on body-specific 4xx errors). The combo therefore died on the first leg even when later targets were perfectly healthy. The no-active-credentials branch now returns `404 NOT_FOUND` with `"No active credentials for provider:

"` instead — `404` flows through `checkFallbackError` as `shouldFallback: true` (generic-error catch-all path in `open-sse/services/accountFallback.ts`), so the next combo target is tried. The log level for this branch also drops from `error` to `warn` because zero active credentials is an expected operator-driven state, not a server fault. Inspired-by upstream decolua/9router PR #336. (thanks @East-rayyy) - **fix(dashboard): Manual Config modal "Copy" button now works on HTTP / non-secure deployments** — the copy handler in `ManualConfigModal` re-implemented the Clipboard-API-with-`execCommand`-fallback inline and gated the modern path on `window.isSecureContext`, so some non-secure-context browsers (and any future drift) silently lost the fallback. Migrated to the shared `useCopyToClipboard` hook (which delegates to `src/shared/utils/clipboard.ts`), giving consistent HTTP/HTTPS behavior with the rest of the dashboard and removing the duplicated code path. (thanks @anuragg-saxenaa) - **fix(dashboard): enable Codex Apply / Reset buttons when the CLI is installed** — on the Codex CLI tool card the **Apply** button was disabled whenever `selectedApiKey` was empty, but the local default `sk_omniroute` key is a valid choice when cloud mode is off or no API keys are configured — so Apply was stuck disabled even when the configuration was otherwise complete. **Reset** was also disabled when `codexStatus.hasOmniRoute` was false, which made it impossible to clear Codex configuration on installs that had never been pointed at OmniRoute. The disabled logic is now extracted into a pure helper (`codexButtonState.ts`) covered by unit tests: Apply is disabled only when no model is selected, or when cloud mode is on **and** keys exist **and** none is picked; Reset is disabled only while a reset is in flight. (thanks @anuragg-saxenaa) - **fix(mitm):** gate the sudo password prompt on the **server** platform, not the browser. The MITM control surface previously decided whether to ask for a sudo password by reading the browser's `navigator.userAgent`, which broke a Windows browser hitting a Linux server (no prompt → request rejected with `Missing sudoPassword`) and also forced an unnecessary modal on Linux hosts running as root, with NOPASSWD sudoers, or in minimal containers with no `sudo` binary on PATH. `GET /api/cli-tools/antigravity-mitm` now reports `isWin` and `needsSudoPassword` (probed via a safe `execFileSync("sudo", ["-n", "true"])`, per Hard Rule #13), and the Antigravity tool card uses the server-reported status to decide whether to show the modal. The POST/DELETE handlers stop returning 400 when sudo is genuinely not required. (thanks @hiepau1231) - **fix(embeddings):** forward output dimensions to Gemini for consistent embedding dims. (thanks @nguyenha935) - **fix(translator):** sanitize Read tool args from non-Anthropic models to prevent retry loops. (thanks @GodrezJr2) - **fix(usage):** reuse Gemini CLI project ID for quota checks (avoid re-discovery). (thanks @Delcado19) - **fix(dashboard):** surface manual config CTA when Claude CLI detection fails (remote deployments). (thanks @anuragg-saxenaa) - **fix(executors):** granular reasoning_effort handling for Claude models on GitHub Copilot. (thanks @baslr) - **fix(translator):** strip Claude output_config before MiniMax (rejected upstream). (thanks @hiepau1231) - **fix(translator): OpenAI audio input now reaches Gemini/Antigravity instead of being silently dropped** — `input_audio`/`audio` content parts on the OpenAI→Gemini path matched no handler in `convertOpenAIContentToParts` and were discarded with no error. They are now mapped to a Gemini `inlineData` part with an `audio/` mime type (wav, mp3, …). (thanks @mugnimaestra) - **fix(combo): model lockout now honors a long upstream quota reset instead of retrying within minutes** — when a combo target returned a quota error carrying an explicit long reset (e.g. Antigravity `Resets in 160h27m24s`, a `Retry-After` header), the per-model lockout capped at the short base cooldown (~minutes) and discarded the parsed reset, so the exhausted model kept being retried far too early. The lockout now applies the parsed reset when it exceeds the base cooldown, and the Antigravity error-message parser also matches the plural `Resets in …` phrasing. (thanks @Ansh7473) - **fix(antigravity): Claude models no longer 400 with `Unknown name "output_config"`** — Anthropic/Claude-Code-only fields (`output_config`, legacy `output_format`) leaked into the Google Cloud Code request envelope via its top-level field passthrough, and Google rejects unknown envelope fields with `400 Invalid JSON payload received. Unknown name "output_config"` — breaking every Claude model served through Antigravity in IDEs. Those fields are now dropped before the envelope is built. (thanks @Duongkhanhtool) - **fix(combo): round-robin members fail over faster under concurrency saturation via a configurable queue depth** — when a round-robin combo member was saturated, requests sat in the per-model semaphore's **unbounded** queue and only failed over to the next member after the full `queueTimeoutMs` (default 30s) elapsed — so a burst of agentic requests deep-queued one hot member instead of spilling to healthy ones. The per-model semaphore now accepts a bounded queue depth and emits `SEMAPHORE_QUEUE_FULL` once it is full (the round-robin loop already cascades on that code), so a configured low depth fails over immediately. A new `queueDepth` combo-config knob (global default / provider override / per-combo, default **20** for backward compatibility; **0** = never queue → fail over now) is exposed in Settings → Combo Defaults. ([#3872](https://github.com/diegosouzapw/OmniRoute/issues/3872) — thanks @KooshaPari) - **fix(pricing): default cost rows for Antigravity's Gemini 3.5 Flash tiers + `gemini-pro-agent`** — the Antigravity public catalog (`ANTIGRAVITY_PUBLIC_MODELS`) ships `gemini-3-flash-agent`, `gemini-3.5-flash-low`, and `gemini-pro-agent` as user-callable client ids, but the `ag` block in the default pricing table only carried rows for `gemini-3-flash` / `gemini-3.1-pro-high`, so `getPricingForModel("ag", id)` returned `null` and cost / quota accounting silently fell back to `$0` for those three models. The missing rows are now seeded with the per-MTok rates the upstream quota tier bills at (Flash High/Medium share the legacy `gemini-3-flash` rate; `gemini-pro-agent` shares `gemini-3.1-pro-high`). (thanks @Ansh7473) - **fix(pricing): align Claude Code (`cc`) pricing with current Anthropic per-MTok rates** — the `cc` provider block in the default pricing table had stale numbers across every Claude 4.x family entry — most visibly, `claude-opus-4-5-20251101` was billed at the deprecated Opus 4.1 rate (`input $15` / `output $75`), and `claude-haiku-4-5-20251001` was at half the current Haiku 4.5 rate. The `cached` (cache hit) and `cache_creation` (5-minute cache write) multipliers were also off across Opus 4.6/4.7/4.8, Sonnet 4.5/4.6, Haiku 4.5, and Fable 5. All eight entries now match the rates Anthropic publishes (input, 5m cache write at 1.25x input, cache hit at 0.1x input, output; reasoning billed at the output rate), so cost accounting on the dashboard and per-request usage events stop under- or over-reporting Claude Code spend. (thanks @chulanpro5) - **fix(executors): sanitize Anthropic-shape content parts before GitHub Copilot `/chat/completions`** — Claude models on GitHub Copilot driven from clients like Cursor IDE (e.g. `gh/claude-sonnet-4.6`) failed with `Provider returned error: type has to be either 'image_url' or 'text' (reset after 30s)` because the client passed through Anthropic-shape content parts (`tool_use`, `tool_result`, `thinking`) untouched, and the Copilot chat-completions endpoint only accepts `text`/`image_url`. `GithubExecutor.transformRequest` now serializes any unsupported part type as `text` (preserving the model's context), drops empty parts, and collapses to `null` when an assistant message's only content was tool_calls — `tool_calls` ride alongside untouched. Codex-family models still route through `/responses` unchanged. (thanks @cngznNN) - **fix(sse):** refactor stall detection to reduce false positives on slow but progressing streams. (thanks @zakirkun) - **fix(executors): synthesize `x-opencode-request` for custom-named OpenCode providers** — the OpenCode CLI only emits the `x-opencode-*` header set when the provider id starts with `opencode`; a custom-named provider (e.g. `omniroute`) instead sends `x-session-affinity` / `x-session-id` (mapped to `x-opencode-session` since #4022) but no request-correlation id, so `x-opencode-request` was silently dropped. `OpencodeExecutor` now synthesizes a fresh `x-opencode-request` on that session-affinity fallback path so custom-named providers are not disadvantaged on the opencode.ai upstream. `x-opencode-client` / `x-opencode-project` are intentionally **not** fabricated (no valid client source — an invented value risks upstream rejection) and remain forward-only; `DefaultExecutor` is untouched. ([#4465](https://github.com/diegosouzapw/OmniRoute/issues/4465) — thanks @pizzav-xyz) - **fix(compression): RTK now compresses Anthropic-shape `tool_result` blocks** — `applyRtkCompression` only compressed OpenAI-shape tool results (`role:"tool"`); Anthropic-shape tool results (`tool_result` content blocks inside a `role:"user"` message) were skipped, so coding agents speaking the Anthropic Messages format got zero RTK savings even though RTK's command-aware filters (e.g. `git-status`) would have compressed the output. RTK now treats a message containing a `tool_result` block as eligible (gated by `applyToToolResults`), captures Anthropic `tool_use` blocks for command resolution, and compresses each block's inner text (string or nested text-block array) while preserving `type` + `tool_use_id` exactly — matching what `caveman`/`aggressive` already did. ([#4468](https://github.com/diegosouzapw/OmniRoute/pull/4468) — thanks @diegosouzapw) - **fix(dashboard): request-log auto-refresh no longer dies from a "ghost" load-more on first page load** — the request-log viewer's infinite-scroll `IntersectionObserver` uses a 200px rootMargin, so its sentinel was already intersecting on mount whenever the first page didn't fill the scroll container. That fired a `loadMore()` with no user interaction, growing the window past `PAGE_SIZE` — and auto-refresh only polls while on the first page (`limit <= pageSize`), so it stayed permanently paused (only a manual filter change re-armed it). The observer now grows the window only after a genuine user scroll (new pure `shouldTriggerInfiniteScroll` guard), and a filter change re-arms the guard, so the default first-page view resumes its ~10s auto-refresh. ([#4269](https://github.com/diegosouzapw/OmniRoute/issues/4269) — thanks @tjengbudi) - **fix(sse): large `/v1/chat/completions` requests no longer crash the server with a Node heap OOM** — the chat request body was parsed multiple times along the route (route guard, injection guard, handler), buffering very large payloads several times and pushing concurrent agentic traffic into an out-of-memory crash. The body is now parsed **once** at the route guard and threaded through, so each request is buffered a single time. ([#4380](https://github.com/diegosouzapw/OmniRoute/issues/4380) — thanks @NakHalal) - **fix(guardrails): tighten the `system_prompt_leak` heuristic to stop false positives on agent traffic** — the leak detector flagged normal agent/tool conversations as prompt-leak attempts; it now requires an additional qualifier before flagging, so legitimate agent traffic is no longer blocked. ([#4041](https://github.com/diegosouzapw/OmniRoute/issues/4041) — thanks @KooshaPari) - **fix(translator): drop orphan tool results on the Claude→OpenAI request path** — a `tool_result` with no preceding matching `tool_use` (orphan) produced upstream 500/502 errors for Command Code / Custom OpenAI clients on ≥3.8.26. Orphan tool results are now filtered before the request is sent. ([#4385](https://github.com/diegosouzapw/OmniRoute/issues/4385) — thanks @adityapnusantara) - **fix(providers): register API-key validators for Firecrawl and Jina Reader** — both providers returned "Provider validation not supported" when validating their API key; they now have proper validators registered in `SEARCH_VALIDATOR_CONFIGS`. ([#4401](https://github.com/diegosouzapw/OmniRoute/issues/4401) — thanks @ponkcore) - **fix(providers): generic web-cookie validator must not shadow per-provider validators** — a follow-up to the `AUTH_007` validation work (#4023): the generic web-cookie validator was matching before more specific per-provider validators, so provider-specific validation was skipped. Validator resolution now prefers the per-provider validator. ([#4467](https://github.com/diegosouzapw/OmniRoute/pull/4467) — thanks @diegosouzapw) - **fix(translator): inject a placeholder message when the Responses API `input[]` is empty** — a `POST /v1/responses` with `input: []` translated to `messages: []`, which every upstream Chat-Completions provider rejects (surfaced as a confusing 406); a single placeholder user message is now injected, mirroring the existing empty-string handling. ([#4393](https://github.com/diegosouzapw/OmniRoute/pull/4393) — thanks @diegosouzapw) - **fix(providers): serve the api.airforce live `/models` catalog instead of the stale seed** — the api.airforce provider listed a stale hard-coded seed; it now serves the upstream live `/models` catalog. ([#4395](https://github.com/diegosouzapw/OmniRoute/pull/4395) — thanks @diegosouzapw) - **fix(cli): non-interactive-safe prompts + `context` alias** — the CLI's `confirm()`/prompt helpers no longer hang in non-interactive (piped/CI) contexts, and a singular `context` alias is accepted alongside `contexts`; the contexts workflow is documented. ([#4439](https://github.com/diegosouzapw/OmniRoute/pull/4439), [#4397](https://github.com/diegosouzapw/OmniRoute/pull/4397) — thanks @diegosouzapw) - **fix(cli): `omniroute update` no longer reports a stale "latest" version from npm's cache** — `getLatestVersion()` ran `npm view omniroute version` without `--prefer-online`, so npm could serve a cached value from its HTTP cache and tell users on an older build (e.g. 3.8.30) they were already "running the latest version" even after a newer one (3.8.31) was published. The version check now passes `--prefer-online` to force npm to revalidate against the registry. ([#4376](https://github.com/diegosouzapw/OmniRoute/issues/4376) — thanks @akbardwi) - **fix(sse): `web_search_20250305` no longer 400s on MiniMax's Anthropic-compatible endpoint** — PR #2960 added a Claude→Claude bypass that forwards Anthropic's typed server tool `web_search_20250305` untouched, assuming the Claude-format upstream implements Anthropic server tools. MiniMax's `/anthropic` endpoint does not, so `claude → minimax` requests carrying that tool got `HTTP 400 "invalid params, function name or parameters is empty (2013)"`. `supportsNativeWebSearchFallbackBypass` now consults the (already-plumbed) `provider` and excludes providers known not to implement server tools (currently `minimax`) from the bypass, so the built-in web-search tool is converted to the `omniroute_web_search` function fallback — which MiniMax accepts as a normal function tool. ([#4481](https://github.com/diegosouzapw/OmniRoute/issues/4481) — thanks @shafqatevo) - **fix(command-code): pass `reasoning` / `thinking` fields through to upstream params** — Command Code requests carrying `reasoning`/`thinking` controls had those fields dropped before the upstream call, so reasoning-effort and extended-thinking settings were silently ignored; they are now forwarded to the upstream params. ([#4473](https://github.com/diegosouzapw/OmniRoute/pull/4473) — thanks @adivekar-utexas) - **fix(usage): keep Kiro overage-enabled accounts routable after base quota hits zero** — a Kiro account with overage enabled was excluded from routing once its base quota reached zero, even though overage billing should keep it serving; such accounts now stay routable past base-quota exhaustion. ([#4469](https://github.com/diegosouzapw/OmniRoute/issues/4469) — thanks @heaven321357 / @CleanDev-Fix) - **fix(providers): model-aware `supportsRedactedThinking` for mixed-format providers** — the redacted-thinking capability was resolved per provider rather than per model, so a mixed-format provider (some models support redacted thinking, others don't) got the wrong answer for some models; the check is now model-aware. ([#4479](https://github.com/diegosouzapw/OmniRoute/pull/4479) — thanks @TF0rd) - **fix(usage): parse numeric-string quota reset timestamps as Unix seconds/ms** — when a provider returned the quota reset timestamp as a numeric string (e.g. `"1700000000"`), `parseResetTime` passed it straight to `new Date(str)`, which returned `Invalid Date` and dropped the reset entirely (UI showed no reset). Numeric strings are now detected and treated as Unix timestamps with the same `< 1e12` seconds-vs-ms heuristic already applied to numeric values; ISO/parseable strings are untouched. Applied symmetrically in `codexUsageQuotas.parseResetTime`. (Inspired by upstream [decolua/9router#768](https://github.com/decolua/9router/pull/768) — thanks @DEYLNN) - **fix(usage): clearer "auth expired" message for Kiro accounts added via Google/GitHub social-auth** — a Kiro account created through the `/api/oauth/kiro/social-exchange` flow (Google or GitHub social login) uses a token format that AWS CodeWhisperer's `GetUsageLimits` quota API frequently rejects with 401/403 even when `/messages` still works. The quota card was throwing the raw upstream error blob (`Failed to fetch Kiro usage: Kiro API error (401): {…}`); social-auth accounts now get the same friendly `Kiro quota API authentication expired. Chat may still work.` message that legacy social-auth users with a stored marker already see, while Builder-ID / IDC accounts keep the existing throw-on-failure behavior so transient upstream errors don't get silently masked. (thanks @anuragg-saxenaa) - **fix(dashboard): Antigravity CLI Tools model picker now lists API-Key-Compatible custom providers** — the API-Key-compatible / passthrough provider groups in `ModelSelectModal` are derived from the user's `modelAliases`, but `AntigravityToolCard` was the only CLI tool card that didn't fetch `/api/models/alias` or forward the `modelAliases` prop, so a custom OpenAI-compatible provider added in OmniRoute never surfaced in the Antigravity tool's model picker — routing a custom model to Antigravity from there was impossible. The card now mirrors the pattern already used by every sibling tool card (Codex, Claude, Cline, Kilo, Droid, OpenClaw, HermesAgent). (thanks @mxskeen) - **fix(mitm): cert/DNS operations no longer fail with `spawn sudo ENOENT` on slim Docker images** — slim Docker base images (e.g. `node:24-trixie-slim`) do not ship `sudo`, and OmniRoute's runtime stage runs as `USER node` (UID 1000, non-root), so `execFileWithPassword("sudo", …)` failed unconditionally for any MITM operation triggered from inside the container (cert install, DNS host-file write). A new `isSudoAvailable()` probe gates the `sudo -S` wrapper; when sudo is missing and the process is not root, the underlying command runs directly (same user, no elevation) — same path already taken when running as root. Privileged operations that genuinely need elevation (system trust store, `/etc/hosts`) still error explicitly so operators can mount the CA or hosts file from the host side. (thanks @lokinh) ### 🔒 Security - **fix(sse): use a crypto-secure RNG for combo/deck load-balancing selection** — random combo/deck member selection used a non-cryptographic PRNG, flagged by CodeQL (`#665`); it now uses a crypto-secure RNG. ([#4457](https://github.com/diegosouzapw/OmniRoute/pull/4457) — thanks @diegosouzapw) - **fix(sse): unbiased `crypto.randomInt` for combo selection (follow-up to #4457)** — the initial crypto-secure conversion used modulo reduction over the secure bytes, which introduces a small modulo bias; selection now uses `crypto.randomInt` (rejection sampling) for a uniform, unbiased distribution across combo/deck members. ([#4462](https://github.com/diegosouzapw/OmniRoute/pull/4462) — thanks @diegosouzapw) ### 📝 Maintenance - **refactor(chatCore):** extract `resolveChatCoreRequestSetup` (first setup-phase slice) toward modularizing the chatCore god-file. ([#4392](https://github.com/diegosouzapw/OmniRoute/pull/4392) — thanks @diegosouzapw) - **refactor(chatCore):** extract the Codex service-tier resolvers into a pure `chatCore/serviceTier.ts` leaf (continues the god-file split). ([#4477](https://github.com/diegosouzapw/OmniRoute/pull/4477), [#3501](https://github.com/diegosouzapw/OmniRoute/issues/3501) — thanks @diegosouzapw) - **perf(dashboard):** lazy-load the usage analytics charts so the dashboard's initial bundle/paint is lighter (charts hydrate on demand). ([#4466](https://github.com/diegosouzapw/OmniRoute/pull/4466) — thanks @KooshaPari) - **perf(kiro):** cut request-completion hot-path CPU and cap the DB-lock event-loop block so Kiro request completion does not stall the event loop under load. ([#4459](https://github.com/diegosouzapw/OmniRoute/pull/4459) — thanks @artickc) - **fix(catalog):** restore-green — add OpenAI `gpt-4.1-mini`/`gpt-4.1-nano` + `o3-mini`/`o4-mini` pricing rows to keep the static-parity gate green after the registry expansion (#4394), plus the web-cookie validator shadowing fix. ([#4447](https://github.com/diegosouzapw/OmniRoute/pull/4447) — thanks @diegosouzapw) - **chore(quality):** reconcile file-size + complexity baselines after the `/review-prs` round, and the `server.ts` file-size baseline after the pool-tools registration (#3368). ([#4461](https://github.com/diegosouzapw/OmniRoute/pull/4461), [#4423](https://github.com/diegosouzapw/OmniRoute/pull/4423) — thanks @diegosouzapw) - **docs(remote-mode):** add a copy-paste end-to-end verification example. ([#4430](https://github.com/diegosouzapw/OmniRoute/pull/4430) — thanks @diegosouzapw) - **docs:** add operational documentation (usage/quota, database, open-sse architecture, monitoring). ([#3455](https://github.com/diegosouzapw/OmniRoute/pull/3455) — thanks @oyi77) --- ## [3.8.31] — 2026-06-20 ### ✨ New Features - **feat(translator):** Gemini accepts OpenAI `input_audio` and `audio_url` content parts. (thanks @mugnimaestra) - **perf(dashboard): combos UI leaf-split, Next.js config tuning, 1-click Redis & Bifrost sidecar** — delivers four of the five performance/UX tracks from the #3932 thread: the combos dashboard page is split into focused leaf components (smaller bundles, faster reloads), `next.config` is tuned for the standalone build, Redis can be provisioned in one click, and a Bifrost sidecar option is wired in. (The fifth track — chatLogHelpers extraction — was already covered upstream and dropped.) ([#4381](https://github.com/diegosouzapw/OmniRoute/pull/4381) — thanks @KooshaPari) ### 🐛 Fixed - **fix(embeddings): NVIDIA NIM asymmetric embedding models inject the required `input_type`** — NVIDIA NIM asymmetric embedders (e.g. `nvidia/nv-embedqa-e5-v5`) reject requests without an `input_type` parameter with `400 "'input_type' parameter is required"`, but OmniRoute only forwarded `input_type` when the client supplied it — so callers (and OpenAI-style SDKs that don't emit the field) got a hard failure. The embedding registry now carries a model-level default (`input_type: "query"`) for the asymmetric NVIDIA model, and the embeddings handler injects a model's default params into the upstream body **only** when the client didn't already send them — a client-supplied `input_type` (e.g. `"passage"`) is respected unchanged, and symmetric models that carry no default are unaffected. ([#4341](https://github.com/diegosouzapw/OmniRoute/pull/4341) — thanks @hydraromania) - **fix(api): migrate the deprecated Codex `[features].codex_hooks` flag to `[features].hooks`** — Codex renamed the `codex_hooks` feature flag to `hooks`; recent Codex CLI versions ignore the old key and print a deprecation notice. When OmniRoute rewrites an existing `~/.codex/config.toml` (configuring/resetting the Codex provider) it now carries the user's intent forward by renaming `[features].codex_hooks` → `[features].hooks` (preserving its value, never clobbering an already-present `hooks`) and dropping the deprecated key. No-op when the flag is absent. ([#4342](https://github.com/diegosouzapw/OmniRoute/pull/4342) — thanks @Bian-Sh) - **fix(translator): same-format response path no longer leaks a `data: null` SSE event** — the streaming response translator's same-format fast path returned `[chunk]` unconditionally, so the end-of-stream null/flush signal (`chunk === null`) propagated as a literal `[null]`. Downstream this surfaced as an empty `data: null` SSE event between chunks and crashed strict clients (e.g. Factory Droid BYOK on `/v1/responses`). The fast path now drops the null flush (returns `[]`) while still passing real chunks through unchanged. ([#4344](https://github.com/diegosouzapw/OmniRoute/pull/4344) — thanks @thaitryhand) - **fix(translator): strip client-only assistant echo fields on the OpenAI target path (Mistral 422)** — strict OpenAI-compatible upstreams (e.g. `mistral/codestral-latest`) reject client-only assistant "echo" fields sent back as input history with `422 extra_forbidden` (the report hit `messages[].assistant.reasoning_content` via Codex `/responses`). Only `reasoning_content` was being stripped on the OpenAI target path; the sibling echo fields `reasoning`, `refusal`, `annotations` and `cache_control` leaked through and tripped the 422. They are now all dropped on the non-reasoner OpenAI target path. `audio` is deliberately preserved (OpenAI audio models reference a prior assistant audio response by id on multi-turn; Mistral never emits audio, so nothing is lost there). ([#4350](https://github.com/diegosouzapw/OmniRoute/pull/4350) — thanks @xxy9468615) - **fix(translator): accept AI SDK-style `{ type: "image", image: "data:…" }` content parts** — several OpenAI-input translators only recognized images shaped as `image_url.url` (or an object with `.source`/`.url`), so an AI SDK-style part where `image` is a bare data-URL **string** was silently dropped before reaching a vision provider (OpenCode is one affected client; the gap is generic). The OpenAI→Claude, OpenAI→Kiro and OpenAI→Gemini/Antigravity translators now parse a string `image` data URL into each provider's native image shape (Claude `{source:{type:"base64"}}`, Kiro `images[].source.bytes`, Gemini `inlineData`). ([#4345](https://github.com/diegosouzapw/OmniRoute/pull/4345) — thanks @mugnimaestra) - **fix(translator): Gemini accepts HTTP/HTTPS image URLs instead of silently dropping them** — the OpenAI→Gemini request helper (`convertOpenAIContentToParts`) discarded remote `image_url` parts (emitting only a `console.warn`) because Gemini's `inlineData` needs base64 and the synchronous helper can't fetch+encode upstream. It now uses Gemini's native `fileData: { fileUri }` part for HTTP/HTTPS URLs (the model fetches the asset itself), so vision requests carrying a URL — not a `data:` URI — reach Gemini intact. ([#4373](https://github.com/diegosouzapw/OmniRoute/pull/4373) — ported from 9router#344, thanks @diegosouzapw) - **fix(executors): strip `stream_options` for qwen non-streaming / thinking Claude-Code requests** — Claude-Code-compatible providers force the executor-level `stream` flag on while the outgoing body keeps the caller's original `stream: false`, so `DefaultExecutor.transformRequest` injected `stream_options: { include_usage: true }` onto a body that still said `stream: false`, and qwen rejected it with `400 "'stream_options' only set this when you set stream: true"`. The executor now strips `stream_options` whenever the body's effective `stream` is false. ([#4374](https://github.com/diegosouzapw/OmniRoute/pull/4374) — ported from 9router#663, thanks @anuragg-saxenaa / @diegosouzapw) - **fix(executors): don't inject `thinking` when `tool_choice` forces a tool (native Claude)** — the Claude-Code wire-image emulation injects `thinking: { type: "adaptive" }` for non-Haiku Claude models, but Anthropic rejects `thinking` when `tool_choice` forces a specific tool (`{type:"any"|"tool"}`) with `400 "Thinking may not be enabled when tool_choice forces tool use."`. Any Opus/Sonnet call that pins a tool (e.g. Claude Code's `message_user`, or agent harnesses that force a tool) hit a hard 400; the injection is now suppressed when `tool_choice` forces a tool. ([#4389](https://github.com/diegosouzapw/OmniRoute/pull/4389) — thanks @NomenAK) - **fix(codex): request reasoning summaries on Codex Responses requests** — Codex/OpenAI Responses can return reasoning-token accounting and empty reasoning items unless visible reasoning summaries are requested, so Codex CLI / pi.dev paths missed visible thinking text. OmniRoute now requests `reasoning.summary: "auto"` (and includes `reasoning.encrypted_content`) when reasoning is enabled — preserving an explicit client `reasoning.summary` and existing `include` entries, and skipping it for `reasoning.effort: "none"`. ([#4359](https://github.com/diegosouzapw/OmniRoute/pull/4359) — thanks @xz-dev) - **fix(sse): default the combo per-target timeout to 120s for fast failover** — a combo's per-target timeout inherited the full `FETCH_TIMEOUT_MS` (600s default) when the combo didn't set `targetTimeoutMs`, so a single hung/slow target (e.g. an openai-compatible upstream returning 524/504) could stall the **whole** combo for up to 10 minutes before failing over. A new `DEFAULT_COMBO_TARGET_TIMEOUT_MS = 120_000` is used as the default-when-unset in `resolveComboTargetTimeoutMs` (backward-compatible 3rd arg, wired in `phaseComboSetup`); an explicit ceiling/opt-out is preserved. ([#4365](https://github.com/diegosouzapw/OmniRoute/pull/4365) — thanks @diegosouzapw) - **fix(cli): Tailscale login honors `TAILSCALE_AUTHKEY` for non-interactive sign-in** — `startTailscaleLogin` built `tailscale up` without ever reading `process.env.TAILSCALE_AUTHKEY`, so on a pre-authenticated / headless daemon the login waited for an interactive auth URL and timed out (~15s). When `TAILSCALE_AUTHKEY` is set it is now passed via `--auth-key=` (as a spawn argv element — no shell interpolation) so the daemon authenticates non-interactively; when unset, behavior is unchanged. ([#4343](https://github.com/diegosouzapw/OmniRoute/pull/4343) — thanks @ipeterpetrus) - **fix(dashboard): OAuth modal shows the real error on a non-JSON server response** — the OAuth connect/reauth modal called `await res.json()` unconditionally, so when a build/OAuth endpoint returned a plain-text error (e.g. a `500 Internal Server Error` page) the modal threw `Unexpected token 'I'…` and hid the real failure. Two shared helpers (`parseResponseBody` / `getErrorMessage` in `src/shared/utils/api.ts`) now read the body safely (JSON when it is JSON, raw text otherwise) and surface a clean message either way; all modal fetch sites use them. ([#4351](https://github.com/diegosouzapw/OmniRoute/pull/4351) — thanks @DNNYF) - **fix(dashboard): a disabled connection's last error is now visible** — the provider card's error badge counts a disabled connection (`isActive === false`) that has an error (its effective status is still error/expired/unavailable), but the connection row hid the `lastError` text for disabled rows — so the operator saw the error count without being able to see what failed. The row now shows the error text whenever there is one, regardless of the active toggle. ([#4352](https://github.com/diegosouzapw/OmniRoute/pull/4352) — thanks @ntdung6868) - **fix(providers): the "Test Connection One-by-One" OAuth probe can no longer hang the queue forever** — the OAuth connection-test path called bare `fetch(url, { method, headers })` with no `AbortController`/signal/timeout, so when a provider's probe endpoint accepted the socket but never responded, the awaited fetch never settled and the one-by-one test queue stalled indefinitely (the API-key path was already bounded via `validateProviderApiKey`'s `timeoutMs`). Both the initial probe and the post-refresh retry are now bounded with `AbortSignal.timeout(30s)` — matching the API-key path's 30s budget — and a timed-out probe resolves as a failure with a clear `Test timed out after 30s` message in the same shape as every other test error. ([#4347](https://github.com/diegosouzapw/OmniRoute/pull/4347) — thanks @ntdung6868) - **fix(providers): a deactivated account is labeled distinctly from a revoked token** — a Codex connection whose OAuth refresh is fully healthy but whose ChatGPT account has been deactivated by the provider gets a `401` from the upstream API. The connection test labeled that the same as a bad credential (`Token invalid or revoked` → `upstream_auth_error`), so the operator couldn't tell a deactivated account from a revoked token. The test now reads the `401`/`403` body and, when it indicates account deactivation, classifies it as `account_deactivated` — which the dashboard already renders as "Account Deactivated". A plain auth `401` is unchanged. ([#4353](https://github.com/diegosouzapw/OmniRoute/pull/4353) — thanks @ntdung6868) - **fix(db): cascade-delete orphaned model aliases when a provider is removed** — deleting a custom provider removed its connections and node but left behind the imported model-alias rows (stored as `key=`, `value="/"`). Those stale aliases then blocked re-importing the same provider — the import dedup treated them as "already exists", so no new models appeared. A new `deleteModelAliasesForProvider(providerId)` DB helper drops every alias whose stored value begins with `/` (leaving other providers and user-defined settings aliases untouched), and the provider-node DELETE handler now calls it after removing the connections and node, so a fresh import is unblocked. ([#4348](https://github.com/diegosouzapw/OmniRoute/pull/4348) — thanks @nguyenvanhuy0612) - **fix(api): persist `max_input_tokens` / `max_output_tokens` when adding a custom model** — `POST /api/provider-models` silently dropped the per-model token limits set in the "add custom model" form: the handler destructured the rest of the body but never read `max_input_tokens` / `max_output_tokens`, and `addCustomModel()` had no parameter for them, so the values were thrown away on write. The DB layer (`inputTokenLimit` / `outputTokenLimit`) and the `/v1/models` catalog already round-trip these fields — only the write path was missing. The validation schema now accepts the two optional limits, the handler forwards them, and `addCustomModel()` persists them so a custom model's context/output window survives into the catalog. ([#4349](https://github.com/diegosouzapw/OmniRoute/pull/4349) — thanks @codename-zen) - **fix(plugin): the OpenCode static-catalog plugin prefixes combo/raw model keys with the provider id** — OpenCode's static-catalog reader misdetected the `omniroute` provider: combo keys emitted as `combo/MASTER` were parsed as provider `combo` ("No credentials for provider: omniroute"), while a bare-`MASTER` form was misread as a model with no resolvable provider, and mixed `omniroute/MASTER` + bare-raw keys were rejected by OpenCode's schema. The plugin now emits every combo and raw model key prefixed with the `omniroute` provider id, emits the provider id explicitly, and drops the legacy `combo/` prefix — so the static-catalog reader detects the provider and the auth loader returns the right credentials (the catalog-fetch timeout was also raised so a cold-start server doesn't publish an empty stub). ([#4384](https://github.com/diegosouzapw/OmniRoute/pull/4384) — thanks @herjarsa) - **fix(translator): inject placeholder message when Responses API input[] is empty (prevents upstream 400)** — a client (e.g. Fabric-AI) calling `POST /v1/responses` with `input: []` used to be translated into `messages: []`, which every upstream Chat-Completions provider rejects with `400: at least one message is required` (surfaced to the client as a confusing 406). The translator now treats an empty `input[]` the same as an empty string — a placeholder user message is injected so the request is always valid. (thanks @anuragg-saxenaa) - **fix(embeddings): NVIDIA NIM asymmetric embedding models inject the required `input_type`** — NVIDIA NIM asymmetric embedders (e.g. `nvidia/nv-embedqa-e5-v5`) reject requests without an `input_type` parameter with `400 "'input_type' parameter is required"`, but OmniRoute only forwarded `input_type` when the client supplied it — so callers (and OpenAI-style SDKs that don't emit the field) got a hard failure. The embedding registry now carries a model-level default (`input_type: "query"`) for the asymmetric NVIDIA model, and the embeddings handler injects a model's default params into the upstream body **only** when the client didn't already send them — a client-supplied `input_type` (e.g. `"passage"`) is respected unchanged, and symmetric models that carry no default are unaffected. (thanks @hydraromania) ### 🔒 Security - **fix(security): scope the OAuth callback `postMessage` to a trusted-origin allowlist** — the OAuth callback at `/callback` previously posted `{ code, state, … }` to `window.opener.postMessage(…, "*")` whenever the opener was cross-origin, so a hostile page that opened the well-known redirect URI in a popup could receive the OAuth code/state and complete the flow as the user. The wildcard fallback is replaced with iteration over a fixed allowlist (same-origin + Codex's `localhost:1455` / `127.0.0.1:1455` loopback helper); the browser silently drops `postMessage` to any opener whose origin isn't listed. ([#4372](https://github.com/diegosouzapw/OmniRoute/pull/4372) — ported from 9router#998, thanks @aeonframework / @diegosouzapw) - **fix(mitm): exact host membership in the MITM hosts test (CodeQL false positive)** — `tests/unit/mitm-tool-hosts.test.ts` checked host membership with `Array.includes(host)`, which CodeQL's `js/incomplete-url-substring-sanitization` heuristic misreads as a `String.includes()` URL-substring sanitization test (HIGH false positive). Switched to `.some((h) => h === host)` — identical semantics, no flagged pattern. ([#4386](https://github.com/diegosouzapw/OmniRoute/pull/4386)) ### 📝 Maintenance - **docs: one-time feature-documentation catch-up (v3.8.20 → v3.8.30)** — reconciled the docs with every user-facing feature shipped since v3.8.20: a new README **✨ What's New** section; new guides for [CLI integrations](docs/guides/CLI-INTEGRATIONS.md), [MITM TPROXY transparent decrypt](docs/security/MITM-TPROXY-DECRYPT.md) and [delegated Anthropic Context Editing](docs/compression/CONTEXT_EDITING.md); refreshed AUTO-COMBO (`auto/:` + Arena-ELO), API_REFERENCE (`x-omniroute-no-memory`), MEMORY (int8 quantization, off-by-default), RESILIENCE (model-lockout success-decay), RTK, AGENTBRIDGE, TRAFFIC_INSPECTOR, GUARDRAILS, CLOUD_AGENT, ENVIRONMENT; regenerated PROVIDER_REFERENCE (231 providers) and synced the provider count in README/CLAUDE/AGENTS. Going forward this runs every release (generate-release step 6b). ([#4391](https://github.com/diegosouzapw/OmniRoute/pull/4391)) - **refactor(chatCore): extract the `checkHeapPressureGuard` leaf (god-file decomposition start)** — first increment of decomposing `chatCore.ts` (~5127 LOC, the hottest path — every chat request flows through `handleChatCore`). The V8 heap-pressure guard at the top of `handleChatCore` (rejects with 503 when `heapUsed` exceeds the shed threshold) is moved to a self-contained, co-located `utils/heapPressure.ts::checkHeapPressureGuard(...)` with no behavior change. ([#4371](https://github.com/diegosouzapw/OmniRoute/pull/4371) — thanks @diegosouzapw) - **refactor(combo): de-dup the exhausted-target skip predicate across both dispatchers** — the byte-identical `#1731`/`#1731v2` pre-check (skip a target already exhausted on the provider/connection within a request) lived in both combo dispatchers; extracted to a shared `combo/comboPredicates.ts` helper. ([#4362](https://github.com/diegosouzapw/OmniRoute/pull/4362) — thanks @diegosouzapw) - **refactor(combo): de-dup the upstream-error exhaustion classification across both dispatchers** — both dispatchers ran a near-identical post-error block classifying the upstream error and updating the exhaustion Sets (`#1731` provider exhausted / `#1731v2` connection error / transient rate-limited); extracted to a shared `combo/targetExhaustion.ts::applyComboTargetExhaustion(...)`. ([#4366](https://github.com/diegosouzapw/OmniRoute/pull/4366) — thanks @diegosouzapw) - **chore(cli): localize CLI / scraping copy and stabilize fetch, memory & coverage handling** — localizes CLI and scraping UX copy plus the Adapta onboarding tutorial (and corrects the CLI Code page title), makes fetch retries honor the start timeout, tightens SSE/response typing, respects configured memory token limits during search, and reduces CI coverage-merge memory by merging V8 data incrementally. ([#4383](https://github.com/diegosouzapw/OmniRoute/pull/4383) — thanks @JxnLexn) - **test(combo): reset circuit breakers between stream-readiness cases (restore green)** — a stream-readiness fallback case failed on the release branch since the cycle-open tip due to test isolation: earlier combo-dispatch cases in the same file deliberately fail `glm` (tripping the module-level provider circuit breaker), and that OPEN state leaked into the next test so `combo.ts` skipped the model. The test now resets the circuit breakers between cases. ([#4396](https://github.com/diegosouzapw/OmniRoute/pull/4396) — thanks @diegosouzapw) - **chore(quality): reconcile the complexity ratchet baseline (1896 → 1900)** — absorbs the small complexity-metric increase from the v3.8.31 `/review-prs` merge batch into `quality-baseline.json` so the ratchet reflects the shipped code (no production change). ([#4410](https://github.com/diegosouzapw/OmniRoute/pull/4410) — thanks @diegosouzapw) - **test/gate: reconcile release-time drift surfaced by the full CI gate** — three already-merged changes left the release branch's full-CI gate red (the per-PR fast gates don't run it): the Gemini `convertOpenAIContentToParts` tests were realigned to the [#4373](https://github.com/diegosouzapw/OmniRoute/pull/4373) HTTP/HTTPS-URL `fileData` pass-through (they still asserted the old warn-and-drop behavior), the `t11` any-budget for `open-sse/executors/base.ts` was raised to 2 with a justification ([#4389](https://github.com/diegosouzapw/OmniRoute/pull/4389) compares `tool_choice` against the string literal `"any"`, not a TS `any` type), and the [#4384](https://github.com/diegosouzapw/OmniRoute/pull/4384) opencode-plugin combos test's net-assert reduction (dropping the obsolete `combo/` namespace) was allowlisted. No production behavior change. (thanks @diegosouzapw) --- ## [3.8.30] — 2026-06-20 ### ✨ New Features - **feat(dashboard): category (media serviceKind) filter on the providers page** — `/dashboard/providers` gains a media-category filter row (Image / Video / Music / Text→Speech / Speech→Text / Embedding) that composes with the existing search, free-only and "show configured only" filters. Membership is derived from the backend media registries (a provider that serves a kind is surfaced even if it never declared `serviceKinds`), keeping the UI in lockstep with the backend. ([#4240](https://github.com/diegosouzapw/OmniRoute/issues/4240)) - **feat(combo): per-step account allowlist — scope a round-robin/weighted step to a subset of a provider's connections** — a combo model step can now carry a first-class account allowlist so a round-robin (or weighted) strategy is scoped to a chosen subset of a provider's connections (e.g. only `foo1`+`foo2` out of `foo1..foo4`) without hand-pinning one step per account. Empty = the whole active pool (unchanged). When a step both has an allowlist and is tag-routed, the two intersect (most-restrictive wins); a single pinned account still takes precedence. The combo builder's Precision step editor gains an optional "Restrict to accounts" picker. ([#3266](https://github.com/diegosouzapw/OmniRoute/issues/3266)) - **feat(providers): add OpenAdapter, dit.ai and TokenRouter as OpenAI-compatible providers** — three community-requested OpenAI-compatible aggregators now register as standard named OpenAI-style providers with live `/v1/models` discovery (the zenmux pattern), falling back to a seeded catalog when the upstream list is unavailable: **OpenAdapter** (`https://api.openadapter.in/v1`, free tier, 70+ open-source models — [#4239](https://github.com/diegosouzapw/OmniRoute/issues/4239)), **dit.ai** (`https://api.dit.ai/v1`, dynamic-pricing router/gateway — [#4155](https://github.com/diegosouzapw/OmniRoute/issues/4155)), and **TokenRouter** (`https://api.tokenrouter.com/v1`, free MiniMax model — [#3841](https://github.com/diegosouzapw/OmniRoute/issues/3841), thanks @FerLuisxd). No custom executor/translator — default OpenAI passthrough. - **feat(api): `x-omniroute-no-memory` request header — per-request opt-out of memory/skills injection** — clients that manage their own context (e.g. their own RAG/memory) can send `x-omniroute-no-memory: true` (mirrors the existing `x-omniroute-no-cache` convention) to skip the gateway injecting up to `memorySettings.maxTokens` (~2k) tokens of memory **and** skills context into that chat request — avoiding the token/cost inflation it otherwise adds on every call. Absent the header, behavior is unchanged. (PRD-2026-06-19-no-memory-header) - **feat(dashboard): MITM tool card lists the exact hosts-file entries to add manually** — the CLI-tools MITM card's "How it works" section now lists the full set of `127.0.0.1 ` lines for the selected tool (sourced from the canonical MITM target registry) instead of a single example domain. Users on locked-down machines — where the automatic, sudo-gated hosts-file edit isn't available — can now copy every required entry by hand. (thanks @mrcyclo) - **feat(cli): `omniroute launch-codex` + `setup-codex` — run/configure the Codex CLI against OmniRoute** — a launcher and setup command that point the Codex CLI at an OmniRoute endpoint (remote-mode aware). ([#4270](https://github.com/diegosouzapw/OmniRoute/pull/4270)) - **feat(cli): Claude Code launcher + setup — remote mode + profiles** — `omniroute launch`/`setup` for Claude Code with remote-mode support and named connection profiles. ([#4274](https://github.com/diegosouzapw/OmniRoute/pull/4274)) - **feat(cli): OpenCode setup — OpenAI-compatible provider + remote-aware plugin** — `setup-opencode` registers OmniRoute as an OpenAI-compatible provider for OpenCode and installs a remote-aware plugin. ([#4277](https://github.com/diegosouzapw/OmniRoute/pull/4277)) - **feat(cli): one-command setup for popular AI coding tools** — new `setup-*` commands that configure each tool to talk to OmniRoute: **Cline** ([#4280](https://github.com/diegosouzapw/OmniRoute/pull/4280)), **Kilo Code** ([#4284](https://github.com/diegosouzapw/OmniRoute/pull/4284)), **Continue** ([#4289](https://github.com/diegosouzapw/OmniRoute/pull/4289)), **Cursor** ([#4291](https://github.com/diegosouzapw/OmniRoute/pull/4291)), **Roo Code** ([#4292](https://github.com/diegosouzapw/OmniRoute/pull/4292)), **Crush** ([#4298](https://github.com/diegosouzapw/OmniRoute/pull/4298)), **Goose** ([#4300](https://github.com/diegosouzapw/OmniRoute/pull/4300)), **Qwen Code** ([#4301](https://github.com/diegosouzapw/OmniRoute/pull/4301)), **Aider** ([#4302](https://github.com/diegosouzapw/OmniRoute/pull/4302)) and the **Gemini CLI** (native `/v1beta`) ([#4303](https://github.com/diegosouzapw/OmniRoute/pull/4303)). - **feat(providers): provider model sweep — live discovery, refreshed catalogs, dead-provider cleanup** — a broad sweep that enables live `/v1/models` discovery for more OpenAI-style providers (the zenmux pattern), refreshes the seeded catalogs with current models, and marks dead providers `deprecated`. ([#4324](https://github.com/diegosouzapw/OmniRoute/pull/4324)) - **feat(mitm): translate Antigravity cloudcode end-to-end (Gap B)** — the MITM decrypt path now translates Antigravity `cloudcode` traffic end-to-end. ([#4299](https://github.com/diegosouzapw/OmniRoute/pull/4299)) - **feat(keys): per-key USD usage quota controls** — an API key can now carry a USD spend quota that caps its usage once the threshold is reached. ([#4327](https://github.com/diegosouzapw/OmniRoute/pull/4327) — thanks @Witroch4) ### 🔧 Changed - **change(memory): memory is now OFF by default** — `DEFAULT_MEMORY_SETTINGS.enabled` now defaults to `false`. Enabling memory injects up to ~2,000 tokens of retrieved context into **every** chat request (and that context is billed), which was a surprising default for new installs and for clients with their own context. Memory is now an explicit opt-in: installs that already enabled it keep it on; installs that never configured it default to off. The Settings → Memory panel now shows a token-cost warning when memory is enabled. (PRD-2026-06-19-no-memory-header) ### 🐛 Fixed - **fix(translator): Gemini accepts HTTP/HTTPS image URLs (no longer silently dropped)** — OpenAI-style `image_url` parts whose URL was `http://…` or `https://…` reached `convertOpenAIContentToParts` (the OpenAI→Gemini request helper) and were dropped with only a `console.warn`, because Gemini's `inlineData` requires base64 and the helper is synchronous (it cannot fetch + encode). Gemini's `Part` schema, however, natively accepts `fileData: { fileUri }` for remote URIs — the model fetches the asset itself. The helper now emits a `fileData` part (`mimeType: "image/*"`, inferred upstream on fetch) instead of dropping, so vision requests that pass a URL — not a data: URI — now reach Gemini intact. `data:` URIs still go through `inlineData` unchanged; unsupported schemes (e.g. `ftp:`) are still skipped. (thanks @East-rayyy) - **fix(security): OAuth callback page no longer relays `code`/`state` to a wildcard `postMessage` target** — the OAuth callback at `/callback` posted `{ code, state, ... }` to `window.opener.postMessage(..., "*")` whenever the opener was cross-origin (a fallback for the legitimate remote-dashboard + local-loopback callback scenario). A hostile page that opened the callback URL in a popup against the well-known redirect URI would therefore receive the OAuth code+state and could complete the OAuth flow as the user. The wildcard fallback is replaced with an iteration over a fixed allowlist of trusted target origins (same-origin + Codex's loopback helper at `localhost:1455` / `127.0.0.1:1455`); the browser silently drops the message for any opener whose origin is not in the list. Methods 2 (`BroadcastChannel`) and 3 (`localStorage`) — already in the page — still cover same-origin parents when the opener was severed by COOP. (thanks @aeonframework) - **fix(compliance): startup cleanup honors the dashboard data-retention setting instead of always trimming to 7 days** — on every restart, `cleanupExpiredLogs()` (run at startup) read retention only from the `CALL_LOG_RETENTION_DAYS` / `APP_LOG_RETENTION_DAYS` env vars, which default to **7 days** when unset, and trimmed `usage_history` (the Usage Analysis data) before the dashboard-based `runAutoCleanup()` — which respects the configured retention — ever ran. So a dashboard "Data Retention" of 90 days was silently overridden and the Usage Analysis page only ever showed the last 7 days after a restart. Retention now follows the precedence **explicit env var → dashboard DB setting → 7-day default**, per table (`usage_history`→`usageHistory`, `call_logs`/`proxy_logs`/`request_detail_logs`→`callLogs`, `mcp_tool_audit`→`mcpAudit`); an operator who sets the env var still wins, and non-DB deployments still fall back to it. ([#4354](https://github.com/diegosouzapw/OmniRoute/issues/4354) — thanks @akbardwi) - **fix(providers): bailian-coding-plan static fallback catalog matches the registry (10 models)** — the provider-model sweep (#4324) added four current Model Studio coding-plan models (`qwen3.7-plus`, `qwen3-coder-plus`, `qwen3-coder-next`, `glm-4.7`) to the `bailian-coding-plan` registry entry but missed the static fallback mirror in `staticModels.ts`, which still listed only the older six. The static catalog (served when live discovery is unavailable) therefore diverged from the registry, and the existing static↔registry parity test went red on the release branch (only surfacing when test-impact analysis happened to select it). The static mirror now carries all ten models in registry order, restoring parity. ([#4324](https://github.com/diegosouzapw/OmniRoute/pull/4324)) - **fix(executors): ArenaLLM accepts LMArena's split Supabase SSR auth cookie** — LMArena migrated to `@supabase/ssr` chunked auth cookies: the single `arena-auth-prod-v1` cookie is now empty and the real session is split across `arena-auth-prod-v1.0`, `arena-auth-prod-v1.1`, … (ascending). A user who pasted the (now-empty) single cookie therefore sent an empty session and upstream rejected it as "invalid cookie". The LMArena executor now reconstructs the single cookie from its chunks — reading `.0`, `.1`, … in ascending numeric order until one is missing and concatenating their raw values (`@supabase/ssr`'s `combineChunks` rule: plain `join("")`, no base64-decode, no JSON-parse, the `base64-` prefix kept verbatim) — while preserving the rest of the pasted jar. A non-empty single cookie is still forwarded unchanged (back-compat). The credential UX now instructs pasting the **full Cookie header** and tracks the `.0`/`.1` storage keys. ([#4271](https://github.com/diegosouzapw/OmniRoute/issues/4271) — thanks @caussao) - **fix(compression): preserve the cacheable prefix for automatic-cache providers** — OpenAI / Codex (and Azure-OpenAI) use _automatic_ prefix caching: the upstream caches the longest matching prefix of a request (system prompt + earliest messages) **without** any explicit `cache_control` markers in the body. The cache-aware compression guard only protected that prefix when the request carried explicit `cache_control`, so for automatic-cache providers the guard was skipped — and with compression enabled and `preserveSystemPrompt: false` (or a prefix-compressing mode like `aggressive`/`ultra`) it rewrote the system prompt / earliest messages, guaranteeing a cache miss and **higher** token spend through OmniRoute than going direct. The guard now treats a caching provider as sufficient on its own (`isCachingProvider` alone, independent of `cache_control`) to skip the system prompt and downgrade prefix-compressing modes, and OpenAI/Codex/Azure are now recognized as caching providers. Compression is still off by default — this only affects operators who enabled it with prefix preservation turned off. ([#3955](https://github.com/diegosouzapw/OmniRoute/issues/3955)) - **fix(executors): DuckDuckGo AI Chat uses duckduckgo.com (fixes 400)** — the DuckDuckGo AI Chat executor fetched status/chat and set `Origin`/`Referer` against `https://duck.ai` while still sending `Sec-Fetch-Site: same-origin`, so the request's same-origin triplet (host + Origin + Referer) was inconsistent and the backend rejected it with HTTP 400. All current DDG reverse-engineering references — and the provider registry's own `baseUrl` — use `https://duckduckgo.com`; the executor now uses it consistently for the status URL, chat URL, `Origin`, and `Referer` (the same-origin header is now coherent). The `x-fe-version` scrape regex also required a 40-hex tail but the real served token has a 20-hex tail (e.g. `serp_20250401_100419_ET-19d438eb199b2bf7c300`), so it silently fell back to a hardcoded default; the pattern is relaxed to a bounded `{20,40}` tail (still ReDoS-safe). This addresses the DuckDuckGo half of the report; the separate Chipotle/`chipotle` upstream breakage is tracked independently. ([#4037](https://github.com/diegosouzapw/OmniRoute/issues/4037) — thanks @daniij) - **fix(security): bound the prompt-injection scan to the first 16 KB (hot-path perf)** — the prompt-injection guard joined every message/system string into one buffer and ran several regexes over the **whole** thing on every chat request, with no size cap — so a 300 KB body (pasted code, RAG context) meant O(body) CPU scanning on the hot path, a self-inflicted latency/GC source under concurrency. Both detection call sites (`detectInjection` in `inputSanitizer.ts` and the custom-pattern scan in `promptInjection.ts`) now slice the joined text to the first **16 KB** (`MAX_INJECTION_SCAN_BYTES`) before the regex loop. Injection directives sit near the top of a prompt, so the generous cap preserves real detection while scanning only a bounded prefix; the existing 10 MB body-size cap (which protects ingestion) is unchanged. ([#3932](https://github.com/diegosouzapw/OmniRoute/issues/3932) — thanks @KooshaPari) - **fix(sse): retry direct-connection socket failures on a fresh socket (fewer `502` bursts)** — the default direct-connection undici dispatcher pools keep-alive sockets for up to 4 s, but some edges (e.g. `nvidia`, `opencode-zen`) silently close idle keep-alive sockets within that window, so the next request reusing a pooled socket fails with `UND_ERR_SOCKET` ("other side closed") — in bursts. `proxyFetch` already retried once on such transient errors, but the retry reused the **same** pooled dispatcher and could grab another stale socket, then fell through to native fetch (which also pools) → the job sat in the rate-limit queue until the 30 s timeout → `502` + circuit-breaker open. The retry now uses a dedicated **no-keep-alive / no-pipelining** dispatcher so it opens a brand-new socket that can't be a dead pooled one; the first attempt still uses the pooled dispatcher (healthy keep-alive reuse is preserved). Complements the v3.8.29 diagnostics (`describeFetchCause`, #4281). ([#4252](https://github.com/diegosouzapw/OmniRoute/issues/4252) — thanks @klimadev) - **fix(sse): combo now stops at the first body-specific 400 instead of trying every target** — the `#2101` guard that detects a body-specific 400 (context overflow / malformed / model-access-denied, e.g. "model is not supported when using Codex with a ChatGPT account") logged "stopping combo" but executed a bare `break`, which only exited the inner retry loop; `executeTarget` then returned `null` and the outer target loop treated that as "this target produced nothing" and advanced to the next model. A combo of N targets that all reject the same request body therefore marched through all N (the report shows a 143-model Codex combo iterating every target), wasting upstream calls and per-attempt work. The guard now surfaces the 400 via the `{ ok, response }` contract (mirroring the 499 client-disconnect path) so the combo resolves and stops immediately. ([#4279](https://github.com/diegosouzapw/OmniRoute/issues/4279)) - **fix(sse): non-streaming combo over a Responses-API target no longer returns empty content** — a Responses-API target (codex/`cx`) streams from upstream even on `stream:false`, and its terminal `response.completed` snapshot can carry a non-empty `output` that lacks the assistant message item (e.g. only a `reasoning` item) while the streamed `output_text` deltas had reconstructed the full message. The SSE→JSON aggregator preferred the terminal `output` wholesale, dropping the reconstructed text → HTTP 200 with empty content (hit notably via n8n, which defaults to `stream:false`). The aggregator now falls back to the reconstructed delta output when the terminal output has no message item but the reconstruction does; the terminal snapshot still wins whenever it already carries the message. ([#3948](https://github.com/diegosouzapw/OmniRoute/issues/3948)) - **fix(executors): preserve tool-name casing on native Claude OAuth (`read` no longer leaks back as `Read`)** — native Claude OAuth traffic runs through an anti-fingerprint tool-name cloak that renames a tool literally named `read` to `Read` on the wire and records the reverse alias on a non-enumerable `_toolNameMap`, which the response side uses to restore the client's original casing. Since v3.8.27 the executor returned a JSON-round-tripped copy of the body as `transformedBody`, and that round-trip dropped the non-enumerable map — so the restore saw an empty map and the cloaked `Read` streamed verbatim to the client, corrupting the tool name. The executor now re-attaches the cloak map onto the serialized body (mirroring the Antigravity executor), so tool-name casing round-trips correctly. ([#4307](https://github.com/diegosouzapw/OmniRoute/issues/4307) — thanks @dev-cj) - **fix(api): cache-HIT `X-OmniRoute-Response-Cost` now reports the incremental cost (≈0), not the original** — on a semantic-cache HIT the gateway serves the stored response **without** an upstream call, but `X-OmniRoute-Response-Cost` was reporting the original call's full cost (recomputed from the cached `usage`). A consumer summing `response-cost` for billing was therefore charging for responses that cost ≈$0 to serve (and stale entries could inflate it). Cache hits now bill `X-OmniRoute-Response-Cost: 0.0000000000` (the real incremental cost), and the avoided cost is surfaced in a new **`X-OmniRoute-Cost-Saved`** header for cache analytics — mirroring the existing `tokens_saved` concept. The MISS path is unchanged. (PRD-2026-06-19-cache-hit-cost-reporting) - **fix(models): imported vision-capable models keep their vision capability** — after importing a provider key, vision-capable models (e.g. OpenRouter models whose `architecture` declares image input, and other synced providers) were listed as text-only in `/v1/models` and the dashboard — even though image requests actually worked. Synced model records never captured the vision flag, and the catalog's OpenRouter live-enrichment (which derives vision from `architecture.input_modalities`) is skipped once a provider has synced models. Discovery now captures `supportsVision` at sync time (from `architecture.input_modalities`, the string `architecture.modality`, or a top-level `input_modalities`), mirroring the existing `supportsThinking` capture, and the catalog surfaces `capabilities.vision` for synced models. ([#4264](https://github.com/diegosouzapw/OmniRoute/issues/4264) — thanks @FerLuisxd) - **fix(providers): Cloudflare Workers AI model discovery shows model names, not UUIDs** — importing a Cloudflare Workers AI key listed models with internal UUID identifiers (e.g. `429b9e8b-d99e-…`) instead of their usable slugs (`@cf/meta/llama-3.1-8b-instruct`). Cloudflare's `/ai/models/search` returns `{ id: "", name: "@cf/…" }`, and discovery was passing the raw objects through — so the UUID `id` became the callable model id. The `cloudflare-ai` discovery now maps each result's `name` → id, surfacing the real `@cf/…` model ids. ([#4259](https://github.com/diegosouzapw/OmniRoute/issues/4259) — thanks @FerLuisxd) - **fix(translator): clamp Responses API `call_id` to 64 characters** — the OpenAI Responses API rejects `call_id` values longer than 64 characters with a 400. Long upstream tool-call ids (some clients emit ids well over the limit) are now clamped deterministically on both the `function_call` item and its matching `function_call_output`, so the pair stays matched through the orphaned-output filter and the request is accepted. (thanks @anuragg-saxenaa, @ngapngap) - **fix(oauth): GitHub Copilot token refresh now sends the public client_id** — the `github` provider config never carried a `clientId`, so GitHub OAuth `refresh_token` exchanges either omitted `client_id` or sent the literal string `undefined` (and a bogus `client_secret=undefined`), which GitHub rejects — leaving a Copilot connection stuck once its short-lived token expired and the long-lived refresh path was needed. The provider now resolves its public device-flow `client_id` from the embedded public credential and omits `client_secret` entirely (GitHub's Copilot app is a public client with no secret). (thanks @baslr) - **fix(translator): a tool property named `pattern` survives Gemini/Antigravity schema sanitization** — the Gemini schema sanitizer strips JSON-Schema constraint keywords Gemini rejects (`pattern`, `minLength`, …) at every nesting level, but it also deleted any tool **property** literally _named_ one of those keywords. glob/grep tools declare a property called `pattern`, so on `ag/*` (Antigravity) backends that argument (and its `required` entry) was silently dropped, breaking the tools. Keyword stripping is now position-aware: it only removes constraint keywords at the schema-node level and never against the user-defined names inside a `properties` map. A genuine string-level `pattern` _constraint_ is still stripped. (thanks @youthanh) - **fix(translator): MCP `namespace` tools flatten to individual functions on the Responses→Chat path** — when a Codex CLI client routes a Responses-API request to a non-Codex backend (e.g. `kr/claude-opus-4.7`), each MCP server is declared as a `namespace` tool (`{ type:"namespace", name, tools:[…] }`). The Responses→Chat translator had no `namespace` branch, so the whole group collapsed into a single empty-schema function named `mcp____` and every MCP call returned `unsupported call: mcp____`, breaking all MCP-based workflows (context7, codegraph, custom MCPs) for that combination. The translator now expands a namespace into one Chat function per sub-tool (preserving each sub-tool's name and parameters); an empty namespace yields no tools instead of a broken placeholder. The native Codex passthrough path was already correct. (thanks @V13t4nh) - **fix(cli): the active remote-context credential wins over an ambient `OMNIROUTE_API_KEY`** — when a remote context is selected, its scoped access token now takes precedence over an `OMNIROUTE_API_KEY` present in the environment, so the connected remote is targeted as expected. ([#4364](https://github.com/diegosouzapw/OmniRoute/pull/4364)) - **fix(cli): wire the `contexts` command into the CLI program** — the `omniroute contexts` command (list/switch saved remote contexts) was implemented but never registered, so it was unreachable; it is now wired into the CLI program. ([#4369](https://github.com/diegosouzapw/OmniRoute/pull/4369)) - **fix(mitm): mask bare `Bearer ` header values in the Traffic Inspector** — the inspector now redacts bare `Authorization: Bearer …` values so tokens don't leak into captured traffic. ([#4358](https://github.com/diegosouzapw/OmniRoute/pull/4358)) - **fix(pricing): price the `gpt-5.x-pro` OpenAI models + align the opencode-go discovery test** — adds pricing for the gpt-5.x-pro models so cost telemetry reports a real cost instead of zero. ([#4355](https://github.com/diegosouzapw/OmniRoute/pull/4355)) - **fix(sse): release the reader and cancel the stream on abort/error (no more Undici pool socket leak)** — on abort or a mid-stream error the response reader is released and the stream cancelled, preventing leaked pooled sockets that degraded later requests. ([#4309](https://github.com/diegosouzapw/OmniRoute/pull/4309) — thanks @Ardem2025) - **fix(kiro): emit an early role-only start chunk to release the stream-readiness gate** — Kiro streams now send an initial role-only chunk so the stream-readiness gate releases promptly instead of stalling. ([#4311](https://github.com/diegosouzapw/OmniRoute/pull/4311) — thanks @artickc) - **fix(dashboard): the proxy modal stops pre-filling new scopes with an unrelated proxy** — adding a new scope assignment no longer inherits a previously-selected proxy's configuration. ([#4312](https://github.com/diegosouzapw/OmniRoute/pull/4312)) - **fix(open-sse): inner-ai stops silently rerouting unmatched models to `models[0]`** — an unmatched model id is no longer silently served by the first available model; the lookup now returns null and the request is handled explicitly. ([#4310](https://github.com/diegosouzapw/OmniRoute/pull/4310)) - **fix(pollinations): handle auth-required premium models (claude, gemini, midjourney)** — premium Pollinations models that require authentication are now handled correctly instead of failing. ([#4266](https://github.com/diegosouzapw/OmniRoute/pull/4266) — thanks @oyi77) - **fix(codex): isolate the Spark quota scope** — Codex Spark usage is tracked under its own quota scope so it no longer bleeds into other Codex quotas. ([#4293](https://github.com/diegosouzapw/OmniRoute/pull/4293) — thanks @xz-dev) - **fix(dashboard): improve the API "try it" functionality** — fixes the request path used by the dashboard's API "try it" panel. ([#4296](https://github.com/diegosouzapw/OmniRoute/pull/4296) — thanks @edrickrenan) - **fix: polyfill `crypto.randomUUID` for non-secure contexts** — restores UUID generation when the dashboard is served over a non-secure (plain-HTTP) origin where `crypto.randomUUID` is unavailable. ([#4287](https://github.com/diegosouzapw/OmniRoute/pull/4287) — thanks @pizzav-xyz) - **fix(proxy): allow concurrent proxy dispatcher streams** — the proxy dispatcher no longer serializes streams, so concurrent requests through a proxied connection run in parallel. ([#4288](https://github.com/diegosouzapw/OmniRoute/pull/4288) — thanks @wilsonicdev) - **fix(build): co-locate llmlingua SLM optionals into `dist/node_modules` (postinstall)** — the optional llmlingua SLM packages are co-located into the standalone build so the compression worker can actually spawn in production. ([#4286](https://github.com/diegosouzapw/OmniRoute/pull/4286)) - **fix(mitm): surface AgentBridge traffic in the Traffic Inspector (D4 ingest)** — AgentBridge requests now appear in the Traffic Inspector. ([#4285](https://github.com/diegosouzapw/OmniRoute/pull/4285)) - **fix(sse): surface undici `err.cause` on dispatcher failure** — dispatcher failures now flatten the cause chain (and `AggregateError`s) into the error detail for diagnosability. ([#4281](https://github.com/diegosouzapw/OmniRoute/pull/4281)) - **fix(cli): harden `launch`/`launch-codex` with free-claude-code patterns** — the launchers adopt the hardened launch patterns ported from free-claude-code. ([#4278](https://github.com/diegosouzapw/OmniRoute/pull/4278)) - **fix(compression): end-to-end audit — fixes across the whole compression flow** — a sweep of the compression pipeline fixing ultra/aggressive/lossless edge cases, accessibility-anchor handling, language detection, and mode decoupling. ([#4323](https://github.com/diegosouzapw/OmniRoute/pull/4323)) ### 🧪 Tests - **test: align two tests left red by merged PRs** — re-aligns the db-rules classification count (#4335) and the LMArena split-cookie metadata test (#4271) after concurrent merges. ([#4346](https://github.com/diegosouzapw/OmniRoute/pull/4346)) - **test(ci): reconcile the release/v3.8.30 baseline + test drift** — reconciles quality baselines and drifted tests accumulated on the release branch. ([#4276](https://github.com/diegosouzapw/OmniRoute/pull/4276)) ### 📝 Maintenance - **refactor(combo): `ComboContext` + extract `phaseComboSetup` (god-file split, phase 1)** — begins decomposing the combo god-file by extracting combo setup into a context object, without touching dispatch/semaphore logic. ([#4326](https://github.com/diegosouzapw/OmniRoute/pull/4326)) - **feat(quality): cap test-file size — anti-reinflation Layer 1** — freezes the existing god-tests and caps new test files at 800 lines to stop re-inflation. ([#4273](https://github.com/diegosouzapw/OmniRoute/pull/4273)) - **feat(quality): seed per-module mutationScore floors + a blocking aggregation ratchet (T3)** — adds per-module mutation-score floors with a blocking aggregate gate. ([#4305](https://github.com/diegosouzapw/OmniRoute/pull/4305)) - **feat(quality): make the a11y gate real (`@axe-core/playwright` in nightly)** — wires the previously-phantom accessibility gate into the nightly run with real baselines. ([#4321](https://github.com/diegosouzapw/OmniRoute/pull/4321)) - **feat(quality): unblock R1 — test-redundancy measurement via `disableBail`** — enables the test-redundancy measurement that was previously blocked by fail-fast. ([#4322](https://github.com/diegosouzapw/OmniRoute/pull/4322)) - **fix(quality): the complexity gate now covers `bin/` + `electron/`, and tracked-artifacts runs in pre-commit** — extends the complexity gate's scope and moves the tracked-artifacts check into the pre-commit hook. ([#4318](https://github.com/diegosouzapw/OmniRoute/pull/4318)) - **fix(quality): restore release/v3.8.30 green — 3 latent reds from concurrent merges** — fixes three latent test reds surfaced by concurrent merges into the release branch. ([#4335](https://github.com/diegosouzapw/OmniRoute/pull/4335)) - **fix(combo): keep `phaseComboSetup` under the complexity ceiling** — extracts a helper so the new combo setup phase stays under the complexity gate. ([#4338](https://github.com/diegosouzapw/OmniRoute/pull/4338)) - **ci(mutation): split over-budget batches by range/pair so every batch fits the job cap** — re-splits the mutation batches so each fits the CI job budget. ([#4272](https://github.com/diegosouzapw/OmniRoute/pull/4272)) - **chore(ci): align the electron audit gate to the root advisory policy** — the electron-workspace audit gate now follows the same advisory policy as the root. ([#4275](https://github.com/diegosouzapw/OmniRoute/pull/4275)) - **chore(quality): reconcile the complexity/quality baselines across concurrent-merge drift** — rolls up the cycle's baseline reconciliations driven by concurrent merges into the release branch. ([#4330](https://github.com/diegosouzapw/OmniRoute/pull/4330), [#4336](https://github.com/diegosouzapw/OmniRoute/pull/4336), [#4370](https://github.com/diegosouzapw/OmniRoute/pull/4370)) - **docs: ban AI-generation footers in commits/PRs/CHANGELOG (Hard Rule #16)** — codifies the prohibition on AI-generation footers and bot co-author trailers. ([#4328](https://github.com/diegosouzapw/OmniRoute/pull/4328)) - **docs(design): add the OmniRoute design system and visual identity specification** — adds the design-system / visual-identity specification document. (thanks @diegosouzapw) ### 🔒 Security - **fix(sse): harden the DuckDuckGo lite scraper sanitization (CodeQL)** — closes four HIGH CodeQL alerts in the no-key web-search scraper: `decodeEntities` now resolves `&` **last** so an already-escaped entity (e.g. `&lt;`) survives as literal text instead of being double-unescaped (`js/double-escaping`); `stripTags` decodes entities first, then strips tags in a loop to a fixpoint and drops any trailing unclosed `<…`, so entity-encoded markup like `<script>` can never reach the LLM/client as a live tag (`js/incomplete-multi-character-sanitization`); and the host checks in the search tests use `new URL().hostname` equality instead of substring `.includes` (`js/incomplete-url-substring-sanitization`). ([#4356](https://github.com/diegosouzapw/OmniRoute/pull/4356)) ### 🔧 Dependencies - **fix(deps): bump undici to 7.28.0 and dompurify to 3.4.11 (security)** — addresses the undici SOCKS5-TLS / cache advisories and the dompurify advisory. ([#4306](https://github.com/diegosouzapw/OmniRoute/pull/4306)) - **chore(deps): bump actions/checkout from 4 to 7** — CI checkout-action update. ([#4297](https://github.com/diegosouzapw/OmniRoute/pull/4297)) - **fix(executors): strip `stream_options` for qwen non-streaming / thinking-mode Claude Code requests** — Claude-Code-compatible providers force the executor-level `stream` flag on via `upstreamStream = stream || isClaudeCodeCompatible` (`open-sse/handlers/chatCore.ts`), but the outgoing body keeps the caller's original `stream: false`. The shared `stream && targetFormat === "openai"` branch in `DefaultExecutor.transformRequest` then injected `stream_options: { include_usage: true }` onto a body that still said `stream: false`, and qwen upstream rejected it with `400 "'stream_options' only set this when you set stream: true"`. Same rejection when the body carries `thinking` / `enable_thinking`. The qwen branch now skips the injection (and strips any client-sent `stream_options`) when the body explicitly says `stream: false` or requests thinking, leaving regular qwen streaming requests with the usage injection intact. (thanks @anuragg-saxenaa) --- ## [3.8.29] — 2026-06-19 ### ✨ New Features - **feat(cloud-agent): Cursor Cloud Agent via the official API-key REST API (no IDE-OAuth ban risk)** — adds a `cursor-cloud` cloud agent that drives Cursor's Background / Cloud Agents through the official REST API (`api.cursor.com`) authenticated with a user or service-account API key — the safer, first-party alternative to re-using the Cursor IDE's OAuth session (the existing `cursor` provider, which carries a ban-risk warning). Implemented as a plain REST adapter mirroring the Devin/Jules agents (`createTask`/`getStatus`/`sendMessage`/`listSources`), so it does **not** pull in the `@cursor/sdk` package and its per-platform native binaries (Cursor's SDK is itself a thin wrapper over this REST API). Cursor's UPPERCASE status enums (`CREATING`/`RUNNING`/`FINISHED`/`ERROR`) are mapped explicitly to the shared `CloudAgentStatus`, and `baseUrl` is overridable per-credential. Credentials are stored encrypted via the existing `cloud_agent_credentials` table; no schema change. ([#4227](https://github.com/diegosouzapw/OmniRoute/issues/4227) — thanks @MRDGH2821) - **feat(routing): OpenRouter-style `auto/:` combos** — auto-routing now understands suffixed combos that separate the _category_ (what kind of route) from the _tier_ (how to optimize): `auto/coding:fast`, `auto/coding:cheap` (alias `:floor`), `auto/coding:free`, `auto/coding:pro`, `auto/coding:reliable`, plus the new category roots `auto/reasoning`, `auto/vision`, `auto/multimodal`. The **tier** picks the scoring weights — `:fast` → ship-fast, `:cheap`/`:floor` → cost-saver, `:reliable` → a new reliability-first pack (circuit-breaker health + latency stability) — while `:free`/`:pro` filter the candidate pool by model tier (`classifyTier`: free-tier vs. premium models). The **category** filters the pool by capability (`vision`/`multimodal` → vision-capable models, `reasoning` → reasoning/thinking models). Any valid `auto/:` resolves on demand; a curated set is advertised in `/v1/models` and the dashboard. Filtering is fail-open — if a constraint matches no connected models the full pool is used so routing never breaks. All composition lives in the new `open-sse/services/autoCombo/suffixComposition.ts`; the core combo scorer (`combo.ts`) is untouched. Second slice of #4235 (premium account-tier weighting is a later follow-up). ([#4235](https://github.com/diegosouzapw/OmniRoute/issues/4235) — thanks @MRDGH2821) - **feat(routing): advertise the `auto/cheap`, `auto/offline`, `auto/smart` combos (catalog ↔ README sync)** — the README lists `auto/cheap` (cheapest-per-token first), `auto/offline` (most quota/rate-limit headroom first) and `auto/smart` (quality-first + 10% exploration), and they already resolved at request time via `parseAutoPrefix` → `createVirtualAutoCombo`. But they were missing from `AUTO_TEMPLATE_VARIANTS`, so `/v1/models` and the dashboard combos list (which iterate that catalog) never showed them — the catalog drifted from the docs (visible in the issue's screenshots). Added the three entries so they're advertised everywhere alongside the other built-in `auto/*` combos. First slice of #4235 (OpenRouter-style `auto/:` suffixes + new categories follow). ([#4235](https://github.com/diegosouzapw/OmniRoute/issues/4235) — thanks @MRDGH2821) - **feat(cli): remote mode — drive a remote OmniRoute with scoped access tokens** — a new CLI mode that connects to a remote OmniRoute instance using scoped access tokens, so a local CLI can drive a server you don't own a session on. ([#4256](https://github.com/diegosouzapw/OmniRoute/pull/4256)) - **feat(api): cost-telemetry parity — `X-OmniRoute-*` headers on every endpoint + a non-token cost engine** — every endpoint now emits the `X-OmniRoute-*` cost/usage headers, backed by a cost engine that also prices non-token (media/request-based) usage. ([#4247](https://github.com/diegosouzapw/OmniRoute/pull/4247)) - **feat(api): register Kimi K2.7 Code models (`kimi-k2.7-code` + `-highspeed`)** — the new Moonshot thinking-only coding models are registered (fixed sampling; `temperature`/`top_p` marked unsupported). ([#4183](https://github.com/diegosouzapw/OmniRoute/pull/4183)) - **feat(catalog): add `kimi-k2.7-code` to the kmca catalog + qwen-web models discovery** — surfaces the new Kimi coding model in the kmca catalog and wires qwen-web into model discovery. ([#4185](https://github.com/diegosouzapw/OmniRoute/pull/4185)) - **feat(api): expand the `zai` provider catalog with GLM-5.2 / GLM-4.7** — adds the real GLM-5.2, GLM-4.7 and GLM-4.7-flash model ids to the Anthropic-direct `zai` provider. ([#4201](https://github.com/diegosouzapw/OmniRoute/pull/4201)) - **feat(api): no-thinking gateway model IDs (FCC port, Fase 8.1)** — gateway model id variants that force thinking off, ported from free-claude-code. ([#4145](https://github.com/diegosouzapw/OmniRoute/pull/4145)) - **feat(sse): mid-stream continuation for truncated streams (FCC port, Task 4.4)** — when a stream is cut short, OmniRoute can transparently continue it, ported from free-claude-code. ([#4147](https://github.com/diegosouzapw/OmniRoute/pull/4147)) - **feat(sse): per-provider sliding-window rate-limit fallback (FCC port, Fase 8.2)** — a per-provider sliding-window rate limiter as a fallback path, ported from free-claude-code. ([#4146](https://github.com/diegosouzapw/OmniRoute/pull/4146)) - **feat(sse): transparent stream recovery (FCC port, Fase 4, opt-in)** — opt-in transparent recovery of interrupted upstream streams, ported from free-claude-code. ([#4131](https://github.com/diegosouzapw/OmniRoute/pull/4131)) - **feat(search): free DuckDuckGo web search as a last-resort provider (FCC port, Fase 6)** — adds a no-key DuckDuckGo web-search provider used as a last resort, ported from free-claude-code. ([#4136](https://github.com/diegosouzapw/OmniRoute/pull/4136)) - **feat(logging): credential-redaction safety net in the pino logger (FCC port, Fase 8.3)** — a logger-level redaction pass that scrubs credentials from log output, ported from free-claude-code. ([#4140](https://github.com/diegosouzapw/OmniRoute/pull/4140)) - **feat(memory): opt-in Qdrant scalar int8 quantization (F4.4 Q1)** — opt-in int8 scalar quantization for Qdrant-backed memory vectors. ([#4187](https://github.com/diegosouzapw/OmniRoute/pull/4187)) - **feat(memory): opt-in sqlite-vec int8 vector quantization (F4.4 Q2)** — opt-in int8 quantization for the sqlite-vec memory backend. ([#4190](https://github.com/diegosouzapw/OmniRoute/pull/4190)) - **feat(deploy): keep optional deps on `update` (`--include=optional`)** — the in-place update path now passes `--include=optional` so native/optional packages aren't dropped on update. ([#4260](https://github.com/diegosouzapw/OmniRoute/pull/4260)) - **feat(dashboard): unified visual identity — grid, primitives, tables, form controls (design phases 1-4)** — a sweeping design pass aligning the dashboard with the site: grid wallpaper, button/card/input primitives, theme-aware tables and form controls. ([#4122](https://github.com/diegosouzapw/OmniRoute/pull/4122)) - **feat(dashboard): grid wallpaper on all standalone screens + fluid 4K layout** — the identity grid now backs every standalone screen and the layout scales fluidly to 4K. ([#4158](https://github.com/diegosouzapw/OmniRoute/pull/4158)) - **feat(dashboard): make the identity grid visible + unify the focus ring on accent** — design follow-up making the grid actually visible and standardizing focus rings on the accent color. ([#4141](https://github.com/diegosouzapw/OmniRoute/pull/4141)) - **feat(dashboard): import only free models + free-model list controls** — the model-import page can import just the free models, with controls to manage the free-model list. ([#4176](https://github.com/diegosouzapw/OmniRoute/pull/4176) — thanks @felipesartori) - **feat(dashboard): compact grid layout for no-auth provider accounts** — a denser grid layout for provider accounts when auth is disabled. ([#4137](https://github.com/diegosouzapw/OmniRoute/pull/4137) — thanks @felipesartori) - **feat(dashboard): derive media `serviceKinds` from the registries (surface MiniMax + the media catalog)** — `/media-providers/[kind]` now derives its service kinds from the registries instead of a hand-maintained list, surfacing ~48 previously-invisible media providers (incl. MiniMax TTS/video/music). ([#4212](https://github.com/diegosouzapw/OmniRoute/pull/4212)) - **feat(traffic-inspector): live (in-flight) request filter (Gap 5)** — the Traffic Inspector can filter to in-flight requests as they happen. ([#4130](https://github.com/diegosouzapw/OmniRoute/pull/4130)) - **feat(agent-bridge): maintenance & diagnostics dashboard controls** — adds maintenance and diagnostics controls for the Agent Bridge to the dashboard. ([#4127](https://github.com/diegosouzapw/OmniRoute/pull/4127)) - **feat(mitm): TPROXY IP_TRANSPARENT native addon + conditional loader (Epic A)** — a native `IP_TRANSPARENT` addon with a conditional loader, the foundation for TPROXY capture. ([#4148](https://github.com/diegosouzapw/OmniRoute/pull/4148)) - **feat(mitm): Fase 3 Epic A spike — TPROXY command builder** — a transactional builder for the iptables/TPROXY command set. ([#4139](https://github.com/diegosouzapw/OmniRoute/pull/4139)) - **feat(mitm): TPROXY setup layer — transactional apply/revert (Epic A)** — applies and reverts the TPROXY routing setup transactionally. ([#4144](https://github.com/diegosouzapw/OmniRoute/pull/4144)) - **feat(mitm): add `setSocketMark` to the TPROXY addon (anti-loop primitive)** — exposes `setSocketMark` so OmniRoute's own egress can be marked and skipped (anti-loop). ([#4160](https://github.com/diegosouzapw/OmniRoute/pull/4160)) - **feat(mitm): TPROXY capture-mode listener + `connectMarked` (Epic A)** — the capture-mode listener plus a marked-connect primitive. ([#4169](https://github.com/diegosouzapw/OmniRoute/pull/4169)) - **feat(mitm): dynamic per-SNI cert authority for TPROXY (TLS decrypt 1/N)** — a per-SNI on-the-fly certificate authority, the first slice of TLS decrypt. ([#4173](https://github.com/diegosouzapw/OmniRoute/pull/4173)) - **feat(mitm): TLS-terminating capture for TPROXY (decrypt 2/N)** — terminates TLS to capture decrypted traffic. ([#4179](https://github.com/diegosouzapw/OmniRoute/pull/4179)) - **feat(mitm): wire the TLS decrypt engine into TPROXY capture mode (decrypt 3/N)** — connects the decrypt engine to the capture-mode pipeline. ([#4200](https://github.com/diegosouzapw/OmniRoute/pull/4200)) - **feat(mitm): TPROXY capture-mode manager (decrypt 4a/N)** — a manager coordinating the TPROXY capture lifecycle. ([#4208](https://github.com/diegosouzapw/OmniRoute/pull/4208)) - **feat(mitm): local-only route + trust-store installer for TPROXY decrypt (4b/N)** — a loopback-only management route plus a CA trust-store installer for the decrypt CA. ([#4211](https://github.com/diegosouzapw/OmniRoute/pull/4211)) - **feat(dashboard): TPROXY decrypt capture toggle in the Traffic Inspector (4c/N)** — a UI toggle to enable/disable decrypted capture. ([#4216](https://github.com/diegosouzapw/OmniRoute/pull/4216)) - **feat(compression): replace the headroom tabular encoder with a vendored GCF** — swaps the tabular encoder for a vendored GCF implementation. ([#4167](https://github.com/diegosouzapw/OmniRoute/pull/4167) — thanks @blackwell-systems) - **feat(compression): live per-engine streaming via `compression.step` (F3.3)** — streams per-engine compression progress through a `compression.step` event. ([#4217](https://github.com/diegosouzapw/OmniRoute/pull/4217)) - **feat(compression): show an engine node for single-engine runs in the studio** — the Compression Studio now renders an engine node even when only one engine runs. ([#4210](https://github.com/diegosouzapw/OmniRoute/pull/4210)) - **feat(compression): expose the WaterfallInspector via a Canvas/Waterfall toggle** — adds a Canvas/Waterfall view toggle that surfaces the WaterfallInspector. ([#4238](https://github.com/diegosouzapw/OmniRoute/pull/4238)) - **feat(compression): make `mcpAccessibility` config reachable via a settings sub-route** — exposes the `mcpAccessibility` config under a dedicated settings sub-route. ([#4237](https://github.com/diegosouzapw/OmniRoute/pull/4237)) - **feat(compression): runnable A/B benchmark CLI (F2.4)** — a CLI to run A/B compression benchmarks. ([#4220](https://github.com/diegosouzapw/OmniRoute/pull/4220)) - **feat(compression): add a transcript loader to the replay harness** — the replay harness can now load real transcripts. ([#4246](https://github.com/diegosouzapw/OmniRoute/pull/4246)) - **feat(compression): wire MCP tool-cardinality reduction (F4.3, opt-in)** — opt-in reduction of MCP tool-set cardinality to shrink prompts. ([#4221](https://github.com/diegosouzapw/OmniRoute/pull/4221)) - **feat(compression): wire RTK comment-stripping config + honor `preserveDocstrings`** — RTK comment-stripping is now configurable and honors a `preserveDocstrings` flag. ([#4242](https://github.com/diegosouzapw/OmniRoute/pull/4242)) - **feat(compression): honor the per-filter RTK `deduplicate` flag** — RTK filters now respect a per-filter `deduplicate` flag. ([#4231](https://github.com/diegosouzapw/OmniRoute/pull/4231)) - **feat(compression): honor the registry `enabled` flag in the stacked loop** — the stacked compression loop now skips engines disabled in the registry. ([#4244](https://github.com/diegosouzapw/OmniRoute/pull/4244)) - **feat(compression): persist RTK grouping config (unlock R5 `enableGrouping`)** — persists the RTK grouping configuration, unlocking the R5 `enableGrouping` rule. ([#4207](https://github.com/diegosouzapw/OmniRoute/pull/4207)) - **feat(compression): wire ultra's `modelPath`/`slmFallbackToAggressive` to the LLMLingua SLM tier** — connects the ultra tier's small-language-model knobs to the LLMLingua SLM path. ([#4257](https://github.com/diegosouzapw/OmniRoute/pull/4257)) - **feat(quality): Onda 2 mutation-gate tooling — radiography classifier (T1) + `mutationScore` ratchet (T3)** — new mutation-testing tooling: a survivor-radiography classifier and a `mutationScore` ratchet. ([#4234](https://github.com/diegosouzapw/OmniRoute/pull/4234)) - **feat(ci): wire the F2.4 compression budget-gate ratchet** — adds a CI ratchet that gates compression budget regressions. ([#4232](https://github.com/diegosouzapw/OmniRoute/pull/4232)) ### 🐛 Fixed - **fix(providers): qwen-web model discovery now lists the live catalog instead of nothing** — the `qwen-web` cookie provider had no entry in `PROVIDER_MODELS_CONFIG`, so its model-discovery page returned an empty/stale local catalog (the OAuth fallback at the top of the route only fires for `provider === "qwen"`, leaving `qwen-web` to fall through to the no-config branch). Added a `qwen-web` entry that fetches the **public** `https://chat.qwen.ai/api/v2/models` endpoint (no auth header) and parses the `{ data: { data: [{ id, name, owned_by }] } }` shape (with a flatter `{ data: [] }` fallback). This is Problem #3 of #3931 (diagnosed by @thezukiru); Problem #1 — validator bare-token false-positive — shipped earlier in #3958, and Problem #2 — empty stream from Qwen WAF bot-detection on the streaming endpoint — remains a separate upstream/stealth concern. ([#3931](https://github.com/diegosouzapw/OmniRoute/issues/3931) — thanks @thezukiru) - **fix(providers): ZenMux model discovery now lists the live catalog (incl. the free models) instead of the stale 9-entry hardcoded list** — adding a ZenMux key validated fine, but the connection then showed `API unavailable — using local catalog` and was missing the free models ZenMux advertises (`z-ai/glm-5.2-free`, `moonshotai/kimi-k2.7-code-free`). Root cause: `zenmux` carries a correct `modelsUrl` in the registry, but — like `llm7`/`byteplus` before #3976 — it was not classified by any live-fetch branch of the model-import route (not `openai-compatible-*`, not self-hosted, not in `NAMED_OPENAI_STYLE_PROVIDERS`), so the route never probed the upstream `/models` and fell through to the registry's hardcoded `models[]`. Added `zenmux` to `NAMED_OPENAI_STYLE_PROVIDERS`, so the route probes `https://zenmux.ai/api/v1/models` (the `/chat/completions`-stripped `/models` candidate) and serves the live list, falling back to the local catalog only when the upstream fetch fails — import never breaks. ([#4202](https://github.com/diegosouzapw/OmniRoute/issues/4202) — thanks @mikmaneggahommie) - **fix(providers): Vercel AI Gateway "import models" now loads the live catalog instead of nothing** — adding a Vercel AI Gateway key worked, but clicking **import** on the models page loaded nothing usable (manually adding the same models worked). Same class as #4202 (zenmux) / #3976 (llm7/byteplus): `vercel-ai-gateway` carries a real `baseUrl` (`https://ai-gateway.vercel.sh/v1/chat/completions`, format `openai`) in the registry, but was not classified by any live-fetch branch of the model-import route (not `openai-compatible-*`, not self-hosted, not in `NAMED_OPENAI_STYLE_PROVIDERS`), so the route never probed the upstream `/models` and fell through to the registry's tiny 5-entry hardcoded `models[]`. Added `vercel-ai-gateway` to `NAMED_OPENAI_STYLE_PROVIDERS`, so the route probes `https://ai-gateway.vercel.sh/v1/models` (the `/chat/completions`-stripped `/models` candidate) and serves the live list, falling back to the local catalog only when the upstream fetch fails — import never breaks. ([#4249](https://github.com/diegosouzapw/OmniRoute/issues/4249) — thanks @FerLuisxd) - **fix(sse): clear error when the request queue drops a job (no more fake-upstream "This job timed out after Nms")** — under concurrent load, requests that exceed the per-connection rate-limit queue budget (`resilienceSettings.requestQueue.maxWaitMs`) were dropped by Bottleneck with its raw `This job timed out after ms.` message. That string is indistinguishable from an upstream gateway timeout, so the 502 body and call-log `last_error` looked like a provider outage across unrelated providers (TI:0\|TO:0) — an operator spent ~3h misdiagnosing local queue saturation as upstream failures. `withRateLimit` now rewrites that specific Bottleneck error into a clear, OmniRoute-owned message that names the knob (`requestQueue.maxWaitMs`, tunable in Settings → Resilience), explicitly disclaims an upstream timeout, preserves the original as `cause`, and tags `code: "RATE_LIMIT_QUEUE_TIMEOUT"`. Behavior is unchanged — the job is still dropped so combo falls back to the next target. ([#4165](https://github.com/diegosouzapw/OmniRoute/issues/4165) — thanks @KooshaPari) - **fix(api): advertise the built-in `auto/*` combos in `/v1/models`** — OmniRoute ships a zero-setup `auto/*` catalog (`auto/best-coding`, `auto/pro-reasoning`, …, 16 variants) that the dashboard advertises and that resolve on demand, but the `/v1/models` listing only emitted persisted DB combos + provider models. Clients that build their model picker from `/v1/models` (e.g. Hermes Agent) never saw any `auto/*` option. The catalog now emits every `AUTO_TEMPLATE_VARIANTS` id (as `owned_by: "combo"`) at the top of the list, deduped against persisted combos. (Showing each `auto/*`'s dynamically-selected members is a separate enhancement.) ([#4164](https://github.com/diegosouzapw/OmniRoute/issues/4164) — thanks @MRDGH2821) - **fix(sse): restore MCP / third-party tool names on the native Claude path (MCP dispatch broken in Claude Code)** — since 3.8.27, every MCP tool call routed through OmniRoute to a native Claude OAuth provider failed client-side with `Error: No such tool available: `: tool schemas arrived fine but the streamed `tool_use.name` reached Claude Code in its cloaked form (e.g. `McpN8nMcpSearchWorkflows` instead of the registered `mcp__n8n-mcp__search_workflows`). The native-Claude tool-name cloak stashes its per-request alias→original map as a **non-enumerable** `_toolNameMap` on the request body; the request-inspector capture added in 3.8.27 rebuilds the captured body from its serialized form (`JSON.parse(JSON.stringify(...))`), which drops non-enumerable properties, so `finalBody._toolNameMap` was empty and the response-side un-cloak silently fell back to the static built-in map — never restoring dynamic MCP / snake_case names. Built-in tools (Bash/Read/…) were unaffected (static map); cross-format paths were unaffected (they attach the map enumerably). The provider-request capture now re-attaches the per-request map (kept non-enumerable, so it still never re-serializes upstream) when the captured copy lost it, restoring MCP tool dispatch. ([#4091](https://github.com/diegosouzapw/OmniRoute/issues/4091) — thanks @pedrotecinf, @NakHalal) - **fix(dashboard): Logs auto-refresh self-heals in embedded/proxied hosts that pin or mis-fire visibility** — a follow-up to #4054: the Request Logger still froze auto-refresh on some hosts (reported on 3.8.28 Docker, works on 3.8.24). #4054 made the initial visibility fail-open, but the pause is event-driven — a host that fires a one-shot `visibilitychange` → hidden and then keeps reporting `"hidden"` (or recovers without firing the event again) left the cached visibility flag stuck `false`, so the interval ticked but never polled (only the manual Refresh button worked). The poll tick now also re-checks the **live** `document.visibilityState`, and a **window `focus`** listener re-arms polling (a focused window is a reliable signal the page is actively viewed). A genuinely backgrounded browser tab still pauses (it reports `"hidden"` and never receives focus), preserving the #3109 network-saturation optimization. ([#4133](https://github.com/diegosouzapw/OmniRoute/issues/4133) — thanks @tjengbudi) - **fix(capabilities): unify vision model-id detection into one shared source** — three code paths kept independent, drifting vision-model lists, so the same model id could get up to three different verdicts. Two concrete bugs: lite compression's gate was missing pixtral / llava / qwen-vl / glm-4v / kimi-vl / mistral-medium-3, so it **stripped images for those real vision models and blinded them** (same class as #4071 / #4012); and the `/v1/models` list was too broad, flagging text models (`gemma`, bare `kimi` like `kimi-k2`) as vision. All three (`modelCapabilities` routing fallback, `/v1/models` listing, lite image-strip gate) now delegate to a single conservative source `src/shared/constants/visionModels.ts`, which also restores `glm-4v` / `gemini-3` coverage and keeps the #3328 MiniMax M3 carve-out. ([#4072](https://github.com/diegosouzapw/OmniRoute/issues/4072) — thanks @diego-anselmo) - **fix(sse): surface mid-stream Gemini errors instead of returning a truncated 200** — when an upstream Gemini SSE stream emitted some partial content and then a JSON error object (`{"error":{"code":503,"message":"…high demand…","status":"UNAVAILABLE"}}`) instead of a `candidates` payload, OmniRoute silently dropped it: the gemini→openai translator's no-candidate branch only handled `promptFeedback` (content-filter) and returned `null` for anything else, so the stream simply ended and the client got HTTP 200 with a truncated body and `finish_reason: "stop"` — masking the failure and skipping combo fallback. `geminiToOpenAIResponse` now detects an `error` object (optionally wrapped in `response`), records it as `state.upstreamError` (preserving the real status — 503/`UNAVAILABLE`, or 429 for `RESOURCE_EXHAUSTED`), and lets `stream.ts` error the stream out through the existing `onFailure`/`buildErrorBody`/`controller.error` path — the same mechanism the openai-responses translator already uses. ([#4177](https://github.com/diegosouzapw/OmniRoute/issues/4177) — thanks @hartmark) - **fix(capabilities): resolve models.dev-synced vision metadata for Mistral `-latest` aliases** — root cause behind the #4071 heuristic: `getResolvedModelCapabilities("mistral/pixtral-12b-latest").supportsVision` resolved `null` (vision came only from the #4071 model-id heuristic, with `attachment` still `null`) even though models.dev exposes the model as multimodal. Confirmed against the live models.dev API: it catalogs Pixtral 12B under the **short** id `pixtral-12b` (with `attachment: true`, `modalities.input: ["text","image"]`), while requests use the Mistral API alias `pixtral-12b-latest`. The synced lookup tried the exact / raw / static-spec-canonical ids — all of which miss the short form — so it fell through to the heuristic. `getSyncedCapabilityForResolved` now adds a last-resort fallback that retries with a trailing `-latest` stripped, so synced metadata (`attachment` / image modalities) wins for these aliases; models whose `-latest` id is stored verbatim (e.g. `pixtral-large-latest`) keep resolving directly. Note: the models.dev sync is currently manual-only (Settings → models.dev) with no scheduled refresh, so a fresh instance still relies on the #4071 heuristic until that sync runs — a periodic-refresh cadence is left as a separate follow-up. ([#4073](https://github.com/diegosouzapw/OmniRoute/issues/4073) — thanks @diego-anselmo) - **fix(sse): map Xiaomi MiMo reasoning control to its native `thinking:{type}` shape** — MiMo (`api.xiaomimimo.com`) controls chain-of-thought **only** via top-level `thinking:{type:"enabled"|"disabled"}` and does not understand OpenAI's `reasoning_effort`/`reasoning`, while its request validator is strict (`400 Param Incorrect`). OmniRoute's OpenAI path carried reasoning intent as `reasoning_effort`, and the claude→openai translator can leave a Claude-shaped `thinking:{type, budget_tokens}` — so the client's on/off choice was silently dropped and `budget_tokens`/`reasoning_effort` rode along as extra params the validator can reject. New `open-sse/services/mimoThinking.ts::normalizeMimoThinking` (wired in `chatCore` for `provider==="xiaomi-mimo"`) reduces any thinking object to just `{type}` (`disabled` stays; `enabled`/`adaptive`/other → `enabled`) and drops `reasoning_effort`/`reasoning`. It deliberately does **not** synthesize thinking from a bare `reasoning_effort` — `mimo-v2-omni` is non-thinking, so that could turn a silently-ignored param into a hard error. ([#4224](https://github.com/diegosouzapw/OmniRoute/pull/4224)) - **fix(capabilities): Xiaomi MiMo `*-pro` chat models are text-only (no vision)** — only `mimo-v2.5` and `mimo-v2-omni` accept images per Xiaomi's docs; `mimo-v2.5-pro`/`mimo-v2-pro` are text-only, but `modelSpecs` marked them vision-capable and models.dev mislabels them ([hermes-agent#18884](https://github.com/NousResearch/hermes-agent/issues/18884)). Since `resolveVisionCapability` lets a synced `attachment:true` win first, an image request could be routed to a blind model (the #4071 failure mode). Corrected the specs **and** added a hard override in `resolveVisionCapability` (checked before the synced branch, anchored so `mimo-v2.5-pro` never matches the multimodal `mimo-v2.5`) that beats the wrong synced attachment. Also registered the missing native `mimo-v2-pro` chat model and the missing `mimo-v2-tts` speech model. ([#4224](https://github.com/diegosouzapw/OmniRoute/pull/4224)) - **fix(sse): Claude Opus 4.7+/Fable 5 use adaptive thinking only (no more manual-budget 400s)** — Opus 4.7 and later (Opus 4.7/4.8, Fable 5) removed manual extended thinking: `thinking.type:"enabled"` or **any** `thinking.budget_tokens` now returns `400` ("Any request that tries to set a fixed thinking budget gets a 400" — Anthropic migration guide). Reasoning is adaptive-only, steered by `output_config.effort`. OmniRoute's OpenAI→Claude translator mapped `reasoning_effort` low/medium/high to a manual `thinking:{type:"enabled", budget_tokens}`, so those requests hard-400'd on the most-used provider (and a Claude-native passthrough client sending the legacy shape did too). A new `adaptiveThinkingOnly` model flag now drives two fixes: the translator maps `reasoning_effort` of **every** level to `{type:"adaptive"}` + `output_config.effort` (preserving the requested level, never a budget) for these models, and a `normalizeClaudeAdaptiveThinking` catch-all at the existing post-translation thinking-normalization chokepoint collapses any residual manual thinking (passthrough legacy shape, per-model defaults) to `{type:"adaptive"}`, keyed on the resolved upstream model so it covers every routing mode. Pre-4.7 models (Opus 4.6/4.5, Sonnet, Haiku) keep manual budgets unchanged. ([#4230](https://github.com/diegosouzapw/OmniRoute/pull/4230)) - **fix(providers): strip non-default temperature/top_p/top_k for Claude Opus 4.7+/Fable 5 (fixed sampling → no 400)** — Opus 4.7 and later reject non-default `temperature`/`top_p`/`top_k` with a `400` (sampling is fixed; reasoning moved to `output_config.effort`). The translator forwarded client-supplied `temperature`/`top_p` unconditionally and the Claude registry models carried no `unsupportedParams`, so a plain OpenAI-format request with `temperature: 0.7` to `claude-opus-4-8` hard-400'd. Added `unsupportedParams: ["temperature","top_p","top_k"]` to the Opus 4.7+/Fable 5 ids in both the `claude` (dashed `claude-opus-4-8`) and `anthropic` (dotted `claude-opus-4.7`) registries, so they're stripped at the existing `getUnsupportedParams` dispatch chokepoint. Pre-4.7 Claude models still accept sampling params. ([#4230](https://github.com/diegosouzapw/OmniRoute/pull/4230)) - **fix(providers): conditionally strip temperature/top_p for GPT-5 reasoning on the `openai` Chat Completions path (no 400 when an effort is active)** — GPT-5 reasoning models reject non-default `temperature`/`top_p` with a `400` whenever a reasoning effort is active, yet accept them again under `reasoning_effort:"none"` (the GPT-5.1+ default, i.e. non-reasoning mode). On the `openai` provider only `o3` carried `REASONING_UNSUPPORTED`; `gpt-5.5`/`gpt-5.4`/`gpt-5.4-mini`/`gpt-5.4-nano` carried no sampling guard, so a `temperature` + active-effort request hard-400'd. A static `unsupportedParams` list can't express the `none`-mode carve-out (it would over-strip the legitimate case), so the new `gpt5SamplingGuard` drops `temperature`/`top_p` only when the resolved effort is active — wired at the existing `getUnsupportedParams` chokepoint and scoped to the `openai` Chat Completions surface (the `codex` Responses path is already covered by the CodexExecutor allowlist; other providers are untouched). ([#4245](https://github.com/diegosouzapw/OmniRoute/pull/4245)) - **fix(codex): stop silently dropping GPT-5 output verbosity (`verbosity` / `text.verbosity`)** — the GPT-5 series added an output-verbosity control: `verbosity` (low/medium/high) on Chat Completions, nested as `text.verbosity` on the Responses API. The CodexExecutor gates translated requests through an allowlist that had no `text` entry, so for the `codex` provider the hint was dropped before reaching upstream (the `openai` Chat path already forwarded it). `normalizeCodexVerbosity` now folds whichever shape arrived into a single validated `text:{verbosity}` before the allowlist (which now permits `text`), and the OpenAI Chat↔Responses request translators map `verbosity` across formats so the hint survives a format crossing for non-codex Responses backends too. Invalid/absent verbosity collapses to no `text` (status quo). ([#4245](https://github.com/diegosouzapw/OmniRoute/pull/4245)) - **fix(sse): map `reasoning_effort` to DeepSeek V4's native `{high, max}` vocabulary** — DeepSeek V4 only understands `high`/`max` reasoning levels, so other `reasoning_effort` values are mapped onto its native vocabulary instead of being rejected. ([#4219](https://github.com/diegosouzapw/OmniRoute/pull/4219)) - **fix(glm): default `max_tokens` and an extended timeout for GLM-5.2+ thinking** — GLM-5.2+ thinking responses are slow and need headroom, so OmniRoute now sets a sensible default `max_tokens` and a longer timeout for them. ([#4255](https://github.com/diegosouzapw/OmniRoute/pull/4255) — thanks @dhaern) - **fix(antigravity): default `includeThoughts` for modern Gemini models** — modern Gemini models on the Antigravity path now default to including thoughts so reasoning isn't silently dropped. ([#4180](https://github.com/diegosouzapw/OmniRoute/pull/4180) — thanks @dhaern) - **fix(provider-registry): add correct `contextLength` to theoldllm models** — fills in accurate context-window sizes for theoldllm's models. ([#4184](https://github.com/diegosouzapw/OmniRoute/pull/4184) — thanks @herjarsa) - **fix(models): expose combo model token limits** — `/v1/models` now reports token limits for combo models. ([#4189](https://github.com/diegosouzapw/OmniRoute/pull/4189) — thanks @megamen32) - **fix(combo): keep the passthrough quota fallback scoped** — prevents the passthrough quota fallback from leaking across unrelated targets. ([#4194](https://github.com/diegosouzapw/OmniRoute/pull/4194) — thanks @Svetznaniy33) - **fix(combo): opt proactive-fallback compression into the TV1 bail-out (no silent target drop)** — proactive-fallback compression now participates in the TV1 bail-out so a target is never silently dropped. ([#4228](https://github.com/diegosouzapw/OmniRoute/pull/4228)) - **fix(compression): show engine preview output** — the Compression Studio preview now renders the engine's output. ([#4128](https://github.com/diegosouzapw/OmniRoute/pull/4128) — thanks @megamen32) - **fix(compression): harden engines against I/O failures and misconfig (F5.3)** — compression engines degrade gracefully on I/O errors and bad configuration instead of throwing. ([#4198](https://github.com/diegosouzapw/OmniRoute/pull/4198)) - **fix(compression): harden RTK raw-output redaction + ReDoS guard for custom filters (F5.3)** — broadens RTK raw-output redaction and adds a ReDoS guard for user-supplied filter patterns. ([#4203](https://github.com/diegosouzapw/OmniRoute/pull/4203)) - **fix(compression): bound `mcpAccessibility` `maxTextChars` on the live read path** — the live read path now clamps `maxTextChars` so a small value can't make tools disappear. ([#4206](https://github.com/diegosouzapw/OmniRoute/pull/4206)) - **fix(dashboard): data tables paint an opaque surface so the grid doesn't bleed through** — data tables now render on an opaque surface, fixing the grid wallpaper showing through. ([#4233](https://github.com/diegosouzapw/OmniRoute/pull/4233)) - **fix(dashboard): make the provider card hover visible (was ~1% opacity)** — the provider-card hover state was effectively invisible; it now has a visible surface. ([#4214](https://github.com/diegosouzapw/OmniRoute/pull/4214)) - **fix(vscode): sanitize implicit editor context** — redacts sensitive filenames/keywords from the implicit VS Code editor context before it's sent upstream. ([#4124](https://github.com/diegosouzapw/OmniRoute/pull/4124) — thanks @zhiru) - **fix(build): raise the Node heap for the local `next build` to stop OOM/stall** — bumps the build-time heap so the local production build no longer OOMs or stalls. ([#4171](https://github.com/diegosouzapw/OmniRoute/pull/4171)) - **fix(mitm): TPROXY OUTPUT-based recipe for local traffic (validated e2e on VPS)** — switches the TPROXY rules to an OUTPUT-chain recipe so locally-originated traffic is captured; validated end-to-end on the VPS. ([#4156](https://github.com/diegosouzapw/OmniRoute/pull/4156)) - **fix(mitm): forward anti-loop — put the bypass-marked socket on the Agent (decrypt 4d)** — places the bypass-marked socket on the HTTP Agent so OmniRoute's own forwarded traffic never re-enters the capture loop; VPS-validated. ([#4229](https://github.com/diegosouzapw/OmniRoute/pull/4229)) - **fix(free-tiers): retire dead-tier `hasFree`, round the headline to ~1.6B, regenerate the per-provider table** — drops dead free tiers from the headline math and regenerates the per-provider free-tier table. ([#4142](https://github.com/diegosouzapw/OmniRoute/pull/4142)) - **fix(free-tiers): retire 4 re-verified-dead free tiers, flag iflytek/sparkdesk ToS, clarify monsterapi one-time** — removes four confirmed-dead free tiers and annotates ToS/one-time caveats. ([#4152](https://github.com/diegosouzapw/OmniRoute/pull/4152)) ### 🧪 Tests - **test(sse): guard the Antigravity `_toolNameMap` cloak map through the request-capture round-trip** — follow-up to #4091: the generic capture fix in `createPreparedRequestLogger().body()` (#4153) re-attaches the non-enumerable `_toolNameMap` that the request-inspector drops when it rebuilds the upstream body via `JSON.parse(JSON.stringify(...))`, but the only regression test covered the native-Claude OAuth cloak (PascalCase aliases). The Antigravity cloak differs — `cloakAntigravityToolPayload` suffixes custom tools with `_ide` (`workspace_read` → `workspace_read_ide`), leaves native tools untouched, and returns the reverse map separately — so a refactor of `providerRequestLogging.ts` or the executor could silently re-break Antigravity tool dispatch without tripping the Claude test. Adds a dedicated regression test driving the real `cloakAntigravityToolPayload` through the capture round-trip and asserting the `_ide` reverse map survives, stays non-enumerable (never re-serializes upstream), and that all-native traffic produces no spurious map (verified failing with the #4153 re-attach removed). No production change. ([#4181](https://github.com/diegosouzapw/OmniRoute/issues/4181) — thanks @hertznsk) - **test(chatcore): dedicated unit tests for 6 leaves + wire into stryker mutate (QG v2 Fase 9 T5 Fase 3)** — adds focused unit tests for 6 chatCore leaf helpers and enrolls them in mutation testing. ([#4218](https://github.com/diegosouzapw/OmniRoute/pull/4218)) - **test(chatcore): telemetry / memory-skills / semantic-cache tests + wire 2 into stryker (QG v2 Fase 9 T5 Fase 3)** — new tests for the telemetry, memory-skills and semantic-cache leaves, two of which are added to the mutation set. ([#4222](https://github.com/diegosouzapw/OmniRoute/pull/4222)) - **test+ci(chatcore): semanticCache HIT-path fixture (15/15 mutate) + 350min budget headroom** — closes the semantic-cache HIT path to a full 15/15 mutation score and gives the nightly auth/accountFallback batches more budget headroom. ([#4225](https://github.com/diegosouzapw/OmniRoute/pull/4225)) - **test(compression): close F5.1 coverage gaps (replay reducer, live accumulator, StatusDot)** — fills the remaining F5.1 compression coverage gaps. ([#4192](https://github.com/diegosouzapw/OmniRoute/pull/4192)) - **test(db,sse): de-flake db-backup + chatcore streaming timing assertions** — stabilizes two timing-sensitive tests (fire-and-forget backup completion + a streaming race). ([#4132](https://github.com/diegosouzapw/OmniRoute/pull/4132)) - **test: align stale integration tests surfaced post-v3.8.28 on main** — realigns integration tests that drifted after the v3.8.28 merge. ([#4129](https://github.com/diegosouzapw/OmniRoute/pull/4129)) ### 📝 Maintenance - **refactor(sse): split chatCore.ts pure helpers into chatCore/ modules (−561 LOC)** — extracts pure helpers out of the chatCore god-file into dedicated modules (Onda 3). ([#4159](https://github.com/diegosouzapw/OmniRoute/pull/4159)) - **refactor(chatcore): extract passthrough/header/telemetry helpers (QG v2 Fase 9 T5 C2-C3-C5)** — further chatCore decomposition. ([#4188](https://github.com/diegosouzapw/OmniRoute/pull/4188)) - **refactor(chatcore): extract combo/proxy context cache + semaphore helpers (QG v2 Fase 9 T5 C6-C7)** — continues the chatCore split. ([#4193](https://github.com/diegosouzapw/OmniRoute/pull/4193)) - **refactor(combo): god-file split pilot — types + validateQuality + predicates (QG v2 Fase 9 T5 D1-D3)** — first slice of the combo.ts decomposition. ([#4162](https://github.com/diegosouzapw/OmniRoute/pull/4162)) - **refactor(combo): god-file split part 2 — shadow + sorters + structure (QG v2 Fase 9 T5 D4-D6)** — continues the combo.ts split. ([#4175](https://github.com/diegosouzapw/OmniRoute/pull/4175)) - **refactor(combo): god-file split part 3 — auto strategy (QG v2 Fase 9 T5 D8)** — extracts the auto strategy from combo.ts. ([#4186](https://github.com/diegosouzapw/OmniRoute/pull/4186)) - **refactor(combo): extract round-robin sticky state to `combo/rrState.ts` (D7a)** — moves round-robin sticky state into its own module. ([#4196](https://github.com/diegosouzapw/OmniRoute/pull/4196)) - **refactor(combo): extract the reset-aware quota block to `combo/quotaStrategies.ts` (D7b)** — moves the reset-aware quota strategies into their own module. ([#4204](https://github.com/diegosouzapw/OmniRoute/pull/4204)) - **refactor(compression): remove vestigial SLM seam + dead deprecated alias** — drops dead compression code. ([#4253](https://github.com/diegosouzapw/OmniRoute/pull/4253)) - **chore(compression): remove vestigial reconstructCcr/SessionDedup round-trip helpers** — removes unused round-trip helpers. ([#4226](https://github.com/diegosouzapw/OmniRoute/pull/4226)) - **chore(compression): remove dead exports + fix stale llmlingua docs** — prunes dead exports and corrects stale LLMLingua docs. ([#4223](https://github.com/diegosouzapw/OmniRoute/pull/4223)) - **chore(build): build + ship the TPROXY native addon in the standalone (prebuilds 4e)** — bundles the native TPROXY addon prebuilds into the standalone build. ([#4236](https://github.com/diegosouzapw/OmniRoute/pull/4236)) - **chore(ci): add quota + 6 covered chatCore leaves to stryker mutate (QG v2 Fase 9 T5 Fase 3 follow-up)** — enrolls more covered leaves into mutation testing. ([#4209](https://github.com/diegosouzapw/OmniRoute/pull/4209)) - **chore(ci): re-add 8 combo split leaves to stryker mutate + expand nightly batch-matrix 3→5 (QG v2 Fase 9 T5 Fase 3)** — restores mutation coverage for the split combo leaves and widens the nightly matrix. ([#4205](https://github.com/diegosouzapw/OmniRoute/pull/4205)) - **chore(quality): close v3.8.28 cycle gate drift (re-baseline + nightly-mutation scope)** — reconciles quality-gate baselines after the v3.8.28 cycle. ([#4135](https://github.com/diegosouzapw/OmniRoute/pull/4135)) - **ci(mutation): split nightly into 3 parallel batches to fit the 180min budget (QG v2 Fase 9 T0)** — parallelizes the nightly mutation run. ([#4150](https://github.com/diegosouzapw/OmniRoute/pull/4150)) - **ci(mutation): restore cold-seed timeout headroom (a/b lost in #4225 squash) + extend to c/d/g/h** — restores and extends per-batch cold-seed timeouts. ([#4258](https://github.com/diegosouzapw/OmniRoute/pull/4258)) - **ci(docs): harden the fabricated-docs checker + enforce `--strict` (QG v2 Fase 9 T9)** — tightens the anti-hallucination docs checker. ([#4149](https://github.com/diegosouzapw/OmniRoute/pull/4149)) - **ci: derive the oasdiff base-ref from the package version + flag the mutation-toolchain regression** — fixes the OpenAPI-diff base-ref and surfaces a mutation-toolchain regression. ([#4134](https://github.com/diegosouzapw/OmniRoute/pull/4134)) - **docs(ci): correct the mutation-gate note (no regression — `stryker -c` is `--concurrency`); record Task 12 GO** — corrects a misread of the stryker flag and records the spike GO. ([#4138](https://github.com/diegosouzapw/OmniRoute/pull/4138)) - **docs(api): document the `/api/v1/ws` chat WebSocket endpoint in openapi.yaml** — adds the WebSocket chat endpoint to the OpenAPI spec. ([#4215](https://github.com/diegosouzapw/OmniRoute/pull/4215)) - **docs(readme): expand Acknowledgments into a themed, star-counted credits hall** — reworks the README acknowledgments section. ([#4195](https://github.com/diegosouzapw/OmniRoute/pull/4195)) - **style(dashboard): shrink the identity grid cell 46px → 32px (~30% smaller)** — tightens the identity grid density. ([#4143](https://github.com/diegosouzapw/OmniRoute/pull/4143)) ### 🔧 Dependencies - **deps: bump the production group with 5 updates** — routine production-dependency bumps. ([#4121](https://github.com/diegosouzapw/OmniRoute/pull/4121)) - **chore(deps): bump github/codeql-action from 3 to 4** — CI action update. ([#4120](https://github.com/diegosouzapw/OmniRoute/pull/4120)) - **chore(deps): bump actions/setup-python from 5 to 6** — CI action update. ([#4119](https://github.com/diegosouzapw/OmniRoute/pull/4119)) --- ## [3.8.28] — 2026-06-17 ### ✨ New Features - **feat(providers): add OrcaRouter (OpenAI-compatible routing gateway)** — OrcaRouter is now registered as an API-key provider. Its adaptive router is exposed as `orcarouter/auto` (smart routing across 150+ upstream models), alongside a curated flagship set (GPT-5.5, Gemini 3.5 Flash, Claude Opus 4.8, Grok 4.3, DeepSeek V4 Pro, MiniMax M2.7, Qwen3.7 Max). `passthroughModels` is enabled so any OrcaRouter model id works. OpenAI-compatible endpoint (`https://api.orcarouter.ai/v1`), Bearer (`sk-orca-…`) auth — no custom executor or translator required. ([#4070](https://github.com/diegosouzapw/OmniRoute/pull/4070) — thanks @jinhaosong-source) - **feat(providers): add Wafer AI (Anthropic-compatible, Bearer auth)** — Wafer AI is now a built-in provider speaking the Anthropic Messages format with Bearer authentication, registered with its model catalog so it works out of the box. ([#4098](https://github.com/diegosouzapw/OmniRoute/pull/4098) — thanks @diegosouzapw) - **feat(cli): `omniroute launch` — zero-config Claude Code launcher** — a new CLI subcommand that boots OmniRoute (if not already running) and launches Claude Code pre-wired to it, with no manual env/settings editing. ([#4097](https://github.com/diegosouzapw/OmniRoute/pull/4097) — thanks @diegosouzapw) - **feat(api): exact offline token counting for the `count_tokens` fallback via tiktoken** — the local `count_tokens` fallback now uses a real tiktoken (BPE) tokenizer for exact offline counts instead of a heuristic estimate, so token budgeting is accurate even when no upstream count endpoint is reachable. ([#4087](https://github.com/diegosouzapw/OmniRoute/pull/4087) — thanks @diegosouzapw) - **feat(sse): Claude Code quota-probe bypass + command meta-request helpers** — ported from free-claude-code: OmniRoute now recognizes Claude Code's quota-probe and command meta-requests and serves them locally instead of burning an upstream call, reducing wasted quota during CLI sessions. ([#4083](https://github.com/diegosouzapw/OmniRoute/pull/4083) — thanks @diegosouzapw) - **feat(sse): generic 400 field-downgrade retry + Groq field stripping** — when an upstream rejects a request with `400` because of an unsupported field, OmniRoute now strips the offending field and retries (a generic downgrade path), with Groq-specific field stripping wired in. Aligned with the existing `context_management` retry handling. ([#4096](https://github.com/diegosouzapw/OmniRoute/pull/4096) — thanks @diegosouzapw) - **feat(sse): delegated Anthropic Context Editing — relay coverage + 400-fallback** — extends the Claude server-side Context Editing delegation (#4021) with broader relay coverage and a `400`-fallback so a request the upstream rejects for the context-management beta degrades gracefully instead of failing. ([#4065](https://github.com/diegosouzapw/OmniRoute/pull/4065) — thanks @diegosouzapw) - **feat(compression): record per-engine Context Editing telemetry** — the compression pipeline now records a `context-editing` engine entry so the dashboard attributes server-side Context Editing savings alongside the local compression engines. ([#4062](https://github.com/diegosouzapw/OmniRoute/pull/4062) — thanks @diegosouzapw) - **feat(compression): RTK learn/discover (sample source + API + UI)** — the rule-based RTK compression engine gains a learn/discover workflow: sample a source, surface candidate rules through a new API, and review/apply them in the dashboard. ([#4088](https://github.com/diegosouzapw/OmniRoute/pull/4088) — thanks @diegosouzapw) - **feat(dashboard): 2026-06-17 free-tier refresh — honest catalog, uncapped + boost tiers, Layout A budget table** — the free-tier page was refreshed with an honest, deep-researched catalog (pooled/realistic figures rather than inflated 24/7 RPM math), new `recurring-uncapped` and boost tiers, new providers, and a KPI + budget table (Layout A). ([#4089](https://github.com/diegosouzapw/OmniRoute/pull/4089) — thanks @diegosouzapw) - **feat(dashboard): Combo Studio connection-cooldown badge (U1b Slice 2)** — the Combo Live cascade now surfaces each connection's cooldown state as a badge, complementing the circuit-breaker badge shipped in 3.8.27. ([#4068](https://github.com/diegosouzapw/OmniRoute/pull/4068) — thanks @diegosouzapw) - **feat(mitm): attribute intercepted requests to the originating process (Gap 1)** — the Traffic Inspector now resolves each intercepted connection back to the originating local process (via `/proc`), so captured traffic can be attributed to the app that produced it. (ProxyBridge-inspired hardening.) ([#4085](https://github.com/diegosouzapw/OmniRoute/pull/4085) — thanks @diegosouzapw) - **feat(mitm): capture-pipeline self-test route (Gap 12)** — a diagnostic route that exercises the MITM capture pipeline end-to-end so operators can confirm interception is working without crafting a real upstream call. ([#4093](https://github.com/diegosouzapw/OmniRoute/pull/4093) — thanks @diegosouzapw) - **feat(mitm): loop-guard self-check + verbosity control in `server.cjs` (Gaps 14+15)** — the MITM proxy gains a self-referential loop guard (so it never proxies its own traffic into an infinite loop) and a `MITM_VERBOSE` routing-decision log level. ([#4101](https://github.com/diegosouzapw/OmniRoute/pull/4101) — thanks @diegosouzapw) - **feat(agent-bridge): portable JSON import/export of config (Gap 4)** — the Agent Bridge / MITM configuration can now be exported to and imported from a portable JSON file, so a working setup can be backed up or moved between machines. ([#4094](https://github.com/diegosouzapw/OmniRoute/pull/4094) — thanks @diegosouzapw) ### 🐛 Fixed - **fix(ws): start the LiveWS sidecar with `cwd` at the package root (global/systemd installs)** — the standalone LiveWS launcher (`scripts/start-ws-server.mjs`) re-spawns itself with `node --import tsx ` but did not set `cwd`. When the WebSocket sidecar was launched from outside the package directory — a global npm/homebrew install, or a `systemd`/`launchd` unit started from `$HOME` — Node could not resolve the `tsx` package (`ERR_MODULE_NOT_FOUND: Cannot find package 'tsx'`), and even from the package directory `tsx` could not resolve the tsconfig `@/*` path aliases (e.g. `@/types/databaseSettings`), so the sidecar never booted. The spawn now pins `cwd` to the package root (the directory above `scripts/`, where `package.json` + `tsconfig.json` live), which resolves both `tsx` discovery and the `@/*` aliases regardless of launch directory. ([#4055](https://github.com/diegosouzapw/OmniRoute/issues/4055) — thanks @Rahulsharma0810) - **fix(dashboard): Logs page auto-refresh now works in embedded/proxied dashboards** — the Request Logger gated each auto-refresh tick on a static `document.visibilityState === "visible"` read. Hosts that report a permanent non-`"visible"` state without ever firing a `visibilitychange` event (Docker dashboard wrappers, embedded webviews) froze auto-refresh entirely — only the manual Refresh button worked, a regression from 3.8.24's unconditional polling. The pause is now event-driven and fail-open: polling starts enabled and only pauses after a real `visibilitychange` → hidden transition (still preserving the backgrounded-tab optimization for normal browser tabs). ([#4054](https://github.com/diegosouzapw/OmniRoute/issues/4054) — thanks @tjengbudi) - **fix(docker): raise the build-stage Node heap to stop the production-build OOM** — the Docker `builder` stage ran `npm run build` with V8's default heap ceiling (~2 GB). After #4052 forced the heavier webpack engine (Turbopack panics on this Next.js version), the production optimization pass exceeded that ceiling and the build died with `FATAL ERROR: … JavaScript heap out of memory` at `[builder] npm run build`. The builder stage now sets `NODE_OPTIONS=--max-old-space-size` (default 4096 MB, overridable via `--build-arg OMNIROUTE_BUILD_MEMORY_MB=…`) before the build; the value propagates to the spawned `next build`. Build-only — the runtime heap (`OMNIROUTE_MEMORY_MB` on the runner stage) is unchanged. ([#4076](https://github.com/diegosouzapw/OmniRoute/issues/4076) — thanks @kamenkadmitry) - **fix(dashboard): "Update Available" banner reappears reliably across Docker/npm/desktop installs** — the home-page banner is gated on `GET /api/system/version`'s `updateAvailable`, which derived the latest version ONLY from `npm info omniroute version --json` via the `npm` CLI binary. When that binary is absent from the runtime PATH (Docker/desktop/locked-down installs) or the registry is unreachable, the call returned `null` → `updateAvailable=false` → the banner silently never rendered even when a newer release existed. The route now resolves the latest version through `resolveLatestVersion()`: the fast `npm` CLI path first, then an npm-binary-free fallback over the registry HTTP API (`registry.npmjs.org/omniroute/latest`), and a logged warning instead of silent degradation when both fail. Version comparison was also hardened to tolerate `v`-prefixed and pre-release version strings. ([#4100](https://github.com/diegosouzapw/OmniRoute/issues/4100)) - **fix(sse): route image requests only to confirmed-vision combo targets** — a combo could route an image-bearing request to a member that doesn't actually support vision. Routing now requires `supportsVision === true` (plus a model-id heuristic) before sending images to a target, so multimodal requests land only on members that can handle them. ([#4071](https://github.com/diegosouzapw/OmniRoute/pull/4071) — thanks @diego-anselmo) - **fix(security): injection guard respects the `INJECTION_GUARD_MODE` DB feature flag** — the prompt-injection guard ignored the database feature-flag, so operators couldn't change its mode at runtime; it now reads the flag and honors the configured mode. ([#4077](https://github.com/diegosouzapw/OmniRoute/pull/4077) — thanks @zhiru) - **fix(ws): proxy LAN `/live-ws` upgrades and warn on an unset `JWT_SECRET`** — WebSocket upgrade requests arriving over the LAN proxy path were not forwarded to the LiveWS sidecar; they are now proxied correctly, and the server logs a clear warning when `JWT_SECRET` is unset. ([#4079](https://github.com/diegosouzapw/OmniRoute/pull/4079) — thanks @Rahulsharma0810) - **fix(dev): force webpack in the custom dev server (Turbopack 16.2.x panics)** — the custom dev server now forces the webpack engine because Turbopack panics on this Next.js version, so `npm run dev` boots reliably. ([#4092](https://github.com/diegosouzapw/OmniRoute/pull/4092) — thanks @chirag127) - **fix(auto): resolve built-in `auto/*` catalog combos** — referencing a built-in `auto/*` combo returned a premature `400` because the catalog entry wasn't resolved; the built-in auto catalog is now resolved before validation so those combos work. ([#4058](https://github.com/diegosouzapw/OmniRoute/pull/4058) — thanks @megamen32) - **fix(sse): friendly 413 message for ChatGPT-web payload-too-large** — an oversized ChatGPT-web payload returned an opaque error; it now returns a clear `413` with a human-readable message. ([#4080](https://github.com/diegosouzapw/OmniRoute/pull/4080) — thanks @diegosouzapw) - **fix(ws): warm the SSE auth import on LiveWS startup; relocate the boot test to integration** — the LiveWS sidecar now pre-imports the SSE auth module at startup to avoid a first-request stall, and its boot test was moved to the integration suite. ([#4063](https://github.com/diegosouzapw/OmniRoute/pull/4063) — thanks @diegosouzapw) - **fix(mitm): crash-safe system-state teardown + socket timeouts (ProxyBridge-inspired hardening)** — the MITM proxy could leave the host's system proxy settings applied if it crashed mid-teardown, and long-lived tunnels could leak as half-open sockets. Teardown is now crash-safe (system state is always restored) and proxied sockets get an idle timeout (`MITM_IDLE_TIMEOUT_MS`, default 60s). ([#4084](https://github.com/diegosouzapw/OmniRoute/pull/4084) — thanks @diegosouzapw) - **fix(responses): clear the `/v1/responses` keep-alive timer on cancel/abort** — a cancelled or aborted `/v1/responses` stream left its keep-alive timer running, leaking a timer and burning CPU; the timer is now cleared on cancel/abort. ([#4105](https://github.com/diegosouzapw/OmniRoute/pull/4105) — thanks @artickc) - **fix(usage): reap orphaned pending-request details (unbounded memory leak)** — pending-request detail entries whose request never completed accumulated without bound; they are now reaped, closing a slow memory leak. ([#4107](https://github.com/diegosouzapw/OmniRoute/pull/4107) — thanks @artickc) - **fix(auth): prune expired entries from the login brute-force guard map (unbounded growth)** — the login brute-force guard map grew without bound because expired entries were never removed; expired entries are now pruned. ([#4111](https://github.com/diegosouzapw/OmniRoute/pull/4111) — thanks @artickc) - **fix(logger): hard-cap the error-dedup map to bound memory under unique-message bursts** — a burst of unique error messages could grow the dedup map without limit; it is now hard-capped. ([#4113](https://github.com/diegosouzapw/OmniRoute/pull/4113) — thanks @artickc) - **fix(circuit-breaker): enforce `MAX_REGISTRY_SIZE` (declared but never applied)** — the circuit-breaker registry declared a maximum size that was never enforced, so it could grow unbounded; the cap is now applied. ([#4114](https://github.com/diegosouzapw/OmniRoute/pull/4114) — thanks @artickc) - **fix(webhook): clear the abort timer in `finally` to avoid dangling timers on fetch error** — a webhook dispatch that threw before clearing its abort timer left the timer dangling; it is now cleared in a `finally` block. ([#4115](https://github.com/diegosouzapw/OmniRoute/pull/4115) — thanks @artickc) - **fix(combo): detach the per-target listener from the shared hedge abort signal** — combo hedging attached a per-target listener to a shared abort signal without detaching it, leaking listeners across requests; the listener is now detached. ([#4116](https://github.com/diegosouzapw/OmniRoute/pull/4116) — thanks @artickc) - **fix(timers): unref background interval timers so they don't block clean shutdown** — long-lived background interval timers kept the event loop alive and blocked a clean process exit; they are now `unref`'d. ([#4117](https://github.com/diegosouzapw/OmniRoute/pull/4117) — thanks @artickc) ### ⚡ Performance - **perf(registry): precompute the model→provider index in `parseModelFromRegistry`** — model→provider lookups now use a precomputed index instead of scanning the registry on every call. ([#4110](https://github.com/diegosouzapw/OmniRoute/pull/4110) — thanks @artickc) - **perf(obfuscation): cache per-word regexes instead of recompiling every request** — the obfuscation pass now caches its per-word regexes rather than recompiling them on each request. ([#4109](https://github.com/diegosouzapw/OmniRoute/pull/4109) — thanks @artickc) - **perf(stream): use `structuredClone` instead of a JSON round-trip for per-chunk reasoning split** — the per-chunk reasoning split now clones with `structuredClone` rather than `JSON.parse(JSON.stringify(...))`. ([#4108](https://github.com/diegosouzapw/OmniRoute/pull/4108) — thanks @artickc) - **perf(gemini): cache the reasoning close-tag regex instead of recompiling per token** — the Gemini reasoning close-tag regex is now compiled once and reused instead of per token. ([#4106](https://github.com/diegosouzapw/OmniRoute/pull/4106) — thanks @artickc) ### 📝 Maintenance - **ci(quality): flip the TIA impacted-unit-tests gate from advisory to blocking (Fase 9)** — the test-impact-analysis gate that runs the unit tests impacted by a diff is now blocking on PRs. ([#4069](https://github.com/diegosouzapw/OmniRoute/pull/4069) — thanks @diegosouzapw) - **ci(quality): dedup the doubly-run `check:docs-sync` + record the validated ROI backlog (Fase 9)** — `check:docs-sync` was running twice in CI; the duplicate was removed and the validated quality-gate ROI backlog recorded. ([#4099](https://github.com/diegosouzapw/OmniRoute/pull/4099) — thanks @diegosouzapw) - **docs(quality-gates): reconcile the gate inventory with `ci.yml` + add the ROI rationalization backlog** — the quality-gate inventory doc was reconciled against the actual CI jobs and a rationalization backlog added. ([#4095](https://github.com/diegosouzapw/OmniRoute/pull/4095) — thanks @diegosouzapw) - **test(infra): isolate `DATA_DIR` per test process; raise Stryker concurrency 1→4** — test processes now get an isolated `DATA_DIR` (no shared-DB cross-talk) and the mutation runner's concurrency was raised. ([#4078](https://github.com/diegosouzapw/OmniRoute/pull/4078) — thanks @diegosouzapw) - **test(dashboard): smoke e2e for the Combo Live Studio page** — adds a Playwright smoke test covering the Combo Live Studio page. ([#4075](https://github.com/diegosouzapw/OmniRoute/pull/4075) — thanks @diegosouzapw) - **docs(compression): document LLMLingua optional deps + on-demand install** — documents the optional LLMLingua dependencies and how they are installed on demand. ([#4061](https://github.com/diegosouzapw/OmniRoute/pull/4061) — thanks @diegosouzapw) - **chore(deps): freeze `@huggingface/transformers` in dependabot (hard-pin)** — the transformers dependency is hard-pinned and frozen in dependabot to protect the VPS-validated LLMLingua + memory-embeddings stack from a breaking major bump. ([#4066](https://github.com/diegosouzapw/OmniRoute/pull/4066) — thanks @diegosouzapw) - **chore(docs): update the Discord invite link to a non-expiring one** — replaces the expiring Discord invite with a permanent link. ([#4067](https://github.com/diegosouzapw/OmniRoute/pull/4067) — thanks @diegosouzapw) - **chore(docs): document the new MITM env vars + reconcile the env-doc contract** — documents `MITM_IDLE_TIMEOUT_MS` and `MITM_VERBOSE` in `.env.example` + `ENVIRONMENT.md`, allowlists the framework-internal `TURBOPACK` and the Claude Code `ANTHROPIC_AUTH_TOKEN`, and relocates/prunes stale provider/guide docs. (thanks @diegosouzapw) ### 🔧 Dependencies - **deps: bump the development group with 10 updates** — routine dependabot dev-dependency bumps. ([#4051](https://github.com/diegosouzapw/OmniRoute/pull/4051)) - **deps(electron): bump electron 42.4.0 → 42.4.1** — ([#4049](https://github.com/diegosouzapw/OmniRoute/pull/4049)) - **ci(deps): bump `actions/setup-node` 4 → 6** — ([#4048](https://github.com/diegosouzapw/OmniRoute/pull/4048)) - **ci(deps): bump `actions/cache` 4.3.0 → 5.0.5** — ([#4047](https://github.com/diegosouzapw/OmniRoute/pull/4047)) - **ci(deps): bump `actions/github-script` 7 → 9** — ([#4046](https://github.com/diegosouzapw/OmniRoute/pull/4046)) - **ci(deps): bump `ossf/scorecard-action` 2.4.0 → 2.4.3** — ([#4045](https://github.com/diegosouzapw/OmniRoute/pull/4045)) - **ci(deps): bump `actions/upload-artifact` 4 → 7** — ([#4044](https://github.com/diegosouzapw/OmniRoute/pull/4044)) --- ## [3.8.27] — 2026-06-17 ### ✨ New Features - **feat(combos): advertise combo capabilities (multimodal / reasoning / caching) on the import surfaces** — importing a combo package into a client (LobeHub / OpenCode / VS Code, via `/v1/combos` and the VS Code combo catalog) no longer requires manually enabling multimodal/image-input, reasoning, and caching afterwards. `projectCombo` now attaches a registry-derived `capabilities` block, gated conservatively: `multimodal`/`reasoning` are advertised only when **every** concrete model step proves the capability (an unprovable nested combo-ref drops them, since the strategy may route to any member), and `caching` reflects the combo's explicit Context-Cache-Protection setting (no surprise prompt-cache cost). The public `/v1/combos` default projection (#2300) is unchanged unless the caller opts in. ([#3979](https://github.com/diegosouzapw/OmniRoute/issues/3979) — thanks @xenstar) - **feat(sse): delegated Anthropic Context Editing for Claude (`clear_tool_uses`)** — Claude requests can now offload context trimming to Anthropic's server-side context-management API (beta `context-management-2025-06-27`, `clear_tool_uses_20250919`), pruning stale tool-use turns upstream instead of locally. Claude-only by nature (the edit runs server-side); multi-provider context trimming remains the job of the local compression engines. ([#4021](https://github.com/diegosouzapw/OmniRoute/pull/4021) — thanks @diegosouzapw) - **feat(sse): real LLMLingua-2 ONNX compression engine (stable)** — the LLMLingua-2 prompt-compression engine is now a real local ONNX model (TinyBERT default, transformers.js + tfjs), promoted to stable after VPS validation, replacing the previous placeholder. ([#4014](https://github.com/diegosouzapw/OmniRoute/pull/4014) — thanks @diegosouzapw) - **feat(compression): capture per-engine analytics + Lite schema fix** — the compression pipeline now persists a per-engine breakdown for historical analytics so the dashboard can attribute savings to each engine in a stacked pipeline, and a Lite-schema mismatch was corrected. ([#4018](https://github.com/diegosouzapw/OmniRoute/pull/4018) — thanks @diegosouzapw) - **feat(dashboard): real circuit-breaker state in the Combo Live cascade (U1b)** — the Combo Live cascade view now surfaces each provider's real circuit-breaker state (CLOSED / OPEN / HALF_OPEN) as a badge, read live from `/api/monitoring/health`, instead of inferring health from request outcomes. ([#4029](https://github.com/diegosouzapw/OmniRoute/pull/4029) — thanks @diegosouzapw) - **feat(openai): honor a custom base URL in model discovery + complete openai/codex pricing** — OpenAI-format providers configured with a custom base URL now have that URL honored during model discovery (not just inference), and the openai/codex pricing table was completed. Discovery is routed through the SSRF-guarded outbound fetch. ([#4005](https://github.com/diegosouzapw/OmniRoute/pull/4005) — thanks @artickc) - **feat(observability): capture actual upstream provider requests** — the request inspector now records the exact payload sent to the upstream provider (post-translation), so you can see what OmniRoute actually dispatched rather than only the client's original request. ([#3941](https://github.com/diegosouzapw/OmniRoute/pull/3941) — thanks @rdself) - **feat(providers): provider auth visibility controls** — adds controls to show/hide provider auth details in the dashboard so credentials can be revealed only when needed. ([#3953](https://github.com/diegosouzapw/OmniRoute/pull/3953) — thanks @rdself) - **feat(providers): model search filter on the provider dashboard** — the provider dashboard gains a search filter to quickly narrow a provider's model list. ([#3950](https://github.com/diegosouzapw/OmniRoute/pull/3950) — thanks @felipesartori) - **feat(compression): Indonesian caveman rules + language pack** — adds an Indonesian "caveman" rule set and language pack to the rule-based compression engine. ([#3975](https://github.com/diegosouzapw/OmniRoute/pull/3975) — thanks @Veier04) - **feat(dashboard): sidebar group separator toggles** — the dashboard sidebar can now toggle group separators for a cleaner navigation layout. ([#3971](https://github.com/diegosouzapw/OmniRoute/pull/3971) — thanks @rdself) - **feat(api): local `@@om-usage` command for cached per-key usage** — API clients can send a message that is exactly `@@om-usage` to retrieve cached Claude-style usage data locally, without forwarding the prompt to an upstream provider. Gated by a new per-key allowance flag. ([#4034](https://github.com/diegosouzapw/OmniRoute/pull/4034) — thanks @Witroch4) ### 🐛 Fixed - **fix(opencode): forward the OpenCode session id to the upstream regardless of how the user named the provider** — the `OpencodeExecutor` forwarded the `x-opencode-session/request/project/client` headers, but the OpenCode CLI only emits those when the configured `providerID` **starts with** `"opencode"`. A user who adds OmniRoute as a custom provider (e.g. `"omniroute"`) makes the CLI send `x-session-affinity` / `X-Session-Id` instead (both carry the same session id), which the executor never read — so the session-metadata forwarding was effectively dead code for the realistic provider-naming case. The opencode-family executor now falls back to `x-session-affinity` / `X-Session-Id` and maps it onto `x-opencode-session` when the client didn't send the header directly, so session continuity to the `opencode.ai` upstream works for any provider name (a direct `x-opencode-session` still wins). Scoped to this executor only — the generic `DefaultExecutor` intentionally does **not** do this, to avoid leaking the client session id to arbitrary third-party upstreams. ([#4022](https://github.com/diegosouzapw/OmniRoute/issues/4022) — thanks @pizzav-xyz) - **fix(guardrails): Vision Bridge no longer drops the image when the describe call fails (Nvidia NIM "Image unavailable")** — the Vision Bridge is enabled by default and engages for any model whose vision capability OmniRoute can't prove from the registry (`supportsVision !== true`, which includes uncatalogued models that resolve to `null`). When the per-image describe call failed (e.g. no vision model configured), it replaced the image with the literal text `[Image N]: (unavailable)` and dropped the original `image_url` — so a genuinely vision-capable upstream (Nvidia NIM) received text only and answered "Image unavailable. Cannot provide description without visual data." A describe failure is no longer destructive: `replaceImageParts` now receives `null` for failed images and **preserves the original image part** so the upstream can still see it (successful describes still replace the image with the text description; `meta.descriptions` observability is unchanged). ([#4012](https://github.com/diegosouzapw/OmniRoute/issues/4012) — thanks @daniij) - **fix(kiro): preserve `finish_reason: "tool_calls"` on the Kiro streaming path** — streaming tool-call requests through the Kiro (Responses API) provider had their terminal `finish_reason` reported as `"stop"` instead of `"tool_calls"`, so agent clients (Hermes) treated the tool-call turn as a finished turn, never ran the tool, and the next request failed with HTTP 400 on the incomplete tool state. `convertKiroToOpenAI`'s terminal `messageStopEvent`/`done` branch hardcoded `finish_reason: "stop"` regardless of whether the stream had emitted `toolUseEvent`s. The translator now records `state.sawToolUse` when a tool-use chunk is emitted and reports `finish_reason: "tool_calls"` on the terminal chunk (and in `state.finishReason`) whenever the stream produced tool calls. The non-streaming path was already correct. ([#3980](https://github.com/diegosouzapw/OmniRoute/issues/3980) — thanks @lordavadon2) - **fix(resilience): respect connection cooldown stored as a numeric epoch** — the router kept dispatching to connections still inside their rate-limit cooldown because `rate_limited_until` (a `TEXT` column) was persisted as a raw epoch number, which SQLite coerced to a string like `"1781696905131.0"` that `new Date(...)` parsed as `NaN`, so the cooling connection was never skipped. The cooldown read predicates now normalize numeric-epoch strings via a shared `cooldownUntilMs()` helper; ISO behavior is unchanged. ([#3995](https://github.com/diegosouzapw/OmniRoute/pull/3995) — thanks @diegosouzapw) - **fix(providers): fetch the live `/models` catalog for LLM7 and BytePlus** — importing an LLM7 or BytePlus key surfaced only a small, outdated hardcoded list because neither provider was classified by any live-fetch branch of the model-import route. Both are now in `NAMED_OPENAI_STYLE_PROVIDERS`, so the route probes `/models` with the key and serves the live catalog, falling back to the local catalog only when the upstream fetch fails. ([#3996](https://github.com/diegosouzapw/OmniRoute/pull/3996) — thanks @FerLuisxd / @diegosouzapw) - **fix(dashboard): logs auto-refresh reads live visibility, not a stale mount ref** — the Logs page never auto-refreshed when the tab loaded in the background because the auto-refresh interval gated each tick on a visibility ref seeded once at mount; the tick now reads the live `document.visibilityState`, so polling self-heals as soon as the tab is visible while still pausing when genuinely hidden. ([#3997](https://github.com/diegosouzapw/OmniRoute/pull/3997) — thanks @tjengbudi / @diegosouzapw) - **fix(combo): shuffle the strict-random fallback remainder to spread load** — with the `strict-random` strategy a persistently-failing model was retried on essentially every request because only the deck-selected slot 0 was shuffled while the fallback remainder stayed in fixed priority order; the remainder is now shuffled too, so fallback load (and recovery from a failing target) spreads evenly across healthy peers. ([#3998](https://github.com/diegosouzapw/OmniRoute/pull/3998) — thanks @KeNJiKunG / @diegosouzapw) - **fix(claude): forward the client `tool-search-tool-2025-10-19` anthropic-beta on the Claude OAuth path** — with deferred tools active, Claude Code negotiates the `tool-search-tool-2025-10-19` beta, but OmniRoute dropped it on both Claude code paths, so the claude.ai backend rejected every deferred-tool request with `400 Tool reference not found`. A new allowlist-merge (`mergeClientAnthropicBeta`) now unions the client's negotiated beta into the outbound set on both paths, appending only allowlisted client betas (preserving the #3415 fix). ([#3999](https://github.com/diegosouzapw/OmniRoute/pull/3999) — thanks @huohua-dev / @diegosouzapw) - **fix(executor): strip `stream_options` on non-streaming requests (NVIDIA NIM 400)** — clients that send `stream_options: { include_usage: true }` regardless of `stream` (e.g. the OpenAI Python SDK) had it passed through untouched on non-streaming calls, and NVIDIA NIM rejected it with `400 "Stream options can only be defined when stream=True"`. `DefaultExecutor.transformRequest` now strips `stream_options` whenever `stream` is false; the streaming injection path is unchanged. ([#4000](https://github.com/diegosouzapw/OmniRoute/pull/4000) — thanks @andrea-kingautomation / @daniij / @diegosouzapw) - **fix(sse): guard model-less registry entries in `getUnsupportedParams` (mimocode)** — a registry entry without a model map (mimocode) threw when computing unsupported params; the lookup now guards the model-less case so request validation no longer crashes. ([#4015](https://github.com/diegosouzapw/OmniRoute/pull/4015) — thanks @diegosouzapw) - **fix(perplexity-web): parse the schematized `diff_block` stream so answers aren't empty** — Perplexity web streamed its answer as RFC-6902 `diff_block` patches that OmniRoute didn't apply during the `PENDING` phase, so responses came back empty; the parser now applies the patches and materializes the text only on `COMPLETED`. ([#4001](https://github.com/diegosouzapw/OmniRoute/pull/4001) — thanks @artickc) - **fix(default-executor): honor a custom `providerSpecificData.baseUrl` for OpenAI-format providers** — OpenAI-format providers configured with a custom base URL had it ignored on the inference path; the default executor now honors `providerSpecificData.baseUrl` so requests reach the configured endpoint. ([#4002](https://github.com/diegosouzapw/OmniRoute/pull/4002) — thanks @artickc) - **fix(live-ws): bridge LiveWS sidecar events to the dashboard** — events emitted by the LiveWS sidecar were not reaching the dashboard; they are now bridged so live websocket activity is visible. (A cookie-auth regression in the sidecar's auth-token parsing was also corrected.) ([#4004](https://github.com/diegosouzapw/OmniRoute/pull/4004) — thanks @megamen32) - **fix(qwen-web): cookie validation false-positive — check the response body for a user object** — Qwen web cookie validation reported a valid cookie as invalid; it now inspects the response body for the `user` object instead of relying on the status code alone. ([#3958](https://github.com/diegosouzapw/OmniRoute/pull/3958) — thanks @thezukiru) - **fix(vision-bridge): force the bridge for tokenrouter deepseek models** — tokenrouter DeepSeek models are now forced through the Vision Bridge so image inputs are handled correctly. ([#3946](https://github.com/diegosouzapw/OmniRoute/pull/3946) — thanks @WormAlien) - **fix(api): return 400 (not 500) for malformed JSON on `/api/auth/login`** — a malformed JSON body on the login endpoint returned an opaque 500; it now returns a proper 400. ([#4031](https://github.com/diegosouzapw/OmniRoute/pull/4031) — thanks @rdself) - **fix(dashboard): Playground Compare tab loading + HTTP method guard** — the Playground Compare tab failed to load; the loading path was fixed and an HTTP method guard added. ([#4024](https://github.com/diegosouzapw/OmniRoute/pull/4024) — thanks @rdself) - **fix(proxy): gate the control-plane proxy direct fallback behind a feature flag (fail-closed)** — the direct-connection fallback for control-plane ops when a pinned proxy is unreachable is now gated behind a feature flag and fails closed, so a pinned proxy is never silently bypassed unless explicitly allowed. ([#3963](https://github.com/diegosouzapw/OmniRoute/pull/3963) — thanks @rdself) - **fix(db): persist backup retention days** — the backup retention-days setting was not persisted across restarts; it is now stored durably. ([#3970](https://github.com/diegosouzapw/OmniRoute/pull/3970) — thanks @rdself) - **fix(dashboard): refine the provider quota card display** — the provider quota card layout was refined for clearer quota/usage presentation. ([#3969](https://github.com/diegosouzapw/OmniRoute/pull/3969) — thanks @rdself) - **fix(dashboard): refine compression settings, storage labels, and sidebar grouping** — polishes the compression-settings UI, clarifies storage labels, and tidies the sidebar grouping. ([#4033](https://github.com/diegosouzapw/OmniRoute/pull/4033) — thanks @rdself) ### 🔒 Security & Hardening - **fix(security): eliminate a polynomial ReDoS in the combo `` tag regex** — `comboAgentMiddleware`'s cache-tag pattern wrapped the tag in an unbounded newline run (`(?:\n|\r)*`), making `.test()` / `.replace()` run in O(n²) on inputs with many newlines (CodeQL `js/polynomial-redos`). The detection pattern now matches only the core `` and the global strip pattern bounds the surrounding newline runs, keeping it linear; detection / extraction / multi-tag stripping behavior is unchanged. ([#3982](https://github.com/diegosouzapw/OmniRoute/pull/3982) — thanks @diegosouzapw) - **ci(security): harden workflows — artipacked `persist-credentials`, cache-poisoning, SC2086** — GitHub Actions workflows were hardened against the artipacked `persist-credentials` leak and cache-poisoning, and shell-quoting (`SC2086`) issues were fixed. ([#3965](https://github.com/diegosouzapw/OmniRoute/pull/3965) — thanks @diegosouzapw) - **ci(quality): flip require-tighten + osv + Trivy to blocking (cycle-end)** — the per-module require-tighten check and the OSV / Trivy scanners moved from advisory to blocking for the v3.8.27 cycle close, so new dependency or coverage regressions fail CI. ([#3984](https://github.com/diegosouzapw/OmniRoute/pull/3984) — thanks @diegosouzapw) - **chore(deps): dependabot security bumps + drop unused gray-matter** — applies a batch of Dependabot security bumps and removes the unused `gray-matter` dependency from the tree. ([#4036](https://github.com/diegosouzapw/OmniRoute/pull/4036) — thanks @diegosouzapw) - **chore(deps): automated dependency bumps** — Dependabot upgraded the production dependency group (13 updates), `vite`, `form-data`, and the `npm_and_yarn` group. ([#3915](https://github.com/diegosouzapw/OmniRoute/pull/3915), [#3942](https://github.com/diegosouzapw/OmniRoute/pull/3942), [#3943](https://github.com/diegosouzapw/OmniRoute/pull/3943), [#3944](https://github.com/diegosouzapw/OmniRoute/pull/3944) — thanks @dependabot) ### 🧹 Internal / Quality / Docs - **feat(ci): Quality Gate v2 — Onda 0 + Onda 1** — first two waves of the Quality Gate v2 program: gate flips, test-impact analysis (TIA), SAST, DAST-smoke, and mutation-testing infrastructure. ([#4016](https://github.com/diegosouzapw/OmniRoute/pull/4016) — thanks @diegosouzapw) - **refactor: modularize the provider registry into individual provider plugins** — `providerRegistry.ts` was split into individual per-provider plugin modules (non-stacked). A forward-fix restored the `byteplus` + `mimocode` modules dropped by the move. ([#3993](https://github.com/diegosouzapw/OmniRoute/pull/3993) — thanks @oyi77 / @diegosouzapw) - **refactor: modularize schemas (non-stacked)** — the request/response schema definitions were split into individual modules to reduce file size and improve maintainability. ([#3988](https://github.com/diegosouzapw/OmniRoute/pull/3988) — thanks @oyi77) - **fix: restore unit regressions dropped by the lossy schema/registry modularizations** — the schema/registry modularizations (#3988, #3993) silently dropped internal logic covered by unit tests; this PR restores the regressed units. ([#4030](https://github.com/diegosouzapw/OmniRoute/pull/4030) — thanks @diegosouzapw) - **refactor(dashboard): settings UI layout + API Keys naming** — the settings UI layout was reorganized and the "API Keys" naming clarified. ([#4020](https://github.com/diegosouzapw/OmniRoute/pull/4020) — thanks @rdself) - **大量UI显示和i18n优化 (dashboard UI display + i18n improvements)** — a batch of dashboard UI-display refinements and i18n string improvements. ([#3973](https://github.com/diegosouzapw/OmniRoute/pull/3973) — thanks @rdself) - **fix(ci): scope TIA to `node:test` unit files only** — test-impact analysis was matching files the `node:test` runner doesn't execute, producing 99 false failures; the TIA glob now mirrors the `test:unit` glob exactly. ([#4035](https://github.com/diegosouzapw/OmniRoute/pull/4035) — thanks @diegosouzapw) - **fix(ci): electron-release publish-npm needs `contents: write`** — the reusable npm-publish job invoked by the electron release lacked `contents: write`, causing a v3.8.26 `startup_failure`; the permission was granted. ([#3966](https://github.com/diegosouzapw/OmniRoute/pull/3966) — thanks @diegosouzapw) - **test(opencode-plugin): ESM default-export test (drop the stale CJS bundle test)** — replaces the stale CJS bundle test with an ESM default-export test, following up the #3883 ESM-only migration. ([#3967](https://github.com/diegosouzapw/OmniRoute/pull/3967) — thanks @diegosouzapw) - **fix(ci): Fix promptfoo security-assertion parsing** — the promptfoo (DAST/security eval) assertion parser was corrected so security assertions are read reliably. ([#4032](https://github.com/diegosouzapw/OmniRoute/pull/4032) — thanks @rdself) - **docs(troubleshooting): note that the MITM proxy cannot intercept Windows-host apps under WSL** — documents that the MITM proxy running inside WSL cannot intercept traffic from apps on the Windows host. ([#4003](https://github.com/diegosouzapw/OmniRoute/pull/4003) — thanks @diegosouzapw) - **chore(quality): maintenance roll-up** — assorted quality-gate hygiene that does not change runtime behavior: re-baseline `validation.ts` for the #3958 qwen body-check, allowlist the `socks` dependency declared by #4004, ignore jscpd major bumps (the v5 Rust rewrite breaks the pinned duplication gate), untrack an accidentally-committed root `node_modules` symlink (and gitignore it), rehome the #3972 logs auto-refresh test so a runner collects it, and open the v3.8.27 development cycle. (thanks @diegosouzapw) --- ## [3.8.26] — 2026-06-15 ### ✨ New Features - **feat(media): Vertex AI (Google) speech, transcription, music & video generation** — Vertex AI's Google media models are now routable through dynamic discovery: speech synthesis, audio transcription, music generation, and video generation. ([#3929](https://github.com/diegosouzapw/OmniRoute/pull/3929) — thanks @artickc) - **feat(glm): add GLM-5.2 with effort-tier routing (high/max)** — GLM-5.2 is registered with high/max effort-tier routing. ([#3885](https://github.com/diegosouzapw/OmniRoute/pull/3885) — thanks @dhaern) - **feat(combo): add a sticky round-robin target limit** — round-robin combos can cap how many targets stay "sticky" within a session (`stickyRoundRobinLimit`), balancing stickiness against spread. ([#3846](https://github.com/diegosouzapw/OmniRoute/pull/3846) — thanks @adivekar-utexas) - **feat(openrouter): connection presets** — OpenRouter connections support reusable presets (provider routing / sort / quantization preferences), selectable when adding a connection. ([#3878](https://github.com/diegosouzapw/OmniRoute/pull/3878) — thanks @rdself) ### 🐛 Fixed - **fix(executor): stop leaking `stream_options` onto non-streaming requests (NVIDIA NIM 400)** — clients that send `stream_options: { include_usage: true }` regardless of `stream` (e.g. the OpenAI Python SDK) had it passed through untouched on non-streaming calls, and NVIDIA NIM rejected it with `400 "Stream options can only be defined when stream=True"`. `DefaultExecutor.transformRequest` only injected/cleared `stream_options` on the streaming branch and had no branch to strip a client-sent value when the outbound request is non-streaming. It now strips `stream_options` whenever `stream` is false (the streaming injection path is unchanged). Affects all OpenAI-compatible providers; NIM is just the one that strictly rejects the violation. ([#3884](https://github.com/diegosouzapw/OmniRoute/issues/3884) — thanks @andrea-kingautomation / @daniij) - **fix(claude): forward the client-negotiated `anthropic-beta: tool-search-tool-2025-10-19` on the Claude OAuth path** — with `ENABLE_TOOL_SEARCH` active, Claude Code sends deferred tools + a `tool_search_tool_*` and negotiates the `tool-search-tool-2025-10-19` beta, but OmniRoute dropped that beta on **both** Claude code paths (the `default` executor rebuilt the header from the static `ANTHROPIC_BETA_CLAUDE_OAUTH` set, and `selectBetaFlags` only read the client beta to gate thinking/effort), so the claude.ai backend rejected every deferred-tool request with `400 Tool reference '' not found in available tools`. A new allowlist-merge (`mergeClientAnthropicBeta`) now unions the client's negotiated beta into the outbound set on both paths — appending only allowlisted client betas (currently just `tool-search-tool-2025-10-19`) so it never forces betas the client didn't request (preserving the #3415 fix) nor leaks betas the backend rejects. ([#3974](https://github.com/diegosouzapw/OmniRoute/issues/3974) — thanks @huohua-dev) - **fix(combo): strict-random spreads fallbacks across healthy peers instead of retrying a failing model** — with the `strict-random` strategy, a model that kept failing was retried on essentially every request and traffic concentrated on a few models. The strategy shuffled only the deck-selected slot 0 and left the fallback remainder in **fixed priority order**, so after any failing deck pick the dispatch chain always fell through to the same top-priority model next. The fallback remainder is now shuffled (like the `random` strategy), so the fallback load — and recovery from a persistently-failing target — spreads evenly across the healthy peers. (Note: when the client always sends `tools` (e.g. OpenCode), the combo still correctly routes only to the tool-capable models in the combo — that capability filtering is by design.) ([#3959](https://github.com/diegosouzapw/OmniRoute/issues/3959) — thanks @KeNJiKunG) - **fix(dashboard): Logs auto-refresh now works even when the tab loads in the background** — the Logs page never auto-refreshed (only the manual Refresh button worked). The auto-refresh interval gated each tick on a visibility ref seeded once at mount and updated only by a `visibilitychange` event; when the tab mounted while the document reported `hidden` (background load, bfcache restore, embedded/Docker-proxied webviews) and no `visibilitychange` ever fired, the ref stayed `false` forever, so the interval ticked but never fetched. The tick now reads the live `document.visibilityState`, so polling self-heals as soon as the tab is visible — while still pausing when genuinely hidden. ([#3972](https://github.com/diegosouzapw/OmniRoute/issues/3972) — thanks @tjengbudi) - **fix(providers): LLM7 (and BytePlus) now fetch the live `/models` catalog instead of a stale hardcoded list** — importing an LLM7 key surfaced only a small, outdated model list even though `GET https://api.llm7.io/v1/models` returns the full pro/standard catalog. Both providers carried a correct `modelsUrl` in the registry, but neither was classified by any live-fetch branch of the model-import route (not `openai-compatible-*`, not self-hosted, not in `NAMED_OPENAI_STYLE_PROVIDERS`), so the route skipped the upstream probe and served the registry's 4 hardcoded entries (`source: "local_catalog"`). Added `llm7` and `byteplus` to `NAMED_OPENAI_STYLE_PROVIDERS` so the route probes `/models` with the key and serves the live catalog, falling back to the local catalog only when the upstream fetch fails (so key import never breaks). ([#3976](https://github.com/diegosouzapw/OmniRoute/issues/3976) — thanks @FerLuisxd) - **fix(resilience): respect connection cooldown stored as a numeric epoch (router kept hammering 429 accounts)** — the router kept dispatching to connections still inside their rate-limit cooldown, causing client timeouts and "connection cooldown isn't respected" reports. Root cause: `rate_limited_until` is a `TEXT` column, but the Antigravity full-quota path (`setConnectionRateLimitUntil`) persists a raw epoch **number**, which SQLite coerces to a numeric string like `"1781696905131.0"`. The account-selection predicate then did `new Date("1781696905131.0")` → `Invalid Date` → `NaN`, so `NaN > Date.now()` was false and the cooling connection was never skipped. The cooldown read predicates (`isAccountUnavailable`, `getEarliestRateLimitedUntil`, `filterAvailableAccounts`, `parseFutureDateMs`) now normalize numeric-epoch strings as well as ISO strings/Date/number via a shared `cooldownUntilMs()` helper — ISO behavior is unchanged. ([#3954](https://github.com/diegosouzapw/OmniRoute/issues/3954)) - **fix(compression/memory): stop memory + compression from poisoning the upstream prompt cache** — with compression and/or memory enabled, requests to caching providers (Anthropic-family) missed the prompt cache on every turn, multiplying cost. Two root causes: (1) memory injection prepended the retrieved memories — which **vary per user query** — at index 0 of the message array, shifting the entire cacheable prefix every turn; memory is now inserted just before the last user message when the request carries `cache_control` breakpoints, keeping the cacheable prefix (system prompt + prior turns) byte-stable. (2) the cache-aware `skipSystemPrompt` flag computed by `getCacheAwareStrategy()` was dropped by `selectCompressionStrategy()` (which can only return a mode), so the system prompt could still be compressed under caching; a new `resolveCacheAwareConfig()` now forces `preserveSystemPrompt` on for caching requests. ([#3936](https://github.com/diegosouzapw/OmniRoute/pull/3936), closes [#3890](https://github.com/diegosouzapw/OmniRoute/issues/3890) — thanks @xenstar / @diegosouzapw) - **fix(providers): register BytePlus ModelArk so its API key can be added** — adding a BytePlus (`ark-…`) key reported "invalid". `byteplus` was present in the provider catalog (`APIKEY_PROVIDERS`) but **never registered in the routing registry**, so key validation fell through to `{ unsupported: true }` → HTTP 400 → the UI rendered every key as invalid (and the provider was unusable for inference). Added a registry entry modeled on the existing Volcengine Ark provider: OpenAI-compatible format, base `https://ark.ap-southeast.bytepluses.com/api/v3` (region `ap-southeast-1`), `Authorization: Bearer` auth, seeded with the catalog's advertised models (Seed 2.0, Kimi K2 Thinking, GLM 4.7, GPT-OSS-120B). ([#3935](https://github.com/diegosouzapw/OmniRoute/pull/3935), closes [#3877](https://github.com/diegosouzapw/OmniRoute/issues/3877) — thanks @nikohd12 / @diegosouzapw) - **fix(providers): Nous Research key validation no longer fails on a stale probe model** — adding a valid Nous Research API key reported "invalid" even though the same key worked via the portal's copy-shell `curl`. The validation probe sent `model: "nousresearch/hermes-4-70b"`, which Nous does not serve, so the API returned `400` and the validator (which only treated `200`/`429` as success) reported the key invalid. The probe now uses the real `Hermes-4-70B` slug, and any non-auth 4xx (`400`/`404`/`422`) is treated as a valid key (the request shape was wrong, not the credentials) — mirroring the longcat/nvidia validators so a future model rename can't re-break key validation. ([#3934](https://github.com/diegosouzapw/OmniRoute/pull/3934), closes [#3881](https://github.com/diegosouzapw/OmniRoute/issues/3881) — thanks @FerLuisxd / @diegosouzapw) - **fix(stream): persist mid-stream upstream failures** — when an upstream stream fails partway through, the partial response and incremental usage are now finalized and persisted instead of lost; extracts a shared `streamFailureFinalization` path and merges incremental Claude usage (follow-up to #3879). ([#3937](https://github.com/diegosouzapw/OmniRoute/pull/3937) — thanks @rdself) - **fix(perplexity-web): update the request payload to schema v2.18 (HTTP 400)** — Perplexity web requests started returning HTTP 400; the request payload was updated to Perplexity's v2.18 schema. ([#3938](https://github.com/diegosouzapw/OmniRoute/pull/3938) — thanks @artickc) - **fix(stream): keep the in-flight request payload in sync** — the pending-by-id request record is now updated in place (`Object.assign`) so the in-flight payload stays consistent with what was dispatched (coexists with #3937). ([#3940](https://github.com/diegosouzapw/OmniRoute/pull/3940) — thanks @rdself) - **fix: stabilize reasoning streams and request logs** — reasoning-token streaming and the request-log capture path were stabilized to avoid dropped/duplicated reasoning frames and inconsistent log entries. ([#3879](https://github.com/diegosouzapw/OmniRoute/pull/3879) — thanks @rdself) - **fix(opencode-plugin): include nested combo-refs in the LCD context window** — the OpenCode plugin now follows nested combo references when computing the least-common-denominator context window, so a combo nested inside another no longer reports an inflated window. ([#3910](https://github.com/diegosouzapw/OmniRoute/pull/3910) — thanks @herjarsa) - **fix(models): correct the failed-model auto-hide defaults** — the defaults governing when a failed model is auto-hidden were corrected, and auto-hide is now opt-in so models are no longer dropped unexpectedly. ([#3930](https://github.com/diegosouzapw/OmniRoute/pull/3930) — thanks @rdself) - **fix(openrouter): show the preset field when editing a connection** — the connection-preset field appeared only when creating a connection, not when editing one; it now appears in both (follow-up to #3878). ([#3921](https://github.com/diegosouzapw/OmniRoute/pull/3921) — thanks @rdself) - **fix(sse): announce the assistant role on the first delta (Responses→Chat)** — the first SSE delta of a Responses-API→Chat-Completions stream now carries `role: "assistant"`, which strict OpenAI-compatible clients expect before content deltas. ([#3911](https://github.com/diegosouzapw/OmniRoute/pull/3911) — thanks @diego-anselmo) - **fix(vertex): add the generative-language scope so SA-JSON model discovery works** — Vertex service-account (SA-JSON) model discovery failed without the `generative-language` OAuth scope; the scope is now requested. ([#3922](https://github.com/diegosouzapw/OmniRoute/pull/3922) — thanks @artickc) - **fix(proxy): direct-connection fallback for control-plane ops when a pinned proxy is unreachable** — control-plane operations (validation, discovery) now fall back to a direct connection when a connection's pinned proxy is unreachable, instead of failing outright. ([#3906](https://github.com/diegosouzapw/OmniRoute/pull/3906) — thanks @zhiru) - **fix(providers): prevent zombie-socket hangs for zai/glm and tighten the default keepAlive** — zai/glm could hang on dead keep-alive sockets; the default keepAlive was tightened to evict zombie sockets. ([#3907](https://github.com/diegosouzapw/OmniRoute/pull/3907) — thanks @insoln) - **fix(setup): remove the stale CJS bundle check from setup-open-code** — the OpenCode setup helper no longer checks for a CJS bundle that the now ESM-only plugin no longer ships. ([#3908](https://github.com/diegosouzapw/OmniRoute/pull/3908) — thanks @herjarsa) - **fix(opencode-plugin): drop the CJS bundle to fix the OpenCode plugin loader** — the plugin is now ESM-only, fixing the OpenCode loader which failed on the dual CJS/ESM build. ([#3883](https://github.com/diegosouzapw/OmniRoute/pull/3883) — thanks @herjarsa) - **fix(mcp): fall back to `node:sqlite` when the better-sqlite3 binding is missing** — the MCP server now falls back to Node's built-in `node:sqlite` when the native better-sqlite3 binding is unavailable, instead of crashing. ([#3887](https://github.com/diegosouzapw/OmniRoute/pull/3887) — thanks @megamen32) - **fix(models): correct the generate-models alias lookup** — alias resolution during model generation was corrected so aliased model ids resolve to their canonical entry. ([#3870](https://github.com/diegosouzapw/OmniRoute/pull/3870) — thanks @YunyunZhai) - **fix(combo): guard the candidate pool against an empty array** — combo candidate-pool selection no longer throws when the pool resolves to an empty array. ([#3871](https://github.com/diegosouzapw/OmniRoute/pull/3871) — thanks @YunyunZhai) ### 🔒 Security & Hardening - **fix(security): bump form-data + vite (2 HIGH), harden workflow template-injection & allowlist guarded `workflow_run`** — two HIGH Dependabot advisories (`form-data`, `vite`) were upgraded; GitHub Actions workflows were hardened against `${{ }}` template-injection (untrusted values now passed via `env:`); and the guarded `workflow_run` trigger was allowlisted. ([#3949](https://github.com/diegosouzapw/OmniRoute/pull/3949) — thanks @diegosouzapw) ### 🧹 Internal / Quality / Docs - **fix(ci): grant `contents: write` to the npm publish job for SBOM attach** — the v3.8.25 TokenPermissions hardening set the npm-publish `publish` job to `contents: read`, but its "Attach SBOM to GitHub Release" step (`gh release upload`) needs `contents: write` and failed with HTTP 403 on the v3.8.25 release (npm / GitHub Packages / opencode-plugin / Docker / Electron all published fine; only the SBOM attach broke — the v3.8.25 SBOM was attached manually). ([#3874](https://github.com/diegosouzapw/OmniRoute/pull/3874) — thanks @diegosouzapw) - **fix(providers): keep the `/v1/models` catalog alias-only (release-time follow-up to #3870)** — #3870 made `generateModels()` also key the registry by each provider's raw id, which surfaced phantom `opencode/*` entries in `/v1/models` that collide with the `opencode/` → opencode-zen route (a regression vs v3.8.25, caught by the #2798 catalog regression test). `getProviderModels()` now resolves a raw provider id to its alias at lookup time instead of mirroring raw-id keys into the model namespace, preserving #3870's intent (`getProviderModels("github")` returns the same models as the `gh` alias) without polluting the public catalog. ([#3870](https://github.com/diegosouzapw/OmniRoute/pull/3870) — thanks @diegosouzapw / @YunyunZhai) - **ci(quality): make zizmor / gitleaks / osv scanners functional + freeze advisory baselines** — the supply-chain scanners are now actually executed (correct install + invocation) with frozen advisory baselines so new findings surface as diffs. ([#3947](https://github.com/diegosouzapw/OmniRoute/pull/3947) — thanks @diegosouzapw) - **ci(quality): fix scanner install + size-limit preset, promote `codeqlAlerts` to blocking** — corrected the scanner install and the size-limit preset, and promoted the `codeqlAlerts` ratchet from advisory to blocking. ([#3945](https://github.com/diegosouzapw/OmniRoute/pull/3945) — thanks @diegosouzapw) - **ci(quality): add an OpenAPI breaking-change gate (oasdiff, advisory) + fix dangling `$ref`s** — a CI gate diffs the OpenAPI spec against the base branch (`BASE_REF`) with oasdiff to surface breaking API changes, and the spec's dangling `$ref`s were repaired. ([#3951](https://github.com/diegosouzapw/OmniRoute/pull/3951) — thanks @diegosouzapw) - **ci(quality): add a schemathesis API-fuzz nightly (advisory)** — a nightly schemathesis property/fuzz pass against the OpenAPI spec (Quality Gates Fase 8 · Bloco B.4, advisory). ([#3956](https://github.com/diegosouzapw/OmniRoute/pull/3956) — thanks @diegosouzapw) - **ci(quality): flip the secret / workflow / bundle-size scanners to ratchet-blocking** — the secret-scan, workflow-lint and bundle-size gates moved from advisory to ratchet-blocking, with their baselines frozen and unit coverage for each scanner (Etapa 2). ([#3961](https://github.com/diegosouzapw/OmniRoute/pull/3961) — thanks @diegosouzapw) - **chore(quality): re-baseline the ESLint-warning ratchet (3760 → 3769)** — absorbs the v3.8.26-cycle warning drift into `quality-baseline.json` (manual re-baseline, never an automatic upward ratchet). ([#3962](https://github.com/diegosouzapw/OmniRoute/pull/3962) — thanks @diegosouzapw) - **ci(quality): wire Stryker mutation testing as an advisory nightly** — Stryker mutation testing runs nightly (advisory) — Quality Gates Fase 7 · Task 11. ([#3898](https://github.com/diegosouzapw/OmniRoute/pull/3898) — thanks @diegosouzapw) - **ci(quality): freeze per-module coverage floors + wire require-tighten (advisory)** — per-module coverage floors are frozen with an advisory "require-tighten" check that flags modules drifting below their floor. ([#3901](https://github.com/diegosouzapw/OmniRoute/pull/3901) — thanks @diegosouzapw) - **ci(quality): enforce the stale-allowlist check on `check-known-symbols`** — stale allowlist entries (suppressing a symbol that no longer exists) now fail the gate — Fase 6A.3 follow-up. ([#3899](https://github.com/diegosouzapw/OmniRoute/pull/3899) — thanks @diegosouzapw) - **test(ci): de-flake pipeline-payloads via per-test re-seed + honest reset** — the pipeline-payloads suite now re-seeds per test and performs an honest cache reset, eliminating a cross-test ordering flake. ([#3893](https://github.com/diegosouzapw/OmniRoute/pull/3893) — thanks @diegosouzapw) - **fix(ci): drop the `secrets`-in-job-`if` from nightly-llm-security** — referencing `secrets` in a job-level `if` caused a `startup_failure` on push; the gating was moved so the workflow starts cleanly. ([#3892](https://github.com/diegosouzapw/OmniRoute/pull/3892) — thanks @diegosouzapw) - **test: reconcile the runtime-timeouts keepAlive baseline to 4000 after the #3907 source revert** — the keepAlive assertion was realigned to the source value (4000) after #3907's source-side revert. ([#3933](https://github.com/diegosouzapw/OmniRoute/pull/3933) — thanks @diegosouzapw) - **chore(repo): nest quality-gate state under `config/quality`, declutter the repo root** — baselines / allowlists / metrics moved under `config/quality/`, trimming the tracked root file count. ([#3896](https://github.com/diegosouzapw/OmniRoute/pull/3896) — thanks @diegosouzapw) - **docs: refresh the provider count to 226 + regenerate `PROVIDER_REFERENCE.md`** — the README advertised a stale `177 providers`; the canonical generator (`scripts/docs/gen-provider-reference.ts`) now reports **226 unique provider IDs**, so the README badges/anchors and the generated provider reference were brought in sync. Also adds a documentation audit/sync report. (thanks @diegosouzapw) - **docs: sync all documentation to v3.8.24 + count-guard & wiki/prose CI** — a full documentation sync with a strict provider/locale count-guard plus Vale / markdownlint prose CI. ([#3804](https://github.com/diegosouzapw/OmniRoute/pull/3804) — thanks @diegosouzapw) - **docs: regenerate stale counts to canonical values** — 226 providers / 87 MCP tools / 15 strategies / 42 locales. ([#3904](https://github.com/diegosouzapw/OmniRoute/pull/3904) — thanks @diegosouzapw) - **docs(quality): correct the stale gate count + add an opt-in agent-lsp scaffold** — ([#3902](https://github.com/diegosouzapw/OmniRoute/pull/3902) — thanks @diegosouzapw) - **docs(mcp): correct the MCP tool-inventory diagram source + text to 87 tools** — ([#3909](https://github.com/diegosouzapw/OmniRoute/pull/3909) — thanks @diegosouzapw) - **docs: update the compression section to the 9-engine multi-layer stack** — ([#3894](https://github.com/diegosouzapw/OmniRoute/pull/3894) — thanks @diegosouzapw) - **ci(docs): automate GitHub wiki sync (add missing pages + cover counts)** — ([#3900](https://github.com/diegosouzapw/OmniRoute/pull/3900) — thanks @diegosouzapw) - **docs: require a dedicated git worktree + branch per development task (Hard Rule #19)** — codifies the worktree-isolation rule after the shared-checkout incidents. ([#3939](https://github.com/diegosouzapw/OmniRoute/pull/3939) — thanks @diegosouzapw) - **fix(docs): add MDX frontmatter to `DOCUMENTATION_AUDIT_REPORT` so the fumadocs build passes** — the audit report lacked the `title:` frontmatter MDX pages require. (thanks @diegosouzapw) --- ## [3.8.25] — 2026-06-14 ### ✨ New Features - **feat(compression): pluggable compression engines + async pipeline + Compression Studios** — a new prompt-compression subsystem with selectable engines (Lite / Aggressive / Ultra), an asynchronous compression pipeline wired into the chat core, and "Compression Studios" tooling for inspecting and tuning compression. ([#3848](https://github.com/diegosouzapw/OmniRoute/pull/3848)) - **feat(compression-ui): unified compression configuration UI** — a Compression Hub with per-engine pages (Lite / Aggressive / Ultra), a combos editor, a dedicated sidebar entry, and live-WS default-on. ([#3860](https://github.com/diegosouzapw/OmniRoute/pull/3860)) - **feat(security): prompt-injection guard across every LLM route + red-team suite** — the prompt-injection guard now runs on all LLM routes (chat, responses, embeddings, images, audio, rerank, search, moderations, videos, music) with a shared input sanitizer and a promptfoo-based red-team suite (Quality Gates Fase 8 · Bloco D). ([#3857](https://github.com/diegosouzapw/OmniRoute/pull/3857)) - **feat(kiro): live per-account model discovery** — Kiro now discovers each account/tier's entitled models via CodeWhisperer `ListAvailableModels` (region-matched, with a static-catalog fallback). ([#3836](https://github.com/diegosouzapw/OmniRoute/pull/3836) — thanks @artickc) - **feat(gemini/vertex): surface Veo video models in dynamic discovery** — Veo video models (`predictLongRunning`) now appear in Gemini/Vertex dynamic model discovery. ([#3839](https://github.com/diegosouzapw/OmniRoute/pull/3839) — thanks @artickc) - **feat(mimocode): per-account proxy for multi-account round-robin** — each mimocode account can route through its own proxy (resolved per account by fingerprint via `runWithProxyContext`), with a "Distribute proxies" UI helper. ([#3837](https://github.com/diegosouzapw/OmniRoute/pull/3837) — thanks @pizzav-xyz) - **feat(intelligence): expose Arena ELO sync as a feature flag** — the LM Arena ELO leaderboard sync is now toggleable (`ARENA_ELO_SYNC_ENABLED`, DB-override + env fallback). ([#3821](https://github.com/diegosouzapw/OmniRoute/pull/3821) — thanks @rdself) ### 🐛 Fixed - **test(oauth): prove refresh_token preservation for the real gemini-cli / antigravity dispatch** — the #3679/#3766 regression test used a synthetic provider that routes through the generic `tokenUrl` path, so the fix was never proven for the actual Google-family providers, which dispatch through `refreshGoogleToken()` against the hardcoded `OAUTH_ENDPOINTS.google.token`. Added a test that drives `checkConnection` through the real `gemini-cli`/`antigravity` path (redirecting the Google token endpoint to a local server returning `invalid_grant`) and asserts the `refresh_token` is preserved (not nulled) — confirming these connections are not spuriously destroyed on a failed refresh. ([#3850](https://github.com/diegosouzapw/OmniRoute/issues/3850) — thanks @3xa228148) - **fix(oauth): clear setup message for GitLab Duo instead of "Internal server error"** — adding a GitLab Duo connection without a registered OAuth client returned an opaque `Internal server error` at the Add Connection step. `buildAuthUrl` **threw** when `GITLAB_DUO_OAUTH_CLIENT_ID` was missing, and the route swallowed it into a generic 500. It now returns `null` (mirroring the Qoder provider) and the authorize route surfaces an actionable message: register an OAuth app at `https://gitlab.com/-/profile/applications` with redirect URI `http://localhost:20128/callback` and scopes `ai_features read_user`, then set `GITLAB_DUO_OAUTH_CLIENT_ID`. ([#3861](https://github.com/diegosouzapw/OmniRoute/issues/3861) — thanks @sidinsearch) - **fix(db): persist the "Keep latest backups" retention setting** — changing the backup-retention count in Settings → Database backup retention had no effect: it always snapped back to 20 on refresh (and editing `.env` post-start was ignored too, since `process.env` isn't reloaded). `getDbBackupMaxFiles()` only read the `DB_BACKUP_MAX_FILES` env var — there was no setter and no persisted value. The value now round-trips through a dedicated `key_value` store (`getDbBackupMaxFiles` precedence: env override → persisted UI value → default 20), and the "Clean old backups" action persists the chosen count. Existing installs keep the historical default of 20 until explicitly changed. ([#3834](https://github.com/diegosouzapw/OmniRoute/issues/3834) — thanks @netstratego) - **fix(sse): clamp Gemini thinking budget to the model's real cap (`reasoning_effort`/`effort=high` 400)** — translating OpenAI `reasoning_effort=high` (and Claude-Code `output_config.effort=high`) to a Gemini target sent a hardcoded `thinkingBudget: 32768`, which exceeds Flash-tier Gemini's real max of 24576 → upstream HTTP 400 (the `thinkingLevel=high` path already used 24576 and worked on the same model). `gemini-2.5-flash` now declares its real `thinkingBudgetCap` (24576) so the existing `capThinkingBudget()` chokepoint actually clamps, and the Claude→Gemini `output_config.effort` path — which previously sent the raw value with no cap at all — now routes through the same clamp (pro-tier, real cap 32768, is left untouched). ([#3842](https://github.com/diegosouzapw/OmniRoute/issues/3842) — thanks @andrea-kingautomation) - **fix(intelligence): run pricing + models.dev sync from the live startup path** — like the Arena ELO sync (v3.8.24), the external **pricing sync** (`PRICING_SYNC_ENABLED`) and the **models.dev capability sync** (Settings → AI toggle) were only initialized from `server-init.ts`, which the Next standalone runtime never executes — and models.dev had no caller at all. Their toggles were inert in production. Both are now initialized from `instrumentation-node.ts` (self-gated, opt-in preserved, non-blocking, never fatal). (thanks @diegosouzapw) - **test(proxy): guard the per-connection 'direct' bypass over a global proxy + clearer label** — the per-connection "Proxy Off" toggle (`proxyEnabled: false`) already overrides a configured **global** proxy (`resolveProxyForConnection` short-circuits to `level: "direct"` before the global step). Added an explicit regression test proving the bypass beats a global assignment (and round-trips on re-enable), and relabeled the UI to "Direct (bypass proxy)" so operators recognize it. Closes the verification gap in [#2996](https://github.com/diegosouzapw/OmniRoute/issues/2996). (thanks @diegosouzapw) - **feat(connections): per-connection "disable cooldown" opt-out** — a connection can now opt out of the transient cooldown (`providerSpecificData.disableCooling`, with a toggle in the Edit Connection modal). When set, a recoverable failure still records the error/backoff but does **not** take the connection out of rotation, so it stays eligible for selection — useful for a primary key you never want parked on a blip. Terminal states (banned / expired / credits_exhausted) still apply. ([#2997](https://github.com/diegosouzapw/OmniRoute/issues/2997) — thanks @diegosouzapw) - **fix(combo): restore sessionless combo stickiness + reasoning-aware readiness (504 / TPS regression after v3.8.14)** — #3399 (v3.8.16) replaced the ``-tag combo pinning with a server-side context-cache pin gated on a client `sessionId`. Clients that send no session id (most OpenAI-compatible tools) lost combo stickiness, so combos re-ran strategy selection every turn → upstream prompt-cache misses → cold high-reasoning starts (~78s) → intermittent `[504] Upstream request did not return response headers` + TPS collapse (only on combos). The pin now falls back to a stable per-conversation fingerprint (`extractSessionAffinityKey(body)`) when no session id is present — **only when `context_cache_protection` is on**, so #3399's anti-leak behaviour is preserved. Separately, the stream-readiness window now grants the +30s reasoning budget **unconditionally** for high-reasoning Codex GPT-5.x (small high-reasoning prompts were 504-ing at the 80s base regardless of stickiness). ([#3825](https://github.com/diegosouzapw/OmniRoute/issues/3825) — thanks @bypanghu) - **test(combo): cover the `skipProviderBreaker` consumer gate** — the producer was tested but the consumer (whether a failed combo target trips the whole-provider circuit breaker) was not; the breaker decision is now an exported pure predicate (`shouldRecordProviderBreakerFailure`, behaviour-identical) with direct tests asserting a `connection_cooldown` 503 does not trip the breaker while a plain 503 does. Closes another deferred test gap from [#2743](https://github.com/diegosouzapw/OmniRoute/issues/2743). (thanks @diegosouzapw) - **fix(providers): surface the real Devin error + correct the Windsurf auth instructions** — Devin chat returned a generic 502 "Invalid SSE response for non-streaming request" that swallowed the real cause (e.g. "Devin CLI not found"): an error-only SSE chunk (no `choices`) is now propagated with its sanitized message. The Windsurf "Visit windsurf.com/show-auth-token" instruction (the bare URL shows no token without an IDE-supplied `?state=`) now directs users to the `Windsurf: Provide Auth Token` command-palette flow. ([#3324](https://github.com/diegosouzapw/OmniRoute/issues/3324) — thanks @mikmaneggahommie) - **fix(grok-web): clearer 403 message for anti-bot / IP-reputation blocks** — a Grok Web subscription validating from a flagged datacenter/VPS IP got a 403 that read like an invalid cookie, sending users to chase a cookie that was actually fine. A non-auth 403 (Cloudflare challenge / anti-bot body) now returns a message stating the cookie is likely OK and the block is IP-reputation-based — retry from a residential IP or configure a proxy (auth-shaped 403s keep the re-paste guidance). ([#3474](https://github.com/diegosouzapw/OmniRoute/issues/3474) — thanks @friedtofu1608) - **fix(db): make the mass-pending-migrations safety threshold env-overridable** — restoring a backup DB from an older version could trip "Detected N pending migrations … threshold is 50" with no way to override the hardcoded `50`. The threshold is now configurable via `OMNIROUTE_MAX_PENDING_MIGRATIONS` (resolved at startup; `0` disables the check). ([#3416](https://github.com/diegosouzapw/OmniRoute/issues/3416) — thanks @samuraiIT) - **test(proxy): cover the Vercel-relay `proxyFetch` path** — net-new tests for `buildVercelRelayHeaders` and the `vercel`-type relay short-circuit (`x-relay-target`/`-path`/`-auth`, TCP-skip, missing-auth fail-closed), closing one of the deferred test gaps tracked in [#2743](https://github.com/diegosouzapw/OmniRoute/issues/2743). (thanks @diegosouzapw) - **fix(cli): surface `omniroute runtime repair` in the native-module error messages** — after a Node major upgrade, `better-sqlite3`'s prebuilt binary mismatches the ABI and the service can crash-loop; the error only mentioned `npm rebuild better-sqlite3` (which fails for global / no-toolchain installs). The startup + SQLite error hints now also point to the existing self-heal command `omniroute runtime repair` (rebuilds into a user-writable runtime), and a top-level `omniroute repair` alias was added. ([#3476](https://github.com/diegosouzapw/OmniRoute/issues/3476) — thanks @Rahulsharma0810) - **fix(antigravity): per-request Pro-family upstream-id fallback chain (`gemini-3.1-pro-high` 400)** — Antigravity silently renamed the Gemini 3.1 Pro-high upstream id, so `gemini-3.1-pro-high` started returning HTTP 400 (while `-low` still worked) and the live id can't be determined statically (competitor proxies disagree). The executor now retries alternative ids on a 400 (`gemini-3.1-pro-high` → `gemini-pro-agent` → `gemini-3-pro-high`, analogous for pro-low), bounded and only on a 400, with zero extra cost on the happy path; the 1:1 tier-passthrough invariant is preserved (the chain is request-time, not a static alias remap). ([#3786](https://github.com/diegosouzapw/OmniRoute/issues/3786) — thanks @aliaksandrsen) - **fix(sse): retry once on an early stream close (`STREAM_EARLY_EOF`) for single-model requests** — flaky OpenAI-compatible upstreams (e.g. NVIDIA NIM with minimax-m3 / qwen3.5 / glm-5.1) intermittently send HTTP 200 then close the SSE with zero useful frames, surfacing as a 502 "Stream ended before producing useful content". Only Antigravity got an early-close retry; every other provider returned the 502 immediately on the non-combo single-model path. A bounded one-retry (early-close only — not readiness-timeout — and without marking the account unavailable) now generalizes it. (The separate qwen-web validation SSRF part of the same report was already fixed in v3.8.24, [#3767](https://github.com/diegosouzapw/OmniRoute/pull/3767).) ([#3758](https://github.com/diegosouzapw/OmniRoute/issues/3758) — thanks @Svatosalav) - **fix(models): preserve eye-hidden models across auto-sync / import** — hiding models via the visibility (eye) toggle to keep only a combo's models was undone on every model import or auto-sync, which re-showed all of them. The sync re-import treated "hidden" identically to "deleted" and dropped both; a distinct `isDeleted` marker now separates the trash/delete path (still dropped on re-import, #3199) from the eye toggle (preserved as listed-but-hidden), and eye-hidden models are no longer re-aliased into the routable catalog on sync. ([#3782](https://github.com/diegosouzapw/OmniRoute/issues/3782) — thanks @xenstar) - **fix(providers): correct the lmarena cookie hint (`session` → `arena-auth-prod-v1`)** — the lmarena credential hint asked for a cookie named `session`, but lmarena.ai's real auth cookie is `arena-auth-prod-v1`, so users who pasted only `session=…` hit validation failures. The credential name, placeholder and storage keys now use the correct name (the legacy `session` key is retained for back-compat with already-saved credentials). ([#3810](https://github.com/diegosouzapw/OmniRoute/issues/3810) — thanks @xspylol) - **fix(reasoning): normalize OpenAI-compatible `max` effort to `xhigh` by default** — OpenAI-compatible providers do not accept literal `max`, but some upstreams (for example DeepSeek through OpenRouter) support `xhigh`; `max` now maps to `xhigh` unless the target model explicitly opts out of `xhigh`, with Claude alias variants still honoring the canonical Claude opt-out list. ([#3826](https://github.com/diegosouzapw/OmniRoute/pull/3826) — thanks @rdself) - **fix(combo): return the replay response on the round-robin streaming path** — a round-robin combo with a streaming target returned a body already locked by the readiness peek, surfacing as a 500 "ReadableStream is locked"; the round-robin path now returns the replay clone like the priority path does. ([#3811](https://github.com/diegosouzapw/OmniRoute/pull/3811) — thanks @0xtbug) - **fix(claude): strip the reasoning-effort suffix from Claude model ids** — Claude ids carrying an effort suffix (`…-low` … `…-max`) 404'd upstream and tripped the circuit breaker into a misleading "rate-limited" state; the suffix is now stripped before dispatch. ([#3807](https://github.com/diegosouzapw/OmniRoute/pull/3807) — thanks @zhiru) - **fix(sse): flush routed SSE chunks promptly (ping/zombie readiness filter)** — combo stream-readiness now filters ping/zombie frames so routed SSE chunks stream out without waiting on the readiness window. ([#3759](https://github.com/diegosouzapw/OmniRoute/pull/3759) — thanks @rdself) - **fix(models): don't auto-hide transient (rate-limited / timeout) failures on Test All** — a parallel Test All across many models could rate-limit an account and auto-hide every model that 429'd / timed out (dropping them from `/v1/models`); transient failures now surface an error state but stay visible. ([#3849](https://github.com/diegosouzapw/OmniRoute/pull/3849) — thanks @lukmanc405) - **fix(quota): surface OpenCode Go's missing quota-API as a latched diagnostic** — OpenCode Go keys whose quota endpoints return 404/401 no longer hammer the dead endpoints; the gap is latched with a clear message and an `OMNIROUTE_OPENCODE_GO_QUOTA_URL` override hint. ([#3838](https://github.com/diegosouzapw/OmniRoute/pull/3838) — thanks @adivekar-utexas) - **fix(pricing): add the missing Kiro model pricing rows** — Kiro models the registry serves (e.g. `claude-sonnet-4.6`) had no pricing row and reported $0.00; the rows were added. ([#3835](https://github.com/diegosouzapw/OmniRoute/pull/3835) — thanks @artickc) - **fix(ui): render country flags via flagcdn SVGs for Windows compatibility** — Windows doesn't render regional-indicator flag emoji; flags now use flagcdn SVGs with an emoji fallback. ([#3814](https://github.com/diegosouzapw/OmniRoute/pull/3814) — thanks @rafacpti23) - **fix(ui): expand the request log table with a vertical resize handle** — the request log table now shows ~10 rows and can be resized vertically. ([#3820](https://github.com/diegosouzapw/OmniRoute/pull/3820) — thanks @rafacpti23) - **fix(i18n): translate the missing `embeddedServices` keys across 37 locales** — the `embeddedServices` strings showed `__MISSING__` in 37 locales; they are now translated. ([#3819](https://github.com/diegosouzapw/OmniRoute/pull/3819) — thanks @rafacpti23) ### 🔒 Security & Hardening - **fix(security): CCR cross-tenant IDOR — per-principal scope store + bounded memory** — the compression CCR scope store was shared across principals, allowing cross-tenant reads; it is now scoped per-principal with bounded memory. ([#3859](https://github.com/diegosouzapw/OmniRoute/pull/3859)) - **feat(supply-chain): build provenance, SBOM, Trivy scan & OpenSSF Scorecard (advisory)** — added npm build provenance, a CycloneDX SBOM, Trivy image scanning, and an OpenSSF Scorecard workflow (Quality Gates Fase 8 · Bloco A, advisory). ([#3824](https://github.com/diegosouzapw/OmniRoute/pull/3824)) ### 🧹 Internal / Quality / Docs - **Consolidate the email-privacy control into Settings → Appearance** — the per-page email-privacy toggles were replaced by a single global switch. ([#3822](https://github.com/diegosouzapw/OmniRoute/pull/3822) — thanks @rdself) - **docs(ui): clarify the routing-settings copy (strategy sync + sticky limit)** — ([#3843](https://github.com/diegosouzapw/OmniRoute/pull/3843) — thanks @adivekar-utexas) - **Quality Gates — Fase 7 & 8** — promoted the dead-code / cognitive-complexity / type-coverage ratchets to blocking, installed advisory CI scanners (gitleaks / osv / actionlint / zizmor), and added property + golden + SSE-correctness tests and a runtime-resilience (chaos / heap-growth / k6 soak) suite. ([#3809](https://github.com/diegosouzapw/OmniRoute/pull/3809), [#3858](https://github.com/diegosouzapw/OmniRoute/pull/3858), [#3808](https://github.com/diegosouzapw/OmniRoute/pull/3808), [#3854](https://github.com/diegosouzapw/OmniRoute/pull/3854)) - **fix(docs): add MDX frontmatter to `SUPPLY_CHAIN.md`** — the new security doc lacked the `title:` frontmatter that MDX pages require, which broke the production Build + Docker Hub publish; the frontmatter was added. ([#3864](https://github.com/diegosouzapw/OmniRoute/pull/3864)) - **chore(deps): bump `aquasecurity/trivy-action` 0.28.0 → 0.36.0** ([#3862](https://github.com/diegosouzapw/OmniRoute/pull/3862)) - **chore(quality): reconcile the file-size ratchet baseline for Prettier-inflated v3.8.25 fixes + `chat.ts` growth** — the per-file size baseline was re-frozen to absorb the formatting/line-count growth from this cycle's chat-core and combo fixes (manual edits, never an automatic upward ratchet). ([#3823](https://github.com/diegosouzapw/OmniRoute/pull/3823), [#3833](https://github.com/diegosouzapw/OmniRoute/pull/3833) — thanks @diegosouzapw) - **test(suite): green the unit suite at release time — align stale tests to this cycle's intended behavior + de-flake two new suites** — release-gate housekeeping: updated tests that lagged behind intended behavior changes (OpenCode Go latched quota message #3838, the email-privacy control consolidated into Settings #3822, SOCKS5 default-on proxy-type message, the `[id]` provider-detail strangler-fig decomposition #3501, Vertex Express-mode keys, Antigravity discovery using a current user-callable model id) and the same-provider 503 fall-through resilience test; de-flaked the compression benchmark reproducibility test (sequential passes) and the ServiceSupervisor crash test (poll instead of fixed sleep). No production code changed. Also documented `OMNIROUTE_MAX_PENDING_MIGRATIONS` (#3416) in `.env.example` + `ENVIRONMENT.md`. (thanks @diegosouzapw) --- ## [3.8.24] — 2026-06-13 ### ✨ New Features - **feat(plugins): custom plugin marketplace support** — the plugin registry now fetches from a custom URL set in system settings (`pluginMarketplaceUrl`), falling back to the local seed registry when none is configured. Adds a `GET /api/plugins/marketplace` endpoint and a revamped Marketplace UI. ([#3656](https://github.com/diegosouzapw/OmniRoute/pull/3656) — thanks @oyi77) - **feat(api-keys): strict-mode controls for the Claude Code default routing path** — `Claude Code default` is now an explicit `cc/*` model permission, so an API key can allow the default path while blocking specific model families (e.g. Fable) in strict mode. Previously the default path received dynamic/unprefixed models (`sonnet`, `opus`, `claude-opus-4-8[1m]`, …) that no single permission represented, so it broke under strict permissions. ([#3776](https://github.com/diegosouzapw/OmniRoute/pull/3776) — thanks @Witroch4) - **feat(flags): expose the emergency budget fallback in the Feature Flags page** — `OMNIROUTE_EMERGENCY_FALLBACK` is now a runtime boolean (enabled by default, applied without restart) resolved through the feature-flag stack, so a DB override can toggle it while still honoring the raw env fallback. Follow-up to [#3741](https://github.com/diegosouzapw/OmniRoute/pull/3741) by @zoispag. ([#3752](https://github.com/diegosouzapw/OmniRoute/pull/3752) — thanks @rdself) - **feat(reasoning): preserve `xhigh` reasoning effort by default** — `xhigh` now passes through unless a model explicitly sets `supportsXHighEffort: false`, with the existing `max` normalization kept separate. ([#3756](https://github.com/diegosouzapw/OmniRoute/pull/3756) — thanks @rdself) - **feat(codex): inject OmniRoute memory into Codex Responses WebSocket requests** — retrieved memory is injected into the Responses WebSocket prepare request via the `instructions` field, with the retrieval query derived from the latest user input (skipping tool/reasoning payloads) and duplicate-safe injection. ([#3749](https://github.com/diegosouzapw/OmniRoute/pull/3749) — thanks @kkkayye) - **feat(dashboard): free provider rankings page** — a new dashboard page (with sidebar entry) that ranks free providers (no-auth / OAuth / API-key) by their models' Arena ELO / intelligence scores, joining the provider registry with the `model_intelligence` table via fuzzy model-name matching. Pure computation over existing data — no external calls. ([#3799](https://github.com/diegosouzapw/OmniRoute/pull/3799) — thanks @pizzav-xyz) ### 🔒 Security - **security(proxy): IPv6-only egress enforcement + closing IP-leak paths (L1/L2/L3)** — de-brackets IPv6 literals at the SOCKS host and the proxy-health tcpCheck (so `socks5://[2001:db8::1]` and any v6-literal proxy connect instead of dying with `ENOTFOUND`), adds a per-proxy `family` policy (`auto`/`ipv4`/`ipv6`), and enforces it end-to-end across SOCKS5/HTTP/HTTPS × global/provider/key × literal/hostname. 16 commits, TDD+BDD, 73 tests. ([#3777](https://github.com/diegosouzapw/OmniRoute/pull/3777)) - **security(marketplace): harden the custom-URL SSRF guard** against three bypasses found by automated security review — IPv6/AAAA records (only `dns.resolve4` was checked, so a private AAAA record or an IPv6 literal slipped through), redirect-following (a public URL could 30x to an internal one), and DNS-rebinding TOCTOU. The guard now resolves A+AAAA via the canonical `isPrivateHost`, routes the fetch through `safeOutboundFetch` (public-only, blocks redirects to private hosts), and re-validates on fetch. Reachable only with management auth + a custom `pluginMarketplaceUrl`. Follow-up to [#3656](https://github.com/diegosouzapw/OmniRoute/pull/3656). ([#3774](https://github.com/diegosouzapw/OmniRoute/pull/3774)) - **security: resolve all open CodeQL + Dependabot alerts** — CodeQL `js/insufficient-password-hash` (the semantic-cache `apiKeyId` is now a plaintext key prefix, `${apiKeyId}.${digest}`, instead of being folded into the SHA-256 digest, clearing the false positive while preserving per-key cache isolation) and a URL-substring check tightened to an exact host match; Dependabot `esbuild < 0.28.1` pinned via override in both workspaces. ([#3778](https://github.com/diegosouzapw/OmniRoute/pull/3778)) The remaining `js/incomplete-url-substring-sanitization` instances in the api-key proxy-context test were also cleared by asserting on the parsed URL host/port instead of a substring `includes`. (thanks @diegosouzapw) ### 🐛 Fixed - **fix(dashboard): surface the Plugins page (plugin manager + marketplace) in the sidebar** — the plugins page (`/dashboard/plugins`), which hosts the custom plugin marketplace shipped in [#3656](https://github.com/diegosouzapw/OmniRoute/pull/3656), had no menu entry and was reachable only by typing the URL. It now appears under **Agentic Features**. (thanks @diegosouzapw) - **fix(proxy): add the IP-family selector (auto / IPv4-only / IPv6-only) to the proxy form** — the per-proxy `family` egress policy from [#3777](https://github.com/diegosouzapw/OmniRoute/pull/3777) was backend-only (the dashboard had no control, so every proxy stayed on `auto`). The proxy registry form now exposes the selector and the create/update schema accepts it, so IPv6-only egress can be enabled from the UI. (thanks @diegosouzapw) - **fix(combo): deep audit of the combo + quota-shared routing system** — repairs 5 dead/broken rules (streaming-USD cost recording, quota-pool usage provider resolution, provider-diversity wiring, `maxComboDepth` threading, and scoring clamp/NaN-safety incl. `connectionDensity`) and revives the dead `tierAffinity`/`specificityMatch` scoring factors — root cause was a `require()` that throws under ESM, so both factors silently collapsed to `0.5`; now a static import. Validates every auto-router strategy (cost / latency / sla-aware / lkgp / `selectWithStrategy` + aliases) and the predictive-TTFT decision, adds E2E coverage (3-hop priority failover, per-target timeout failover, real `strategy:auto` dispatch), and introduces opt-in complexity-aware routing (2026) layered over the existing specificity detector. Per-target credential+proxy isolation verified clean (`AsyncLocalStorage`). 4 TDD waves, 10 new/updated test files. ([#3779](https://github.com/diegosouzapw/OmniRoute/pull/3779) — thanks @diegosouzapw) - fix(anthropic): normalize sampling params under extended thinking — Claude models with extended thinking (e.g. Opus 4.8 via the Claude Code provider) returned **HTTP 400** when a request carried non-default `temperature`/`top_p` (`temperature may only be set to 1 …`, `top_p must be ≥ 0.95 or unset …`). Tools like VS Code Copilot's "Ollama" BYOK send `temperature: 0.7` + `top_p: 0.9`, so every thinking-enabled Claude request failed; the proxy now drops/normalizes these params at the chokepoint so the request succeeds. ([#3780](https://github.com/diegosouzapw/OmniRoute/pull/3780) — thanks @zhiru) - fix(sse): pass Claude passthrough `thinking` blocks through unchanged — the Anthropic-native Claude OAuth passthrough rewrote every assistant `thinking` block to `redacted_thinking`, which the Messages API rejects (submitted thinking blocks are validated against the original response), so every multi-turn request with extended thinking failed with `400 … thinking blocks … cannot be modified` (very visible on long Claude Code tool-loops). The blocks are now passed through verbatim; the signature is validated server-side and stays valid on replay (including across an OAuth token switch), so the redaction was unnecessary. ([#3775](https://github.com/diegosouzapw/OmniRoute/pull/3775) — thanks @havockdev) - fix(mcp): resolve the bundled MCP server entry from `dist/` instead of the legacy `app/` path — `omniroute --mcp` crashed on npm installs with `ERR_MODULE_NOT_FOUND: Cannot find package @/lib` because `bin/mcp-server.mjs` looked for the compiled entry under `app/` (a VPS-deploy path that never exists in the npm package) and fell back to the un-bundled `.ts`. ([#3765](https://github.com/diegosouzapw/OmniRoute/pull/3765) — thanks @megamen32) - fix(sse): preserve streamed tool-call arguments end-to-end — incremental tool-call argument deltas could be truncated/duplicated through SSE parsing, transformation and response translation, corrupting tool calls in CLI tool-use output. Dedup now only collapses unambiguous snapshots. ([#3762](https://github.com/diegosouzapw/OmniRoute/pull/3762), closes [#3701](https://github.com/diegosouzapw/OmniRoute/issues/3701) — thanks @Mffff4) - fix(dashboard): repair the Logs page light-mode controls — the "Clean history" button keeps readable contrast (while preserving the red destructive affordance), request-row hover uses a cool blue tint so it no longer reads like a failed-request row, and the custom auto-refresh interval persists in `localStorage` (clamped to 1–300s). Also refreshes the Feature Flags light-mode treatment. ([#3760](https://github.com/diegosouzapw/OmniRoute/pull/3760) — thanks @rdself) - fix(dashboard): make the Request Logs "Clean history" action perform a full request-history purge — clears `call_logs`, legacy `request_detail_logs`, and local JSON artifacts under `DATA_DIR/call_logs` (including orphaned artifact files) via a dedicated maintenance API route, instead of a retention-only cleanup. ([#3751](https://github.com/diegosouzapw/OmniRoute/pull/3751) — thanks @rdself) - fix(cli): detect CLI tools installed outside the GUI PATH on macOS. macOS GUI/Electron apps don't inherit the user's login-shell PATH, so Homebrew (`/opt/homebrew/bin`), nvm and volta-installed CLIs (Cline, Codex, OpenCode, Continue, Hermes, …) were reported "not installed" and the Cline runtime couldn't be spawned. CLI detection (`omniroute doctor`) and the provider-runtime lookup now enrich the lookup PATH with the login shell's PATH (`$SHELL -ilc`, darwin-only, cached, fail-safe). ([#3321](https://github.com/diegosouzapw/OmniRoute/issues/3321) — thanks @mikmaneggahommie) - fix(dashboard): repair the Playground model selector for custom OpenAI/Anthropic-compatible providers. Two bugs left it unusable: (1) when the provider's catalog prefix didn't resolve, the list was filtered by the raw connection id (which matches nothing) so the selector showed nothing; and (2) selecting a provider reset the model to empty and nothing ever picked a default, so the chat failed with "Set a model in the config pane". The selector now falls back to the full catalog instead of emptying, and auto-selects the first available model. ([#3731](https://github.com/diegosouzapw/OmniRoute/issues/3731), [#3009](https://github.com/diegosouzapw/OmniRoute/issues/3009) — thanks @tjengbudi) - fix(cursor): send a `ModelDetails` envelope (`model_id` + `display_model_id` + `display_name`) in the Cursor agent request, alongside the existing `RequestedModel`. Pinned Cursor Claude/GPT _thinking_ variants (e.g. `cursor/claude-opus-4-7-thinking-xhigh`) were returning an empty turn → `502 Provider returned empty content`, because Cursor needs the `ModelDetails` envelope (which `cursor-agent`'s real wire format sends) to resolve them; `RequestedModel` alone only resolves server-routed ids (`auto`/`composer-*`). The `-fast` parameter path on `RequestedModel` is preserved. ([#3714](https://github.com/diegosouzapw/OmniRoute/issues/3714)) - fix(docs): correct the OAuth redirect URI in the Fly.io deployment guide. It told users to register `/api/oauth//callback`, but OmniRoute's browser OAuth flow uses a single `/callback` handler (there is no per-provider callback route). The mismatch caused GitLab Duo (and any OAuth provider) to reject the flow with "The redirect URI included is not valid". Added a regression guard test. ([#3732](https://github.com/diegosouzapw/OmniRoute/issues/3732)) - fix(providers): give Ollama Cloud's `kimi-k2.7-code` its real capabilities (262K context, 262K max output, vision + thinking + tools) instead of the degraded `128000 / 8192` defaults. The model had no spec/registry entry, so importing it via "Import from /models" (whose `/v1/models` upstream returns no per-model metadata) left it as a bare custom model with fallback capabilities. Added a global `kimi-k2.7-code` model spec (parity with `kimi-k2.6`) plus a registry entry on `ollama-cloud`. ([#3761](https://github.com/diegosouzapw/OmniRoute/issues/3761) — thanks @SultanKs4) - fix(providers): repair qwen-web (chat.qwen.ai) connection validation, which failed with a misleading `provider.validation.ssrf_blocked` error. qwen-web had no specialty validator, so the generic OpenAI-compatible path probed a non-existent `/api/v2/models` URL that answers with a 307 redirect — the outbound guard blocked the redirect and the route mislabeled it as an SSRF security block. Added a `qwen-web` specialty validator that probes the real session endpoint (`GET /api/v2/user`, mirroring the executor's anti-bot headers + cookie-jar replay). Also hardened `toValidationErrorResult` so a blocked redirect is only flagged `securityBlocked` when its target is a private/internal host — a benign 3xx to a public host is no longer mislabeled as an SSRF attempt (this affected every web-cookie provider, not just qwen). ([#3288](https://github.com/diegosouzapw/OmniRoute/issues/3288), [#3758](https://github.com/diegosouzapw/OmniRoute/issues/3758)) - fix(oauth): stop nulling the stored `refresh_token` of non-rotating providers when a proactive health-check refresh fails with `invalid_grant`. The destructive `refreshToken: null` write in `tokenHealthCheck` was only meant for rotating one-time-use tokens (Codex/OpenAI), but it also fired for Google-family providers (gemini-cli / antigravity / gemini) whose refresh tokens are non-rotating. Once nulled, the connection reported "No valid refresh token available" and could never recover even after re-activation. The token is now preserved (gated on `isRotatingProvider`) so it stays as the recovery artifact. ([#3679](https://github.com/diegosouzapw/OmniRoute/issues/3679) — thanks @3xa228148) - fix(dashboard): self-host the Material Symbols icon font instead of loading it from the Google Fonts CDN. On networks where `fonts.googleapis.com` is unreachable (e.g. mainland China), the icon ligature font never loaded, so every icon rendered as its literal text name (`smart_toy`, `visibility`, …) and the layout broke — especially after importing many models. The font is now bundled locally via the `material-symbols` package (`@import "material-symbols/outlined.css"` in `globals.css`), removing the runtime CDN dependency. ([#3695](https://github.com/diegosouzapw/OmniRoute/issues/3695) — thanks @lqyiwwx) - fix(antigravity): skip Google One AI credits retry on `full_quota_exhausted` verdict — antigravity executor now calls `decide429()` before attempting the credits retry so that a quota-exhausted account (24h cooldown) bypasses the extra upstream HTTP call instead of hanging for up to ~41s. Also persists the cooldown in the DB via `setConnectionRateLimitUntil` so post-restart routing skips exhausted connections without re-learning the hard way. Bonus: `antigravity429Engine` now recognises the real Antigravity "Individual quota reached. Contact your administrator to enable overages." error message as `quota_exhausted`. ([#3707](https://github.com/diegosouzapw/OmniRoute/issues/3707) — thanks @andrea-kingautomation) - fix(cli): `ServerSupervisor.handleExit` now coerces the exit code to a number before calling `process.exit()` — Node.js v24 throws `TypeError [ERR_INVALID_ARG_TYPE]` when `process.exit()` receives a string (e.g. `'ENOENT'` from a spawn `error` event's `err.code`). The `error` callback also now passes `-1` instead of the raw `err.code`, which is an OS error string rather than a meaningful exit code. ([#3748](https://github.com/diegosouzapw/OmniRoute/issues/3748)) ### 📝 Maintenance - **feat(intelligence): enable Arena ELO sync by default + wire it into the live startup path** — the Arena AI leaderboard ELO sync (`ARENA_ELO_SYNC_ENABLED`) that powers the new Free Provider Rankings page ([#3799](https://github.com/diegosouzapw/OmniRoute/pull/3799)) is now **on by default** (was opt-in). It was also only initialized from `server-init.ts`, which the Next standalone runtime never executes (it boots through `instrumentation-node.ts`), so the sync never actually ran in production — it's now initialized from the live instrumentation path. Fetches from `api.wulong.dev` on startup (non-blocking, never fatal) and refreshes daily; set `ARENA_ELO_SYNC_ENABLED=false` to opt out of the outbound sync. (thanks @diegosouzapw) - **chore(quality): Quality Gates → 100%** — completes Fase 6A (systemic hardening: a `stale-allowlist` helper applied across ~10 gates, `docs/architecture/QUALITY_GATES.md`, and ratchet engine v2 with `--require-tighten` + per-metric `eps`) and the entire Fase 7 (20 security/dead-code/mutation/tooling gates), with the Fase 8 plan documented. ([#3757](https://github.com/diegosouzapw/OmniRoute/pull/3757)) - **docs: close the remaining documentation gaps** for proxy operations, skills internals, the memory engine, RTK customization, and compression extensibility (post-#3438 audit, 5 areas in one PR). ([#3453](https://github.com/diegosouzapw/OmniRoute/pull/3453) — thanks @oyi77) - **chore(quality): re-baseline `providerRegistry.ts` file-size** (4692→4703) after #3768's Ollama Cloud `kimi-k2.7-code` capability fix grew the file past the frozen baseline, turning the release's own Fast Quality Gates red. No source file is touched. ([#3770](https://github.com/diegosouzapw/OmniRoute/pull/3770)) - **fix(publish): clean the `@omniroute/opencode-plugin` `node_modules` after the tsup build** — the hard links npm creates on Linux ended up in the published tarball as LINK entries, which the npm registry rejects with `E415 "Hard link is not allowed"`. The dependencies are only needed for the build, never shipped. (thanks @diegosouzapw) - **chore(docs): prune internal planning/spec artifacts and sync the i18n CHANGELOG to 3.8.24.** (thanks @diegosouzapw) - **test: align two unit tests left stale by this cycle's behavior changes** — `executor-codex` now asserts that GPT 5.4 Mini's `xhigh` reasoning effort passes through unchanged (the intended #3756 default; the model ships an `xhigh` catalog variant and carries no `supportsXHighEffort:false` flag) instead of the old downgrade-to-`high`, and `plugins-route-error-sanitization` now covers the new `/api/plugins/marketplace` route from #3656 (verified compliant with Hard Rule #12: imports + uses `buildErrorBody`). No production behavior change. (thanks @diegosouzapw) - **docs: refresh root + `docs/` documentation to the current architecture** — a full codebase audit corrected stale counts/facts (MCP 43→87 tools / ~13→30 scopes, DB 45+→83 modules / 55→97 migrations, routing 14→15 strategies, A2A 5→6 skills, Node `>=22.0.0 <23 || >=24.0.0 <27`, TypeScript 6.0) across `CLAUDE.md`, `AGENTS.md`, `README.md`, and the `docs/` index, and documented the plugin marketplace, Notion/Obsidian, and embedded-services subsystems. (thanks @diegosouzapw) --- ## [3.8.23] — 2026-06-12 ### ✨ New Features - **Emergency budget fallback: opt-out env switch `OMNIROUTE_EMERGENCY_FALLBACK`** ([#3741](https://github.com/diegosouzapw/OmniRoute/pull/3741) — thanks @zoispag): adds an `OMNIROUTE_EMERGENCY_FALLBACK` environment variable that disables the budget-exhaustion emergency reroute to `nvidia/openai/gpt-oss-120b` entirely when set to `false` or `0`. Default behavior (enabled) is unchanged. - **Auto-Combo: live model intelligence scoring via Arena ELO + models.dev** ([#3660](https://github.com/diegosouzapw/OmniRoute/pull/3660) — thanks @pizzav-xyz): replaces the static fitness lookup with a 5-layer resolution chain (user override → Arena ELO → models.dev tiers → hardcoded map → neutral fallback). A sync pipeline auto-fetches Arena AI leaderboard ELO scores and derives intelligence tiers from models.dev capabilities; combo picks now update as leaderboard rankings change without any manual configuration. - **Vertex AI: dynamic model discovery** ([#3712](https://github.com/diegosouzapw/OmniRoute/pull/3712) — thanks @artickc): the `vertex` provider now queries the Generative Language models API at runtime to surface the full account catalog — including image-generation models (Imagen, `gemini-*-image`), embeddings, and partner models — instead of returning only the small hardcoded registry list. - **Vertex AI: self-tracked USD spend on the Limits page** ([#3724](https://github.com/diegosouzapw/OmniRoute/pull/3724) — thanks @artickc): since the Google Cloud Billing API is inaccessible via the proxy credential, Vertex connections now track their own cumulative USD spend locally (based on token-cost accounting) and display it on the Limits page as "$ used since account added." - **Gemini: rate-limit metadata for known per-model RPM/RPD caps** ([#3686](https://github.com/diegosouzapw/OmniRoute/pull/3686) — thanks @hartmark): injects known rate-limit headers (RPM/RPD) for Gemini models that carry per-model limits (e.g. Gemma 4's 15 RPM / generous RPD), so the cooldown engine applies them correctly instead of locking out the whole account on daily-limit hits. - **Model Lockout: full settings UI with success-decay recovery** ([#3629](https://github.com/diegosouzapw/OmniRoute/pull/3629) — thanks @Chewji9875): end-to-end wiring of the per-model lockout feature — settings UI (enable/disable, configure thresholds), backend integration, structured error classification, and a success-decay mechanism that gradually recovers a locked model's fitness as successful calls accumulate. Lockout now applies to all providers when enabled, not just per-model-quota providers. - **Provider display modes — All / Configured / Compact** ([#3743](https://github.com/diegosouzapw/OmniRoute/pull/3743) — thanks @rdself): adds a three-state display mode control to the Providers page. "All" shows every registered provider; "Configured" shows only providers with at least one connection; "Compact" shows configured providers in a condensed card layout for denser views. - **API key cost drilldown + quota % used** ([#3742](https://github.com/diegosouzapw/OmniRoute/pull/3742) — thanks @Witroch4): the API Keys page now shows a per-key cost breakdown and the percentage of quota consumed for each key. ### 🔧 Bug Fixes - **`@omniroute/opencode-plugin` bundled in the npm tarball + `omniroute setup opencode` CLI command** ([#3726](https://github.com/diegosouzapw/OmniRoute/pull/3726) — thanks @herjarsa): the plugin was never compiled as part of the publish pipeline, requiring manual extraction. Now ships pre-built inside the `omniroute` package and installed via `omniroute setup opencode` (copies plugin into `~/.config/opencode/plugins/omniroute/`, updates `opencode.json` idempotently). Also fixes `provider.models` baseURL resolution — checks `_provider.options.baseURL` as a third fallback so partner/tiered providers no longer return zero models. ([#3711](https://github.com/diegosouzapw/OmniRoute/issues/3711)) - **MiMoCode 403 "Illegal access" fixed** ([#3728](https://github.com/diegosouzapw/OmniRoute/pull/3728) — thanks @felipesartori): the Xiaomi free endpoint gates requests on a recognized MiMoCode system-prompt signature; OmniRoute forwarded raw requests without the marker, causing 403 on every call. The executor now injects the required anti-abuse signature. - **"Test all models" flow: i18n crash, status icons, auto-hide** ([#3729](https://github.com/diegosouzapw/OmniRoute/pull/3729) — thanks @felipesartori): three bugs in the provider-detail test-all-models flow — `providerText()` crash because the `testAllResults` template requires `{ok, total}` but callers passed `{ok, error}`; missing `online`/`offline` status icons on model rows; results panel not auto-hiding after run completes. - **OAuth token-refresh invalidation loop fixed** ([#3692](https://github.com/diegosouzapw/OmniRoute/pull/3692) — thanks @diegosouzapw): `refreshClaudeOAuthToken` returned `null` instead of the error sentinel on non-canonical 400 bodies, causing the caller to retry every 60 seconds — observed as 1,352 consecutive refresh attempts on one Claude account. Fixed alongside hardening of `safeResolveProxy` (proxy resolution errors now warn instead of silently falling back to DIRECT) and adding egress-IP visibility to `safeLogEvents`. - **`safeLogEvents` async hotfix** (thanks @diegosouzapw): PR #3692 introduced a `lazy await import(proxyEgress)` inside a sync `safeLogEvents` — an ES syntax error that broke every consumer loading `chatHelpers` via `tsx` and caused 14 tests to fail at module load. Made `safeLogEvents` async; `void`-ed the single `chat.ts` call site. - **Kiro: quota tracking for IAM Identity Center accounts** ([#3722](https://github.com/diegosouzapw/OmniRoute/pull/3722) — thanks @artickc): `getKiroUsage` returned "0 used" for IAM Identity Center accounts (and `kiro-cli` imports) because those connections frequently lack a persisted `profileArn`. Now falls back to a name-based profile lookup so quota displays correctly. - **Empty Claude SSE stream now surfaces a real error** ([#3689](https://github.com/diegosouzapw/OmniRoute/pull/3689) — thanks @TechNickAI): when a Claude stream completed with lifecycle events but no content block, the proxy returned a synthetic `"[Proxy Error] The upstream API returned an empty response"` as a _successful_ assistant message. Now emits a proper SSE error event; the missing-finalizer synthetic path is preserved for streams that already produced content. - **Vertex AI Express-mode API keys** ([#3690](https://github.com/diegosouzapw/OmniRoute/pull/3690) — thanks @artickc): the Vertex executor rejected every non-JSON credential with "Vertex AI requires a valid Service Account JSON." Now accepts Express-mode API key strings (`AIza*`) alongside Service Account JSON, routing them through the correct token endpoint. - **Anthropic: strip `top_p` when `temperature` is set** ([#3691](https://github.com/diegosouzapw/OmniRoute/pull/3691) — thanks @zhiru): Anthropic API rejects requests containing both `temperature` and `top_p`; VS Code's Claude extension sends both in every request, causing 400s on all routed calls. The OpenAI→Claude translator now drops `top_p` when `temperature` is present. - **Combo reasoning token buffer: conservative application + feature flag** ([#3700](https://github.com/diegosouzapw/OmniRoute/pull/3700) — thanks @rdself): tightens the #3588 buffer (only applies when the model is explicitly thinking-capable, has a non-default known output cap, and the full buffered value fits inside that cap) and adds a `reasoningTokenBufferEnabled` feature flag in combo defaults so users can fully disable it from Settings. - **Emergency budget fallback: cross-provider credential leak fixed** ([#3699](https://github.com/diegosouzapw/OmniRoute/pull/3699) — thanks @diegosouzapw): the executor-level emergency hop re-sent the failing provider's API key to the emergency provider's endpoint (e.g. the OpenAI `Authorization` header going to `integrate.api.nvidia.com`). Now orchestrated exclusively by the routing layer, which resolves credentials for the emergency provider via account selection and no longer fires inside combo targets. - **`/v1/messages/count_tokens` now honors the connection's proxy assignment** ([#3699](https://github.com/diegosouzapw/OmniRoute/pull/3699) — thanks @diegosouzapw): token count calls went DIRECT regardless of configured proxies, leaking the host IP for proxy-isolated setups. Now wraps execution in `runWithProxyContext`, exactly like chat execution. - **Gemini: context-mode fallback for signatureless tool calls** ([#3688](https://github.com/diegosouzapw/OmniRoute/pull/3688) — thanks @diegosouzapw): fixes HTTP 400 on multi-turn thinking-model tool calls when `thought_signature` is unavailable — standard Gemini provider now falls back to context mode instead of sending the unsigned call. - **Antigravity: preserve `gemini-3.1-pro` High/Low budget tiers** ([#3696](https://github.com/diegosouzapw/OmniRoute/pull/3696) — thanks @diegosouzapw): upstream accepts the suffixed ids; stop collapsing to bare `gemini-3.1-pro`. - **Stream combo: fail over on empty/content-filtered response** ([#3685](https://github.com/diegosouzapw/OmniRoute/pull/3685) — thanks @diegosouzapw): streaming combos now route to the next target instead of surfacing a blank reply. - **Qwen Web: migrated to v2 chat API** ([#3723](https://github.com/diegosouzapw/OmniRoute/pull/3723) — thanks @diegosouzapw): the legacy `/api/chat/completions` endpoint was retired upstream returning `504` HTML from Alibaba's gateway for all requests. The executor now uses the two-step v2 flow (`/api/v2/chats/new` → `/api/v2/chat/completions?chat_id=`), replays the full browser cookie jar (cna + ssxmod_itna/itna2 + token) required by Alibaba's WAF instead of only a Bearer token, parses phase-based SSE (think→reasoning, answer→content), and refreshes the model catalog to current ids (`qwen3.7-max`, `qwen3.7-plus`, `qwen3.6-plus`; legacy ids kept as aliases). 17 unit tests. (Closes [#3288](https://github.com/diegosouzapw/OmniRoute/issues/3288)) - **Responses API: `stream` defaults to `false` when omitted (spec compliance)** ([#3708](https://github.com/diegosouzapw/OmniRoute/pull/3708) — thanks @diegosouzapw): `/v1/responses` requests that omit `stream` no longer 502 (`STREAM_EARLY_EOF`) when the upstream returns a valid JSON response. `resolveStreamFlag` now applies the OpenAI Responses API spec default (stream=false) in addition to the existing Anthropic Messages API default — previously only `sourceFormat=claude` triggered this path, leaving `sourceFormat=openai-responses` to fall through to the wildcard-Accept heuristic (`Accept: */*` → streaming intent), which caused spec-compliant upstreams that return JSON to appear as a dead stream. Codex CLI (always sends `stream: true`) and explicit SSE clients (`Accept: text/event-stream`) are unaffected. - **Semantic cache: scope to requesting API key** ([#3740](https://github.com/diegosouzapw/OmniRoute/pull/3740) — thanks @diegosouzapw): two callers with different API keys sending the same prompt and model no longer receive each other's cached responses. `generateSignature` now includes the `api_key_id` dimension in the SHA-256 hash; unauthenticated requests (no API key) remain isolated from keyed requests. Existing cache entries (generated without the key dimension) are cleared by migration `098`. - **Model-family fallback: dot-notation model IDs now resolve correctly** (thanks @diegosouzapw): `getNextFamilyFallback` normalizes dots to hyphens for the initial lookup but also falls back to the bare model name, supporting IDs like `gemini-3.1-pro-high` whose dots are part of the literal name. Previously, `gemini-3.1-pro-high` silently returned null and bypassed the entire family. ### ♻️ Code Quality - **Dashboard god-component (#3501): Phases 1g → 1t complete — ≤800 LOC target reached** ([#3717](https://github.com/diegosouzapw/OmniRoute/pull/3717), [#3721](https://github.com/diegosouzapw/OmniRoute/pull/3721), [#3725](https://github.com/diegosouzapw/OmniRoute/pull/3725), [#3727](https://github.com/diegosouzapw/OmniRoute/pull/3727) — thanks @diegosouzapw): four extraction phases bring `ProviderDetailPageClient.tsx` from 4,062 to 781 LOC — the ≤800 target set at the start of the refactor. Extracted OAuth flow helpers, quota display, traffic-inspector panel, logs viewer, combo-target editor, and remaining inline UI into standalone components under `providers/[id]/components/`. ### 🌍 Internationalization - **zh-CN: comprehensive Simplified Chinese translation improvements** ([#3736](https://github.com/diegosouzapw/OmniRoute/pull/3736) — thanks @sdfsdfw2): broad pass on Simplified Chinese UI strings for accuracy and consistency. ### 📝 Maintenance - **CI: bump GitHub Actions artifacts/cache actions to latest** (thanks @diegosouzapw): `actions/download-artifact` 4→8 ([#3733](https://github.com/diegosouzapw/OmniRoute/pull/3733)), `actions/cache` 4→5 ([#3734](https://github.com/diegosouzapw/OmniRoute/pull/3734)), `actions/upload-artifact` 4→7 ([#3735](https://github.com/diegosouzapw/OmniRoute/pull/3735)). - **File-size ratchet baseline reconciled** ([#3705](https://github.com/diegosouzapw/OmniRoute/pull/3705) — thanks @diegosouzapw): freezes 27 inherited/previously-grown files at their current LOC and registers `providerLimits.ts` in the gate; ongoing shrink tracked via #3501. - **docs: add FUNDING.yml and README Support section** ([#3698](https://github.com/diegosouzapw/OmniRoute/pull/3698) — thanks @diegosouzapw) - **docs(changelog): restore `#3590` bullet lost on the v3.8.20 release branch** (thanks @diegosouzapw): the fix reached `main` pre-tag via cherry-pick `#3591`, but its changelog bullet only existed on `release/v3.8.20` after the squash-merge; restored per the 2026-06-12 release-branch leftover audit. ### ✅ Tests - **Combo strategy fallback coverage** (`tests/unit/combo-strategy-fallbacks.test.ts`, 11 tests): fill-first / p2c / random / cost-optimized / strict-random fallback paths (previously happy-path only), price-tie stability, stale strict-random deck degradation, unknown-strategy normalization to priority, and circuit-breaker HALF_OPEN recovery inside the combo loop + `preScreenTargets` (lazy-recovery contract). - **`#1731` fast-skip suite restored** (`tests/integration/combo-provider-exhaustion.test.ts`): the five skipped tests were rewritten against the current routing policy (quota-exhausted 429 marks the provider for the request; transient 429 retries other connections; connection errors skip per-connection; nothing persists across requests) and re-enabled — 8/8 green. - **Proxy context passthrough** (`tests/integration/proxy-context-passthrough.test.ts`): combo targets each execute under their own connection's proxy; `count_tokens` runs inside the connection's proxy context. --- ## [3.8.22] — 2026-06-11 ### ✨ Added - **MiMoCode free-tier provider** ([#3659] — thanks @pizzav-xyz): new no-auth provider `mimocode` (alias `mcode`) exposing Xiaomi's `mimo-auto` model (1M context) via device-fingerprint bootstrap-JWT auth (`/api/free-ai/bootstrap` → Bearer JWT → `/api/free-ai/openai/chat`). Supports multiple accounts (N fingerprints → round-robin with exponential cooldown), re-bootstrap on 401/403, and cooldown on 429. Reuses a new generic `NoAuthAccountCard` dashboard component (also wired for `opencode`). 22 unit tests; upstream validated live during review. (Maintainer follow-up: added the required `authHeader: "none"` field to the registry entry.) Co-authored with @pizzav-xyz. - **Prefer Claude Code for unprefixed `claude-*` model IDs** ([#3540] — thanks @Witroch4): opt-in setting (default off) that routes bare `claude-*` model IDs from Claude Code clients through the Claude Code OAuth account instead of requiring a provider prefix. Configurable via the `OMNIROUTE_PREFER_CLAUDE_CODE_FOR_UNPREFIXED_CLAUDE_MODELS` env flag or a dashboard toggle on the Claude provider page; explicit provider prefixes still win. Full layer coverage (resolver + DB setting + zod schemas + types + UI) with 6 tests. Co-authored with @Witroch4. - **Codex Responses-WebSocket call history** ([#3616] — thanks @kkkayye): Codex `/v1/responses` WebSocket calls are now persisted to request history — success completions plus prepare-failures, upstream WS errors and premature closes — with `sanitizeErrorMessage` applied to the stored error. Two proxy-side integration tests cover the success and failure paths. - **Obsidian/WebDAV**: add the `/api/v1/webdav` file server (PROPFIND/GET/PUT/DELETE/MKCOL/MOVE, Basic-Auth, path-traversal hardened) so Obsidian mobile can sync the vault (#3485, part 2). Implemented in the custom server layer (`scripts/dev/webdav-handler.mjs`) — intercepted before Next.js to support non-standard HTTP methods (`PROPFIND`, `MKCOL`, `MOVE`, `LOCK`). Reads vault path and credentials (with enc:v1: AES-256-GCM decryption) directly from the SQLite `key_value` table; credentials configured via PR1's `/api/settings/obsidian/webdav` endpoint. 36 TDD unit tests covering traversal guard, constant-time auth, decrypt round-trip, XML generation, and full CRUD cycle. - **Quota overview**: deactivate/activate an account directly from the quota card header (toggle button) so users can park a near-zero-quota account without navigating to the provider detail page. ([#3675](https://github.com/diegosouzapw/OmniRoute/pull/3675) — thanks @leninejunior) ### ♻️ Code Quality - **providers/[id]**: extract `useProviderConnections`, `useProviderSettings`, `useProviderModels` hooks from the god-component — #3501 Phase 1f. `ProviderDetailPageClient.tsx`: 4,948 → 4,063 LOC (−885 lines). New hooks in `hooks/`: `useProviderConnections.ts` (954 LOC — all connection management, batch ops, proxy/CLIProxyAPI state, batch-test runner with MAX_BULK_IDS chunking), `useProviderSettings.ts` (264 LOC — Codex global service mode + Claude routing preference), `useProviderModels.ts` (155 LOC — model metadata, aliases). Frozen baselines updated. 10 Phase-1f smoke tests; typecheck/cycles/lint green. Co-authored with @oyi77. - **providers/[id]**: extract `useModelCompatState` hook + model sections (`ModelRow`, `PassthroughModelRow`, `PassthroughModelsSection`, `CustomModelsSection`, `CompatibleModelsSection`) from the god-component — #3501 Phase 1e. `ProviderDetailPageClient.tsx`: 6,838 → 4,922 LOC (−1,916 lines). New leaf `hooks/useModelCompatState.ts` (101 LOC); compat helpers moved to `providerPageHelpers.ts`. Frozen baselines: `providerPageHelpers.ts: 822`. 12 Phase-1e smoke tests; typecheck/cycles/lint green; #3610 auto-hide fix preserved. - **providers/[id]**: extract `ConnectionRow` (+ `CooldownTimer`/`inferErrorType`/`getStatusPresentation`), `ModelCompatPopover` (+ `recordToHeaderRows`), and `SiliconFlowEndpointModal` from the god-component into `components/` — #3501 Phase 1d. `ProviderDetailPageClient.tsx`: 8,092 → 6,838 LOC (−1,254 lines). Frozen baselines: `ConnectionRow.tsx: 941`. 7 new Phase-1d smoke tests; typecheck/cycles/lint green. - **providers/[id]: extract AddApiKeyModal + EditConnectionModal (+ WebSessionCredentialGuide) from the god-component into components/** ([#3501] Phase 1c): extracted the two heaviest inline modals — `AddApiKeyModal` (~787-LOC body) and `EditConnectionModal` (~1091-LOC body) — plus shared `WebSessionCredentialGuide` (~103 LOC) into standalone files under `providers/[id]/components/modals/` and `providers/[id]/components/` respectively. Added `ERROR_TYPE_LABELS` and `formatTimeAgo` to `providerPageHelpers.ts` (leaf) so `EditConnectionModal` and `ConnectionRow` share them without cycles. Pruned 14 now-unused imports from the god-component. `ProviderDetailPageClient.tsx`: 9,981 → 8,092 LOC (−1,889 lines). Frozen baselines: `AddApiKeyModal.tsx: 842`, `EditConnectionModal.tsx: 1170`. 6 new Phase-1c smoke tests; all 21 vitest modal tests pass; typecheck/cycles/lint green. - **refactor: small db/utils cleanup** ([#3523] — thanks @androw): table-driven `compression_analytics` column migration (replaces 17 repeated `ALTER TABLE` calls), a single merged `serializeJsonField` helper in `db/providers.ts` (folded two byte-identical serializers), and removal of the dead no-op `syncProviderDataToCloud`/`getProvidersNeedingRefresh` stubs from `shared/utils/machine.ts` (no remaining callers). Pure refactor; behavior unchanged. - **Provider-detail god-component decomposition — Phase 2b (remaining shared helpers→leaf)** ([#3501]): extended `providers/[id]/providerPageHelpers.ts` with all remaining pure helpers needed by the heavy modals (`AddApiKeyModal`/`EditConnectionModal`) before they can be extracted. Moved 22 symbols: web-session credential label/hint/check/title helpers; upstream-headers helpers (`upstreamHeadersRecordsEqual`, `headerRowsToRecord`, `effectiveUpstreamHeadersForProtocol`, `anyUpstreamHeadersBadge`, `getProtoSlice`) plus their `HeaderDraftRow`/`CompatModelRow`/`CompatModelMap`/`CompatByProtocolMap` types; Codex consts and helpers (`CODEX_REASONING_STRENGTH_OPTIONS`, `CODEX_ACCOUNT_SERVICE_TIER_VALUES`, `CODEX_GLOBAL_SERVICE_MODE_VALUES`, `getCodexServiceTierLabel`, `normalizeCodexLimitPolicy`, `getCodexRequestDefaults`, `getClaudeCodeCompatibleRequestDefaults`); misc helpers (`compatProtocolLabelKey`, `extractCommandCodeCredentialInput`, `normalizeAndValidateHttpBaseUrl`, `SILICONFLOW_ENDPOINTS`, `CommandCodeAuthFlowState`). New transitive imports wired into the leaf: `MODEL_COMPAT_PROTOCOL_KEYS` (`@/shared/constants/modelCompat`), `CodexServiceTier`/`getCodexRequestDefaults`/`getClaudeCodeCompatibleRequestDefaults` (`@/lib/providers/requestDefaults`), `CodexGlobalServiceMode` (`@/lib/providers/codexFastTier`), `WebSessionCredentialRequirement` (`./webSessionCredentials`). `ProviderDetailPageClient.tsx`: 10,288 → 9,980 LOC. Leaf module: 589 LOC (acyclic). 25-assertion unit test suite passes; smoke test 3/3; no import cycles. Co-authored with @oyi77. - **Provider-detail god-component decomposition — Phase 2 (helpers→lib)** ([#3501]): extracted the pure shared helpers — `ProviderMessageTranslator`/`LocalProviderMetadata` types, `providerText`/`providerCountText`/`readBooleanToggle`, and the provider base-URL + routing-tag/excluded-model parse/format block — into a new leaf `providers/[id]/providerPageHelpers.ts` (imports only `@/shared`, so the client and modals share them with no import cycle). `ProviderDetailPageClient.tsx`: 10,435 → 10,288 LOC. Unblocks extracting the heavier `AddApiKeyModal`/`EditConnectionModal` (which depend on these helpers) without cycling. The Phase 0 smoke test caught a missing transitive import (`isSelfHostedChatProvider`) at mount — now wired + locked by a new helpers unit test (12 assertions). Co-authored with @oyi77. - **#3500 fully resolved** — Hard Rule #5 (no raw SQL in route handlers): all 13 internal offenders migrated to `src/lib/db/` modules across slices (call*logs, usage_history/daily_usage_summary, community_servers, usage_logs, semantic_cache, proxy_logs, skills UPDATE, db-backups). The gate's `KNOWN_RAW_SQL` set is renamed to `EXTERNAL_DB_ALLOWED` (with a back-compat alias) and now holds only the **2 external-DB reads** (`oauth/cursor/auto-import`, `oauth/kiro/auto-import`) — these open \_another app's* SQLite to import credentials, so by design they cannot live in OmniRoute's `db/` domain. The gate still blocks any NEW raw SQL against OmniRoute's DB. - **#3500 fully resolved** — Hard Rule #5 (no raw SQL in route handlers): all 13 internal offenders migrated to `src/lib/db/` modules across slices (call*logs, usage_history/daily_usage_summary, community_servers, usage_logs, semantic_cache, proxy_logs, skills UPDATE, db-backups). The gate's `KNOWN_RAW_SQL` set is renamed to `EXTERNAL_DB_ALLOWED` (with a back-compat alias) and now holds only the **2 external-DB reads** (`oauth/cursor/auto-import`, `oauth/kiro/auto-import`) — these open \_another app's* SQLite to import credentials, so by design they cannot live in OmniRoute's `db/` domain. The gate still blocks any NEW raw SQL against OmniRoute's DB. - **chore(db-gate):** reclassify `KNOWN_UNEXPORTED` → `INTENTIONALLY_INTERNAL` in `scripts/check/check-db-rules.mjs` ([#3499]): a full audit of all 25 db modules confirmed each is consumed via direct/dynamic import per Hard Rule #2 ("Never barrel-import from localDb.ts"). The old framing labelled them as "debt", which was misleading — they are the correct pattern. The gate's blocking behaviour is unchanged (a NEW unexported module still fails); only the name, comments, and per-module justifications were updated to reflect audited truth. Four modules flagged `DEAD?` (`compressionScheduler`, `discovery`, `pluginMetrics`, `prompts`) have zero production importers and are documented as schema-reserved. A new regression-guard test (`tests/unit/check-db-rules-classification.test.ts`) asserts every non-dead module in the set has ≥1 real importer, so a future consumer removal surfaces as a test failure requiring explicit reclassification. - **refactor(db): move `call_logs` aggregations into `callLogStats` db module** ([#3500]): extracted raw SQL from three route handlers (`/api/provider-metrics`, `/api/search/stats`, `/api/v1/search/analytics`) into a new `src/lib/db/callLogStats.ts` domain module (`getProviderMetrics`, `getSearchProviderStats`, `getRecentSearchLogs`, `getSearchAggregateStats`, `getSearchProviderCounts`). First slice of #3500 (call_logs cluster). Behavior unchanged; the three routes are removed from `KNOWN_RAW_SQL` in the gate. Validated with TDD unit tests (6 assertions seeding an in-memory SQLite fixture). - **refactor(db): move `usage_history`/`daily_usage_summary` SQL into `usageAnalytics` db module** ([#3500]): extracted all inline `db.prepare(...)` calls from two route handlers (`/api/usage/analytics`, `/api/settings/export-json`) into a new `src/lib/db/usageAnalytics.ts` module and extended `src/lib/db/callLogStats.ts` with `getFallbackStats`. New exports: `buildUnifiedSource`, `buildPresetUnifiedSource` (UNION CTE builders), plus 12 typed query functions covering summary, daily, daily-cost, heatmap, model, provider, account, api-key, service-tier, weekly-pattern, and preset-cost aggregations, plus `getAllUsageHistory`/`getAllDomainCostHistory`/`getAllDomainBudgets` for backup export. Second slice of #3500. `KNOWN_RAW_SQL` drops from 12 → 10. Validated with 21 TDD unit tests (`tests/unit/db-usage-analytics-3500.test.ts`) seeding a temp SQLite fixture. - **refactor(db): move `community_servers` auth look-up into `gamification` db module** ([#3500]): extracted raw SQL from two federation route handlers (`/api/gamification/federation/leaderboard`, `/api/gamification/federation/score`) into a new `getConnectedServerByKeyHash(apiKeyHash)` function in `src/lib/db/gamification.ts`. Third slice of #3500 (gamification federation cluster). Behavior unchanged; the two routes are removed from `KNOWN_RAW_SQL` in the gate. Validated with TDD unit tests (3 assertions seeding a temp SQLite fixture). - **refactor(db): move `skills UPDATE` + `db-backups` SQL into db modules** ([#3500]): fifth slice of #3500. Extracted the dynamic `UPDATE skills SET …` from `src/app/api/skills/[id]/route.ts` into a new `src/lib/db/skills.ts` module (`updateSkill(id, patch)`). The dynamic SET clause is injection-safe: column names are validated against a hard-coded allowlist of known writable columns before being interpolated; unknown keys are silently ignored. Extended `src/lib/db/backup.ts` with three new functions: `exportAllSummaryRows()` (multi-table SELECT for key_value / combos / provider_connections / api_keys, used by exportAll), `getTableNamesFromAdapter()` (sqlite_master introspection via an adapter arg, used by import validation), and `countImportedRows()` (post-import COUNT(\*) per table). The backup domain module is the correct home for sqlite_master introspection — it is not "raw SQL in a route" once moved there. `KNOWN_RAW_SQL` drops by 3 (from 8 → 5). Validated with 11 TDD unit tests (`tests/unit/db-backups-skills-3500.test.ts`). - **refactor(db): move `skills UPDATE` + `db-backups` SQL into db modules** ([#3500]): fifth slice of #3500. Extracted the dynamic `UPDATE skills SET …` from `src/app/api/skills/[id]/route.ts` into a new `src/lib/db/skills.ts` module (`updateSkill(id, patch)`). The dynamic SET clause is injection-safe: column names are validated against a hard-coded allowlist of known writable columns before being interpolated; unknown keys are silently ignored. Extended `src/lib/db/backup.ts` with three new functions: `exportAllSummaryRows()` (multi-table SELECT for key_value / combos / provider_connections / api_keys, used by exportAll), `getTableNamesFromAdapter()` (sqlite_master introspection via an adapter arg, used by import validation), and `countImportedRows()` (post-import COUNT(\*) per table). The backup domain module is the correct home for sqlite_master introspection — it is not "raw SQL in a route" once moved there. `KNOWN_RAW_SQL` drops by 3 (from 8 → 5). Validated with 11 TDD unit tests (`tests/unit/db-backups-skills-3500.test.ts`). - **refactor(db): move `usage_logs`/`semantic_cache`/`proxy_logs` SQL into db modules** ([#3500]): extracted raw `db.prepare(...)` SQL from three route handlers (`/api/analytics/auto-routing` → `usageLogs.ts`; `/api/cache/entries` → `semanticCache.ts`; `/api/logs/export` → `proxyLogs.ts`) into new `src/lib/db/` domain modules. New exports: `getAutoRoutingTotalCount`, `getAutoRoutingVariantBreakdown`, `getAutoRoutingTopProviders` (usage_logs), `listSemanticCacheEntries`, `deleteSemanticCacheBySignature`, `deleteSemanticCacheByModel` (semantic_cache), and `exportProxyLogsSince` (proxy_logs). Fourth slice of #3500. `KNOWN_RAW_SQL` drops from 8 → 5. Validated with 13 TDD unit tests (`tests/unit/db-logs-cache-3500.test.ts`) seeding temp SQLite fixtures. - **Provider-detail god-component decomposition — Phase 0** ([#3501]): introduced `ProviderDetailPageClient.tsx` and reduced `providers/[id]/page.tsx` to a thin 9-line route wrapper (was 12,882 LOC), following the repo's `*PageClient` convention. Added the first-ever smoke render test for the page (Hard Rule #8) as the safety net every later extraction phase is diffed against. Behavior unchanged; the `check-file-size` ratchet now tracks the extracted client. Foundation for Phases 1–6 (strangler-fig). Thanks @oyi77 for the parallel modularization effort in #3627. - **Provider-detail god-component decomposition — Phase 1a** ([#3501]): extracted the three self-contained auth-import modal clusters (Codex/Claude/Gemini `Import*AuthModal` + `Apply*AuthModal` + their co-located helpers, ~2,160 LOC) into `providers/[id]/components/modals/`. `ProviderDetailPageClient.tsx` drops 12,882 → 10,719 LOC. Behavior unchanged (smoke test green; clusters had clean `{ onClose, onSuccess }` / inline-prop interfaces). Co-authored with @oyi77. - **Provider-detail god-component decomposition — Phase 1b** ([#3501]): extracted `EditCompatibleNodeModal` (+ its node/props types) into `providers/[id]/components/modals/`, and moved the shared `CC_COMPATIBLE_DEFAULT_CHAT_PATH` constant into a leaf `providerDetailConstants.ts` so the page client and the modal can both import it without a circular dependency. Also removed two dangling section comments left by Phase 1a. `ProviderDetailPageClient.tsx`: 10,719 → 10,435 LOC. Behavior unchanged (smoke test + a new standalone modal render test green; `check:cycles` clean). Co-authored with @oyi77. ### 🔧 Bug Fixes - **Combos / Auto-Combo: premature context compaction ("agent keeps forgetting things")** ([#3680]): two related context-window bugs fixed. (1) `GET /api/combos/auto` now advertises `context_length` / `max_output_tokens` (MAX across the candidate pool — safe because the auto-combo context pre-filter routes oversized requests to large-window candidates), and the opencode plugin consumes them instead of hardcoding `limit: { context: 0 }` — a zero context silently disables opencode's smart auto-compaction, letting sessions grow until the gateway's destructive history purge kicks in. (2) chatCore's proactive compression for DB combos (incl. quota-shared pools) no longer compresses at `min(...allTargets)`: it now uses the EXECUTING target's own window (`resolveComboContextLimit`), keeping min-of-targets only as a defensive fallback when the current provider/model resolves no specific limit. TDD: 8 server tests (`tests/unit/auto-combo-context-advertising.test.ts`) + 3 plugin tests (`tests/auto-combo-context.test.ts`). - **Obsidian/WebDAV**: add the `/api/settings/obsidian/webdav` config route (enable/disable vault sync), encrypt WebDAV credentials at rest, and remove the duplicate UI block (#3485, part 1). - **OpenCode Free / passthrough**: "Test all models" now respects "Auto-hide failed models" and switches the list to the visible filter so hidden models actually disappear (#3610). Three related bugs fixed: `autoHideFailed` is now threaded from the outer component into `PassthroughModelsSection` via a prop (single shared checkbox); the `/api/models/test-all` request body now includes `autoHideFailed: true` so the server persists the hide; and after the loop, `visibilityFilter` is switched to `"visible"` when ≥1 model was hidden. Two pure-function helpers (`buildPassthroughTestBody`, `shouldSwitchToVisibleFilter`) extracted to `providerPageHelpers.ts` with 7 unit tests. - **Resilience**: clear stale transient connection cooldowns on startup so a prior unclean crash no longer makes every request time out at 120s after restart (#3625) - **fix(home topology): restore live in-flight request pulse** ([#3507]): the animated "pulse" edges in the home Provider Topology panel went dead after PR #3401 unified request visibility, because `activeRequests` was hardcoded to `[]`. Re-wired to `useLiveRequests()` (the existing WebSocket hook on port 20129) so that every pending/running request drives the animation in real time. A pure `selectActiveRequests` mapping helper was extracted to `home/topologyUtils.ts` with 5 unit tests. - **Electron desktop**: launch the peer-stamping `server-ws.mjs` entrypoint so local-only routes (AgentBridge, MCP, services) no longer return 403 LOCAL_ONLY (#3386) - **Provider Topology**: stop flagging healthy providers as errored based on stale historical failures; use current request status (#3619) - **OpenCode Free**: fetch the live model catalog from the provider's `modelsUrl` for the no-auth model picker instead of serving a stale hardcoded list (#3611) - **Hermes Agent**: honour the `HERMES_HOME` env var when writing/reading the agent config instead of always using `~/.hermes` (#3628). Introduced `getHermesHome()` / `getHermesConfigPath()` helpers (read at call-time) and routed all four hardcoded callsites through them so OmniRoute's config lands in the same directory that the Hermes PowerShell installer configures on Windows. - **MITM/cert**: remove the duplicated "Command failed:" prefix in system-command error messages ([#3641](https://github.com/diegosouzapw/OmniRoute/issues/3641)): `execFileText` was prepending its own `"Command failed: "` prefix on top of Node's `execFile` error message, which already begins with `"Command failed: "` for non-zero exits. The error message now surfaces Node's message directly (no double prefix), with stderr appended only when non-empty. - **fix(reasoning): replay `reasoning_content` on plain DeepSeek turns** ([#3632] — thanks @adivekar-utexas): the reasoning-replay gate previously only fired when an assistant message already carried `reasoning_content`. Plain (non-tool-call) turns whose `reasoning_content` was stripped by the client (e.g. Cursor) were forwarded without it, so DeepSeek V4+ rejected the request with 400 "the reasoning_content in the thinking mode must be passed back". The gate now also covers missing/empty `reasoning_content` on DeepSeek replay targets, injecting the cached reasoning (or the non-Anthropic placeholder) so multi-turn text conversations no longer 400. Fixes #1682. 2 regression tests. - **fix(kiro): route enterprise IAM Identity Center accounts to their regional endpoint** ([#3631] — thanks @artickc): Kiro/CodeWhisperer access tokens and Q Developer profile ARNs are region-bound, so enterprise IAM Identity Center accounts outside `us-east-1` (e.g. `eu-central-1`) were rejected by the default host. Adds `resolveKiroRegion` (stored region → profileArn region → `us-east-1`) and `kiroRuntimeHost` (regional `q.{region}.amazonaws.com`, legacy `codewhisperer.us-east-1` for the default), routes chat + usage to the regional endpoint, and discovers the region-matched `profileArn` via `ListAvailableProfiles` in a best-effort `postExchange` hook. 9 tests. - **fix(combo): skip same-provider/connection targets on connection-level errors** ([#3637] — thanks @herjarsa): on connection-level upstream errors (408/500/502/503/504/524), remaining same-`provider:connection` targets in a combo request are now skipped to avoid hammering a known-bad connection, in both the priority and round-robin paths. Adjusted in review to **exclude OmniRoute circuit-breaker-open responses** (503 + `X-OmniRoute-Provider-Breaker` / `provider_circuit_open`) from this skip, preserving the invariant that a breaker-open is an ordinary target failure (the next same-provider target is still tried). Co-authored with @herjarsa. - **/v1/responses**: detect stream readiness for tool-call-only and `object`-less chunks so Codex-shaped (reasoning + tools) requests no longer fail with "Stream ended before producing useful content" (#3612) - **RTL locales (ar/fa/he/ur)**: use logical CSS direction utilities for the sidebar and key overlays so the layout mirrors correctly under `dir=rtl` (#3541, partial — core layout) - **Kiro/AWS auto-import**: set a descriptive account name and dedupe by `profileArn` so imports no longer create nameless duplicate "OAuth Account" rows (#3615) - **fix(guardrails):** the `/api/guardrails/test` route now validates its body through `validateBody()` (Zod) instead of parsing raw JSON directly, aligning it with the repo-wide input-validation pattern (Hard Rule #7). ([#3621](https://github.com/diegosouzapw/OmniRoute/pull/3621) — thanks @diegosouzapw) - **fix(dashboard): bulk provider connection actions** — close audit, API, and UX gaps in the batch activate/deactivate flow: register the `provider.credentials.batch_updated` event in `HIGH_LEVEL_ACTIONS` and `ACTIVITY_ICONS` (was silently dropped from the Activity feed); fix `/api/providers` PATCH to return `warn` status when `notFound` is non-empty instead of always `success`; `/api/providers/test-batch` empty-result early-return now includes a `summary` so stale-ID mode reports to the user; bulk activate/deactivate chunks selection by 100 to avoid the Zod 400 cap on large provider accounts. ([#3673](https://github.com/diegosouzapw/OmniRoute/pull/3673) — thanks @leninejunior) ### 📝 Maintenance - **fix(ci):** increase the `execFileSync` `maxBuffer` in `validate-pack-artifact` so the npm-pack inventory no longer overflows on large tarballs during release validation — follow-up to the v3.8.21 pack-artifact hotfix. ([#3622](https://github.com/diegosouzapw/OmniRoute/pull/3622) — thanks @diegosouzapw) --- ## [3.8.21] — 2026-06-11 ### ✨ Added - **feat(cli):** `omniroute autostart` now accepts the shorthand the headless / `omniroute serve` path was missing — `omniroute autostart on` / `... true` (aliases of `enable`), `... off` / `... false` (aliases of `disable`), a new `... toggle`, and a default `... status` (bare `omniroute autostart` is a safe read-only). Previously autostart could only be toggled from the tray (`serve --tray`) or the Electron Appearance tab, so a plain `omniroute serve` user had no way to enable it. (The cross-platform launchd/systemd/registry logic is unchanged — this only wires the ergonomic CLI surface.) ([#3331](https://github.com/diegosouzapw/OmniRoute/issues/3331) — thanks @uniQta) ### ♻️ Code Quality - **refactor(chatCore):** extract the chatCore request phases — idempotency check, semantic cache check, common request sanitization, and memory/skills injection — into dedicated `open-sse/handlers/chatCore/` modules (`idempotency.ts`, `semanticCache.ts`, `sanitization.ts`, `memorySkillsInjection.ts`), slimming the monolithic handler with no behavior change. (Maintainer follow-up: re-derive `idempotencyKey` at the Phase 9.2 save site after the check moved into the module, fixing a `ReferenceError` on successful non-cached responses.) ([#3598](https://github.com/diegosouzapw/OmniRoute/pull/3598) — thanks @oyi77) - **docs(opencode-provider):** soft-deprecate `@omniroute/opencode-provider` in favour of `@omniroute/opencode-plugin`. The provider package writes a **static** model list to `opencode.json` that drifts behind the live OmniRoute catalog, whereas the plugin fetches `/v1/models` at OpenCode startup. The package keeps working (no code/behavior change), but its npm description and README now carry a deprecation banner with the one-line migration, and a guard test pins the notice. ([#3419](https://github.com/diegosouzapw/OmniRoute/issues/3419) — thanks @herjarsa) - **chore(review):** pre-release hardening from a multi-reviewer `/review-reviews` battery over the v3.8.21 diff (7 Opus reviewers; zero blocker/high). Resolved findings: npm tarball no longer ships co-located test files (`files[]` negations + reconciled `.npmignore`; the #3578 closure gate now asserts the real `npm pack` output in both directions); `getSanitizedCachedProviderLimitsMap` scopes its connection scan to antigravity/agy instead of decrypting every active connection on each dashboard poll; the Antigravity quota-tier remap (`toClientAntigravityQuotaModelId`) is centralized in `antigravityModelAliases.ts` (was an inline if-ladder in `usage.ts`); the chatCore idempotency check returns its resolved key so the save site reuses a single derivation; and new tests pin the chatCore extracted modules, the Antigravity `usage_history` fallback contract, the reasoning-wrapper prefix-preservation heuristic, the Antigravity SSE `markdown` branch, and the upstream-ca/test no-persist guarantee. (Live-verified that agy consumer tokens are accepted by the non-daily `cloudcode-pa` host used by `retrieveUserQuota`, so #3604 is not agy-host-limited.) ### 🔧 Bug Fixes - **fix(routing):** reasoning models (deepseek-v4-flash, nemotron, etc.) no longer return empty content in combo routing when they spend all of `max_tokens` on reasoning — `validateResponseQuality` now rejects an empty-content-but-`reasoning_content` response when reasoning consumed ≥90% of completion tokens (so the combo loop retries/falls back), and reasoning models receive a `max_tokens` buffer (+50%, +1000 floor) so reasoning and content both fit. (Maintainer follow-up: the round-robin buffer is applied to a per-attempt copy so it does not compound across models/retries — `4096 → 6144 → 9216 → …`.) ([#3588](https://github.com/diegosouzapw/OmniRoute/pull/3588) — thanks @herjarsa) - **fix(routing):** a valid `max_tokens`-truncated upstream response is no longer misclassified as empty content and rewritten into a fake 502 — `isEmptyContentResponse()` flagged any Claude `content:[]` / OpenAI empty-choice payload regardless of `stop_reason`/`finish_reason`, so a Claude Code `max_tokens: 1` connectivity ping (HTTP 200, `stop_reason:"max_tokens"`, empty content) became a synthetic `502 "Provider returned empty content"` and triggered a needless family fallback. The guard now treats a terminal truncation/tool signal (Claude `stop_reason` `max_tokens`/`tool_use`, OpenAI `finish_reason` `length`/`tool_calls`) as a legitimate completion; genuinely empty responses (no terminal reason, or `stop`/`end_turn` with empty content) are still caught. ([#3572](https://github.com/diegosouzapw/OmniRoute/issues/3572)) - **fix(api):** `/v1/completions` now returns the legacy OpenAI Completions shape (`object:"text_completion"`, `choices[].text`) instead of chat payloads (`choices[].message|delta.content`) — the endpoint routes internally through the chat pipeline, so legacy Completion clients like TabbyML's `openai/completion` backend crashed with `missing field "text"`. The response (both non-streaming JSON and the SSE stream) is now translated back to the text-completion shape; `[DONE]` and error bodies pass through unchanged. ([#3571](https://github.com/diegosouzapw/OmniRoute/issues/3571)) - **fix(usage):** the z.ai/GLM coding-plan quota card no longer shows "Monthly 0%" — coding plans have no monthly cap (only 5-hour windows), so the quota API reports the `TIME_LIMIT` ("Monthly") entry with `total=0`, and the `total>0 ? … : 0` fallback rendered a misleading 0% remaining (which can skew downstream model-choice). With no absolute cap the remaining percentage now falls back to the percentage-derived value (full/100% when 0% used). ([#3580](https://github.com/diegosouzapw/OmniRoute/issues/3580)) - **docs(discovery):** mark `DISCOVERY_TOOL_DESIGN.md`'s API Endpoints table with an explicit "⚠️ Not yet implemented — Phase 2" banner — the discovery routes are a design proposal (Phase-1 stub only), and the banner makes clear the `KNOWN_STALE_DOC_REFS` gate suppression is intentional, not stale drift. ([#3498](https://github.com/diegosouzapw/OmniRoute/issues/3498)) - **fix(agent-bridge):** add the missing `POST /api/tools/agent-bridge/upstream-ca/test` route — the UpstreamCaField "Test" button POSTed to it but it didn't exist (404). The new validate-only route checks the CA file exists and is a parseable PEM certificate (returns the subject/expiry) **without** persisting the path or activating it; it inherits the `/api/tools/agent-bridge/` LOCAL_ONLY classification. ([#3488](https://github.com/diegosouzapw/OmniRoute/issues/3488)) - **fix(gamification):** the dashboard Profile page no longer hits three 404s — added the missing `GET /api/gamification/{level,badges,badges/earned}` routes (management-scoped). The page is operator-wide (no `apiKeyId`), so `level`/`badges/earned` aggregate across all keys (with an optional `?apiKeyId` for a single key), and `badges` seeds the built-in catalog first (idempotent) so the grid is populated even on installs that never seeded it (see #3472). ([#3484](https://github.com/diegosouzapw/OmniRoute/issues/3484)) - **security(oauth):** migrate the five public OAuth client_ids (Claude, Codex, Qwen, Kimi, GitHub Copilot — 9 server-side call-sites in `providerRegistry.ts` + `oauth.ts`) from string literals to `resolvePublicCred()` (Hard Rule #11), matching the existing Gemini/Antigravity pattern. The values decode byte-for-byte to the same public client_ids (env overrides still win), so OAuth flows are unchanged; the `check-public-creds` allowlist is now empty. The browser-bundled `codexDeviceFlow.ts` copy stays a literal by necessity (it cannot import `open-sse`). ([#3493](https://github.com/diegosouzapw/OmniRoute/issues/3493)) - **fix(mcp):** `omniroute --mcp` no longer crashes on npm installs with `ERR_MODULE_NOT_FOUND` (e.g. `src/lib/combos/steps.ts`) — the MCP server runs from raw TypeScript and imports across `src/` + `open-sse/`, but the published `files` allowlist only shipped a handful of cherry-picked paths, so the transitive closure (~400 files) was absent from the tarball. `files` now ships the backend source the MCP server needs (`open-sse/` + `src/{domain,lib,mitm,server,shared,sse,types}/`, excluding the `src/app` UI), and a new regression test computes the MCP import closure and fails if any reachable source file is not covered by `files`. ([#3578](https://github.com/diegosouzapw/OmniRoute/issues/3578)) - **fix(api):** `API_REFERENCE.md` no longer documents a non-existent `/api/guardrails*` / `/api/shadow*` surface (doc-fiction flagged by `check-docs-symbols`, frozen in `KNOWN_STALE_DOC_REFS`). The guardrail pipeline is real (`src/lib/guardrails`), so the two routes that map to actual behavior are now implemented — `GET /api/guardrails` (list the registered guardrails + status) and `POST /api/guardrails/test` (dry-run the pre-call pipeline over a sample input), both management-scoped — while the fictional `enable`/`disable`/`logs` rows and the entire `/api/shadow*` table (shadow A-B comparison is combo-config + `/api/combos/metrics`) were removed from the doc and dropped from the allowlist. ([#3496](https://github.com/diegosouzapw/OmniRoute/issues/3496)) - **fix(agent-bridge):** the MITM "Start" button no longer reports a misleading "port 443 may be in use" for every failure cause — `startMitm()` only matched the EADDRINUSE stderr line and always threw the port-443 message, so a missing `ROUTER_API_KEY` or an `EACCES` permission error sent users debugging the wrong thing. The startup watcher now buffers the MITM child's stderr and `interpretMitmStartupError()` maps the real `server.cjs` `❌` cause (port-in-use / permission-denied / missing API key / any other diagnostic line) into the surfaced error; with no captured output it stays generic instead of guessing port 443. ([#3606](https://github.com/diegosouzapw/OmniRoute/issues/3606)) - **fix(oauth):** Kiro "Import Token" no longer reports a bare `Internal server error` that hides the real cause — the import validates/refreshes the pasted refresh token against AWS, and the catch returned a generic 500 string, so an `invalid_grant`, an expired token, or a region mismatch all surfaced identically in the dashboard. The import error now carries the sanitized upstream cause via `sanitizeErrorMessage()` (Hard Rule #12 — no stack, no secrets), keeping the same `{ error: }` response shape, and still falls back to the generic message when there is nothing to report. ([#3589](https://github.com/diegosouzapw/OmniRoute/issues/3589)) - **fix(antigravity):** the Antigravity/agy Gemini 3.5 Flash catalog now exposes clean public tier IDs (`gemini-3.5-flash-low`/`-medium`/`-high`, matching Antigravity 2.0.4's Low/Medium/High selector) and maps them to the live upstream IDs at the executor boundary, instead of the old confusing `-preview`/`-agent` names. Antigravity model-id normalization moved out of the global model resolver into the executor so client-visible IDs are no longer rewritten before account/credential routing and logging. (Maintainer follow-up: kept `gemini-3.5-flash-preview` as a hidden backward-compat alias routing to the High tier so saved combos/configs keep working; live-validated the tier set via the `agy` CLI catalog.) ([#3603](https://github.com/diegosouzapw/OmniRoute/pull/3603) — thanks @dhaern) - **fix(usage):** Antigravity/agy Provider Limits now report accurate consumption — `retrieveUserQuota` (live usage) is preferred over the `fetchAvailableModels` catalog view (which keeps reporting full buckets after real usage), with a local `usage_history` fallback for buckets that are only catalog-visible; cached entries are sanitized so retired upstream IDs are not re-exposed, and a deduplicated post-usage refresh keeps the dashboard fresh after each request. (Maintainer follow-up: the post-usage refresh is decoupled through a lightweight `usageEvents` bus so `usageHistory` no longer imports `providerLimits`/the executors graph, keeping the `typecheck:core` surface stable.) ([#3604](https://github.com/diegosouzapw/OmniRoute/pull/3604) — thanks @dhaern) - **fix(gemini):** textual reasoning wrappers emitted as assistant prose (``/``/``/``, including malformed/open tags like ` 0`, so saving with one field filled 400'd and clearing all limits was impossible. Limits now accept `0` (= "no limit for this period"; enforcement only kicks in above 0) and the cross-field minimum was removed; negatives are still rejected. ([#3537](https://github.com/diegosouzapw/OmniRoute/issues/3537)) - **fix(gamification):** badge-unlock events no longer re-fire on every request — the "already unlocked?" guard used `getBadges()`, which INNER-JOINs `badge_definitions` (empty until seeded), so it always reported "not earned" and re-emitted `events.badge_unlocked` per request. Added a `hasBadge()` helper that reads `user_badges` directly, so dedup is correct regardless of whether definitions are seeded. ([#3472](https://github.com/diegosouzapw/OmniRoute/issues/3472)) - **fix(routing):** the `auto` model keyword now works on the Codex `/v1/responses` path — `resolveResponsesApiModel` rewrote the bare `auto` keyword to `codex/auto`, which ChatGPT rejects (`The 'auto' model is not supported when using Codex with a ChatGPT account`). `auto` (OmniRoute's zero-config auto-routing keyword) now passes through untouched so combo routing handles it. ([#3509](https://github.com/diegosouzapw/OmniRoute/issues/3509)) - **fix(cli-tools):** saving the OpenCode/CLI tool config no longer 400s in cloud mode — every CLI tool card posts `apiKey: null` (the real key is resolved server-side from `keyId`), but `guideSettingsSaveSchema` used `z.string().optional()`, which rejects `null`. The schema now normalizes `null` → `undefined`, so the save succeeds and the `keyId`/default path is used. ([#3552](https://github.com/diegosouzapw/OmniRoute/issues/3552)) - **fix(catalog):** PublicAI is no longer miscatalogued as keyless/free — it requires an API key (registry `authType:"apikey"`; signup grants a one-time credit, then it bills). The three PublicAI models moved from `freeType:"keyless"` (which could pick them into the no-auth pool and dispatch with no `Authorization` header) to `"one-time-initial"`, and the provider's `hasFree` flag is now `false` — matching `freeTierCatalog.ts`, which already excluded publicai. ([#3558](https://github.com/diegosouzapw/OmniRoute/issues/3558)) - **fix(gemini-web):** a missing Playwright Chromium browser no longer loops and trips the provider breaker — when the browser binary is not installed, `chromium.launch()` threw an error surfaced as a retryable **500**, so accountFallback marked the account unavailable and retry-looped. It is now classified as a host/config problem and returns **503** with an actionable message (`npx playwright install chromium`) and the `X-Omni-Fallback-Hint: connection_cooldown` header, which skips the provider circuit breaker and applies a short non-exponential cooldown. ([#3516](https://github.com/diegosouzapw/OmniRoute/issues/3516)) - **fix(proxy):** the SOCKS5 proxy option now follows the runtime `ENABLE_SOCKS5_PROXY` env instead of the build-time `NEXT_PUBLIC_ENABLE_SOCKS5_PROXY` — Next.js inlines `NEXT_PUBLIC_*` at build time, so a prebuilt Docker image ignored a runtime setting and the SOCKS5 type stayed hidden. The proxy modal now reads `socks5Enabled` from `GET /api/settings/proxies` (server-side `ENABLE_SOCKS5_PROXY`), with the build-time value kept only as a static-deploy fallback. ([#3508](https://github.com/diegosouzapw/OmniRoute/issues/3508)) - **fix(playground):** the playground model selector now lists models from custom-endpoint (OpenAI/Anthropic-compatible) providers — it filtered `/v1/models` by the provider's connection id, but the catalog emits compatible-provider models under the node's custom prefix (`prefix/model`), so the list came up empty ("None"/"-"). The selector now filters by the node prefix (exposed additively as `modelPrefix` on provider options; the connection id is unchanged, so translator send/translate and connection lookups are unaffected). ([#3505](https://github.com/diegosouzapw/OmniRoute/issues/3505)) - **fix(usage):** the Kiro quota card no longer renders a blank when the account returns no usage breakdown — `getKiroUsage` returned `quotas:{}` for a successful GetUsageLimits response without a `usageBreakdownList` (observed with some AWS IAM / Builder ID accounts), which the dashboard showed as an unexplained empty card. It now returns an informative message (surfaced via the card's connection-message path). ([#3506](https://github.com/diegosouzapw/OmniRoute/issues/3506)) - **fix(security):** route raw `err.message` through `sanitizeErrorMessage()` in five web executors (`adapta-web`, `deepseek-web`, `perplexity-web`, `qoder`, `veoaifree-web`) and the embeddings + search handlers (Hard Rule #12) — these built error response bodies from the raw upstream/exception message, which could leak internal detail. ([#3494](https://github.com/diegosouzapw/OmniRoute/issues/3494), [#3495](https://github.com/diegosouzapw/OmniRoute/issues/3495)) - **fix(dashboard):** correct two dashboard fetches that hit non-existent routes (404) — `CustomHostsManager` called `/api/tools/traffic-inspector/custom-hosts` (the real route is `/hosts`), and `FeatureFlagsGrid`'s post-restart liveness probe called `/api/health` (the real lightweight endpoint is `/api/health/ping`). ([#3486](https://github.com/diegosouzapw/OmniRoute/issues/3486), [#3487](https://github.com/diegosouzapw/OmniRoute/issues/3487)) - **chore(providers):** remove the dead `krutrim` registry entry — it was half-registered (present in `providerRegistry.ts` with a baseUrl + one model, but absent from `providers.ts`, with no executor/translator/OAuth), so it was never selectable. Dropped its `ProviderIcon` entry and the `KNOWN_REGISTRY_ONLY` exception. ([#3483](https://github.com/diegosouzapw/OmniRoute/issues/3483)) - **docs(api):** fix the agent-bridge per-agent state route in `openapi.yaml` and `AGENTBRIDGE.md` — both documented `/api/tools/agent-bridge/agents/{id}/state`, which has no route; corrected to the real per-agent `/api/tools/agent-bridge/agents/{id}` (global state remains `/api/tools/agent-bridge/state`). ([#3489](https://github.com/diegosouzapw/OmniRoute/issues/3489)) - **docs(api):** correct `API_REFERENCE.md` endpoints that documented non-existent routes — skills (`PUT /api/skills/[id]`, `POST`/`GET /api/skills/executions`), plugins (`[id]`→`[name]`, `activate`/`deactivate`), ACP (`DELETE`/`POST /api/acp/agents` via `?id`/`{action:"refresh"}`), cache (`DELETE /api/cache/reasoning`, `/api/cache/entries`), and removed the fabricated `/api/admin/circuit-breaker`, `/api/admin/rate-limits`, and `/api/system-info` (admin only exposes `/concurrency`). ([#3497](https://github.com/diegosouzapw/OmniRoute/issues/3497)) - **fix(executor):** strip provider prefix from versioned built-in tool model field — Anthropic rejects `tools[N].model: "cc/claude-opus-4-8"` from Claude Code's `advisor_20260301` and similar versioned built-in tools; the native Claude OAuth execute path now strips any provider prefix from `model` on tools whose name matches `name_YYYYMMDD`. ([#3532](https://github.com/diegosouzapw/OmniRoute/pull/3532) — thanks @ggiak) - **fix(dashboard):** handle DEGRADED and unknown provider breaker states on the Runtime page — an unrecognised breaker state (e.g. DEGRADED) caused a crash because the styling map had no entry for it; now falls back to a neutral style so the page never throws on unknown states. ([#3533](https://github.com/diegosouzapw/OmniRoute/pull/3533) — thanks @rdself) - **fix(usage):** make opencode-go quota fetcher fail-open instead of throwing 500 — the quota API rejects chat API keys with a JSON-401 body even though the same key works for chat; previously this threw and crashed the dashboard with a red error banner. It now returns an informative message and keeps rendering like other connection-message cases. ([#3522](https://github.com/diegosouzapw/OmniRoute/pull/3522) — thanks @wilsonicdev) - **fix(translator):** map the Codex `local_shell` tool type — `local_shell` was absent from the translator's tool-type map, causing it to fall through as an unknown type; it is now forwarded correctly to the upstream. ([#3534](https://github.com/diegosouzapw/OmniRoute/pull/3534) — thanks @kamaka) - **fix(images):** prefer bare combo names over built-in image model aliases — a user combo named `gpt-image-2` can now shadow the native OpenAI alias so image requests route through the combo; provider-qualified IDs like `openai/gpt-image-2` still resolve via the built-in path. ([#3527](https://github.com/diegosouzapw/OmniRoute/pull/3527) — thanks @AveryanAlex) - **fix(translator):** fix OpenAI→Gemini translation of historical tool calls — tool results from earlier turns were being converted to text, causing Gemini to pattern-match the response as prose rather than structured content; they now use the native Gemini `functionResponse` part format. ([#3569](https://github.com/diegosouzapw/OmniRoute/pull/3569) — thanks @hartmark) - **fix(plugins):** forward plugin lifecycle hooks (`onInstall`, `onActivate`, `onDeactivate`, `onUninstall`) via IPC and wrap `onDeactivate`/`onUninstall` in try/catch so a buggy plugin handler can no longer brick teardown; also removes redundant `RegExp()` wrappers in `accountFallback.ts` and fixes indentation in `requestLogger.ts`. ([#3562](https://github.com/diegosouzapw/OmniRoute/pull/3562) — thanks @oyi77) - **fix(auto-update):** use a stable PROJECT_ROOT walker instead of frozen `process.cwd()` — `resolveProjectRoot` now walks up from `__dirname` to find the nearest directory containing `package.json` or `.git` (bounded at 16 levels), preventing ENOENT errors when the working directory is not the project root. ([#3561](https://github.com/diegosouzapw/OmniRoute/pull/3561) — thanks @oyi77 / @ViFigueiredo via [#3423](https://github.com/diegosouzapw/OmniRoute/pull/3423)) - **fix(resilience):** expose `providerCooldown` in `GET /api/resilience` and accept it in `PATCH` — PR #3556 added the global provider cooldown tracker to the settings model and `ResilienceTab` UI, but the API route never returned the field (causing a crash when the tab loaded) and `updateResilienceSchema` (`.strict()`) rejected PATCH bodies containing it with 400. `providerCooldownSettingsSchema` is now wired into the Zod schema, returned in the GET response, and merged in the PATCH handler. Caught during v3.8.20 VPS homologation (Hard Rule #18 — 5 TDD tests); shipped to main via [#3591](https://github.com/diegosouzapw/OmniRoute/pull/3591). ([#3590](https://github.com/diegosouzapw/OmniRoute/pull/3590) — thanks @diegosouzapw) --- ## [3.8.19] — 2026-06-09 > Focused quality-infrastructure release: the complete **quality-gate ratchet + anti-hallucination guardrail system** (Phases 0–6 + fast-tracked 6A.1/6A.2). No external PRs were taken this cycle by design — community PRs carry over to the next cycle. ### ✨ New Features - **feat(quality):** quality-gate ratchet + anti-hallucination/rule-enforcement guardrails (Phases 0–6) — generic multi-metric ratchet engine (`quality-baseline.json` + collector + comparator, regression-only) and ~18 deterministic gates wired into CI: provider-consistency, dashboard `fetch()`→route and OpenAPI/docs→route resolution (anti-hallucination), dependency allowlist (anti-slopsquatting), file-size/duplication/complexity ratchets (frozen debt only shrinks), anti test-masking (assert-removal/tautology detection on PR diffs), error-helper (Hard Rule #12), public-creds (Rule #11), route-guard membership (Rules #15/#17), db-rules (Rules #2/#5), known-symbols (executors/strategies/translators), migration numbering. Re-enabled the cheap pre-commit hook, tiered `npm audit`, reconciled the CI coverage gate (40→60) and wired 3 orphaned contract gates. ([#3471](https://github.com/diegosouzapw/OmniRoute/pull/3471) — thanks @diegosouzapw) - **feat(quality):** test-discovery gate + 135 orphan tests re-wired + vitest in CI (fast-tracked Phase 6A.1/6A.2) — new `check:test-discovery` proves every `*.test.ts|tsx` is collected by a runner that actually executes (15 collectors with textual drift-check; orphans frozen in a shrink-only baseline). Found **195 orphan test files** (incl. `authz/routeGuard.test.ts` guarding Rules #15/#17 — already rotten); 135 re-wired into the node runner via explicit-braces recursive globs across all scripts + 4 CI call sites; the remaining 60 are categorized debt. New `test-vitest` CI job: `test:vitest` blocking (146/146), `test:vitest:ui` informational (14 pre-existing UI-drift fails, triage 2026-06-16). ([#3536](https://github.com/diegosouzapw/OmniRoute/pull/3536) — thanks @diegosouzapw) ### 🔧 Bug Fixes - **fix(authz):** restored the missing `BYPASS_PREFIX_NOT_ALLOWED` schema guard (Hard Rules #15/#17) — the zod refine documented as layer-1 in `routeGuard.ts` was absent from the live `settingsSchemas.ts`, so `PATCH /api/settings` accepted spawn-capable prefixes (e.g. `/api/cli-tools/runtime/`) into the manage-scope bypass list (the layer-2 runtime predicate still refused to honour them). Surfaced by re-wired orphan tests AC-8/AC-10c, which now stand as the permanent regression guard. ([#3536](https://github.com/diegosouzapw/OmniRoute/pull/3536) — thanks @diegosouzapw) - **fix(db):** `closeDbInstance()`/`resetDbInstance()` now fire the `stateReset.ts` module-state resetters (previously only backup-restore did) — `apiKeys.ts` kept a process-level schema memo across a recreated DB, so the stale re-prepare exploded with `no such column: is_active` and clients received **503 instead of 403** for an invalid bearer; the same path hit production when restoring an older backup snapshot. Includes a dedicated regression test; a test that had accommodated the buggy 503 now asserts the deterministic 403. ([#3536](https://github.com/diegosouzapw/OmniRoute/pull/3536) — thanks @diegosouzapw) ### 🔒 Security - **fix(security):** block the cloud-metadata SSRF pivot in the cli-tools catalog fetch (CodeQL `js/request-forgery`, **critical**) — `fetchOmniRouteCatalog()` built its `/v1/models` URL from a user-controlled `baseUrl` and fetched it. Since the legitimate target is the user's own OmniRoute (loopback), the public-only guard can't apply; `assertSafeCatalogUrl()` now blocks the cloud-metadata/link-local pivot (`169.254.169.254`, `metadata.google.internal`, …) unconditionally, plus non-http(s) protocols and embedded credentials, and the request fetches the re-parsed (taint-severed) URL. Loopback and public OmniRoute Cloud targets stay allowed. ([#3544](https://github.com/diegosouzapw/OmniRoute/pull/3544) — thanks @diegosouzapw) ### 📝 Maintenance - **docs(quality):** Phase 6A critical-audit plan + Phase 7 community-tooling additions, both stored with an activation gate of **2026-06-16** — 6A: stale-allowlist enforcement, ratchet `--require-tighten`, gate scope expansions, remaining orphan/UI-suite triage; Phase 7 additions: gitleaks (Betterleaks noted), actionlint + zizmor, SPDX license compliance. ([#3530](https://github.com/diegosouzapw/OmniRoute/pull/3530) — thanks @diegosouzapw) - **chore(quality):** conscious, documented re-baselines so the quality-gate debuts holding the REAL published line — file-size frozen at current sizes for 9 files that grew in the v3.8.18 era (RequestLoggerV2 +281, stream +101, combo +73, chatCore +45, …) and `eslintWarnings` 3482→3501 (the published v3.8.18 tag already measured 3501; this cycle is neutral). Driving both down is Phase 6A work. ([#3538](https://github.com/diegosouzapw/OmniRoute/pull/3538) — thanks @diegosouzapw) - **chore(release):** open the v3.8.19 development cycle (version bump + electron lockfile sync) and ignore generated yt-downloader artifacts. (thanks @diegosouzapw) - **test:** release-gate stabilization — the re-wired suites + the debuting CI gates surfaced and fixed 6 latent test defects: 2 suites depended on the dev machine's configured password (now hermetic), the breaker reset-timeout test ran on a 5ms margin, the bypass-prefix schema test consecrated the pre-#3536 bug, the chatcore upstream-timeout test had a structurally-broken pending-detail predicate (tested `.providerRequest` on an array — never passed isolated, even at the published v3.8.18 tag), and internal planning docs were excluded from the docs-symbols gate. Coverage floors re-baselined to the honest post-re-wire denominator (78.4% measured: previously-never-imported modules now count). (thanks @diegosouzapw) --- ## [3.8.18] — 2026-06-09 ### ✨ New Features - **feat(ui):** unified Active + Finished requests into a single view — the dashboard now shows in-flight and completed requests in one list with deep-linking, live streaming detail, and a dedicated `/api/logs/[id]` detail route; pending requests are tracked per connection and finalized as they complete. ([#3401](https://github.com/diegosouzapw/OmniRoute/pull/3401) — thanks @hartmark / @diegosouzapw) - **feat(plugins):** plugin lifecycle hooks + theme-manager example — adds `onInstall`/`onActivate`/`onDeactivate`/`onUninstall` lifecycle events dispatched by the plugin manager, thins `index.ts` to a backward-compatible re-export shim over `hooks.ts`, and ships theme-manager + request-logger example plugins. ([#3473](https://github.com/diegosouzapw/OmniRoute/pull/3473) — thanks @oyi77 / @diegosouzapw) - **feat(browserPool):** Playwright proxy resolved from the proxy registry — browser-backed providers (claude-web/gemini-web) now route through the configured per-provider/global proxy instead of connecting directly, matching how OAuth/token-refresh already honor `resolveProxyForProvider` (closes the VPS IP-rate-limit gap for the browser path). Fully additive with graceful degradation. ([#3492](https://github.com/diegosouzapw/OmniRoute/pull/3492) — thanks @borodulin) ### 🔧 Bug Fixes - **fix(executor):** Llama / OpenAI-compat base URL normalization — a `baseURL` without a path (e.g. `llama.example.foo`) or with a non-`/v1` path (e.g. `bar.example.com/foo`) now correctly gets `/v1/chat/completions` appended, fixing the 404 on message sends while `GET /model` still worked. ([#3519](https://github.com/diegosouzapw/OmniRoute/pull/3519) — thanks @hartmark) - **fix(sse):** empty-choices chunks without usage are dropped instead of injecting retry text — a streamed chunk carrying an empty `choices` array and no `usage` is now silently skipped rather than emitting placeholder retry text into the stream, eliminating spurious content for clients that send such keepalive-style frames. ([#3513](https://github.com/diegosouzapw/OmniRoute/pull/3513) — thanks @diegosouzapw) - **fix(types):** restored a clean `typecheck:core` — typed `getPendingRequests()` to its real shape (`Record>`) so the unified-requests view (#3401) no longer treats pending counts as `unknown`, cast the `streamChunks` log payload to its declared type, and aligned `preScreenTargets` (#3169) to the canonical `IsModelAvailable` signature (sync-or-async, normalized via `Promise.resolve`). (thanks @diegosouzapw) - **fix(opencode-plugin):** repaired the corrupted `index.ts` that broke the npm `publish-opencode-plugin` build (introduced by the #3435 branch) — removed two duplicated code blocks (apiFormat + debug-logging), dropped the local `normaliseFreeLabel` superseded by the `naming.ts` extraction, fixed an undefined `sdkBaseURL` reference, declared the missing `startupDebug` / `logLevel` feature-schema fields, and fixed `shortProviderLabel` dropping the prefix on a long displayName with no alias. Plugin now builds (DTS clean) with all 254 tests green. ([#3435](https://github.com/diegosouzapw/OmniRoute/pull/3435) — thanks @diegosouzapw) - **fix(catalog):** Codex CLI model-catalog refresh no longer errors — `GET /v1/models` now returns a top-level `models: []` array for Codex clients (detected via the `originator` / `user-agent` = `codex_*` headers it sends on `GET /v1/models?client_version=...`), so `codex_models_manager` stops failing to decode the OpenAI-standard response and no longer logs `failed to refresh available models` on every startup. The array is intentionally empty: Codex replaces its built-in per-model agent prompt (`base_instructions`, ~21k chars) with whatever a populated entry carries for the selected model, so emitting our catalog would break Codex's agent behaviour — an empty list keeps Codex on its built-in model info (same inference as before, minus the error). Non-Codex OpenAI clients receive the unchanged `{object,data}` response. ([#3481](https://github.com/diegosouzapw/OmniRoute/pull/3481) — thanks @diegosouzapw) - **fix(provider):** Cursor's Responses-API-shaped bodies on `/chat/completions` are detected and handled — a body with `input` but no `messages` is now classified as `openai-responses` (instead of forcing `openai` and building from undefined `messages` → upstream 400); standard OpenAI clients are unaffected by the `messages===undefined` guard. ([#3490](https://github.com/diegosouzapw/OmniRoute/pull/3490) — thanks @borodulin) - **fix(sse):** numeric provider IDs normalized to strings across 4 more surfaces — extends #3427 to the Responses-API SSE passthrough (`response_id`/`item_id`/`call_id`), the buffered/flush path in `stream.ts`, the dedup-key builders, and `sseParser.ts`, preventing `undefined` lookups when IDs arrive as numbers. ([#3451](https://github.com/diegosouzapw/OmniRoute/pull/3451) — thanks @disafronov) - **fix(theoldllm):** `X-Request-Token` generated server-side, dropping the Playwright dependency — replicates the site's client `rie()` token (djb2 hash + `oldllm-client-2026` seed + UA prefix + 8-hex `crypto.randomUUID` suffix) directly, so The Old LLM no longer needs a headless browser to mint tokens. ([#3491](https://github.com/diegosouzapw/OmniRoute/pull/3491) — thanks @borodulin / @diegosouzapw) - **fix(combo):** parallel pre-screen + circuit-breaker fast-exit for priority combos — provider profiles and model availability for all targets are pre-screened concurrently (max 5), and targets whose circuit breaker is OPEN are skipped immediately, reducing first-token latency on multi-target priority combos. ([#3169](https://github.com/diegosouzapw/OmniRoute/pull/3169) — thanks @pizzav-xyz) - **fix(authz):** URL-tokenized client endpoints (`/api/v1/vscode//...`) authenticate again when the caller sends its own non-OmniRoute `Authorization` header — a non-`Bearer ` header (e.g. VS Code Copilot's own, or an empty `Bearer `) no longer short-circuits auth; it falls through to the path-scoped URL token (still validated downstream), instead of 401'ing under `REQUIRE_API_KEY=true`. ([#3504](https://github.com/diegosouzapw/OmniRoute/pull/3504) — thanks @zhiru / @diegosouzapw) - **fix(playground):** the dashboard provider Test playground works under `REQUIRE_API_KEY=true` — it previously sent the **masked** key (`sk-xxxx****yyyy`) as a bearer (always invalid → 401). It now authenticates via the dashboard session and sends only the key **id** (`x-omniroute-playground-key-id`); the gateway resolves the secret server-side, honored **only** for an authenticated session and never putting the key secret on the wire. ([#3503](https://github.com/diegosouzapw/OmniRoute/pull/3503) — thanks @zhiru / @diegosouzapw) ### 📝 Maintenance - **feat(docs):** doc-accuracy gate — new `npm run check:fabricated-docs` (`scripts/check/check-fabricated-docs.mjs`) indexes the codebase (api routes, env vars, CLI commands) and flags API-path/env-var/CLI/hook/file-ref claims in `docs/**` + `AGENTS.md` that don't exist in source (soft-fail by default, `--strict` for CI; wired into `check:docs-all`). Also refreshes the AGENTS.md live counts against source. ([#3510](https://github.com/diegosouzapw/OmniRoute/pull/3510) — thanks @oyi77) - **chore:** ignore local quality reports and prompt artifacts (`quality-metrics.json`, `PLANO-/RELATORIO-QUALITY-GATES.md`, stray prompt `.txt` files) so they no longer surface in `git status`. (thanks @diegosouzapw) ### 🔒 Security - **fix(opencode-plugin):** bounded the regex quantifiers in `normaliseFreeLabel` to close a polynomial-ReDoS (CodeQL `js/polynomial-redos`) — an unbounded `\s*` before an anchored `\s*$` allowed O(n²) backtracking on attacker-influenced provider/model display names; bounded to `{0,8}`/`{1,8}`. (thanks @diegosouzapw) --- ## [3.8.17] — 2026-06-09 ### ✨ New Features - **feat(providers):** LMArena provider — routes requests to the LMArena battle platform via the new `lmarena` executor; supports streaming chat completions. ([#3421](https://github.com/diegosouzapw/OmniRoute/pull/3421) — thanks @oyi77) - **feat(providers):** ZenMux provider — adds the `zenmux` executor for ZenMux's OpenAI-compatible endpoint with streaming support. ([#3429](https://github.com/diegosouzapw/OmniRoute/pull/3429) — thanks @oyi77) - **feat(providers):** Gemini Business provider — adds the `gemini-business` executor (Phase 2C of the Google provider expansion), enabling Gemini models via Google Workspace accounts. ([#3436](https://github.com/diegosouzapw/OmniRoute/pull/3436) — thanks @oyi77) - **feat(plugin+api):** auto-combos API + free model quota display — new `GET /api/combos/auto` endpoint lists dynamically scored combos; provider pages now surface free-tier quotas inline; MCP-plugin surface extended to match. ([#3435](https://github.com/diegosouzapw/OmniRoute/pull/3435) — thanks @mrmm) - **feat(opencode-plugin):** per-prefix API format selection, debug logging, and free-label normaliser — three backports from the mrmm fork: each route prefix can specify its own wire format (OpenAI / Anthropic / Gemini), structured debug output is toggled via env var, and free-tier labels are normalized across providers. ([#3420](https://github.com/diegosouzapw/OmniRoute/pull/3420) — thanks @herjarsa) - **feat(connections):** connection pagination, health filter, batch-delete confirmation, and custom banned keywords — the provider connections table is now paginated; a health-state filter lets operators show only healthy/degraded/failed connections; multi-select + confirm dialog for bulk deletes; per-connection keyword denylist for content safety. ([#3454](https://github.com/diegosouzapw/OmniRoute/pull/3454) — thanks @sdfsdfw2) - **feat(settings):** Endpoint Token Saver visibility toggle — operators can now show or hide the Token Saver widget on the endpoint page from Settings → Appearance. ([#3461](https://github.com/diegosouzapw/OmniRoute/pull/3461) — thanks @rdself) - **feat(catalog):** model catalog name feature flag — a new feature flag controls whether the catalog exposes provider-prefixed model names, letting deployments opt into the legacy bare-name format for downstream tooling compatibility. ([#3464](https://github.com/diegosouzapw/OmniRoute/pull/3464) — thanks @rdself) ### 🔧 Bug Fixes - **fix(translator):** Vertex AI tool calls no longer fail with `400 Unknown name "id"` — the OpenAI-style `id` field is stripped from `functionCall`/`functionResponse` parts for `vertex`/`vertex-partner`; the public Gemini API still receives `id` as required for Gemini 3+ signature matching. ([#3457](https://github.com/diegosouzapw/OmniRoute/pull/3457) — thanks @nullbytef0x / @diegosouzapw) - **fix(claude):** Claude Code `claude-opus-4-8` tool calls no longer break with `tool call could not be parsed` — OmniRoute no longer force-injects `interleaved-thinking` / `advanced-tool-use` / `effort` beta flags the client never negotiated; clients sending their own `anthropic-beta` header control those betas themselves. ([#3458](https://github.com/diegosouzapw/OmniRoute/pull/3458) — thanks @Forcerecon / @diegosouzapw) - **fix(catalog):** imported/custom models on no-auth providers (e.g. The Old LLM) now appear in `GET /api/v1/models` and the Playground model selector — the eligibility gate required a DB connection row which no-auth providers never have, silently dropping every imported model for them. ([#3463](https://github.com/diegosouzapw/OmniRoute/pull/3463) — thanks @tjengbudi / @diegosouzapw) - **fix(browser):** optional `cloakbrowser` import no longer causes bundle errors when the package is absent — the import is now wrapped in a dynamic require so the build succeeds on environments that don't install the optional dep. ([#3460](https://github.com/diegosouzapw/OmniRoute/pull/3460) — thanks @rdself) - **fix(claude-web):** claude-web session handling cleanup — corrects an edge case where session cookies were not properly refreshed after a Turnstile challenge, and removes stale wrapper code left over from the provider split. ([#3449](https://github.com/diegosouzapw/OmniRoute/pull/3449) — thanks @androw) - **fix(analytics):** SQL named params are now scoped per query context — a shared params object was being mutated across concurrent analytics queries, causing `SQLITE_MISUSE: named parameter not found` errors under load. ([#3447](https://github.com/diegosouzapw/OmniRoute/pull/3447) — thanks @ReqX) - **fix(command-code):** chat endpoint reverted to `/alpha/generate` and model-sync discovery fixed — a prior refactor incorrectly targeted the wrong path, causing Command Code completions to silently 404; model listing now also resolves from the correct discovery endpoint. ([#3432](https://github.com/diegosouzapw/OmniRoute/pull/3432) — thanks @TapZe) - **fix(command-code):** CLI version header aligned to current Command Code release — the `X-Command-Code-Version` header value was pinned to a stale version string, causing upstream version-gated features to be rejected. ([#3462](https://github.com/diegosouzapw/OmniRoute/pull/3462) — thanks @hevener10) - **fix(sse):** provider IDs are normalized to strings before lookup — numeric provider IDs (e.g. from legacy DB rows) caused `undefined` lookups in the executor registry; all IDs are now coerced to string at the SSE entry point. ([#3427](https://github.com/diegosouzapw/OmniRoute/pull/3427) — thanks @disafronov) - **fix(stream):** textual tool-call slicing index mismatch resolved and `containsTextualToolCallMarker` deduplicated — two related bugs in the rolling-buffer parser caused partial tool-call chunks to be emitted twice or sliced from the wrong offset, producing garbled JSON in streamed tool responses. ([#3413](https://github.com/diegosouzapw/OmniRoute/pull/3413) — thanks @Ardem2025) - **fix(stream):** OpenAI usage-only chunks (empty `choices: []`) are now passed through instead of being dropped — some providers emit a trailing stats-only chunk after the last content delta; discarding it caused usage counters to be missing in logged responses. ([#3422](https://github.com/diegosouzapw/OmniRoute/pull/3422) — thanks @xz-dev) - **fix(translator):** empty-string `reasoning_content` replaced with placeholder on cache miss — `injectEmptyReasoningContentForToolCalls` pre-sets `reasoning_content=""` before the cache lookup; the old guard checked for `undefined`, never firing on miss and leaving `""` in place, which DeepSeek V4+ rejects with a 400. ([#3433](https://github.com/diegosouzapw/OmniRoute/pull/3433) — thanks @ViFigueiredo) - **fix(catalog):** combos auto-compute `context_length` for any provider-ID form — the context-length resolution only matched exact-string provider IDs, missing combos declared with a numeric or aliased ID; the lookup now normalizes before matching. ([#3417](https://github.com/diegosouzapw/OmniRoute/pull/3417) — thanks @herjarsa) - **fix(healthcheck):** container bridge network IP probed correctly — the healthcheck script was hard-coded to `localhost` which resolves to IPv6 `::1` inside some container runtimes; it now queries the bridge gateway IP so the probe succeeds on both bridge and host networking modes. ([#3434](https://github.com/diegosouzapw/OmniRoute/pull/3434) — thanks @naimo84) - **fix(publish):** onnxruntime CUDA binary removed from npm tarball — the native `.node` binary exceeded npm's 413 payload limit and was never needed at runtime (OmniRoute uses the CPU build); the pack policy now excludes the CUDA artifact. ([#3437](https://github.com/diegosouzapw/OmniRoute/pull/3437) — thanks @herjarsa) ### 📝 Maintenance - **docs:** critical documentation gaps closed — new guides for ACP protocol, router strategies, compression, REST API reference, and updated AUTO-COMBO deep-dive; getting-started section added with Quick Start, Providers, Free Tiers, Auto-Combo, and Troubleshooting pages. ([#3438](https://github.com/diegosouzapw/OmniRoute/pull/3438) — thanks @oyi77) - **docs(opencode-plugin):** plugin README rewritten to lead with the why — positions the plugin as the recommended integration path over the legacy `@omniroute/opencode-provider` package, with migration guidance. ([#3418](https://github.com/diegosouzapw/OmniRoute/pull/3418) — thanks @herjarsa) - **docs(env):** `COMMAND_CODE_VERSION` override documented — environment variable added to `.env.example` and reference docs so operators can pin the CLI version header without a code change. ([#3462](https://github.com/diegosouzapw/OmniRoute/pull/3462) — thanks @hevener10) - **test(auto-combo):** same-provider connection identity assertion added — regression test covering the case where two connections for the same provider share an account ID, verifying the combo engine selects the correct one. ([#3378](https://github.com/diegosouzapw/OmniRoute/pull/3378) — thanks @oyi77) - **deps:** electron upgraded to 42.3.3; electron-builder to 26.15.2; electron-updater to 6.8.9; 4 development-group and 10 production-group packages bumped via Dependabot. ([#3441](https://github.com/diegosouzapw/OmniRoute/pull/3441) / [#3442](https://github.com/diegosouzapw/OmniRoute/pull/3442) / [#3443](https://github.com/diegosouzapw/OmniRoute/pull/3443) / [#3444](https://github.com/diegosouzapw/OmniRoute/pull/3444) / [#3445](https://github.com/diegosouzapw/OmniRoute/pull/3445) — thanks @diegosouzapw) - **chore(release):** v3.8.17 development cycle opened from `main`. (thanks @diegosouzapw) --- ## [3.8.16] — 2026-06-08 ### ✨ New Features - **feat(vision-bridge):** auto-routing to the fastest available vision model — when a request carries image content and the selected model does not support vision, OmniRoute now transparently delegates to the best-match vision-capable model instead of returning an error. ([#3377](https://github.com/diegosouzapw/OmniRoute/pull/3377) — thanks @herjarsa) - **feat(web-session):** web-session pool observability — new MCP tool `get_web_session_pool_health` and a health-matrix REST response (`GET /api/web-session-pool/health`) expose per-provider slot counts, lease ages, and error budgets so operators can diagnose pool exhaustion without digging through logs. ([#3395](https://github.com/diegosouzapw/OmniRoute/pull/3395) — thanks @oyi77) - **feat(web-session):** adaptive keepalive threshold — the keepalive heartbeat interval now self-adjusts based on observed provider idle-disconnect behaviour instead of using a fixed constant, reducing both unnecessary pings and unexpected session drops. ([#3397](https://github.com/diegosouzapw/OmniRoute/pull/3397) — thanks @oyi77) - **feat(web-session):** bulk credential import endpoint (`POST /api/web-session/import`) — import a JSON array of session credentials in one call; each entry is validated and inserted atomically, with per-entry success/failure reported in the response. ([#3403](https://github.com/diegosouzapw/OmniRoute/pull/3403) — thanks @oyi77) - **feat(api):** REST API for session pool health (`GET /api/session-pool/health`) — a dashboard-facing endpoint that aggregates live slot usage, wait-queue depth, and error rates across all active session pools; wired to a new dashboard widget. ([#3404](https://github.com/diegosouzapw/OmniRoute/pull/3404) — thanks @oyi77) ### 🔧 Bug Fixes - **fix(sse):** eliminate race window in `usageTokenBuffer` settings update — a concurrent save + stream-start could race to apply stale settings, causing token counts to roll back by up to 2 000 tokens after a restart; the update now uses an atomic read-modify-write on the shared settings ref. ([#3405](https://github.com/diegosouzapw/OmniRoute/pull/3405) — thanks @diegosouzapw) - **fix(context-cache):** server-side context-cache pinning now correctly persists across restarts; proxy message content no longer leaks into the upstream prompt; and the `context_cache_protection` toggle is properly saved to the DB on change. ([#3399](https://github.com/diegosouzapw/OmniRoute/pull/3399) — thanks @k0valik) - **fix(providers):** the provider settings page now refreshes its model list after a successful `sync-models` call — previously the stale list remained until a full page reload. ([#3402](https://github.com/diegosouzapw/OmniRoute/pull/3402) — thanks @0xtbug) - **fix(stream):** empty-choices chunks (choices array present but empty, no `finish_reason`) are now silently dropped rather than emitted as a `retry:` SSE event — removes spurious retry lines from streaming responses for providers that emit heartbeat keep-alive chunks. ([#3400](https://github.com/diegosouzapw/OmniRoute/pull/3400) — thanks @0xtbug) - **fix(account-fallback):** the connection cooldown deduplication state is now preserved across the fallback retry chain — previously a second concurrent failure on the same account could clear the dedupe flag set by the first, allowing the cooldown window to be extended twice. ([#3381](https://github.com/diegosouzapw/OmniRoute/pull/3381) — thanks @oyi77) - **fix(stream):** false-positive textual tool-call marker truncation — `containsTextualToolCallMarker` now tracks how much of the accumulated streamed content has already been emitted, so it only withholds the unemitted tail rather than re-scanning from the start on every new chunk. ([#3382](https://github.com/diegosouzapw/OmniRoute/pull/3382) — thanks @Ardem2025) - **fix(sanitizer):** `containsTextualToolCallContent()` now requires the complete `[Tool call: name]\nArguments:` header pattern instead of a bare `.includes("[Tool call:")` check — prevents the non-streaming response sanitizer from nulling out model responses that merely quote `[Tool call:]` in prose or code examples. ([#3355](https://github.com/diegosouzapw/OmniRoute/pull/3410) — thanks @diegosouzapw) - **fix(stream):** the streaming textual tool-call guard now flushes any remaining buffered content as plain text when the stream ends, regardless of whether the buffer contains `"Arguments:"` — previously, a partial/incomplete tool-call header that arrived at end-of-stream was silently dropped. ([#3355](https://github.com/diegosouzapw/OmniRoute/pull/3410) — thanks @diegosouzapw) - **fix(executor):** Mistral (and any provider in `PROVIDERS_REQUIRING_USER_LAST_MESSAGE`) no longer receives a trailing `assistant` message with plain text content — `stripTrailingAssistantForProvider` drops it on the upstream-send path, fixing the `400: Expected last role User or Tool … but got assistant` rejection. ([#3396](https://github.com/diegosouzapw/OmniRoute/pull/3409) — thanks @diegosouzapw) - **fix(mitm):** `getMitmStatus()` in the build-time stub (Docker image) now returns a graceful `{ running: false }` status instead of throwing, so the Agent Bridge UI shows a clean "stopped" state rather than an error banner in containerised deployments. ([#3390](https://github.com/diegosouzapw/OmniRoute/pull/3408) — thanks @diegosouzapw) - **fix(env):** corrected casing of `OMNIROUTE_TRACE` in `.env.example` and all related documentation files — was previously mixed-case in some places, causing the variable to be silently ignored on case-sensitive file systems. ([#3393](https://github.com/diegosouzapw/OmniRoute/pull/3393) — thanks @androw) - **fix(featureFlags):** `PRICING_SYNC_ENABLED` description now clearly states that the feature requires the corresponding environment variable to be set — removes the ambiguity that led operators to enable it via the UI only and wonder why sync never ran. ([#3394](https://github.com/diegosouzapw/OmniRoute/pull/3394) — thanks @androw) ### 📝 Maintenance - **ci(docker):** the CI pipeline now builds and publishes the `-web` image variant in the same Docker publish workflow, so both the standard and browser-backed images stay in sync on every release. ([#3389](https://github.com/diegosouzapw/OmniRoute/pull/3389) — thanks @zhiru) - **ci(e2e):** E2E shard suite hardened — timeout raised to 45 min for the heaviest shard; build artifact now uses an explicit `tar` bundle to avoid `upload-artifact@v4` LCA path ambiguity; `node_modules` copied into standalone after download; browser cache added to cut cold-shard time; `sync-models` endpoint mocked in `providers-management.spec.ts` so the import modal reaches "done" immediately. (thanks @diegosouzapw) - **docs:** Codex CLI configuration guide added to the dashboard (`/dashboard/codex-config`) — covers profile naming, model selection, and the `CODEX_*` environment variables accepted by OmniRoute. (thanks @diegosouzapw) - **chore(agentSkills):** catalog expanded to 43 entries — `config-codex-cli` added as a new `CONFIG_SKILL_IDS` category; all skill-count assertions updated across unit and integration test suites; `next-fetch` opts cast to satisfy the TypeScript overload signature in the skill runner. (thanks @diegosouzapw) --- ## [3.8.15] — 2026-06-07 ### ✨ New Features - **feat(error-rules):** provider-specific error classification with scope — a declarative rules layer lets providers map upstream error shapes to the right resilience action (provider circuit-breaker vs connection cooldown vs model lockout) at the correct scope, instead of relying on generic status-code heuristics. ([#3370](https://github.com/diegosouzapw/OmniRoute/pull/3370) — thanks @herjarsa) ### 🔧 Bug Fixes - **fix(combo):** add `429` to `PROVIDER_FAILURE_ERROR_CODES` so a rate-limited target no longer drives an infinite retry loop — the combo now cools the target down and moves on. ([#3366](https://github.com/diegosouzapw/OmniRoute/pull/3366) — thanks @herjarsa) - **fix(catalog):** add a `getTokenLimit` fallback for combo targets with an unknown context window, so a target whose context can't be resolved no longer breaks token-limit computation for the combo. ([#3369](https://github.com/diegosouzapw/OmniRoute/pull/3369) — thanks @herjarsa) - **fix(auto-combo):** include no-auth providers in Auto-Combo declaratively (driven by provider metadata rather than a hard-coded list), so keyless providers are eligible candidates. ([#3365](https://github.com/diegosouzapw/OmniRoute/pull/3365) — thanks @oyi77) - **fix(auto-combo):** validate web-session credentials before selecting a web-cookie provider as an Auto-Combo target, so an expired/empty session doesn't get picked. ([#3371](https://github.com/diegosouzapw/OmniRoute/pull/3371) — thanks @oyi77) - **fix(command-code):** update the Command Code base URL from `/alpha/` to `/provider/v1/` (upstream moved the endpoint). ([#3372](https://github.com/diegosouzapw/OmniRoute/pull/3372) — thanks @TapZe) - **fix(kiro):** probe `%APPDATA%\kiro\storage.db` on Windows during Kiro auto-import, so the import finds the credential store where Kiro actually writes it on Windows. ([#3375](https://github.com/diegosouzapw/OmniRoute/pull/3375), fixes #3363 — thanks @diegosouzapw; reported by @Gerashka2) ### 📝 Maintenance - **fix(migrations):** restore `095_provider_node_custom_headers.sql` — it was twice deleted from the release branch by a contributor branch's `git rm` of a duplicate getting folded into the squash merge; restored and guarded. (thanks @diegosouzapw) ### 🙌 Contributors Thanks to everyone whose work landed in v3.8.15: | Contributor | PRs / Issues | | ------------------------------------------------ | -------------------------------------------------- | | [@herjarsa](https://github.com/herjarsa) | #3366, #3369, #3370 | | [@oyi77](https://github.com/oyi77) | #3365, #3371 | | [@TapZe](https://github.com/TapZe) | #3372 | | [@Gerashka2](https://github.com/Gerashka2) | reported #3363 | | [@diegosouzapw](https://github.com/diegosouzapw) | maintainer — #3375 shepherding, migration restores | --- ## [3.8.14] — 2026-06-07 ### ✨ New Features - **feat(api):** per-provider **custom headers** for OpenAI/Anthropic-compatible provider nodes — attach operator-defined headers (e.g. tenant/routing headers) to upstream requests via a new `customHeaders` field on provider nodes (`custom_headers_json` column, migration 095). Hardened on merge: values/names validated through the canonical `upstreamHeadersRecordSchema` (CRLF/control-char/length/16-max) with a single shared `isForbiddenCustomHeaderName()` denylist (hop-by-hop + auth), applied case-insensitively, and honored for `anthropic-compatible-cc-*` nodes too. ([#3338](https://github.com/diegosouzapw/OmniRoute/pull/3338) — thanks @pizzav-xyz / @diegosouzapw) ### 🔒 Security - **fix(security):** provider auto-sync self-fetch now uses a trusted loopback/env-pinned origin (`getModelSyncInternalBaseUrl()`) instead of `new URL(request.url).origin`, so a management-authenticated caller can no longer redirect the credential-bearing internal request to an arbitrary host via the `Host` header (CodeQL `js/request-forgery`, critical). Shipped to Docker/Electron in v3.8.13; reaches npm here (npm `3.8.13` was immutable). ([#3336](https://github.com/diegosouzapw/OmniRoute/pull/3336), CodeQL #323 — thanks @diegosouzapw) ### 🔧 Bug Fixes - **fix(translator):** every Gemini/Vertex `functionDeclaration.parameters` is now coerced to an OBJECT-typed schema before cleaning. Clients like GitHub Copilot send some tools (e.g. `terminal_last_command`) whose `parameters` is present but lacks a top-level `type: "object"` (just `{ properties }`, a scalar type, or `{}`); these slipped through `buildGeminiTools`' `params || default` guard and Vertex rejected them with `[400] ... functionDeclaration parameters schema should be of type OBJECT`. Hardens every OpenAI→Gemini tool request (Vertex / antigravity / agy / gemini). (#3357 — thanks @nullbytef0x) - **fix(gemini):** normalize Gemini/Antigravity textual `[Tool call: ...]` markers — stop suppressing **false positives** (legitimate assistant prose that merely mentions `[Tool call: terminal]`, e.g. in backticks, is preserved instead of being swallowed) and correctly buffer markers **split across streaming chunks** (`[Tool` + ` call: terminal]` + `Arguments: {...}`), flushing the text when it turns out not to be a tool call. Dedups the parsing/validation into a shared `open-sse/utils/textualToolCall.ts` (new `isValidToolCallHeaderPrefix`) and adds `gemini-2.5-flash`/`gemini-3.5-flash-low` model specs. ([#3358](https://github.com/diegosouzapw/OmniRoute/pull/3358) — thanks @Ardem2025 / @diegosouzapw) - **fix(electron):** clicking "Exit" (or applying an update) now terminates the **whole** server process tree, not just the direct child. The embedded server runs as `omniroute.exe`-as-node (`ELECTRON_RUN_AS_NODE`) and spawns grandchildren (embedded services, MITM proxy, tunnels); on Windows `ChildProcess.kill()` only terminates the direct child, so survivors kept `omniroute.exe` locked — the process "hung in memory" after Exit and updates failed with "file in use". New `killProcessTree()` helper uses `taskkill /PID /T /F` on Windows (signal-based on POSIX); wired into `stopNextServer`, the `waitForServerExit` force-kill, and `installUpdate`. (#3347 — thanks @Flexible78) - **fix(proxy):** proxy auto-selection is now **opt-in** (new `PROXY_AUTO_SELECT_ENABLED` flag, default off). Previously a single proxy in the registry silently became a global fallback for **all** provider connections (the Step-11 fallback listed every registry proxy, ignoring assignments and per-connection `proxy_enabled`). It now no-ops unless the operator enables the flag. (#3332 — thanks @hertznsk) - **fix(cli):** write the OpenCode config to `~/.config/opencode/opencode.json` on **all** platforms — on Windows OmniRoute wrote to `%APPDATA%\opencode\` but OpenCode reads from `%USERPROFILE%\.config\opencode\` (XDG), so dashboard-saved config silently had no effect. (#3330 — thanks @abdulkadirozyurt) - **fix(catalog):** remove `minimaxai/minimax-m3` from the **NVIDIA NIM** tier — NVIDIA does not host it yet, so every request 404'd (`404 page not found`), while sibling `minimax-m2.7` on the same provider works. MiniMax M3 stays available on the tiers that actually serve it. (#3329 — thanks @mikmaneggahommie) - **fix(sse):** treat **MiniMax M3** as multimodal so the compression layer no longer strips image parts from vision requests — `lite.ts modelSupportsVision` now keeps images for `minimax-m3*` (see also the registry `supportsVision` alignment in Maintenance). ([#3328](https://github.com/diegosouzapw/OmniRoute/pull/3328) — thanks @diegosouzapw) - **fix(oauth):** Kiro **Builder ID** token import no longer fails with "Bad credentials" — `validateImportToken` only ever tried the social-auth refresh; it now uses the cached AWS SSO `clientId`/`clientSecret` (`~/.aws/sso/cache/*.json`) and the OIDC refresh path (`authMethod: "builder-id"`), with a TDD harness. ([#3333](https://github.com/diegosouzapw/OmniRoute/pull/3333) — thanks @quanturbo / @diegosouzapw) - **fix(provider-proxy):** honor per-account proxy toggles — a connection with `proxy_enabled = false` is no longer forced through an assigned/registry proxy. ([#3349](https://github.com/diegosouzapw/OmniRoute/pull/3349) — thanks @rdself) - **fix(providers):** reduce proxy label noise on the provider page (clearer proxy assignment/state display). ([#3346](https://github.com/diegosouzapw/OmniRoute/pull/3346) — thanks @wilsonicdev) - **fix(noauth):** expose only **usable** model aliases for no-auth providers, so the catalog no longer advertises aliases that can't actually be called. ([#3345](https://github.com/diegosouzapw/OmniRoute/pull/3345) — thanks @oyi77) - **fix(duckduckgo):** restore the bare `Response` contract for the DuckDuckGo/browser-backed executor (rebased onto the cycle), fixing a wrapping-contract regression. ([#3323](https://github.com/diegosouzapw/OmniRoute/pull/3323) — thanks @oyi77 / @diegosouzapw) - **fix(dashboard):** drop the duplicate "Distribute Proxies" button on the provider page — it rendered twice at once (provider toolbar + accounts-list header) whenever connections existed and none were selected. The toolbar button (global) and the per-tag-group buttons remain. ([#3352](https://github.com/diegosouzapw/OmniRoute/pull/3352) — thanks @diegosouzapw) - **fix(electron):** ship `loginManager.js` in the packaged app — #3292 added it (and a `require("./loginManager")` in `main.js`) without adding it to electron-builder's `build.files`, so the packaged app crashed at startup with "Cannot find module" on the Linux/macOS smoke tests. Plus a regression test asserting every local `require("./x")` in the Electron entry points is shipped. ([#3334](https://github.com/diegosouzapw/OmniRoute/pull/3334) — thanks @diegosouzapw) - **fix(startup):** correct the #3292 auto-refresh daemon import (`@/open-sse/...` → `@omniroute/open-sse/services/autoRefreshDaemon`); the `@/` alias maps to `src/`, so the daemon silently never ran in the built standalone (non-fatal "Cannot find module", caught at runtime). Adds a regression test banning `@/open-sse/*` imports in `src/`. ([#3335](https://github.com/diegosouzapw/OmniRoute/pull/3335) — thanks @diegosouzapw) - **fix(electron):** wrap `autoUpdater.checkForUpdates()` so a 404/offline/rate-limited update check can no longer surface as an unhandled rejection (the `error` event still notifies the user); fixes the macOS-intel packaged-app smoke failure. ([#3339](https://github.com/diegosouzapw/OmniRoute/pull/3339) — thanks @diegosouzapw) - **fix(dashboard):** stop the infinite render loop on `/dashboard/cli-agents/hermes-agent` — `HermesAgentToolCard` listed `currentRoles` in the config-load effect's deps while `loadCurrentConfig()` set `currentRoles` to a fresh object on every fetch, so the effect re-fired → refetched → re-set forever (the page spun and spammed `GET /api/cli-tools/hermes-agent-settings` in the console; it manifested only on the always-expanded detail page). `loadCurrentConfig` is now memoized and the batch-seed reads `currentRoles` via a functional update, so the effect runs once. Adds a jsdom regression test asserting the settings endpoint is fetched a bounded number of times. ([#3353](https://github.com/diegosouzapw/OmniRoute/pull/3353) — thanks @diegosouzapw) - **fix(dashboard):** the Usage Analytics card now surfaces the **real** backend error (status + message) instead of a generic placeholder when `/api/usage/analytics` fails — a new shared `fetchError.ts` helper extracts a useful message. ([#3356](https://github.com/diegosouzapw/OmniRoute/pull/3356) — thanks @diegosouzapw) ### 📝 Maintenance - **fix(review):** harden the per-provider custom-headers feature surfaced by the `/review-reviews` battery — `updateProviderNode` no longer wipes stored `custom_headers_json` on a partial update that omits the field; `customHeadersSchema` reuses the canonical `upstreamHeadersRecordSchema` guards (CRLF/control-char/length/16-max) and rejects auth header names via a single shared `isForbiddenCustomHeaderName()` denylist (executor + schema no longer keep divergent copies); custom headers now reach the wire for `anthropic-compatible-cc-*` nodes and override the executor's own `Content-Type`/`Accept` case-insensitively instead of duplicating them; and `rowToCamel` normalizes a NULL `_json` column to `baseKey: null`. ([#3350](https://github.com/diegosouzapw/OmniRoute/pull/3350) — thanks @diegosouzapw) - **fix(catalog):** flag every `minimax-m3` registry entry `supportsVision` (not just the opencode free tier) so the vision-bridge guardrail and the compression layer agree the model is multimodal on all tiers (completes #3328). (thanks @diegosouzapw) - **fix(oauth):** Kiro Builder ID import forwards the requested `region` to the OIDC validation refresh (no longer pinned to `us-east-1`), prefers the region-matching cached SSO client registration over the first file found, and falls `expiresIn` back to 3600 on the OIDC path. (thanks @diegosouzapw) - **fix(db):** migration `095` gains an `isSchemaAlreadyApplied` guard so a fresh DB (where `SCHEMA_SQL` already creates `custom_headers_json`) skips it cleanly instead of throwing-then-catching a duplicate-column error. (thanks @diegosouzapw) - **test:** align stale cycle tests with shipped behavior — NVIDIA `minimaxai/minimax-m3` removal (#3329), the 29th feature flag (`PROXY_AUTO_SELECT_ENABLED`, #3332), and the OpenCode `~/.config` path on Windows (#3330). (thanks @diegosouzapw) - **docs:** add a documentation comment to the exported `GET` handler in the context-analytics route. ([#3337](https://github.com/diegosouzapw/OmniRoute/pull/3337) — thanks @Lang-Qiu) - **docs(i18n):** translate 25 core documentation files to Indonesian. ([#3348](https://github.com/diegosouzapw/OmniRoute/pull/3348) — thanks @KrisnaSantosa15) ### 🙌 Contributors Thanks to everyone whose work landed in v3.8.14: | Contributor | PRs / Issues | | -------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ | | [@pizzav-xyz](https://github.com/pizzav-xyz) | #3338 | | [@quanturbo](https://github.com/quanturbo) | #3333 | | [@oyi77](https://github.com/oyi77) | #3323, #3345 | | [@rdself](https://github.com/rdself) | #3349 | | [@wilsonicdev](https://github.com/wilsonicdev) | #3346 | | [@hertznsk](https://github.com/hertznsk) | #3332 | | [@abdulkadirozyurt](https://github.com/abdulkadirozyurt) | #3330 | | [@mikmaneggahommie](https://github.com/mikmaneggahommie) | #3329 | | [@Flexible78](https://github.com/Flexible78) | #3347 | | [@Lang-Qiu](https://github.com/Lang-Qiu) | #3337 | | [@KrisnaSantosa15](https://github.com/KrisnaSantosa15) | #3348 | | [@nullbytef0x](https://github.com/nullbytef0x) | #3357 | | [@Ardem2025](https://github.com/Ardem2025) | #3358 | | [@diegosouzapw](https://github.com/diegosouzapw) | maintainer — #3334, #3335, #3336, #3339, #3350, #3352, #3353, #3356; review/hardening across the cycle | --- ## [3.8.13] — 2026-06-06 ### ✨ New Features - **feat(web-cookie):** self-service login infrastructure for 21 web-cookie providers — three login pathways (Electron BrowserWindow, Playwright dashboard fallback, `POST /api/providers/{id}/login`), token-extraction configs, and a 15-min cookie-validity auto-refresh daemon. Hardened on merge: error bodies sanitized (Hard Rule #12), the spawn-capable login route classified LOCAL_ONLY (Hard Rules #15/#17), and the Electron status listener de-duplicated. ([#3292](https://github.com/diegosouzapw/OmniRoute/pull/3292), closes #3070 — thanks @oyi77 / @diegosouzapw) - **feat(api):** accept path-scoped API keys on client API routes — keys may arrive via `/api/v1/vscode//…` path aliases (incl. `raw`/`combos`); explicit `Authorization`/`x-api-key` headers still take precedence. Split out of #3073. ([#3300](https://github.com/diegosouzapw/OmniRoute/pull/3300) — thanks @zhiru) - **feat(api):** model-catalog enrichment + MCP `model-catalog` tools — richer per-model metadata (context window, capabilities) surfaced through `/v1/models` and new MCP tools, plus `readHeaderValue` header-record support. Split out of #3073; reconciled on merge with the #3309 URL-token hardening (kept the security gate — no query-string credential fallback, management auth stays header-only). ([#3306](https://github.com/diegosouzapw/OmniRoute/pull/3306) — thanks @zhiru / @diegosouzapw) - **feat(dashboard):** internationalize the proxy settings UI — `ProxyTab` + the proxy `DocumentationTab`/`FreePoolTab`/`VercelRelayModal` now render via `t(...)`, with matching `en`/`pt-BR` message keys. Split out of #3073. ([#3307](https://github.com/diegosouzapw/OmniRoute/pull/3307), [#3310](https://github.com/diegosouzapw/OmniRoute/pull/3310) — thanks @zhiru) - **feat(provider):** provider test-all endpoint + per-connection rate-limit overrides + model visibility — `POST /api/models/test-all` runs parallel model tests (chunked, timeout-skip) atop a shared `runSingleModelTest` runner; per-connection rate-limit overrides land via `PATCH /api/providers/:id` (new `rate_limit_overrides_json` column + Zod schema); a dashboard model-visibility toolbar (All / Visible / Hidden) drives a `/v1/models` catalog that excludes user-hidden models; models auto-fetch on every connection add; and passthrough (OpenRouter) models gain test buttons. Folds in dashboard fixes on merge (missing alias/delete handlers, duplicate-model-ID React keys, "Hide all" restored) and a build fix so empty `.env` values no longer override real config. ([#3267](https://github.com/diegosouzapw/OmniRoute/pull/3267) — thanks @Vinayrnani) - **feat(api):** VS Code Copilot Ollama-compatible BYOK endpoint — exposes an Ollama-shaped surface so VS Code Copilot's "bring your own key" Ollama provider can target OmniRoute directly, with a `VscodeTokenAliasCard` in the dashboard endpoint tab to generate the path-scoped token alias. ([#3316](https://github.com/diegosouzapw/OmniRoute/pull/3316) — thanks @zhiru) - **feat(combo):** Auto-Combo candidate-expansion optimization + playground model dropdown + "only configured" model toggle — reworks the `auto` strategy's candidate selection in `combo.ts` and surfaces a model picker in the playground `StudioConfigPane` / `useAvailableModels`. ([#3322](https://github.com/diegosouzapw/OmniRoute/pull/3322) — thanks @oyi77) ### 🔒 Security - **fix(auth):** follow-up hardening of the client-API key extractor (#3300) — removed the generic query-string token fallbacks (`?token=`/`?key=`/`?apiKey=`/`?api_key=`), which leak credentials into access logs / Referer headers, and gated URL-borne tokens to client routes only (management auth is now header-only) so a credential in the URL can never authenticate a management route. The path-scoped `/vscode//…` form the VS Code integration needs is unchanged. (security review follow-up to [#3300](https://github.com/diegosouzapw/OmniRoute/pull/3300) — thanks @zhiru / @diegosouzapw) ### 🔧 Bug Fixes - **fix(dashboard):** Agent Bridge page (`/dashboard/tools/agent-bridge`) no longer crashes with "Internal Server Error" — the page replaced its well-shaped state with the raw `/api/tools/agent-bridge/state` response (`{ server, agents }`), leaving `serverState` undefined and throwing `Cannot read properties of undefined (reading 'running')`. A shared `normalizeAgentBridgeState()` now maps the route shape into the page contract (incl. `server.certExists → certTrusted`) and always returns safe defaults, used by both the SSR loader and the polling hook. (#3318 — thanks @tycronk20) - **fix(codex):** strip client-only params (`prompt_cache_retention`, `safety_identifier`, `user`) on the native `codex/` `/v1/responses` passthrough — Codex upstream rejects them with `400 Unsupported parameter`, which broke Factory Droid and any client injecting those fields. The chat-completions path already stripped them; the responses→responses passthrough now does too. (#3317 — thanks @tycronk20) - **fix(theoldllm):** stop the `[502]: Body is unusable: Body has already been read` error on the cached-token path — the executor read the same upstream `Response` body with `.text()` twice; it now reads it once and only re-reads after a token-rejection refetch. (#3296 — thanks @onizukashonan14-png) - **fix(dashboard):** keep no-auth providers (opencode, duckduckgo-web, theoldllm, veoaifree-web) visible under the "Show configured only" filter — they never create a connection row (`stats.total === 0`) but are always usable and already appear in `/v1/models`, so the filter now treats `displayAuthType === "no-auth"` as configured. (#3290 — thanks @uniQta) - **fix(dashboard):** refresh the connection list after a Codex/Claude/Gemini auth import — the import modals called `fetchData()` (which only reloads provider metadata), so a freshly-imported connection stayed invisible until a manual reload; they now call `fetchConnections()`. ([#3320](https://github.com/diegosouzapw/OmniRoute/pull/3320) — thanks @zhiru) - **fix(cli):** `omniroute update` no longer always fails on a global install — `getCurrentVersion()` and `createBackup()` now resolve `package.json`/`bin` relative to the script (`import.meta.url`) instead of `process.cwd()` (the user's working dir on a global npm/brew install → _"Could not determine current version"_), and the backup copies the `cli` directory with `cpSync({recursive:true})` instead of `copyFileSync`, which threw a swallowed `EISDIR` → _"Failed to create backup. Aborting"_. (#3295 — thanks @uniQta) - **fix(sse):** harden the passthrough stream against empty upstream responses — emit a synthetic retry chunk on an empty `choices: []` (fixes a Copilot Chat crash) and log empty post-`tool_calls` completions; also registers **MiniMax M3** (1M context) across 8 provider tiers. ([#3297](https://github.com/diegosouzapw/OmniRoute/pull/3297), #3110 — thanks @wilsonicdev) - **fix(opencode-provider):** extract `contextLength` from the live `/v1/models` catalog (live > `modelContextLengths` > static map) so passthrough models outside the legacy 8-model map no longer silently truncate to OpenCode's 128K default. ([#3298](https://github.com/diegosouzapw/OmniRoute/pull/3298) — thanks @herjarsa / @diegosouzapw) - **fix(dev):** auto-rebuild `better-sqlite3` on a Node ABI mismatch at `npm run dev` startup (nvm 22↔24) — dev-only, no-op on the healthy path, unrelated errors not swallowed. ([#3301](https://github.com/diegosouzapw/OmniRoute/pull/3301) — thanks @zhiru) - **fix(api):** remove the bundled **Completions.me** provider preset — empirically verified to return Rick Astley lyrics instead of real completions for every model/prompt. ([#3302](https://github.com/diegosouzapw/OmniRoute/pull/3302), discussion #3293 — thanks @diegosouzapw; reported by @mikmaneggahommie) - **fix(ci):** skip the auto-deploy step when the VPS SSH port is unreachable from the GitHub runner (private LAN / firewall) instead of red-failing every release pipeline; genuine deploy/boot failures still fail honestly. ([#3299](https://github.com/diegosouzapw/OmniRoute/pull/3299) — thanks @diegosouzapw) - **fix(sse):** strip leaked internal tool-call envelopes (`to=functions.*` / `multi_tool_use.parallel { … }`) from visible assistant text and sanitize Responses-API streaming (drop `commentary`-phase output items) so harness syntax never reaches the client. ([#3311](https://github.com/diegosouzapw/OmniRoute/pull/3311) — thanks @zhiru) - **fix(sse):** expose the Claude (`claude-opus-4-6-thinking`, `claude-sonnet-4-6`) and Gemini budget tiers (`gemini-3.1-pro-{high,low}`, `gemini-3.5-flash-{low,extra-low}`) in the Antigravity catalog — they are user-callable on the Antigravity OAuth backend (agy parity), correcting an earlier assumption that Claude had been removed. ([#3303](https://github.com/diegosouzapw/OmniRoute/pull/3303), discussion #3184 — thanks @diegosouzapw) - **fix(catalog):** compute a combo's `context_length` from the known targets only — a single target with unknown context no longer collapses the whole combo to `undefined`; also accepts live `{id, contextLength}` model entries in the opencode-provider helper (follow-up to #3298). ([#3304](https://github.com/diegosouzapw/OmniRoute/pull/3304) — thanks @herjarsa / @diegosouzapw) ### 📝 Maintenance - **test(catalog):** align the Antigravity preview-alias catalog test with the #3303 budget tiers — asserts the restored Claude/Gemini tiers are surfaced, locking in the behavior so a future tier change can't silently drop them again (thanks @diegosouzapw) - **docs:** rename the `resolve-issues` skill references to `review-issues` across the docs/skill surfaces, matching the renamed governance skill (thanks @diegosouzapw) - **docs:** document the VS Code / Ollama endpoints (API reference + new `docs/reference/CLI-TOOLS.md`) and improve the env-bootstrap + i18n key-coverage tooling. ([#3319](https://github.com/diegosouzapw/OmniRoute/pull/3319) — thanks @zhiru) - **chore(release):** open the v3.8.13 development cycle (version bump + cycle bookkeeping) and finalize this changelog (thanks @diegosouzapw) ### 🙌 Contributors Thanks to everyone whose work landed in v3.8.13: | Contributor | PRs / Issues | | -------------------------------------------------------------- | ------------------------------------------------------------------------------------ | | [@zhiru](https://github.com/zhiru) | #3300, #3306, #3307 / #3310, #3309, #3301, #3311, #3320, #3319, #3316 | | [@tycronk20](https://github.com/tycronk20) | #3317, #3318 | | [@Vinayrnani](https://github.com/Vinayrnani) | #3267 | | [@oyi77](https://github.com/oyi77) | #3292 (closes #3070), #3322 | | [@onizukashonan14-png](https://github.com/onizukashonan14-png) | #3296 | | [@uniQta](https://github.com/uniQta) | #3290, #3295 | | [@wilsonicdev](https://github.com/wilsonicdev) | #3297 | | [@herjarsa](https://github.com/herjarsa) | #3298, #3304 | | [@mikmaneggahommie](https://github.com/mikmaneggahommie) | reported the Completions.me rickroll (discussion #3293) | | [@diegosouzapw](https://github.com/diegosouzapw) | maintainer — #3299, #3302, #3303; co-author on #3292 / #3306 / #3298 / #3304 / #3309 | --- ## [3.8.12] — 2026-06-06 ### ✨ New Features - **chipotle:** add Chipotle Pepper AI — a free provider implemented via the reverse-engineered Amelia protocol, with its executor error body routed through `sanitizeErrorMessage()` (Hard Rule #12) ([#3250](https://github.com/diegosouzapw/OmniRoute/pull/3250) — thanks @oyi77) - **web-cookie:** add tool-call translation to 8 web-cookie executors via shared `webTools` helpers, so cookie-backed providers can participate in tool/function calling through a single serialize/parse path ([#3259](https://github.com/diegosouzapw/OmniRoute/pull/3259) — thanks @oyi77) - **free-tiers:** per-model free-token budget catalog + a Monthly Budget dashboard card surfacing each provider's monthly free allowance (joins the honest free-token catalog/API/headline work from #3257) ([#3263](https://github.com/diegosouzapw/OmniRoute/pull/3263), [#3257](https://github.com/diegosouzapw/OmniRoute/pull/3257) — thanks @diegosouzapw) - **dashboard:** bulk activate / deactivate / retest for selected provider connections — multi-select with batch lifecycle actions on the providers page ([#3271](https://github.com/diegosouzapw/OmniRoute/pull/3271) — thanks @leninejunior) - **models:** register MiniMax-M3 (frontier coding/agentic model, 1M context, Anthropic-compatible) across 8 provider tiers — `minimax`, `minimax-cn`, `opencode` (free), `opencode-go`, `opencode-zen`, `trae`, `ollama-cloud`, `nvidia` ([#3287](https://github.com/diegosouzapw/OmniRoute/pull/3287), #3110 — thanks @wilsonicdev) ### 🔧 Bug Fixes - **api/responses:** combo names without a slash (e.g. `paid-premium`, `n8n-text`) are no longer force-rewritten to `codex/` on `/v1/responses` — `resolveResponsesApiModel` now returns the request unchanged when the model resolves to a combo (regression from the v3.8.9 Codex WS→HTTP fallback) ([#3268](https://github.com/diegosouzapw/OmniRoute/pull/3268), fixes #3227 / #3233 — thanks @wilsonicdev; supersedes the earlier closed #3242) - **sse:** strip **every** `` tag before forwarding to the provider, not just the first — a global-regex variant prevents stray routing tags from leaking into the upstream prompt ([#3248](https://github.com/diegosouzapw/OmniRoute/pull/3248), fixes #454 — thanks @MikeTuev) - **grok-web:** add TLS fingerprint impersonation to bypass Cloudflare anti-bot on the Grok web endpoint, with the executor's error bodies routed through `sanitizeErrorMessage()` (Hard Rule #12) ([#3249](https://github.com/diegosouzapw/OmniRoute/pull/3249), fixes #3180 — thanks @wilsonicdev) - **providers:** improve provider refresh/validation and the model-catalog UI — including the OpenRouter catalog and the proxy UI, plus the NVIDIA NIM `/models`-suffix probe path (real-VPS validated) ([#3261](https://github.com/diegosouzapw/OmniRoute/pull/3261) — thanks @strangersp) - **embeddings:** block cross-dimension failover inside embedding combos so a fallback target with a different vector dimension can no longer corrupt results ([#3256](https://github.com/diegosouzapw/OmniRoute/pull/3256) — thanks @diegosouzapw) - **sse/web-tools:** web-cookie providers (e.g. `ds-web`/`deepseek-v4-pro`) that wrap tool calls as `{json}` are now parsed correctly — the real tool name is read from the JSON body instead of the tag attribute, and the call is no longer silently dropped when `arguments` is absent ([#3275](https://github.com/diegosouzapw/OmniRoute/pull/3275), fixes #3260 — thanks @diegosouzapw) - **sse/groq:** non-reasoning Groq models (`llama-3.3-70b-versatile`, `llama-4-scout`) are now flagged `supportsReasoning: false`, so `reasoning_effort` / `output_config.effort` / `thinking` are stripped before dispatch instead of being forwarded and rejected with HTTP 400 — fixes the Claude Code → Groq regression of #764 ([#3277](https://github.com/diegosouzapw/OmniRoute/pull/3277), fixes #3258 — thanks @diegosouzapw) - **api/images:** `POST /v1/images/edits` to a custom OpenAI-compatible provider no longer forwards an empty `model`. The multipart body is now built as a `Buffer` with an explicit boundary instead of a global `FormData` — the patched undici `fetch` serialized a native `FormData` as the literal string `[object FormData]` (text/plain), dropping every field including `model` ([#3278](https://github.com/diegosouzapw/OmniRoute/pull/3278), fixes #3273 — thanks @diegosouzapw) - **db:** detect SQLite driver-unavailable errors to avoid a destructive DB rename + an optional FTS5 migration guard, so a transient driver-load failure no longer triggers the backup-and-recreate path on a healthy database (split from #3073) ([#3274](https://github.com/diegosouzapw/OmniRoute/pull/3274) — thanks @zhiru) - **quota:** repair the Quota Sharing Engine — `poolUsageWithDimensions()` promoted onto the `QuotaStore` interface (kills the dynamic type-narrowing hack), single-snapshot burn rate via `computeBurnRateFromWindow()` (the dashboard previously always showed 0), zero-weight allocations normalized to equal distribution, Anthropic `anthropic-ratelimit-*` saturation signals, a `quota.exceeded` webhook fired on block, and quota enforcement extended to the embeddings handler ([#3280](https://github.com/diegosouzapw/OmniRoute/pull/3280) — thanks @oyi77) - **plugins:** `emitHookBlocking` now chains the payload between handlers — each blocking handler receives the body/metadata as mutated by previous handlers, so a later plugin can observe an earlier plugin's changes (previously every handler got the original static payload) ([#3286](https://github.com/diegosouzapw/OmniRoute/pull/3286) — thanks @oyi77) - **api/webhooks:** webhook URLs may now target a private/internal address (e.g. `192.168.x`, a docker-internal host) when `OMNIROUTE_ALLOW_PRIVATE_PROVIDER_URLS=true` — the webhook guard reuses the same explicit opt-in as private provider URLs (default OFF; protocol and embedded-credential checks stay unconditional). Cloud-metadata / link-local endpoints (`169.254.169.254`, `metadata.google.internal`, `100.100.100.200`, `169.254.0.0/16`) are blocked **unconditionally** even with the opt-in on, and the webhook test endpoint redacts the upstream response body for private targets (no SSRF→IAM-credential pivot, no content exfiltration) ([#3279](https://github.com/diegosouzapw/OmniRoute/pull/3279), [#3281](https://github.com/diegosouzapw/OmniRoute/pull/3281), fixes #3269 — thanks @diegosouzapw) - **sse/qoder:** a valid Qoder Personal Access Token is no longer wrongly reported as "expired" when the Cosy validation endpoint returns a generic `Internal Server Error` (HTTP 500). A Cosy 500 only marks the PAT invalid when its body carries an explicit auth signal; a generic server fault now falls back to the #1391 valid-bypass rule ([#3283](https://github.com/diegosouzapw/OmniRoute/pull/3283), fixes #3247 — thanks @wilsonicdev, who independently diagnosed the same root cause and filed [#3282](https://github.com/diegosouzapw/OmniRoute/pull/3282); refined here to keep rejecting on an explicit-auth-signal 500) ### 📝 Maintenance - **ci:** `deploy-vps` recreates the PM2 process via the `omniroute` bin (instead of a bare `pm2 restart` pinned to the removed `app/server-ws.mjs` path) and gates the deploy on `/api/monitoring/health` reporting `"status":"healthy"`, failing the job (with recent PM2 logs) when the box never becomes healthy — supersedes #3262 ([#3270](https://github.com/diegosouzapw/OmniRoute/pull/3270) — thanks @diegosouzapw) - **security:** harden the Chipotle executor against CodeQL findings — `Math.random()` → `crypto.randomInt()`/`crypto.randomUUID()` (imported from `node:crypto`) for session/server IDs, and a strict `new URL().hostname` check (replacing a substring match) in its test ([#3285](https://github.com/diegosouzapw/OmniRoute/pull/3285) — thanks @oyi77) - **governance:** raise the coverage gate from 40% to 60% (statements/lines/functions/branches) now that real coverage sits at ~80% — brings the threshold in line with Hard Rule #9 (thanks @diegosouzapw) - **docs:** consolidate the community links (Discord + Telegram + WhatsApp) at the top of the README and promote the Free-Token Budget section ([#3289](https://github.com/diegosouzapw/OmniRoute/pull/3289) — thanks @diegosouzapw) - **docs:** richer free-tier budget-card image (28 models + first-month strip) and softer ToS framing (caution rather than warning) ([#3284](https://github.com/diegosouzapw/OmniRoute/pull/3284) — thanks @diegosouzapw) ### 🙌 Contributors Thanks to everyone whose work landed in v3.8.12: | Contributor | PRs / Issues | | ------------------------------------------------ | --------------------------------------------------------------------------------- | | [@oyi77](https://github.com/oyi77) | #3250, #3259, #3280, #3285, #3286 | | [@wilsonicdev](https://github.com/wilsonicdev) | #3249, #3268, #3282 / #3283 (co-author, #3247 diagnosis), #3287 | | [@strangersp](https://github.com/strangersp) | #3261 | | [@MikeTuev](https://github.com/MikeTuev) | #3248 | | [@leninejunior](https://github.com/leninejunior) | #3271 | | [@zhiru](https://github.com/zhiru) | #3274 | | [@diegosouzapw](https://github.com/diegosouzapw) | maintainer — #3256, #3263, #3270, #3275, #3277, #3278, #3279, #3281, #3284, #3289 | --- ## [3.8.11] — 2026-06-05 ### ✨ New Features - **theoldllm:** add The Old LLM — a free, Playwright-backed provider with dual-mode operation (cached browser token + direct fetch) bridged through a Vercel relay (#3217 — thanks @oyi77) - **codex:** add Codex login via OpenAI's browser-driven device authorization flow, exposed as a shareable "Adicionar Externo" public link (`/connect/codex/{token}`) so a third party can complete the OpenAI device login without dashboard access (#3195 — thanks @zhiru) - **proxy:** per-connection proxy distribution — `proxy_enabled` DB schema + Zod-validated resolution backend, automatic proxy-fallback selection when provider validation hits a network error, and a dashboard UI with per-connection toggles and a tag-filtered "Distribute Proxies" button (#3170, #3171, #3172 — thanks @pizzav-xyz) - **api:** `/v1/images/generations` and `/v1/images/edits` now resolve a bare combo/alias model name (e.g. `image`) to its single image target, and `/v1/images/edits` forwards multipart edits to custom OpenAI-compatible providers' `{base_url}/images/edits` (also accepting JSON/data-URL edit input) instead of rejecting everything but chatgpt-web (#3214, #3215 — thanks @ngocquynh85) ### 🔧 Bug Fixes - **api:** combo names sent to `/v1/responses` are no longer force-rewritten to `codex/` — the Codex CLI WS→HTTP fallback rewrite now skips bare names that are combos, so combos (e.g. `n8n-text`, `paid-premium`) route correctly again instead of failing with "No credentials for provider: codex" (regression since v3.8.9) (#3227, #3233 — thanks @Marcus1Pierce, @Dima-Kal) - **antigravity:** the `agy` `gemini-3.1-pro-high`/`-low` models now alias to the plain `gemini-3.1-pro` upstream id (the `-high`/`-low` suffix is rejected for gemini-3.x), and non-streaming upstream 4xx/5xx errors surface as real error bodies instead of being masked as an empty `chat.completion` envelope (#3229) - **auth:** honor the effective `REQUIRE_API_KEY` feature flag (DB override > env > default) in client API auth instead of reading `process.env` directly, and align the route-local optional-auth checks (`/v1/embeddings`, `/v1/web/fetch`, `/v1/combos`, playground) with it (#3188 — thanks @xz-dev) - **oauth:** use `api.anthropic.com` for the Claude OAuth token exchange so self-hosted VPS deployments are no longer blocked by Cloudflare Bot Management on `console.anthropic.com` (#3203, fixes #3192 — thanks @wilsonicdev; the same root cause was independently diagnosed by @ibanunmangun in [#3193](https://github.com/diegosouzapw/OmniRoute/pull/3193), credited here as co-author) - **oauth:** validate OAuth client IDs against `resolvePublicCred` so adding an Antigravity / Gemini CLI / AGY connection with the built-in public client no longer fails with a Google `redirect_uri_mismatch` (#3206 — thanks @juandisay) - **auto-combo:** include zero-config OpenCode Free in `auto/*` virtual combos even with no `provider_connections` row, reusing the synthetic `noauth` connection id and routing through the `oc/` prefix (#3189, fixes #3155 — thanks @wilsonicdev) - **sse:** refine Kimi thinking-block handling and add regression tests for assistant tool-call replay (#3191 — thanks @bypanghu) - **openrouter:** report the true upstream `context_length` for passthrough models instead of the 128K default — `normalizeDiscoveredModels` now reads `context_length`/`top_provider.context_length` (and `max_completion_tokens` for output) when `inputTokenLimit` is absent (#3202 — thanks @pulyankote) - **images:** custom image-generation providers now use the provider node's base URL (`providerSpecificData.baseUrl`) and resolve the `prefix/model` form, instead of silently falling back to the Gemini endpoint (#3205 — thanks @ngocquynh85) - **docker:** the container healthcheck now probes `127.0.0.1`/`localhost`/`::1` and prints the failure to stderr instead of swallowing it, fixing false "unhealthy" status when the server binds to a non-loopback address (#3151 — thanks @naimo84) - **docker:** copy `scripts/dev/healthcheck.mjs` into the runner-base image — the Next.js standalone output doesn't trace it, so the `HEALTHCHECK CMD ["node", "healthcheck.mjs"]` probe silently exited 1 (#3201 — thanks @wilsonicdev) - **llama-cpp:** fall back to the provider's local default base URL (`127.0.0.1:8080/v1`) when a local connection has no base URL set, instead of silently routing to OpenAI (residual of #3136) (#3197 — thanks @tjengbudi) - **provider-models:** allow deleting synced/fetched models (e.g. llama-cpp) via `DELETE /api/provider-models` — the handler now clears the `syncedAvailableModels` namespace, not just `customModels` (#3204, fixes #3199 — thanks @wilsonicdev); and a deleted synced model now stays deleted across an auto-fetch re-import (the DELETE marks it hidden and the re-import skips hidden ids) (#3199 — thanks @tjengbudi) - **db/electron:** fix `Cannot find module 'better-sqlite3'` crash when importing a database backup in the packaged Electron app (Windows installer) — the `db-backups/import` route now opens its integrity-check DB through the resilient driver factory (better-sqlite3 → node:sqlite → sql.js) instead of a static native import that is stripped from the standalone server bundle; a guard test prevents any API route from reintroducing a direct native import (#3025 — thanks @yeardie) - **dashboard:** the home provider-topology graph now shows the friendly provider name instead of the internal UUID for custom providers — the label precedence let `getProviderConfig`'s `{ name: providerId }` fallback shadow the pre-resolved name (#3198 — thanks @tjengbudi) - **providers:** NVIDIA key validation now probes the universally-available `meta/llama-3.1-8b-instruct` instead of the catalog's first model (`z-ai/glm-5.1`), which requires the "Public API Endpoints" account permission and could hang/be DEGRADED — making a valid key fail with a misleading "Upstream Error" (#3116 — thanks @miracuves) - **providers:** NVIDIA NIM key validation no longer times out (504) — the probe bypasses the global undici `fetch` proxy patch (`open-sse/utils/proxyFetch.ts`) that is incompatible with NVIDIA's endpoint and made the request hang silently (#3226 — thanks @miracuves) - **dashboard:** corrected two misleading provider credential hints — Grok Web now states both `sso` and `sso-rw` cookies are required (was just `sso`), and the Vertex AI Service Account field shows real instructional placeholder text instead of an untranslated stub across 40 locales (#3180, #3091 — thanks @YoursSweetDom, @Guru01100101) - **i18n:** normalize dotted `compliance.eventTypes` keys into nested objects at load time so next-intl no longer throws `INVALID_KEY: Namespace keys cannot contain "."` (the same PR also corrects the Codex import-auth provider hint) ([#3185](https://github.com/diegosouzapw/OmniRoute/pull/3185) — thanks @zhiru; the same i18n bug was independently fixed by @androw in [#3167](https://github.com/diegosouzapw/OmniRoute/pull/3167), credited here as co-author) - **usage:** route the `agy` provider's quota through the existing Antigravity usage implementation (register `agy` in `USAGE_FETCHER_PROVIDERS`, all four `getUsageForProvider` call sites + `parseQuotaData` + `syncAntigravitySubscriptionIfNeeded`) so it no longer falls through to "Usage API not implemented" (#3232, fixes #3230 — thanks @wilsonicdev) - **cli:** show OpenCode Free in the Hermes Agent model picker even with no active connection — new optional `alwaysIncludeProviders` prop on `ModelSelectModal` (defaults to `[]`, so other callers are unaffected) lets zero-config providers like `opencode` surface in the grouped list (#3240 — thanks @wilsonicdev) - **gemini:** refresh the Gemini (AI Studio) static fallback so the provider tab exposes current 3.x / 2.5 models on first run, preserving the `gemini-2.0-flash` default ordering; the full catalog still comes from API sync once a key is added (#3241, fixes #3231 — thanks @wilsonicdev) ### 📝 Maintenance - **build:** finish the build-output-isolation cleanup — `assembleStandalone.mjs` now derives both its async (`syncStandalone*`) and sync copy paths from a single `NATIVE_ASSET_ENTRIES`/`EXTRA_MODULE_ENTRIES` source of truth (previously two hand-maintained lists that could silently drift), guarded by a new parity test; and the `Dockerfile` drops 5 redundant per-module `COPY` overrides (`@swc/helpers`, `pino-abstract-transport`, `pino-pretty`, `split2`, `migrations`) now that `assembleStandalone` bundles them into the standalone regardless of NFT/Turbopack tracing (validated with a real Turbopack `docker build` + boot → `/api/monitoring/health` 200; `better-sqlite3` stays explicit since only its native `build/` is synced) (#3187 — thanks @diegosouzapw) - **combo:** add a regression guard asserting the same-provider cascade is short-circuited by the connection-cooldown layer (#3200 — thanks @diegosouzapw) - **repo:** housekeeping — ignore the generated `coverage/` output dir and prune deprecated `.agents/skills/*` SKILL definitions superseded by the current workflow skills (thanks @diegosouzapw) ### 🙌 Contributors Thanks to everyone whose work landed in v3.8.11: | Contributor | PRs / Issues | | -------------------------------------------------- | ----------------------------------------------- | | [@wilsonicdev](https://github.com/wilsonicdev) | #3189, #3201, #3203, #3204, #3232, #3240, #3241 | | [@pizzav-xyz](https://github.com/pizzav-xyz) | #3170, #3171, #3172 | | [@zhiru](https://github.com/zhiru) | #3185, #3195 | | [@oyi77](https://github.com/oyi77) | #3217 | | [@miracuves](https://github.com/miracuves) | #3116, #3226 | | [@ngocquynh85](https://github.com/ngocquynh85) | #3205, #3214, #3215 | | [@xz-dev](https://github.com/xz-dev) | #3188 | | [@bypanghu](https://github.com/bypanghu) | #3191 | | [@juandisay](https://github.com/juandisay) | #3206 | | [@tjengbudi](https://github.com/tjengbudi) | #3197, #3198, #3199 | | [@naimo84](https://github.com/naimo84) | #3151 | | [@yeardie](https://github.com/yeardie) | #3025 | | [@pulyankote](https://github.com/pulyankote) | #3202 | | [@YoursSweetDom](https://github.com/YoursSweetDom) | #3180 | | [@Guru01100101](https://github.com/Guru01100101) | #3091 | | [@androw](https://github.com/androw) | #3167 (co-author) | | [@ibanunmangun](https://github.com/ibanunmangun) | #3193 (co-author) | | [@diegosouzapw](https://github.com/diegosouzapw) | maintainer — #3187, #3200, issue-fix batches | --- ## [3.8.10] — 2026-06-04 OAuth resilience & observability release: spaced/sequential quota sync for OAuth accounts, a per-provider proactive-refresh skip list to keep short-TTL providers (Kimi) alive without re-exposing the Codex Auth0 cascade, token-expiry visibility on the provider cards, a new provider-stats dashboard, plus a wide batch of provider fixes (DeepSeek-web tool calls, Antigravity, Qoder, MiniMax, GitHub Copilot, Fireworks, llama.cpp, t3.chat-web, Kiro, Kilocode) and Podman deployment support. ### ✨ New Features - **dashboard:** new Provider Stats page + `/api/provider-stats` endpoint — per-provider and per-model aggregates from `call_logs` plus live combo/telemetry/tool-latency overlays. (#3175 — thanks @pizzav-xyz / @diegosouzapw) - **metrics:** cross-request TTFT and gap-after-tool-call latency tracking, aggregated per provider. (#3173 — thanks @pizzav-xyz / @diegosouzapw) - **quota:** show the OAuth token expiry on provider cards (small, blue, informative — "Token expires in …" / "Token expired"). (#3178 — thanks @diegosouzapw) - **responses:** strip `previous_response_id` for stateless Responses upstreams, with an auto/strip/preserve setting + UI so stateless clients (e.g. VS Code Custom Endpoint) keep context. (#3143 — thanks @JxnLexn) - **deploy:** Podman/rootless deployment support (contrib units + `CONTAINER_HOST` hint) and larger upload body-size limits for `/v1/files`. (#3128 — thanks @hartmark) ### 🔧 Bug Fixes - **usage:** sequential + spaced OAuth quota sync (`PROVIDER_LIMITS_SYNC_SPACING_MS`) so a host no longer bursts simultaneous usage/refresh requests; reactive forced re-mint after a 401 on the per-card refresh (recovers imported accounts); a genuine 401 now surfaces a re-authenticate hint. (#3156 — thanks @diegosouzapw) - **healthcheck:** per-provider proactive-refresh skip list (`OMNIROUTE_HEALTHCHECK_SKIP_PROVIDERS`) — keep rotating-cascade providers (Codex/OpenAI) reactive-only while short-TTL providers (Kimi-coding) keep refreshing proactively. (#3159 — thanks @diegosouzapw) - **providers:** on `?refresh=true` with no remote models, don't resurface the just-cleared synced cache into the local-catalog fallback. (#3181 — thanks @diegosouzapw) - **providers:** use synced models as the authoritative local catalog across all providers (even on connections that didn't run the sync). (#3148 — thanks @herjarsa) - **web-tools:** parse bare-JSON tool calls for DeepSeek-web with fuzzy tool-name matching scoped to the requested tools. (#3157 — thanks @wilsonicdev) - **responses:** normalize `image_url` parts across every Responses input path (message content, replayed output items, `function_call_output`) to avoid upstream 400s. (#3150 — thanks @wilsonicdev) - **antigravity:** dynamic upstream model resolution via the MITM alias table (server-only executor), with a guard against corrupted alias values. (#3144 — thanks @herjarsa) - **qoder:** bifurcate validation by token type — PAT (`pt-`) → Cosy, regular API key → dashscope — matching the executor's routing. (#3149 — thanks @herjarsa) - **api-manager:** preserve API key expiration in local time (the `datetime-local` input no longer silently shifts to UTC) + a clear button. (#3146 — thanks @xz-dev) - **opencode-plugin:** map `caps.thinking → ModelV2.capabilities.interleaved` for single models and combos. (#3138 — thanks @mrmm) - **kiro:** optional `targetProvider` on the social-OAuth exchange so Kiro-based providers can reuse the social login flow. (#3176 — thanks @pizzav-xyz) - **misc:** broaden the DeepSeek reasoning-replay regex (`-free` / `zen/deepseek-v4`), export `ProviderProfile`, and guard a non-string directory entry in the binary manager. (#3177 — thanks @pizzav-xyz) - **providerRegistry:** point kilocode at the OpenAI format + default executor (matching its sibling `kilo-gateway`). (#3166 — thanks @androw) - **fireworks:** preserve fully-qualified router/model IDs so Fire Pass router IDs (`accounts/fireworks/routers/...`) are no longer double-prefixed into an upstream 404. (#3133 — thanks @KooshaPari) - **llama-cpp:** route requests to the configured local baseUrl instead of OpenAI's API (which returned an OpenAI-worded 401). (#3136 — thanks @tjengbudi) - **t3-chat-web:** parse cookies + convexSessionId from the single stored credential so t3.chat web connections work (the executor previously read fields the credential pipeline never produced). (#3007 — thanks @minhtran162) - **minimax:** stop capping MiniMax-M3 / MiniMax-M2.7 `max_tokens` at the 8192 default — add the M3 model spec (512K output) and make model-spec lookups case-insensitive. (#3141 — thanks @totaltube) - **github-copilot:** discover the model catalog live from `api.githubcopilot.com/models` so Import Models refreshes and only entitled models are listed (with fallback to the static catalog). (#3120, #3121 — thanks @gabrielmoreira) - **combo:** invalidate the nested-combo cache on combo edits so removed targets/models stop being served within the 10s window; log the resolved DATA_DIR at startup to diagnose multi-replica volume mismatches. (#3147 — thanks @ViFigueiredo) - **providers:** resolve web-provider alias collisions. (thanks @diegosouzapw) ### 📝 Maintenance - **deps:** bump hono from 4.12.18 to 4.12.23. (#3179 — thanks @dependabot) - **ci(electron):** make the macOS-arm64 smoke step best-effort (headless GPU crash). (#3137 — thanks @diegosouzapw) - **chore(release):** open the v3.8.10 development cycle. (thanks @diegosouzapw) --- ## [3.8.9] — 2026-06-03 ### ✨ New Features - **Obsidian context source — 24 MCP tools** (`read:obsidian` / `write:obsidian`) — search, read, write, and bidirectional sync against a local Obsidian vault via the [Local REST API community plugin](https://github.com/obsidianmd/obsidian-local-rest-api). Dashboard "Context Sources" tab, settings API, DB config. (#3077 — thanks @branben) - **cursor:** vision (`image_url`) input for the Cursor provider — OpenAI image parts are encoded as `SelectedContext.selected_images[]` in the `agent.v1` protobuf, plus a tool-commit directive (lifts composer-2.5's tool-call rate), `tool_choice` none/required/specific handling, and `response_format`/`max_tokens`/`stop` output constraints surfaced to the agent. Hardened with SSRF + DNS-rebinding guards, a 1 MiB pre-decode cap, and a protobuf length-overrun check. (#3104 — thanks @payne0420) - **deepseek-web:** opt-in persistent session + rolling-window conversation memory (`persistSession`, `historyWindow` per-connection settings) and bidirectional tool-call translation — tool schemas are injected as a system prompt and `{…}` blocks in the reply are parsed back into OpenAI `tool_calls` (replacing the old hard `400`). ([#2942](https://github.com/diegosouzapw/OmniRoute/issues/2942), [#2820](https://github.com/diegosouzapw/OmniRoute/issues/2820)) - **i18n:** Turkish locale-aware search & sorting — a `turkishText` helper (`normalizeForSearch`, `matchesSearch`, `compareTr`) folds the dotted/dotless İ/ı correctly and uses `Intl.Collator("tr")`, wired across dashboard search/sort call-sites with an ESLint guard (warn) against raw `toLowerCase().includes()`. (#3115 — thanks @osrt91) - **kiro:** add Claude Opus 4.8 to the Kiro (AWS CodeWhisperer) model catalog — Kiro previously topped out at Opus 4.7 even though Opus 4.8 was already defined and served by the `claude` provider. (#3131 — thanks @artickc) ### 🔧 Bug Fixes - **sse:** stop 502'ing streaming requests when a "reasoning" openai-compatible upstream ignores `stream:true` and returns a complete `application/json` body — the streaming readiness check only recognized SSE `data:` frames, so such a JSON body (even with valid `content`/`reasoning_content`) produced a spurious `STREAM_EARLY_EOF`. OmniRoute now detects a non-SSE JSON upstream body on the streaming path and synthesizes an equivalent OpenAI SSE stream (`synthesizeOpenAiSseFromJson`), preserving content + reasoning_content. ([#3089](https://github.com/diegosouzapw/OmniRoute/issues/3089)) - **cache:** serve semantic-cache hits as SSE for streaming clients — a cache hit returned `application/json` regardless of the `stream` flag, so OpenAI-compatible streaming clients lost `reasoning_content` (and got a non-stream body) on cached responses. Stream requests now SSE-wrap the cached completion. ([#2952](https://github.com/diegosouzapw/OmniRoute/issues/2952)) - **i18n:** fill the missing Chinese (zh-CN) and Russian (ru) UI translations — both locales were missing 9 entire sections (`quotaPlans`, `activity`, `agentBridge`, `trafficInspector`, `cliCommon`, `cliCode`, `cliAgents`, `acpAgents`, `agentSkills`, ~823 keys each) added after the last translation sweep, so those buttons/labels rendered in English. Both catalogs are now at full key parity with `en.json` (8025 keys). ([#3026](https://github.com/diegosouzapw/OmniRoute/issues/3026), [#3067](https://github.com/diegosouzapw/OmniRoute/issues/3067)) - **dashboard:** fix "Ambiguous model" error in the provider Playground for vendor-namespaced models — the Playground only prefixed models without a `/`, so ids like `moonshotai/kimi-k2.6` or `nvidia/zyphra/zamba2-7b-instruct` (NVIDIA NIM) were sent bare and rejected when the same id exists under multiple providers. The Playground now always qualifies the selected model with its `providerId/` prefix (without double-prefixing). ([#3050](https://github.com/diegosouzapw/OmniRoute/issues/3050)) - **db:** stop accepting duplicate API keys for the same provider — `createProviderConnection` now dedups by the decrypted key value (not just by name), so re-adding the same key under a different/blank name updates the existing connection instead of inserting a second row. Whitespace-only differences also dedup. ([#3023](https://github.com/diegosouzapw/OmniRoute/issues/3023)) - **dashboard:** "Import from /models" now works for no-auth providers (e.g. OpenCode Free) — the button used to silently no-op because no-auth providers have no connection row, so `handleImportModels` returned early and the models route 404'd. The route now serves the provider's model catalog when called with a no-auth provider id, and the dashboard falls back to the provider id when there is no connection. ([#3047](https://github.com/diegosouzapw/OmniRoute/issues/3047)) - **providers:** forward Grok's paired `sso-rw` cookie for grok-web — both the executor and the connection validator now send `sso=…; sso-rw=…` (via the new `buildGrokCookieHeader` helper) when the pasted blob carries `sso-rw`, fixing the `403` _"Request rejected by anti-bot rules"_ that Grok returns for `sso` alone. The add-account hint now asks for the full cookie line. ([#3063](https://github.com/diegosouzapw/OmniRoute/issues/3063)) - **providers:** fix claude-web persistent 403 — `execute()` was calling the synchronous `normalizeClaudeSessionCookie()` which never injects `cf_clearance`; changed to async `normalizeClaudeSessionCookieWithAutoRefresh()` with `allowAutoSolve:true`. Also removes dead executor `claude-web-auto-refresh.ts` and correctly reclassifies `duckduckgo-web` and `veoaifree-web` as `NOAUTH_PROVIDERS`. (#3090 — thanks @oyi77) - **autoCombo:** rotate across all provider connections, never waste capacity — `buildAutoCandidates` now expands each provider into one candidate per active connection (e.g. 43 Cerebras keys → 43 candidates). Adds `ScoreTierRotator` with per-combo round-robin state, combo-name-aware tier preferences (smart/fast/cheap/coding), `connectionDensity` factor (weight 0.05), and budget-cap degradation using the rotator. (#3078 — thanks @oyi77) - **providers:** fix SiliconFlow model sync from configured endpoint — routes model discovery through `providerSpecificData.baseUrl` so CN (`api.siliconflow.cn`) vs Global endpoint selection is respected, and prevents `/sync-models` from treating `source: "local_catalog"` fallback responses as successful remote syncs. (#3094 — thanks @xz-dev) - **resilience:** a per-model subscription/permission `403` from a passthrough provider (e.g. Ollama Cloud `deepseek-v4-pro` → _"this model requires a subscription"_) now locks out **only that model** instead of cooling down the whole connection — the free models on the same key keep serving, and repeated paid-model 403s no longer escalate a connection-wide backoff. Generalizes the grok-web 403 precedent to all `hasPerModelQuota` providers; terminal/credential 403s (banned/deactivated key) still deactivate the connection. ([#3027](https://github.com/diegosouzapw/OmniRoute/issues/3027)) - **cache:** preserve client-side `cache_control` breakpoints for Xiaomi MiMo — added `xiaomi-mimo` to the prompt-caching provider allowlist so Claude Code (via cc-switch) cache hints are no longer stripped by the OpenAI-format translator, restoring cache hits. ([#3088](https://github.com/diegosouzapw/OmniRoute/issues/3088)) - **tools:** keep opaque object schemas open — empty object schemas (and the `web_search` passthrough shim) now get `additionalProperties: true` so GPT-5.5/Codex stop pruning untyped nested payloads (e.g. `SPLOX_EXECUTE_TOOL.args`). (#3097 — thanks @nmime) - **codex:** preserve native Responses passthrough tools and history — `tool_search` and `custom` tools (e.g. `apply_patch`) survive `normalizeCodexTools`, and `phase:"commentary"` history items are kept, only on the native passthrough path (`_nativeCodexPassthrough`). (#3107 — thanks @yinaoxiong) - **responses:** resolve bare ChatGPT model ids (e.g. `gpt-5.5`) to `codex/…` on the `/v1/responses` HTTP fallback path, fixing the Codex CLI WS→HTTP fallback that was routing to a credential-less provider (#3113). - **sse:** bound the Antigravity 429 short-retry loop (per-URL `MAX_AUTO_RETRIES` guard — no more infinite loop on a persistent 429) and lock quota-exhausted accounts for the full "Resets in XhYmZs" window via model lockout. (#3122 — thanks @ahmet-cetinkaya) - **image-gen:** add an AbortController timeout to `fetchImageEndpoint` so a stuck image provider surfaces a `504` instead of hanging until the server timeout. (#3105 — thanks @mgarmash) - **logs (perf):** fix browser freeze and network saturation on `/dashboard/logs` — smaller page size, 15s polling, pause polling on a hidden tab / past the first page, and memoized derived lists. (#3109 — thanks @0xtbug) - **cli:** handle Windows `.exe` healthchecks with spaces in the path — direct executables skip the shell (so `cmd.exe` doesn't split `C:\…\Name With Spaces\…\claude.exe`) while `.cmd`/`.bat` wrappers still run through it. (#3111 — thanks @EmpRider) - **cli:** don't write `STORAGE_ENCRYPTION_KEY` to `.env` on informational commands — `omniroute --version`/`--help` no longer generate a key or create `~/.omniroute/.env`; provisioning is scoped to commands that actually touch encrypted storage (#3129). - **tests:** remove a stale lowercase `db-apikeys-crud.test.ts` duplicate that collided with the canonical `db-apiKeys-crud.test.ts` on case-insensitive filesystems (no coverage lost). (#3125 — thanks @juandisay) - **kimi:** add a dedicated `KimiExecutor` so Kimi thinking-mode responses no longer drop `reasoning_content` — the reasoning stream is now surfaced instead of being lost. (#3132 — thanks @bypanghu) - **handler:** provide a `connectionId` fallback when it is undefined, fixing kilo (kilocode) calls that were silently not being written to `call_logs`. (#3130 — thanks @androw) ### 🔧 Build - **build-output-isolation:** unified standalone assembly into one shared `assembleStandalone` module; isolated build output into `.build/` (intermediates, gitignored) and `dist/` (shippable bundle, gitignored), replacing the old repo-root `app/` and `.next/` directories; dropped the duplicate `next build` that prepublish previously ran; added `build:release` script for a clean rebuild with a `dist/BUILD_SHA` HEAD sentinel that guards against deploying stale bundles. **Operators using custom `app/` paths:** the published bundle directory on the VPS image (`/usr/lib/node_modules/omniroute/app/`) is unchanged — only the in-repo build output path moved. Update any local scripts that reference the repo-local `app/` build output to `dist/` instead. - **build:** re-apply the build-reorg follow-ups that landed after the main refactor merged — the `serve` CLI now falls back from `dist/` to the legacy `app/` location for upgrade safety, and the deploy skills `pm2 stop` before `rsync --delete` to avoid a transient `Cannot find module ./chunks/…` race (#3127). - **build:** fix the standalone static-asset path so the dashboard renders after the build-output reorg — `assembleStandalone` was copying `static/` into `/.next/static`, but the standalone server (built with `distDir=.build/next`) serves `/_next/static` from `/.build/next/static`, so every JS/CSS chunk 404'd and the login UI rendered as a blank page. The static (and `required-server-files.json` / Turbopack chunk) destinations are now derived from the configured `distDir` instead of a hard-coded `.next`. ### 📦 Dependencies - **electron:** bump to 42.3.2 (crash fix desktopCapturer, Chromium 148.0.7778.218, ThinLTO perf) (#3083) - **electron-updater:** bump to 6.8.8 (security: harden auto-update flow against path traversal and env var intercepts) (#3084) - **electron-builder:** bump to 26.14.0 (security hardening, pure-JS blockmap/icon migration) (#3082) - **dev deps:** bump eslint-config-next 16.2.7, lint-staged 17.0.7, typescript-eslint 8.60.1, vitest 4.1.8 (#3086) - **prod deps:** bump next 16.2.7, react/react-dom 19.2.7, tsx 4.22.4, ws 8.21.0, parse5 8.0.1, commander 15.0.0, and 15 other packages (#3085) ### 🙌 Contributors Huge thanks to everyone whose work shipped in v3.8.9: @branben (Obsidian context source), @oyi77 (claude-web 403 fix, autoCombo connection rotation), @xz-dev (SiliconFlow model sync), @nmime (open opaque tool schemas), @payne0420 (Cursor vision input), @mgarmash (image-gen fetch timeout), @yinaoxiong (Codex native passthrough tools/history), @0xtbug (logs page perf), @EmpRider (Windows CLI healthcheck paths), @ahmet-cetinkaya (Antigravity 429 retry bound + quota lockout), @juandisay (duplicate test cleanup), @osrt91 (Turkish locale-aware search & sorting), @artickc (Kiro Opus 4.8 catalog), @bypanghu (Kimi thinking-mode reasoning_content fix), and @androw (connectionId fallback + kilo call logging). And thank you to the OmniRoute community for the bug reports, reproductions, and testing that drove these fixes. 🎉 --- ## [3.8.8] — 2026-06-03 ### Added - **Plugins framework** (`src/lib/plugins/`, `/api/plugins/*`, `/dashboard/plugins`) — hooks + registry unification, plugin SDK (`definePlugin`), worker-thread sandbox, per-plugin hook rate limiting, SHA-256 integrity verification, semver-gated upgrade, and execution analytics. Plugin routes are loopback-only (`isLocalOnlyPath`) and `child_process` exec is opt-in via `OMNIROUTE_PLUGINS_ALLOW_EXEC`. (#2913 / #3041 — thanks @oyi77) - **Plugin system: response-hook wiring + startup load + example plugin** — wires the plugin `onResponse` hook into the chat success path, loads active plugins on server startup so they survive restarts (`pluginManager.loadAll()` in `server-init`), and ships a `welcome-banner` example plugin (`examples/plugins/`) plus a comprehensive plugin test suite. (#3045 — thanks @oyi77) - **API key option: disable non-published models** — a per-key flag restricting the key to discovered, public models (combos / `auto/*` / `qtSd/*` routing still allowed). (#3017 — thanks @androw) - **SessionPool — modular & provider-agnostic** (`open-sse/services/sessionPool/`) — pooled cookie/session manager with round-robin fingerprint rotation (distinct fingerprint per pooled session), per-session cooldown/backoff, and a provider-agnostic `webExecutorWrapper`. Adds pool support for DuckDuckGo Web and LLM7 providers and an MCP `poolTools` toolset. (#2954 / #2978 — thanks @oyi77) - **AgentBridge** (`/dashboard/tools/agent-bridge`) — MITM proxy consolidating 9 IDE agents (Antigravity, Kiro, GitHub Copilot, OpenAI Codex, Cursor IDE, Zed Industries, Claude Code, Open Code, Trae stub) with server card, per-agent setup wizard, model mapping table, bypass list, upstream CA cert support, and redirect from legacy `/dashboard/system/mitm-proxy`. See `docs/frameworks/AGENTBRIDGE.md`. (#2858 — thanks @diegosouzapw) - **Traffic Inspector** (`/dashboard/tools/traffic-inspector`) — LLM-aware HTTPS debugger with 4 capture modes (AgentBridge hook, Custom Hosts DNS, HTTP_PROXY :8080, System-wide proxy), DevTools split UI, 7 detail tabs (Conversation, Headers, Request, Response, Timing, LLM Details, Stats), resizable panels, session recording (.har/.jsonl export), SSE stream merger, conversation normalizer (multi-provider), system-prompt fingerprint colorization, and annotations. See `docs/frameworks/TRAFFIC_INSPECTOR.md`. - **MITM handler base + 9 agent handlers** (`src/mitm/handlers/`) — `MitmHandlerBase` abstract class with `hookBufferStart`/`hookBufferUpdate` for Traffic Inspector integration; concrete handlers for all 9 agents. - **MITM targets registry** (`src/mitm/targets/`) — declarative `MitmTarget` shape per agent; emits `DATA_DIR/mitm/targets.json` for dynamic `server.cjs` resolution. - **Traffic Inspector core** (`src/mitm/inspector/`) — `TrafficBuffer` in-memory ring, `kindDetector`, `sseMerger` (MIT port from chouzz/llm-interceptor), `conversationNormalizer` (MIT port), `contextKey` fingerprinting, `httpProxyServer`, `systemProxyConfig`. - **AgentBridge passthrough + bypass** (`src/mitm/passthrough.ts`) — TCP tunnel for non-mapped hosts; bypass list with default sensitive-host patterns + user-defined patterns. - **Upstream CA cert** (`src/mitm/upstreamTrust.ts`) — `AGENTBRIDGE_UPSTREAM_CA_CERT` for corporate TLS environments. - **Secret masking** (`src/mitm/maskSecrets.ts`) — sk-/Bearer/generic token masking before any log or Traffic Inspector broadcast. - **DB migrations 073–075** — `agent_bridge_state`, `agent_bridge_mappings`, `agent_bridge_bypass`, `inspector_custom_hosts`, `inspector_sessions`, `inspector_session_requests`. - **~28 API routes** under `/api/tools/agent-bridge/` (12 routes) and `/api/tools/traffic-inspector/` (16+ routes). All LOCAL_ONLY + SPAWN_CAPABLE. - **i18n** PT-BR + EN for all new keys in `agentBridge.*` and `trafficInspector.*` namespaces; all other locales fall back to EN automatically. - **E2E smoke tests** — `tests/e2e/agent-bridge.spec.ts`, `tests/e2e/traffic-inspector.spec.ts`, `tests/e2e/agent-bridge-traffic-cross.spec.ts` (skip-gated on CI by `RUN_AGENT_BRIDGE_E2E` / `RUN_TRAFFIC_INSPECTOR_E2E` / `RUN_CROSS_E2E`). - **Documentation** — `docs/frameworks/AGENTBRIDGE.md` and `docs/frameworks/TRAFFIC_INSPECTOR.md`; `docs/architecture/REPOSITORY_MAP.md` updated; `docs/openapi.yaml` updated with ~28 new routes and 20+ new schemas. - **i18n:** translate Ukrainian (uk-UA) menu and UI strings, plus complete uk-UA UI coverage (#2981 / #2988 — thanks @Lion-killer) - **providers:** add SiliconFlow endpoint selector (#2975 — thanks @xz-dev) - **oauth:** add Trae SOLO provider (work/code modes) (#2964 — thanks @S0yora) - **providers:** add Qwen Web (chat.qwen.ai) web-cookie provider (#2947 — thanks @oyi77) - **Quota Share Engine — multi-provider quota pools** — Monitoring/Costs reorg plus a Quota Share Engine: group selector, grouped pool cards, exclusive-quota API keys (`allowedQuotas`), `quotaShared-*` routing models via combos, a 3-step pool wizard (legacy Plans page retired), endpoint + key preview, and full pool editing. Adds quota-pool DB migrations. (#2859 / #3022 / #3032 — thanks @diegosouzapw) - **Dashboard page redesigns (Nav Restructure)** — agent-skills + omni-skills with a dynamic 42-skill catalog and MCP/A2A discovery (#2827); CLI Code's + CLI Agents + ACP Agents pages (#2839); translator friendly redesign, 5 tabs → 2 (#2847); functional `/batch` + `/batch/files` redesign (#2849); Playground Studio + Search Tools Studio (#2869); memory engine redesign — sqlite-vec + hybrid RRF + Studio UI (#2873). (thanks @diegosouzapw) - **notion:** add Notion as an MCP context source — 6 tools (`notion_search`, `notion_list_databases`, `notion_get_database`, `notion_query_database`, `notion_read`, `notion_append_blocks`) scoped under `read:notion` / `write:notion`, with dashboard "Context Sources" tab, settings API, and token persistence in `key_value` table (#2959 — thanks @branben) - **Per-API-key stream default mode** — a per-key setting that forces JSON or SSE as the default response shape (migration `077_api_key_stream_default_mode`), so integrations that expect non-streaming JSON work without client changes. (thanks @JxnLexn) - **Codex Responses-over-WebSocket** — opt-out flag `OMNIROUTE_CODEX_WS_ENABLED` (default ON) upgrading Codex Responses traffic to a WebSocket bridge with a clean handshake and bridge-secret auth; the Quota Share endpoints card now surfaces the Responses + codex-WS endpoints. (thanks @diegosouzapw) - **Xiaomi MiMo usage tracking** — self-reported usage accounting for Xiaomi MiMo plus a monthly cap preset; DeepSeek USD preset and a Claude plan preset (percent 5h + weekly) seeded into the plan registry. (thanks @diegosouzapw) - **API Manager: Normal vs Quota key sections** — the API keys screen now splits keys into Normal and Quota sections in a compact 2-table layout, and the Quota Share screen gains a beta banner, live per-account upstream quota, and a real-time Codex quota view backed by the cascade-safe serialized refresh. (thanks @diegosouzapw) ### Changed - Sidebar Tools group: added `agent-bridge` and `traffic-inspector` items after `cloud-agents`. - `/api/tools/agent-bridge/` and `/api/tools/traffic-inspector/` added to `LOCAL_ONLY_API_PREFIXES` and `SPAWN_CAPABLE_PREFIXES` in `src/server/authz/routeGuard.ts`. - `.env.example`: documented 9 new env vars (`AGENTBRIDGE_UPSTREAM_CA_CERT`, `INSPECTOR_BUFFER_SIZE`, `INSPECTOR_HTTP_PROXY_PORT`, `INSPECTOR_HTTP_PROXY_AUTOSTART`, `INSPECTOR_TLS_INTERCEPT`, `INSPECTOR_SYSTEM_PROXY_GUARD_MINUTES`, `INSPECTOR_MAX_BODY_KB`, `INSPECTOR_MASK_SECRETS`, `INSPECTOR_LLM_HOSTS_EXTRA`, `INSPECTOR_INTERNAL_INGEST_TOKEN`). ### Fixed - **memory:** the `recent` retrieval strategy no longer drops recent memories whose text doesn't overlap the current prompt. It was internally mapped to the `exact` path, which relevance-filtered by the forwarded prompt (`score > 0`), so recency-based injection silently returned nothing for unrelated prompts. The prompt is no longer forwarded for `recent` (semantic/hybrid still use it for vector search). - **combo:** custom-provider credential lookup now expands `provider_nodes` prefixes (e.g. `78code/gpt-5.4`) to the generated internal connection ids during account selection, so combos targeting compatible/custom providers resolve their live credentials instead of failing to find a connection. (#3058) - **build:** Docker image build (`docker compose --profile cli build`, which runs `next build` with Turbopack) no longer errors. Two Turbopack-only failures were fixed: `sqlite-vec` is now externalized so Turbopack stops trying to bundle its native `vec0.so` ("Unknown module type"), and `manager.stub.ts` now exports `getAllAgentsStatus` (statically imported by `/api/tools/agent-bridge/state` — the missing export aborted the build). The webpack-based VM build was unaffected, which is why the deploy validated while the Docker build errored. The sqlite-vec native binary is also now bundled into the standalone output, so vector/semantic memory keeps working in the container instead of silently degrading to FTS5 keyword search. (#3066 — thanks @freefrank) - **codex/providers:** `POST /api/providers/[id]/refresh` (the manual/auto "refresh token" endpoint) no longer rotates rotating-refresh providers (Codex/OpenAI share one Auth0 `client_id`). This was the last unguarded proactive-refresh entry point: when the dashboard auto-refreshed every expiring connection on a page load (or an old cached frontend bulk-called it), each Codex account's single-use refresh_token was rotated, and Auth0 revoked the whole token family (`openai/codex#9648`) — every account but the last died with `[403] `. The quota path now skips proactive refresh for rotating providers (`rotationGroupFor`) and reuses the current access*token, deferring genuine expiry to the reactive, serialized 401 path. Defense in depth: `serializeRefresh` now leaves a settle gap between two \_queued* sibling refreshes (default 2000 ms, tunable via `CODEX_REFRESH_SPACING_MS`, `"0"` to opt out) while releasing a lone refresh immediately, so the reactive path adds no latency. - **payload-rules:** saved payload rules now survive a server restart. When no in-memory override is set (fresh process before the boot hook ran, or a separate module instance in the standalone build), `getPayloadRulesConfig` now reads the DB-persisted rules (the source of truth) before the file config, instead of silently returning the empty file default. (#2986) - **models/custom:** custom models can now carry a per-model `targetFormat` override (e.g. an opencode-go custom model that must use the Anthropic Messages shape). Previously custom models always routed as OpenAI-compatible because `targetFormat` was neither persisted nor consulted at routing time. Threaded through `addCustomModel`/`replaceCustomModels`/`updateCustomModel`, the API schema/route, `getModelInfo`, and chatCore's targetFormat resolution. (#2905) - **providers/pollinations:** route to `gen.pollinations.ai/v1` instead of the retired `text.pollinations.ai` host, which now returns `404 "legacy API"` for all models. The gen gateway is the current OpenAI-compatible endpoint. (#2987) - **executors/codex:** drop the CLI-injected `image_generation` hosted tool for free-plan Codex accounts (`workspacePlanType === "free"`), which can't run it server-side and would otherwise get an upstream 400. Paid plans keep it. (mirrors CLIProxyAPI's free-plan guard; spun off from the #2980 analysis) - **dashboard:** custom providers (`openai-compatible-*` / `anthropic-compatible-*`) now show their user-given node name instead of the raw UUID id across the active-requests panel, proxy logger, and home-page provider topology. The display-label resolver was extracted into a shared util reused by all surfaces (previously only the request-log viewer resolved it). (#2968) - **docker:** the standalone launcher (Docker `CMD`) now honors `OMNIROUTE_MEMORY_MB` (default 512, clamped [64, 16384]) and overrides the image `NODE_OPTIONS` fallback, fixing random OOM crashes under load / with large SQLite DBs. Previously only `omniroute serve` honored the knob. (#2939) - **docker:** add a `web` compose profile (`omniroute-web`, target `runner-web`, image `omniroute:web`) so web-cookie providers (gemini-web, claude-web, claude-turnstile) work out of the box — the default `base` image ships without Chromium/Playwright, which made those providers fail with "Executable doesn't exist at .../ms-playwright/chromium...". (#2832) - **routing/codex:** fix two gpt-5.5 Codex defects (#2877). (A) For a Codex-only account, a bare `gpt-5.5` Responses request was rerouted to codex with the model hardcoded to `gpt-5.5-medium` (`chatHelpers.ts`); the executor read that `-medium` suffix as an explicit `modelEffort` that (per #2331) overrode a client `reasoning.effort=xhigh`, silently demoting it — now it keeps the bare `gpt-5.5` id so the client effort wins. (B) `gpt-5.5-xhigh`/`-high`/`-low` misrouted to `openai` (→ "No credentials" for codex-only users); the suffixed variants are now in `CODEX_PREFERRED_UNPREFIXED_MODELS` so they infer codex. - **sse/chatCore:** remove a duplicate `const settings` declaration in `handleChatCore` (introduced alongside the per-key stream-default-mode feature). The same-scope redeclaration made esbuild/tsx fail with "The symbol 'settings' has already been declared", which turned every unit test that imports chatCore red and broke the production build. The earlier consolidated `settings` const is now reused. - **db/migrations:** resolve a `077` migration version collision (`077_api_key_stream_default_mode.sql` vs `077_quota_pools.sql`) that made `getMigrationFiles()` throw and blocked `getDbInstance()` at startup (app would not boot; every DB-touching test was red). Renumbered the dependency-free, idempotent `quota_pools` migration to `085`, kept the non-idempotent `api_key_stream_default_mode` `ALTER` at `077`, added a retroactive `isSchemaAlreadyApplied` guard (case `085`), and a regression test enforcing unique migration prefixes. - **routing/reasoning-replay:** OpenCode `big-pickle` (provider `opencode`/`oc` and `opencode-zen`) now declares the interleaved `reasoning_content` contract via a new `RegistryModel.interleavedField` field, so follow-up/tool-use turns replay reasoning_content. Previously `big-pickle` matched no replay pattern and failed with `[400] The reasoning_content in the thinking mode must be passed back to the API` (its DeepSeek-thinking upstream is not detectable from the model id, and `requiresReasoningReplay` does not consume `supportsReasoning`). `getResolvedModelCapabilities` now surfaces the registry `interleavedField`. (#2900) - **providers/github-copilot:** built-in GitHub Copilot Claude Opus and Gemini models (`claude-opus-4.7`, `claude-opus-4-5-20251101`, `gemini-3.1-pro-preview`, `gemini-3-flash-preview`) no longer carry `targetFormat: "openai-responses"`, so they route through `chat/completions` (the provider default, like the working `claude-opus-4.6`) instead of the Responses API, which Copilot does not serve for non-OpenAI models (returned `[400]`). Native OpenAI `gpt-*` models keep the Responses API. (#2911) - **translator/responses:** Codex Desktop injects an `image_generation` hosted tool into every Responses API request (even text-only ones), which OmniRoute rejected with `[400] image_generation tool type is not supported`. It is now treated like `tool_search`: allowed past the tool-type validator and dropped silently from the tools array before forwarding to Chat Completions. (#2950) - **combo/builder:** no-auth OpenCode Free combo entries now use the `oc/` routing alias instead of the `opencode/` prefix. `parseModel("opencode/")` resolves to the `opencode-zen` api-key tier (via a manual `ALIAS_TO_PROVIDER_ID` override), so combos built with the bare provider id misrouted away from the no-auth `opencode` provider; `oc/` resolves correctly. (#2901) - **resilience/providers:** a route-restriction `403` (e.g. Fireworks Fire Pass `fpk_*` keys returning "…not authorized for this route." on `/models`, while chat still works) no longer marks the connection unavailable. Provider validation falls through to the chat probe for such 403s instead of returning "Invalid API key", and `checkFallbackError` short-circuits them to no cooldown. Genuine auth failures (401 / generic 403) still fail fast. (#2929) - **auth/opencode-zen:** the OpenCode Zen free model now works in the Playground and combos without an API key. `opencode-zen` serves the public, signup-free endpoint (`https://opencode.ai/zen/v1`); when no api-key connection is configured, credential resolution now falls back to anonymous (no-auth) access instead of failing with "No credentials for provider: opencode-zen". A configured, active key is still used when present. (#2962) - **translator/responses:** fixed an upstream `[400] Messages with role 'tool' must be a response to a preceding message with 'tool_calls'` when a Codex client sent a `function_call` with an empty/missing `call_id`. The orphaned `function_call_output` previously slipped past the orphan filter. Now empty-`call_id` function calls are skipped (no dangling assistant tool_call) and any tool result without a matching tool_call id is dropped. (#2893) - **deps:** remove the `proxifly` npm dependency (#3000 — thanks @terence71-glitch) - **proxy:** use connection proxy for OAuth refresh (#3012 — thanks @terence71-glitch) - **usage:** export pure helper functions for unit testing (#3015 — thanks @oyi77) - **docs/docker:** align memory default docs to 1024MB (#3006 — thanks @terence71-glitch) - **providers:** fix DuckDuckGo missing API key & update OpenCode free model list (#3008 — thanks @NekoMonci12) - **claude:** bump Claude Code identity to 2.1.158 and sync beta flags (#3010 — thanks @Tentoxa) - **test:** increase DB and usage utils coverage to >60% (#3018 — thanks @oyi77) - **oom:** resolve memory leak in Bottleneck limiter caches and provider registry (#2965 — thanks @soyelmismo) - **proxy:** show registry provider proxies in dashboard after Custom proxy flow moved them into the proxy registry (#2963 — thanks @terence71-glitch) - **routing:** add agy to executor map so it uses AntigravityExecutor (#2957 — thanks @ReqX) - **skills:** avoid Claude assistant tool_result blocks (#2956 — thanks @terence71-glitch) - **perf:** CPU leak from Bottleneck limiter accumulation + per-request optimizations (#2951 — thanks @soyelmismo) - **combo:** combo credential resolution ignores target.providerId — prefer combo target's providerId over model-inferred provider (#2946 — thanks @oyi77) - **dashboard:** v3.8.8 screen fixes — agent-bridge SSR + audit/logs/memory/playground (#2944) - **claude:** sanitize tool schemas + cloak third-party tool names on native Claude OAuth (#2943 — thanks @NomenAK) - **auth:** prevent Codex multi-account refresh_token family revocation (#2941) - **combo:** fix combo vision passthrough and Codex tool history repair (#2940 — thanks @charithharshana) - **claude:** map WebSearch to Responses web_search (#2938 — thanks @makcimbx) - **claude:** strip empty Read pages tool input (#2937 — thanks @makcimbx) - **dashboard:** improve self-service provider quota visibility (#2931 — thanks @guanbear) - **antigravity:** avoid visible signatureless tool history (#2927 — thanks @dhaern) - **sse/web-search:** bypass the web-search fallback on a Claude → Claude passthrough so native Claude requests aren't rewritten (#2960 — thanks @terence71-glitch) - **oom:** prevent per-request memory accumulation (~256MB heap growth) (#2973 — thanks @soyelmismo) - **perf/proxy:** parallelize provider proxy overlay lookups (#2984 — thanks @terence71-glitch) - **privacy/PII:** resolve the PII feature flag correctly and fix PII response sanitization in streaming SSE requests (#3021 — thanks @dangeReis) - **electron:** improve macOS window chrome (#3029 — thanks @bobbyunknown) - **i18n:** fix missing API key scope translations (#3031 — thanks @guanbear) - **stream/responses:** drop a leaked chat bootstrap chunk for Responses-API clients (#3035 — thanks @CitrusIce) - **docker:** warn-only on the `/app/data` permission check instead of `exit 1`, so a non-writable bind mount no longer kills the container at boot (#3036 — thanks @wussh) - **mcp:** resolve streamable-HTTP transport readiness reporting an offline status (#3037 — thanks @Chewji9875) - **dashboard:** use a lightweight ping endpoint for the MaintenanceBanner (fixes #3040) (#3043 — thanks @herjarsa) - **test:** resolve pre-existing test failures — env sync, PII, quota, sidebar (#3039 — thanks @oyi77) - **docs/mcp:** regenerate the mcp-tools diagram for 43 tools and fix the tool count (#3028 — thanks @diegosouzapw) - **mcp:** move `enforceScopes` guard before `MCP_TOOL_MAP` lookup, add inline `scopes` parameter to `withScopeEnforcement()`, and declare scopes on all 24 dynamic tool definitions (memory, skills, plugins, gamification, compression) to fix scope enforcement for dynamic MCP tool groups (#2958 — thanks @branben) - **sse/chatCore:** the heap-pressure guard now auto-calibrates its threshold to 85% of the live V8 heap ceiling (floor 400 MB) instead of a fixed 200 MB that sat below the app's ~260 MB baseline and returned `503 Service temporarily unavailable due to resource pressure` for every request once the heap warmed up. It now tracks `--max-old-space-size` across 1 GB / 2 GB / large VPS; `HEAP_PRESSURE_THRESHOLD_MB` still overrides. (#3052) - **proxy:** fail closed for OAuth usage-account proxies (#3051 — thanks @terence71-glitch) - **proxy:** resolve registry proxy assignments for combo and key levels (#3048 — thanks @terence71-glitch) - **providers/web:** wire the session pool for fingerprint rotation on Pollinations / DuckDuckGo (#3049 — thanks @oyi77) - **providers/claude-web:** add `cf_clearance` cookie support and session-pool fingerprint rotation for Pollinations / DuckDuckGo (#3046 — thanks @oyi77) - **usage:** handle MiniMax coding-plan percent quotas (`general`/percent dimension) so MiniMax coding plans report remaining quota correctly. (thanks @diegosouzapw) - **home:** pass `providerId` to the quota widget icons so provider brand icons resolve on the home dashboard (#3064 — thanks @xz-dev) - **quota:** block `qtSd/*` models for keys with no quota-pool allocation (enforcement Check 2.9), and never flag rotating-refresh providers (Codex/OpenAI) as expired during the quota sync (#3030). (thanks @diegosouzapw) ### 🏆 Contributors A special thanks to everyone who contributed to this release — 746 commits since `v3.8.7`: | Contributor | PRs / Contribution | | -------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- | | [@diegosouzapw](https://github.com/diegosouzapw) | maintainer — AgentBridge, Traffic Inspector, Quota Share Engine, Nav Restructure, Plugins integration, releases & upstream ports | | [@oyi77](https://github.com/oyi77) | #2913, #2947, #2954, #2978, #3015, #3018, #3039, #3041, #3045, #3046, #3049 | | [@terence71-glitch](https://github.com/terence71-glitch) | #2956, #2960, #2963, #2984, #3000, #3006, #3012, #3048, #3051 | | [@soyelmismo](https://github.com/soyelmismo) | #2951, #2965, #2973 | | [@branben](https://github.com/branben) | #2958, #2959 | | [@makcimbx](https://github.com/makcimbx) | #2937, #2938 | | [@guanbear](https://github.com/guanbear) | #2931, #3031 | | [@Lion-killer](https://github.com/Lion-killer) | #2981, #2988 | | [@JxnLexn](https://github.com/JxnLexn) | per-API-key stream default mode | | [@androw](https://github.com/androw) | #3017 | | [@xz-dev](https://github.com/xz-dev) | #2975, #3064 | | [@S0yora](https://github.com/S0yora) | #2964 | | [@NekoMonci12](https://github.com/NekoMonci12) | #3008 | | [@Tentoxa](https://github.com/Tentoxa) | #3010 | | [@ReqX](https://github.com/ReqX) | #2957 | | [@NomenAK](https://github.com/NomenAK) | #2943 | | [@charithharshana](https://github.com/charithharshana) | #2940 | | [@dhaern](https://github.com/dhaern) | #2927 | | [@dangeReis](https://github.com/dangeReis) | #3021 | | [@bobbyunknown](https://github.com/bobbyunknown) | #3029 | | [@CitrusIce](https://github.com/CitrusIce) | #3035, #3058 | | [@wussh](https://github.com/wussh) | #3036 | | [@Chewji9875](https://github.com/Chewji9875) | #3037 | | [@herjarsa](https://github.com/herjarsa) | #3043 | | [@freefrank](https://github.com/freefrank) | #3066 (reported the Docker build failure) | A special thanks to everyone who contributed code, reviews, and tests for this release: @androw, @bobbyunknown, @branben, @charithharshana, @Chewji9875, @CitrusIce, @dangeReis, @dhaern, @diegosouzapw, @freefrank, @guanbear, @herjarsa, @JxnLexn, @Lion-killer, @makcimbx, @NekoMonci12, @NomenAK, @oyi77, @ReqX, @S0yora, @soyelmismo, @Tentoxa, @terence71-glitch, @wussh, @xz-dev --- ## [3.8.7] — 2026-05-29 ### ✨ New Features - **api (self-service):** add `GET /api/v1/me/status` so a delegated API key can view its own usage (USD used, budget percent, token totals) and optional shared Codex account quota, backed by migration `075_api_key_self_service_usage_scopes` (#2908 — thanks @guanbear). - **analytics:** roll up usage logs to `daily_usage_summary` before raw log cleanup, and query a SQL `UNION` of raw and rolled-up data to prevent analytics history data loss (#2904 — thanks @unitythemaker). - **perf (RAM):** reduce server memory footprint by capping 11 in-memory caches, limiting SQLite page cache, lazy-loading provider registries via Proxy, and optimizing Next.js startup database probes (#2903 — thanks @soyelmismo). ### 🔧 Bug Fixes - **sse:** guard against numeric or non-string upstream error codes and malformed model strings to prevent runtime string-method crashes in `proxyFetch`, `parseModel`, and combo routing (#2463) - **docker:** add dedicated `runner-web` Docker stage with Playwright + Chromium + system libs so web-cookie providers (Gemini Web, Claude Turnstile) work in container deployments without bloating the base image (#2832) - **token-accounting:** prefer `prompt_tokens` over compatibility `input_tokens` for Anthropic Claude streams to avoid double-counting cached tokens (#2904 — thanks @unitythemaker). - **agy:** add the **Antigravity CLI (`agy`)** as a standalone OAuth provider next to `gemini-cli`/`antigravity`. It reuses the antigravity inference backend (identical Google client, `daily-cloudcode-pa.googleapis.com`) but ships its own model catalog — notably the Claude models the backend exposes (`claude-opus-4-6-thinking`, `claude-sonnet-4-6`) — its own account pool, and connection methods: import the `agy` CLI token file (paste/upload), auto-detect a local CLI login (`~/.gemini/antigravity-cli/antigravity-oauth-token`), browser OAuth, and bulk/ZIP import. New routes: `POST /api/providers/agy-auth/{import,import-bulk,zip-extract,apply-local}`. ### Breaking Changes - **proxy-logs:** `GET /api/usage/proxy-logs` now returns `clientIp` instead of `publicIp` for each log entry. External consumers reading `log.publicIp` must update to `log.clientIp`. The underlying SQLite column (`public_ip`) is unchanged, so callers that query the database directly are unaffected (#2880 — thanks @rdself). ### Known Inconsistency - **log-export:** `GET /api/logs/export?type=proxy-logs` returns raw SQLite rows whose IP field is still named `public_ip` (the historical column name). This differs from the `clientIp` field exposed by `GET /api/usage/proxy-logs`. The two endpoints are intentionally inconsistent for now and will be aligned in a future migration (#2880). ### ✨ New Features - **usage:** add per-API-key token limits scoped to model/provider/global with two-tier inline enforcement and in-memory cache accelerator (#2888 — thanks @mugnimaestra). - **providers:** audit web cookie providers, fix 4 missing registry entries, and add DuckDuckGo AI Chat provider (#2862 — thanks @oyi77). - **compression:** expand pt-BR pack with 34 new rules inspired by the troglodita project (#2818 — thanks @leninejunior). ### 🔧 Bug Fixes - **oauth:** hotfix Windsurf login — drop dead PKCE flow, promote import-token, and resolve SQLite bind type errors (#2884 — thanks @yunaamelia). - **models:** prune stale synced available models for inactive connections and dynamically map Antigravity MITM aliases loop-safely (#2886 — thanks @herjarsa). - **antigravity:** harden signatureless tool history replay by making text representation inert (#2878 — thanks @dhaern). - **i18n:** complete 144 missing Portuguese (pt-BR) locale keys and synchronize them with English (#2870 — thanks @alltomatos). - **opencode-go:** add OpenCode Go provider limits quota fetcher to retrieve Z.AI quota windows (#2861 — thanks @RajvardhanPatil07). - **reasoning:** gate reasoning trace replay injection on model interleaved capability metadata (#2843 — thanks @nickwizard). - **audio:** construct multipart body manually for transcription form-data to prevent dropped boundary headers under Next.js fetch (#2842 — thanks @soyelmismo). - **gemini-cli:** prefer real Google Cloud project IDs over default-project during model synchronization (#2841 — thanks @nickwizard). - **mcp:** redirect console.log and console.warn startup messages to stderr in stdio MCP mode to prevent JSON-RPC parsing failures (#2840 — thanks @disonjer). - **antigravity:** normalize unescaped tool calls and classify resource exhaustion 429 errors as lockout cooldowns (#2828 — thanks @Ardem2025). - **sse:** repair RTK engine defaults to resolve consecutive-line deduplication and direct compression calls (#2825 — thanks @leninejunior). - **fix(usage):** add opencode-go / opencode / opencode-zen quota fetcher so the provider limits page surfaces $12/5h, $30/wk, $60/mo windows alongside other quota-aware providers ([#2852](https://github.com/diegosouzapw/OmniRoute/issues/2852) — thanks @apoapostolov) --- ## [3.8.6] — 2026-05-29 ### ✨ New Features - **providers (Unlimited LLM Access):** add 7 new web-cookie providers plus a research catalog and discovery tool, expanding free/session-based model access ([#2887](https://github.com/diegosouzapw/OmniRoute/pull/2887) — thanks @oyi77) - **combo (Zero-Latency Combos):** add Hedging, Proactive Compression, and Predictive TTFT strategies for lower tail latency on combo routing ([#2868](https://github.com/diegosouzapw/OmniRoute/pull/2868) — thanks @herjarsa) - **api,oauth (agy):** add the `agy` (Antigravity CLI) standalone provider with CLI token import ([#2899](https://github.com/diegosouzapw/OmniRoute/pull/2899) — thanks @diegosouzapw) - **usage:** per-API-key token limits scoped to model / provider / global, backed by migration `073_per_model_token_limits` ([#2888](https://github.com/diegosouzapw/OmniRoute/pull/2888) — thanks @mugnimaestra) - **providers (web-cookie audit):** fix 4 missing registry entries and add DuckDuckGo ([#2862](https://github.com/diegosouzapw/OmniRoute/pull/2862) — thanks @oyi77) - **logs:** add clean log history action button to Logs page dashboard (#2799 — thanks @apoapostolov) - **settings:** restore settings-driven home page layout toggles and auto-refresh limits widget (#2800 — thanks @apoapostolov) - **modelSpecs:** register explicit model specifications and context/output caps for Moonshot, Qwen, Hunyuan, DeepSeek, MiniMax, GLM on the `opencode-go` provider (#2802 — thanks @jeferssonlemes) - **claude:** default `xhigh` reasoning-effort support for newer Opus models ([#2874](https://github.com/diegosouzapw/OmniRoute/pull/2874) — thanks @rdself) - **compression (RTK):** add RTK command filters for `kubectl`, `docker-build`, `composer`, and `gh` ([#2824](https://github.com/diegosouzapw/OmniRoute/pull/2824) — thanks @leninejunior) - **compression:** expand the pt-BR "troglodita" compression pack from 15 to 49 rules ([#2818](https://github.com/diegosouzapw/OmniRoute/pull/2818) — thanks @leninejunior) - **opencode-go:** register 4 missing models from the upstream catalog ([#2790](https://github.com/diegosouzapw/OmniRoute/pull/2790) — thanks @jeferssonlemes) - **build:** nix multi-OS package-manager install (`flake.nix` / `flake.lock`) ([#2806](https://github.com/diegosouzapw/OmniRoute/pull/2806) — thanks @levonk) ### 🛡️ Security - **mitm:** refactor `runElevatedPowerShell` to write the elevated payload to a per-call temp `.ps1` file (mode 0o600) and reference it via `-File` instead of `-EncodedCommand `, removing the textbook fingerprint flagged by Socket.dev (#2863 — thanks @a-dmx) - **cloud-sync:** require HMAC verification of the Cloud response (`X-Cloud-Sig`) when `OMNIROUTE_CLOUD_SYNC_SECRET` is set; default-off opt-in `OMNIROUTE_CLOUD_SYNC_SECRETS` flag now required to overwrite `accessToken` / `refreshToken` / `providerSpecificData` from the Cloud payload. Closes silent-credential-swap surface (#2863) - **providers/zed-import:** split into 2-step `discover` + `import` flow. `/import` now requires `confirmedAccounts: [{ service, account, fingerprint }]` and re-reads the keychain server-side to filter by fingerprint, so a tampered discover response cannot trick the endpoint into saving an unrelated token. `OMNIROUTE_ZED_IMPORT_LEGACY_ONE_STEP=true` preserves v3.8.5 behaviour (deprecated, removed in v3.9) (#2863) - **build:** add `OMNIROUTE_BUILD_PROFILE=minimal` (`npm run build:secure`) that physically removes the four sensitive modules (MITM cert install, Zed keychain reader, Cloud Sync, 9router installer) from the standalone bundle via webpack `NormalModuleReplacementPlugin` aliases. Stubs return HTTP 503 `feature-disabled` at runtime. Intended for the `omniroute-secure` artifact (#2863) - **docs:** add `docs/security/SOCKET_DEV_FINDINGS.md` per-finding maintainer attestation + `socket.yml` v2 config + in-source `SECURITY-AUDITOR-NOTE:` blocks at every flagged call site (#2863) - **windsurf:** redact the public Firebase Web key from the Windsurf provider spec (secret-scanning #7) and document the SHA-256 cache-key rationale (code-scanning #261) ([#2894](https://github.com/diegosouzapw/OmniRoute/pull/2894), [#2896](https://github.com/diegosouzapw/OmniRoute/pull/2896) — thanks @diegosouzapw) ### 🔧 Bug Fixes - **antigravity:** harden signature-less tool history handling to prevent malformed tool-call replays ([#2878](https://github.com/diegosouzapw/OmniRoute/pull/2878) — thanks @dhaern) - **providers:** provider model-sync pruning and dynamic Antigravity MITM proxy mappings ([#2886](https://github.com/diegosouzapw/OmniRoute/pull/2886) — thanks @herjarsa) - **audio:** build the multipart body manually to preserve `Content-Type` on transcription requests ([#2842](https://github.com/diegosouzapw/OmniRoute/pull/2842) — thanks @soyelmismo) - **opencode-go:** add a provider-limits quota fetcher so quota state is reported correctly ([#2861](https://github.com/diegosouzapw/OmniRoute/pull/2861) — thanks @RajvardhanPatil07) - **validation:** add specialty validators for connection test, bypassing the `/models` probe for providers that don't expose it ([#2837](https://github.com/diegosouzapw/OmniRoute/pull/2837) — thanks @oyi77) - **cli:** restore `omniroute logs` command — create missing `/api/cli-tools/logs` route that `log-streamer.ts` was calling, returning filtered pino log entries with `follow` and `filter` query-param support (#2756) - **cli:** replace `cli-table3` dependency with a ~50-line hand-rolled ASCII formatter to resolve Node 24 / ESM interop breakage and remove tourniquet `package.json` overrides pinning `ansi-regex@^5`, `strip-ansi@^6`, `string-width@^4` (#2752) - **fix(opencode-go,opencode-zen):** mark qwen3.7-max / 3.6-plus / 3.5-plus as supportsVision:false to stop forwarding image blocks to vision-incapable upstream models ([#2822]) - **nous-research:** append /chat/completions to provider baseUrl so DefaultExecutor's default URL builder hits the correct endpoint instead of returning 404 ([#2826]) - **fix(quota):** honor explicit per-connection `quotaPreflightEnabled: false` even when the provider has global window defaults — adds early-return guard before the AND-of-negations gate in auth.ts ([#2831]) - **api:** include noAuth providers (opencode, etc.) in `/v1/models` active aliases so their models surface without a DB connection row (#2798) - **opencode-go:** route Qwen3.x via Claude messages format and repair `fixMissingToolResponses` helper for Claude-shape upstreams (#2791 — thanks @jeferssonlemes) - **validation:** register missing validation helper checks for web-cookie providers (`claude-web`, `gemini-web`, `copilot-web`, `t3-web`) (#2793 — thanks @oyi77) - **docker:** check and warn if `/app/data` is not writable in the Docker entrypoint script to fail fast with helpful host instructions (#2795 — thanks @hartmark) - **oauth:** repair native Google loopback callback flow and support remote callbacks via state matching on 127.0.0.1 (#2796 — thanks @akarray) - **combo:** resolve custom `openai-compatible-responses-*` provider targets correctly when called via combo name — combo steps storing the internal UUID-prefixed provider id now match the provider node by id as well as by prefix, fixing 503 errors for users with custom providers used inside combos (#2778) - **combos:** fix combo handling so transient 429 rate limit errors do not poison or persist the rate limited state for the same-provider connection (#2800 — thanks @apoapostolov) - **gemini:** translate signature-less Gemini thinking model tool calls to text parts to prevent `400 "missing thought_signature"` errors (#2801 — thanks @herjarsa) - **translator:** strip `safety_identifier` from `/v1/responses` body before forwarding to Chat Completions upstream; fixes LobeHub-originated `400` errors (#2770) - **warning-cleanup:** relax node engine constraint to `>=22.0.0` and clean dependencies (keeping `marked-terminal` to prevent TUI REPL crash) (#2792 — thanks @oyi77) - **combo:** normalize upstream Headers into a plain object before classification to avoid Node 24 / undici cross-instance `Cannot read private member #headers` crash on combo failover (#2751) - **translator:** silently drop `tool_search` built-in tool type instead of returning 400 — newer Codex clients send `tool_search` as a Responses API built-in with no Chat Completions equivalent (#2766) - **usage:** un-invert GitHub Copilot Free / limited plan quota — `limited_user_quotas` is the _remaining_ count, not used, so the dashboard now shows 100% when the quota is untouched and 0% when fully exhausted (#2876 — thanks @androw) - **fix(cli):** register openclaw in the CLI tool-detector so it appears in `omniroute status` alongside its existing API and config support ([#2833](https://github.com/diegosouzapw/OmniRoute/issues/2833)) - **oauth (windsurf):** hotfix Windsurf login — drop the dead PKCE flow and promote the import-token flow as the default ([#2884](https://github.com/diegosouzapw/OmniRoute/pull/2884) — thanks @yunaamelia) - **antigravity:** normalize textual SSE tool calls and classify Gemini Antigravity resource exhaustion as a model lockout instead of a connection failure ([#2828](https://github.com/diegosouzapw/OmniRoute/pull/2828) — thanks @Ardem2025) - **reasoning:** gate reasoning replay by the `interleaved` capability field and guard the interleaved capability lookup ([#2843](https://github.com/diegosouzapw/OmniRoute/pull/2843) — thanks @nickwizard) - **gemini-cli:** prefer real project IDs over `default-project` during discovery ([#2841](https://github.com/diegosouzapw/OmniRoute/pull/2841) — thanks @nickwizard) - **geminiHelper:** support the `rec.image` content shape and warn on dropped remote image URLs ([#2855](https://github.com/diegosouzapw/OmniRoute/pull/2855) — thanks @Tushar49) - **deepseek-web:** return `400` when the client sends `tools[]` — `chat.deepseek.com` has no tool support ([#2854](https://github.com/diegosouzapw/OmniRoute/pull/2854) — thanks @Tushar49) - **claude:** preserve max reasoning effort for supported models ([#2875](https://github.com/diegosouzapw/OmniRoute/pull/2875) — thanks @rdself) - **github:** route `claude-opus-4.6` via the chat-completions path ([#2821](https://github.com/diegosouzapw/OmniRoute/pull/2821) — thanks @marchlhw) - **logs:** rename proxy-log "Public IP" to "Client IP" ([#2880](https://github.com/diegosouzapw/OmniRoute/pull/2880) — thanks @rdself) - **qoder:** reject invalid/expired PATs that surface as a Cosy `500` error ([#2860](https://github.com/diegosouzapw/OmniRoute/pull/2860) — thanks @herjarsa) - **combo:** preserve system messages during context-handoff summary generation ([#2865](https://github.com/diegosouzapw/OmniRoute/pull/2865) — thanks @herjarsa) - **cli:** allow nullable/optional `apiKey` in `cliMitmStartSchema` ([#2857](https://github.com/diegosouzapw/OmniRoute/pull/2857) — thanks @herjarsa) - **chatCore:** wire CLIProxyAPI fallback settings into the chatCore routing engine ([#2866](https://github.com/diegosouzapw/OmniRoute/pull/2866) — thanks @oyi77) - **skills:** skip interception for unregistered client-native tools ([#2817](https://github.com/diegosouzapw/OmniRoute/pull/2817) — thanks @jeferssonlemes) - **mcp:** redirect `console.log`/`console.warn` to stderr in `--mcp` stdio mode so they don't corrupt the JSON-RPC stream ([#2840](https://github.com/diegosouzapw/OmniRoute/pull/2840) — thanks @disonjer) - **cli:** respect the `PORT` env var in the `serve` command ([#2845](https://github.com/diegosouzapw/OmniRoute/pull/2845) — thanks @gogones) - **sse (RTK):** repair RTK engine defaults so dedup and direct calls work ([#2825](https://github.com/diegosouzapw/OmniRoute/pull/2825) — thanks @leninejunior) - **i18n:** translate 144 new `__MISSING__` pt-BR strings ([#2816](https://github.com/diegosouzapw/OmniRoute/pull/2816) — thanks @leninejunior); complete and sync remaining pt-BR strings with `en.json` ([#2870](https://github.com/diegosouzapw/OmniRoute/pull/2870) — thanks @alltomatos); translate 162 missing zh-CN UI strings ([#2789](https://github.com/diegosouzapw/OmniRoute/pull/2789) — thanks @InkshadeWoods) ### 🧹 Chores - **ci:** resolve `release/v3.8.6` gate failures — docs-sync, any-budget, and pack-artifact ([#2895](https://github.com/diegosouzapw/OmniRoute/pull/2895) — thanks @diegosouzapw) - **security (re-land):** re-integrate the Socket.dev supply-chain mitigations, secrets opt-in, and minimal build profile onto the release branch ([#2871](https://github.com/diegosouzapw/OmniRoute/pull/2871) — thanks @diegosouzapw) - **skills:** implement automated skill workflows and update system configuration + validation schemas (thanks @diegosouzapw) - **tests:** stabilize unit suites (blackbox-web, schema-coercion, translator-helper-branches, usage-service-hardening, audio-transcription) and isolate `services-branch-hardening` DB directory to avoid concurrency flakes (thanks @diegosouzapw) - **chore:** remove stale agent skill documentation files and streamline maintenance workflows (thanks @diegosouzapw) - **gitignore:** ignore `.claude/settings.local.json` so per-user Claude Code permissions never get committed by accident - **release:** version bump and metadata sync (package.json, package-lock.json, electron, open-sse, openapi.yaml) ### 🏆 Contributors A special thanks to everyone who contributed to this release. Ranked by commits since `v3.8.6` (105 commits total): | Contributor | Commits | PRs | | ---------------------------------------------------------- | ------: | ----------------------------------------------- | | [@diegosouzapw](https://github.com/diegosouzapw) | 38 | maintainer — releases, upstream ports & fixes | | [@oyi77](https://github.com/oyi77) | 10 | #2887, #2862, #2866, #2837, #2885, #2792, #2793 | | [@yunaamelia](https://github.com/yunaamelia) | 7 | #2884 | | [@herjarsa](https://github.com/herjarsa) | 6 | #2868, #2886, #2865, #2860, #2857, #2801 | | [@leninejunior](https://github.com/leninejunior) | 4 | #2818, #2824, #2825, #2816 | | [@jeferssonlemes](https://github.com/jeferssonlemes) | 3 | #2791, #2802, #2815, #2817 | | [@rdself](https://github.com/rdself) | 3 | #2874, #2875, #2880 | | Dmitry Kuznetsov | 3 | textual tool-call & lockout hardening | | [@apoapostolov](https://github.com/apoapostolov) | 2 | #2799, #2800 | | [@unitythemaker](https://github.com/unitythemaker) | 2 | #2904 | | Nikolay Alafuzov | 2 | reasoning interleaved gating | | [@Tushar49](https://github.com/Tushar49) | 2 | #2854, #2855, #2807 | | [@guanbear](https://github.com/guanbear) | 2 | #2908 | | [@soyelmismo](https://github.com/soyelmismo) | 2 | #2903, #2842 | | [@RajvardhanPatil07](https://github.com/RajvardhanPatil07) | 1 | #2861 | | [@mugnimaestra](https://github.com/mugnimaestra) | 1 | #2888 | | [@dhaern](https://github.com/dhaern) | 1 | #2878 | | [@hartmark](https://github.com/hartmark) | 1 | #2795, #2771 | | [@marchlhw](https://github.com/marchlhw) | 1 | #2821 | | [@alltomatos](https://github.com/alltomatos) | 1 | i18n pt-BR | | [@akarray](https://github.com/akarray) | 1 | #2796 | | [@gogones](https://github.com/gogones) | 1 | #2845 | | [@disonjer](https://github.com/disonjer) | 1 | #2840 | | [@nickwizard](https://github.com/nickwizard) | 1 | #2841 | | [@levonk](https://github.com/levonk) | 1 | #2806 | _Reviews & additional contributions: @androw, @Ardem2025, @InkshadeWoods._ A special thanks to everyone who contributed code, reviews, and tests for this release: @akarray, @alltomatos, @androw, @apoapostolov, @Ardem2025, @dhaern, @disonjer, @gogones, @hartmark, @herjarsa, @InkshadeWoods, @jeferssonlemes, @leninejunior, @levonk, @marchlhw, @mugnimaestra, @nickwizard, @oyi77, @RajvardhanPatil07, @rdself, @soyelmismo, @Tushar49, @yunaamelia, Dmitry Kuznetsov, Nikolay Alafuzov --- ## [3.8.5] — 2026-05-27 ### ✨ New Features - **auth:** support restricting API keys to specific endpoint categories (e.g., chat only, search only, embeddings only) with full dashboard configuration and centralized policy enforcement (#2777 — thanks @hijak) - **batch:** recover stale `in_progress` and `finalizing` batches back to `validating` state on startup, reset counters, and apply a configurable concurrency limit (`BATCH_MAX_CONCURRENT`) (#2755 — thanks @hartmark) ### 🔧 Bug Fixes - **docker:** rebuild `better-sqlite3` native bindings after hardened install to resolve container startup crash (#2772 — thanks @thanet-s) - **combos:** make combo target timeout configurable, inheriting resolved request timeout by default and clamping values so they only shorten fallback latency (#2775 — thanks @rdself) - **oauth:** use public callbacks for remote Google OAuth with custom creds (#2787 — thanks @akarray) - **combos:** allow rate-limited provider connections after transient 429s (#2786 — thanks @JxnLexn) - **logs:** keep database log settings in sync with the pipeline toggle (#2785 — thanks @JxnLexn) - **docker:** speedup docker creation by reducing steps and bunch up copy operations (#2784 — thanks @hartmark) - **codex:** apply global service tiers to combo request bodies (#2783 — thanks @JxnLexn) ### ⚡ Performance / CI - **ci:** build Docker platforms on native runners (linux/amd64 on ubuntu-24.04 and linux/arm64 on ubuntu-24.04-arm) instead of emulated QEMU, reducing build times significantly (#2774 — thanks @thanet-s) ### 📝 Documentation - **docs:** fix broken documentation links in README after Fumadocs migration (#2782 — thanks @kjhq) ### 🏆 Hall of Contributors A special thanks to everyone who contributed code, reviews, and tests for this release: @akarray, @hartmark, @hijak, @JxnLexn, @kjhq, @rdself, @thanet-s --- ## [3.8.4] — 2026-05-26 ### 🔒 Security - **authz:** redirect `/home` and `/home/:path*` to `/login` when unauthenticated — Next.js middleware matcher omitted `/home`, so any visit reached the page directly on `REQUIRE_LOGIN` deployments (#2712 — thanks @diegosouzapw) - **review:** resolve v3.8.4 important + minor findings from consolidated review including SSRF guards (#2749 — thanks @diegosouzapw) ### ✨ New Features - **feat(credential-health):** fail-fast credential health check with TTL cache and background scheduler — validates API key + OAuth connections before combo dispatch, skips failed targets in <1ms instead of 10-30s timeout - **feat(middleware):** pre-request middleware pipeline with global, combo-specific, and per-request scopes — hooks can mutate body/headers/model, short-circuit, or skip remaining hooks - **feat(websocket):** live dashboard WebSocket server on port 20129 with EventBus integration — real-time request started/combo target attempt/succeeded/failed and credential health events - **feat(circuit-breaker):** three-state circuit breaker (CLOSED→DEGRADED→OPEN) with adaptive backoff per failure kind (rate-limit/auth/timeout), escalation count, and historical state tracking - **feat(key-groups):** API key groups with migration 066 — key_groups, group_model_permissions, key_group_members tables and CRUD, REST endpoints, group auth integration - **feat(copilot):** OmniRoute Copilot with CodeGraph knowledge base and CLI harness — LLM-guided configurator at POST /api/copilot/chat - **feat(combo-playground):** combo routing simulation API and dashboard UI under /dashboard/combos/playground/ - **feat(pwa):** improved PWA manifest with icons, categories, and service worker with push notification support - **feat(relay):** serverless relay proxies with migration 067 — relay_tokens, relay_rate_limits, relay_logs, public endpoint at /api/v1/relay/chat/completions, management API, dashboard UI - **feat(cost):** cost optimization engine with alerts (budget/spike/trend thresholds), 6 REST endpoints, dashboard alerts UI - **feat(backup):** backup and restore system with export/import API and dashboard UI - **feat(config-templates):** config templates with migration 070, seed data, CRUD + apply API, dashboard UI - **feat(custom-models):** custom model registry with migration 069, CRUD API, dashboard UI - **feat(webhooks-cicd):** webhook CI/CD actions with migration 071 — ActionEngine supporting deploy/restart/sync actions, REST API - **feat(multitenant):** multi-tenant dashboard with per-API-key usage aggregation and provider/model breakdown - **feat(sla):** SLA dashboard with uptime/latency/error rate queries, summary/trend APIs, uptime badges and sparklines - **feat(routing-analytics):** AI-powered usage pattern analysis and routing recommendations — combo_metrics queries, hourly failure heatmap, provider breakdown, cost-vs-latency scatter chart - **feat(teams):** fixed team execution with 13 git worktrees and project-level team configs - **feat(providers):** add Inner.ai provider support with native executor, translation support, and model catalog definitions (#2704 — thanks @df4p) - **feat(proxy):** unified free proxy pool, Vercel Relay serverless endpoints, and a redesigned 4-tab proxy dashboard interface (#2705 — thanks @diegosouzapw) - **feat(webhooks):** 3-step configuration wizard for Slack, Telegram, Discord, and Custom webhook destinations, with reorganized React components (#2703 — thanks @diegosouzapw) - **feat(openapi):** comprehensive API endpoints content audit with 100% schema coverage, authz security tiers, and full i18n localization support (#2701 — thanks @diegosouzapw) - **feat(providers):** add BluesMinds, FreeModel.dev, and FreeAIAPIKey to the provider catalog (#2709 — thanks @oyi77) - **feat(routing/providers):** broaden routing, provider capabilities, and dashboard views — adds AWS Bedrock provider executor, combo scoring inspector, route explainability, reset-aware combo routing, and improves UI views for quota and resilience (#2750 — thanks @JxnLexn) - **feat(batch-fixes):** clean batch UI, Docker compose base profile, and support for parallel testing execution (#2761 — thanks @diegosouzapw) - **chore(deps):** added ws + @types/ws for WebSocket support, recharts ^3.8.1 for analytics charts ### 🔧 Bug Fixes - **validation:** add Poolside specialty validator (direct `/chat/completions` probe — Poolside has no `/v1/models` endpoint and returns 401 for unknown routes, which the generic `/models` flow misread as "invalid API key") (#2723) - **validation:** add NVIDIA NIM specialty validator and harden `normalizeBaseUrl` against non-string `providerSpecificData.baseUrl` — fixes the `e.startsWith is not a function` TypeError that surfaced after minification (#2463) - **cli:** `omniroute compression *` falls back to direct REST endpoints (`/api/settings/compression`, `/api/context/combos`, `/api/context/analytics`) when `/api/mcp/tools/call` returns 404; normalize `none → off` / `hybrid → stacked` engine aliases (#2688) - **cli:** import `cli-helper/tool-detector` and `cli-helper/doctor/checks` with the explicit `.ts` extension that tsx resolves directly, so the published npm package (which ships only the `.ts` source) no longer crashes with `Cannot find module '…tool-detector.js'` (#2509) - **authz:** make the DB feature-flag override authoritative over `process.env` for `OMNIROUTE_ALLOW_PRIVATE_PROVIDER_URLS`, so toggling "Allow Private Provider URLs" in the Electron dashboard takes effect without restarting the spawned server (#2575) - **fix(antigravity):** stabilize model detection, OAuth handling, and token refresh logic (#2757 — thanks @oyi77) - **fix(batch):** recover and resume stale batch jobs on server restart instead of failing them, and add configurable concurrency limit (#2755 — thanks @hartmark) - **fix(harness):** resolve Headers private slot errors and type check compiler issues, and stabilize cooldown retry test flakiness (#2763 — thanks @diegosouzapw) - Fix combo cascade skipping on credential check timeout - Fix team sessions going idle (worktree initialization) - **feat(providers):** enhance Google Gemini, CLI, and Antigravity resilience and features — introduces explicit TypeScript typing to translation layers, adds new Gemini 2.0 models, implements backoff and retry logic in the Gemini CLI executor, extracts Google Search grounding metadata into standard `citations`, and adds backend definitions for the `vertex-partner` provider. ([#2676](https://github.com/diegosouzapw/OmniRoute/pull/2676) — thanks @alltomatos) - **fix(proxy):** atomically save and assign custom dashboard proxies in a single SQLite transaction, preventing orphan configuration rows (#2697 — thanks @terence71-glitch) - **fix(reasoning):** inject thinking blocks into Claude-format messages for Kimi K2 to prevent infinite tool-calling loops (#2699 — thanks @herjarsa) - **fix(antigravity):** default exhausted quota status display to 0% instead of 100% (#2700 — thanks @ahmet-cetinkaya) - **fix(electron):** add Caps Lock indicator, custom reset warnings, and suppress shell window spawning on startup (#2714 — thanks @benzntech) - **fix(combos):** resolve context handoff tags ordering issue and enforce a 60-second request timeout limit per combo target to prevent capacity leaks (#2717 — thanks @herjarsa) - **fix(oauth):** resolve parallel token refresh race conditions in Codex and implement comprehensive error checking across OAuth providers (#2718 — thanks @diegosouzapw) - **fix(docker):** install `python3`, `make`, and `g++` in the Docker builder stage to support native Node.js addon compilation (#2713 — thanks @mrmm) - **fix(i18n):** restore real hint and placeholder translation strings for web-cookie providers in `en.json` (#2694 — thanks @diegosouzapw) - **fix(db):** resolve migration version prefix collision between services and webhook metadata tables (#2727 — thanks @diegosouzapw) - **fix(vision-bridge):** ensure images are processed when a vision-capable model is matched through a combo routing mapping (#2706 — thanks @herjarsa) - **mcp:** break callLogs ↔ compliance ESM cycle that deadlocks the bundled MCP server on Node.js 24 — extract no-log state to `compliance/noLog.ts`, switch callers to the leaf module, keep `compliance/index.ts` re-exports for backwards compat (#2650 — thanks @disonjer) - **deepseek:** guard PoW solver Web Worker handler so `require()` no longer throws `ReferenceError: onmessage is not defined` under Node strict mode (#2724 — thanks @thanet-s) - **combos:** include no-auth providers (FreeAIAPIKey, BluesMinds, FreeModel.dev, opencode, …) in the combo builder picker — they were invisible because they never get rows in `provider_connections` (#2737 — thanks @herjarsa) - **translator:** allow the `web_search` server-tool family (`web_search_20250305`, `web_search_20250101`, plain `web_search`) in the Responses API translator and preserve the original versioned name on output (#2695 — thanks @diegosouzapw) - **oauth:** register the missing `trae` provider with `import_token` flow so the Trae IDE no longer 500s during token import (#2658 — thanks @diegosouzapw) - **model:** merge settings-based aliases with the legacy DB alias namespace so aliases set via the Settings UI (e.g. `gpt-5.4 → cx/gpt-5.4`) are honored instead of being overridden by provider inference (#2618, #2208 — thanks @diegosouzapw) - **kiro:** fall back to `document.execCommand("copy")` when the Clipboard API is unavailable (HTTP/non-secure contexts), so the "Copy authorization link" button works on LAN deployments (#2689 — thanks @disonjer) - **cli:** raise `omniroute serve` ready timeout from 20s to 60s and add a TCP-listening fallback so Windows users no longer get phantom timeouts during slow Next.js cold start (#2460 — thanks @benzntech) - **mcp:** break circular await deadlock in compliance→callLogs + Kiro refresh resilience (#2747 — thanks @disonjer) - **ui:** claude-web provider shows 'API Key' label instead of 'Session Cookie' (#2744 — thanks @oyi77) - **deepseek-web:** lazy start session refresh (#2742 — thanks @thanet-s) - **docker:** keep fumadocs doc assets in Docker build context (#2741 — thanks @janeza2) - **vision-bridge:** force bridge for opencode-go/zen models that overstate vision support (#2740 — thanks @herjarsa) - **combos:** enable universal handoff by default to preserve cross-model conversation context (#2736 — thanks @herjarsa) ### 🚀 Embedded Services - **feat(services):** embedded service manager for 9Router and CLIProxyAPI — introduces a full lifecycle management system for locally-run AI proxy daemons accessible on loopback only: - **ServiceSupervisor** (`src/lib/services/supervisor.ts`) — EventEmitter-based child process manager with state machine (`not_installed → stopped → starting → running → stopping → error`), ring-buffer log capture (5 MB/service), health polling, and configurable stop timeout. - **ServiceRegistry** (`src/lib/services/registry.ts`) — process-scoped map of active `ServiceSupervisor` instances; integrates with `bootstrap.ts` for auto-start on app launch. - **9Router lifecycle** — npm-installer (`src/lib/services/installers/ninerouter.ts`), 8 REST endpoints under `/api/services/9router/` (install, start, stop, restart, update, status, auto-start, rotate-key), NineRouterExecutor at `open-sse/executors/ninerouter.ts`, model-sync job, and provider registration. - **CLIProxyAPI lifecycle** — GitHub-release installer (`src/lib/services/installers/cliproxy.ts`), 7 REST endpoints under `/api/services/cliproxy/` (install, start, stop, restart, update, status, auto-start), health probe at `/v1/models` (CPA 6.x has no `/health` endpoint). - **SSE log streaming** — `/api/services/{name}/logs` with `tail` and `filter` query params, `snapshot` + `log` SSE events, 30-second heartbeat. - **WebSocket proxy** — `/api/services/{name}/ws` reverse-proxies WebSocket connections to the embedded service UI port (port 20131); `isLocalOnlyPath()` guard in `routeGuard.ts` (Hard Rule #17). - **HTTP UI proxy** — `/api/services/9router/proxy/[...path]` for iframe asset loading. - **Dashboard page** `/dashboard/providers/services` — URL-based tab navigation (`?tab=cliproxy` default / `?tab=9router`), shared components (`ServiceStatusCard`, `ServiceLifecycleButtons`, `ServiceLogsPanel`), sidebar item under Omni Proxy (hideable, `material-symbols-outlined: deployed_code`). - **CliproxyServiceTab** — auto-start toggle, fallback routing card (enable/disable, URL, status codes); fallback settings remain mirrored in Settings → CLIProxyAPI for backward compatibility. - **NinerouterServiceTab** — auto-start toggle, API key display + rotation, collapsible embedded Web UI iframe (`sandbox="allow-scripts allow-same-origin allow-forms"`, loopback-only). - **DB migration 071** (originally 068, renumbered post-merge to avoid collision with `068_free_proxies` and `068_webhooks_kind_metadata`) — extends `version_manager` table with `autoStart`, `autoUpdate`, `providerExpose`, `apiKey`, and `port` columns. `migrationRunner.ts` now throws at boot if two `.sql` files share the same numeric prefix. - All service routes classified as `LOCAL_ONLY` in `routeGuard.ts`; loopback enforcement is unconditional before any auth check (leaked JWT via tunnel cannot trigger process spawning). ### 🏆 Hall de Contribuidores Um agradecimento especial a todos que contribuíram com código, revisões e testes para este release: @ahmet-cetinkaya, @alltomatos, @benzntech, @Chewji9875, @df4p, @diegosouzapw, @disonjer, @hartmark, @herjarsa, @janeza2, @JxnLexn, @mrmm, @oyi77, @thanet-s, @terence71-glitch --- ## [3.8.3] — 2026-05-24 ### ✨ New Features - **feat(combos):** universal context handoff for cross-model conversation continuity — structured XML summary system (``) that preserves conversation continuity and handles state transfer when combo routing switches models. ([#2653](https://github.com/diegosouzapw/OmniRoute/pull/2653) — thanks @herjarsa) - **feat(docs):** migrate `/docs` to Fumadocs MDX with nested routes — replaces the custom docs engine with Fumadocs, adding `[...slug]` catch-all routing, search API at `/docs/api/search`, `source.config.ts` content configuration, and `meta.json` navigation files across 8 doc sections (`architecture/`, `compression/`, `frameworks/`, `guides/`, `ops/`, `reference/`, `routing/`, `security/`). Includes 50+ URL redirects for backward compatibility via `next.config.mjs`. ([#2614](https://github.com/diegosouzapw/OmniRoute/pull/2614) — thanks @ovehbe) - **feat(dashboard):** add search and filters to `/dashboard/api-manager` — filter bar with search by name/key, active-only toggle (persisted to localStorage), status filter (active/disabled/banned/expired), type filter (standard/manage/restricted), filter count badges, and empty state with "Clear Filters" button. ([#2628](https://github.com/diegosouzapw/OmniRoute/pull/2628) / [#2641](https://github.com/diegosouzapw/OmniRoute/pull/2641) — thanks @diegosouzapw) - **feat(dashboard):** free-tier grouping with symbolic link in `/dashboard/providers` — groups and displays all free-tier providers from all categories dynamically using `hasFree: true` properties without removing them from their native lists. Displays category dot and amber dot with localizable tooltips, dedupes search results by provider ID, and corrects free tier count statistics. ([#2632](https://github.com/diegosouzapw/OmniRoute/pull/2632) — thanks @diegosouzapw) - **feat(dashboard):** risk notice modal for sensitive providers — show a soft, informative warning modal when connecting to session-based or OAuth providers (like Claude, Cursor, Copilot) for the first time. Adds `subscriptionRisk` properties to 20 providers, localizable templates, and stores acknowledgment in localStorage. ([#2633](https://github.com/diegosouzapw/OmniRoute/pull/2633) / [#2638](https://github.com/diegosouzapw/OmniRoute/pull/2638) — thanks @diegosouzapw) - **feat(dashboard):** refactor free-tier provider dashboard layout — cleans up visual clutter, reorganizes categories, hides redundant banners, and integrates free-tier categories nicely into the primary provider interface. ([#2640](https://github.com/diegosouzapw/OmniRoute/pull/2640) — thanks @diegosouzapw) - **feat(dashboard):** mini-playground inline (Phase 4) — integrated interactive mini-playground capabilities to provider details pages, including support for specialized example cards (Embedding, Image, LLM Chat, Music, STT, TTS, Video, Web Fetch, Web Search), unified API key loading hooks, model listing hooks, and curl command builder. ([#2648](https://github.com/diegosouzapw/OmniRoute/pull/2648) — thanks @diegosouzapw) - **feat(webfetch):** category support with dedicated media providers page and executors for Firecrawl, Jina Reader, and Tavily. ([#2645](https://github.com/diegosouzapw/OmniRoute/pull/2645) — thanks @diegosouzapw) - **feat(adapta):** integrate Adapta Org (`adapta-web`) provider with automatic Clerk authentication refresh and custom onboarding tutorial modal. ([#2643](https://github.com/diegosouzapw/OmniRoute/pull/2643) — thanks @df4p) - **feat(i18n):** complete translations for Simplified Chinese — translates 1220 missing keys bringing UI coverage to 98.8% with 0 placeholders. ([#2655](https://github.com/diegosouzapw/OmniRoute/pull/2655) — thanks @L-aros) - **feat(dashboard):** add Cmd+K / Ctrl+K command palette for sidebar navigation and a slideover panel for LLM provider testing UI. ([#2656](https://github.com/diegosouzapw/OmniRoute/pull/2656) — thanks @mrmm) - **feat(i18n):** finish Simplified Chinese (zh-CN) UI coverage with 377 translated entries. ([#2659](https://github.com/diegosouzapw/OmniRoute/pull/2659) — thanks @L-aros) - **feat(dashboard):** chat-first test slide-over layout — consolidates header controls (Model/Key selects, Clear button) into a unified toolbar, maximizes vertical conversation space, integrates live tailing of provider logs in Logs tab, and locks composer focus for keyboard-only convenience. ([#2660](https://github.com/diegosouzapw/OmniRoute/pull/2660) — thanks @mrmm) - **feat(cli):** desktop updates, autostart, and headless CLI modes — integrates native auto-updater checks, login autostart (Linux .desktop, macOS/Windows login items), and a background headless server CLI daemon mode (`--headless` or `OMNIROUTE_HEADLESS=true`) into the Electron app wrapper. ([#2662](https://github.com/diegosouzapw/OmniRoute/pull/2662) — thanks @benzntech) - **feat(quota):** card-grid layout and provider group headers under quota management — replaces monolithic table with a beautiful 4-column card grid in limits. ([#2667](https://github.com/diegosouzapw/OmniRoute/pull/2667) — thanks @Gi99lin) - **feat(dashboard):** real-time WebSocket live monitoring daemon — runs a Node.js WebSocket daemon sidecar on port `20129` to emit real-time events for request starts/completes/fails, combo attempts, and credential status in the dashboard logs. ([#2668](https://github.com/diegosouzapw/OmniRoute/pull/2668) — thanks @herjarsa) - **feat(copilot):** AI assistant with CodeGraph + CLI + knowledge base — integrates a dashboard assistant with CodeGraph knowledge base access and CLI capabilities for app exploration. ([#2669](https://github.com/diegosouzapw/OmniRoute/pull/2669) — thanks @ovehbe / @herjarsa) - **feat(pipeline):** pre-request middleware hooks — pipeline executing custom JS hooks before routing/combo logic to mutate headers/body or short-circuit requests. ([#2670](https://github.com/diegosouzapw/OmniRoute/pull/2670) — thanks @herjarsa) - **feat(resilience):** credential health check + adaptive circuit breaker v2 — background connection health check scheduler with progressive circuit breaker adding DEGRADED state and HALF-OPEN recovery validation to avoid latency spikes. ([#2671](https://github.com/diegosouzapw/OmniRoute/pull/2671) — thanks @herjarsa) - **feat(playground):** combo routing visual simulator — interactive route simulation page at `/dashboard/combos/playground` to showcase cascade hops, latency, and cost estimates. ([#2672](https://github.com/diegosouzapw/OmniRoute/pull/2672) — thanks @herjarsa) - **feat(auth):** API key groups with model-level permissions — group definitions with model-level wildcards/denies where API keys inherit group-scoped restrictions. ([#2673](https://github.com/diegosouzapw/OmniRoute/pull/2673) — thanks @herjarsa) - **feat(pwa):** enhanced manifest + push notification support — polishes offline shortcuts, screenshots, display metadata, and push service workers. ([#2674](https://github.com/diegosouzapw/OmniRoute/pull/2674) — thanks @herjarsa) - **feat(proxy):** serverless relay proxy endpoints with rate limiting — public relay proxy endpoints with cost caps and rate limits, CRUD API, and dashboard usage tracking. ([#2675](https://github.com/diegosouzapw/OmniRoute/pull/2675) — thanks @herjarsa) ### 🔧 Bug Fixes - **fix(settings):** Require Login modal Cancel button text and dismissal — modal now renders localized cancel label via the `common` namespace and closes correctly without modifying settings when cancelled. ([#2649](https://github.com/diegosouzapw/OmniRoute/pull/2649) — thanks @Chewji9875) - **fix(deepseek-web):** re-apply SSE parser, prompt format, and error handling fixes — handles all 3 DeepSeek SSE stream formats (initial fragments, APPEND operations, bare string tokens), uses non-greedy regex for markdown image stripping, simplifies prompt to single-turn, checks `json.code` before token extraction, and uses `accessToken` fallback for session cache eviction on auth errors. ([#2616](https://github.com/diegosouzapw/OmniRoute/pull/2616) — thanks @ovehbe) - **fix(deepseek-web):** SSE thinking/search routing and session lifecycle — properly routes thinking vs content fragments based on `thinking_enabled` flag, handles search results with citation indices, appends search result footnotes, refactors `transformSSE()` and `collectSSEContent()` with shared helpers. ([#2624](https://github.com/diegosouzapw/OmniRoute/pull/2624) — thanks @ovehbe) - **fix(codex):** use allowlist to strip non-Responses-API fields in non-passthrough path — strips residual Chat Completions fields (`stream_options`, `service_tier`, `store`, `metadata`) from the request body when routing through the non-passthrough (translation) code path, preventing GPT-5.5 from receiving invalid parameters. ([#2615](https://github.com/diegosouzapw/OmniRoute/pull/2615) — thanks @diegosouzapw) - **fix(catalog):** skip static PROVIDER_MODELS when synced models exist — prevents stale/duplicate model entries in `/v1/models` for auto-synced providers. ([#2625](https://github.com/diegosouzapw/OmniRoute/pull/2625) — thanks @herjarsa) - **fix(qoder):** Cosy auth fallback for PAT tokens + vision support for qwen3-vl-plus — when a PAT token gets 401, falls back to Cosy auth against `api1.qoder.sh`; adds `supportsVision: true` to qwen3-vl-plus. ([#2629](https://github.com/diegosouzapw/OmniRoute/pull/2629) — thanks @herjarsa) - **fix(cli):** register tsx loader and add opencode config subcommand — registers `tsx/esm` at CLI startup so dynamic `.ts` imports resolve; adds `omniroute config opencode` convenience alias. ([#2631](https://github.com/diegosouzapw/OmniRoute/pull/2631) — thanks @amogus22877769) - **fix(claude):** improve Pi and OpenCode compatibility — adds Pi Coding Agent anchors to system transform removal, stores `_toolNameMap` as non-enumerable, strips `context_management` when thinking is disabled. ([#2621](https://github.com/diegosouzapw/OmniRoute/pull/2621) — thanks @unitythemaker) - **fix(passthrough):** restore semantic passthrough system-role-only extraction — reverts full `normalizeClaudeUpstreamMessages()` to lighter `extractSystemRoleMessages()` in CC semantic passthrough paths, preventing document/tool chain corruption. ([#2620](https://github.com/diegosouzapw/OmniRoute/pull/2620) — thanks @Tentoxa) - **fix(kiro):** stabilize conversationId across prompt compression — captures pre-compression body and uses the original first user message as seed for UUID v5, keeping Kiro's AWS conversation context stable. ([#2630](https://github.com/diegosouzapw/OmniRoute/pull/2630) — thanks @HALDRO) - **fix(t3-chat-web):** close implementation gaps for t3.chat TanStack Start, tracking of stream_options, and retry configurations — parses TSS Turbo Stream Serialization from `_serverFn/*`, tracks request `combo_strategy` via database migration `062_usage_history_combo_strategy.sql`, and makes batch retry backoffs custom-configurable via environment variables. ([#2634](https://github.com/diegosouzapw/OmniRoute/pull/2634) — thanks @oyi77) - **fix(reasoning):** extend empty `reasoning_content` injection to prevent tool call loops in Kimi K2 and replay models — injects empty `reasoning_content` field to Kimi models during tool-calling sequences to bypass loop issues. ([#2639](https://github.com/diegosouzapw/OmniRoute/pull/2639) — thanks @herjarsa) - **fix(cli):** Linux autostart via systemd user service on headless VPS — adds auto-generating systemd user service unit for headless setups on Linux, updating tray configs and system variables allowlist (`LOGNAME` and `XDG_CURRENT_DESKTOP`). ([#2635](https://github.com/diegosouzapw/OmniRoute/pull/2635) — thanks @janeza2) - **fix(combo):** preserve `` tag in SSE stream output for combos when using `context_cache_protection` to ensure correct context pinning round-trips. ([#2646](https://github.com/diegosouzapw/OmniRoute/pull/2646) — thanks @herjarsa) - **fix(rtk):** prevent false positives in RTK compression by skipping content-based filter matching for non-shell tool results (e.g. read_file, grep_search). ([#2642](https://github.com/diegosouzapw/OmniRoute/pull/2642) — thanks @HALDRO) - **fix(translator):** enable Claude extended thinking for Copilot Responses-API requests — handles reasoning budget and translations for Copilot. ([#2647](https://github.com/diegosouzapw/OmniRoute/pull/2647) — thanks @ivan-mezentsev) - **fix(tests):** remove duplicate assertion in schema coercion & fix(cli): ignore system vars in env check. (thanks @diegosouzapw) - **fix(combo):** resolve pending request leaks on unresponsive combo targets — implements a default 60-second per-target timeout during combo routing loops to abort hanging upstream requests and release capacity limits. ([#2663](https://github.com/diegosouzapw/OmniRoute/pull/2663) — thanks @Chewji9875) - **fix(proxy):** save custom dashboard proxies directly in SQLite registry — writes new provider/account/global/combo custom proxies directly to the modern `proxy_registry` database and assigns them via `proxy_assignments` instead of creating duplicate configurations. ([#2661](https://github.com/diegosouzapw/OmniRoute/pull/2661) — thanks @terence71-glitch) - **fix(settings):** expand effortLevel enum to support xhigh and max reasoning efforts — adds `xhigh` and `max` levels to the updateThinkingBudgetSchema to resolve validation failures that silently discarded top-effort request payloads. ([#2666](https://github.com/diegosouzapw/OmniRoute/pull/2666) — thanks @mrmm) - **fix(codex):** Codex OAuth refresh token reuse race condition under parallel requests. ([#2667](https://github.com/diegosouzapw/OmniRoute/pull/2667) — thanks @diegosouzapw) ### 📝 Maintenance - **chore(config):** ignore additional agent workflow command files (`.agents/commands/`). (thanks @diegosouzapw) - **chore(config):** ignore `memory-bank` and Cursor agent rules from tracking. (thanks @ovehbe) - **chore(ci):** publish @omniroute/opencode-plugin to npm — adds a parallel build, test, and publish job to the npm release workflow for automated package deployment. ([#2666](https://github.com/diegosouzapw/OmniRoute/pull/2666) — thanks @mrmm) ### 🏆 Hall de Contribuidores Um agradecimento especial a todos que contribuíram com código, revisões e testes para este release: @amogus22877769, @benzntech, @Chewji9875, @df4p, @diegosouzapw, @Gi99lin, @HALDRO, @herjarsa, @ivan-mezentsev, @janeza2, @L-aros, @mrmm, @ovehbe, @oyi77, @Tentoxa, @terence71-glitch, @unitythemaker --- ## [3.8.2] — 2026-05-22 ### ✨ New Features - **feat(@omniroute/opencode-plugin):** upstream-provider suffix in model display name — appends provider label to enriched names (e.g. `Claude Opus 4.7 · Claude` vs `Claude Opus 4.7 · Kiro`) so the OC TUI model picker can differentiate same-id models routed through different upstream connections. Default-on, opt-out via `features.providerTag: false`. ([#2602](https://github.com/diegosouzapw/OmniRoute/pull/2602) — thanks @mrmm) - **feat(@omniroute/opencode-plugin):** provider-tag becomes a prefix + traffic-light compression emoji — provider label now prepends (`Claude - Claude Opus 4.7`) for better TUI column grouping, with smart abbreviation for long labels (`GitHub Models` → `GHM`). Compression pipelines render intensity as emoji (🟢🟡🟠🔴). ([#2604](https://github.com/diegosouzapw/OmniRoute/pull/2604) — thanks @mrmm) - **feat(providers):** add 7 free-tier providers (Wave 1) — Arcee AI, InclusionAI, Krutrim, Liquid AI, MonsterAPI, Nomic, and Poolside now available as new API-key providers with provider icons, model specs, and full routing support. ([#2479](https://github.com/diegosouzapw/OmniRoute/pull/2479) — thanks @oyi77) - **feat(providers):** add Astraflow provider support with global + China endpoints — new provider with dual-region base URLs for global and mainland China access. ([#2486](https://github.com/diegosouzapw/OmniRoute/pull/2486) — thanks @ucloudnb666) - **feat(providers):** add `claude-web` provider — cookie-based Claude Web chat access without OAuth. ([#2476](https://github.com/diegosouzapw/OmniRoute/pull/2476) — thanks @oyi77) - **feat(providers):** add 14 free-tier providers (Wave 1b) — 360AI, Baichuan, Baidu, ByteDance/Doubao, IDEO, Kuaishou/Kling, Kunlun/Skywork, SenseTime/SenseNova, Stepfun, Tencent HunYuan, Zhipu GLM, Replicate, RunPod, and Modal with provider icons, model specs, and routing support. ([#2488](https://github.com/diegosouzapw/OmniRoute/pull/2488) — thanks @oyi77) - **feat(hermes):** add rich multi-role Hermes Agent CLI support — 7 configurable roles (default, delegation, vision, compression, web_extract, skills_hub, approval), per-role model selection with YAML config generation, dashboard card with preview, and home widget integration. ([#2526](https://github.com/diegosouzapw/OmniRoute/pull/2526) — thanks @apoapostolov) - **feat(cloud-agents):** cloud agents UX overhaul — tabs (tasks/agents/settings), status filters, Material icons, duration formatting, cloud agent credentials and health API endpoints, memory stats endpoint. ([#2516](https://github.com/diegosouzapw/OmniRoute/pull/2516) — thanks @oyi77) - **feat(authz):** manage-scope API keys may reach `/api/mcp/*` from non-loopback — Route Guard Tiers system (LOCAL_ONLY / ALWAYS_PROTECTED / MANAGEMENT), narrow carve-out for remote MCP access gated by `manage` scope; `/api/cli-tools/runtime/*` stays strict-loopback. Includes dashboard AuthzSection, inventory API, and comprehensive docs. ([#2473](https://github.com/diegosouzapw/OmniRoute/pull/2473) — thanks @mrmm) - **feat(home):** home page customization for experienced users — pin Provider Quota to home, toggle Quick Start and Provider Topology visibility via Appearance settings. ([#2531](https://github.com/diegosouzapw/OmniRoute/pull/2531) — thanks @apoapostolov) - **feat(home):** automatic refresh of Provider Quota — configurable interval (60s–600s) with toggle in Appearance settings; auto-refreshes pinned quota on the home page. ([#2532](https://github.com/diegosouzapw/OmniRoute/pull/2532) — thanks @apoapostolov) - **feat(@omniroute/opencode-plugin):** OmniRoute OpenCode plugin — live models fetched from OmniRoute API, combo-aware model listing, Gemini request sanitization, multi-instance support, auth flow integration, and 10 test files. ([#2529](https://github.com/diegosouzapw/OmniRoute/pull/2529) — thanks @mrmm) - **feat(executors):** forward OpenCode client headers to upstream providers — OpenCode-specific headers are now forwarded through the executor pipeline for improved compatibility. ([#2538](https://github.com/diegosouzapw/OmniRoute/pull/2538) — thanks @kang-heewon) - **feat(fireworks):** add new models with `modelIdPrefix` support — generic registry field that stores short model IDs and prepends the full path prefix before upstream API calls. Adds 6 new Fireworks models, `modelsUrl` for dynamic sync, and Qwen3 reranker. ([#2560](https://github.com/diegosouzapw/OmniRoute/pull/2560) — thanks @HALDRO) - **feat(@omniroute/opencode-plugin):** readable + filterable + offline-resilient model picker — `usableOnly` filter (only show providers with healthy connections), `diskCache` for offline hydration, `Combo:` prefix labeling, and compression metadata tags in combo display names. ([#2572](https://github.com/diegosouzapw/OmniRoute/pull/2572) — thanks @mrmm) - **feat(smart-pipeline):** multi-stage pipeline for auto combo routing — rule-based + intent-classifier + domain-specific stages with configurable pipeline router, accuracy benchmarks, and comprehensive tests. ([#2551](https://github.com/diegosouzapw/OmniRoute/pull/2551) — thanks @oyi77) - **feat(ops):** skip DB health check on startup via `OMNIROUTE_SKIP_DB_HEALTHCHECK=1` — replaces slow `integrity_check` (7+ min on large WAL) with `quick_check`, and adds env var to skip entirely. ([#2554](https://github.com/diegosouzapw/OmniRoute/pull/2554) — thanks @soyelmismo) - **refactor(dashboard):** Provider Quota grouped layout with vertical rail — restructures the page to a 2-column per-provider layout (left rail with icon/name/status, right content with dynamic per-provider columns), new `providerColumns.ts` / `ProviderGroup.tsx` / `AccountRow.tsx` components, env chip-filter row, bulk-refresh per group, and inline expanded panels. ([#2528](https://github.com/diegosouzapw/OmniRoute/pull/2528) — thanks @Gi99lin) - **feat(providers):** add 26 free-tier providers missing from registry — Novita, Avian, Chutes, Kluster, Targon, Nineteen, Celery, Ditto, Atoma, and more. ([#2590](https://github.com/diegosouzapw/OmniRoute/pull/2590) — thanks @oyi77) - **feat(providers):** add api-airforce free provider with 55 models. ([#2587](https://github.com/diegosouzapw/OmniRoute/pull/2587) — thanks @oyi77) - **feat(dashboard):** configurable sidebar — presets, drag-and-drop ordering, smart-grouping, and new Settings → Sidebar page. ([#2581](https://github.com/diegosouzapw/OmniRoute/pull/2581) — thanks @Gi99lin) ### 🔧 Bug Fixes - **fix(validation):** stop appending a second `/models` when the Gemini base URL already ends in `/models` — Google AI Studio connections using the default base URL were validating against `.../v1beta/models/models` and failing with `404` for every connection. ([#2545](https://github.com/diegosouzapw/OmniRoute/issues/2545)) - **fix(cloudflare-ai):** flatten OpenAI content-part arrays to plain strings for the Workers AI (`cf/`) executor — Workers AI's `/ai/v1/chat/completions` rejects `content: [{type:"text",...}]` with HTTP 400, so requests with array content now have their text parts joined into a string. ([#2539](https://github.com/diegosouzapw/OmniRoute/issues/2539)) - **fix(i18n):** replace leftover Portuguese strings in the English source with English on the Quota dashboards — the quota-share Beta notice (`betaConfigSaved*`) and the Provider Quota row's `Edit cutoffs` / `Refresh now` fallbacks were showing Portuguese. ([#2540](https://github.com/diegosouzapw/OmniRoute/issues/2540)) - **fix(proxy):** honor the legacy per-provider/global proxy config in `resolveProxyForProvider` — the Claude OAuth token exchange and token refresh only consulted the new proxy registry, so a proxy configured the legacy way (`/api/settings/proxy?level=provider`) was ignored and the exchange went out directly from the host, tripping Anthropic's IP `rate_limit_error` on VPS deployments. It now falls back to the legacy config, mirroring `resolveProxyForConnection`. ([#2456](https://github.com/diegosouzapw/OmniRoute/issues/2456)) - **fix(antigravity):** auto-discover a missing Cloud Code `projectId` via `loadCodeAssist` before failing — a freshly re-added Antigravity account whose stored `projectId` was empty (OAuth-time discovery returned nothing) now recovers the project on the first request instead of returning `422 Missing Google projectId`, mirroring the `gemini-cli` bootstrap. ([#2334](https://github.com/diegosouzapw/OmniRoute/issues/2334), [#2541](https://github.com/diegosouzapw/OmniRoute/issues/2541)) - **fix(stream):** keep the `/v1/responses` SSE connection warm for strict clients — emit an early keepalive while the upstream produces its first token and lower the heartbeat cadence to 4s, so Codex CLI's `reqwest` client (≈5s idle-read timeout) no longer drops the stream "before completion" on slow/reasoning models. `curl` was unaffected because it has no idle timeout. ([#2544](https://github.com/diegosouzapw/OmniRoute/issues/2544)) - **fix(electron):** wait longer for the server on first launch and reload once it responds — long post-upgrade DB migrations could exceed the 30s readiness probe, leaving the desktop app stuck on the "Server starting" screen even though the backend was healthy. The probe now targets the auth-exempt health endpoint with a generous timeout and reloads the window once the server comes up. ([#2460](https://github.com/diegosouzapw/OmniRoute/issues/2460)) - **fix(cli):** mark `bin/omniroute.mjs` as executable (mode 755) so the globally-installed CLI runs directly without a manual `chmod +x`. ([#2469](https://github.com/diegosouzapw/OmniRoute/issues/2469) — thanks @disonjer) - **fix(settings):** restore the Global System Prompt into the in-memory config on server startup and after JSON/SQLite import — it was only loaded by the PUT endpoint, so the toggle/prompt silently reverted to defaults after any restart or import. ([#2470](https://github.com/diegosouzapw/OmniRoute/issues/2470) — thanks @disonjer) - **fix(settings):** append the Global System Prompt **after** existing system content instead of prepending it, so provider/agent instructions (Kiro, OpenCode, Hermes, …) injected into the system message no longer override the user's global prompt via recency bias. ([#2468](https://github.com/diegosouzapw/OmniRoute/issues/2468) — thanks @disonjer) - **fix(kiro):** refresh imported social tokens (`authMethod === "imported"`) via the Kiro social-auth endpoint instead of AWS SSO OIDC — imported tokens carry a registered `clientId`/`clientSecret` but a social-issued refresh token the OIDC client cannot refresh, so auto-refresh was failing with "provider returned no new token". ([#2467](https://github.com/diegosouzapw/OmniRoute/issues/2467) — thanks @disonjer) - **fix(antigravity):** resolve the Cloud Code `projectId` from `providerSpecificData` as a fallback (and preserve it across token refresh) so the Gemini `/v1beta` streaming path stops returning a spurious `422 Missing Google projectId` for connections that store the project there. ([#2480](https://github.com/diegosouzapw/OmniRoute/issues/2480)) - **fix(api):** `GET /v1beta/models` now lists only models whose provider has an active/validated connection, matching the OpenAI-format `/v1/models` behavior, instead of returning the entire catalog. ([#2483](https://github.com/diegosouzapw/OmniRoute/issues/2483)) - **fix(cli):** persist `STORAGE_ENCRYPTION_KEY` into `DATA_DIR` (not only `~/.omniroute`) and refuse to auto-generate a fresh key when a `storage.sqlite` already exists — a new key cannot decrypt previously-encrypted credentials, so silently regenerating it locked users out of their database. The CLI now mirrors the server `bootstrapEnv` guard. (reported by Daniel Nach; original key persistence by @Chewji9875 — follow-up to [#1622](https://github.com/diegosouzapw/OmniRoute/issues/1622)) - **fix(gemini):** preserve and re-attach the `thoughtSignature` on Gemini thinking-model tool calls — thread the signature namespace through the `FORMATS.GEMINI` and `FORMATS.GEMINI_CLI` request translators so the cached signature (keyed by connection + tool-call id) is found on the follow-up turn. Fixes `[400]: Function call is missing a thought_signature in functionCall parts` on agentic Gemini tool use. ([#2504](https://github.com/diegosouzapw/OmniRoute/issues/2504)) - **fix(translator):** accept PDFs sent in the Responses-API `input_file` shape on the Gemini path, and the Gemini-style `document` shape on the Responses/Codex path — content parts are now normalized across `input_file` / `file` / `document` so a PDF reaches the model regardless of which field name the client used. ([#2515](https://github.com/diegosouzapw/OmniRoute/issues/2515)) - **fix(stream):** count `thinking` arrays and `reasoning_details` as useful stream output — a reasoning-only response (e.g. Mistral/StepFun with a low `max_tokens`) was misclassified as "Stream ended before producing useful content" and turned into a spurious 502; it is now recognized as valid output. ([#2520](https://github.com/diegosouzapw/OmniRoute/issues/2520)) - **fix(claude):** extract system/developer role messages in Claude Code semantic passthrough paths — moves `role:"system"` / `role:"developer"` messages from the `messages[]` array to the top-level `system` parameter before sending to Anthropic, which rejects them inside messages. Fixes memory injection context being silently dropped. ([#2497](https://github.com/diegosouzapw/OmniRoute/pull/2497) — thanks @unitythemaker) - **fix(vision-bridge):** auto-route non-standard provider models through OmniRoute self-loop — vision-bridge now detects when a model doesn't natively support vision and automatically re-routes the image through OmniRoute's own endpoint for format translation. ([#2487](https://github.com/diegosouzapw/OmniRoute/pull/2487) — thanks @herjarsa) - **fix(mitm):** add IPv6 DNS redirect, modular antigravity target, improved logging — MITM DNS handler now correctly redirects IPv6 (AAAA) queries alongside IPv4, adds a dedicated `antigravity.ts` target module, and enhances DNS/TLS logging for debugging. ([#2514](https://github.com/diegosouzapw/OmniRoute/pull/2514) — thanks @herjarsa) - **fix(usage):** improve Claude and MiniMax plan label detection — better tier name resolution for Claude OAuth usage (tier/plan/subscription_type/org fields) and new MiniMax plan label inference from quota totals. ([#2498](https://github.com/diegosouzapw/OmniRoute/pull/2498) — thanks @Gi99lin) - **fix(codex):** fan out image `n` requests in parallel — when Codex requests `n > 1` images, the image-generation handler now dispatches them concurrently instead of sequentially, significantly reducing total latency. ([#2499](https://github.com/diegosouzapw/OmniRoute/pull/2499) — thanks @nmime) - **fix(embeddings):** strip stale `Content-Encoding` headers from upstream response — prevents clients from receiving gzip-encoded responses with `identity` encoding declared, which caused silent data corruption. ([#2477](https://github.com/diegosouzapw/OmniRoute/pull/2477) — thanks @lordavadon2) - **fix(model):** return clear error instead of silent OpenAI default for unrecognized models — previously, an unrecognized model silently fell back to OpenAI; now returns a 404 with a descriptive message listing known providers. ([#2492](https://github.com/diegosouzapw/OmniRoute/pull/2492) — thanks @herjarsa) - **fix(dark-mode):** correct background token on Compression Override select — the combo compression override `