# codex-shim Run **Codex Desktop** against any BYOK model you can describe in `~/.codex-shim/models.json`, plus an optional passthrough to your **ChatGPT subscription's Codex model** — without rebuilding Codex. The shim is a local Python/aiohttp server that exposes an OpenAI Responses-compatible endpoint on loopback. Codex points at the shim; the shim routes each request to the matching upstream (OpenAI chat completions, Anthropic Messages, a generic OpenAI-shaped chat endpoint, or ChatGPT Codex passthrough), then translates streaming responses back into the shape Codex expects. > Tested on Codex Desktop **0.133.0-alpha.1** for macOS arm64. The shim server > and routing layer are plain Python/aiohttp and work on Windows, macOS, Linux, > WSL, and Git Bash. The only macOS-specific piece is the optional Desktop picker > ASAR patch, needed when Codex hides custom catalog entries. --- ## What this gives you Codex Desktop only shows models allowed by its server-side config. If you have OpenAI / Anthropic / Z.ai / DeepSeek / Gemini / OpenRouter / local proxy models you want as first-class picker entries, this wires them in locally. The practical win is that Codex keeps its native UX while model routing moves local: - **BYOK models in the normal Codex picker.** No Codex rebuild, no request replay workflow. - **Native Codex agent loops stay intact.** Function calls, tool outputs, reasoning blocks, image-capable models, shell-command metadata, and streaming SSE are translated instead of flattened into plain text. - **ChatGPT/Codex passthrough.** If `~/.codex/auth.json` has a valid Codex access token, the shim can route Codex's native `/v1/responses` traffic to ChatGPT's Codex backend under the `gpt-5.5` slug used by current Codex builds. - **Cursor/Composer passthrough.** If `cursor-agent login` is active, the shim exposes `composer-2-5` and routes through your Cursor subscription — no Dashboard API key (`crsr_…`) required. See [`docs/subscription-integration.md`](docs/subscription-integration.md). - **Auto Router (optional).** Add an `Auto (smart routing)` picker entry that uses a cheap classifier model to route each task to the cheapest configured model that can handle it — trivial turns stay cheap, hard turns escalate. See [`docs/AUTO_ROUTER.md`](docs/AUTO_ROUTER.md). - **Prompt-catching/proxy-friendly architecture.** Put a local proxy in front of the shim to dedupe boilerplate, inject stable instructions, repair pseudo-tool text, or route prompts by policy before they hit an upstream. - **Maintainer-side wins on real coding-agent runs.** In the maintainer's internal Codex tasks, ChatGPT passthrough plus a prompt-catching proxy in front of the shim has produced multi-x reductions in billed input tokens and noticeably faster wall time vs. the baseline route. No reproducible benchmark script ships with the repo yet, so treat that as anecdata — the benchmark section below explains how to measure your own setup against an explicit oracle before quoting numbers. --- ## Requirements - Python 3.11+. - Codex CLI/Desktop installed and authenticated. - One of: - `~/.codex-shim/models.json` with configured BYOK/upstream models; - a compatible JSON file passed with `--settings`; - `~/.codex/auth.json` containing `tokens.access_token` for ChatGPT/Codex passthrough-only use. - Windows: PowerShell/cmd works when installed via the Python package entry point; WSL or Git Bash is needed only for the optional `bin/` shell wrappers. - macOS only: `npx` and `codesign` if you need the optional Desktop picker patch. --- ## Install Recommended on macOS/Linux/WSL/Git Bash (installs the `codex-shim` entry point from `pyproject.toml`): ```bash git clone https://github.com/0xSero/codex-shim ~/codex-shim cd ~/codex-shim python3 -m pip install --user -e . ``` Recommended on native Windows PowerShell/cmd: ```powershell git clone https://github.com/0xSero/codex-shim $HOME\codex-shim cd $HOME\codex-shim py -3.11 -m pip install --user -e . ``` That pulls in `aiohttp` and installs the portable Python console command `codex-shim`. On POSIX-like shells, the optional `codex-app` and `codex-model` shortcuts live in `bin/`; symlink them if you want them on `PATH` too: ```bash mkdir -p ~/.local/bin ln -sf "$PWD/bin/codex-app" ~/.local/bin/codex-app ln -sf "$PWD/bin/codex-model" ~/.local/bin/codex-model ``` If you move the checkout, recreate those symlinks; `codex-shim app` launches `codex app` through the installed Python entry point and does not need them. Alternative on macOS/Linux/WSL/Git Bash (no install, run straight from the checkout): ```bash git clone https://github.com/0xSero/codex-shim ~/codex-shim cd ~/codex-shim python3 -m pip install --user aiohttp mkdir -p ~/.local/bin ln -sf "$PWD/bin/codex-shim" ~/.local/bin/codex-shim ln -sf "$PWD/bin/codex-app" ~/.local/bin/codex-app ln -sf "$PWD/bin/codex-model" ~/.local/bin/codex-model ``` For running the test suite: ```bash python3 -m pip install --user pytest pytest-asyncio ``` If your POSIX shell cannot find the commands, make sure `~/.local/bin` is on `PATH`: ```bash export PATH="$HOME/.local/bin:$PATH" ``` If PowerShell cannot find `codex-shim`, add your Python user Scripts directory to `Path`. For Python 3.11 installed from python.org, the usual path is: ```powershell $env:APPDATA\Python\Python311\Scripts ``` You can also skip `PATH` entirely and run through Python: ```powershell py -3.11 -m codex_shim.cli status ``` --- ## Windows support Yes, the shim works on Windows. The core shim is Python/aiohttp, binds to `127.0.0.1`, and writes the same Codex provider config that macOS/Linux use. Use one of these setups: | Setup | Status | Notes | |---|---|---| | Native Windows PowerShell/cmd | Supported | Install with `py -3.11 -m pip install --user -e .` and run `codex-shim ...`. | | WSL | Supported | Works like Linux. Best when Codex CLI/Desktop is also being driven from WSL. | | Git Bash | Supported | Works with the POSIX `bin/` wrappers if Python/Codex are on `PATH`. | | `bin/codex-app`, `bin/codex-model` in PowerShell/cmd | Not native | These are shell scripts. Use `codex-shim app ...` and `codex-shim model ...` instead. | | `patch-app` / `restore-app` | macOS only | They target `/Applications/Codex.app` and Electron ASAR signing on macOS. | Native Windows quick check: ```powershell py -3.11 -m pip install --user -e . codex-shim generate codex-shim start codex-shim status codex-shim list ``` If `codex-shim` is not on `Path`, use the module form: ```powershell py -3.11 -m codex_shim.cli generate py -3.11 -m codex_shim.cli start py -3.11 -m codex_shim.cli status ``` Path behavior is intentionally ordinary: - In native Windows, `~/.codex-shim/models.json` means `%USERPROFILE%\.codex-shim\models.json` and Codex config lives under `%USERPROFILE%\.codex\config.toml`. - In WSL, `~/.codex-shim/models.json` and `~/.codex/config.toml` are inside the Linux home directory unless you explicitly point `--settings` at a Windows path under `/mnt/c/...`. - Do not mix a WSL-generated `~/.codex/config.toml` with native Windows Codex and expect both to share files automatically. If Codex is native Windows, run the native Windows install path or manually keep the Windows config in sync. - The local provider URL is still `http://127.0.0.1:8765/v1`. The optional macOS picker patch is not required for the shim server to work. On Windows, if Codex can read the generated catalog/provider config, requests route through the same local endpoint as every other platform. Windows Store/MSIX Codex Desktop builds are stricter than the CLI. They may treat custom local/BYOK slugs as unavailable, rewrite `model = ""` back to `gpt-5.5`, and add `[tui.model_availability_nux]` entries on launch. That is a Desktop allowlist behavior, not a shim routing behavior: `codex exec`, the TUI, and the shim endpoint still use the configured model slug. The macOS `patch-app` helper does not apply to MSIX packages under `C:\\Program Files\\WindowsApps`. If Windows has a system proxy such as Clash/V2Ray, make sure loopback bypasses it: ```powershell setx NO_PROXY "127.0.0.1,localhost,::1" setx no_proxy "127.0.0.1,localhost,::1" ``` `codex-shim codex -- ...` and `codex-shim app ...` add those loopback entries to the launched process environment automatically; set them globally too if you run `codex.exe` directly. --- ## Quick start ### 1. Generate the catalog and start the shim ```bash codex-shim generate # reads ~/.codex-shim/models.json if present codex-shim start # background daemon on 127.0.0.1:8765 codex-shim list # show generated slugs and upstream routes codex-shim status # health probe + model count ``` Generated runtime files live under the repo-local `.codex-shim/` directory: ```text .codex-shim/custom_model_catalog.json # model picker catalog for Codex .codex-shim/config.toml # opt-in Codex provider config .codex-shim/shim.pid # daemon pid .codex-shim/shim.log # stdout/stderr + request summaries ``` The server binds `127.0.0.1` by default. It is meant to be a local loopback adapter, not an Internet-facing proxy. ### 2. Point Codex Desktop at it ```bash codex-shim app . # launch Codex Desktop with the shim wired in ``` `app` generates the catalog, starts the local daemon if needed, and writes a small managed block into `~/.codex/config.toml` so Codex Desktop uses the local provider. The previous config is backed up under `.codex-shim/` and the managed block can be removed with: ```bash codex-shim disable ``` After this, Codex Desktop sees every entry from `~/.codex-shim/models.json`, plus the `GPT-5.5` ChatGPT passthrough slug if (and only if) `~/.codex/auth.json` holds a valid `tokens.access_token`. If your Codex Desktop's model picker only shows `default` and refuses to render the catalog entries, apply the macOS picker patch below. ### 3. Switch the active Desktop model ```bash codex-model list codex-model gpt-5.5 # or any other slug from `list` codex-app # relaunch Codex with new default ``` `codex-model ` is a shortcut for `codex-shim model use `. It writes only the shim-managed block in `~/.codex/config.toml`. ### 4. Use the Codex CLI without writing config For one-off CLI runs, use inline `-c` overrides instead of changing `~/.codex/config.toml`: ```bash codex-shim codex -- "inspect this repo and summarize the architecture" ``` --- ## Custom config file The shim defaults to `~/.codex-shim/models.json`. If that file is missing, the shim still generates a catalog — and adds the `gpt-5.5` ChatGPT passthrough entry only when `~/.codex/auth.json` contains a valid `tokens.access_token`. You can point it at any compatible file: ```bash codex-shim --settings /path/to/my-models.json generate codex-shim --settings /path/to/my-models.json start ``` Recommended schema: ```json { "models": [ { "model": "gpt-5.5", "provider": "openai", "base_url": "https://api.openai.com/v1", "api_key": "sk-…", "display_name": "OpenAI GPT-5.5", "max_context_limit": 400000 }, { "model": "claude-opus-4-7-20251109", "provider": "anthropic", "base_url": "https://api.anthropic.com/v1", "api_key": "sk-ant-…", "display_name": "Claude Opus 4.7" }, { "model": "deepseek-v4-pro", "provider": "anthropic", "base_url": "https://api.deepseek.com/anthropic", "api_key": "…", "display_name": "DeepSeek V4 Pro", "no_image_support": true } ] } ``` The loader also accepts camelCase aliases (`baseUrl`, `apiKey`, `apiKeyEnv`, `displayName`, `maxContextLimit`, `maxOutputTokens`, `noImageSupport`, `extraHeaders`) and a legacy top-level `customModels` array, so existing model config exports can be used directly. The shim **never writes your API keys** into the generated catalog. Put literal keys in your settings file or reference them with `api_key_env`; credentials are resolved when requests are handled. Supported `provider` values: | provider | upstream API | |---|---| | `openai` | OpenAI `/v1/chat/completions` | | `generic-chat-completion-api` | OpenAI-shaped chat completions | | `anthropic` | Anthropic `/v1/messages` | The shim also accepts Anthropic Messages requests at `http://127.0.0.1:8765/v1/messages`. For `openai` and `generic-chat-completion-api` models, it translates Messages requests to OpenAI-shaped chat completions and converts responses back to Anthropic shape. For `anthropic` models, it passes the request through to the upstream `/messages` endpoint with the configured model name. The bridge supports text, image inputs, basic function tools/tool results, and streaming SSE. Provider features such as prompt caching, extended thinking signatures, files, and token counting remain upstream-specific. Useful model fields: | field | behavior | |---|---| | `display_name` | Human-readable picker label. | | `api_key_env` | Name of an environment variable that contains the upstream API key. | | `max_context_limit` | Catalog context window and compaction limits. | | `max_output_tokens` | Default max output when translating to Anthropic. | | `no_image_support` | When true, catalog advertises text-only input. | | `extra_headers` | Optional upstream headers merged into requests. | ### OpenCode Go OpenCode Go adds and updates models over time. Refresh the local settings from the live OpenCode Go catalog instead of copying a hard-coded model list: ```bash export OPENCODE_GO_API_KEY="..." codex-shim opencode-go refresh codex-shim generate codex-shim start ``` The refresh command calls `https://opencode.ai/zen/go/v1/models`, probes each model through both `/chat/completions` and `/messages`, and writes `ocgo-*` entries into `~/.codex-shim/models.json`. Models that work through chat completions are configured as `generic-chat-completion-api`; models that only work through Messages are configured as `anthropic`. Use `--settings` to write a different file, `--api-key-env` to use a different environment variable name, or `--prefer messages` if you want models that support both routes to prefer Anthropic Messages: ```bash codex-shim --settings /path/to/models.json opencode-go refresh --prefer messages ``` If you need a minimal manual fallback, add one model with the same key env: ```json { "models": [ { "slug": "ocgo-glm-5-1", "model": "glm-5.1", "display_name": "OpenCode Go GLM 5.1", "provider": "generic-chat-completion-api", "base_url": "https://opencode.ai/zen/go/v1", "api_key_env": "OPENCODE_GO_API_KEY" } ] } ``` The current OpenCode Go model list and endpoint split are documented at . ### Ollama / local OpenAI-compatible chat endpoints Codex sends the Responses API. Ollama and many local servers expose OpenAI-shaped `/v1/chat/completions` instead. Keep Codex pointed at the shim with `wire_api = "responses"`; configure Ollama as `generic-chat-completion-api` so the shim translates Responses ⇄ chat completions: ```json { "models": [ { "model": "llama3.2", "display_name": "Ollama Llama 3.2", "provider": "generic-chat-completion-api", "base_url": "http://127.0.0.1:11434/v1", "api_key": "ollama" } ] } ``` `codex-shim --settings /path/to/ollama-launch-models.json generate` also accepts launch-model style files with a top-level `launchModels` / `launch_models` array, including bare strings. `provider: "ollama"` is normalized to `generic-chat-completion-api` with `http://127.0.0.1:11434/v1` when no base URL is supplied. Repeated `codex-shim enable`, `codex-shim app`, and `codex-shim model use ...` runs are idempotent: the shim-managed top-level keys and `[model_providers.codex_shim]` block are removed before the new managed block is written, so duplicate profile/provider keys should not accumulate. Codex may make small background calls to OpenAI model slugs such as `gpt-5.4-mini` for its own product behavior. Those calls are not Ollama routing failures; use the shim request log to confirm the actual selected model for the agent turn. --- ## Picker patch for Codex Desktop on macOS Codex Desktop has a Statsig server-side allowlist (`use_hidden_models: true`) that hides any model whose slug is not on a hardcoded list. Custom catalog entries fall into the hidden bucket and never render in the picker. A single-boolean ASAR patch flips the allowlist branch off so the picker only checks the local `hidden` flag (which this catalog never sets). On recent Codex Desktop builds, the patch also changes the local recent-thread loader from `modelProviders: null` to `modelProviders: []` so the sidebar continues to show existing native `openai` chats while Desktop is routed through the `codex_shim` provider. The combined patch has been tested on Codex Desktop **26.519.41501** / `codex-cli 0.133.0-alpha.1` on macOS arm64. > Back up `app.asar` and `Info.plist` before patching. ```bash APP=/Applications/Codex.app sudo cp -R "$APP" "$APP.unpatched-$(date +%Y%m%d-%H%M%S)" # 1. Extract the ASAR cd /tmp && rm -rf codex-asar-patch && mkdir codex-asar-patch && cd codex-asar-patch npx --yes @electron/asar extract "$APP/Contents/Resources/app.asar" extracted # 2. Patch the picker filter (single occurrence in tested builds) PATCH_FILE=$(grep -RIl 'useHiddenModels' extracted/webview/assets/model-queries-*.js | head -n1) sed -i.bak -E 's/let u=c\.useHiddenModels&&o!==`amazonBedrock`,d;/let u=!1,d;/' "$PATCH_FILE" diff "$PATCH_FILE.bak" "$PATCH_FILE" || true rm "$PATCH_FILE.bak" # 3. Patch the sidebar recent-thread provider filter (single occurrence) SIDEBAR_FILE=$(grep -RIl 'listRecentThreads' extracted/webview/assets/app-server-manager-signals-*.js | head -n1) python3 - "$SIDEBAR_FILE" <<'PY' from pathlib import Path import sys path = Path(sys.argv[1]) text = path.read_text() old = "listRecentThreads({cursor:e,limit:t}){return this.params.requestClient.sendRequest(`thread/list`,{limit:t,cursor:e,sortKey:this.recentConversationSortKey,modelProviders:null,archived:!1,sourceKinds:ke})}" new = "listRecentThreads({cursor:e,limit:t}){return this.params.requestClient.sendRequest(`thread/list`,{limit:t,cursor:e,sortKey:this.recentConversationSortKey,modelProviders:[],archived:!1,sourceKinds:ke})}" if text.count(old) != 1: raise SystemExit("expected one sidebar provider filter occurrence") path.write_text(text.replace(old, new, 1)) PY # 4. Repack npx --yes @electron/asar pack extracted app.asar.new sudo cp app.asar.new "$APP/Contents/Resources/app.asar" ``` That alone can crash Codex on next launch with `EXC_BREAKPOINT`. Electron's `ElectronAsarIntegrity` field in `Info.plist` is a SHA-256 of the **JSON header** of the ASAR archive (not the whole file). Recompute it and re-sign: ```bash # 5. Compute new header hash HEADER_HASH=$(python3 - "$APP/Contents/Resources/app.asar" <<'PY' import struct, hashlib, sys with open(sys.argv[1], 'rb') as f: data_size, header_size, _, json_size = struct.unpack('<4I', f.read(16)) header_json = f.read(json_size) print(hashlib.sha256(header_json).hexdigest()) PY ) echo "new header hash: $HEADER_HASH" # 6. Patch Info.plist (replaces the hash for Resources/app.asar) sudo /usr/libexec/PlistBuddy -c \ "Set :ElectronAsarIntegrity:Resources/app.asar:hash $HEADER_HASH" \ "$APP/Contents/Info.plist" # 7. Ad-hoc re-sign sudo codesign --force --deep --sign - "$APP" # 8. Launch open "$APP" ``` To roll back: `sudo rm -rf "$APP" && sudo mv "$APP.unpatched-…" "$APP"`. The CLI also has helper commands for patching/restoring `app.asar` and the matching ASAR integrity metadata: ```bash codex-shim patch-app codex-shim restore-app ``` If Codex still crashes after `patch-app`, restore with `codex-shim restore-app` and re-check the manual patch needles against the installed Desktop build. --- ## ChatGPT/Codex passthrough If `~/.codex/auth.json` exists and contains `tokens.access_token`, the shim exposes a synthetic `gpt-5.5` catalog entry that proxies straight to: ```text https://chatgpt.com/backend-api/codex/responses ``` The entry is **only** advertised in `/health`, `/v1/models`, `codex-shim list`, and the generated `custom_model_catalog.json` while that token is present. Once you `codex logout` or the file is missing, the slug stops appearing — so the picker never shows an option that would 401 on first use. Run `codex login` to mint a new token and the entry comes back automatically on the next `codex-shim generate`. The passthrough keeps Codex's native `/v1/responses` payload intact, changes the model to `gpt-5.5`, and sends your Codex access token as `Authorization: Bearer ` with the ChatGPT account id from `auth.json` when present. It bypasses configured BYOK routes entirely and uses your ChatGPT subscription quota. It is already in `.codex-shim/custom_model_catalog.json` after `codex-shim generate`. Select `GPT-5.5` in the picker, or run: ```bash codex-model gpt-5.5 ``` Older local configs or notes may refer to `openai-gpt-5-5`; the server accepts that prefix as an alias and routes it to the same passthrough. --- ## Cursor/Composer passthrough (subscription) If `cursor-agent status` shows you are logged in, the shim exposes **Composer 2.5** as slug `composer-2-5` and routes each request by spawning `cursor-agent` with your CLI OAuth session — the same pattern [Open Design](https://github.com/nexu-io/open-design) uses for Cursor Agent. ```bash cursor-agent login scripts/codex-shim-install-cursor-composer codex-shim model use composer-2-5 codex-app ``` The install helper is optional; it regenerates the local catalog/config and sets `composer-2-5` as the active model when `cursor-agent status` reports an active login. Troubleshoot with `cursor-agent status` and `/health`, which reports `cursor_passthrough: true` when the shim can expose Composer. Do **not** configure Composer via `cursor-api.standardagents.ai` unless you intentionally want Dashboard API-key billing (`crsr_…`). That path is BYOK, not CLI subscription. --- ## How routing works ```text Codex Desktop ── /v1/responses ──▶ codex-shim (127.0.0.1:8765) │ ├── slug "gpt-5.5" │ └─▶ chatgpt.com/backend-api/codex/responses │ (Authorization: Bearer ) │ ├── provider "openai" / "generic-…" │ └─▶ baseUrl/chat/completions │ (Authorization: Bearer apiKey) │ └── provider "anthropic" └─▶ baseUrl/messages (x-api-key: apiKey, anthropic-version: …) ``` The shim translates Codex's Responses-API request into the upstream's shape (chat completions or Anthropic Messages) and translates the streamed reply back. Extended-thinking blocks from Anthropic-shaped upstreams (Claude, DeepSeek, GLM, etc.) round-trip through `reasoning.encrypted_content` items. --- ## Auto Router (smart routing) Optionally add one extra picker entry — **`Auto (smart routing)`** (slug `codex-auto`) — that chooses the right model *per task*: trivial turns go to a cheap model, hard turns escalate to your strongest one. It runs entirely on the models you already configure. On each new task the shim asks a cheap **classifier** model you nominate to score every candidate `0.0–1.0` (how likely it nails the task first try), reading a short **capability card** per candidate. It then routes to the **cheapest candidate whose score clears `threshold`** (default `0.7`), caches that decision for the task's tool-call round-trips, and falls back safely on any error. The classifier never sees price, so it can't be biased toward expensive models. Turn it on by adding a `router` block to `~/.codex-shim/models.json`: ```jsonc "router": { "enabled": true, "slug": "codex-auto", "classifier": "minimax-m3", // slug of a cheap configured model "threshold": 0.7, "default": "minimax-m3", "cache": true, "candidates": [ { "slug": "minimax-m3", "cost": 0.3, "supports_images": false, "card": "Cheap, fast. Single-file edits, codegen, simple refactors." }, { "slug": "opus", "cost": 5.0, "supports_images": true, "card": "Frontier. Big multi-file refactors, hard debugging, images." } ] } ``` Prove it end to end with no keys and no network: ```bash python3 examples/auto_router_demo.py ``` It spins up a mock multi-backend server, starts the **real** shim with the router on, and shows trivial→cheap, medium→mid, hard→strong, image→image-capable, and a repeat served from cache. Full configuration, env knobs (`CODEX_SHIM_ROUTER_LOG`, `CODEX_SHIM_DISABLE_ROUTER`, …), and failure behavior are in [`docs/AUTO_ROUTER.md`](docs/AUTO_ROUTER.md). --- ## Tool calls and agent loops Codex expects Responses-API output items. Most BYOK upstreams speak either OpenAI chat completions or Anthropic Messages. The shim bridges the gap: | Codex/Responses item | OpenAI-shaped upstream | Anthropic upstream | |---|---|---| | `tools: [{type: "function", ...}]` | `tools: [{type: "function", function: ...}]` | `tools: [{name, description, input_schema}]` | | `function_call` output item | Chat `tool_calls[]` | `tool_use` content block | | `function_call_output` input item | Chat `role: "tool"` message | `tool_result` user content block | | streamed argument deltas | `response.function_call_arguments.delta` | `response.function_call_arguments.delta` | | parallel calls | Preserved via `parallel_tool_calls` where supported | Multiple `tool_use` blocks | This is the piece that makes the shim useful for real Codex runs instead of only text chat. A model can ask Codex to run tools, Codex sends the tool output back through the shim, and the upstream model continues the same loop. Native Responses-only tools now have BYOK fallbacks: | Responses tool | Chat/Anthropic fallback | |---|---| | `computer_use` / `computer_use_preview` | `computer_use` function with `{action, x, y, text, ...}` | | `web_search` / `web_search_preview` | `web_search` function with `{query, ...}` | | `apply_patch` | `apply_patch` function with `{patch, ...}` | | `local_shell` / `shell` | `local_shell` function with `{command, ...}` | | Codex MCP functions | Passed through as normal function tools | That keeps BYOK models inside the Codex agent loop even when the upstream API is chat-completions or Anthropic Messages instead of native Responses. ChatGPT passthrough remains the highest-fidelity path for first-party hosted tool item shapes, but BYOK routes no longer drop those tools. Visual feedback is preserved for vision-capable BYOK providers: Responses `input_image`, `computer_call_output` screenshots, and visual `function_call_output` payloads become OpenAI chat `image_url` parts or Anthropic image blocks instead of being flattened to text. Known edge cases: - BYOK native-tool fallbacks depend on the Codex client/harness recognizing and executing the fallback function call. The shim translates tool schemas and round-trips tool outputs; it does not execute computer, shell, patch, or MCP actions itself. - Some OpenAI-compatible providers advertise tool calls but stream malformed JSON arguments. The shim preserves deltas; the provider still has to emit valid JSON by the end of the call. - If a provider ignores `parallel_tool_calls`, Codex may still request one tool at a time. That is an upstream behavior, not a catalog issue. --- ## Compaction Codex can compact long sessions through `POST /v1/responses/compact`. | route | behavior | |---|---| | ChatGPT passthrough (`gpt-5.5` / `openai-gpt-5-5*`) | Forwards to ChatGPT's native `/backend-api/codex/responses/compact` endpoint and rewrites returned model metadata back to the requested shim slug. | | BYOK OpenAI/chat-completions providers | Sends a non-streaming summarization request through `/chat/completions`, then returns a Responses-shaped compacted window whose `output` can be used as the next `input`. | | BYOK Anthropic providers | Sends a non-streaming compact request through `/messages`, then returns the same Responses-shaped compacted window. | The BYOK path intentionally strips provider-hostile fields such as `stream` and `service_tier` before forwarding. It preserves the practical Codex behavior — a smaller next context window — without pretending third-party chat APIs can emit OpenAI's opaque encrypted compaction items. --- ## Computer use, shell commands, images, and MCP The generated catalog advertises the Codex-facing capabilities Codex needs to run as an agent: | catalog field | value | |---|---| | `shell_type` | `shell_command` | | `apply_patch_tool_type` | `freeform` | | `web_search_tool_type` | `text_and_image` | | `supports_parallel_tool_calls` | `true` | | `input_modalities` | `text,image` unless `noImageSupport: true` | | `supports_image_detail_original` | disabled when `noImageSupport: true` | What that means in practice: - **Shell/file operations** are still executed by Codex Desktop/CLI. The shim only translates the model request and response stream. - **Images/screenshots** can pass to providers that accept images. Responses `input_image` items, `computer_call_output` screenshots, and visual tool outputs are preserved as OpenAI chat `image_url` parts or Anthropic image blocks. Set `noImageSupport: true` for text-only upstreams so Codex does not send image content they cannot parse. - **Computer-use/native hosted tools** use native Responses item types on the ChatGPT passthrough path. BYOK chat/Anthropic routes receive deterministic function-tool fallbacks (`computer_use`, `web_search`, `apply_patch`, `local_shell`) so they can stay in the same Codex tool loop. Codex Desktop forwards three generic MCP tools to every model: - `list_mcp_resources` - `list_mcp_resource_templates` - `read_mcp_resource` It does **not** flatten individual MCP server tools into the function list. That is a Codex client behavior, not a shim limitation. Shim-routed models receive the same MCP tools as built-in OpenAI models. The model is expected to call `list_mcp_resources` to discover what is available. --- ## Prompt catching and request interception There are two useful interception layers: ### 1. Built-in request summaries Every `/v1/responses` request is summarized into `.codex-shim/shim.log`. Use it while debugging model routing, tool schemas, and prompt size: ```bash tail -f .codex-shim/shim.log ``` The log is intentionally summary-level so it does not dump API keys or full prompt bodies by default. ### 2. Local prompt-catching proxy in front of this shim For deeper control, put a small local proxy in front of `codex-shim` and point Codex at that proxy. That layer can inspect the full Responses request before it reaches this shim, then forward to `http://127.0.0.1:8765/v1/responses`. Common uses: - inject a stable system/developer preamble; - strip repeated boilerplate before it burns tokens; - repair pseudo-tool text such as XML-ish `` drafts into structured tool calls before Codex sees them; - route some prompts to ChatGPT passthrough and others to BYOK models; - redact or hash large file blobs in logs. Minimal aiohttp forwarder shape: ```python from aiohttp import ClientSession, web UPSTREAM = "http://127.0.0.1:8765" async def responses(request): body = await request.json() body = catch_prompt(body) # mutate or record the Responses payload async with ClientSession() as s: async with s.post(f"{UPSTREAM}/v1/responses", json=body, headers=request.headers) as r: return web.Response(body=await r.read(), status=r.status, headers=r.headers) def catch_prompt(body): # Keep this deterministic. Codex retries are much easier to debug when the # same input produces the same transformed payload. return body app = web.Application() app.router.add_post("/v1/responses", responses) web.run_app(app, host="127.0.0.1", port=8766) ``` Then launch Codex with the shim provider URL set to `http://127.0.0.1:8766/v1` instead of `8765`. Keep prompt catching outside `codex_shim/translate.py` unless you want every BYOK route to share the same mutation policy. --- ## Benchmarking cost and speed The right benchmark is an actual Codex task, not a synthetic hello-world completion. Measure the same repository, prompt, model, and tool budget across routes. Suggested quick protocol: 1. Pick one real task that uses tools, e.g. "find the bug, edit the file, run the focused test". 2. Run it once through your baseline Codex route and once through `gpt-5.5` passthrough or your BYOK model. 3. Record wall time, request count, prompt tokens, output tokens, tool-call count, and final test result. 4. Compare only successful end-to-end runs. Useful shell timing wrapper: ```bash /usr/bin/time -f 'wall=%E cpu=%P max_rss_kb=%M' codex-shim codex -- "your task here" ``` The `--` separator is accepted and stripped by the wrapper. It is optional, but it keeps task prompts that start with `-` from being parsed as wrapper flags. A good report looks like: ```text Oracle: same repo commit, same prompt, same focused test command Baseline: 12 requests, 210k input tokens, 19k output tokens, 18m42s, test passed Shim: 8 requests, 31k input tokens, 11k output tokens, 2m35s, test passed Result: 6.8x fewer billed input tokens, 7.2x faster wall time ``` The exact multiplier depends on model, prompt catcher policy, repo size, network path, and how often the agent calls tools. --- ## Commands ```text codex-shim generate regenerate catalog/config without starting daemon codex-shim start regenerate catalog and start local shim daemon codex-shim enable start daemon and write managed ~/.codex/config.toml block codex-shim status health check + model count codex-shim doctor read-only local diagnostics report codex-shim stop stop daemon codex-shim disable remove managed config block and stop daemon codex-shim restart stop, regenerate, and start daemon codex-shim list list generated slugs and upstream routes codex-shim opencode-go refresh refresh OpenCode Go models into the settings file codex-shim model list list slugs currently usable in the picker codex-shim model use set the Desktop default model in managed config codex-shim codex -- exec `codex` CLI through inline shim overrides codex-shim app [path] launch Codex Desktop through managed shim config codex-shim patch-app patch macOS Codex Desktop picker allowlist codex-shim restore-app restore macOS app.asar from patch backup codex-app [path] shortcut for `codex-shim app` codex-model [list|] shortcut for `codex-shim model …` ``` Global flags: - `--settings `: used by catalog/model/start/app/codex/doctor flows. - `--port `: used by daemon/provider/doctor flows. `patch-app` and `restore-app` always target `/Applications/Codex.app`, do not use `--settings`, and exit with a clear error on Windows/Linux. --- ## Model picker (web UI) The shim exposes a small browser UI for switching the active model without restarting the CLI: - `GET /picker` — self-contained HTML page (dark theme) listing every model the shim currently knows about, with the active one highlighted. - `GET /api/models` — JSON list backing the picker. - `POST /api/switch` — `{"slug": "...", "restart_codex": true|false}`. The shim rewrites `model = "..."` and the `[model_providers.codex_shim]` `name = "..."` in `~/.codex/config.toml` so the Codex Desktop UI shows the selected model's display name (e.g. "Kimi K2.6") instead of the generic "Codex Shim" label, and optionally relaunches Codex Desktop (`open -a Codex` on macOS, `taskkill` + `Codex.exe` on Windows). This state-changing picker endpoint requires the per-process `X-Codex-Shim-Picker-Token` header embedded in `/picker`. All picker routes are behind the same `Host`-header allowlist as the rest of the shim, so a visited web page cannot drive them via DNS rebinding. The state-changing `/api/switch` endpoint also requires a per-process picker token, so third-party pages cannot trigger model switches just because the loopback server is reachable. --- ## Security and privacy - The shim binds to `127.0.0.1` by default. - The shim validates the `Host` header on every request and rejects anything that is not a loopback name (`127.0.0.1`, `localhost`, `::1`), the configured bind host, or an entry in `CODEX_SHIM_ALLOWED_HOSTS`. This blocks DNS-rebinding attacks where a web page you visit resolves its own domain to `127.0.0.1` and drives the shim with your credentials. If you deliberately bind to a non-loopback host, add the host(s) you reach it by to `CODEX_SHIM_ALLOWED_HOSTS` (comma-separated). - The model picker protects its state-changing `/api/switch` endpoint with a per-process picker token, so cross-site pages cannot switch the active model or request a Desktop restart without loading the picker page. - API keys stay in your settings file; the generated catalog does not contain them. - Request logs are summary-level by default and avoid full prompt/API-key dumps. - ChatGPT passthrough reads `~/.codex/auth.json` at request time and forwards the access token only to ChatGPT's Codex endpoint. - If you put a prompt-catching proxy in front of the shim, that proxy controls what it logs. Redact or hash large/private prompt bodies there. --- ## Limitations - Codex internals and model-picker bundles change. The ASAR patch is version sensitive by nature. - The ChatGPT passthrough endpoint is the endpoint current Codex builds use; it may move or change shape in a future Codex release. - BYOK providers vary wildly in tool-call quality. The shim translates shapes; it cannot make an upstream model reliably emit valid tool-call JSON. - Hosted Responses-only tools are highest fidelity on the ChatGPT passthrough path. BYOK routes get normal function-tool translation. - The `bin/codex-app` and `bin/codex-model` shortcuts are POSIX shell scripts. In native Windows shells, use the installed `codex-shim` command instead. --- ## Troubleshooting ### Shim will not start ```bash codex-shim doctor codex-shim status tail -n 80 .codex-shim/shim.log ``` `codex-shim doctor` prints a read-only diagnostics report grouped by section (Python, dependencies, Codex CLI, settings, runtime files, daemon health, passthrough availability, proxy bypass, and Codex config). It never writes configuration, starts/stops the daemon, calls model providers, or prints API keys/tokens. It exits 1 only when a hard `FAIL` is detected; warnings are meant as local setup hints. Common causes: - Python is older than 3.11. - `aiohttp` is not installed in the Python used by the wrapper. - Port `8765` is already in use. Start on another port: ```bash codex-shim --port 8766 restart codex-shim --port 8766 app . ``` ### `~/.codex-shim/models.json` is missing That is fine for ChatGPT passthrough-only use, **provided** `~/.codex/auth.json` has a valid `tokens.access_token`. In that case `codex-shim generate` writes a catalog containing just `gpt-5.5`. If neither file is present, the catalog will be empty and `codex-shim list` will exit non-zero with a hint to run `codex login` or pass a compatible settings file: ```bash codex-shim --settings /path/to/my-models.json generate ``` ### `codex-shim list` exits 1 with "No models available" You have neither configured models in `~/.codex-shim/models.json` nor a valid Codex login. Pick one: ```bash codex login # populate ~/.codex/auth.json # or codex-shim --settings /path/to/my-models.json list ``` ### Codex shows only `default` Run: ```bash codex-shim generate codex-shim model list ``` If the catalog contains your models but Desktop still hides them, apply the macOS picker patch. On Windows Store/MSIX Desktop, the same allowlist can rewrite the active model back to `gpt-5.5`; use `codex-shim codex -- ...` / Codex CLI for BYOK routes, or a non-MSIX/Desktop build that can read the custom catalog without rewriting the config. ### Windows proxy sends loopback traffic away from the shim If `codex.exe` returns proxy/502 errors while the shim is healthy, a system proxy may be intercepting `http://127.0.0.1:8765`. Set both uppercase and lowercase bypass variables before launching Codex: ```powershell $env:NO_PROXY = "127.0.0.1,localhost,::1" $env:no_proxy = "127.0.0.1,localhost,::1" ``` `codex-shim app ...` and `codex-shim codex -- ...` set those entries for the child process automatically. ### Model appears but requests 404 The selected slug is not in the current generated catalog. Regenerate after editing `~/.codex-shim/models.json` or the file passed with `--settings`: ```bash codex-shim generate codex-model list codex-model ``` ### Upstream returns 401/403 The API key in your model settings file is wrong, expired, or missing a provider-specific header. For ChatGPT passthrough, refresh Codex login so `~/.codex/auth.json` contains a valid `tokens.access_token`. ### Tool calls turn into text Use the ChatGPT passthrough path first to confirm Codex itself is sending tools. If passthrough works but a BYOK route does not, the upstream probably lacks native tool-call support or emits malformed streamed arguments. Check `.codex-shim/shim.log` for the requested model and tool count. ### Images fail on a text-only model Set `"noImageSupport": true` for that model in the settings file and regenerate the catalog. ### Streaming hangs Check whether the upstream streams correctly outside Codex. Then restart the local daemon: ```bash codex-shim restart tail -f .codex-shim/shim.log ``` The server uses a long read timeout because real coding-agent turns can stream for a while; a silent hang is usually upstream/network/provider behavior. ### macOS app crashes after patching You repacked `app.asar` but did not update `ElectronAsarIntegrity` and re-sign, or the patch hit the wrong JavaScript bundle. Restore and retry: ```bash codex-shim restore-app codex-shim patch-app ``` ### Reset generated shim state ```bash codex-shim stop # Remove .codex-shim manually if you want a completely fresh generated state. codex-shim generate codex-shim start ``` --- ## File layout ```text codex_shim/ python source (server + cli + translation) bin/codex-shim main entrypoint bin/codex-app shortcut wrapping `codex-shim app` bin/codex-model shortcut wrapping `codex-shim model …` .codex-shim/ generated catalog, config, logs, pid (gitignored) tests/ pytest suite ``` Config behavior: - `codex-shim generate`, `start`, `stop`, `restart`, `list`, `status`, and `codex-shim codex -- ...` do not persistently modify `~/.codex/config.toml`. - `codex-shim enable`, `codex-shim app`, and `codex-shim model use ` write managed blocks to `~/.codex/config.toml`. If existing top-level Codex model keys are displaced, the managed block records them so disable can restore those keys without reverting unrelated config edits. - `codex-shim disable` removes the managed blocks, restores displaced top-level model keys when present, and stops the daemon. --- ## Development checks ```bash python3 -m pytest tests/ python3 -m compileall codex_shim/ -q ``` The tests cover settings/catalog generation, request translation, server routing, and CLI settings-file UX. Add regression tests when changing translation behavior; tool-call shape bugs are easy to miss by eyeballing streams. --- ## Contributing Good contributions include: - new provider translation tests; - captured stream fixtures for tricky tool-call/reasoning cases; - compatibility notes for new Codex Desktop builds; - safer picker patch detection for changed ASAR bundles; - docs for known-good provider configs. Before opening a PR, run the development checks above and include the Codex Desktop/CLI version you tested. --- ## License MIT — see `LICENSE`. Codex Desktop is a trademark of OpenAI. This project is unaffiliated.