# OpenClaude Advanced Setup

This guide is for users who want source builds, Bun workflows, provider profiles, diagnostics, or more control over runtime behavior.

## Install Options

OpenClaude requires Node.js `>=22.0.0` for npm installs and runtime. Bun is
only required when building or running from source.

### Option A: npm

```bash
npm install -g @gitlawb/openclaude@latest
```

### Option B: From source with Bun

Use Bun `1.3.13` or newer for source builds. Older Bun versions can fail during `bun run build`.

```bash
git clone https://github.com/Gitlawb/openclaude.git
cd openclaude

bun install
bun run build
npm link
```

### Option C: Run directly with Bun

```bash
git clone https://github.com/Gitlawb/openclaude.git
cd openclaude

bun install
bun run dev
```

## Provider Examples

### OpenAI

```bash
export CLAUDE_CODE_USE_OPENAI=1
export OPENAI_API_KEY=sk-...
export OPENAI_MODEL=gpt-4o
```

### Codex via ChatGPT auth

`codexplan` maps to GPT-5.5 on the Codex backend with high reasoning.
`codexspark` maps to GPT-5.3 Codex Spark for faster loops.

If you use the in-app provider wizard, choose `Codex OAuth` to open ChatGPT sign-in in your browser and let OpenClaude store Codex credentials securely.

If you already use the Codex CLI, OpenClaude reads `~/.codex/auth.json` automatically. You can also point it elsewhere with `CODEX_AUTH_JSON_PATH` or override the token directly with `CODEX_API_KEY`.

If you set `CODEX_API_KEY` manually and are not relying on `auth.json` or stored
Codex OAuth credentials, also set `CHATGPT_ACCOUNT_ID` (or
`CODEX_ACCOUNT_ID`).

```bash
export CLAUDE_CODE_USE_OPENAI=1
export OPENAI_MODEL=codexplan

# optional if you do not already have ~/.codex/auth.json
export CODEX_API_KEY=...
export CHATGPT_ACCOUNT_ID=...

openclaude
```

### DeepSeek

```bash
export CLAUDE_CODE_USE_OPENAI=1
export OPENAI_API_KEY=sk-...
export OPENAI_BASE_URL=https://api.deepseek.com/v1
export OPENAI_MODEL=deepseek-v4-flash
```

Use `deepseek-v4-pro` when you want the stronger model. `deepseek-chat` and `deepseek-reasoner` remain available as DeepSeek's legacy API aliases.

### Google Gemini

```bash
export CLAUDE_CODE_USE_GEMINI=1
export GEMINI_API_KEY=...
export GEMINI_MODEL=gemini-3-flash-preview
```

### Claude on Vertex AI

The Vertex route uses Anthropic's Claude-on-Vertex API. It is not a general
Vertex AI Model Garden adapter for Gemini or arbitrary partner models; use the
Gemini provider for Gemini models and OpenAI-compatible routes for compatible
third-party gateways.

Authentication uses Google Application Default Credentials through
`google-auth-library`. There is no `OPENAI_API_KEY`-style API key for this
route. Authenticate with either a service-account file or local ADC:

```bash
gcloud auth application-default login
```

Minimal setup:

```bash
export CLAUDE_CODE_USE_VERTEX=1
export ANTHROPIC_VERTEX_PROJECT_ID=my-gcp-project
export GOOGLE_CLOUD_PROJECT=my-gcp-project
export CLOUD_ML_REGION=us-east5

openclaude --model claude-sonnet-4-6
```

`CLOUD_ML_REGION` is optional and defaults to `us-east5`. Model-specific
Vertex region override variables are also supported for Claude models; see
`src/utils/envUtils.ts` for the current override names.

### Gemini via OpenRouter

```bash
export CLAUDE_CODE_USE_OPENAI=1
export OPENAI_API_KEY=sk-or-...
export OPENAI_BASE_URL=https://openrouter.ai/api/v1
export OPENAI_MODEL=google/gemini-2.5-pro
```

OpenRouter model availability changes over time. If a model stops working, try another current OpenRouter model before assuming the integration is broken.

### Ollama

```bash
ollama pull llama3.3:70b

export CLAUDE_CODE_USE_OPENAI=1
export OPENAI_BASE_URL=http://localhost:11434/v1
export OPENAI_MODEL=llama3.3:70b
```

#### Ollama Context Length

OpenClaude sends the current conversation history to Ollama on each turn and
uses Ollama's native chat API for Ollama endpoints. Native chat lets OpenClaude
send `options.num_ctx` with each request, so Ollama receives a 32768-token
context window by default instead of falling back to the smaller context often
used by Ollama's OpenAI-compatible `/v1/chat/completions` shim.

To choose a different request-level context size, set
`OPENCLAUDE_OLLAMA_NUM_CTX` before launching OpenClaude:

```bash
export OPENCLAUDE_OLLAMA_NUM_CTX=65536
```

You can also start Ollama with a global context length:

macOS / Linux:

```bash
# Stop any existing Ollama app/server first, then run:
OLLAMA_CONTEXT_LENGTH=32768 ollama serve
```

Windows PowerShell:

```powershell
# Quit any existing Ollama app/server first, then run:
$env:OLLAMA_CONTEXT_LENGTH="32768"
ollama serve
```

After a chat request, verify the loaded model is using the requested context:

```bash
ollama ps
```

Check the `CONTEXT` column. If it still shows a small value such as `4K` after a
new OpenClaude request, stop the existing Ollama app/server, start it again, and
retry the request.

Use a concrete recall test after changing the setting, such as asking the model
to repeat the first topic from the current chat. Questions like "do you remember our
conversation?" can trigger generic local-model disclaimers even when history is
present.

### Atomic Chat (local, Apple Silicon)

```bash
export CLAUDE_CODE_USE_OPENAI=1
export OPENAI_BASE_URL=http://127.0.0.1:1337/v1
export OPENAI_MODEL=your-model-name
```

No API key is needed for Atomic Chat local models.

Or use the profile launcher:

```bash
bun run dev:atomic-chat
```

Download Atomic Chat from [atomic.chat](https://atomic.chat/). The app must be running with a model loaded before launching.

### LM Studio

```bash
export CLAUDE_CODE_USE_OPENAI=1
export OPENAI_BASE_URL=http://localhost:1234/v1
export OPENAI_MODEL=your-model-name
```

### Together AI

```bash
export CLAUDE_CODE_USE_OPENAI=1
export OPENAI_API_KEY=...
export OPENAI_BASE_URL=https://api.together.xyz/v1
export OPENAI_MODEL=meta-llama/Llama-3.3-70B-Instruct-Turbo
```

### Groq

```bash
export CLAUDE_CODE_USE_OPENAI=1
export GROQ_API_KEY=gsk_...
export OPENAI_BASE_URL=https://api.groq.com/openai/v1
export OPENAI_MODEL=llama-3.3-70b-versatile
```

`GROQ_API_KEY` matches the built-in Groq gateway preset. `OPENAI_API_KEY` also works as a fallback on the generic OpenAI-compatible path, but `GROQ_API_KEY` is the preferred variable for Groq-specific setup.

### OpenCode Zen (pay-as-you-go)

```bash
export CLAUDE_CODE_USE_OPENAI=1
export OPENCODE_API_KEY=...
export OPENAI_BASE_URL=https://opencode.ai/zen/v1
export OPENAI_MODEL=gpt-5.4

openclaude
```

OpenCode Zen is a pay-as-you-go AI gateway with 48 models (GPT, Claude, Gemini,
Qwen, MiniMax, GLM, Kimi, Grok, Big Pickle, DeepSeek, Nemotron). Uses the same
`OPENCODE_API_KEY` as OpenCode Go. Get your key from https://opencode.ai.

### OpenCode Go (subscription)

```bash
export CLAUDE_CODE_USE_OPENAI=1
export OPENCODE_API_KEY=...
export OPENAI_BASE_URL=https://opencode.ai/zen/go/v1
export OPENAI_MODEL=glm-5.1

openclaude
```

OpenCode Go is a $10/mo subscription for 13 open models (GLM, Kimi, DeepSeek,
MiMo, MiniMax, Qwen). Uses the same `OPENCODE_API_KEY` as OpenCode Zen.

### Gitlawb Opengateway

```bash
export CLAUDE_CODE_USE_OPENAI=1
export OPENAI_BASE_URL=https://opengateway.gitlawb.com/v1
export OPENGATEWAY_API_KEY=ogw_live_...
export OPENAI_MODEL=mimo-v2.5-pro
```

The Opengateway route is the fresh-install startup default and requires an API
key from https://gitlawb.com/opengateway/keys. Keep the base URL at `/v1` and
switch models with `/model` or `OPENAI_MODEL`. Current partner models include:

- `mimo-v2.5-pro`
- `google/gemini-3.1-flash-lite-preview`

### Xiaomi MiMo

```bash
export CLAUDE_CODE_USE_OPENAI=1
export MIMO_API_KEY=...
export OPENAI_BASE_URL=https://api.xiaomimimo.com/v1
export OPENAI_MODEL=mimo-v2.5-pro
```

The `/provider` Xiaomi MiMo preset uses the same endpoint and stores the key as `MIMO_API_KEY`. `OPENAI_API_KEY` also works as a compatibility fallback, but `MIMO_API_KEY` keeps the profile tied to the MiMo route.

### NEAR AI

```bash
export CLAUDE_CODE_USE_OPENAI=1
export NEARAI_API_KEY=...
export OPENAI_BASE_URL=https://cloud-api.near.ai/v1
export OPENAI_MODEL=anthropic/claude-sonnet-4-6

openclaude
```

NEAR AI is a unified OpenAI-compatible gateway that proxies Anthropic, OpenAI,
and Google models alongside TEE-hosted open models (GLM 5.1, Qwen3.5, Kimi K2.6).
All models are accessible from a single endpoint with one API key.
Get your key from https://cloud.near.ai/dashboard/organizations.

Model IDs use `provider/model-name` format (e.g. `anthropic/claude-opus-4-7`,
`openai/gpt-5.5`, `google/gemini-3.5-flash`, `zai-org/GLM-5.1-FP8`).

For direct TEE completions (lower latency, verifiable privacy):

```bash
export OPENAI_BASE_URL=https://qwen35-122b.completions.near.ai/v1
```

### Mistral

```bash
export CLAUDE_CODE_USE_MISTRAL=1
export MISTRAL_API_KEY=...
export MISTRAL_MODEL=devstral-latest
```

### Azure OpenAI

```bash
export CLAUDE_CODE_USE_OPENAI=1
export OPENAI_API_KEY=your-azure-key
export OPENAI_BASE_URL=https://your-resource.openai.azure.com/openai/deployments/your-deployment/v1
export OPENAI_MODEL=gpt-4o
```

### Microsoft Foundry / Azure OpenAI (resource URL + deployment)

When your endpoint is the **resource base URL** (not the full `.../deployments/.../v1` path), set `OPENAI_MODEL` to the **deployment name** and `AZURE_OPENAI_API_VERSION` to your API version. The OpenAI shim builds:

`{base}/openai/deployments/{OPENAI_MODEL}/chat/completions?api-version={AZURE_OPENAI_API_VERSION}`

and sends the key in the `api-key` header for Azure hosts.

```bash
export CLAUDE_CODE_USE_OPENAI=1
export OPENAI_API_KEY=your-azure-key
export OPENAI_BASE_URL=https://your-resource.openai.azure.com
export OPENAI_MODEL=your-deployment-name
export AZURE_OPENAI_API_VERSION=2024-12-01-preview
```

If your hostname is not detected as Azure (for example some inference endpoints), force Azure URL and header behavior:

```bash
export OPENAI_AZURE_STYLE=1
```

### Fireworks AI

Fireworks AI provides a fully OpenAI-compatible endpoint. Model IDs use the full path format `accounts/fireworks/models/<model-name>`.

```bash
export CLAUDE_CODE_USE_OPENAI=1
export FIREWORKS_API_KEY=fw_your_key_here
export OPENAI_BASE_URL=https://api.fireworks.ai/inference/v1
export OPENAI_MODEL=accounts/fireworks/models/llama-v3p1-70b-instruct
```

The **OpenClaude VS Code extension** can store the key in Secret Storage and set these variables for you when you launch from the Control Center. See `vscode-extension/openclaude-vscode/README.md`.

## Environment Variables

| Variable | Required | Description |
|----------|----------|-------------|
| `CLAUDE_CODE_USE_OPENAI` | OpenAI-compatible only | Set to `1` to enable the OpenAI-compatible provider path |
| `OPENAI_API_KEYS` | One of `OPENAI_API_KEYS` or `OPENAI_API_KEY` for non-local OpenAI-compatible cloud routes* | Comma-separated OpenAI-compatible API key pool. Takes precedence over `OPENAI_API_KEY` and rotates to the next key on auth, quota, or rate-limit failures (`*` not needed for local models like Ollama, LM Studio, Atomic Chat, or other local OpenAI-compatible proxies). |
| `OPENAI_API_KEY` | Required only when `OPENAI_API_KEYS` is unset or empty for non-local OpenAI-compatible cloud routes* | Your API key (`*` not needed for local models like Ollama, LM Studio, Atomic Chat, or other local OpenAI-compatible proxies). A comma-separated list also enables key rotation. |
| `OPENAI_MODEL` | OpenAI-compatible only | Model name such as `gpt-4o`, `deepseek-v4-flash`, or `llama3.3:70b` |
| `OPENAI_BASE_URL` | No | API endpoint, defaulting to `https://api.openai.com/v1` |
| `OPENAI_API_BASE` | No | Compatibility alias for `OPENAI_BASE_URL` |
| `OPENCLAUDE_OLLAMA_NUM_CTX` | Ollama only | Request-level Ollama context window. Defaults to `32768`; set a larger value for longer same-session history if your model and hardware can handle it. |
| `CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS` | No | JSON map of OpenAI-compatible model names to context windows, such as `{"custom-model":1000000}`. Use this when a custom provider does not expose context metadata from `/v1/models`. |
| `OPENCODE_API_KEY` | OpenCode Zen / Go | Shared API key for OpenCode Zen (pay-as-you-go) and OpenCode Go (subscription); get yours from https://opencode.ai |
| `MIMO_API_KEY` | Xiaomi MiMo route | Xiaomi MiMo API key for `https://api.xiaomimimo.com/v1`; mirrored into the OpenAI-compatible auth env when the MiMo route is active |
| `CLAUDE_CODE_USE_GEMINI` | Gemini only | Set to `1` to enable the direct Gemini provider path |
| `GEMINI_API_KEY` / `GOOGLE_API_KEY` | Gemini API-key auth | Gemini API key for direct Gemini setup |
| `GEMINI_MODEL` | Gemini only | Model name such as `gemini-3-flash-preview` or `gemini-2.5-pro` |
| `GEMINI_BASE_URL` | No | Override the Gemini base URL |
| `CLAUDE_CODE_USE_MISTRAL` | Mistral only | Set to `1` to enable the dedicated Mistral provider path |
| `MISTRAL_API_KEY` | Mistral only | Mistral API key |
| `MISTRAL_MODEL` | Mistral only | Model name such as `devstral-latest` |
| `MISTRAL_BASE_URL` | No | Override the Mistral base URL |
| `CODEX_API_KEY` | Codex only | Codex or ChatGPT access token override |
| `CHATGPT_ACCOUNT_ID` / `CODEX_ACCOUNT_ID` | Codex only | Required for manual Codex env setup when the account id is not coming from `auth.json` or stored OAuth credentials |
| `CODEX_AUTH_JSON_PATH` | Codex only | Path to a Codex CLI `auth.json` file |
| `CODEX_HOME` | Codex only | Alternative Codex home directory |
| `OPENCLAUDE_MAX_RETRIES` | No | Maximum retry attempts for retryable API failures, capped at 100 (default: 10). Set to `0` to disable retries after the initial request. If unset, deprecated `CLAUDE_CODE_MAX_RETRIES` is still honored for compatibility. |
| `OPENCLAUDE_RETRY_DELAY_MS` | No | Base retry delay in milliseconds for APIs that do not send `Retry-After`; exponential backoff starts from this value, capped at 60000 (default: 500) |
| `OPENCLAUDE_DISABLE_CO_AUTHORED_BY` | No | Suppress the default `Co-Authored-By` trailer in generated git commits |
| `OPENCLAUDE_LOG_TOKEN_USAGE` | No | When truthy (e.g. `verbose`), emits one JSON line on stderr per API request with input/output/cache tokens and the resolved provider. **User-facing debug output** — complements the REPL display controlled by `/config showCacheStats`. Distinct from `CLAUDE_CODE_ENABLE_TOKEN_USAGE_ATTACHMENT`, which is **model-facing** (injects context usage info into the prompt itself). Both can run together. |

Model env vars are provider-scoped: first-party Anthropic sessions read
`ANTHROPIC_MODEL`, OpenAI-compatible sessions read `OPENAI_MODEL`, Gemini reads
`GEMINI_MODEL`, and Mistral reads `MISTRAL_MODEL`. For manual Bedrock, Vertex,
or Foundry launches, select the model with `--model`.

## Runtime Hardening

Use these commands to validate your setup and catch mistakes early:

```bash
# quick startup sanity check
bun run smoke

# validate provider env + reachability
bun run doctor:runtime

# print machine-readable runtime diagnostics
bun run doctor:runtime:json

# persist a diagnostics report to reports/doctor-runtime.json
bun run doctor:report

# print a redacted public issue report
openclaude doctor report --markdown

# write a redacted JSON issue report for attachment
openclaude doctor report --json --out openclaude-report.json

# write a deterministic JSON task report from a session transcript
openclaude report --json --transcript ~/.openclaude/projects/-path-to-project/session-id.jsonl --out task-report.json

# full local hardening check (smoke + runtime doctor)
bun run hardening:check

# strict hardening (includes project-wide typecheck)
bun run hardening:strict
```

Notes:

- `doctor:runtime` fails fast if `CLAUDE_CODE_USE_OPENAI=1` with a placeholder key or a missing key for non-local providers.
- `doctor:runtime` also validates the dedicated Gemini and Mistral env paths when `CLAUDE_CODE_USE_GEMINI=1` or `CLAUDE_CODE_USE_MISTRAL=1`.
- Local providers such as `http://localhost:11434/v1`, `http://10.0.0.1:11434/v1`, and `http://127.0.0.1:1337/v1` can run without `OPENAI_API_KEY`.
- Codex profiles validate `CODEX_API_KEY` or the Codex CLI auth file and probe `POST /responses` instead of `GET /models`.
- `openclaude doctor report` is redacted by default and is intended for GitHub issues. It summarizes provider/runtime/build/settings state without prompts, transcripts, raw settings files, API keys, MCP command details, or full home-directory paths.
- `openclaude report --json` summarizes observed session facts such as tool uses, Bash commands, validation commands, changed files, branch metadata, warnings, and linked issue/PR references. Use `--transcript <file>` for an explicit transcript, `--session <id>` for a stored session, or omit both to report the latest session for the current project. Large previews are truncated and credential-shaped strings are redacted. When no validation command is observed, the report keeps `validations` empty and includes a warning instead of claiming checks passed.

## Provider Launch Profiles

Use profile launchers to avoid repeated environment setup:

```bash
# one-time profile bootstrap (prefer viable local Ollama, otherwise OpenAI)
bun run profile:init

# preview the best provider/model for your goal
bun run profile:recommend -- --goal coding --benchmark

# auto-apply the best available local/openai provider/model for your goal
bun run profile:auto -- --goal latency

# codex bootstrap (defaults to codexplan and ~/.codex/auth.json)
bun run profile:codex

# openai bootstrap with explicit key
bun run profile:init -- --provider openai --api-key sk-...

# gemini bootstrap with explicit key
bun run profile:init -- --provider gemini --api-key ...

# ollama bootstrap with custom model
bun run profile:init -- --provider ollama --model llama3.1:8b

# ollama bootstrap with intelligent model auto-selection
bun run profile:init -- --provider ollama --goal coding

# atomic-chat bootstrap (auto-detects running model)
bun run profile:init -- --provider atomic-chat

# codex bootstrap with a fast model alias
bun run profile:init -- --provider codex --model codexspark

# launch using persisted user-level provider profile
bun run dev:profile

# codex profile (uses CODEX_API_KEY or ~/.codex/auth.json)
bun run dev:codex

# OpenAI profile (uses the saved OpenAI profile, or OPENAI_API_KEYS / OPENAI_API_KEY from your shell)
bun run dev:openai

# Gemini profile (uses the saved Gemini profile, or GEMINI_API_KEY / GOOGLE_API_KEY from your shell)
bun run dev:gemini

# Ollama profile (defaults: localhost:11434, llama3.1:8b)
bun run dev:ollama

# Atomic Chat profile (Apple Silicon local LLMs at 127.0.0.1:1337)
bun run dev:atomic-chat
```

`profile:recommend` ranks installed Ollama models for `latency`, `balanced`, or `coding`, and `profile:auto` can persist the recommendation directly.

If no profile exists yet, `dev:profile` uses the same goal-aware defaults when picking the initial model.

### Provider Profile Model Picker Mode

When a saved provider profile is active, `/model` can either show the provider's
catalog/discovered models or only the models explicitly listed in the profile.
Configure this in `~/.openclaude.json`:

```json
{
  "providerProfileModelPickerMode": "auto"
}
```

Supported values:

- `auto` (default): single-model profiles show the provider catalog; multi-model
  profiles show the explicit profile list; native vendor routes keep their full
  provider catalog.
- `provider`: show the provider catalog/discovery list first and append
  profile-only custom model IDs.
- `profile`: show only explicitly configured profile models.

Use `--provider ollama` when you want a local-only path. Auto mode falls back to OpenAI when no viable local chat model is installed.

Use `--provider atomic-chat` when you want Atomic Chat as the local Apple Silicon provider.

Use `profile:codex` or `--provider codex` when you want the ChatGPT Codex backend.

`dev:openai`, `dev:gemini`, `dev:ollama`, `dev:atomic-chat`, and `dev:codex`
run `doctor:runtime` first and only launch the app if checks pass.

For `dev:ollama`, make sure Ollama is running locally before launch.

For `dev:atomic-chat`, make sure Atomic Chat is running with a model loaded before launch.

## Message-Count Compaction Threshold

By default, OpenClaude compacts conversations based on token usage. A secondary
message-count-based trigger (`OPENCLAUDE_MAX_ACTIVE_MESSAGES`) exists for
diagnostics but is disabled by default.

If you frequently resume long sessions that accumulate hundreds of small
tool-result messages with negligible token cost, you can opt in to message-count
compaction via the in-app `/config` command:

```text
/config
```

Select **Message-count compaction** and choose a threshold (`100`, `200`, `500`,
or `1000`). Setting it to `off` (default) disables the message-count trigger.

This setting is intended for power users debugging specific edge cases. Most
users should leave it at `off`.

The legacy `OPENCLAUDE_MAX_ACTIVE_MESSAGES` environment variable is still
honored when the setting is `off`.