# motosan-ai

> Multi-provider LLM SDK for Python and Rust. Unified API for Anthropic, OpenAI, Ollama, MiniMax, and Gemini — with streaming, tool use, retry, and ThinkStripper built in.

- Python 0.12.1 · Rust 0.20.0
- GitHub: https://github.com/motosan-dev/motosan-ai
- PyPI: https://pypi.org/project/motosan-ai/
- crates.io: https://crates.io/crates/motosan-ai

## Install

```bash
# Python — pick providers via extras
pip install "motosan-ai[anthropic]"
pip install "motosan-ai[gemini]"
pip install "motosan-ai[anthropic,openai,ollama,minimax,gemini]"
```

```toml
# Rust — pick features in Cargo.toml
[dependencies]
motosan-ai = { version = "0.20.0", features = ["anthropic"] }
# features: anthropic | openai | minimax | ollama | ollama_native | full
#           gemini | gemini-code-assist
# CLI backends (Rust features; Python has built-in ClaudeCodeClient/CodexCliClient):
#   claude-code  → ClaudeCodeProvider → `claude --print --output-format stream-json`
#   codex-cli    → CodexCliProvider   → `codex exec --json -`
#   gemini-cli   → GeminiCliProvider  → `gemini -p "" -o stream-json`
```

## Environment Variables

| Provider  | Env var             |
|-----------|---------------------|
| Anthropic | `ANTHROPIC_API_KEY` |
| OpenAI    | `OPENAI_API_KEY`    |
| MiniMax   | `MINIMAX_API_KEY`   |
| Gemini    | `GEMINI_API_KEY`    |
| Ollama    | (none — local)      |

## Model Defaults

| Provider  | Default model             |
|-----------|---------------------------|
| Anthropic | `claude-sonnet-4-6`       |
| OpenAI    | Python: `gpt-4o` · Rust: `gpt-5.3-codex` |
| MiniMax   | Python: `MiniMax-Text-01` · Rust: `MiniMax-M2.7` |
| Ollama    | `llama3.2`               |
| Gemini    | Python: `gemini-2.5-flash` · Rust: `gemini-2.0-flash` |
| GeminiCodeAssist | `gemini-2.5-flash` |

Override per client or per request. Anthropic's model catalog includes `claude-opus-4-8`; use it as an override when you want Opus. For Opus 4.8/4.7/4.6, `thinking` uses Anthropic adaptive thinking (`thinking.type = "adaptive"`, summarized display, `output_config.effort = "high"`) and OAuth adaptive-thinking requests omit the legacy `interleaved-thinking` beta header, matching pi.

---

## Python API

### Client

```python
from motosan_ai import Client, Message

# Factory methods (reads env var automatically)
client = Client.anthropic()
client = Client.openai(model="gpt-4o")
client = Client.minimax()
client = Client.gemini()
client = Client.ollama(model="llama3.2", base_url="http://localhost:11434")
```

Python HTTP providers: Anthropic, OpenAI, MiniMax, Gemini, Ollama. Gemini uses `GEMINI_API_KEY`, default model `gemini-2.5-flash`, supports text, images, tools/tool choice, streaming, stop sequences, and usage tokens. Document/PDF blocks are rejected before HTTP.

### chat() — single turn

```python
resp = await client.chat([Message.user("Hello")])
print(resp.content)        # str
print(resp.stop_reason)    # StopReason: end_turn | max_tokens | tool_use | stop
print(resp.usage)          # Usage: input_tokens, output_tokens
print(resp.tool_calls)     # list[ToolCall] — empty if no tool use
```

### chat() — full control via keyword args

```python
resp = await client.chat(
    [Message.user("Hello")],
    system="You are a helpful assistant.",
    temperature=0.7,
    max_tokens=1024,
    tools=[...],
    provider_options={"key": "val"},
)
```

### stream() — streaming text

```python
async for event in client.stream([Message.user("Tell me a story")]):
    if event.event_type == "text":
        print(event.content, end="", flush=True)
    if event.done:
        break
```

### stream() — with tools

```python
async for event in client.stream(
    messages,
    tools=tools,
    system="You are helpful.",
):
    # handle events (see Streaming section)
```

### Rust-parity Client methods (Python v0.10.0)

```python
from motosan_ai import ChatRequest

# Full ChatRequest passthrough; request.model falls back to client.model.
# Opus 4.8/4.7/4.6 thinking is sent as adaptive thinking.
req = (
    ChatRequest.builder()
    .message(Message.user("Hi"))
    .model("claude-opus-4-8")
    .thinking(1024)
    .build()
)
resp = await client.chat_with(req)

async for event in client.stream_with(req):
    ...

# Stream-to-ChatResponse assembly.
resp = await client.stream_collect([Message.user("Hi")])
resp = await client.stream_collect_with(req)
```

Use `*_with` for fields not exposed by `chat()`/`stream()` kwargs:
`thinking`, `tool_choice`, `mcp_servers`, `system_blocks`, and `stop_sequences`.
The SDK is async-only — wrap in `asyncio.run(client.chat(...))` from sync code.

### Message Helpers

```python
Message.user("Hello")
Message.assistant("Hi there")
Message.system("You are a helpful assistant")
Message.assistant_with_tool_calls("Let me check", tool_calls=[...])
Message.tool_result(tool_call_id="call_123", content='{"result": 42}')
```

### Retry

```python
# default retries = 3
client = Client.anthropic(api_key="...", max_retries=3)

# disable retries
client = Client.anthropic(api_key="...", max_retries=0)
```

### Error Handling

```python
from motosan_ai import MotosanError, AuthError, RateLimitError

try:
    resp = await client.chat([Message.user("Hi")])
except RateLimitError as e:
    print(f"Rate limited: {e}")
except AuthError as e:
    print(f"Auth failed: {e}")
except MotosanError as e:
    print(f"Error: {e}")
```

Error hierarchy: `MotosanError` → `AuthError`, `RateLimitError`, `InvalidRequestError`, `ConfigError`, `ProviderError`, `NetworkError`, `StreamError`

---

## Rust API

### Client

```rust
use motosan_ai::{Client, Provider, Message};

let client = Client::builder()
    .provider(Provider::Anthropic)    // required
    .api_key(std::env::var("ANTHROPIC_API_KEY")?)  // required for HTTP providers
    .model("claude-sonnet-4-6")       // optional
    .retry_policy(RetryPolicy::new()) // optional
    .stream_read_timeout_secs(30)      // optional — terminate stream after 30s of silence
    .build()?;
```

Provider variants: `Anthropic` | `OpenAI` | `Minimax` | `Ollama` | `Gemini` | `GeminiCodeAssist` | `ClaudeCode` | `CodexCli` | `GeminiCli`

`Gemini` (v0.13.0, feature `gemini`) — HTTP client for `generativelanguage.googleapis.com`. API key via `x-goog-api-key` header. Default model `gemini-2.0-flash`. **Gemini-specific convention**: `Message::tool_result` must pass the function name (not an opaque call ID) as `tool_call_id`, because Gemini's `functionResponse.name` must be the function name.

`GeminiCodeAssist` (Rust v0.13.0 feature `gemini-code-assist`; Python v0.10.0 `GeminiCodeAssistProvider` / `Provider.gemini_code_assist`) — HTTP client for `cloudcode-pa.googleapis.com/v1internal`. OAuth Bearer token (`ya29.*` from `motosan-ai-oauth` in Rust or `motosan_ai.oauth` PKCE flow in Python). Requires GCP project ID. Default model `gemini-2.5-flash`. Billing is subscription-based (not per-token). `chat()` implemented internally as `stream()` + collect (no non-streaming endpoint). Python Anthropic Claude Pro/Max OAuth is available via `motosan_ai.oauth.claude_pro_max_config()` + `login()` (returns an `sk-ant-oat01-*` token).

CLI backends (`ClaudeCode` / `CodexCli` / `GeminiCli`) go through the **same** `client.chat(...)` / `client.stream(...)` API. `api_key` is optional on the builder for these paths; configure provider-specific flags by passing a pre-built `ClaudeCodeProvider` / `CodexCliProvider` / `GeminiCliProvider` via `.claude_code(...)` / `.codex_cli(...)` / `.gemini_cli(...)`. `ClaudeCode` and `CodexCli` landed in v0.11.0; `GeminiCli` followed in the same generation. See the CLI Backends section under Provider-Specific Notes for a full example.

### chat() — single turn

```rust
let resp = client.chat(vec![Message::user("Hello")]).await?;
println!("{}", resp.content);            // String
println!("{:?}", resp.stop_reason);      // StopReason enum
println!("{}+{}", resp.usage.input_tokens, resp.usage.output_tokens);
```

### chat_with() — full control

```rust
use motosan_ai::ChatRequest;

let request = ChatRequest::builder()
    .messages(vec![Message::user("Hello")])
    .system("You are helpful")
    .model("claude-opus-4-8")
    .temperature(0.7)
    .max_tokens(1024)
    .tools(vec![...])
    .build();

let resp = client.chat_with(request).await?;
```

### stream() — streaming text

```rust
use futures_util::StreamExt;

let mut stream = client.stream(vec![Message::user("Tell me a story")]).await?;
while let Some(item) = stream.next().await {
    let event = item?;
    if !event.content.is_empty() {
        print!("{}", event.content);
    }
    if event.done {
        // Terminal event carries the stop reason when the provider reports one.
        // Streams emit exactly ONE done event (since v0.10.1), even on
        // non-conformant proxies that skip [DONE] and finish_reason.
        if let Some(reason) = event.stop_reason {
            eprintln!("\n[stop_reason: {reason:?}]");
        }
        break;
    }
}
```

### stream_with() — streaming + tools

```rust
let request = ChatRequest::builder()
    .messages(messages)
    .tools(tools)
    .system("You are helpful")
    .build();

let mut stream = client.stream_with(request).await?;
while let Some(item) = stream.next().await {
    let event = item?;
    match event.event_type {
        StreamEventType::Text => print!("{}", event.content),
        StreamEventType::ToolCallStart => { /* event.tool_call_id, event.tool_call_name */ },
        StreamEventType::ToolCallArgs  => { /* event.tool_call_args_delta */ },
        StreamEventType::ToolCallEnd   => { /* tool call complete */ },
        StreamEventType::Usage         => { /* event.usage */ },
        StreamEventType::ThinkingDelta => { /* live extended-thinking chunk (Anthropic only) */ },
        StreamEventType::ThinkingDone  => { /* full thinking text on block close (Anthropic only) */ },
    }
    if event.done { break; }
}
```

### Message Helpers

```rust
Message::user("Hello")
Message::assistant("Hi")
Message::system("You are helpful")
Message::tool_result("call_id", "result JSON string")
Message::tool("result", "call_id")                     // alias for tool_result
Message::assistant_with_tool_calls("text", tool_calls)
Message::user_with_image("prompt", "base64data", "image/png")  // image + text
Message::user_with_blocks(vec![ContentBlock::Image { .. }, ContentBlock::Text { .. }])
Message::user_with_pdf_base64("prompt", "base64data")  // document (Anthropic only)
```

### BoxStream Type

```rust
// stream() and stream_with() return:
Result<BoxStream, MotosanError>
// where BoxStream = Pin<Box<dyn Stream<Item = Result<StreamEvent, MotosanError>> + Send>>
```

Note: stream items are fallible (Rust 0.20+). Use `let event = item?;` in `StreamExt::next()` loops; mid-stream provider/timeout errors surface as `Err(...)`.

### RetryPolicy

```rust
use motosan_ai::RetryPolicy;

let policy = RetryPolicy::new()
    .max_retries(3)        // default 3
    .base_delay_ms(100)    // default 100
    .max_delay_ms(2_000)   // default 2000
    .jitter(true)          // default true
    .respect_retry_after(true);  // default true
```

Retries on: 429 (rate limit), 5xx (server error), timeout/connect errors.
Backoff: `base_delay * 2^(attempt-1)`, capped at `max_delay_ms`, with optional jitter.

### Error Handling

```rust
use motosan_ai::MotosanError;

match client.chat(messages).await {
    Ok(resp) => println!("{}", resp.content),
    Err(MotosanError::Auth(msg)) => eprintln!("Auth: {msg}"),
    Err(MotosanError::RateLimit(msg)) => eprintln!("Rate limited: {msg}"),
    Err(MotosanError::InvalidRequest(msg)) => eprintln!("Bad request: {msg}"),
    Err(MotosanError::Config(msg)) => eprintln!("Config: {msg}"),
    Err(MotosanError::ProviderError(msg)) => eprintln!("Provider: {msg}"),
    Err(MotosanError::Network(msg)) => eprintln!("Network: {msg}"),
    Err(MotosanError::Stream(msg)) => eprintln!("Stream: {msg}"),
    Err(MotosanError::StreamReadTimeout(secs)) => eprintln!("Stream timed out after {secs}s"),
    Err(MotosanError::UnsupportedFeature(msg)) => eprintln!("Unsupported: {msg}"),
}
```

---

## Tool Use

### Define Tools

**Python:**
```python
from motosan_ai import Tool

tools = [Tool(
    name="get_weather",
    description="Get weather for a city",
    input_schema={"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]},
)]
```

**Rust (0.18.0+):** `Tool` composes `motosan_agent_primitives::ToolSchema`
(re-exported as `motosan_ai::ToolSchema`). `description` and `input_schema`
are required fields on `ToolSchema` — no longer `Option`:
```rust
use motosan_ai::{Tool, ToolSchema};
use serde_json::json;

let tools = vec![Tool::from(ToolSchema {
    name: "get_weather".into(),
    description: "Get weather for a city".into(),
    input_schema: json!({"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]}),
})];
```

When bridging from agent-side declarations, pass `ToolSchema`s directly via
`ChatRequest::builder().tool_schemas(&schemas)` — the old `tool_defs(&[ToolDef])`
builder, the `agent-tool` feature, and the `motosan-agent-tool` dependency were
all removed in 0.18.0.

### Multi-turn Tool Loop (Python)

```python
from motosan_ai import Client, Message, StopReason
import json

async def agent_loop(client, user_input: str, tools):
    messages = [Message.user(user_input)]
    while True:
        resp = await client.chat(messages, tools=tools)
        if resp.stop_reason != StopReason.tool_use or not resp.tool_calls:
            return resp.content

        messages.append(Message.assistant_with_tool_calls(resp.content, resp.tool_calls))
        for tc in resp.tool_calls:
            result = await execute_tool(tc.name, tc.input)
            messages.append(Message.tool_result(tool_call_id=tc.id, content=json.dumps(result)))
```

### Multi-turn Tool Loop (Rust)

```rust
use motosan_ai::{Client, Message, ChatRequest, StopReason, Tool};

async fn agent_loop(client: &Client, input: &str, tools: Vec<Tool>) -> Result<String, MotosanError> {
    let mut messages = vec![Message::user(input)];
    loop {
        let request = ChatRequest::builder()
            .messages(messages.clone())
            .tools(tools.clone())
            .build();
        let resp = client.chat_with(request).await?;
        if resp.stop_reason != StopReason::ToolUse || resp.tool_calls.is_empty() {
            return Ok(resp.content);
        }
        messages.push(Message::assistant_with_tool_calls(&resp.content, resp.tool_calls.clone()));
        for tc in &resp.tool_calls {
            let result = execute_tool(&tc.name, &tc.input).await?;
            messages.push(Message::tool_result(&tc.id, &result));
        }
    }
}
```

### ToolCall Fields

```
tool_call.id    — unique call ID (required for tool_result)
tool_call.name  — which tool was called
tool_call.input — parsed arguments (Python: dict, Rust: serde_json::Value)
```

---

## Streaming Events

### StreamEvent Fields

```
event.event_type           — "text" | "tool_call_start" | "tool_call_args" | "tool_call_end" | "usage"
event.content              — text delta (String)
event.done                 — True on last event (bool)
event.tool_call_id         — tool call ID (Optional)
event.tool_call_name       — tool name, on tool_call_start (Optional)
event.tool_call_args_delta — incremental JSON args, on tool_call_args (Optional)
event.usage                — token tally on usage events (Optional)
event.stop_reason          — StopReason on the terminal `done` event (Optional)
                             • Anthropic / MiniMax: from message_delta.delta.stop_reason
                             • OpenAI: from choices[0].finish_reason
                             • None on intermediate events
                             • None on done events for providers that don't report a reason
```

Each provider stream emits **exactly one** `done` event — guaranteed since
v0.10.1 even when the upstream provider closes the connection without `[DONE]`
and without a `finish_reason` chunk (some non-conformant proxies do this). If
the provider reports a stop reason, it lands on that event. `collect_stream`
honors the explicit `stop_reason` and only falls back to its tool-calls
heuristic when none was reported.

### Event Sequence for Tool Calls

```
tool_call_start  → tool_call_id + tool_call_name set
tool_call_args   → tool_call_args_delta accumulates (may arrive in multiple chunks)
tool_call_args   → ...
tool_call_end    → tool_call_id set, args complete
```

Buffer `tool_call_args_delta` until `tool_call_end`, then JSON parse.

### Streaming Tool Use (Python)

```python
pending = {}  # tool_call_id → {"name": str, "args": str}
async for event in client.stream(
    request.messages,
    tools=request.tools,
    system=request.system,
):
    match event.event_type:
        case "text":
            print(event.content, end="", flush=True)
        case "tool_call_start":
            pending[event.tool_call_id] = {"name": event.tool_call_name, "args": ""}
        case "tool_call_args":
            pending[event.tool_call_id]["args"] += event.tool_call_args_delta
        case "tool_call_end":
            tc = pending.pop(event.tool_call_id)
            args = json.loads(tc["args"])
            result = await execute_tool(tc["name"], args)
    if event.done:
        break
```

---

## ThinkStripper

`<think>...</think>` blocks in LLM output (e.g. DeepSeek, QwQ, MiniMax reasoning models) are stripped **automatically** at the `stream()` / `stream_with()` level.

Manual use (Python):
```python
from motosan_ai import ThinkStripper
stripper = ThinkStripper()
clean = stripper.feed("<think>reasoning...</think>Hello")  # → "Hello"
tail = stripper.flush()  # call at stream end
```

Manual use (Rust):
```rust
use motosan_ai::think_stripper::ThinkStripper;
let mut stripper = ThinkStripper::new();
let clean = stripper.feed("<think>reasoning...</think>Hello");  // → "Hello"
let tail = stripper.flush();  // call at stream end
```

---

## Provider-Specific Notes

### Provider Capabilities (Rust, v0.13.1+)

Providers declare which content types they accept. Passing unsupported content returns `Err(MotosanError::UnsupportedFeature(...))` **before** any network call.

| Provider | Image | Document |
|----------|-------|----------|
| Anthropic | ✓ | ✓ |
| OpenAI | ✓ | ✗ |
| Gemini / GeminiCodeAssist | ✓ | ✗ |
| MiniMax / Ollama / CLI backends | ✗ | ✗ |

```rust
use motosan_ai::ProviderCapabilities;

// Check programmatically
let caps = provider.capabilities();
if caps.supports_image { /* send image */ }
```

When implementing a custom `ProviderImpl`, override `capabilities()` to declare what the provider supports:

```rust
fn capabilities(&self) -> ProviderCapabilities {
    ProviderCapabilities::with_image()   // or full() / text_only()
}
```

### Gemini (HTTP — generativelanguage.googleapis.com)

```rust
// Cargo.toml: features = ["gemini"]
let client = Client::builder()
    .provider(Provider::Gemini)
    .api_key("AIzaSy...")               // Google AI Studio API key
    .model("gemini-2.0-flash")          // optional, this is the default
    .build()?;

// tool use multi-turn: pass function name (not call ID) as tool_call_id
let resp = client.chat_with(req_with_tools).await?;
let tc = &resp.tool_calls[0];
let result = Message::tool_result(&tc.name, tool_output); // ← tc.name, NOT tc.id
```

Billing: pay-per-token. Free tier available but rate-limited. Default model `gemini-2.0-flash`.

### GeminiCodeAssist (HTTP — cloudcode-pa.googleapis.com)

```rust
// Cargo.toml: features = ["gemini-code-assist"]
// Also: motosan-ai-oauth = { version = "0.2", features = ["gemini"] }

use motosan_ai_oauth::{refresh, providers::gemini};

// Refresh OAuth token (stored refresh_token from initial PKCE login)
let token = refresh(&gemini(), &stored_refresh_token).await?;

// Get project ID from loadCodeAssist API (one-time setup):
// POST https://cloudcode-pa.googleapis.com/v1internal:loadCodeAssist
// Authorization: Bearer <token>
// → response["cloudaicompanionProject"] = "your-project-id"

let client = Client::builder()
    .provider(Provider::GeminiCodeAssist)
    .api_key(token.access_token)        // ya29.xxx OAuth token
    .gemini_code_assist_project_id("your-gcp-project-id")
    .model("gemini-2.5-flash")          // optional, this is the default
    .build()?;
```

Billing: subscription-based ($19–$45/seat/month). 1,000–2,000 requests/day depending on tier. If you have an active Gemini CLI subscription, this is effectively free (no extra cost). Default model must be `gemini-2.5-flash` or `gemini-2.5-pro` — `gemini-2.0-flash` is not available on `cloudcode-pa` for standard-tier accounts.

### Anthropic Auth

- Regular API key (`sk-ant-api*`): uses `x-api-key` header
- OAuth token (`sk-ant-oat01*`): auto-switches to OAuth mode with `Authorization: Bearer`, streaming required, `chat()` auto-redirects to `stream()`

### Anthropic Base URL

```rust
// Point at a proxy, on-prem deployment, or Anthropic-compatible third-party endpoint.
// Defaults to https://api.anthropic.com when unset. (Rust v0.14.2+)
client_builder.anthropic_base_url("https://proxy.example.com/anthropic")
```

Getter: `client.anthropic_base_url() -> Option<&str>` returns the override, or `None` if the default is in use.

### OpenAI

```rust
// Custom auth header
client_builder.openai_auth_x_api_key()
client_builder.openai_auth_custom_header("X-Auth-Token")

// Point at an OpenAI-compatible endpoint. Pass the FULL URL — no /v1 injection,
// no base_url heuristics. What you pass is what gets POSTed.
client_builder.openai_chat_url("https://api.groq.com/openai/v1/chat/completions")
client_builder.openai_chat_url("https://api.deepseek.com/v1/chat/completions")
client_builder.openai_chat_url("https://my-proxy.example.com/any/path")

// Fallback to the Responses API when chat completions returns 404
client_builder.openai_responses_fallback(true)
// Override the Responses URL (only needed for non-OpenAI hosts that expose it)
client_builder.openai_responses_url("https://api.openai.com/v1/responses")
```

Defaults: `https://api.openai.com/v1/chat/completions` + `https://api.openai.com/v1/responses`. Exported as `DEFAULT_OPENAI_CHAT_URL` / `DEFAULT_OPENAI_RESPONSES_URL` from `motosan_ai::providers::openai`.

### MiniMax

- Routed via Anthropic-compatible `/anthropic/v1/messages`
- Default model: `MiniMax-M2.7` (`MiniMax-M2.7-highspeed` also supported)
- Default base URL: `https://api.minimax.io/anthropic`
- CN base URL override: `.minimax_base_url("https://api.minimaxi.com/anthropic")`
- Uses Anthropic serialization (`tool_use` / `tool_result` content blocks)
- Declares text-only capabilities (`ProviderCapabilities::text_only()`)

### Ollama

```rust
// OpenAI-compatible mode (default)
Client::builder().provider(Provider::Ollama).api_key("ollama")

// Native mode with extra options
Client::builder()
    .provider(Provider::Ollama)
    .api_key("ollama")
    .ollama_native(true)
    .ollama_base_url("http://localhost:11434")
    .ollama_keep_alive("5m")
    .ollama_num_ctx(4096)
```

### CLI Backends

Rust has three opt-in CLI backends (`ClaudeCodeProvider`, `CodexCliProvider`, `GeminiCliProvider`) that shell out to local binaries instead of calling HTTP APIs. Python v0.9.2+ has `ClaudeCodeClient`, `CodexCliClient`, and `GeminiCliClient` with Rust-compatible flag coverage. All CLI paths return the same `ChatResponse` / `StreamEvent` types as HTTP providers.

**Rust unified `Client::builder()` dispatch** (v0.11.0+):

```rust
use motosan_ai::claude_code::{EffortLevel, PermissionMode};
use motosan_ai::codex_cli::SandboxMode;
use motosan_ai::gemini_cli::ApprovalMode;
use motosan_ai::{
    Client, ClaudeCodeProvider, CodexCliProvider, GeminiCliProvider, Message, Provider,
};

// Claude Code via Client::builder — full config via pre-built ClaudeCodeProvider
let client = Client::builder()
    .provider(Provider::ClaudeCode)
    .claude_code(
        ClaudeCodeProvider::new()
            .bare(true)                                 // --bare (daemon-safe; skip hooks/plugins/auto-memory)
            .model("sonnet")                            // --model
            .system_prompt("Be terse.")                 // --system-prompt (full replacement)
            .permission_mode(PermissionMode::Plan)      // --permission-mode plan
            .effort(EffortLevel::Low)                   // --effort low
            .fallback_model("opus")                     // --fallback-model
            .add_dir("/tmp/workspace")                  // --add-dir (repeatable)
            .allow_tool("Edit")                         // --allowed-tools (variadic)
            .disallow_tool("WebFetch")                  // --disallowed-tools (variadic)
            .mcp_config("./mcp.json")                   // --mcp-config (variadic)
            .strict_mcp_config(true)                    // --strict-mcp-config
            .settings("./settings.json")                // --settings
            .setting_source("user")                     // --setting-sources user,project
            .setting_source("project")
            .no_session_persistence(true)               // --no-session-persistence
            .max_budget_usd(2.5),                       // --max-budget-usd
    )
    .build()?;  // no api_key needed
let response = client.chat(vec![Message::user("hi")]).await?;

// Codex CLI via Client::builder — full config via pre-built CodexCliProvider
let client = Client::builder()
    .provider(Provider::CodexCli)
    .codex_cli(
        CodexCliProvider::new()
            .model("gpt-5.1-codex")
            .sandbox(SandboxMode::WorkspaceWrite)
            .profile("work")
            .ephemeral(true)
            .add_dir("/tmp/output")
            .enable_feature("fast_mode")
            .config_override("model_reasoning_effort", "\"low\""),
    )
    .build()?;
let response = client.chat(vec![Message::user("hi")]).await?;

// Gemini CLI via Client::builder
let client = Client::builder()
    .provider(Provider::GeminiCli)
    .gemini_cli(
        GeminiCliProvider::new()
            .model("gemini-2.5-pro")
            .approval_mode(ApprovalMode::Yolo)
            .sandbox(true),
    )
    .build()?;
let response = client.chat(vec![Message::user("hi")]).await?;
```

**Direct use** (still supported, bypasses `Client`):

```rust
use motosan_ai::{CodexCliProvider, codex_cli::SandboxMode};

let provider = CodexCliProvider::new().sandbox(SandboxMode::ReadOnly);
let response = provider.chat(request).await?;
let stream = provider.stream(request).await?;
```

**Python Claude Code direct use** (v0.9.0+):

```python
from motosan_ai import ChatRequest, ClaudeCodeClient, Message

client = (
    ClaudeCodeClient()
    .model("sonnet")
    .system_prompt("Be terse.")      # --system-prompt (full replacement)
    .permission_mode("plan")         # --permission-mode plan
    .effort("low")                   # --effort low
    .allow_tool("Edit")              # --allowed-tools Edit (variadic)
    .max_budget_usd(2.5)              # --max-budget-usd
)
resp = await client.chat(ChatRequest(messages=[Message.user("hi")]))

async for event in client.stream(ChatRequest(messages=[Message.user("hi")])):
    if event.event_type == "usage":
        print(event.usage)
```

**Python Codex CLI direct use** (v0.9.1+):

```python
from motosan_ai import ChatRequest, CodexCliClient, Message, SandboxMode

client = (
    CodexCliClient()
    .sandbox(SandboxMode.workspace_write)      # --sandbox workspace-write
    .model("gpt-5.1-codex")                   # --model
    .profile("work")                          # --profile
    .config_override("approval_policy", "never")  # -c approval_policy=never
)
resp = await client.chat(ChatRequest(messages=[Message.user("hi")]))
```

**Python Gemini CLI direct use** (v0.9.2+):

```python
from motosan_ai import ApprovalMode, ChatRequest, GeminiCliClient, Message

client = (
    GeminiCliClient()
    .model("gemini-2.5-pro")
    .approval_mode(ApprovalMode.plan)       # --approval-mode plan
    .include_dir("/tmp/workspace")          # --include-directories
)
resp = await client.chat(ChatRequest(messages=[Message.user("hi")]))
```

Key notes:
- Runs `claude --print --output-format stream-json` / `codex exec --json --skip-git-repo-check -` / `gemini -p "" -o stream-json` with the prompt on stdin.
- Authentication: `claude` uses your existing login state; `codex` uses `CODEX_API_KEY` env var or `~/.codex/auth.json` (**not** `OPENAI_API_KEY`); `gemini` uses its own local auth (`gemini auth` once — personal Google account or API key).
- Blocking `chat().tool_calls` is always empty on all three. But `stream()` now surfaces CLI tool use as `ToolCallStart → ToolCallArgs → ToolCallEnd` events (Claude `tool_use`; Codex `command_execution` / `mcp_tool_call`; Gemini `tool_use`). Tool *results* run inside the CLI sandbox and are not surfaced.
- **Per-run CLI knobs (v0.20.0), shared by all three providers**: `.cwd(dir)` runs the child with `Command::current_dir` (Codex uses `.cd()` → `--cd`); `.env(k,v)` / `.envs(iter)` inject a per-run secret bundle into the child (redacted from `Debug`, never logged); `.timeout(dur)` / `.no_timeout()` bound each invocation (default: Claude 300 s, Codex/Gemini 600 s) and the `stream()` per-line read-stall deadline (→ `Err(MotosanError::StreamReadTimeout)`); `.resume(id)` continues a prior session (Codex `exec resume <id>`; Gemini/Claude `--resume`), and the provider-minted id is surfaced on `StreamEvent::session_id` / `ChatResponse::session_id`.
- Codex emits **complete** `agent_message` items (not token deltas); `stream()` yields one text event per finalized message. `chat()` treats the last `agent_message` as `content` and folds earlier ones (preamble / tool narration) into `thinking`.
- Gemini emits delta chunks (`{"type":"message","role":"assistant","content":"...","delta":true}`) followed by a terminal `{"type":"result","stats":{...}}` event; both paths share one parser. Usage maps from `stats.input_tokens` / `output_tokens` / `cached` (→ `cache_read_input_tokens`).
- Gemini has no `--system-prompt` flag — `GeminiCliProvider` merges the system text into the stdin payload as a blank-line-separated prefix.
- `SandboxMode` variants (Codex): `ReadOnly` / `WorkspaceWrite` / `DangerFullAccess`. Can coexist with `agent_mode(true)` (`--full-auto`).
- `ApprovalMode` variants (Gemini): `Default` / `AutoEdit` / `Yolo` / `Plan`. `yolo(true)` is shorthand for `--yolo` (same effect as `ApprovalMode::Yolo`).
- `PermissionMode` variants (Claude Code): `AcceptEdits` / `Auto` / `BypassPermissions` / `Default` / `DontAsk` / `Plan`. Distinct from `.agent_mode(true)`, which forwards `--dangerously-skip-permissions` and switches the blocking path to `--output-format json` so usage tokens can be parsed.
- `EffortLevel` variants (Claude Code): `Low` / `Medium` / `High` / `Max`. Maps to `--effort`.
- Claude Code isolation: `.bare(true)` forwards `--bare`, so the spawned `claude` skips hooks, plugins, auto-memory, keychain reads, and user/project settings discovery. Recommended for daemons / servers that must not inherit the operator's interactive state; leave `false` for workflows that should pick up `~/.claude/`. Emitted before `--dangerously-skip-permissions`.
- Claude Code system prompts: `.system_prompt(...)` forwards `--system-prompt` (full replacement); message-extracted system prompts flow through `--append-system-prompt`. Both can coexist — Claude applies append on top of replace.
- Claude Code stream usage: Python v0.9.0+ and Rust both emit a `StreamEvent` with `event_type == "usage"` before terminal `done` when Claude Code's NDJSON `result` event includes usage.
- Codex CLI stream usage: Python v0.9.1+ and Rust both map `turn.completed.usage.cached_input_tokens` to `Usage.cache_read_input_tokens` and emit usage before terminal `done`.
- Gemini CLI stream usage: Python v0.9.2 and Rust both map `result.stats.cached` to `Usage.cache_read_input_tokens` and emit usage before terminal `done`. Gemini CLI has no trailing `-` stdin marker; system prompts are prepended to stdin with `\n\n`.
- Claude Code budget: `.max_budget_usd(amount)` drops non-finite and negative values at argv-build time so the CLI never receives an invalid number. Only meaningful under `--print`.
- Claude Code session flags (`--session-id` / `--resume` / `--continue` / `--fork-session` / `--no-session-persistence`) all live on the same `ClaudeCodeProvider` builder; blank string inputs are skipped so `.resume("")` is a no-op.
- Claude Code argv order is stable and locked by `common_args_full_loadout_order_is_stable`. Don't grep-reorder it.
- `config_override` values (Codex) are parsed as TOML by the CLI — string values need escaped quotes (`"\"low\""`).
- `codex features list` prints the valid feature names accepted by `.enable_feature()` / `.disable_feature()` — codex validates them against a strict allowlist.
- Only `codex exec` is supported. `codex exec resume` and `codex review` subcommands are out of scope.
- The v0.10.0 `ClaudeCodeClient` / `CodexCliClient` type aliases were removed in v0.11.0. Use `ClaudeCodeProvider` / `CodexCliProvider` / `GeminiCliProvider`.

---

## codex-oauth

Standalone crate for browser-based PKCE OAuth login against `auth.openai.com`. Returns an access token usable with the ChatGPT backend API.

- crates.io: https://crates.io/crates/codex-oauth
- Version: 0.1.0

### Install

```toml
codex-oauth = "0.1"
```

### API

```rust
// Login (opens browser, listens on localhost:1455, times out after 120s)
let token = codex_oauth::login().await?;

// Refresh
let token = codex_oauth::refresh(&token.refresh_token).await?;

// Expiry check
if token.is_expired() { /* refresh */ }

// Token fields
token.access_token   // Bearer token for chatgpt.com/backend-api
token.refresh_token  // long-lived, use with refresh()
token.expires_in     // lifetime in seconds
token.issued_at      // Unix timestamp of issue time
```

`Token` implements `Serialize`/`Deserialize` for disk persistence.

---

## anthropic-oauth

Standalone crate for browser-based PKCE OAuth login against `claude.ai`. Returns an `sk-ant-oat01-*` access token usable with `motosan-ai`'s `AnthropicProvider` (which auto-detects the prefix and applies Claude Code identity headers).

- crates.io: https://crates.io/crates/anthropic-oauth
- Version: 0.1.0

### Install

```toml
anthropic-oauth = "0.1"
```

### API

```rust
// Login (opens browser, listens on 127.0.0.1:53692, times out after 120s).
// The redirect URI registered with Anthropic uses hostname "localhost"; the
// bind address is still 127.0.0.1 — the OAuthConfig handles this split.
let token = anthropic_oauth::login().await?;

// Refresh
let token = anthropic_oauth::refresh(&token.refresh_token).await?;

// Expiry check
if token.is_expired() { /* refresh */ }

// Token fields (same shape as codex-oauth)
token.access_token   // "sk-ant-oat01-..." — Bearer for Anthropic API
token.refresh_token  // long-lived, use with refresh()
token.expires_in     // lifetime in seconds
token.issued_at      // Unix timestamp of issue time
```

`Token` implements `Serialize`/`Deserialize` for disk persistence.

**ToS disclosure**: this crate uses the OAuth `client_id` registered by Anthropic's Claude Code CLI. The resulting access token authenticates as a Claude Code CLI session. Anthropic has not published this `client_id` for third-party use; using it for purposes other than running `claude` CLI may be subject to change, rate limited, or violate Anthropic's terms of service. See the project README for the full disclosure.

---

## Release

Python, Rust, and TypeScript SDKs are versioned and released **independently**.

### Tag Convention

| SDK          | Tag format            | Example               |
|--------------|-----------------------|-----------------------|
| Python       | `python-vX.Y.Z`       | `python-v0.4.2`       |
| Rust         | `rust-vX.Y.Z`         | `rust-v0.3.3`         |
| TypeScript   | `ts-vX.Y.Z`           | `ts-v0.10.0`          |
| motosan-ai-oauth | `motosan-ai-oauth-vX.Y.Z` | `motosan-ai-oauth-v0.2.0` |
| codex-oauth  | `codex-oauth-vX.Y.Z`  | `codex-oauth-v0.1.0`  |
| anthropic-oauth | `anthropic-oauth-vX.Y.Z` | `anthropic-oauth-v0.1.0` |

### Release Steps (Python)

```bash
# 1. Bump version
#    sdks/python/pyproject.toml → version = "X.Y.Z"

# 2. Update CHANGELOG
#    sdks/python/CHANGELOG.md → ## [X.Y.Z] - YYYY-MM-DD

# 3. Update version references in:
#    - README.md (root) — Languages table, provider table
#    - AGENTS.md — "Current versions" section, Releasing table
#    - llms.txt — header version line, Install section
#    - skills/motosan-ai/SKILL.md — header version, Install section

# 4. Commit
git add sdks/python/pyproject.toml sdks/python/CHANGELOG.md README.md AGENTS.md llms.txt skills/motosan-ai/SKILL.md
git commit -m "chore: release python-vX.Y.Z"

# 5. Tag + push (triggers publish-python.yml → PyPI)
git tag -a python-vX.Y.Z -m "python-vX.Y.Z — summary"
git push origin main python-vX.Y.Z
```

### Release Steps (Rust)

```bash
# 1. Bump version
#    sdks/rust/Cargo.toml → version = "0.4.1"

# 2. Update CHANGELOG
#    sdks/rust/CHANGELOG.md → ## [0.4.1] - YYYY-MM-DD

# 3. Update version references in:
#    - README.md (root) — Languages table, provider table
#    - AGENTS.md — "Current versions" section, Releasing table
#    - llms.txt — header version line, Install section
#    - skills/motosan-ai/SKILL.md — header version, Install section

# 4. Commit
git add sdks/rust/Cargo.toml sdks/rust/CHANGELOG.md README.md AGENTS.md llms.txt skills/motosan-ai/SKILL.md
git commit -m "chore: release rust-v0.4.1"

# 5. Tag + push (triggers publish-rust.yml → crates.io)
git tag -a rust-v0.4.1 -m "rust-v0.4.1 — summary"
git push origin main rust-v0.4.1
```

### Release Steps (TypeScript)

```bash
# 1. Bump version
#    sdks/typescript/package.json → "version": "X.Y.Z"

# 2. Update CHANGELOG
#    sdks/typescript/CHANGELOG.md → ## [X.Y.Z] - YYYY-MM-DD

# 3. Update version references in:
#    - README.md (root) — Languages table
#    - AGENTS.md — Releasing paragraph
#    - llms.txt — Tag Convention + this section

# 4. Commit
git add sdks/typescript/package.json sdks/typescript/CHANGELOG.md README.md AGENTS.md llms.txt skills/motosan-ai/SKILL.md
git commit -m "chore: release ts-vX.Y.Z"

# 5. Tag + push (triggers publish-typescript.yml → npm)
git tag ts-vX.Y.Z
git push origin main ts-vX.Y.Z
```

### Release Steps (motosan-ai-oauth)

```bash
# 1. Bump version
#    sdks/rust/crates/motosan-ai-oauth/Cargo.toml → version = "0.2.0"

# 2. Update CHANGELOG
#    sdks/rust/crates/motosan-ai-oauth/CHANGELOG.md → ## [0.2.0] - YYYY-MM-DD

# 3. Commit
git add sdks/rust/crates/motosan-ai-oauth/Cargo.toml sdks/rust/crates/motosan-ai-oauth/CHANGELOG.md
git commit -m "chore: release motosan-ai-oauth-v0.2.0"

# 4. Tag + push (triggers publish-motosan-ai-oauth.yml → crates.io)
git tag -a motosan-ai-oauth-v0.2.0 -m "motosan-ai-oauth-v0.2.0 — summary"
git push origin main motosan-ai-oauth-v0.2.0
```

Publish `motosan-ai-oauth` before publishing wrapper crates (`codex-oauth`,
`anthropic-oauth`) that depend on its new version.

### Release Steps (codex-oauth)

```bash
# 1. Bump version
#    sdks/rust/crates/codex-oauth/Cargo.toml → version = "0.1.1"

# 2. Update CHANGELOG
#    sdks/rust/crates/codex-oauth/CHANGELOG.md → ## [0.1.1] - YYYY-MM-DD

# 3. Commit
git add sdks/rust/crates/codex-oauth/Cargo.toml sdks/rust/crates/codex-oauth/CHANGELOG.md
git commit -m "chore: release codex-oauth-v0.1.1"

# 4. Tag + push (triggers publish-codex-oauth.yml → crates.io)
git tag -a codex-oauth-v0.1.1 -m "codex-oauth-v0.1.1 — summary"
git push origin main codex-oauth-v0.1.1
```

### Release Steps (anthropic-oauth)

```bash
# 1. Bump version
#    sdks/rust/crates/anthropic-oauth/Cargo.toml → version = "0.1.1"

# 2. Update CHANGELOG
#    sdks/rust/crates/anthropic-oauth/CHANGELOG.md → ## [0.1.1] - YYYY-MM-DD

# 3. Commit
git add sdks/rust/crates/anthropic-oauth/Cargo.toml sdks/rust/crates/anthropic-oauth/CHANGELOG.md
git commit -m "chore: release anthropic-oauth-v0.1.1"

# 4. Tag + push (triggers publish-anthropic-oauth.yml → crates.io)
git tag -a anthropic-oauth-v0.1.1 -m "anthropic-oauth-v0.1.1 — summary"
git push origin main anthropic-oauth-v0.1.1
```

### CI Pipeline

Tag push triggers GitHub Actions:
- **publish-python.yml**: `uv build` → `pypa/gh-action-pypi-publish` (secret: `PYPI_API_TOKEN`)
- **publish-rust.yml**: `cargo fmt --check` → `cargo clippy` → `cargo test --all-features` → `cargo publish` (secret: `CARGO_REGISTRY_TOKEN`)
- **publish-typescript.yml**: `npm ci` → `npm run build` → `npm run test` → version-matches-tag guard → `npm publish --provenance --access public` (secret: `NPM_TOKEN`)
- **publish-motosan-ai-oauth.yml**: `cargo fmt --check` → `cargo clippy` → `cargo test` → `cargo publish` (secret: `CARGO_REGISTRY_TOKEN`)
- **publish-codex-oauth.yml**: `cargo fmt --check` → `cargo clippy` → `cargo test` → `cargo publish` (secret: `CARGO_REGISTRY_TOKEN`)
- **publish-anthropic-oauth.yml**: `cargo fmt --check` → `cargo clippy` → `cargo test` → `cargo publish` (secret: `CARGO_REGISTRY_TOKEN`)

All Rust workflows support `workflow_dispatch` for manual trigger.

### Pre-Push Validation (Local)

```bash
./scripts/pre-push-gate.sh
# [1/4] Python unit tests
# [2/4] Rust unit tests
# [3/4] Python live tests (skipped if no ANTHROPIC_API_KEY)
# [4/4] Rust live tests (skipped if no ANTHROPIC_API_KEY)
```

### Emergency Manual Publish

```bash
# Python
cd sdks/python && uv build --out-dir dist && uv publish dist/*

# Rust
cd sdks/rust && cargo publish

# TypeScript
cd sdks/typescript && npm ci && npm run build && npm publish --access public
```