# motosan-ai > Multi-provider LLM SDK for Python and Rust. Unified API for Anthropic, OpenAI, Ollama, MiniMax, and Gemini — with streaming, tool use, retry, and ThinkStripper built in. - Python 0.12.1 · Rust 0.20.0 - GitHub: https://github.com/motosan-dev/motosan-ai - PyPI: https://pypi.org/project/motosan-ai/ - crates.io: https://crates.io/crates/motosan-ai ## Install ```bash # Python — pick providers via extras pip install "motosan-ai[anthropic]" pip install "motosan-ai[gemini]" pip install "motosan-ai[anthropic,openai,ollama,minimax,gemini]" ``` ```toml # Rust — pick features in Cargo.toml [dependencies] motosan-ai = { version = "0.20.0", features = ["anthropic"] } # features: anthropic | openai | minimax | ollama | ollama_native | full # gemini | gemini-code-assist # CLI backends (Rust features; Python has built-in ClaudeCodeClient/CodexCliClient): # claude-code → ClaudeCodeProvider → `claude --print --output-format stream-json` # codex-cli → CodexCliProvider → `codex exec --json -` # gemini-cli → GeminiCliProvider → `gemini -p "" -o stream-json` ``` ## Environment Variables | Provider | Env var | |-----------|---------------------| | Anthropic | `ANTHROPIC_API_KEY` | | OpenAI | `OPENAI_API_KEY` | | MiniMax | `MINIMAX_API_KEY` | | Gemini | `GEMINI_API_KEY` | | Ollama | (none — local) | ## Model Defaults | Provider | Default model | |-----------|---------------------------| | Anthropic | `claude-sonnet-4-6` | | OpenAI | Python: `gpt-4o` · Rust: `gpt-5.3-codex` | | MiniMax | Python: `MiniMax-Text-01` · Rust: `MiniMax-M2.7` | | Ollama | `llama3.2` | | Gemini | Python: `gemini-2.5-flash` · Rust: `gemini-2.0-flash` | | GeminiCodeAssist | `gemini-2.5-flash` | Override per client or per request. Anthropic's model catalog includes `claude-opus-4-8`; use it as an override when you want Opus. For Opus 4.8/4.7/4.6, `thinking` uses Anthropic adaptive thinking (`thinking.type = "adaptive"`, summarized display, `output_config.effort = "high"`) and OAuth adaptive-thinking requests omit the legacy `interleaved-thinking` beta header, matching pi. --- ## Python API ### Client ```python from motosan_ai import Client, Message # Factory methods (reads env var automatically) client = Client.anthropic() client = Client.openai(model="gpt-4o") client = Client.minimax() client = Client.gemini() client = Client.ollama(model="llama3.2", base_url="http://localhost:11434") ``` Python HTTP providers: Anthropic, OpenAI, MiniMax, Gemini, Ollama. Gemini uses `GEMINI_API_KEY`, default model `gemini-2.5-flash`, supports text, images, tools/tool choice, streaming, stop sequences, and usage tokens. Document/PDF blocks are rejected before HTTP. ### chat() — single turn ```python resp = await client.chat([Message.user("Hello")]) print(resp.content) # str print(resp.stop_reason) # StopReason: end_turn | max_tokens | tool_use | stop print(resp.usage) # Usage: input_tokens, output_tokens print(resp.tool_calls) # list[ToolCall] — empty if no tool use ``` ### chat() — full control via keyword args ```python resp = await client.chat( [Message.user("Hello")], system="You are a helpful assistant.", temperature=0.7, max_tokens=1024, tools=[...], provider_options={"key": "val"}, ) ``` ### stream() — streaming text ```python async for event in client.stream([Message.user("Tell me a story")]): if event.event_type == "text": print(event.content, end="", flush=True) if event.done: break ``` ### stream() — with tools ```python async for event in client.stream( messages, tools=tools, system="You are helpful.", ): # handle events (see Streaming section) ``` ### Rust-parity Client methods (Python v0.10.0) ```python from motosan_ai import ChatRequest # Full ChatRequest passthrough; request.model falls back to client.model. # Opus 4.8/4.7/4.6 thinking is sent as adaptive thinking. req = ( ChatRequest.builder() .message(Message.user("Hi")) .model("claude-opus-4-8") .thinking(1024) .build() ) resp = await client.chat_with(req) async for event in client.stream_with(req): ... # Stream-to-ChatResponse assembly. resp = await client.stream_collect([Message.user("Hi")]) resp = await client.stream_collect_with(req) ``` Use `*_with` for fields not exposed by `chat()`/`stream()` kwargs: `thinking`, `tool_choice`, `mcp_servers`, `system_blocks`, and `stop_sequences`. The SDK is async-only — wrap in `asyncio.run(client.chat(...))` from sync code. ### Message Helpers ```python Message.user("Hello") Message.assistant("Hi there") Message.system("You are a helpful assistant") Message.assistant_with_tool_calls("Let me check", tool_calls=[...]) Message.tool_result(tool_call_id="call_123", content='{"result": 42}') ``` ### Retry ```python # default retries = 3 client = Client.anthropic(api_key="...", max_retries=3) # disable retries client = Client.anthropic(api_key="...", max_retries=0) ``` ### Error Handling ```python from motosan_ai import MotosanError, AuthError, RateLimitError try: resp = await client.chat([Message.user("Hi")]) except RateLimitError as e: print(f"Rate limited: {e}") except AuthError as e: print(f"Auth failed: {e}") except MotosanError as e: print(f"Error: {e}") ``` Error hierarchy: `MotosanError` → `AuthError`, `RateLimitError`, `InvalidRequestError`, `ConfigError`, `ProviderError`, `NetworkError`, `StreamError` --- ## Rust API ### Client ```rust use motosan_ai::{Client, Provider, Message}; let client = Client::builder() .provider(Provider::Anthropic) // required .api_key(std::env::var("ANTHROPIC_API_KEY")?) // required for HTTP providers .model("claude-sonnet-4-6") // optional .retry_policy(RetryPolicy::new()) // optional .stream_read_timeout_secs(30) // optional — terminate stream after 30s of silence .build()?; ``` Provider variants: `Anthropic` | `OpenAI` | `Minimax` | `Ollama` | `Gemini` | `GeminiCodeAssist` | `ClaudeCode` | `CodexCli` | `GeminiCli` `Gemini` (v0.13.0, feature `gemini`) — HTTP client for `generativelanguage.googleapis.com`. API key via `x-goog-api-key` header. Default model `gemini-2.0-flash`. **Gemini-specific convention**: `Message::tool_result` must pass the function name (not an opaque call ID) as `tool_call_id`, because Gemini's `functionResponse.name` must be the function name. `GeminiCodeAssist` (Rust v0.13.0 feature `gemini-code-assist`; Python v0.10.0 `GeminiCodeAssistProvider` / `Provider.gemini_code_assist`) — HTTP client for `cloudcode-pa.googleapis.com/v1internal`. OAuth Bearer token (`ya29.*` from `motosan-ai-oauth` in Rust or `motosan_ai.oauth` PKCE flow in Python). Requires GCP project ID. Default model `gemini-2.5-flash`. Billing is subscription-based (not per-token). `chat()` implemented internally as `stream()` + collect (no non-streaming endpoint). Python Anthropic Claude Pro/Max OAuth is available via `motosan_ai.oauth.claude_pro_max_config()` + `login()` (returns an `sk-ant-oat01-*` token). CLI backends (`ClaudeCode` / `CodexCli` / `GeminiCli`) go through the **same** `client.chat(...)` / `client.stream(...)` API. `api_key` is optional on the builder for these paths; configure provider-specific flags by passing a pre-built `ClaudeCodeProvider` / `CodexCliProvider` / `GeminiCliProvider` via `.claude_code(...)` / `.codex_cli(...)` / `.gemini_cli(...)`. `ClaudeCode` and `CodexCli` landed in v0.11.0; `GeminiCli` followed in the same generation. See the CLI Backends section under Provider-Specific Notes for a full example. ### chat() — single turn ```rust let resp = client.chat(vec![Message::user("Hello")]).await?; println!("{}", resp.content); // String println!("{:?}", resp.stop_reason); // StopReason enum println!("{}+{}", resp.usage.input_tokens, resp.usage.output_tokens); ``` ### chat_with() — full control ```rust use motosan_ai::ChatRequest; let request = ChatRequest::builder() .messages(vec![Message::user("Hello")]) .system("You are helpful") .model("claude-opus-4-8") .temperature(0.7) .max_tokens(1024) .tools(vec![...]) .build(); let resp = client.chat_with(request).await?; ``` ### stream() — streaming text ```rust use futures_util::StreamExt; let mut stream = client.stream(vec![Message::user("Tell me a story")]).await?; while let Some(item) = stream.next().await { let event = item?; if !event.content.is_empty() { print!("{}", event.content); } if event.done { // Terminal event carries the stop reason when the provider reports one. // Streams emit exactly ONE done event (since v0.10.1), even on // non-conformant proxies that skip [DONE] and finish_reason. if let Some(reason) = event.stop_reason { eprintln!("\n[stop_reason: {reason:?}]"); } break; } } ``` ### stream_with() — streaming + tools ```rust let request = ChatRequest::builder() .messages(messages) .tools(tools) .system("You are helpful") .build(); let mut stream = client.stream_with(request).await?; while let Some(item) = stream.next().await { let event = item?; match event.event_type { StreamEventType::Text => print!("{}", event.content), StreamEventType::ToolCallStart => { /* event.tool_call_id, event.tool_call_name */ }, StreamEventType::ToolCallArgs => { /* event.tool_call_args_delta */ }, StreamEventType::ToolCallEnd => { /* tool call complete */ }, StreamEventType::Usage => { /* event.usage */ }, StreamEventType::ThinkingDelta => { /* live extended-thinking chunk (Anthropic only) */ }, StreamEventType::ThinkingDone => { /* full thinking text on block close (Anthropic only) */ }, } if event.done { break; } } ``` ### Message Helpers ```rust Message::user("Hello") Message::assistant("Hi") Message::system("You are helpful") Message::tool_result("call_id", "result JSON string") Message::tool("result", "call_id") // alias for tool_result Message::assistant_with_tool_calls("text", tool_calls) Message::user_with_image("prompt", "base64data", "image/png") // image + text Message::user_with_blocks(vec![ContentBlock::Image { .. }, ContentBlock::Text { .. }]) Message::user_with_pdf_base64("prompt", "base64data") // document (Anthropic only) ``` ### BoxStream Type ```rust // stream() and stream_with() return: Result // where BoxStream = Pin> + Send>> ``` Note: stream items are fallible (Rust 0.20+). Use `let event = item?;` in `StreamExt::next()` loops; mid-stream provider/timeout errors surface as `Err(...)`. ### RetryPolicy ```rust use motosan_ai::RetryPolicy; let policy = RetryPolicy::new() .max_retries(3) // default 3 .base_delay_ms(100) // default 100 .max_delay_ms(2_000) // default 2000 .jitter(true) // default true .respect_retry_after(true); // default true ``` Retries on: 429 (rate limit), 5xx (server error), timeout/connect errors. Backoff: `base_delay * 2^(attempt-1)`, capped at `max_delay_ms`, with optional jitter. ### Error Handling ```rust use motosan_ai::MotosanError; match client.chat(messages).await { Ok(resp) => println!("{}", resp.content), Err(MotosanError::Auth(msg)) => eprintln!("Auth: {msg}"), Err(MotosanError::RateLimit(msg)) => eprintln!("Rate limited: {msg}"), Err(MotosanError::InvalidRequest(msg)) => eprintln!("Bad request: {msg}"), Err(MotosanError::Config(msg)) => eprintln!("Config: {msg}"), Err(MotosanError::ProviderError(msg)) => eprintln!("Provider: {msg}"), Err(MotosanError::Network(msg)) => eprintln!("Network: {msg}"), Err(MotosanError::Stream(msg)) => eprintln!("Stream: {msg}"), Err(MotosanError::StreamReadTimeout(secs)) => eprintln!("Stream timed out after {secs}s"), Err(MotosanError::UnsupportedFeature(msg)) => eprintln!("Unsupported: {msg}"), } ``` --- ## Tool Use ### Define Tools **Python:** ```python from motosan_ai import Tool tools = [Tool( name="get_weather", description="Get weather for a city", input_schema={"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]}, )] ``` **Rust (0.18.0+):** `Tool` composes `motosan_agent_primitives::ToolSchema` (re-exported as `motosan_ai::ToolSchema`). `description` and `input_schema` are required fields on `ToolSchema` — no longer `Option`: ```rust use motosan_ai::{Tool, ToolSchema}; use serde_json::json; let tools = vec![Tool::from(ToolSchema { name: "get_weather".into(), description: "Get weather for a city".into(), input_schema: json!({"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]}), })]; ``` When bridging from agent-side declarations, pass `ToolSchema`s directly via `ChatRequest::builder().tool_schemas(&schemas)` — the old `tool_defs(&[ToolDef])` builder, the `agent-tool` feature, and the `motosan-agent-tool` dependency were all removed in 0.18.0. ### Multi-turn Tool Loop (Python) ```python from motosan_ai import Client, Message, StopReason import json async def agent_loop(client, user_input: str, tools): messages = [Message.user(user_input)] while True: resp = await client.chat(messages, tools=tools) if resp.stop_reason != StopReason.tool_use or not resp.tool_calls: return resp.content messages.append(Message.assistant_with_tool_calls(resp.content, resp.tool_calls)) for tc in resp.tool_calls: result = await execute_tool(tc.name, tc.input) messages.append(Message.tool_result(tool_call_id=tc.id, content=json.dumps(result))) ``` ### Multi-turn Tool Loop (Rust) ```rust use motosan_ai::{Client, Message, ChatRequest, StopReason, Tool}; async fn agent_loop(client: &Client, input: &str, tools: Vec) -> Result { let mut messages = vec![Message::user(input)]; loop { let request = ChatRequest::builder() .messages(messages.clone()) .tools(tools.clone()) .build(); let resp = client.chat_with(request).await?; if resp.stop_reason != StopReason::ToolUse || resp.tool_calls.is_empty() { return Ok(resp.content); } messages.push(Message::assistant_with_tool_calls(&resp.content, resp.tool_calls.clone())); for tc in &resp.tool_calls { let result = execute_tool(&tc.name, &tc.input).await?; messages.push(Message::tool_result(&tc.id, &result)); } } } ``` ### ToolCall Fields ``` tool_call.id — unique call ID (required for tool_result) tool_call.name — which tool was called tool_call.input — parsed arguments (Python: dict, Rust: serde_json::Value) ``` --- ## Streaming Events ### StreamEvent Fields ``` event.event_type — "text" | "tool_call_start" | "tool_call_args" | "tool_call_end" | "usage" event.content — text delta (String) event.done — True on last event (bool) event.tool_call_id — tool call ID (Optional) event.tool_call_name — tool name, on tool_call_start (Optional) event.tool_call_args_delta — incremental JSON args, on tool_call_args (Optional) event.usage — token tally on usage events (Optional) event.stop_reason — StopReason on the terminal `done` event (Optional) • Anthropic / MiniMax: from message_delta.delta.stop_reason • OpenAI: from choices[0].finish_reason • None on intermediate events • None on done events for providers that don't report a reason ``` Each provider stream emits **exactly one** `done` event — guaranteed since v0.10.1 even when the upstream provider closes the connection without `[DONE]` and without a `finish_reason` chunk (some non-conformant proxies do this). If the provider reports a stop reason, it lands on that event. `collect_stream` honors the explicit `stop_reason` and only falls back to its tool-calls heuristic when none was reported. ### Event Sequence for Tool Calls ``` tool_call_start → tool_call_id + tool_call_name set tool_call_args → tool_call_args_delta accumulates (may arrive in multiple chunks) tool_call_args → ... tool_call_end → tool_call_id set, args complete ``` Buffer `tool_call_args_delta` until `tool_call_end`, then JSON parse. ### Streaming Tool Use (Python) ```python pending = {} # tool_call_id → {"name": str, "args": str} async for event in client.stream( request.messages, tools=request.tools, system=request.system, ): match event.event_type: case "text": print(event.content, end="", flush=True) case "tool_call_start": pending[event.tool_call_id] = {"name": event.tool_call_name, "args": ""} case "tool_call_args": pending[event.tool_call_id]["args"] += event.tool_call_args_delta case "tool_call_end": tc = pending.pop(event.tool_call_id) args = json.loads(tc["args"]) result = await execute_tool(tc["name"], args) if event.done: break ``` --- ## ThinkStripper `...` blocks in LLM output (e.g. DeepSeek, QwQ, MiniMax reasoning models) are stripped **automatically** at the `stream()` / `stream_with()` level. Manual use (Python): ```python from motosan_ai import ThinkStripper stripper = ThinkStripper() clean = stripper.feed("reasoning...Hello") # → "Hello" tail = stripper.flush() # call at stream end ``` Manual use (Rust): ```rust use motosan_ai::think_stripper::ThinkStripper; let mut stripper = ThinkStripper::new(); let clean = stripper.feed("reasoning...Hello"); // → "Hello" let tail = stripper.flush(); // call at stream end ``` --- ## Provider-Specific Notes ### Provider Capabilities (Rust, v0.13.1+) Providers declare which content types they accept. Passing unsupported content returns `Err(MotosanError::UnsupportedFeature(...))` **before** any network call. | Provider | Image | Document | |----------|-------|----------| | Anthropic | ✓ | ✓ | | OpenAI | ✓ | ✗ | | Gemini / GeminiCodeAssist | ✓ | ✗ | | MiniMax / Ollama / CLI backends | ✗ | ✗ | ```rust use motosan_ai::ProviderCapabilities; // Check programmatically let caps = provider.capabilities(); if caps.supports_image { /* send image */ } ``` When implementing a custom `ProviderImpl`, override `capabilities()` to declare what the provider supports: ```rust fn capabilities(&self) -> ProviderCapabilities { ProviderCapabilities::with_image() // or full() / text_only() } ``` ### Gemini (HTTP — generativelanguage.googleapis.com) ```rust // Cargo.toml: features = ["gemini"] let client = Client::builder() .provider(Provider::Gemini) .api_key("AIzaSy...") // Google AI Studio API key .model("gemini-2.0-flash") // optional, this is the default .build()?; // tool use multi-turn: pass function name (not call ID) as tool_call_id let resp = client.chat_with(req_with_tools).await?; let tc = &resp.tool_calls[0]; let result = Message::tool_result(&tc.name, tool_output); // ← tc.name, NOT tc.id ``` Billing: pay-per-token. Free tier available but rate-limited. Default model `gemini-2.0-flash`. ### GeminiCodeAssist (HTTP — cloudcode-pa.googleapis.com) ```rust // Cargo.toml: features = ["gemini-code-assist"] // Also: motosan-ai-oauth = { version = "0.2", features = ["gemini"] } use motosan_ai_oauth::{refresh, providers::gemini}; // Refresh OAuth token (stored refresh_token from initial PKCE login) let token = refresh(&gemini(), &stored_refresh_token).await?; // Get project ID from loadCodeAssist API (one-time setup): // POST https://cloudcode-pa.googleapis.com/v1internal:loadCodeAssist // Authorization: Bearer // → response["cloudaicompanionProject"] = "your-project-id" let client = Client::builder() .provider(Provider::GeminiCodeAssist) .api_key(token.access_token) // ya29.xxx OAuth token .gemini_code_assist_project_id("your-gcp-project-id") .model("gemini-2.5-flash") // optional, this is the default .build()?; ``` Billing: subscription-based ($19–$45/seat/month). 1,000–2,000 requests/day depending on tier. If you have an active Gemini CLI subscription, this is effectively free (no extra cost). Default model must be `gemini-2.5-flash` or `gemini-2.5-pro` — `gemini-2.0-flash` is not available on `cloudcode-pa` for standard-tier accounts. ### Anthropic Auth - Regular API key (`sk-ant-api*`): uses `x-api-key` header - OAuth token (`sk-ant-oat01*`): auto-switches to OAuth mode with `Authorization: Bearer`, streaming required, `chat()` auto-redirects to `stream()` ### Anthropic Base URL ```rust // Point at a proxy, on-prem deployment, or Anthropic-compatible third-party endpoint. // Defaults to https://api.anthropic.com when unset. (Rust v0.14.2+) client_builder.anthropic_base_url("https://proxy.example.com/anthropic") ``` Getter: `client.anthropic_base_url() -> Option<&str>` returns the override, or `None` if the default is in use. ### OpenAI ```rust // Custom auth header client_builder.openai_auth_x_api_key() client_builder.openai_auth_custom_header("X-Auth-Token") // Point at an OpenAI-compatible endpoint. Pass the FULL URL — no /v1 injection, // no base_url heuristics. What you pass is what gets POSTed. client_builder.openai_chat_url("https://api.groq.com/openai/v1/chat/completions") client_builder.openai_chat_url("https://api.deepseek.com/v1/chat/completions") client_builder.openai_chat_url("https://my-proxy.example.com/any/path") // Fallback to the Responses API when chat completions returns 404 client_builder.openai_responses_fallback(true) // Override the Responses URL (only needed for non-OpenAI hosts that expose it) client_builder.openai_responses_url("https://api.openai.com/v1/responses") ``` Defaults: `https://api.openai.com/v1/chat/completions` + `https://api.openai.com/v1/responses`. Exported as `DEFAULT_OPENAI_CHAT_URL` / `DEFAULT_OPENAI_RESPONSES_URL` from `motosan_ai::providers::openai`. ### MiniMax - Routed via Anthropic-compatible `/anthropic/v1/messages` - Default model: `MiniMax-M2.7` (`MiniMax-M2.7-highspeed` also supported) - Default base URL: `https://api.minimax.io/anthropic` - CN base URL override: `.minimax_base_url("https://api.minimaxi.com/anthropic")` - Uses Anthropic serialization (`tool_use` / `tool_result` content blocks) - Declares text-only capabilities (`ProviderCapabilities::text_only()`) ### Ollama ```rust // OpenAI-compatible mode (default) Client::builder().provider(Provider::Ollama).api_key("ollama") // Native mode with extra options Client::builder() .provider(Provider::Ollama) .api_key("ollama") .ollama_native(true) .ollama_base_url("http://localhost:11434") .ollama_keep_alive("5m") .ollama_num_ctx(4096) ``` ### CLI Backends Rust has three opt-in CLI backends (`ClaudeCodeProvider`, `CodexCliProvider`, `GeminiCliProvider`) that shell out to local binaries instead of calling HTTP APIs. Python v0.9.2+ has `ClaudeCodeClient`, `CodexCliClient`, and `GeminiCliClient` with Rust-compatible flag coverage. All CLI paths return the same `ChatResponse` / `StreamEvent` types as HTTP providers. **Rust unified `Client::builder()` dispatch** (v0.11.0+): ```rust use motosan_ai::claude_code::{EffortLevel, PermissionMode}; use motosan_ai::codex_cli::SandboxMode; use motosan_ai::gemini_cli::ApprovalMode; use motosan_ai::{ Client, ClaudeCodeProvider, CodexCliProvider, GeminiCliProvider, Message, Provider, }; // Claude Code via Client::builder — full config via pre-built ClaudeCodeProvider let client = Client::builder() .provider(Provider::ClaudeCode) .claude_code( ClaudeCodeProvider::new() .bare(true) // --bare (daemon-safe; skip hooks/plugins/auto-memory) .model("sonnet") // --model .system_prompt("Be terse.") // --system-prompt (full replacement) .permission_mode(PermissionMode::Plan) // --permission-mode plan .effort(EffortLevel::Low) // --effort low .fallback_model("opus") // --fallback-model .add_dir("/tmp/workspace") // --add-dir (repeatable) .allow_tool("Edit") // --allowed-tools (variadic) .disallow_tool("WebFetch") // --disallowed-tools (variadic) .mcp_config("./mcp.json") // --mcp-config (variadic) .strict_mcp_config(true) // --strict-mcp-config .settings("./settings.json") // --settings .setting_source("user") // --setting-sources user,project .setting_source("project") .no_session_persistence(true) // --no-session-persistence .max_budget_usd(2.5), // --max-budget-usd ) .build()?; // no api_key needed let response = client.chat(vec![Message::user("hi")]).await?; // Codex CLI via Client::builder — full config via pre-built CodexCliProvider let client = Client::builder() .provider(Provider::CodexCli) .codex_cli( CodexCliProvider::new() .model("gpt-5.1-codex") .sandbox(SandboxMode::WorkspaceWrite) .profile("work") .ephemeral(true) .add_dir("/tmp/output") .enable_feature("fast_mode") .config_override("model_reasoning_effort", "\"low\""), ) .build()?; let response = client.chat(vec![Message::user("hi")]).await?; // Gemini CLI via Client::builder let client = Client::builder() .provider(Provider::GeminiCli) .gemini_cli( GeminiCliProvider::new() .model("gemini-2.5-pro") .approval_mode(ApprovalMode::Yolo) .sandbox(true), ) .build()?; let response = client.chat(vec![Message::user("hi")]).await?; ``` **Direct use** (still supported, bypasses `Client`): ```rust use motosan_ai::{CodexCliProvider, codex_cli::SandboxMode}; let provider = CodexCliProvider::new().sandbox(SandboxMode::ReadOnly); let response = provider.chat(request).await?; let stream = provider.stream(request).await?; ``` **Python Claude Code direct use** (v0.9.0+): ```python from motosan_ai import ChatRequest, ClaudeCodeClient, Message client = ( ClaudeCodeClient() .model("sonnet") .system_prompt("Be terse.") # --system-prompt (full replacement) .permission_mode("plan") # --permission-mode plan .effort("low") # --effort low .allow_tool("Edit") # --allowed-tools Edit (variadic) .max_budget_usd(2.5) # --max-budget-usd ) resp = await client.chat(ChatRequest(messages=[Message.user("hi")])) async for event in client.stream(ChatRequest(messages=[Message.user("hi")])): if event.event_type == "usage": print(event.usage) ``` **Python Codex CLI direct use** (v0.9.1+): ```python from motosan_ai import ChatRequest, CodexCliClient, Message, SandboxMode client = ( CodexCliClient() .sandbox(SandboxMode.workspace_write) # --sandbox workspace-write .model("gpt-5.1-codex") # --model .profile("work") # --profile .config_override("approval_policy", "never") # -c approval_policy=never ) resp = await client.chat(ChatRequest(messages=[Message.user("hi")])) ``` **Python Gemini CLI direct use** (v0.9.2+): ```python from motosan_ai import ApprovalMode, ChatRequest, GeminiCliClient, Message client = ( GeminiCliClient() .model("gemini-2.5-pro") .approval_mode(ApprovalMode.plan) # --approval-mode plan .include_dir("/tmp/workspace") # --include-directories ) resp = await client.chat(ChatRequest(messages=[Message.user("hi")])) ``` Key notes: - Runs `claude --print --output-format stream-json` / `codex exec --json --skip-git-repo-check -` / `gemini -p "" -o stream-json` with the prompt on stdin. - Authentication: `claude` uses your existing login state; `codex` uses `CODEX_API_KEY` env var or `~/.codex/auth.json` (**not** `OPENAI_API_KEY`); `gemini` uses its own local auth (`gemini auth` once — personal Google account or API key). - Blocking `chat().tool_calls` is always empty on all three. But `stream()` now surfaces CLI tool use as `ToolCallStart → ToolCallArgs → ToolCallEnd` events (Claude `tool_use`; Codex `command_execution` / `mcp_tool_call`; Gemini `tool_use`). Tool *results* run inside the CLI sandbox and are not surfaced. - **Per-run CLI knobs (v0.20.0), shared by all three providers**: `.cwd(dir)` runs the child with `Command::current_dir` (Codex uses `.cd()` → `--cd`); `.env(k,v)` / `.envs(iter)` inject a per-run secret bundle into the child (redacted from `Debug`, never logged); `.timeout(dur)` / `.no_timeout()` bound each invocation (default: Claude 300 s, Codex/Gemini 600 s) and the `stream()` per-line read-stall deadline (→ `Err(MotosanError::StreamReadTimeout)`); `.resume(id)` continues a prior session (Codex `exec resume `; Gemini/Claude `--resume`), and the provider-minted id is surfaced on `StreamEvent::session_id` / `ChatResponse::session_id`. - Codex emits **complete** `agent_message` items (not token deltas); `stream()` yields one text event per finalized message. `chat()` treats the last `agent_message` as `content` and folds earlier ones (preamble / tool narration) into `thinking`. - Gemini emits delta chunks (`{"type":"message","role":"assistant","content":"...","delta":true}`) followed by a terminal `{"type":"result","stats":{...}}` event; both paths share one parser. Usage maps from `stats.input_tokens` / `output_tokens` / `cached` (→ `cache_read_input_tokens`). - Gemini has no `--system-prompt` flag — `GeminiCliProvider` merges the system text into the stdin payload as a blank-line-separated prefix. - `SandboxMode` variants (Codex): `ReadOnly` / `WorkspaceWrite` / `DangerFullAccess`. Can coexist with `agent_mode(true)` (`--full-auto`). - `ApprovalMode` variants (Gemini): `Default` / `AutoEdit` / `Yolo` / `Plan`. `yolo(true)` is shorthand for `--yolo` (same effect as `ApprovalMode::Yolo`). - `PermissionMode` variants (Claude Code): `AcceptEdits` / `Auto` / `BypassPermissions` / `Default` / `DontAsk` / `Plan`. Distinct from `.agent_mode(true)`, which forwards `--dangerously-skip-permissions` and switches the blocking path to `--output-format json` so usage tokens can be parsed. - `EffortLevel` variants (Claude Code): `Low` / `Medium` / `High` / `Max`. Maps to `--effort`. - Claude Code isolation: `.bare(true)` forwards `--bare`, so the spawned `claude` skips hooks, plugins, auto-memory, keychain reads, and user/project settings discovery. Recommended for daemons / servers that must not inherit the operator's interactive state; leave `false` for workflows that should pick up `~/.claude/`. Emitted before `--dangerously-skip-permissions`. - Claude Code system prompts: `.system_prompt(...)` forwards `--system-prompt` (full replacement); message-extracted system prompts flow through `--append-system-prompt`. Both can coexist — Claude applies append on top of replace. - Claude Code stream usage: Python v0.9.0+ and Rust both emit a `StreamEvent` with `event_type == "usage"` before terminal `done` when Claude Code's NDJSON `result` event includes usage. - Codex CLI stream usage: Python v0.9.1+ and Rust both map `turn.completed.usage.cached_input_tokens` to `Usage.cache_read_input_tokens` and emit usage before terminal `done`. - Gemini CLI stream usage: Python v0.9.2 and Rust both map `result.stats.cached` to `Usage.cache_read_input_tokens` and emit usage before terminal `done`. Gemini CLI has no trailing `-` stdin marker; system prompts are prepended to stdin with `\n\n`. - Claude Code budget: `.max_budget_usd(amount)` drops non-finite and negative values at argv-build time so the CLI never receives an invalid number. Only meaningful under `--print`. - Claude Code session flags (`--session-id` / `--resume` / `--continue` / `--fork-session` / `--no-session-persistence`) all live on the same `ClaudeCodeProvider` builder; blank string inputs are skipped so `.resume("")` is a no-op. - Claude Code argv order is stable and locked by `common_args_full_loadout_order_is_stable`. Don't grep-reorder it. - `config_override` values (Codex) are parsed as TOML by the CLI — string values need escaped quotes (`"\"low\""`). - `codex features list` prints the valid feature names accepted by `.enable_feature()` / `.disable_feature()` — codex validates them against a strict allowlist. - Only `codex exec` is supported. `codex exec resume` and `codex review` subcommands are out of scope. - The v0.10.0 `ClaudeCodeClient` / `CodexCliClient` type aliases were removed in v0.11.0. Use `ClaudeCodeProvider` / `CodexCliProvider` / `GeminiCliProvider`. --- ## codex-oauth Standalone crate for browser-based PKCE OAuth login against `auth.openai.com`. Returns an access token usable with the ChatGPT backend API. - crates.io: https://crates.io/crates/codex-oauth - Version: 0.1.0 ### Install ```toml codex-oauth = "0.1" ``` ### API ```rust // Login (opens browser, listens on localhost:1455, times out after 120s) let token = codex_oauth::login().await?; // Refresh let token = codex_oauth::refresh(&token.refresh_token).await?; // Expiry check if token.is_expired() { /* refresh */ } // Token fields token.access_token // Bearer token for chatgpt.com/backend-api token.refresh_token // long-lived, use with refresh() token.expires_in // lifetime in seconds token.issued_at // Unix timestamp of issue time ``` `Token` implements `Serialize`/`Deserialize` for disk persistence. --- ## anthropic-oauth Standalone crate for browser-based PKCE OAuth login against `claude.ai`. Returns an `sk-ant-oat01-*` access token usable with `motosan-ai`'s `AnthropicProvider` (which auto-detects the prefix and applies Claude Code identity headers). - crates.io: https://crates.io/crates/anthropic-oauth - Version: 0.1.0 ### Install ```toml anthropic-oauth = "0.1" ``` ### API ```rust // Login (opens browser, listens on 127.0.0.1:53692, times out after 120s). // The redirect URI registered with Anthropic uses hostname "localhost"; the // bind address is still 127.0.0.1 — the OAuthConfig handles this split. let token = anthropic_oauth::login().await?; // Refresh let token = anthropic_oauth::refresh(&token.refresh_token).await?; // Expiry check if token.is_expired() { /* refresh */ } // Token fields (same shape as codex-oauth) token.access_token // "sk-ant-oat01-..." — Bearer for Anthropic API token.refresh_token // long-lived, use with refresh() token.expires_in // lifetime in seconds token.issued_at // Unix timestamp of issue time ``` `Token` implements `Serialize`/`Deserialize` for disk persistence. **ToS disclosure**: this crate uses the OAuth `client_id` registered by Anthropic's Claude Code CLI. The resulting access token authenticates as a Claude Code CLI session. Anthropic has not published this `client_id` for third-party use; using it for purposes other than running `claude` CLI may be subject to change, rate limited, or violate Anthropic's terms of service. See the project README for the full disclosure. --- ## Release Python, Rust, and TypeScript SDKs are versioned and released **independently**. ### Tag Convention | SDK | Tag format | Example | |--------------|-----------------------|-----------------------| | Python | `python-vX.Y.Z` | `python-v0.4.2` | | Rust | `rust-vX.Y.Z` | `rust-v0.3.3` | | TypeScript | `ts-vX.Y.Z` | `ts-v0.10.0` | | motosan-ai-oauth | `motosan-ai-oauth-vX.Y.Z` | `motosan-ai-oauth-v0.2.0` | | codex-oauth | `codex-oauth-vX.Y.Z` | `codex-oauth-v0.1.0` | | anthropic-oauth | `anthropic-oauth-vX.Y.Z` | `anthropic-oauth-v0.1.0` | ### Release Steps (Python) ```bash # 1. Bump version # sdks/python/pyproject.toml → version = "X.Y.Z" # 2. Update CHANGELOG # sdks/python/CHANGELOG.md → ## [X.Y.Z] - YYYY-MM-DD # 3. Update version references in: # - README.md (root) — Languages table, provider table # - AGENTS.md — "Current versions" section, Releasing table # - llms.txt — header version line, Install section # - skills/motosan-ai/SKILL.md — header version, Install section # 4. Commit git add sdks/python/pyproject.toml sdks/python/CHANGELOG.md README.md AGENTS.md llms.txt skills/motosan-ai/SKILL.md git commit -m "chore: release python-vX.Y.Z" # 5. Tag + push (triggers publish-python.yml → PyPI) git tag -a python-vX.Y.Z -m "python-vX.Y.Z — summary" git push origin main python-vX.Y.Z ``` ### Release Steps (Rust) ```bash # 1. Bump version # sdks/rust/Cargo.toml → version = "0.4.1" # 2. Update CHANGELOG # sdks/rust/CHANGELOG.md → ## [0.4.1] - YYYY-MM-DD # 3. Update version references in: # - README.md (root) — Languages table, provider table # - AGENTS.md — "Current versions" section, Releasing table # - llms.txt — header version line, Install section # - skills/motosan-ai/SKILL.md — header version, Install section # 4. Commit git add sdks/rust/Cargo.toml sdks/rust/CHANGELOG.md README.md AGENTS.md llms.txt skills/motosan-ai/SKILL.md git commit -m "chore: release rust-v0.4.1" # 5. Tag + push (triggers publish-rust.yml → crates.io) git tag -a rust-v0.4.1 -m "rust-v0.4.1 — summary" git push origin main rust-v0.4.1 ``` ### Release Steps (TypeScript) ```bash # 1. Bump version # sdks/typescript/package.json → "version": "X.Y.Z" # 2. Update CHANGELOG # sdks/typescript/CHANGELOG.md → ## [X.Y.Z] - YYYY-MM-DD # 3. Update version references in: # - README.md (root) — Languages table # - AGENTS.md — Releasing paragraph # - llms.txt — Tag Convention + this section # 4. Commit git add sdks/typescript/package.json sdks/typescript/CHANGELOG.md README.md AGENTS.md llms.txt skills/motosan-ai/SKILL.md git commit -m "chore: release ts-vX.Y.Z" # 5. Tag + push (triggers publish-typescript.yml → npm) git tag ts-vX.Y.Z git push origin main ts-vX.Y.Z ``` ### Release Steps (motosan-ai-oauth) ```bash # 1. Bump version # sdks/rust/crates/motosan-ai-oauth/Cargo.toml → version = "0.2.0" # 2. Update CHANGELOG # sdks/rust/crates/motosan-ai-oauth/CHANGELOG.md → ## [0.2.0] - YYYY-MM-DD # 3. Commit git add sdks/rust/crates/motosan-ai-oauth/Cargo.toml sdks/rust/crates/motosan-ai-oauth/CHANGELOG.md git commit -m "chore: release motosan-ai-oauth-v0.2.0" # 4. Tag + push (triggers publish-motosan-ai-oauth.yml → crates.io) git tag -a motosan-ai-oauth-v0.2.0 -m "motosan-ai-oauth-v0.2.0 — summary" git push origin main motosan-ai-oauth-v0.2.0 ``` Publish `motosan-ai-oauth` before publishing wrapper crates (`codex-oauth`, `anthropic-oauth`) that depend on its new version. ### Release Steps (codex-oauth) ```bash # 1. Bump version # sdks/rust/crates/codex-oauth/Cargo.toml → version = "0.1.1" # 2. Update CHANGELOG # sdks/rust/crates/codex-oauth/CHANGELOG.md → ## [0.1.1] - YYYY-MM-DD # 3. Commit git add sdks/rust/crates/codex-oauth/Cargo.toml sdks/rust/crates/codex-oauth/CHANGELOG.md git commit -m "chore: release codex-oauth-v0.1.1" # 4. Tag + push (triggers publish-codex-oauth.yml → crates.io) git tag -a codex-oauth-v0.1.1 -m "codex-oauth-v0.1.1 — summary" git push origin main codex-oauth-v0.1.1 ``` ### Release Steps (anthropic-oauth) ```bash # 1. Bump version # sdks/rust/crates/anthropic-oauth/Cargo.toml → version = "0.1.1" # 2. Update CHANGELOG # sdks/rust/crates/anthropic-oauth/CHANGELOG.md → ## [0.1.1] - YYYY-MM-DD # 3. Commit git add sdks/rust/crates/anthropic-oauth/Cargo.toml sdks/rust/crates/anthropic-oauth/CHANGELOG.md git commit -m "chore: release anthropic-oauth-v0.1.1" # 4. Tag + push (triggers publish-anthropic-oauth.yml → crates.io) git tag -a anthropic-oauth-v0.1.1 -m "anthropic-oauth-v0.1.1 — summary" git push origin main anthropic-oauth-v0.1.1 ``` ### CI Pipeline Tag push triggers GitHub Actions: - **publish-python.yml**: `uv build` → `pypa/gh-action-pypi-publish` (secret: `PYPI_API_TOKEN`) - **publish-rust.yml**: `cargo fmt --check` → `cargo clippy` → `cargo test --all-features` → `cargo publish` (secret: `CARGO_REGISTRY_TOKEN`) - **publish-typescript.yml**: `npm ci` → `npm run build` → `npm run test` → version-matches-tag guard → `npm publish --provenance --access public` (secret: `NPM_TOKEN`) - **publish-motosan-ai-oauth.yml**: `cargo fmt --check` → `cargo clippy` → `cargo test` → `cargo publish` (secret: `CARGO_REGISTRY_TOKEN`) - **publish-codex-oauth.yml**: `cargo fmt --check` → `cargo clippy` → `cargo test` → `cargo publish` (secret: `CARGO_REGISTRY_TOKEN`) - **publish-anthropic-oauth.yml**: `cargo fmt --check` → `cargo clippy` → `cargo test` → `cargo publish` (secret: `CARGO_REGISTRY_TOKEN`) All Rust workflows support `workflow_dispatch` for manual trigger. ### Pre-Push Validation (Local) ```bash ./scripts/pre-push-gate.sh # [1/4] Python unit tests # [2/4] Rust unit tests # [3/4] Python live tests (skipped if no ANTHROPIC_API_KEY) # [4/4] Rust live tests (skipped if no ANTHROPIC_API_KEY) ``` ### Emergency Manual Publish ```bash # Python cd sdks/python && uv build --out-dir dist && uv publish dist/* # Rust cd sdks/rust && cargo publish # TypeScript cd sdks/typescript && npm ci && npm run build && npm publish --access public ```