# llm-proxy [English](./README.md) | [简体中文](./README.zh.md) A local LLM proxy server — single port serving both admin UI and AI API, with multi-protocol routing, protocol translation, streaming SSE conversion, token tracking, and protocol capture debugging. ## Features - 🔀 **Multi-Protocol**: Anthropic, OpenAI, and OpenAI Responses on a single port - 🔄 **Protocol Translation**: Bidirectional conversion across all three protocols (streaming + non-streaming) - 📸 **External Vision**: Image-to-text fallback for non-multimodal models — auto-converts images via a configured vision model, with persistent LRU cache - 🖥️ **macOS App**: Native menu bar app with built-in proxy — zero dependencies, drag & drop install - 📊 **Admin UI**: Alpine.js SPA with dashboard, provider management, adapter config, vision settings, and capture debugger - 🎯 **Virtual Adapters**: Custom endpoints with model remapping (`/{adapter-name}/v1/...`) - 📡 **SSE Streaming**: 4 bidirectional stream converters with per-line timestamps - 🔍 **Protocol Capture**: Ring buffer recording raw request/response pairs with side-by-side diff - 🔥 **Hot Reload**: Atomic config swap without dropping in-flight requests - 📈 **Token Tracking**: Per-provider token usage statistics ## Screenshots
macOS menu bar — service control, adapter switching, language settings | adapter list & model mapping
Admin dashboard — provider status, token usage, proxy key management
Provider management — add/edit/delete AI providers, pull model lists, set input modalities
Adapter configuration — virtual endpoints with model remapping
## Install **macOS (recommended):** Download `LLMProxy.dmg` from [Releases](https://github.com/maplezzk/llm-proxy/releases), drag to `/Applications`. If macOS blocks the app, run: ```bash xattr -cr /Applications/LLMProxy.app ``` Then open again. Includes everything — CLI, proxy, and admin UI. **macOS (Homebrew):** ```bash brew tap maplezzk/tap && brew install --cask llm-proxy ``` **CLI only:** ```bash npm install -g @maplezzk/llm-proxy ``` ## Quick Start ```bash # Start proxy llm-proxy start # Open admin UI → http://127.0.0.1:9000/admin/ ``` On first launch, the config directory is created automatically. Open the admin UI to configure everything in your browser — no manual YAML editing needed. The admin UI supports: - **Provider management**: Add/edit/delete AI providers, pull model lists from APIs, declare input modalities (text/image) - **Adapter config**: Create virtual endpoints with model remapping and protocol adaptation - **Vision settings**: Enable external vision (image-to-text) for non-multimodal models, view cache stats - **Proxy key**: Set API authentication key - **Live test**: Send test requests directly to verify configuration - **Protocol capture**: Real-time request/response inspection ## Configuration `~/.llm-proxy/config.yaml`: ```yaml log_level: debug # debug | info | warn | error port: 9000 # Optional: default 9000 max_body_size: 10485760 # Optional: max request body in bytes (default 10MB) proxy_key: sk-xxx # Optional: if set, /v1/* requires auth providers: - name: deepseek type: openai # anthropic | openai | openai-responses api_key: ${DEEPSEEK_API_KEY} api_base: https://api.deepseek.com models: - id: deepseek-chat - name: anthropic type: anthropic api_key: ${ANTHROPIC_API_KEY} models: # Multimodal model — declare image input modality - id: claude-sonnet-4 input: [text, image] thinking: budget_tokens: 10000 # Non-multimodal model — images will be auto-converted via vision provider - id: deepseek-reasoner # input omitted → defaults to [text]; image requests trigger vision fallback # MiniMax adaptive thinking passthrough (non-standard thinking.type) - id: MiniMax-M2 thinking: type: enabled adapters: - name: my-tool type: anthropic models: - sourceModelId: claude-sonnet-4 provider: anthropic targetModelId: claude-sonnet-4-20250514 # External Vision — image-to-text for non-multimodal models vision: provider: anthropic # Required: vision-capable provider model: claude-sonnet-4 # Required: multimodal model ID prompt: | # Optional: custom prompt (default shown below) 请详细描述这张图片的内容,包括其中的文字、物体、场景、颜色等关键信息。 ``` API keys use environment variable interpolation (`${VAR}`) — never stored in plain text. ## External Vision When the routed model **does not** declare `input: image`, llm-proxy can automatically convert image content to text using a configured vision model: 1. Inbound request contains an image block (Anthropic `image`, OpenAI Chat `image_url`, or OpenAI Responses `input_image`) 2. llm-proxy extracts each image and calls the configured vision provider/model via the proxy itself (recursive routing) 3. The image is replaced with a `