# 🐱 PrismCat [English](./README.md) | [įŽ€äŊ“中文](./README_CN.md) ![GitHub Release](https://img.shields.io/github/v/release/paopaoandlingyia/PrismCat) ![License](https://img.shields.io/github/license/paopaoandlingyia/PrismCat) ![Docker Image](https://img.shields.io/badge/image-ghcr.io%2Fpaopaoandlingyia%2Fprismcat-blue) > **You never know how much junk your SDK silently injects into your prompts — until you use PrismCat.** PrismCat is a **self-hosted, transparent proxy and debugging console for LLM APIs**. Change one line — your `base_url` — and instantly see every request and response between your app and OpenAI / Claude / Gemini / Ollama / any LLM API, including streaming (SSE). ![PrismCat Dashboard](assets/dashboard.png) --- ## ⚡ Get Started in 30 Seconds ### 1. Launch Grab the binary for your system from [Releases](https://github.com/paopaoandlingyia/PrismCat/releases). | Platform | How to Start | |----------|-------------| | **Windows** | Run `prismcat.exe` — it lives in your system tray | | **Linux / macOS** | Run `./prismcat` | | **Docker** | See [Docker Deployment](#-docker-deployment) | Open **`http://localhost:8080`** in your browser. ### 2. Add an Upstream In the Settings page, add an upstream. For example: | Name | Target | |------|--------| | `openai` | `https://api.openai.com` | PrismCat gives you a proxy address: **`http://openai.localhost:8080`** ### 3. Change One Line, Start Capturing ```python from openai import OpenAI client = OpenAI( base_url="http://openai.localhost:8080/v1", # ← change only this api_key="sk-..." ) # everything else stays exactly the same response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}], ) ``` Go back to the dashboard. Your full request and response are already there. That's it. --- ## 🧩 How It Works PrismCat uses **subdomain routing** for truly transparent proxying. When you add an upstream named `openai`: ``` Your App PrismCat OpenAI │ │ │ │ openai.localhost:8080 │ api.openai.com │ │ ─────────────────────────>│ ────────────────────────────>│ │ │ logs request ✓ │ │<─────────────────────────│<────────────────────────────│ │ │ logs response ✓ │ ``` **Why subdomains?** Because they make the proxy truly transparent — your request paths (like `/v1/chat/completions`) stay exactly the same. No path rewriting, no SDK quirks. Any language, any SDK, any LLM — as long as it lets you set a `base_url`, it just works. You can even chain proxies (App → PrismCat → relay → OpenAI) with zero friction. > **💡 About `*.localhost`**: Modern browsers and most operating systems automatically resolve `*.localhost` to `127.0.0.1` — no hosts file editing required. If your environment doesn't support this, see [Path Routing Mode](#-fallback-path-routing-mode) or add a hosts entry manually. --- ## ✨ Key Features ### 📊 Full Traffic Observability - Complete request/response headers and bodies - **SSE streaming** captured in full — view raw chunks or the merged result - Auto-formatted JSON, smart Base64 folding (no more drowning in image data) with one-click image preview ![Image Preview](assets/image_preview.png) ### 🎮 One-Click Replay (Playground) See a failed request? Hit **Replay**, tweak the prompt or parameters right in your browser, and resend instantly. No need to re-run your Python/Node script. ### 🔐 Privacy & Security - **Fully local** — data stays in local SQLite + filesystem, no third-party servers - Automatic masking of sensitive headers (`Authorization`, `api-key`) ### đŸˇī¸ Log Tagging Add `X-PrismCat-Tag: my-tag` to any request header to categorize logs in the UI. Perfect for shared proxies with multiple users or projects. ### đŸ“Ļ Dead-Simple Deployment Single binary, zero dependencies. Windows system tray support. Native Docker image available. ### 🔄 Always-On, Always Reviewable PrismCat is designed to run as a **silent, 24/7 LLM black box**. You don't need to "remember to start capturing" when a bug happens — it's already recording. Automatic log retention cleanup and large-body offloading keep storage healthy over months of continuous operation. Perfect for monitoring autonomous Agents that you can't fully predict — just go back and review what they actually sent and received, days after the fact. --- ## đŸŽ¯ Who Needs PrismCat? | Your Problem | How PrismCat Helps | |-------------|-------------------| | "Why is my token usage so high? My prompt is short!" | See the hidden system prompts and few-shot examples your SDK/framework silently injects | | "Function Calling keeps returning broken JSON" | Capture the raw model output, tweak your prompt in the Playground, and retry instantly | | "Streaming output sometimes freezes or gets truncated" | Every SSE chunk is recorded — pinpoint whether the issue is the model, gateway, or client | | "I run local models with Ollama, want to inspect the traffic" | Add an upstream pointing to `http://localhost:11434` — it's a universal HTTP proxy | | "Multiple people share one API key — whose request failed?" | Use `X-PrismCat-Tag` to tag by user, find the culprit in seconds | | "My Agent went rogue and I have no idea what it did" | PrismCat silently logs every API call — review the full behavior chain anytime | --- ## 🤔 PrismCat vs. Alternatives | | PrismCat | mitmproxy | Langfuse / Helicone | |---|---------|-----------|---------------------| | Deployment | Single binary / Docker | Local install + certs | SaaS or complex self-host | | LLM-Optimized | ✅ JSON formatting, Base64 folding, SSE merge | ❌ Generic HTTP inspector | ✅ But geared toward production monitoring | | One-Click Replay | ✅ Built-in Playground | ❌ | Partial | | Integration | Change `base_url` | System-wide proxy / certs | Instrument SDK code | | Data Ownership | Fully local | Fully local | Third-party dependent | | Stream Playback | ✅ Raw + merged view | Poor UX | Partial | | Long-Term Running | ✅ Auto-cleanup, silent background | Ad-hoc debugging tool | ✅ But requires external infra | --- ## đŸŗ Docker Deployment ```yaml services: prismcat: image: ghcr.io/paopaoandlingyia/prismcat:latest container_name: prismcat ports: - "8080:8080" environment: # Hosts allowed to access the dashboard - PRISMCAT_UI_HOSTS=localhost,127.0.0.1 # Base domain for subdomain routing - PRISMCAT_PROXY_DOMAINS=localhost # Set a password for public-facing deployments - PRISMCAT_UI_PASSWORD=your_strong_password - PRISMCAT_RETENTION_DAYS=30 volumes: - ./data:/app/data restart: always ``` --- ## 🔀 Fallback: Path Routing Mode If your environment can't resolve `*.localhost` (some Windows network configurations, or inside certain containers), enable **path routing mode** in Settings to route by URL path instead of subdomain: ```python # Path routing mode — no subdomain resolution needed client = OpenAI( base_url="http://localhost:8080/_proxy/openai/v1", api_key="sk-..." ) ``` Enable via config or environment variable: ```yaml # config.yaml server: enable_path_routing: true path_routing_prefix: "/_proxy" ``` ```bash # or via environment variable PRISMCAT_ENABLE_PATH_ROUTING=true ``` > **Note**: Path routing adds a prefix to your request URL (e.g., `/_proxy/openai/...`), which may require extra care with how some SDKs construct paths. Subdomain mode doesn't have this caveat. --- ## 🌐 Production Deployment (Nginx + Wildcard Domain) For public-facing deployments, use a wildcard domain (e.g., `*.prismcat.example.com`) with Nginx: ```nginx server { listen 80; server_name prismcat.example.com *.prismcat.example.com; location / { proxy_pass http://127.0.0.1:8080; proxy_set_header Host $host; # Required: pass original Host for subdomain routing # Required for SSE / streaming proxy_http_version 1.1; proxy_set_header Connection ""; proxy_buffering off; client_max_body_size 50M; } } ``` Then add `prismcat.example.com` to PrismCat's `proxy_domains`. Your upstream `openai` will be accessible at `openai.prismcat.example.com`. --- ## âš™ī¸ Configuration Reference The config file lives at `data/config.yaml` and is created on first launch. Most settings can also be changed from the Settings page in the UI.
Full config example ```yaml server: port: 8080 ui_password: "" # Dashboard password proxy_domains: # Base domains for subdomain routing - localhost logging: max_request_body: 1048576 # Max request body to log (1MB) max_response_body: 10485760 # Max response body to log (10MB) sensitive_headers: # Headers to auto-mask - Authorization - api-key - x-api-key detach_body_over_bytes: 262144 # Store bodies > 256KB as separate files early_request_body_snapshot: true storage: retention_days: 30 # Log retention in days; 0 = keep forever upstreams: openai: target: "https://api.openai.com" timeout: 120 gemini: target: "https://generativelanguage.googleapis.com" timeout: 120 ```
--- ## 🧩 FAQ
Q: openai.localhost doesn't work? Most modern systems resolve `*.localhost` to `127.0.0.1` automatically. If yours doesn't: 1. Add `127.0.0.1 openai.localhost` to your hosts file 2. Or enable [Path Routing Mode](#-fallback-path-routing-mode) as a workaround 3. Or use your own wildcard domain (see [Nginx Deployment](#-production-deployment-nginx--wildcard-domain))
Q: Streaming feels "stuck"? If you're behind a reverse proxy (e.g., Nginx), make sure you have: - `proxy_buffering off;` - `proxy_http_version 1.1;` Nginx buffers entire responses by default, making streaming look like it's hanging.
Q: Which LLM services are supported? PrismCat is a generic HTTP proxy — it's not tied to any specific LLM provider. Any HTTP/HTTPS API works, including: - OpenAI / Azure OpenAI - Anthropic Claude - Google Gemini - Ollama / LM Studio (local models) - API relay services / aggregators
Q: Does it add latency? PrismCat uses asynchronous log writing. The proxy overhead is typically under 1ms. Logging never blocks request forwarding.
--- ## đŸ›Ąī¸ License [MIT License](LICENSE)