# đą PrismCat
[English](./README.md) | [įŽäŊ䏿](./README_CN.md)
  
> **You never know how much junk your SDK silently injects into your prompts â until you use PrismCat.**
PrismCat is a **self-hosted, transparent proxy and debugging console for LLM APIs**.
Change one line â your `base_url` â and instantly see every request and response between your app and OpenAI / Claude / Gemini / Ollama / any LLM API, including streaming (SSE).

---
## ⥠Get Started in 30 Seconds
### 1. Launch
Grab the binary for your system from [Releases](https://github.com/paopaoandlingyia/PrismCat/releases).
| Platform | How to Start |
|----------|-------------|
| **Windows** | Run `prismcat.exe` â it lives in your system tray |
| **Linux / macOS** | Run `./prismcat` |
| **Docker** | See [Docker Deployment](#-docker-deployment) |
Open **`http://localhost:8080`** in your browser.
### 2. Add an Upstream
In the Settings page, add an upstream. For example:
| Name | Target |
|------|--------|
| `openai` | `https://api.openai.com` |
PrismCat gives you a proxy address: **`http://openai.localhost:8080`**
### 3. Change One Line, Start Capturing
```python
from openai import OpenAI
client = OpenAI(
base_url="http://openai.localhost:8080/v1", # â change only this
api_key="sk-..."
)
# everything else stays exactly the same
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
```
Go back to the dashboard. Your full request and response are already there. That's it.
---
## đ§Š How It Works
PrismCat uses **subdomain routing** for truly transparent proxying. When you add an upstream named `openai`:
```
Your App PrismCat OpenAI
â â â
â openai.localhost:8080 â api.openai.com â
â âââââââââââââââââââââââââ>â ââââââââââââââââââââââââââââ>â
â â logs request â â
â<ââââââââââââââââââââââââââ<âââââââââââââââââââââââââââââ
â â logs response â â
```
**Why subdomains?** Because they make the proxy truly transparent â your request paths (like `/v1/chat/completions`) stay exactly the same. No path rewriting, no SDK quirks. Any language, any SDK, any LLM â as long as it lets you set a `base_url`, it just works. You can even chain proxies (App â PrismCat â relay â OpenAI) with zero friction.
> **đĄ About `*.localhost`**: Modern browsers and most operating systems automatically resolve `*.localhost` to `127.0.0.1` â no hosts file editing required. If your environment doesn't support this, see [Path Routing Mode](#-fallback-path-routing-mode) or add a hosts entry manually.
---
## ⨠Key Features
### đ Full Traffic Observability
- Complete request/response headers and bodies
- **SSE streaming** captured in full â view raw chunks or the merged result
- Auto-formatted JSON, smart Base64 folding (no more drowning in image data) with one-click image preview

### đŽ One-Click Replay (Playground)
See a failed request? Hit **Replay**, tweak the prompt or parameters right in your browser, and resend instantly. No need to re-run your Python/Node script.
### đ Privacy & Security
- **Fully local** â data stays in local SQLite + filesystem, no third-party servers
- Automatic masking of sensitive headers (`Authorization`, `api-key`)
### đˇī¸ Log Tagging
Add `X-PrismCat-Tag: my-tag` to any request header to categorize logs in the UI. Perfect for shared proxies with multiple users or projects.
### đĻ Dead-Simple Deployment
Single binary, zero dependencies. Windows system tray support. Native Docker image available.
### đ Always-On, Always Reviewable
PrismCat is designed to run as a **silent, 24/7 LLM black box**. You don't need to "remember to start capturing" when a bug happens â it's already recording. Automatic log retention cleanup and large-body offloading keep storage healthy over months of continuous operation. Perfect for monitoring autonomous Agents that you can't fully predict â just go back and review what they actually sent and received, days after the fact.
---
## đ¯ Who Needs PrismCat?
| Your Problem | How PrismCat Helps |
|-------------|-------------------|
| "Why is my token usage so high? My prompt is short!" | See the hidden system prompts and few-shot examples your SDK/framework silently injects |
| "Function Calling keeps returning broken JSON" | Capture the raw model output, tweak your prompt in the Playground, and retry instantly |
| "Streaming output sometimes freezes or gets truncated" | Every SSE chunk is recorded â pinpoint whether the issue is the model, gateway, or client |
| "I run local models with Ollama, want to inspect the traffic" | Add an upstream pointing to `http://localhost:11434` â it's a universal HTTP proxy |
| "Multiple people share one API key â whose request failed?" | Use `X-PrismCat-Tag` to tag by user, find the culprit in seconds |
| "My Agent went rogue and I have no idea what it did" | PrismCat silently logs every API call â review the full behavior chain anytime |
---
## đ¤ PrismCat vs. Alternatives
| | PrismCat | mitmproxy | Langfuse / Helicone |
|---|---------|-----------|---------------------|
| Deployment | Single binary / Docker | Local install + certs | SaaS or complex self-host |
| LLM-Optimized | â
JSON formatting, Base64 folding, SSE merge | â Generic HTTP inspector | â
But geared toward production monitoring |
| One-Click Replay | â
Built-in Playground | â | Partial |
| Integration | Change `base_url` | System-wide proxy / certs | Instrument SDK code |
| Data Ownership | Fully local | Fully local | Third-party dependent |
| Stream Playback | â
Raw + merged view | Poor UX | Partial |
| Long-Term Running | â
Auto-cleanup, silent background | Ad-hoc debugging tool | â
But requires external infra |
---
## đŗ Docker Deployment
```yaml
services:
prismcat:
image: ghcr.io/paopaoandlingyia/prismcat:latest
container_name: prismcat
ports:
- "8080:8080"
environment:
# Hosts allowed to access the dashboard
- PRISMCAT_UI_HOSTS=localhost,127.0.0.1
# Base domain for subdomain routing
- PRISMCAT_PROXY_DOMAINS=localhost
# Set a password for public-facing deployments
- PRISMCAT_UI_PASSWORD=your_strong_password
- PRISMCAT_RETENTION_DAYS=30
volumes:
- ./data:/app/data
restart: always
```
---
## đ Fallback: Path Routing Mode
If your environment can't resolve `*.localhost` (some Windows network configurations, or inside certain containers), enable **path routing mode** in Settings to route by URL path instead of subdomain:
```python
# Path routing mode â no subdomain resolution needed
client = OpenAI(
base_url="http://localhost:8080/_proxy/openai/v1",
api_key="sk-..."
)
```
Enable via config or environment variable:
```yaml
# config.yaml
server:
enable_path_routing: true
path_routing_prefix: "/_proxy"
```
```bash
# or via environment variable
PRISMCAT_ENABLE_PATH_ROUTING=true
```
> **Note**: Path routing adds a prefix to your request URL (e.g., `/_proxy/openai/...`), which may require extra care with how some SDKs construct paths. Subdomain mode doesn't have this caveat.
---
## đ Production Deployment (Nginx + Wildcard Domain)
For public-facing deployments, use a wildcard domain (e.g., `*.prismcat.example.com`) with Nginx:
```nginx
server {
listen 80;
server_name prismcat.example.com *.prismcat.example.com;
location / {
proxy_pass http://127.0.0.1:8080;
proxy_set_header Host $host; # Required: pass original Host for subdomain routing
# Required for SSE / streaming
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_buffering off;
client_max_body_size 50M;
}
}
```
Then add `prismcat.example.com` to PrismCat's `proxy_domains`. Your upstream `openai` will be accessible at `openai.prismcat.example.com`.
---
## âī¸ Configuration Reference
The config file lives at `data/config.yaml` and is created on first launch. Most settings can also be changed from the Settings page in the UI.
Full config example
```yaml
server:
port: 8080
ui_password: "" # Dashboard password
proxy_domains: # Base domains for subdomain routing
- localhost
logging:
max_request_body: 1048576 # Max request body to log (1MB)
max_response_body: 10485760 # Max response body to log (10MB)
sensitive_headers: # Headers to auto-mask
- Authorization
- api-key
- x-api-key
detach_body_over_bytes: 262144 # Store bodies > 256KB as separate files
early_request_body_snapshot: true
storage:
retention_days: 30 # Log retention in days; 0 = keep forever
upstreams:
openai:
target: "https://api.openai.com"
timeout: 120
gemini:
target: "https://generativelanguage.googleapis.com"
timeout: 120
```
---
## đ§Š FAQ
Q: openai.localhost doesn't work?
Most modern systems resolve `*.localhost` to `127.0.0.1` automatically. If yours doesn't:
1. Add `127.0.0.1 openai.localhost` to your hosts file
2. Or enable [Path Routing Mode](#-fallback-path-routing-mode) as a workaround
3. Or use your own wildcard domain (see [Nginx Deployment](#-production-deployment-nginx--wildcard-domain))
Q: Streaming feels "stuck"?
If you're behind a reverse proxy (e.g., Nginx), make sure you have:
- `proxy_buffering off;`
- `proxy_http_version 1.1;`
Nginx buffers entire responses by default, making streaming look like it's hanging.
Q: Which LLM services are supported?
PrismCat is a generic HTTP proxy â it's not tied to any specific LLM provider. Any HTTP/HTTPS API works, including:
- OpenAI / Azure OpenAI
- Anthropic Claude
- Google Gemini
- Ollama / LM Studio (local models)
- API relay services / aggregators
Q: Does it add latency?
PrismCat uses asynchronous log writing. The proxy overhead is typically under 1ms. Logging never blocks request forwarding.
---
## đĄī¸ License
[MIT License](LICENSE)