# 🤫 llm-whisperer — full reference > New here? Start with the [project README](../README.md) for a friendly intro. > This page is the detailed reference (architecture, every provider, caveats). Whisper to the **free web chat UIs** of Qwen, ChatGPT, Claude, DeepSeek, GLM, Kimi, MiniMax, Grok, Pi and ERNIE — and talk to all of them through **one local HTTP API** (native or OpenAI-compatible, with streaming) — no paid API keys. ``` your app ──HTTP──▶ LLM-Whisperer ──▶ browser tabs ──▶ chat.qwen.ai / claude.ai / ... ``` ## ⚠️ Terms of Service disclaimer This tool automates web interfaces that are intended for human use. Most providers explicitly **prohibit automated or programmatic access** to their free web UI in their Terms of Service: | Provider | ToS reference | |---|---| | OpenAI / ChatGPT | [Terms of Use](https://openai.com/policies/terms-of-use/) — prohibits scraping and automated access | | Anthropic / Claude | [Usage Policy](https://www.anthropic.com/legal/aup) — prohibits automated web UI access | | xAI / Grok | [Terms of Service](https://x.ai/legal/terms-of-service) — prohibits automated access | | Alibaba / Qwen | Prohibits bots and automated access | | DeepSeek | Prohibits automated access | | Zhipu / GLM | Prohibits automated access | | Moonshot / Kimi | Prohibits automated access | | MiniMax | Prohibits automated access | | Inflection / Pi | Prohibits automated access | | Baidu / ERNIE | Prohibits automated access | **Use this tool for personal experimentation, research, or local prototyping only — at your own risk.** For production use, pay for the official API. Publishing this package on npm does not imply endorsement of violating any provider's terms. --- ## How it works - **One browser window** — all providers share a single Chromium instance; each gets a separate tab. Chrome partitions cookies by origin, so sessions never mix. - **Config-driven selectors** — every provider is the same generic driver; only the CSS selectors in `providers.yaml` differ. Fix a broken provider without touching code. - **Persistent login** — log in once per service by hand; the session is saved and reused (including headless). - **Conversation continuity** — each request continues the open tab's chat by default. Pass `newChat: true` to start fresh. - **OpenAI-compatible** — a `/v1/chat/completions` endpoint with real SSE streaming, so OpenAI clients (Cursor, Open WebUI, Continue.dev, the `openai` SDK) work by just changing the base URL. - **Model switching** — request `provider/model-name` to flip the model in the web UI before sending. - **Real API keys too** — already paying for OpenAI, DeepSeek, etc.? Add an API-key provider and it calls the official OpenAI-compatible HTTP API instead of a browser — same endpoints, no scraping. See [API-key providers](#api-key-providers). - **Optional API key** — set `WSPR_API_KEY` to gate the API for LAN exposure. - **Stealth** — `puppeteer-extra-plugin-stealth` to reduce bot-detection. ## Available providers & models | Provider key | Site | Model | Login | Status | |---|---|---|---|---| | `qwen` | chat.qwen.ai | Qwen3.7-Plus | yes | Verified ✓ | | `pi` | pi.ai | Pi (Inflection) | **no** | Verified ✓ | | `deepseek` | chat.deepseek.com | DeepSeek V3 / R1 | yes | Template | | `chatgpt` | chatgpt.com | GPT-4o (free tier) | yes | Cloudflare + login | | `claude` | claude.ai | Claude Sonnet (free tier) | yes | Template | | `glm` | chat.z.ai | GLM-4 | yes | Template | | `kimi` | kimi.com | Kimi K2 (Moonshot) | yes | Partial | | `minimax` | agent.minimax.io | MiniMax-M3 | yes | Partial | | `grok` | grok.com | Grok 3 (xAI) | yes | Partial | | `ernie` | yiyan.baidu.com | ERNIE (Baidu) | yes | Partial | > **Verified** = driven end-to-end. **Partial** = input/login confirmed, response > selector needs a `HEADLESS=false` check. **Template** = best-effort selectors. > `pi` needs no login — the quickest way to try the tool. See > [providers.md](./providers.md) for details and how to fix selectors. ### API-key providers If you have a real key, you can skip the browser entirely. These call an OpenAI-compatible HTTP API and ship in `providers.yaml`: | Provider key | Endpoint | Default model | Key env var | |---|---|---|---| | `openai` | api.openai.com | `gpt-4o-mini` | `OPENAI_API_KEY` | | `gemini` | generativelanguage.googleapis.com | `gemini-2.5-flash` | `GEMINI_API_KEY` | | `groq` | api.groq.com | `llama-3.3-70b-versatile` | `GROQ_API_KEY` | | `openrouter` | openrouter.ai | `openai/gpt-oss-120b:free` | `OPENROUTER_API_KEY` | | `cerebras` | api.cerebras.ai | `gpt-oss-120b` | `CEREBRAS_API_KEY` | | `mistral` | api.mistral.ai | `mistral-small-latest` | `MISTRAL_API_KEY` | | `cloudflare` | api.cloudflare.com | `@cf/meta/llama-3.3-70b-instruct-fp8-fast` | `CLOUDFLARE_API_TOKEN` + `CLOUDFLARE_ACCOUNT_ID` | | `digitalocean` | inference.do-ai.run | `llama3.3-70b-instruct` | `DIGITALOCEAN_INFERENCE_KEY` | > `gemini`, `groq`, `openrouter`, `cerebras`, `mistral`, and `cloudflare` all > have **free tiers** (OpenRouter's `:free` models need no credits at all); > `openai` and `digitalocean` are paid. Free quotas, models, and rate limits > change often — check each provider's docs for current limits. > > **Where to get each key** (and the exact Cloudflare steps) is in > [providers.md](./providers.md#api-key-providers). `cloudflare` also > needs your account id in `CLOUDFLARE_ACCOUNT_ID` (it goes into the request URL). Set the matching env var (e.g. in `.env`), then use the provider like any other — `{"model":"openai"}` or `{"model":"digitalocean/deepseek-r1-distill-llama-70b"}` to pick a model. Add any other OpenAI-compatible service (Groq, Together, …) by copying the `api:` block. Keys are read from the environment, never stored in the YAML. See [providers.md](./providers.md#api-key-providers). ## Install ```bash npm install -g llm-whisperer npx playwright install chromium # one-time browser download (~170 MB) ``` On Linux/WSL, browser providers also need Chromium system libraries. If launch fails with a missing shared library such as `libnspr4.so`, run: ```bash sudo npx playwright install-deps chromium ``` Or run without installing: ```bash npx llm-whisperer serve ``` ## Quick start `pi` needs no login, so you can try the tool immediately: ```bash # 1. Start the API wspr serve # 2. Chat (native endpoint) curl -s -X POST http://localhost:9777/chat \ -H "Content-Type: application/json" \ -d '{"provider":"pi","messages":[{"role":"user","content":"Hello!"}]}' \ | jq .message.content ``` For login-gated providers like Qwen, run `wspr login qwen` first (with `serve` stopped), then use `"provider":"qwen"`. ### OpenAI-compatible (with streaming) ```bash curl -N http://localhost:9777/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"pi","stream":true,"messages":[{"role":"user","content":"Count to 5"}]}' ``` Or with the `openai` SDK — set `base_url="http://localhost:9777/v1"`. See **[quickstart.md](./quickstart.md)** for a full walkthrough. ## Documentation | Doc | Description | |---|---| | [quickstart.md](./quickstart.md) | Step-by-step first run | | [api.md](./api.md) | HTTP API: `/chat`, OpenAI `/v1/chat/completions`, streaming, model selection, auth | | [providers.md](./providers.md) | Provider status, login, selector & model-switching reference | | [configuration.md](./configuration.md) | Env vars (`WSPR_API_KEY`, CDP mode, …), concurrency | | [pnpm.md](../wiki/pnpm.md) | pnpm usage and publishing notes | ## Caveats - UI updates break selectors — run with `HEADLESS=false` to debug, fix in `providers.yaml` - Aggressive use triggers rate limits or Cloudflare challenges - Sessions expire — run `wspr login ` to refresh - `wspr login` requires `wspr serve` to be stopped first (Chrome profile lock) - This is for personal, low-volume use; respect each service's terms ## From source ```bash git clone https://github.com/aananda-giri/llm-whisperer cd llm-whisperer pnpm install pnpm exec playwright install chromium pnpm run serve ``` ## License MIT — see [LICENSE](../LICENSE).