specification: API Commons Rate Limits specificationVersion: '0.1' schema: https://raw.githubusercontent.com/api-evangelist/interface-research/main/schema/api-commons.yml#/$defs/RateLimits provider: Plandex providerId: plandex created: '2026-05-29' modified: '2026-05-29' tags: - AI Coding Agent - Developer Tools - CLI - LLM - Open Source - Rate Limiting - Quotas description: >- Machine-readable rate-limit definitions for the Plandex server REST API and the Plandex Cloud commercial surface. The open-source Plandex server does not document fixed HTTP rate limits — operators control throttling by their own deployment configuration. Practical limits are imposed by the upstream model providers (OpenAI, Anthropic, OpenRouter, Google, etc.) used by the configured model pack. The historical Plandex Cloud trial tier capped plans and model responses per plan rather than per-request rates. The hosted service is winding down as of 2025-10-03. status: historical headers: limit: X-RateLimit-Limit remaining: X-RateLimit-Remaining reset: X-RateLimit-Reset retryAfter: Retry-After policy: RateLimit-Policy responseCodes: throttled: 429 quotaExceeded: 429 serviceUnavailable: 503 limits: - tier: self-hosted name: Self-Hosted Default scope: operator-controlled metric: requests_per_second limit: -1 timeFrame: second notes: >- The open-source Plandex server does not enforce a fixed request rate. Operators may put it behind any reverse proxy (nginx, Caddy, Cloud Run) to apply per-IP or per-token throttling. The effective ceiling is set by the upstream model provider's TPM/RPM rate limits on the configured API keys. applies: - Plandex Server REST API - tier: self-hosted name: Streaming Connections Per Plan scope: plan metric: concurrent_streams limit: 1 timeFrame: second notes: >- Only one in-progress streaming run (tell/build/connect) is supported per plan branch at a time; subsequent calls reconnect to the active stream. applies: - Plandex Server REST API - tier: cloud-byo name: Plandex Cloud — BYO Trial Plans scope: account metric: plans limit: 10 timeFrame: trial notes: Historical; Plandex Cloud no longer accepts new users. applies: - Plandex Cloud (BYO API Key Mode) - tier: cloud-byo name: Plandex Cloud — BYO Trial Responses Per Plan scope: plan metric: model_responses_per_plan limit: 20 timeFrame: trial notes: Historical; Plandex Cloud no longer accepts new users. applies: - Plandex Cloud (BYO API Key Mode) - tier: cloud-integrated name: Plandex Cloud — Integrated Credit Balance scope: account metric: credits_usd_balance limit: -1 timeFrame: continuous notes: >- Throughput effectively gated by the user's Plandex credit balance and configured monthly budget rather than per-request quotas. Historical; Plandex Cloud no longer accepts new users. applies: - Plandex Cloud (Integrated Models Mode) upstreamProviders: - name: OpenRouter note: Default provider for self-hosted Plandex. OpenRouter applies its own per-key TPM/RPM limits. - name: OpenAI note: Direct OpenAI API limits apply when OPENAI_API_KEY is set. - name: Anthropic note: Direct Anthropic limits apply when ANTHROPIC_API_KEY (or a Claude Pro/Max subscription) is used. - name: Google Gemini / Vertex AI note: Google AI Studio / Vertex per-project quotas apply. - name: Azure OpenAI note: Azure per-deployment TPM/RPM quotas apply. - name: AWS Bedrock note: Bedrock per-region per-model quotas apply. - name: DeepSeek / Perplexity / Ollama / Custom Providers note: Each provider enforces its own limits. policies: - name: Backoff Strategy description: >- The Plandex CLI implements client-side retries with exponential backoff for transient upstream-provider failures and honors Retry-After when surfaced by the upstream provider. OpenRouter is used as a failover route when other configured providers error. - name: Streaming Reconnect description: >- Clients reconnect to in-progress plan runs via PATCH /plans/{planId}/{branch}/connect. The server is designed to be resilient to brief client disconnects during long-running coding tasks. - name: Operator Throttling description: >- Self-hosting operators are responsible for any rate-limiting they wish to enforce in front of the Plandex server (e.g., per-IP, per-token, per-org). - name: Concurrency Per Plan description: >- Only one streaming run per plan branch is active at any time; CLI flags like --bg run additional plans in the background under separate streams. maintainers: - FN: Kin Lane email: kin@apievangelist.com