# Runtime Policy Reference

Prodex reads `policy.toml` from the Prodex root, usually `~/.prodex/policy.toml` unless `PRODEX_HOME` is set.
Environment variables override policy values, and unset values fall back to built-in defaults.

Relative `runtime.log_dir` values are resolved under the Prodex root. `PRODEX_RUNTIME_LOG_DIR` is used as provided.
Use `prodex info` for effective tuning values and `prodex doctor --runtime --json` for the resolved runtime log directory, format, and current `log_path`.

```bash
prodex doctor --runtime --json
prodex doctor --runtime --json | jq -r '.log_path'
prodex doctor --runtime --json | jq -r '.runtime_logs.directory'
```

Defaults below are production defaults. Test builds use smaller timeouts and limits in several places.

## Runtime Keys

| Policy key | Environment override | Default | Meaning |
| --- | --- | --- | --- |
| `runtime.log_dir` | `PRODEX_RUNTIME_LOG_DIR` | OS temp directory, usually `/tmp` on Linux | Directory for `prodex-runtime-latest.path` and per-run `prodex-runtime-*.log` files. |
| `runtime.log_format` | `PRODEX_RUNTIME_LOG_FORMAT` | `text` | Runtime proxy log format. Valid values: `text`, `json`. |

## Gateway Keys

`prodex gateway` runs a standalone OpenAI-compatible HTTP gateway.
Native OpenAI-compatible upstreams are passed through for `/v1/responses`, `/v1/chat/completions`, `/v1/embeddings`, `/v1/images/*`, `/v1/audio/*`, `/v1/batches`, `/v1/rerank`, `/v1/a2a`, `/v1/messages`, and `/v1/models`.
Provider bridges translate `/v1/responses` where supported and pass native-compatible side endpoints through to the selected upstream.

| Policy key | Environment override | Default | Meaning |
| --- | --- | --- | --- |
| `gateway.listen_addr` | none | `127.0.0.1:4000` | Gateway bind address. Non-loopback binds require `--auth-token`, `PRODEX_GATEWAY_TOKEN`, or `gateway.virtual_keys`. |
| `gateway.provider` | CLI `--provider` | OpenAI-compatible upstream | Provider preset: `anthropic`, `copilot`, `deepseek`, or `gemini`. |
| `gateway.base_url` | CLI `--base-url`; `OPENAI_BASE_URL` for OpenAI-compatible mode | Provider default, or `https://api.openai.com/v1` | Upstream base URL. OpenAI-compatible mode appends `/v1` when the URL has no path. |
| `gateway.require_auth` | none | `false` | Require gateway bearer auth even on loopback. Token value comes from `--auth-token`, `PRODEX_GATEWAY_TOKEN`, or configured virtual key env vars. |
| `gateway.adaptive_routing.enabled` | none | `false` | Enable adaptive routing telemetry and shadow recommendations. Live route selection remains deterministic until an explicit non-shadow policy is implemented. |
| `gateway.adaptive_routing.shadow_mode` | none | `true` | Keep adaptive routing recommendations observational only. Continuation affinity and quota/safety constraints still win. |
| `gateway.adaptive_routing.window_size` | none | `128` | Bounded owner-attributed feedback window size used by adaptive quality scoring. |
| `gateway.adaptive_routing.min_samples` | none | `8` | Minimum samples before a model can be recommended by the adaptive shadow scorer. |
| `gateway.adaptive_routing.exploration_rate` | none | `0.0` | Reserved exploration rate in the range `0.0..=1.0`; currently parsed and validated but not applied to live routing. |
| `gateway.state.backend` | none | `file` | Gateway admin/usage state backend. Valid values: `file`, `sqlite`, `postgres`, `redis`. `postgres` stores admin-managed virtual keys, usage counters, and billing ledger rows in a shared Postgres database. `redis` stores the same data in Redis using locked JSON snapshots and a Redis list for the ledger. |
| `gateway.state.sqlite_path` | none | `gateway-state.sqlite` under the Prodex root when `backend=sqlite` | SQLite database path for admin-managed virtual keys, usage counters, and schema migrations. Relative paths are resolved under the Prodex root. |
| `gateway.state.postgres_url_env` | none | empty | Environment variable containing a Postgres connection URL. Required when `backend="postgres"`. |
| `gateway.state.redis_url_env` | none | empty | Environment variable containing a Redis connection URL. Required when `backend="redis"`. |
| `gateway.admin_tokens` | env vars named by `token_env` | empty | Additional admin-plane bearer tokens. They protect `/v1/prodex/gateway/*` only and do not authorize model traffic. |
| `gateway.admin_tokens[].role` | none | `admin` | Admin-plane role: `admin` can create/update/delete keys; `viewer` can read keys, usage, metrics, and OpenAPI only. |
| `gateway.admin_tokens[].allowed_key_prefixes` | none | empty | Optional virtual-key name prefixes this admin token can see and mutate. Empty means global access. |
| `gateway.admin_tokens[].tenant_id` | none | empty | Optional tenant boundary for this admin token. Tenant-scoped admins only see and mutate keys, SCIM users, usage, ledger rows, summaries, CSV exports, and metrics in that tenant. |
| `gateway.admin_tokens[].team_id` / `project_id` / `user_id` / `budget_id` | none | empty | Optional governance boundaries for this admin token. Scoped admins only see, create, mutate, and export virtual keys, usage, ledger rows, summaries, CSV exports, and metrics matching those dimensions. |
| `gateway.sso.proxy_token_env` | named env var | empty | Enable trusted reverse-proxy SSO for admin endpoints. The proxy must send this shared token in `gateway.sso.token_header`; Prodex then trusts the configured identity headers. |
| `gateway.sso.token_header` | none | `x-prodex-sso-token` | Header carrying the trusted proxy shared token. |
| `gateway.sso.user_header` | none | `x-prodex-sso-user` | Header carrying the authenticated user name/email from the upstream SSO proxy. |
| `gateway.sso.role_header` | none | `x-prodex-sso-role` | Optional header carrying `admin` or `viewer`; missing/invalid values fall back to `gateway.sso.default_role`. |
| `gateway.sso.key_prefixes_header` | none | `x-prodex-sso-key-prefixes` | Optional comma/semicolon/newline-separated virtual-key prefixes visible to the SSO principal. Empty means global access. |
| `gateway.sso.tenant_header` | none | `x-prodex-sso-tenant` | Optional tenant id header from a trusted SSO proxy. Missing values fall back to an active matching SCIM user's tenant. SCIM users can also carry `team_id`, `project_id`, `user_id`, and `budget_id`; SSO/OIDC admin requests inherit those dimensions from the matching active SCIM user. |
| `gateway.sso.oidc_issuer` | none | empty | Enable native OIDC/JWT admin auth for bearer tokens issued by this issuer. Requires `oidc_audience`; Prodex discovers JWKS from this issuer when `oidc_jwks_url` is omitted. |
| `gateway.sso.oidc_audience` | none | empty | Required audience for OIDC/JWT admin bearer tokens. |
| `gateway.sso.oidc_jwks_url` | none | issuer discovery | Optional JWKS URL used to verify OIDC/JWT admin bearer token signatures. |
| `gateway.sso.oidc_user_claim` | none | `email` | Claim used as the admin principal name before SCIM lookup. Runtime falls back to `email`, `preferred_username`, then `sub`. |
| `gateway.sso.oidc_role_claim` | none | `prodex_role` | Optional claim carrying `admin` or `viewer`; missing/invalid values fall back to SCIM user role or `gateway.sso.default_role`. |
| `gateway.sso.oidc_tenant_claim` | none | `prodex_tenant` | Optional claim carrying the admin tenant id; missing values fall back to an active matching SCIM user's tenant. |
| `gateway.sso.oidc_key_prefixes_claim` | none | `prodex_key_prefixes` | Optional string or string-array claim carrying visible virtual-key prefixes; missing values fall back to SCIM user prefixes. |
| `gateway.sso.default_role` | none | `admin` | Default role for SSO-authenticated admin requests when the role header is absent. |
| `gateway.route_aliases` | none | empty | Declarative model aliases. Matching request `model` values are rewritten according to each alias `strategy`. |
| `gateway.route_aliases[].strategy` | none | `fallback` | Routing strategy for the alias: `fallback` rewrites to `combo:...`, `round-robin` selects one target by request id, `least-busy` selects the target with the fewest in-flight gateway requests, `first` always picks the first target. |
| `gateway.route_aliases[].model_metrics` | none | catalog defaults where known | Optional per-model routing hints for metric strategies: cost, latency, RPM limit, and TPM limit. Policy values override the embedded provider/model catalog. |
| `gateway.virtual_keys` | env vars named by `token_env` | empty | Static virtual gateway keys. Each key can enforce model allowlists, persisted request/spend budgets, RPM, and TPM, and can carry governance dimensions for admin and FinOps reporting. |
| `gateway.virtual_keys[].token_env` | named env var | required per key | Environment variable containing the bearer token for this virtual key. Missing or empty env vars are configuration errors. |
| `gateway.virtual_keys[].tenant_id` | none | empty | Optional tenant id assigned to this policy-backed key for tenant-scoped admin visibility. |
| `gateway.virtual_keys[].team_id` / `project_id` / `user_id` / `budget_id` | none | empty | Optional governance dimensions returned by the admin API and SDK for team, project, user, and budget attribution. When multiple virtual keys share a non-empty `budget_id`, `request_budget` and `budget_usd` also act as shared caps for that budget bucket. |
| `gateway.virtual_keys[].allowed_models` | none | empty | Optional model allowlist checked against the request `model` before route alias rewrite. |
| `gateway.virtual_keys[].budget_usd` | none | empty | Optional persisted spend cap for estimated request cost when catalog or policy cost is available. |
| `gateway.virtual_keys[].request_budget` | none | empty | Optional persisted total request cap for the virtual key name. |
| `gateway.virtual_keys[].rpm_limit` / `gateway.virtual_keys[].tpm_limit` | none | empty | Optional per-minute request/token caps. TPM uses Prodex's semantic request-token estimator. |
| `gateway.observability.sinks` | none | `runtime-log` | Enabled gateway observability sinks. `runtime-log` is always enabled; `jsonl` and `http` are enabled automatically when their target fields are set. |
| `gateway.observability.call_id_header` | none | `x-prodex-call-id` | Response header containing a stable per-request call id such as `prodex-42`. |
| `gateway.observability.jsonl_path` | none | empty | Optional JSONL export path for structured `gateway_spend` events. Relative paths are resolved under the Prodex root. |
| `gateway.observability.http_endpoint` | none | empty | Optional HTTP JSON export endpoint for structured `gateway_spend` events. |
| `gateway.observability.http_schema` | none | `generic` | HTTP export payload schema: `generic`, `otel`, `datadog`, or `langfuse`. |
| `gateway.observability.http_bearer_token_env` | none | empty | Environment variable name containing a bearer token for `gateway.observability.http_endpoint`. |
| `gateway.guardrails.blocked_keywords` | none | empty | Case-insensitive pre-call keyword blocks applied before upstream send. |
| `gateway.guardrails.blocked_output_keywords` | none | empty | Case-insensitive output keyword blocks. Buffered responses are replaced with `403 policy_violation`; streaming responses are stopped and logged when a keyword is observed. |
| `gateway.guardrails.allowed_models` | none | empty | Optional pre-call allowlist for request `model` values, checked before route alias rewrite. |
| `gateway.guardrails.presidio_redaction` | CLI `--presidio` / `--no-presidio` | `false` | Enable Presidio request-body redaction for gateway traffic. |
| `gateway.guardrails.prompt_injection_detection` | none | `false` | Enable built-in prompt-injection heuristic checks before upstream send. |
| `gateway.guardrails.pii_redaction` | none | `false` | Enable local best-effort request-body redaction for emails, secret-like bearer/API-key values, and long digit groups before upstream send. |
| `gateway.guardrails.webhook_url` | none | empty | Optional external guardrail HTTP endpoint. Prodex sends base64 request/response bodies and expects JSON such as `{"allow": false, "reason": "...", "message": "..."}` to block. |
| `gateway.guardrails.webhook_phases` | none | both phases | External guardrail phases: `pre` for requests before upstream send, `post` for buffered responses before returning to caller. |
| `gateway.guardrails.webhook_bearer_token_env` | none | empty | Environment variable name containing a bearer token for `gateway.guardrails.webhook_url`. |
| `gateway.guardrails.webhook_fail_closed` | none | `false` | Block when the external guardrail endpoint fails or returns non-2xx. |

Admin-managed virtual keys and SCIM users default to file state under the Prodex root as `gateway-virtual-keys.json`; request/spend usage defaults to `gateway-virtual-key-usage.json`, and response-reconciled billing ledger records default to `gateway-billing-ledger.jsonl`. Set `[gateway.state] backend = "sqlite"` to store admin-managed keys, SCIM users, usage counters, billing ledger records, and schema migrations in one SQLite database, `backend = "postgres"` with `postgres_url_env` to store the same admin/usage/ledger/SCIM state in a shared Postgres database, or `backend = "redis"` with `redis_url_env` to store gateway state in Redis. The configured gateway admin token from `--auth-token` or `PRODEX_GATEWAY_TOKEN` has admin role and can `GET`/`POST` `/v1/prodex/gateway/keys`, `GET`/`PATCH`/`DELETE` `/v1/prodex/gateway/keys/{name}`, `GET`/`POST` `/v1/prodex/gateway/scim/v2/Users`, `GET`/`PATCH`/`PUT`/`DELETE` `/v1/prodex/gateway/scim/v2/Users/{id}`, read `/v1/prodex/gateway/usage`, read `/v1/prodex/gateway/ledger`, read aggregated billing totals from `/v1/prodex/gateway/ledger/summary`, export billing CSV from `/v1/prodex/gateway/ledger.csv` and `/v1/prodex/gateway/ledger/summary.csv`, scrape Prometheus text metrics from `/v1/prodex/gateway/metrics`, inspect provider adapter contracts through `/v1/prodex/gateway/providers`, inspect active observability and guardrail configuration through `/v1/prodex/gateway/observability` and `/v1/prodex/gateway/guardrails`, fetch `/v1/prodex/gateway/openapi.json`, and use the built-in gateway admin dashboard at `/v1/prodex/gateway/admin`. Prometheus virtual-key metrics include `tenant_id`, `team_id`, `project_id`, `user_id`, and `budget_id` labels when the key has those governance dimensions. Additional `[[gateway.admin_tokens]]` entries can be `admin` or read-only `viewer`, and can set `allowed_key_prefixes`, `tenant_id`, `team_id`, `project_id`, `user_id`, and/or `budget_id` to restrict key list/read/mutation, SCIM user management, usage, ledger, summary, CSV, and metrics visibility. `[gateway.sso]` can trust an authenticated reverse proxy by requiring a shared proxy token header and mapping user, role, tenant, and key-prefix headers to the same admin RBAC model. It can also verify native OIDC/JWT bearer tokens against a configured issuer and audience, using either a configured JWKS URL or the issuer discovery document; role, tenant, key-prefix, and governance dimensions can come from an active matching SCIM user when token/header values are absent. An inactive matching SCIM user is rejected. Virtual-key bearer tokens cannot use these admin endpoints. `POST /keys` returns a generated bearer token once when `token` is omitted, while persisted state stores only its hash. Keys configured by `policy.toml` stay source `policy` and are read-only through the admin API. Admin-managed key create, update, rotate, delete, and SCIM user mutations are recorded as `gateway_admin` events in `prodex audit` without storing bearer token material. Gateway observability emits `gateway_spend` with `phase=request` after upstream response headers and `phase=response` after buffered response completion or streaming EOF/drop. Provider catalog edits should pass `npm run catalog:providers`.

Example:

```toml
[gateway]
listen_addr = "127.0.0.1:4000"
provider = "gemini"
require_auth = true

[gateway.state]
backend = "sqlite"
sqlite_path = "gateway-state.sqlite"

[[gateway.admin_tokens]]
name = "ops"
token_env = "PRODEX_GATEWAY_OPS_TOKEN"
role = "admin"

[[gateway.admin_tokens]]
name = "auditor"
token_env = "PRODEX_GATEWAY_AUDITOR_TOKEN"
role = "viewer"
allowed_key_prefixes = ["team-a-"]
tenant_id = "tenant-a"
team_id = "platform"
project_id = "codex-gateway"
user_id = "alice@example.com"
budget_id = "budget-platform"

[gateway.sso]
proxy_token_env = "PRODEX_GATEWAY_SSO_PROXY_TOKEN"
user_header = "x-auth-request-email"
role_header = "x-prodex-role"
key_prefixes_header = "x-prodex-key-prefixes"
tenant_header = "x-prodex-tenant"
default_role = "viewer"

# Or verify native OIDC/JWT admin bearer tokens directly.
# oidc_issuer = "https://idp.example"
# oidc_audience = "prodex-gateway"
# oidc_jwks_url = "https://idp.example/.well-known/jwks.json" # optional
# oidc_user_claim = "email"
# oidc_role_claim = "prodex_role"
# oidc_tenant_claim = "prodex_tenant"
# oidc_key_prefixes_claim = "prodex_key_prefixes"

[[gateway.route_aliases]]
alias = "prodex-fast"
models = ["gemini-3-flash", "gemini-2.5-flash"]
strategy = "fallback"

[[gateway.route_aliases.model_metrics]]
model = "gemini-3-flash"
input_cost_per_million_microusd = 100
output_cost_per_million_microusd = 200
latency_ms = 300
rpm_limit = 60
tpm_limit = 100000

[[gateway.virtual_keys]]
name = "team-a"
token_env = "PRODEX_GATEWAY_TEAM_A_TOKEN"
tenant_id = "tenant-a"
team_id = "platform"
project_id = "codex-gateway"
user_id = "alice@example.com"
budget_id = "budget-platform"
allowed_models = ["prodex-fast"]
budget_usd = 10.0
request_budget = 1000
rpm_limit = 60
tpm_limit = 100000

[gateway.observability]
sinks = ["runtime-log", "jsonl", "http"]
call_id_header = "x-prodex-call-id"
jsonl_path = "gateway-spend.jsonl"
http_endpoint = "https://otel-collector.example/v1/events"
http_schema = "otel"
http_bearer_token_env = "PRODEX_GATEWAY_OBSERVABILITY_TOKEN"

[gateway.guardrails]
blocked_keywords = ["secret project"]
blocked_output_keywords = ["do not reveal"]
allowed_models = ["prodex-fast"]
presidio_redaction = true
prompt_injection_detection = true
pii_redaction = true
webhook_url = "https://guardrails.example/check"
webhook_phases = ["pre", "post"]
webhook_bearer_token_env = "PRODEX_GATEWAY_GUARDRAIL_TOKEN"
webhook_fail_closed = true
```

## Runtime Proxy Keys

`runtime_proxy.preset` selects a conservative preset before individual `runtime_proxy` keys are applied.
Valid values are `low`, `default`, `many-terminals`, and `aggressive`; `PRODEX_RUNTIME_PROXY_PRESET` selects the preset from the environment.
Specific environment overrides for individual keys still have highest priority.
Unknown policy preset values are rejected when `policy.toml` is parsed. Unknown environment preset values are ignored so the configured policy or built-in defaults still apply.
The preset changes only local concurrency and admission tuning; transport timeouts remain on their normal defaults unless configured directly.

## Runtime Proxy Contract

`runtime_proxy` tuning must preserve these invariants:

- Prodex stays a scoped Codex gateway, not a general-purpose LLM SDK.
- Profile selection must be visible through policy, `prodex info`, `prodex doctor`, and runtime logs.
- Pre-commit retry and fallback paths must stay bounded per request.
- Runtime hot paths must avoid broad disk reads, quota probes, or blocking state saves.
- Quota, budget, transport, and local pressure signals must stay classified separately.
- Selection, admission, affinity, backoff, and first-chunk events must be structured in runtime logs.
- Upstream HTTP/WebSocket connection reuse should be preserved where it does not change Codex semantics.
- Secrets remain profile-isolated, redacted in diagnostics, and covered by audit events for Prodex-owned mutations.

<!-- BEGIN GENERATED RUNTIME_PROXY_KEYS -->
| Policy key | Environment override | Default | Meaning |
| --- | --- | --- | --- |
| `runtime_proxy.worker_count` | `PRODEX_RUNTIME_PROXY_WORKER_COUNT` | CPU parallelism clamped to `4..12` | Short-lived proxy worker pool size. |
| `runtime_proxy.long_lived_worker_count` | `PRODEX_RUNTIME_PROXY_LONG_LIVED_WORKER_COUNT` | `parallelism * 2` clamped to `8..24` | Worker pool for long-lived streams and websocket work. |
| `runtime_proxy.probe_refresh_worker_count` | `PRODEX_RUNTIME_PROBE_REFRESH_WORKER_COUNT` | CPU parallelism clamped to `2..4` | Background profile probe refresh workers. |
| `runtime_proxy.async_worker_count` | `PRODEX_RUNTIME_PROXY_ASYNC_WORKER_COUNT` | CPU parallelism clamped to `2..4` | Async runtime worker count. |
| `runtime_proxy.long_lived_queue_capacity` | `PRODEX_RUNTIME_PROXY_LONG_LIVED_QUEUE_CAPACITY` | `long_lived_worker_count * 8` clamped to `128..1024` | Queue capacity for long-lived proxy work. |
| `runtime_proxy.active_request_limit` | `PRODEX_RUNTIME_PROXY_ACTIVE_REQUEST_LIMIT` | `worker_count + long_lived_worker_count * 3` clamped to `64..512` | Global local admission cap for fresh runtime proxy requests. |
| `runtime_proxy.responses_active_limit` | `PRODEX_RUNTIME_PROXY_RESPONSES_ACTIVE_LIMIT` | `75%` of global limit, clamped to `4..global` | Lane cap for main Responses traffic. |
| `runtime_proxy.compact_active_limit` | `PRODEX_RUNTIME_PROXY_COMPACT_ACTIVE_LIMIT` | `25%` of global limit, clamped to `2..6` | Lane cap for `/responses/compact`. |
| `runtime_proxy.websocket_active_limit` | `PRODEX_RUNTIME_PROXY_WEBSOCKET_ACTIVE_LIMIT` | `long_lived_worker_count` clamped to `2..global` | Lane cap for websocket transport. |
| `runtime_proxy.standard_active_limit` | `PRODEX_RUNTIME_PROXY_STANDARD_ACTIVE_LIMIT` | `worker_count / 2` clamped to `2..8` | Lane cap for other unary proxy traffic. |
| `runtime_proxy.profile_inflight_soft_limit` | `PRODEX_RUNTIME_PROXY_PROFILE_INFLIGHT_SOFT_LIMIT` | `4` | Fresh selection starts penalizing profiles above this in-flight count. |
| `runtime_proxy.profile_inflight_hard_limit` | `PRODEX_RUNTIME_PROXY_PROFILE_INFLIGHT_HARD_LIMIT` | `8` | Fresh selection avoids profiles above this in-flight count; hard affinity still wins. |
| `runtime_proxy.admission_wait_budget_ms` | `PRODEX_RUNTIME_PROXY_ADMISSION_WAIT_BUDGET_MS` | `750` | Normal wait budget for local admission pressure. |
| `runtime_proxy.pressure_admission_wait_budget_ms` | `PRODEX_RUNTIME_PROXY_PRESSURE_ADMISSION_WAIT_BUDGET_MS` | `200` | Shorter admission wait budget when proxy is already under pressure. |
| `runtime_proxy.long_lived_queue_wait_budget_ms` | `PRODEX_RUNTIME_PROXY_LONG_LIVED_QUEUE_WAIT_BUDGET_MS` | `750` | Normal wait budget for long-lived queue pressure. |
| `runtime_proxy.pressure_long_lived_queue_wait_budget_ms` | `PRODEX_RUNTIME_PROXY_PRESSURE_LONG_LIVED_QUEUE_WAIT_BUDGET_MS` | `200` | Shorter long-lived queue wait budget under pressure. |
| `runtime_proxy.http_connect_timeout_ms` | `PRODEX_RUNTIME_PROXY_HTTP_CONNECT_TIMEOUT_MS` | `5000` | Upstream HTTP connect timeout. |
| `runtime_proxy.stream_idle_timeout_ms` | `PRODEX_RUNTIME_PROXY_STREAM_IDLE_TIMEOUT_MS` | `300000` | Responses stream idle timeout, aligned with Codex behavior. |
| `runtime_proxy.compact_request_timeout_ms` | `PRODEX_RUNTIME_PROXY_COMPACT_REQUEST_TIMEOUT_MS` | `90000` | Total request timeout for unary remote compact calls before Codex can observe failure and retry. |
| `runtime_proxy.sse_lookahead_timeout_ms` | `PRODEX_RUNTIME_PROXY_SSE_LOOKAHEAD_TIMEOUT_MS` | `1000` | Pre-commit SSE lookahead timeout. |
| `runtime_proxy.prefetch_backpressure_retry_ms` | `PRODEX_RUNTIME_PROXY_PREFETCH_BACKPRESSURE_RETRY_MS` | `10` | Retry delay while stream prefetch is backpressured. |
| `runtime_proxy.prefetch_backpressure_timeout_ms` | `PRODEX_RUNTIME_PROXY_PREFETCH_BACKPRESSURE_TIMEOUT_MS` | `1000` | Max wait for stream prefetch backpressure to clear. |
| `runtime_proxy.prefetch_max_buffered_bytes` | `PRODEX_RUNTIME_PROXY_PREFETCH_MAX_BUFFERED_BYTES` | `786432` | Max buffered prefetch bytes before backpressure. |
| `runtime_proxy.websocket_connect_timeout_ms` | `PRODEX_RUNTIME_PROXY_WEBSOCKET_CONNECT_TIMEOUT_MS` | `15000` | Upstream websocket connect timeout. |
| `runtime_proxy.websocket_happy_eyeballs_delay_ms` | `PRODEX_RUNTIME_PROXY_WEBSOCKET_HAPPY_EYEBALLS_DELAY_MS` | `200` | Delay before alternate websocket TCP connect attempt. |
| `runtime_proxy.websocket_precommit_progress_timeout_ms` | `PRODEX_RUNTIME_PROXY_WEBSOCKET_PRECOMMIT_PROGRESS_TIMEOUT_MS` | `8000` | Websocket pre-commit progress timeout. |
| `runtime_proxy.websocket_connect_worker_count` | `PRODEX_RUNTIME_WEBSOCKET_CONNECT_WORKER_COUNT` | CPU parallelism clamped to `4..16` | Worker count for bounded websocket TCP connect executor. |
| `runtime_proxy.websocket_connect_queue_capacity` | `PRODEX_RUNTIME_WEBSOCKET_CONNECT_QUEUE_CAPACITY` | `websocket_connect_worker_count * 8` clamped to `32..128` | Bounded queue capacity for websocket TCP connect work; effective value is at least the worker count. |
| `runtime_proxy.websocket_connect_overflow_capacity` | `PRODEX_RUNTIME_WEBSOCKET_CONNECT_OVERFLOW_CAPACITY` | `websocket_connect_queue_capacity * 4` clamped to `32..512` | Overflow queue capacity for websocket TCP connect work after the bounded queue fills; `0` disables overflow buffering. |
| `runtime_proxy.websocket_dns_worker_count` | `PRODEX_RUNTIME_WEBSOCKET_DNS_WORKER_COUNT` | CPU parallelism clamped to `2..8` | Worker count for bounded websocket DNS resolution executor. |
| `runtime_proxy.websocket_dns_queue_capacity` | `PRODEX_RUNTIME_WEBSOCKET_DNS_QUEUE_CAPACITY` | `websocket_dns_worker_count * 4` clamped to `16..64` | Bounded queue capacity for websocket DNS resolution work; effective value is at least the worker count. |
| `runtime_proxy.websocket_dns_overflow_capacity` | `PRODEX_RUNTIME_WEBSOCKET_DNS_OVERFLOW_CAPACITY` | `websocket_dns_queue_capacity * 2` clamped to `16..128` | Overflow queue capacity for websocket DNS resolution work after the bounded queue fills; `0` disables overflow buffering. |
| `runtime_proxy.websocket_previous_response_reuse_stale_ms` | `PRODEX_RUNTIME_PROXY_WEBSOCKET_PREVIOUS_RESPONSE_REUSE_STALE_MS` | `60000` | Window for reusing a websocket previous-response binding before treating it as stale. |
| `runtime_proxy.broker_ready_timeout_ms` | `PRODEX_RUNTIME_BROKER_READY_TIMEOUT_MS` | `15000` | Startup wait for the runtime broker to become ready. |
| `runtime_proxy.broker_health_connect_timeout_ms` | `PRODEX_RUNTIME_BROKER_HEALTH_CONNECT_TIMEOUT_MS` | `750` | Broker health check connect timeout. |
| `runtime_proxy.broker_health_read_timeout_ms` | `PRODEX_RUNTIME_BROKER_HEALTH_READ_TIMEOUT_MS` | `1500` | Broker health check read timeout. |
| `runtime_proxy.sync_probe_pressure_pause_ms` | `PRODEX_RUNTIME_PROXY_SYNC_PROBE_PRESSURE_PAUSE_MS` | `5` | Pause before synchronous probe work when local pressure is detected. |
| `runtime_proxy.responses_critical_floor_percent` | `PRODEX_RUNTIME_PROXY_RESPONSES_CRITICAL_FLOOR_PERCENT` | `2` | Minimum remaining Responses quota percentage treated as critical; valid range `1..10`. |
| `runtime_proxy.startup_sync_probe_warm_limit` | `PRODEX_RUNTIME_STARTUP_SYNC_PROBE_WARM_LIMIT` | `1` | Startup synchronous quota probe warm-up limit, capped internally at `3`. |
<!-- END GENERATED RUNTIME_PROXY_KEYS -->

Positive integer values are required for numeric policy keys, except websocket overflow capacity keys, which may be `0`, and `responses_critical_floor_percent`, which must be between `1` and `10`.
Some effective values are clamped after env or policy resolution to protect runtime bounds.

## Example

```toml
version = 1

[runtime]
log_format = "json"
log_dir = "runtime-logs"

[runtime_proxy]
preset = "many-terminals"
worker_count = 16
active_request_limit = 128
responses_active_limit = 96
profile_inflight_soft_limit = 6
profile_inflight_hard_limit = 10
```