specification: API Commons Rate Limits
specificationVersion: '0.1'
schema: https://raw.githubusercontent.com/api-evangelist/interface-research/main/schema/api-commons.yml#/$defs/RateLimits
provider: Gravitee
providerId: gravitee
created: '2026-05-04'
modified: '2026-05-15'
reconciled: true
tags:
  - Rate Limiting
  - API Gateway
  - API Management
  - Event Streaming
description: Gravitee is a self-hosted (or Gravitee-hosted) API and Event Management platform; it
  does not impose per-call rate limits on customer traffic. Instead, it provides Rate Limit, Quota,
  and Spike Arrest policies that customers configure on their own gateways to throttle their consumers.
  Plan-level limits (gateways included, support tier) are commercial caps rather than runtime
  request caps, and Gravitee advertises "unlimited API calls and events" within the subscription.
sources:
  - https://www.gravitee.io/pricing
  - https://documentation.gravitee.io/apim/policies/rate-limit
  - https://documentation.gravitee.io/apim/policies/quota
responseCodes:
  throttled: 429
  quotaExceeded: 429
  serviceUnavailable: 503
limits:
  - name: Customer-facing API calls (subscription)
    scope: subscription
    metric: requests_per_month
    limit: -1
    timeFrame: month
    notes: Gravitee subscription plans cover unlimited API calls and events; the monthly subscription
      fee is the only volume-related ceiling.
  - name: Production gateways (per plan)
    scope: subscription
    metric: gateway
    limit: 'see plan'
    notes: Planet=1, Galaxy=2, Universe=4+, Comet=1, Meteor=2, Asteroid=4. Beyond these, additional
      gateways require an upgrade or add-on.
  - name: Configurable Rate Limit policy
    scope: api/consumer
    metric: requests_per_second
    limit: 'operator-defined'
    notes: API publishers configure per-second rate limits inside the Gravitee Rate Limit policy on
      each API.
  - name: Configurable Quota policy
    scope: api/consumer/subscription
    metric: requests_per_period
    limit: 'operator-defined'
    notes: API publishers set per-day / per-week / per-month quotas via the Quota policy.
  - name: AI Prompt Token Tracking (Agent Management)
    scope: api/consumer/model
    metric: llm_tokens_per_period
    limit: 'operator-defined'
    notes: The AI Prompt Token Tracking policy meters input / output tokens per consumer, per LLM
      route. Combined with the Quota or Rate Limit policy, it enables LLM-cost throttling.
policies:
  - name: Rate Limit policy
    description: Short-window throttling (e.g. requests per second) applied at the gateway, with
      configurable burst and refill semantics.
  - name: Quota policy
    description: Longer-window quotas (per day / week / month) per consumer or per subscription
      plan.
  - name: Spike Arrest
    description: Smooths traffic bursts by rejecting requests above an allowed peak per smaller
      time bucket.
  - name: Backoff strategy
    description: When 429 is returned by the customer-configured policy, downstream consumers
      should honor Retry-After and use exponential backoff.
  - name: AI Prompt Token Tracking policy
    description: Meters LLM input / output token usage per consumer, per route, per model. Backed by
      Redis or Hazelcast distributed counters and feeds the Quota / Rate Limit policy for AI cost caps.
  - name: AI Prompt Guard-Rails policy
    description: Allow / deny / classify prompts before they reach an LLM upstream; rejects matching
      requests with 4xx response codes.
maintainers:
  - FN: Kin Lane
    email: kin@apievangelist.com