aid: vespa-ai
name: Vespa
description: Vespa is an open-source AI search engine, big-data serving engine, and vector database originally developed
  inside Yahoo and spun out as Vespa.ai AS. Vespa combines vector search, text search (BM25), structured filtering, and
  machine-learned ranking — including native tensor inference — into a single distributed serving engine that scales to
  billions of documents with sub-100ms latency. Vespa Cloud is the fully managed commercial offering operated by the
  Vespa.ai team across AWS and GCP, with Startup, Basic, Commercial, and Enterprise plans plus a Self-Managed option for
  customers running the open-source engine on their own infrastructure. Vespa is widely used at Spotify, Perplexity,
  Yahoo, Farfetch, and Elicit for search, recommendation, personalization, and Retrieval-Augmented Generation (RAG).
type: Index
position: Provider
access: 3rd-Party
image: https://kinlane-productions.s3.amazonaws.com/apis-json/apis-json-logo.jpg
tags:
  - AI
  - Search
  - Vector Database
  - Big Data
  - Machine Learning
  - Semantic Search
  - Retrieval Augmented Generation
  - Open Source
  - Tensor
  - Recommendations
url: https://raw.githubusercontent.com/api-evangelist/vespa-ai/refs/heads/main/apis.yml
created: '2026-05-25'
modified: '2026-05-25'
specificationVersion: '0.19'
apis:
  - aid: vespa-ai:vespa-query-api
    name: Vespa Query API
    description: The Vespa Query (Search) API executes structured and vector queries against a Vespa application using
      YQL (Vespa Query Language). It supports text search with BM25, approximate-nearest-neighbor vector search over HNSW
      indexes, hybrid search, machine-learned ranking with multi-phase rank profiles, grouping/aggregation, pagination,
      result presentation, and tracing. Queries can be issued as GET requests with query-string parameters or POST
      requests with a JSON body for complex expressions.
    humanURL: https://docs.vespa.ai/en/query-api.html
    tags:
      - AI
      - Search
      - Query
      - YQL
      - Vector Search
      - Ranking
      - Hybrid Search
    properties:
      - url: openapi/vespa-query-api-openapi.yml
        type: OpenAPI
      - url: https://docs.vespa.ai/en/query-api.html
        type: Documentation
      - url: https://docs.vespa.ai/en/reference/api/query.html
        type: Documentation
      - url: https://docs.vespa.ai/en/getting-started.html
        type: GettingStarted
      - type: NaftikoCapability
        url: capabilities/vespa-query.yaml
  - aid: vespa-ai:vespa-document-api
    name: Vespa Document API
    description: The Vespa Document API (/document/v1) provides synchronous REST access to document operations against a
      Vespa content cluster. It supports Put, Get, Update (partial update with assign/add/remove operators), Remove, and
      Visit (streaming visit, copy, delete-where, update-where) over JSON or JSON Lines, with conditional writes,
      multi-tenant namespaces, field-set projection, time-window selection, and pagination via continuation tokens.
    humanURL: https://docs.vespa.ai/en/reference/document-v1-api-reference.html
    tags:
      - Documents
      - CRUD
      - Indexing
      - Data
      - Streaming
    properties:
      - url: openapi/vespa-document-api-openapi.yml
        type: OpenAPI
      - url: https://docs.vespa.ai/en/reference/document-v1-api-reference.html
        type: Documentation
      - url: https://docs.vespa.ai/en/writing/document-v1-api-guide.html
        type: Documentation
      - url: https://docs.vespa.ai/en/reads-and-writes.html
        type: Documentation
      - type: NaftikoCapability
        url: capabilities/vespa-documents.yaml
  - aid: vespa-ai:vespa-deploy-api
    name: Vespa Deploy API
    description: The Vespa Deploy API (/application/v2) manages application packages on a Vespa configuration server.
      It supports preparing, activating, and tearing down application packages, session-based deployments, schema
      validation, and zero-downtime updates of services, schemas, and rank profiles.
    humanURL: https://docs.vespa.ai/en/reference/deploy-rest-api-v2.html
    tags:
      - Deployment
      - Configuration
      - Application
      - DevOps
    properties:
      - url: https://docs.vespa.ai/en/reference/deploy-rest-api-v2.html
        type: Documentation
      - url: https://docs.vespa.ai/en/application-packages.html
        type: Documentation
  - aid: vespa-ai:vespa-tenant-api
    name: Vespa Tenant and Application API
    description: The Vespa Tenant API (/application/v2/tenant) manages tenants and applications hosted on a Vespa
      configuration server or Vespa Cloud control plane. It exposes operations for creating tenants, listing
      applications, and binding application sessions to a tenant.
    humanURL: https://docs.vespa.ai/en/reference/application-v2-tenant.html
    tags:
      - Tenants
      - Applications
      - Multi-Tenancy
      - Administration
    properties:
      - url: https://docs.vespa.ai/en/reference/application-v2-tenant.html
        type: Documentation
  - aid: vespa-ai:vespa-config-api
    name: Vespa Config API
    description: The Vespa Config API (/config/v2) lets services in a Vespa application retrieve their configuration
      from a Vespa configuration server using the config-server / config-proxy protocol. It is primarily used by Vespa
      services and tooling rather than end users, but is documented as a stable HTTP API.
    humanURL: https://docs.vespa.ai/en/reference/config-rest-api-v2.html
    tags:
      - Configuration
      - Internal
    properties:
      - url: https://docs.vespa.ai/en/reference/config-rest-api-v2.html
        type: Documentation
  - aid: vespa-ai:vespa-cluster-api
    name: Vespa Cluster Controller API
    description: The Vespa Cluster Controller API (/cluster/v2) exposes runtime state and management endpoints for a
      Vespa content cluster — including node state queries, maintenance-mode transitions, and storage cluster orchestration.
    humanURL: https://docs.vespa.ai/en/reference/cluster-v2.html
    tags:
      - Cluster
      - Operations
      - Content
      - State
    properties:
      - url: https://docs.vespa.ai/en/reference/cluster-v2.html
        type: Documentation
  - aid: vespa-ai:vespa-state-api
    name: Vespa State API
    description: The Vespa State API (/state/v1) exposes per-service health, version, and metrics endpoints for any
      Vespa node — used by orchestration tooling, monitoring agents, and load balancers to check liveness, readiness,
      and runtime metrics.
    humanURL: https://docs.vespa.ai/en/reference/state-v1.html
    tags:
      - Health
      - Monitoring
      - Metrics
      - Observability
    properties:
      - url: https://docs.vespa.ai/en/reference/state-v1.html
        type: Documentation
      - type: NaftikoCapability
        url: capabilities/vespa-state.yaml
  - aid: vespa-ai:vespa-metrics-api
    name: Vespa Metrics API
    description: Vespa exposes a family of metrics endpoints (/metrics/v1, /metrics/v2, /prometheus/v1) that publish
      Vespa engine and application metrics in JSON or Prometheus exposition format for scraping by Prometheus,
      Grafana, or other observability stacks.
    humanURL: https://docs.vespa.ai/en/operations/metrics.html
    tags:
      - Metrics
      - Prometheus
      - Observability
      - Monitoring
    properties:
      - url: https://docs.vespa.ai/en/operations/metrics.html
        type: Documentation
      - url: https://docs.vespa.ai/en/reference/metrics-v1.html
        type: Documentation
      - url: https://docs.vespa.ai/en/reference/metrics-v2.html
        type: Documentation
      - url: https://docs.vespa.ai/en/reference/prometheus-v1.html
        type: Documentation
common:
  - type: Website
    url: https://vespa.ai
  - type: Documentation
    url: https://docs.vespa.ai/
  - type: GettingStarted
    url: https://docs.vespa.ai/en/getting-started.html
  - type: Tutorials
    url: https://docs.vespa.ai/en/learn/tutorials/
  - type: GitHubOrganization
    url: https://github.com/vespa-engine
  - type: GitHubRepository
    url: https://github.com/vespa-engine/vespa
  - type: License
    url: https://github.com/vespa-engine/vespa/blob/master/LICENSE
  - type: Blog
    url: https://blog.vespa.ai/
  - type: BlogRSS
    url: https://blog.vespa.ai/feed.xml
  - type: Pricing
    url: https://cloud.vespa.ai/pricing
  - type: Console
    url: https://console.vespa-cloud.com/
  - type: Slack
    url: https://slack.vespa.ai/
  - type: Support
    url: https://github.com/vespa-engine/vespa/issues
  - type: ChangeLog
    url: https://github.com/vespa-engine/vespa/releases
  - type: SDK
    name: Vespa CLI (Go)
    url: https://github.com/vespa-engine/vespa/tree/master/client/go
  - type: SDK
    name: pyvespa (Python)
    url: https://github.com/vespa-engine/pyvespa
  - type: SDK
    name: pyvespa Documentation
    url: https://vespa-engine.github.io/pyvespa/
  - type: SDK
    name: vespa-feed-client (Java)
    url: https://github.com/vespa-engine/vespa/tree/master/vespa-feed-client
  - type: SDK
    name: vespa-search (JavaScript)
    url: https://github.com/vespa-engine/vespa-search
  - type: SampleApps
    url: https://github.com/vespa-engine/sample-apps
  - type: PrometheusExporter
    url: https://github.com/vespa-engine/vespa_exporter
  - type: DockerImage
    url: https://github.com/vespa-engine/docker-image
  - type: GitHubAction
    url: https://github.com/vespa-engine/setup-vespa-cli-action
  - type: SpectralRules
    url: rules/vespa-ai-rules.yml
  - type: Vocabulary
    url: vocabulary/vespa-ai-vocabulary.yml
  - type: JSONLDContext
    url: json-ld/vespa-ai-context.jsonld
  - type: Plans
    url: plans/vespa-ai-plans-pricing.yml
  - type: RateLimits
    url: rate-limits/vespa-ai-rate-limits.yml
  - type: FinOps
    url: finops/vespa-ai-finops.yml
  - type: Features
    data:
      - Open-source under Apache 2.0
      - Vector search with HNSW indexes
      - BM25 text search and hybrid search
      - Native tensor and ML model inference at serving time
      - YQL (Vespa Query Language) for structured queries
      - Multi-phase ranking (match-phase, first-phase, second-phase, global-phase)
      - Document API with conditional writes, visits, and JSON Lines streaming
      - Multi-tenant namespaces and document groups
      - Real-time indexing with sub-100ms query latency
      - Distributed content clusters with automatic sharding and replication
      - Streaming search mode for personal/private corpora
      - Built-in machine learning inference (TensorFlow, ONNX, XGBoost, LightGBM)
      - Approximate nearest neighbor and exact nearest neighbor operators
      - Application packages with schemas, services.xml, and rank profiles
      - Container API for custom searchers, document processors, and handlers
      - Self-managed (Apache 2.0) or fully managed Vespa Cloud (AWS, GCP)
      - Vespa Cloud Startup plan from $0.05 / vCPU-hour, $0.005 / GiB-memory-hour
      - Vespa Cloud Commercial plan with 24/7 1-hour SLA support
      - Vespa Cloud Enterprise plan with $20k/month minimum and 15-minute SLA
      - Up to 50% volume discounts and 15% committed-spend discount
    sources:
      - https://cloud.vespa.ai/price-calculator.html
      - https://docs.vespa.ai/
    updated: '2026-05-25'
  - type: UseCases
    data:
      - name: Hybrid Search
        description: Combine BM25 text relevance with vector similarity and structured filters in a single query
          executed by Vespa's multi-phase ranking pipeline.
      - name: Retrieval Augmented Generation
        description: Serve grounded context to large language models by indexing documents, chunks, and embeddings in
          Vespa and retrieving them with hybrid search at sub-100ms latency.
      - name: Recommendation and Personalization
        description: Power recommendation systems with machine-learned ranking, real-time feature updates, and tensor
          inference over user and item embeddings.
      - name: Ad Targeting and Real-Time Bidding
        description: Match candidate ads against user context and serve ranked impressions within tight latency
          budgets using Vespa's distributed serving engine.
      - name: E-Commerce Search and Browse
        description: Combine faceted navigation, structured filters, text relevance, and learned ranking for
          large product catalogs with frequent updates.
      - name: Streaming Search for Personal Data
        description: Run "streaming search" mode that scans a user's personal corpus on demand — ideal for mail,
          messaging, and document search where each user has their own private index.
  - type: Integrations
    data:
      - name: AWS
      - name: Google Cloud
      - name: Prometheus
      - name: Grafana
      - name: TensorFlow
      - name: ONNX Runtime
      - name: XGBoost
      - name: LightGBM
      - name: Kubernetes
      - name: LangChain
      - name: LlamaIndex
      - name: Haystack
integrations:
  - name: AWS
  - name: Google Cloud
  - name: LangChain
  - name: LlamaIndex
  - name: Haystack
  - name: Prometheus
maintainers:
  - FN: Kin Lane
    email: kin@apievangelist.com