aid: parasail
url: https://raw.githubusercontent.com/api-evangelist/parasail-ai/refs/heads/main/apis.yml
name: Parasail
description: |
  Parasail is an AI Supercloud — a pay-per-token GPU inference platform aimed at AI
  startups and developers. Parasail orchestrates rented GPU capacity across 40+
  data centers in 15+ countries to serve open-weight LLMs, vision/multimodal models,
  embedding models, and TTS/STT models on a serverless, dedicated, or batch basis.
  The platform exposes OpenAI-compatible /v1 endpoints for chat completions,
  completions, embeddings, batch, and models, plus a control-plane /api/v1 for
  managing dedicated GPU deployments of any Hugging Face or custom model. Parasail
  serves 500B+ tokens per day and is positioned as up to 30x cheaper than legacy
  cloud providers, with no quotas, no rate-limit penalties, and no long-term
  contracts. Co-founded by Mike Henry (ex-Mythic) and Tim Harris (ex-Swift
  Navigation); raised a $32M Series A in April 2026 (Touring Capital and Kindred
  Ventures) bringing total funding to $42M.
tags:
  - AI
  - Artificial Intelligence
  - GPU
  - Inference
  - Large Language Models
  - Open Source Models
  - Hugging Face
  - Batch
  - Embeddings
  - Tokenmaxxing
  - Supercloud
kind: contract
image: https://kinlane-images.s3.amazonaws.com/shared/apis-json/apis-json-logo.jpg
access: 3rd-Party
apis:
  - aid: parasail:parasail-inference-api
    name: Parasail Inference API
    tags:
      - AI
      - Artificial Intelligence
      - Inference
      - Chat
      - Embeddings
      - Models
    humanURL: https://docs.parasail.io/parasail-docs/
    baseURL: https://api.parasail.io/v1
    properties:
      - url: https://docs.parasail.io/parasail-docs/
        type: Documentation
      - url: openapi/parasail-inference-api-openapi.yml
        type: OpenAPI
      - url: json-schema/parasail-chat-completion-schema.json
        type: JSONSchema
      - url: json-ld/parasail-context.jsonld
        type: JSONLD
    description: |
      OpenAI-compatible real-time and streaming inference API exposing serverless
      access to popular open-weight LLMs, embedding models, and the model catalog.
      Endpoints: /v1/chat/completions, /v1/completions, /v1/embeddings, /v1/models.
      Bearer-token authentication; pay-per-token billing; supports streaming, tool
      use, and structured outputs. Compatible with the OpenAI Python and TypeScript
      clients by overriding base_url.
  - aid: parasail:parasail-batch-api
    name: Parasail Batch API
    tags:
      - AI
      - Artificial Intelligence
      - Batch
      - Files
    humanURL: https://docs.parasail.io/parasail-docs/
    baseURL: https://api.parasail.io/v1
    properties:
      - url: https://docs.parasail.io/parasail-docs/
        type: Documentation
      - url: openapi/parasail-batch-api-openapi.yml
        type: OpenAPI
      - url: json-schema/parasail-batch-schema.json
        type: JSONSchema
    description: |
      OpenAI-compatible Batch API for asynchronous inference workloads at 50% off
      serverless pricing (with an additional 30% off cached tokens). Supports
      /v1/chat/completions and /v1/embeddings in the OpenAI Batch file format
      (JSONL) with a 24-hour completion window. Includes a Files surface for
      uploading and downloading input/output/error JSONL files. Ideal for offline
      enrichment, dataset processing, and large-scale tokenmaxxing.
  - aid: parasail:parasail-dedicated-api
    name: Parasail Dedicated Deployments API
    tags:
      - AI
      - Artificial Intelligence
      - GPU
      - Deployments
      - Dedicated
    humanURL: https://docs.parasail.io/parasail-docs/
    baseURL: https://api.parasail.io/api/v1
    properties:
      - url: https://docs.parasail.io/parasail-docs/
        type: Documentation
      - url: openapi/parasail-dedicated-api-openapi.yml
        type: OpenAPI
      - url: json-schema/parasail-deployment-schema.json
        type: JSONSchema
    description: |
      Control-plane API for managing Parasail Dedicated and Dedicated Serverless
      deployments. Provision reserved GPU capacity (H100, A100, H200, etc.) running
      any Hugging Face or custom model, then list, retrieve, update, pause, resume,
      and delete deployments. Read-only API keys can list and retrieve but cannot
      mutate. Endpoint: /api/v1/dedicated/deployments.
common:
  - type: Portal
    url: https://parasail.io
  - type: Documentation
    url: https://docs.parasail.io/parasail-docs/
  - type: Signup
    url: https://www.saas.parasail.io/
  - type: Pricing
    url: https://www.saas.parasail.io/pricing
  - type: Blog
    url: https://parasail.io/blogs
  - type: AboutUs
    url: https://parasail.io/about-us
  - type: Careers
    url: https://job-boards.greenhouse.io/parasail
  - type: PrivacyPolicy
    url: https://parasail.io/legal/privacy-policy
  - type: TermsOfService
    url: https://parasail.io/legal/terms-of-service
  - type: GitHubOrganization
    url: https://github.com/parasail-ai
  - type: Forums
    url: https://discord.gg/parasail
  - type: LinkedIn
    url: https://www.linkedin.com/company/parasail-ai
  - type: X
    url: https://x.com/parasail_io
  - url: https://github.com/parasail-ai/openai-batch
    name: openai-batch
    type: SDKs
  - url: https://github.com/parasail-ai/cookbook
    name: Parasail Cookbook
    type: CodeExamples
  - url: https://github.com/parasail-ai/kvcached
    name: kvcached
    type: Tools
  - url: https://github.com/parasail-ai/vllm-public
    name: vllm-public
    type: Tools
  - url: https://github.com/parasail-ai/curator
    name: curator
    type: Tools
  - url: https://github.com/parasail-ai/simple-evals
    name: simple-evals
    type: Tools
  - url: https://github.com/parasail-ai/VLMEvalKit
    name: VLMEvalKit
    type: Tools
  - url: plans/parasail-plans-pricing.yml
    type: Plans
  - url: rate-limits/parasail-rate-limits.yml
    type: RateLimits
  - url: finops/parasail-finops.yml
    type: FinOps
  - type: Features
    data:
      - Pay-per-token serverless GPU inference with no quotas or contracts
      - OpenAI-compatible /v1/chat/completions, /v1/completions, /v1/embeddings, /v1/models
      - Batch API at 50% off serverless (plus 30% off cached tokens) with 24-hour window
      - Dedicated and Dedicated Serverless deployments for reserved GPU capacity
      - Bring-your-own model from Hugging Face or custom weights
      - Day-0 support for frontier open-weight LLMs (DeepSeek, Qwen, Llama, OLMo, Kimi)
      - Vision, multimodal, embeddings, and TTS (Resemble, Orpheus) model surfaces
      - Global GPU orchestration across 40+ data centers in 15+ countries
      - 500B+ tokens served per day
      - Sub-500ms latency suitable for voice agents
      - Up to 30x cheaper than legacy cloud providers
      - Speculative decoding (EAGLE) and KV-cache virtualization for performance
      - Free starter credits and usage-tier auto-advancement (5 / 500 / 1000 / 4000 RPM)
      - OpenAI Python and TypeScript SDK compatibility via base_url override
      - $42M total funding (April 2026 Series A) — Touring Capital, Kindred Ventures, Samsung NEXT
    sources:
      - https://parasail.io/
      - https://docs.parasail.io/parasail-docs/
      - https://www.saas.parasail.io/pricing
      - https://parasail.io/blogs
      - https://github.com/parasail-ai
    updated: '2026-05-25'
maintainers:
  - FN: Kin Lane
    email: info@apievangelist.com
    X: apievangelist
    url: https://apievangelist.com
created: '2026-05-25T00:00:00.000Z'
modified: '2026-05-25'
position: Consuming
specificationVersion: '0.16'