# Fireworks AI GraphQL Schema

This directory contains a conceptual GraphQL schema for the Fireworks AI fast inference platform. The schema is derived from the public Fireworks AI REST API surface documented at https://docs.fireworks.ai/ and represents the same capabilities in GraphQL vocabulary.

## Provider

**Name:** Fireworks AI
**Website:** https://fireworks.ai/
**API Docs:** https://docs.fireworks.ai/
**Base URL:** https://api.fireworks.ai/inference/v1

## Schema File

- `fireworks-ai-schema.graphql` — Full conceptual GraphQL schema (60+ named types)

## Coverage

The schema covers all major Fireworks AI API surface areas:

### Models
- `Model` — catalog entry with type, provider, capabilities, and context length
- `ModelDetails` — architecture, parameter count, quantization, license, HuggingFace ID
- `ModelType` — CHAT, COMPLETION, EMBEDDING, RERANK, IMAGE, AUDIO, REASONING, MULTIMODAL
- `ModelProvider` — Meta, Mistral, Qwen, DeepSeek, Stability AI, Black Forest Labs, Fireworks, Community
- `ModelCapability` — TEXT_GENERATION, FUNCTION_CALLING, STRUCTURED_OUTPUT, VISION, etc.
- `ModelConfig` — temperature, top-p, top-k, max tokens, stop sequences, seed

### Deployments
- `Deployment` — on-demand dedicated GPU deployment with status lifecycle
- `DeploymentDetails` — GPU type (H100/H200/B200/B300), region, replica counts, inference URL, cost-per-GPU-second
- `DeploymentStatus` — PENDING, INITIALIZING, RUNNING, SCALING, PAUSED, STOPPED, FAILED, DELETING
- `DeploymentConfig` — auto-scale, scale-to-zero, idle timeout, LoRA adapters

### Chat Completions
- `ChatCompletion` — top-level response with choices, usage, and system fingerprint
- `ChatMessage` — role-tagged message with optional tool calls and refusal field
- `ChatRole` — SYSTEM, USER, ASSISTANT, TOOL, FUNCTION
- `ChatCompletionChoice` — individual generation with finish reason and logprobs

### Text Completions
- `CompletionRequest` — legacy /completions endpoint request
- `CompletionDetails` — all sampling parameters
- `CompletionChoice` — text output with logprobs
- `CompletionMessage` — message wrapper with tool call support

### Function Calling / Tools
- `Tool` — name, description, JSON parameters schema, strict mode
- `ToolFunction` — function definition for the tool
- `FunctionCall` — structured call emitted by the model (id, type, function details)
- `ToolChoice` — NONE, AUTO, REQUIRED enum
- `ToolResult` — tool output to feed back in the next turn

### Embeddings
- `EmbeddingRequest` — model, input strings, encoding format, dimensions
- `EmbeddingResult` — list of embedding vectors with usage
- `EmbeddingVector` — index, object type, dense float32 vector

### Audio
- `SpeechToText` — transcription request (Whisper-compatible)
- `AudioTranscription` — text, language, duration, segments, word-level timestamps
- `AudioTranslation` — translation to English with segments
- `AudioSegment` — chunk with start/end, tokens, log-prob, compression ratio
- `AudioWord` — word-level timestamp pair
- `AudioFormat` — MP3, WAV, FLAC, OGG, PCM, OPUS

### Image Generation
- `ImageGeneration` — text-to-image and image-to-image request (FLUX, Stable Diffusion)
- `ImageResult` — list of generated images with seeds
- `GeneratedImage` — URL or base64, seed, revised prompt, dimensions
- `SamplerType` — DDIM, DDPM, Euler, Euler-A, DPM++2M, DPM++SDE
- `ImageFormat` — PNG, JPEG, WEBP

### Reasoning
- `ReasoningModel` — model with thinking budget and streaming thoughts support
- `ReasoningStep` — thought, action, observation with token count

### Fine-Tuning
- `FineTuningJob` — supervised or reinforcement fine-tuning job lifecycle
- `FineTuningStatus` — QUEUED, RUNNING, SUCCEEDED, FAILED, CANCELLED
- `FineTuningDetails` — method (LoRA / full-param / RFT), hyperparameters, estimated cost
- `TrainingFile` — uploaded dataset file with line count and status
- `CheckpointDetails` — step, metrics (train/valid loss and accuracy)
- `CheckpointMetrics` — quantitative training metrics per checkpoint

### Batch Inference
- `BatchJob` — async batch request at 50% serverless discount
- `BatchJobStatus` — VALIDATING, IN_PROGRESS, COMPLETED, FAILED, CANCELLED, EXPIRED
- `BatchDetails` — request counts, token totals, discount percentage, metadata

### Usage and Cost
- `Usage` — account-level usage for a billing period with breakdown by model
- `TokenUsage` — prompt, completion, total, cached, audio, image token counts
- `CostEstimate` — prompt cost, completion cost, GPU cost, storage cost in USD
- `UsageBreakdown` — per-model slice of usage and cost

### Prompts and Templates
- `PromptTemplate` — named template with variable substitution and linked system prompt
- `TemplateVariable` — name, description, default, required flag
- `SystemPrompt` — reusable system message with token count

### Account and Auth
- `Account` — top-level account with plan and API keys
- `AccountDetails` — email, org, credit balance, monthly spend limit
- `AccountPlan` — FREE, GROWTH, SCALE, ENTERPRISE
- `APIKey` — key with prefix, scopes, expiry, last-used timestamp
- `Token` — OAuth-style access token envelope

### Error Handling and Rate Limits
- `Error` — code, message, type, param, details
- `RateLimitInfo` — per-minute request/token limits, remaining counts, reset times, retry-after

## Operations

### Queries (14)
`model`, `models`, `deployment`, `deployments`, `account`, `usage`, `fineTuningJob`, `fineTuningJobs`, `trainingFile`, `trainingFiles`, `batchJob`, `batchJobs`, `apiKeys`, `promptTemplates`

### Mutations (16)
`createCompletion`, `createChatCompletion`, `createEmbedding`, `generateImage`, `transcribeAudio`, `translateAudio`, `createDeployment`, `updateDeployment`, `stopDeployment`, `deleteDeployment`, `createFineTuningJob`, `cancelFineTuningJob`, `uploadTrainingFile`, `deleteTrainingFile`, `createBatchJob`, `cancelBatchJob`

### Subscriptions (5)
`chatCompletionStream`, `completionStream`, `fineTuningJobUpdates`, `batchJobUpdates`, `deploymentStatusUpdates`

## Type Count

63 named types (scalars, enums, object types) plus Query, Mutation, and Subscription root types.

## Related Resources

- OpenAPI spec: `openapi/fireworks-ai-merged-openapi.yml`
- AsyncAPI spec: `asyncapi/fireworks-ai-asyncapi.yml`
- Plans and pricing: `plans/fireworks-ai-plans-pricing.yml`
- Rate limits: `rate-limits/fireworks-ai-rate-limits.yml`
- FinOps: `finops/fireworks-ai-finops.yml`