--- name: elevenlabs-deploy-integration description: 'Deploy ElevenLabs TTS applications to Vercel, Fly.io, and Cloud Run. Use when deploying ElevenLabs-powered apps to production, configuring platform-specific secrets, or setting up serverless TTS. Trigger: "deploy elevenlabs", "elevenlabs Vercel", "elevenlabs Cloud Run", "elevenlabs Fly.io", "elevenlabs serverless", "host TTS API". ' allowed-tools: Read, Write, Edit, Bash(vercel:*), Bash(fly:*), Bash(gcloud:*) version: 1.0.0 license: MIT author: Jeremy Longshore tags: - saas - voice - ai - elevenlabs - deployment - serverless compatibility: Designed for Claude Code --- # ElevenLabs Deploy Integration ## Overview Deploy ElevenLabs TTS/voice applications to cloud platforms. Covers Vercel (serverless), Fly.io (containers), and Google Cloud Run with proper secrets management, timeout configuration, and streaming support. ## Prerequisites - ElevenLabs API key for production - Platform CLI installed (`vercel`, `fly`, or `gcloud`) - Application code tested locally ## Instructions ### Vercel Deployment (Serverless) **Key constraint:** Vercel functions have a 10-second timeout on Hobby (30s on Pro). Use Flash model for speed. ```bash # Set secrets vercel env add ELEVENLABS_API_KEY production vercel env add ELEVENLABS_API_KEY preview # Deploy vercel --prod ``` **API Route (Next.js / Vercel):** ```typescript // app/api/tts/route.ts import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js"; import { NextResponse } from "next/server"; export const runtime = "nodejs"; export const maxDuration = 30; // Vercel Pro max const client = new ElevenLabsClient(); export async function POST(req: Request) { const { text, voiceId = "21m00Tcm4TlvDq8ikWAM" } = await req.json(); if (!text || text.length > 5000) { return NextResponse.json( { error: "Text required, max 5000 characters" }, { status: 400 } ); } try { const audio = await client.textToSpeech.convert(voiceId, { text, model_id: "eleven_flash_v2_5", // Fast for serverless output_format: "mp3_22050_32", voice_settings: { stability: 0.5, similarity_boost: 0.75, }, }); return new Response(audio as any, { headers: { "Content-Type": "audio/mpeg", "Cache-Control": "public, max-age=3600", }, }); } catch (error: any) { const status = error.statusCode || 500; return NextResponse.json( { error: error.message || "TTS generation failed" }, { status } ); } } ``` **vercel.json:** ```json { "env": { "ELEVENLABS_API_KEY": "@elevenlabs_api_key" }, "functions": { "app/api/tts/route.ts": { "maxDuration": 30 } } } ``` ### Fly.io Deployment (Container) Better for long-running TTS, WebSocket streaming, and high concurrency. **fly.toml:** ```toml app = "my-tts-service" primary_region = "iad" [env] NODE_ENV = "production" # Use the closest region to ElevenLabs servers (US East) ELEVENLABS_MODEL = "eleven_multilingual_v2" [http_service] internal_port = 3000 force_https = true auto_stop_machines = true auto_start_machines = true min_machines_running = 1 [http_service.concurrency] type = "requests" hard_limit = 25 soft_limit = 20 [[vm]] cpu_kind = "shared" cpus = 1 memory_mb = 512 ``` ```bash # Set secrets fly secrets set ELEVENLABS_API_KEY=sk_your_prod_key fly secrets set ELEVENLABS_WEBHOOK_SECRET=whsec_your_secret # Deploy fly deploy # Check logs fly logs ``` **Express server with streaming:** ```typescript // server.ts import express from "express"; import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js"; import { Readable } from "stream"; const app = express(); app.use(express.json()); const client = new ElevenLabsClient(); // Streaming TTS endpoint app.post("/api/tts/stream", async (req, res) => { const { text, voiceId = "21m00Tcm4TlvDq8ikWAM", modelId } = req.body; res.setHeader("Content-Type", "audio/mpeg"); res.setHeader("Transfer-Encoding", "chunked"); try { const stream = await client.textToSpeech.stream(voiceId, { text, model_id: modelId || "eleven_flash_v2_5", output_format: "mp3_22050_32", }); // Pipe streaming audio directly to response const readable = Readable.fromWeb(stream as any); readable.pipe(res); } catch (error: any) { if (!res.headersSent) { res.status(error.statusCode || 500).json({ error: error.message }); } } }); // Health check app.get("/health", async (_req, res) => { try { const user = await client.user.get(); res.json({ status: "healthy", quota: { used: user.subscription.character_count, limit: user.subscription.character_limit, }, }); } catch { res.status(503).json({ status: "unhealthy" }); } }); app.listen(3000, () => console.log("TTS service running on :3000")); ``` ### Google Cloud Run ```bash # Build and deploy gcloud run deploy tts-service \ --source . \ --region us-central1 \ --platform managed \ --allow-unauthenticated \ --set-secrets=ELEVENLABS_API_KEY=elevenlabs-api-key:latest \ --timeout=60 \ --concurrency=10 \ --min-instances=0 \ --max-instances=5 # Store secret in Secret Manager first echo -n "sk_your_prod_key" | gcloud secrets create elevenlabs-api-key --data-file=- ``` **Dockerfile:** ```dockerfile FROM node:20-slim WORKDIR /app COPY package*.json ./ RUN npm ci --only=production COPY . . EXPOSE 3000 CMD ["node", "dist/server.js"] ``` ## Platform Comparison for ElevenLabs | Feature | Vercel | Fly.io | Cloud Run | |---------|--------|--------|-----------| | Max timeout | 30s (Pro) | No limit | 60min | | WebSocket streaming | Limited | Full support | Full support | | Cold start | ~1-3s | ~0.5-2s | ~1-5s | | Concurrency | Per-function | Per-VM | Per-instance | | Best for | Simple TTS API | Streaming/WebSocket | Variable load | | Min cost | Free tier | ~$2/mo | Free tier | ## Error Handling | Issue | Cause | Solution | |-------|-------|----------| | Vercel timeout | TTS > 10s on Hobby | Upgrade to Pro (30s) or use Flash model | | Cold start slow | Container initialization | Set `min_instances=1` (Cloud Run) or `min_machines=1` (Fly) | | Secret not found | Missing platform config | Add via platform CLI | | Streaming broken | Proxy buffering | Disable response buffering in nginx/CDN | | CORS errors | Missing headers | Add `Access-Control-Allow-Origin` to TTS endpoint | ## Resources - [Vercel Functions](https://vercel.com/docs/functions) - [Fly.io Node.js](https://fly.io/docs/languages-and-frameworks/node/) - [Cloud Run Docs](https://cloud.google.com/run/docs) - [ElevenLabs API Quickstart](https://elevenlabs.io/docs/eleven-api/quickstart) ## Next Steps For webhook handling, see `elevenlabs-webhooks-events`.