--- name: mistral-deploy-integration description: 'Deploy Mistral AI integrations to Vercel, Docker, and Cloud Run platforms. Use when deploying Mistral AI-powered applications to production, configuring platform-specific secrets, or setting up deployment pipelines. Trigger with phrases like "deploy mistral", "mistral Vercel", "mistral production deploy", "mistral Cloud Run", "mistral Docker". ' allowed-tools: Read, Write, Edit, Bash(vercel:*), Bash(docker:*), Bash(gcloud:*) version: 1.0.0 license: MIT author: Jeremy Longshore tags: - saas - mistral - deployment compatibility: Designed for Claude Code, also compatible with Codex and OpenClaw --- # Mistral AI Deploy Integration ## Overview Deploy Mistral AI-powered applications to production with secure API key management. Covers Vercel (Edge + Serverless), Docker, Cloud Run, and self-hosted vLLM deployments. All connect to `api.mistral.ai` or your own inference endpoint. ## Prerequisites - Mistral AI production API key - Platform CLI installed (vercel, docker, or gcloud) - Application using `@mistralai/mistralai` SDK ## Instructions ### Step 1: Platform Secret Configuration ```bash set -euo pipefail # Vercel vercel env add MISTRAL_API_KEY production vercel env add MISTRAL_MODEL production # optional: default model # Cloud Run echo -n "your-key" | gcloud secrets create mistral-api-key --data-file=- # Docker echo "MISTRAL_API_KEY=your-key" > .env.production echo ".env.production" >> .gitignore ``` ### Step 2: Vercel Edge Function ```typescript // api/chat.ts — Vercel Edge Function with streaming import { Mistral } from '@mistralai/mistralai'; export const config = { runtime: 'edge' }; export default async function handler(req: Request) { const client = new Mistral({ apiKey: process.env.MISTRAL_API_KEY! }); const { messages, stream = false } = await req.json(); if (stream) { const streamResponse = await client.chat.stream({ model: process.env.MISTRAL_MODEL ?? 'mistral-small-latest', messages, }); const encoder = new TextEncoder(); const readable = new ReadableStream({ async start(controller) { for await (const event of streamResponse) { const content = event.data?.choices?.[0]?.delta?.content; if (content) { controller.enqueue(encoder.encode(`data: ${JSON.stringify({ content })}\n\n`)); } } controller.enqueue(encoder.encode('data: [DONE]\n\n')); controller.close(); }, }); return new Response(readable, { headers: { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache', }, }); } const response = await client.chat.complete({ model: process.env.MISTRAL_MODEL ?? 'mistral-small-latest', messages, }); return Response.json(response); } ``` ### Step 3: Docker Deployment ```dockerfile FROM node:20-slim AS builder WORKDIR /app COPY package*.json ./ RUN npm ci --production=false COPY . . RUN npm run build FROM node:20-slim WORKDIR /app COPY --from=builder /app/dist ./dist COPY --from=builder /app/node_modules ./node_modules COPY --from=builder /app/package.json ./ ENV NODE_ENV=production EXPOSE 3000 HEALTHCHECK --interval=30s --timeout=5s \ CMD curl -sf http://localhost:3000/health || exit 1 CMD ["node", "dist/index.js"] ``` ```bash set -euo pipefail docker build -t mistral-app . docker run -d --name mistral-app \ -p 3000:3000 \ -e MISTRAL_API_KEY="$MISTRAL_API_KEY" \ -e MISTRAL_MODEL="mistral-small-latest" \ mistral-app ``` ### Step 4: Cloud Run Deployment ```bash set -euo pipefail # Build and push gcloud builds submit --tag gcr.io/$PROJECT_ID/mistral-app # Deploy with secret injection gcloud run deploy mistral-service \ --image gcr.io/$PROJECT_ID/mistral-app \ --region us-central1 \ --platform managed \ --set-secrets=MISTRAL_API_KEY=mistral-api-key:latest \ --set-env-vars=MISTRAL_MODEL=mistral-small-latest \ --min-instances=1 \ --max-instances=10 \ --memory=512Mi \ --timeout=60s ``` ### Step 5: Self-Hosted with vLLM For data sovereignty or latency requirements, self-host open-weight Mistral models: ```bash set -euo pipefail # Serve Mistral with vLLM (OpenAI-compatible API) docker run --runtime nvidia --gpus all \ -v ~/.cache/huggingface:/root/.cache/huggingface \ -p 8000:8000 \ -e HF_TOKEN="$HF_TOKEN" \ vllm/vllm-openai:latest \ --model mistralai/Mistral-Small-24B-Instruct-2501 \ --dtype auto \ --api-key "your-local-key" ``` Point the SDK at your local endpoint: ```typescript import { Mistral } from '@mistralai/mistralai'; const client = new Mistral({ apiKey: 'your-local-key', serverURL: 'http://localhost:8000', // vLLM endpoint }); ``` ### Step 6: Health Check Endpoint ```typescript import { Mistral } from '@mistralai/mistralai'; export async function GET() { const start = performance.now(); try { const client = new Mistral({ apiKey: process.env.MISTRAL_API_KEY! }); await client.models.list(); return Response.json({ status: 'healthy', provider: 'mistral', latencyMs: Math.round(performance.now() - start), }); } catch (error: any) { return Response.json( { status: 'unhealthy', error: error.message }, { status: 503 }, ); } } ``` ## Error Handling | Issue | Cause | Solution | |-------|-------|----------| | API key not found | Missing env/secret | Verify secret config on platform | | Function timeout | Long completion | Increase timeout, use streaming | | Cold start latency | Serverless spin-up | Set `min-instances=1` or use edge | | vLLM OOM | Model too large for GPU | Use quantized model or smaller variant | ## Resources - [Mistral AI Documentation](https://docs.mistral.ai/) - [vLLM Deployment](https://docs.mistral.ai/deployment/self-deployment/vllm/) - [Cloud Deployment](https://docs.mistral.ai/deployment/ai-studio/) ## Output - Platform-specific deployment configurations - Secure API key management per platform - Streaming support for Edge/Serverless - Health check endpoint - Self-hosted option with vLLM