--- name: flyio-performance-tuning description: 'Optimize Fly.io application performance with auto-stop/start tuning, VM sizing, multi-region latency optimization, and connection pooling. Trigger: "fly.io performance", "fly.io cold start", "fly.io latency", "fly.io VM sizing". ' allowed-tools: Read, Write, Edit, Bash(fly:*) version: 1.0.0 license: MIT author: Jeremy Longshore tags: - saas - edge-compute - flyio compatibility: Designed for Claude Code --- # Fly.io Performance Tuning ## Overview Optimize Fly.io performance: eliminate cold starts, right-size VMs, leverage multi-region for low latency, and tune concurrency settings. ## Instructions ### Step 1: Eliminate Cold Starts ```toml # fly.toml — suspend instead of stop for faster resume (~100ms vs ~5s) [http_service] auto_stop_machines = "suspend" # Suspend to RAM, not full stop auto_start_machines = true min_machines_running = 1 # Always-warm in primary region # For latency-critical: keep machines running in all regions # min_machines_running applies globally ``` ### Step 2: Right-Size VMs ```bash # Check current allocation fly scale show -a my-app # Start small, scale up based on metrics fly scale vm shared-cpu-1x --memory 256 # Start here fly scale vm shared-cpu-1x --memory 512 # If memory-constrained fly scale vm shared-cpu-2x --memory 1024 # If CPU-bound fly scale vm performance-2x --memory 4096 # For compute-heavy workloads ``` | Workload | VM | Memory | When | |----------|-------|--------|------| | Static site / API proxy | shared-cpu-1x | 256mb | Low traffic | | Node.js API | shared-cpu-1x | 512mb | Most apps | | Heavy processing | shared-cpu-2x | 1gb | Background jobs | | Database / ML | performance-2x | 4gb | Compute-intensive | ### Step 3: Multi-Region Latency Optimization ```bash # Deploy close to your users fly scale count 1 --region iad # US East fly scale count 1 --region lhr # Europe fly scale count 1 --region nrt # Asia Pacific # Fly automatically routes to nearest region via Anycast # Verify: curl with timing curl -w "DNS: %{time_namelookup}s, Connect: %{time_connect}s, Total: %{time_total}s\n" \ -o /dev/null -s https://my-app.fly.dev/health ``` ### Step 4: Connection Pooling for Postgres ```typescript // Use connection pooling for Fly Postgres // PgBouncer runs on port 5433 (pooled) vs 5432 (direct) const pooledUrl = process.env.DATABASE_URL?.replace(':5432/', ':5433/'); // Prisma: add pgbouncer=true // DATABASE_URL="postgres://user:pass@my-db.internal:5433/db?pgbouncer=true" ``` ### Step 5: Tune Concurrency ```toml [http_service.concurrency] type = "requests" # or "connections" hard_limit = 250 # Max before rejecting soft_limit = 200 # Start scaling at this point ``` ## Resources - [Auto Stop/Start](https://fly.io/docs/launch/autostop-autostart/) - [Machine Sizing](https://fly.io/docs/machines/) - [Suspend/Resume](https://fly.io/docs/reference/suspend-resume/) ## Next Steps For cost optimization, see `flyio-cost-tuning`.