--- name: castai-cost-tuning description: 'Maximize Kubernetes cost savings with CAST AI spot strategies and right-sizing. Use when analyzing cloud spend, optimizing spot-to-on-demand ratios, or configuring CAST AI for maximum savings. Trigger with phrases like "cast ai cost", "cast ai savings", "cast ai spot strategy", "reduce kubernetes cost", "cast ai budget". ' allowed-tools: Read, Write, Edit, Bash(curl:*), Grep version: 1.0.0 license: MIT author: Jeremy Longshore tags: - saas - kubernetes - cost-optimization - castai compatibility: Designed for Claude Code --- # CAST AI Cost Tuning ## Overview Maximize Kubernetes cost savings through CAST AI: spot instance strategies, workload right-sizing, cluster hibernation, and savings tracking. Typical savings: 50-70% on cloud compute costs. ## Prerequisites - CAST AI Phase 2 enabled with full automation - Savings report available (requires 24h+ of data) - Understanding of workload criticality tiers ## Instructions ### Step 1: Analyze Current Savings ```bash # Get savings breakdown curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \ "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/savings" \ | jq '{ currentMonthlyCost: .currentMonthlyCost, optimizedMonthlyCost: .optimizedMonthlyCost, monthlySavings: .monthlySavings, savingsPercentage: .savingsPercentage, spotSavings: .spotSavings, rightSizingSavings: .rightSizingSavings }' ``` ### Step 2: Maximize Spot Usage ```bash # Enable aggressive spot with diversity and fallbacks curl -X PUT -H "X-API-Key: ${CASTAI_API_KEY}" \ -H "Content-Type: application/json" \ "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/policies" \ -d '{ "enabled": true, "spotInstances": { "enabled": true, "clouds": ["aws"], "spotDiversityEnabled": true, "spotDiversityPriceIncreaseLimitPercent": 20, "spotBackups": { "enabled": true, "spotBackupRestoreRateSeconds": 600 } } }' ``` **Spot allocation strategy by workload tier:** | Workload Type | Spot % | Rationale | |---------------|--------|-----------| | Batch jobs, CI runners | 100% spot | Interruptible, restartable | | Stateless APIs (behind LB) | 80% spot | Can handle brief interruptions | | Stateful services, databases | 0% spot | Use on-demand or reserved | | ML training | 80-100% spot | Checkpointing handles interrupts | ### Step 3: Workload Right-Sizing ```bash # Get resource waste analysis curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \ "https://api.cast.ai/v1/workload-autoscaling/clusters/${CASTAI_CLUSTER_ID}/workloads" \ | jq '[.items[] | select(.estimatedSavingsPercent > 20) | { name: .workloadName, namespace: .namespace, wastedCpu: (.currentCpuRequest - .recommendedCpuRequest), wastedMemory: (.currentMemoryRequest - .recommendedMemoryRequest), savingsPercent: .estimatedSavingsPercent }] | sort_by(-.savingsPercent) | .[0:10]' ``` ### Step 4: Cluster Hibernation (Dev/Staging) ```bash # Hibernate non-production clusters during off-hours # Scales nodes to zero, resume on demand # Enable hibernation curl -X POST -H "X-API-Key: ${CASTAI_API_KEY}" \ -H "Content-Type: application/json" \ "https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/hibernate" \ -d '{ "schedule": { "enabled": true, "hibernateAt": "20:00", "wakeUpAt": "08:00", "timezone": "America/New_York", "weekdaysOnly": true } }' ``` ### Step 5: Cost Tracking Dashboard ```typescript interface CostReport { cluster: string; period: string; currentCost: number; optimizedCost: number; savings: number; spotPercent: number; } async function generateMonthlyCostReport( clusterIds: string[] ): Promise { const reports: CostReport[] = []; for (const clusterId of clusterIds) { const [cluster, savings, nodes] = await Promise.all([ castaiGet(`/v1/kubernetes/external-clusters/${clusterId}`), castaiGet(`/v1/kubernetes/clusters/${clusterId}/savings`), castaiGet(`/v1/kubernetes/external-clusters/${clusterId}/nodes`), ]); const spotNodes = nodes.items.filter( (n: { lifecycle: string }) => n.lifecycle === "spot" ).length; reports.push({ cluster: cluster.name, period: new Date().toISOString().slice(0, 7), currentCost: savings.currentMonthlyCost, optimizedCost: savings.optimizedMonthlyCost, savings: savings.monthlySavings, spotPercent: nodes.items.length > 0 ? (spotNodes / nodes.items.length) * 100 : 0, }); } return reports; } ``` ## Cost Optimization Checklist - [ ] Spot instances enabled with diversity - [ ] Workload autoscaler right-sizing resources - [ ] Dev/staging clusters hibernated off-hours - [ ] Empty node downscaler enabled - [ ] Instance families include latest generation (cheaper) - [ ] Reserved/savings plan for baseline on-demand nodes - [ ] Weekly savings report review ## Error Handling | Issue | Cause | Solution | |-------|-------|----------| | Savings lower than expected | Too many on-demand constraints | Relax node template constraints | | Spot interruptions too frequent | Single instance type | Enable spot diversity | | Hibernation not triggering | Schedule timezone wrong | Use IANA timezone format | | Right-sizing too aggressive | Low headroom | Increase memory headroom to 20% | ## Resources - [CAST AI Savings Report](https://docs.cast.ai/docs/getting-started) - [Spot Instance Best Practices](https://docs.cast.ai/docs/autoscaler-settings) - [Cluster Hibernation](https://docs.cast.ai/docs/autoscaling-cluster-hibernation) ## Next Steps For architecture patterns, see `castai-reference-architecture`.