--- name: agent-cost-optimizer description: Real-time cost tracking, budget enforcement, and ROI measurement for AI agent operations. Track token usage, predict costs, enforce budget caps ($50-70/month typical), optimize model selection, cache results, measure cost-to-value. Use when tracking AI costs, preventing budget overruns, optimizing spend, measuring ROI, or ensuring cost-effective AI operations. allowed-tools: Read, Write, Edit, Glob, Grep, Bash --- # Agent Cost Optimizer ## Overview agent-cost-optimizer provides comprehensive cost tracking, budget enforcement, and ROI measurement for AI agent operations. **Purpose**: Control and optimize AI spending while maximizing value delivered **Pattern**: Task-based (7 operations for cost management) **Key Innovation**: Real-time cost tracking with automatic budget enforcement and cost-effective fallbacks **Industry Context** (2025): - Average AI spending: **$85,521/month** (36% YoY increase) - Only **50% of organizations can measure AI ROI** - IT teams struggle with hidden costs **Solution**: Comprehensive cost management from tracking to optimization --- ## When to Use Use agent-cost-optimizer when: - Tracking AI costs across skills - Preventing budget overruns - Measuring ROI (cost vs. value delivered) - Optimizing model selection (Opus vs. Sonnet vs. Haiku) - Planning AI budgets - Cost-effective development - Enterprise cost accountability --- ## Prerequisites ### Required - AI API access (Anthropic, OpenAI, Google) - Cost tracking capability (API usage data) ### Optional - Paid.ai or AgentOps integration (advanced cost tracking) - Prometheus/Grafana (cost visualization) - Budget approval workflow --- ## Cost Operations ### Operation 1: Track Token Usage **Purpose**: Monitor token consumption per skill invocation **Process**: 1. **Initialize Tracking**: ```json { "tracking_id": "track_20250126_1200", "skill": "multi-ai-verification", "started_at": "2025-01-26T12:00:00Z", "tokens": { "prompt": 0, "completion": 0, "total": 0 }, "cost": { "amount_usd": 0.00, "model": "claude-sonnet-4-5" } } ``` 2. **Track During Execution**: ```typescript // After each AI call trackTokens({ prompt_tokens: response.usage.input_tokens, completion_tokens: response.usage.output_tokens, model: 'claude-sonnet-4-5' }); // Update running totals ``` 3. **Finalize Tracking**: ```json { "tracking_id": "track_20250126_1200", "skill": "multi-ai-verification", "completed_at": "2025-01-26T12:45:00Z", "duration_minutes": 45, "tokens": { "prompt": 15234, "completion": 8932, "total": 24166 }, "cost": { "amount_usd": 0.073, "model": "claude-sonnet-4-5", "rate": "$3 per million tokens" } } ``` 4. **Save to Cost Log**: ```bash # Append to daily cost log cat tracking.json >> .cost-tracking/$(date +%Y-%m-%d).json ``` **Outputs**: - Token usage per invocation - Cost per invocation - Model used - Daily/monthly aggregates **Validation**: - [ ] Tracking initialized - [ ] Tokens counted accurately - [ ] Cost calculated correctly - [ ] Logs saved **Time Estimate**: Automatic (integrated into skills) --- ### Operation 2: Calculate Costs **Purpose**: Compute accurate costs based on provider pricing **Pricing** (2025 rates): **Anthropic** (Claude): | Model | Input (per MTok) | Output (per MTok) | |-------|------------------|-------------------| | Claude Opus 4.5 | $15 | $75 | | Claude Sonnet 4.5 | $3 | $15 | | Claude Haiku 4.5 | $0.80 | $4 | **OpenAI** (Codex): | Model | Input | Output | |-------|-------|--------| | GPT-5.1-codex | $5 | $15 | | o3 | $10 | $40 | | o4-mini | $1.50 | $6 | **Google** (Gemini): | Model | Input | Output | |-------|-------|--------| | Gemini 2.5 Pro | $1.25 | $5 | | Gemini 2.5 Flash | $0.15 | $0.60 | **Process**: ```typescript function calculateCost(usage, model) { const pricing = { 'claude-sonnet-4-5': { input: 3, output: 15 }, 'claude-haiku-4-5': { input: 0.80, output: 4 }, 'claude-opus-4-5': { input: 15, output: 75 }, // ... more models }; const rates = pricing[model]; const inputCost = (usage.prompt_tokens / 1_000_000) * rates.input; const outputCost = (usage.completion_tokens / 1_000_000) * rates.output; return { input_cost: inputCost, output_cost: outputCost, total_cost: inputCost + outputCost, currency: 'USD' }; } ``` **Outputs**: - Accurate cost per invocation - Model-specific pricing - Input vs. output cost breakdown --- ### Operation 3: Enforce Budget Caps **Purpose**: Prevent exceeding monthly budget limits **Process**: 1. **Set Budget**: ```json { "monthly_budget_usd": 100, "skill_budgets": { "multi-ai-verification": 70, "multi-ai-research": 20, "multi-ai-testing": 10 }, "alert_thresholds": { "warning": 0.80, "critical": 0.95 } } ``` 2. **Check Before Operation**: ```typescript async function checkBudget(skill, estimated_cost) { const usage = getCurrentMonthUsage(); const remaining = budget.monthly_budget_usd - usage.total_cost; if (estimated_cost > remaining) { return { allowed: false, reason: `Budget exceeded: $${usage.total_cost}/$${budget.monthly_budget_usd}`, overage: estimated_cost - remaining }; } if (usage.total_cost / budget.monthly_budget_usd > 0.80) { return { allowed: true, warning: `80% of monthly budget used ($${usage.total_cost}/$${budget.monthly_budget_usd})` }; } return { allowed: true }; } ``` 3. **Enforce**: ```typescript const budgetCheck = await checkBudget('multi-ai-verification', 0.50); if (!budgetCheck.allowed) { console.log(`❌ Budget exceeded: ${budgetCheck.reason}`); console.log(`Options:`); console.log(` 1. Use cheaper model (Sonnet → Haiku)`); console.log(` 2. Skip optional verification layers`); console.log(` 3. Request budget increase`); return; } if (budgetCheck.warning) { console.log(`⚠️ ${budgetCheck.warning}`); } // Proceed with operation ``` **Outputs**: - Budget status checked - Operations blocked if over budget - Warnings at 80% - Cost-effective alternatives suggested --- ### Operation 4: Optimize Model Selection **Purpose**: Choose cost-effective model for each task **Decision Matrix**: | Task Type | Recommended Model | Cost | Rationale | |-----------|-------------------|------|-----------| | **Simple verification** (Layer 1-2) | Haiku | $ | Rules-based, fast, cheap | | **Code generation** | Sonnet | $$ | Balanced quality/cost | | **Complex reasoning** (architecture) | Opus | $$$ | Best quality, worth premium | | **LLM-as-judge** | Sonnet or external model | $$ | Good judgment, reasonable cost | | **Test generation** | Sonnet | $$ | Comprehensive coverage needed | | **Research** | Sonnet/Haiku mix | $-$$ | Haiku for search, Sonnet for synthesis | **Auto-Optimization**: ```typescript function selectModel(task_type, criticality, budget_remaining) { // Critical + budget OK → Use Opus if (criticality === 'critical' && budget_remaining > 20) { return 'claude-opus-4-5'; } // Standard → Use Sonnet if (criticality === 'standard') { return 'claude-sonnet-4-5'; } // Budget low or simple task → Use Haiku if (budget_remaining < 5 || task_type === 'simple') { return 'claude-haiku-4-5'; } return 'claude-sonnet-4-5'; // Default } ``` **Outputs**: - Optimal model selected - Cost minimized - Quality maintained --- ### Operation 5: Cost-Effective Caching **Purpose**: Avoid re-computing identical operations **Process**: 1. **Cache Key Generation**: ```typescript function generateCacheKey(operation, inputs) { // Hash inputs to create unique key const content_hash = crypto .createHash('sha256') .update(JSON.stringify(inputs)) .digest('hex'); return `${operation}_${content_hash}`; } ``` 2. **Check Cache Before Operation**: ```typescript const cacheKey = generateCacheKey('verify_code', { files: ['src/auth.ts'], file_hashes: {'src/auth.ts': 'abc123'} }); const cached = readCache(cacheKey); if (cached && !isExpired(cached, 24)) { // Use cached result console.log('📦 Using cached verification result'); return cached.result; } // Cache miss → run verification const result = await runVerification(); // Save to cache saveCache(cacheKey, result, ttl: 24 hours); ``` 3. **Cache Structure**: ```json { "cache_key": "verify_code_abc123def456", "created_at": "2025-01-26T12:00:00Z", "expires_at": "2025-01-27T12:00:00Z", "inputs": { "files": ["src/auth.ts"], "file_hashes": {"src/auth.ts": "abc123"} }, "result": { "quality_score": 92, "layers_passed": 5, "issues": [] }, "cost_saved": 0.073 } ``` **Outputs**: - 90% reduction in re-verification costs - Instant results for unchanged code - Cache hit/miss tracking **Validation**: - [ ] Cache key correctly identifies identical operations - [ ] File changes invalidate cache - [ ] Expired cache not used - [ ] Cost savings tracked --- ### Operation 6: Measure ROI **Purpose**: Calculate return on investment for AI spending **Process**: 1. **Track Time Saved**: ```json { "task": "Implement user authentication", "without_ai": { "estimated_hours": 40, "developer_rate": 100, "total_cost": 4000 }, "with_ai": { "actual_hours": 11.3, "developer_rate": 100, "developer_cost": 1130, "ai_cost": 2.50, "total_cost": 1132.50 }, "roi": { "time_saved_hours": 28.7, "cost_saved": 2867.50, "roi_percentage": 253, "payback_period_hours": 0.025 } } ``` 2. **Calculate ROI**: ```typescript function calculateROI(task) { const time_saved = task.without_ai.estimated_hours - task.with_ai.actual_hours; const cost_saved = task.without_ai.total_cost - task.with_ai.total_cost; const roi_percentage = (cost_saved / task.with_ai.ai_cost) * 100; return { time_saved_hours: time_saved, cost_saved_usd: cost_saved, roi_percentage: roi_percentage, payback_period: task.with_ai.ai_cost / (task.without_ai.developer_rate * (time_saved / 40)) // weeks }; } ``` 3. **Monthly ROI Report**: ```markdown # Monthly ROI Report - January 2025 ## AI Spending - Total AI costs: $87.50 - Breakdown: - multi-ai-verification: $62.30 (71%) - multi-ai-research: $18.40 (21%) - multi-ai-testing: $6.80 (8%) ## Time Savings - Tasks completed: 8 - Total hours saved: 156 hours - Average savings per task: 19.5 hours ## Cost Savings - Developer cost avoided: $15,600 (156h × $100/h) - AI costs: $87.50 - Net savings: $15,512.50 ## ROI - ROI: 17,728% ($177 saved per $1 spent) - Payback period: 0.03 weeks (immediate) - Value multiplier: 178x ## Recommendations - ✅ Current spending highly cost-effective - Continue using AI for all qualifying tasks - Consider increasing budget (high ROI) ``` **Outputs**: - Comprehensive ROI analysis - Time and cost savings quantified - Monthly reports - Business justification for AI spending --- ### Operation 7: Predict Costs **Purpose**: Estimate costs before starting expensive operations **Process**: 1. **Historical Data**: ```json { "operation": "multi-ai-verification", "mode": "all_5_layers", "historical_costs": [ {"date": "2025-01-15", "tokens": 24166, "cost": 0.073}, {"date": "2025-01-18", "tokens": 21893, "cost": 0.066}, {"date": "2025-01-20", "tokens": 26543, "cost": 0.080} ], "avg_cost": 0.073, "std_dev": 0.007 } ``` 2. **Predict Before Operation**: ```typescript const prediction = predictCost('multi-ai-verification', { mode: 'all_5_layers', code_size_lines: 850 }); console.log(`💰 Estimated cost: $${prediction.estimated_cost} ± $${prediction.std_dev}`); console.log(` Range: $${prediction.min_cost} - $${prediction.max_cost}`); console.log(` Confidence: ${prediction.confidence}%`); // Check budget if (prediction.estimated_cost > budget_remaining) { console.log(`⚠️ Estimated cost exceeds remaining budget`); console.log(`Options:`); console.log(` 1. Use Haiku (estimated: $${prediction.estimated_cost * 0.27})`); console.log(` 2. Skip Layer 5 (save ~60%: $${prediction.estimated_cost * 0.4})`); console.log(` 3. Increase budget`); } ``` **Outputs**: - Cost predictions with confidence intervals - Budget impact assessment - Cost-effective alternatives suggested --- ## Cost Optimization Strategies ### Strategy 1: Model Selection **Baseline** (All Sonnet): $100/month **Optimized** (Smart selection): - Layer 1-2: Haiku ($20) - Layer 3-4: Sonnet ($40) - Layer 5: Sonnet ($30) - **Total**: $90/month (10% savings) **Aggressive** (Maximum savings): - Layer 1-2: Haiku ($20) - Layer 3-4: Haiku ($15) - Layer 5: Sonnet ($30) - **Total**: $65/month (35% savings) **Trade-off**: Some quality reduction at Layers 3-4 --- ### Strategy 2: Caching **Without Caching**: Re-verify same code multiple times **With Caching** (24-hour TTL): - First verification: $0.073 - Same code next 24h: $0 (cache hit) - **Savings**: 90% on unchanged code **Implementation**: ```typescript const cacheKey = hash(files_to_verify); const cached = getCache(cacheKey); if (cached && !isExpired(cached, 24)) { return cached.result; // $0 cost } const result = await verify(); // $0.073 cost saveCache(cacheKey, result); ``` --- ### Strategy 3: Ensemble Optimization **Baseline** (Always 5-agent ensemble): - 5 agents × $0.073 = $0.365 per verification **Optimized** (Conditional ensemble): - Critical features: 5 agents ($0.365) - Standard features: 3 agents ($0.219) - Simple changes: 1 agent ($0.073) - **Average savings**: 60% **Decision Logic**: ```typescript function shouldUseEnsemble(criticality, code_size, budget) { if (criticality === 'critical') return 5; if (criticality === 'high' && budget > 20) return 3; return 1; // Single agent } ``` --- ### Strategy 4: Layer Skipping **Full Verification** (All 5 layers): ~$0.073 **Fast-Track** (Layers 1-2 only): - Rules + Functional only - ~$0.015 (80% savings) - Use for: minor changes, docs, config **Decision**: ```typescript if (lines_changed < 50 && files_changed.every(f => !isCritical(f))) { // Fast-track: Layers 1-2 only mode = 'fast_track'; estimated_cost = 0.015; } else { // Full verification mode = 'standard'; estimated_cost = 0.073; } ``` --- ## Budget Management ### Monthly Budget Planning **Sample Budget** ($100/month): ```json { "monthly_budget_usd": 100, "allocation": { "multi-ai-verification": { "budget": 70, "rationale": "Most expensive (LLM-as-judge)" }, "multi-ai-research": { "budget": 20, "rationale": "Occasional use, tri-AI" }, "multi-ai-testing": { "budget": 10, "rationale": "Mostly automated" }, "buffer": 10 }, "assumptions": { "features_per_month": 8, "verifications_per_feature": 1.5, "research_per_month": 2 } } ``` ### Budget Tracking **Daily**: ```bash # Check today's spending cat .cost-tracking/$(date +%Y-%m-%d).json | jq '[.[] | .cost.amount_usd] | add' # Output: $3.45 today ``` **Monthly**: ```bash # Check month-to-date cat .cost-tracking/2025-01-*.json | jq '[.[] | .cost.amount_usd] | add' # Output: $67.80 this month (68% of budget) ``` **Projection**: ```typescript // Project end-of-month const days_elapsed = 26; const days_in_month = 31; const current_spend = 67.80; const projected = (current_spend / days_elapsed) * days_in_month; // = $81.14 projected (within budget ✅) ``` --- ## Cost Alerting ### Alert Levels **80% Budget** (Warning): ```markdown ⚠️ BUDGET ALERT: 80% Used **Current**: $80.00 / $100.00 (80%) **Remaining**: $20.00 **Days left**: 5 **Projected EOMs**: $93.75 (within budget) **Recommendations**: - Monitor spending closely - Use Haiku for simple tasks - Cache aggressively - Skip optional layers where safe ``` **95% Budget** (Critical): ```markdown 🚨 CRITICAL: 95% Budget Used **Current**: $95.00 / $100.00 (95%) **Remaining**: $5.00 **Days left**: 5 **Projected EOM**: $110 (OVER BUDGET) **Actions Required**: 1. Pause non-critical verifications 2. Use Haiku exclusively 3. Request budget increase OR 4. Defer work to next month **Auto-throttling**: Enabled - Only critical operations allowed - All optional layers disabled - Ensemble verification disabled ``` **Budget Exceeded**: ```markdown ❌ BUDGET EXCEEDED **Current**: $102.50 / $100.00 (102.5%) **Overage**: $2.50 **Operations BLOCKED** until: 1. Budget increased OR 2. Next month (resets automatically) **Emergency Override**: Requires approval ``` --- ## ROI Calculation Framework ### Value Metrics **Quantifiable Value**: - Time saved (hours) - Cost saved (developer time avoided) - Quality improvement (fewer bugs in production) - Faster time-to-market (days) **Formula**: ``` ROI = ((Value Delivered - AI Costs) / AI Costs) × 100% ``` **Example**: ``` Feature without AI: 40 hours × $100/hour = $4,000 Feature with AI: 11.3 hours × $100/hour + $2.50 AI = $1,132.50 Value Delivered = $4,000 - $1,130 = $2,870 AI Costs = $2.50 ROI = ($2,870 / $2.50) × 100% = 114,800% ``` ### Monthly Reporting **Template**: ```markdown # AI ROI Report - January 2025 ## Summary - **AI Spending**: $87.50 - **Value Delivered**: $15,600 (156 hours saved × $100/hour) - **ROI**: 17,728% - **Payback**: Immediate ## Details ### Features Delivered (8) 1. User authentication - 11.3h (was 40h), ROI: 114,800% 2. Payment integration - 8.5h (was 32h), ROI: 108,235% [... more ...] ### Cost Breakdown - Verification: $62.30 (71%) - Highest cost, highest value - Research: $18.40 (21%) - Occasional, high impact - Testing: $6.80 (8%) - Mostly automated, low cost ### Savings - Time: 156 hours saved - Cost: $15,512.50 net savings - Quality: 4 bugs prevented (saved ~20 hours) ### Recommendations - ✅ ROI is excellent (17,728%) - Consider increasing budget (high returns) - Current spending optimal ``` --- ## Quick Reference ### Cost Operations | Operation | Purpose | Time | Automation | |-----------|---------|------|------------| | **Track** | Monitor token usage | Automatic | 100% | | **Calculate** | Compute costs | Automatic | 100% | | **Enforce** | Budget caps | Automatic | 100% | | **Optimize** | Model selection | Semi-auto | 70% | | **Cache** | Avoid re-compute | Automatic | 100% | | **Measure ROI** | Value analysis | Manual | 30% | | **Predict** | Cost estimation | Automatic | 90% | ### Cost Optimization Strategies | Strategy | Savings | Trade-off | Recommended For | |----------|---------|-----------|-----------------| | **Model selection** | 10-35% | Some quality loss | All features | | **Caching** | 90% | Stale results risk | Unchanged code | | **Ensemble optimization** | 60% | Lower confidence | Non-critical | | **Layer skipping** | 80% | Less thorough | Minor changes | ### Budget Thresholds - **< 80%**: Normal operation - **80-95%**: Warning, optimize - **95-100%**: Critical, throttle - **> 100%**: Block operations --- **agent-cost-optimizer ensures cost-effective AI operations through real-time tracking, budget enforcement, model optimization, and ROI measurement - preventing budget overruns while maximizing value delivered.** For cost reports, see examples/. For optimization strategies, see Cost Optimization Strategies section.