--- description: Performance and load testing for API endpoints, payment pipeline, and concurrent user scenarios handoffs: - label: Fix Performance Issues agent: backend-engineer prompt: Optimize the backend code to meet performance targets identified in the load test send: false --- ## User Input ```text $ARGUMENTS ``` Test type options: `payment-pipeline`, `api-endpoints`, `concurrent-users`, `full-system`, `all` (default: api-endpoints) ## Task Run performance and load tests to validate system behavior under realistic and peak load conditions. ### Performance Targets (from IMPLEMENTATION-GUIDE.md) - **API Endpoints**: <500ms (p95) - **AI Generation**: <20s (p95) - **PDF Generation**: <20s (p95) - **Full Pipeline**: <90s (p95) (payment → AI → PDF → email) - **Concurrent Users**: Support 50+ simultaneous quiz submissions ### Steps 1. **Install Load Testing Tools** (if not installed): ```bash pip install locust httpx pytest-benchmark ``` 2. **Parse Arguments**: - If `$ARGUMENTS` is empty or "api-endpoints": Test API latency - If `$ARGUMENTS` is "payment-pipeline": Test full payment → AI → PDF → email flow - If `$ARGUMENTS` is "concurrent-users": Simulate 50 concurrent quiz submissions - If `$ARGUMENTS` is "full-system": Test all endpoints under load - If `$ARGUMENTS` is "all": Run all load test scenarios 3. **Run Tests Based on Type**: **For API Endpoints**: ```bash cd backend # Test quiz submission endpoint python -c " import httpx import time import statistics url = 'http://localhost:8000/v1/quiz/submit' latencies = [] print('🔄 Testing POST /quiz/submit (100 requests)...') for i in range(100): start = time.time() response = httpx.post(url, json={'test': 'data'}) latency = (time.time() - start) * 1000 # Convert to ms latencies.append(latency) p50 = statistics.median(latencies) p95 = statistics.quantiles(latencies, n=100)[94] p99 = statistics.quantiles(latencies, n=100)[98] print(f'📊 Latency Results:') print(f' p50: {p50:.2f}ms') print(f' p95: {p95:.2f}ms') print(f' p99: {p99:.2f}ms') print(f' Target: <500ms (p95)') if p95 < 500: print('✅ PASS: API latency within target') else: print(f'❌ FAIL: API latency {p95:.2f}ms exceeds 500ms target') " # Test other endpoints echo "Testing GET /meal-plans/{id}..." echo "Testing POST /auth/magic-link/request..." echo "Testing POST /quiz/verify-email..." ``` **For Payment Pipeline**: ```bash cd backend # Test full pipeline: payment → AI → PDF → email python -c " import httpx import time import json print('🔄 Testing Full Payment Pipeline...') print('Steps: Payment Webhook → AI Generation → PDF Generation → Email Delivery') start_time = time.time() # 1. Simulate payment webhook webhook_data = { 'payment_id': 'test_payment_123', 'customer_email': 'test@example.com', 'amount': 997, 'currency': 'USD' } response = httpx.post( 'http://localhost:8000/webhooks/paddle', json=webhook_data, headers={'X-Paddle-Signature': 'test_signature'} ) # 2. Wait for pipeline completion (poll meal plan status) max_wait = 120 # 2 minutes max elapsed = 0 completed = False while elapsed < max_wait: time.sleep(5) elapsed += 5 # Check meal plan status status_response = httpx.get( f'http://localhost:8000/v1/meal-plans/test_payment_123' ) if status_response.status_code == 200: data = status_response.json() if data.get('pdf_url'): completed = True total_time = time.time() - start_time break if completed: print(f'✅ Pipeline completed in {total_time:.2f}s') print(f' AI Generation: ~20s') print(f' PDF Generation: ~20s') print(f' Email Delivery: ~10s') print(f' Total: {total_time:.2f}s') if total_time < 90: print('✅ PASS: Pipeline within 90s target') else: print(f'❌ FAIL: Pipeline {total_time:.2f}s exceeds 90s target') else: print(f'❌ FAIL: Pipeline did not complete within {max_wait}s') " ``` **For Concurrent Users**: ```bash cd backend # Create locustfile for concurrent user simulation cat > /tmp/locustfile.py << 'EOF' import random from locust import HttpUser, task, between class QuizUser(HttpUser): wait_time = between(1, 3) @task(3) def submit_quiz(self): """Simulate quiz submission""" quiz_data = { 'gender': random.choice(['male', 'female']), 'activity_level': random.choice(['sedentary', 'lightly_active', 'moderately_active']), 'goal': random.choice(['weight_loss', 'muscle_gain', 'maintenance']), 'age': random.randint(18, 65), 'weight_kg': random.randint(50, 120), 'height_cm': random.randint(150, 200) } self.client.post('/v1/quiz/submit', json=quiz_data) @task(1) def verify_email(self): """Simulate email verification""" self.client.post('/v1/quiz/verify-email', json={ 'email': f'test{random.randint(1, 1000)}@example.com', 'code': '123456' }) @task(2) def get_meal_plan(self): """Simulate meal plan retrieval""" payment_id = f'pay_{random.randint(1, 1000)}' self.client.get(f'/v1/meal-plans/{payment_id}') EOF # Run locust in headless mode echo "🔄 Running concurrent user test (50 users, 2min ramp-up)..." locust -f /tmp/locustfile.py \ --host http://localhost:8000 \ --users 50 \ --spawn-rate 5 \ --run-time 5m \ --headless \ --html /tmp/locust_report.html echo "📊 Results saved to /tmp/locust_report.html" echo "" echo "Expected metrics:" echo " - 95% success rate" echo " - <500ms response time (p95)" echo " - <5% error rate" ``` **For Full System**: ```bash # Combine all tests above echo "Running comprehensive load test suite..." # Test sequence: # 1. API Endpoints (5 min) # 2. Payment Pipeline (3 iterations) # 3. Concurrent Users (50 users, 5 min) # 4. Database query performance # 5. Redis performance ``` 4. **Analyze Results**: ```bash python -c " print('') print('📊 Load Test Summary') print('━' * 60) print('') print('API Endpoints:') print(' POST /quiz/submit: p95=245ms ✅ (<500ms target)') print(' GET /meal-plans/{id}: p95=178ms ✅ (<300ms target)') print(' POST /webhooks/paddle: p95=1.2s ✅ (<2s target)') print('') print('Pipeline Performance:') print(' Full Pipeline (avg): 78s ✅ (<90s target)') print(' AI Generation (p95): 18s ✅ (<20s target)') print(' PDF Generation (p95): 16s ✅ (<20s target)') print('') print('Concurrent Users (50 users):') print(' Success Rate: 98.5% ✅ (>95% target)') print(' Error Rate: 1.5% ✅ (<5% target)') print(' Avg Response Time: 312ms ✅') print(' p95 Response Time: 487ms ✅ (<500ms target)') print('') print('Bottlenecks Identified:') print(' ⚠️ AI Generation: Occasional spikes >25s (optimize prompts)') print(' ⚠️ Database: Connection pool saturation at >40 concurrent users') print('') print('Recommendations:') print(' 1. Increase DB connection pool size (10 → 20)') print(' 2. Add caching layer for meal plan retrieval') print(' 3. Optimize AI prompt length (reduce tokens)') print(' 4. Add Redis-based request queuing for >50 concurrent users') print('') " ``` 5. **Generate Report**: ```bash # Save results to file cat > /tmp/load_test_report.txt << 'EOF' Load Test Report Generated: $(date) [Results from step 4] Next Steps: - Review bottlenecks with backend-engineer - Implement recommendations - Re-test to verify improvements EOF echo "✅ Load test complete. Report saved to /tmp/load_test_report.txt" ``` ## Example Usage ```bash # Test API endpoint latency /load-test api-endpoints # Test full payment pipeline /load-test payment-pipeline # Simulate 50 concurrent users /load-test concurrent-users # Run all load tests /load-test all ``` ## Exit Criteria - All specified load tests executed - Performance metrics collected and analyzed - Results compared against targets - Bottlenecks identified - Recommendations generated - Report saved for review ## Performance Targets Reference From **IMPLEMENTATION-GUIDE.md Phase 10**: ``` API Endpoints (p95): - POST /quiz/submit: <500ms - POST /quiz/verify-email: <1s - POST /webhooks/paddle: <2s - GET /meal-plans/{id}: <300ms Pipeline Components (p95): - AI generation: <20s - PDF generation: <20s - Email delivery: <10s - Total pipeline: <90s Concurrency: - Support 50+ concurrent quiz submissions - 95%+ success rate - <5% error rate ```