# Sockethub Performance & Stress Testing Comprehensive load testing suite for Sockethub server using Artillery. ## Quick Start ```bash # 1. Start dependencies (Redis + Prosody for XMPP tests) bun run docker:start:deps # 2. Start Sockethub locally bun run start # 3. Generate baseline (first time only) bun run stress:baseline # 4. Run tests bun run stress:performance # Performance tests (~10 min) bun run stress:stress # Stress tests (~15 min) bun run stress:soak # Soak test (30 min) bun run stress:all # All tests (~60 min) bun run stress:ci # CI smoke test (1-2 min) # 5. View results bun run stress:report --latest ``` ## What We Test ### Performance Tests (Baseline Metrics) **Goal**: Measure normal performance with realistic load - **Connection Baseline**: 100 connections over 2 minutes - **Message Throughput**: 50 clients × 10 msg/sec sustained - **Multi-Client Broadcast**: 20 clients in shared XMPP room **Platform**: Dummy (fast baseline), XMPP (real protocol) ### Stress Tests (Breaking Points) **Goal**: Find maximum capacity before system fails - **Connection Storm**: Ramp 0→1000 connections in 30 seconds - **Message Flood**: Ramp 1→100 msg/sec with XMPP protocol - **Feed Processing**: 50 clients requesting feeds simultaneously - **Spike Test**: Baseline → 10x spike → baseline recovery **Platform**: XMPP (realistic overhead), Feed (different pattern) ### Soak Test (Memory Leaks) **Goal**: Detect resource leaks over time - **Long Duration**: 30 minutes sustained load - **Memory Tracking**: Heap growth monitoring - **Platform Stability**: Process cleanup validation **Platform**: Dummy (fast) + XMPP (real-world) ### CI Smoke Test **Goal**: Quick crash detection (not performance measurement) - **Duration**: 1-2 minutes - **Purpose**: Verify server doesn't crash under basic load - **Metrics**: Ignored (CI hardware is unreliable) ## Architecture ``` Artillery (load generator) ↓ @sockethub/client (real client library) ↓ Socket.IO connections + ActivityStreams Sockethub Server (your code) ↓ Redis + Platforms (XMPP, Feed, etc.) ``` **What we test:** Full stack - client + server capacity under load **Why @sockethub/client:** - Ensures correct ActivityStreams message formatting - Tests the actual client library users will use - Proper error handling and validation **Testing local vs Docker:** - **Local**: `docker:start:deps` + `bun run start` (tests your working code) - **Docker**: `docker:start` (tests Dockerized build, different use case) ## Baselines ### How Baselines Work 1. **System Profile**: Unique ID per hardware (CPUs, RAM, hostname) 2. **Generation**: 3 warmup runs + 5 baseline runs → median 3. **Storage**: `baselines/{hostname}-{cpus}c-{memory}gb.json` 4. **Comparison**: ±15% tolerance for pass/fail ### When to Regenerate - First time running tests - After Sockethub version upgrade - After major refactors - Weekly trend tracking ```bash # Force regenerate baseline rm stress-tests/baselines/*.json bun run stress:baseline ``` ## Results Interpretation ### Good Performance ✓ - P95 latency within 15% of baseline - Error rate < 0.1% - Memory/CPU stable over time - Redis queue depth < 100 ### Warning Signs ⚠️ - P95 latency 15-30% above baseline - Error rate 0.1-0.5% - Gradual memory growth - Queue depth trending up ### Bad Performance ❌ - P99 latency > 2x baseline - Error rate > 1% - Memory leak (continuous growth) - Platform process crashes - Sockethub validation errors (malformed ActivityStreams) ## Error Handling Tests automatically detect and report: **Sockethub Errors (validation, processing):** - Credentials errors (authentication failures) - Echo/message errors (Dummy platform rejections) - XMPP errors (protocol failures) - Feed errors (fetch failures) **Test Failures:** - Tests fail if >5% of users encounter Sockethub errors - Tests fail if >50% Artillery connection errors - Tests fail if all connections fail **Error Output:** ``` Sockethub errors: 15 (3.2%) - Credentials errors: 5 - Echo errors: 10 ``` All errors are logged in real-time so you can see exactly what went wrong. ## Example Output ``` ╔══════════════════════════════════════════════════════════╗ ║ SOCKETHUB PERFORMANCE & STRESS TEST SUITE ║ ╚══════════════════════════════════════════════════════════╝ System: dev-machine (8 CPUs, 16GB RAM) Baseline: ✓ Found (2026-01-20, v5.0.0-alpha.10) [1/3] Performance: message-throughput Platform: dummy Status: ✓ PASS [2/3] Stress: connection-storm Platform: dummy Status: ✓ PASS Results: 3/3 PASSED Report: stress-tests/reports/2026-01-23-full-suite.json ``` ## CI Integration **Recommended:** Don't run full tests in CI (too slow, unreliable hardware) ### Option 1: No CI (recommended) - Run tests manually before releases - Track trends locally ### Option 2: Smoke Test Only ```yaml # .github/workflows/stress-test.yml stress-smoke: runs-on: ubuntu-latest steps: - run: bun run docker:start - run: bun run start & - run: bun run stress:ci # 1-2 min, crash detection only ``` **CI Mode Behavior:** - Ignores performance metrics (unreliable) - Only checks: error rate > 10% = fail - Purpose: Detect crashes, not measure speed ## File Structure ``` stress-tests/ ├── artillery/ │ ├── scenarios/ │ │ ├── performance/ # Baseline tests │ │ ├── stress/ # Capacity tests │ │ ├── soak/ # Memory leak tests │ │ └── ci/ # Smoke tests │ └── processors/ │ ├── activitystreams-validator.js │ └── metrics-collector.js ├── bun/ │ ├── system-profiler.ts │ ├── baseline-generator.ts │ └── baseline-comparator.ts ├── baselines/ # System-specific baselines (gitignored) ├── reports/ # Test results (gitignored) ├── config.ts ├── runner.ts ├── reporter.ts └── types.ts ``` ## Advanced Usage ### Run Single Scenario ```bash bunx artillery run stress-tests/artillery/scenarios/performance/message-throughput.yml ``` ### Custom Sockethub URL Edit `stress-tests/config.ts`: ```typescript export const DEFAULT_SOCKETHUB_URL = 'http://custom-host:10550'; ``` ### Adjust Tolerance Edit `stress-tests/config.ts`: ```typescript export const TOLERANCE_PCT = 20; // ±20% instead of ±15% ``` ## Troubleshooting ### "No baseline found" ```bash bun run stress:baseline ``` ### "Connection refused" Ensure dependencies and Sockethub are running: ```bash bun run docker:start:deps # Start Redis + Prosody bun run start # Start Sockethub locally ``` ### High error rates Check Redis is running: ```bash docker ps | grep redis ``` ### Artillery not found ```bash bun add -d artillery ``` ## Best Practices 1. **Test local code** - Use `docker:start:deps` + `bun run start`, not Docker Sockethub 2. **Run on dedicated hardware** (not laptop with throttling) 3. **Clear Redis between runs** for consistency 4. **Disable debug logging** in production mode 5. **Run full suite before releases** 6. **Track trends over time** (weekly runs) 7. **Document intentional tradeoffs** that affect performance ## Limitations - Tests server capacity (uses @sockethub/client for proper ActivityStreams formatting) - Baseline requires consistent hardware - CI results are unreliable (variable performance) - Long tests (soak) are expensive ## Next Steps - Add historical trend tracking - Implement HTML report generation - Add platform-specific metrics (XMPP connection pools, Feed cache hits) - Create Docker Compose profile for stress testing - Add profiling integration (flame graphs, heap snapshots)