--- name: full-stack-debugger description: This skill should be used when debugging full-stack issues that span UI, backend, and database layers. It provides a systematic workflow to detect errors, analyze root causes, apply fixes iteratively, and verify solutions through automated server restarts and browser-based testing. Ideal for scenarios like failing schedulers, import errors, database issues, or API payload problems where issues originate in backend code but manifest in the UI. --- # Full Stack Debugger ## Overview The Full Stack Debugger enables systematic debugging of issues across the entire application stack (UI/Frontend, Backend/API, Database/State). It combines browser testing, log analysis, code examination, and automated server restart/verification to iteratively identify and fix issues one at a time until the system is fully operational. This skill uses a proven workflow: **Detection → Analysis → Fix → Restart → Verification → Iteration** to systematically resolve issues that developers encounter during development and testing. ## When to Use This Skill Trigger this skill when observing: - Error states in the UI (dashboard, buttons failing, status showing errors) - Repeated failures in backend logs (task execution failures, import errors, database errors) - Unexpected database state (rows showing failed status when they should succeed) - API endpoints returning errors or unexpected responses - Services failing to initialize or process tasks - Cascading failures across multiple components ## Debugging Workflow ### Phase 1: Detection Detect errors from multiple sources: **Browser UI Detection:** - Navigate to the affected page/feature in the browser - Check for error messages, red warning states, or disabled functionality - Read console error messages using DevTools - Note the specific UI state and what action triggered the error **Backend Log Detection:** - Query recent error logs using `tail -200 /path/to/logs/errors.log` - Search for error patterns related to the issue using `grep` - Note error timestamps, error messages, and stack traces - Look for repeated errors (indicates systemic issue) **Database State Detection:** - Query the database directly using sqlite3 - Check status of recent tasks, transactions, or records - Look for failed, incomplete, or error states - Note which records are affected and what their states are **Example:** When debugging a scheduler failure: 1. Navigate to System Health dashboard 2. Observe scheduler showing "0 done" or "X failed" 3. Check `/logs/errors.log` for error messages 4. Query `queue_tasks` table to see failed task records ### Phase 2: Analysis Analyze root causes by reading code and logs: **Code Analysis:** - Read the error file/module indicated in error stack traces - Check imports - look for missing `from X import Y` statements - Check class names - verify instantiation matches actual class names - Look for syntax errors - unmatched quotes, unclosed parentheses - Check function signatures - ensure payloads match expected parameters - Read reference documentation (`references/common_errors.md`) for error patterns **Log Analysis:** - Extract error messages from logs - Look for patterns like `'optional'` (missing import), `unterminated string` (syntax error), `'attribute'` (wrong class name) - Trace error propagation backward to find the originating issue - Check timestamps - multiple errors at same time indicate batch failure **API/Payload Analysis:** - Check what payload the API is sending to task handlers - Read the task handler code to see what fields it expects - Compare actual payload vs expected payload - Look for missing required fields **Example:** When debugging "name 'Optional' is not defined": 1. Find the file mentioned in error (`analysis_executor.py`) 2. Read the imports section 3. Notice `Optional` is used but not imported 4. Check line 14: `from typing import Dict, List, Any` - missing `Optional` 5. Fix: Add `Optional` to the import statement ### Phase 3: Fix (One Issue at a Time) Apply fixes one issue per iteration: **Before Fixing:** - Verify this is the first/next issue to fix - Read the relevant code section carefully - Use the fix patterns from `references/fix_templates.md` **Common Fix Patterns:** - **Missing imports:** Add to import statement (e.g., `from typing import Optional`) - **Wrong class name:** Update import and instantiation to match actual class - **Missing docstring quotes:** Add opening `"""` to docstring - **Wrong payload fields:** Add missing required fields to payload dictionary - **Syntax errors:** Fix unmatched quotes, parentheses, brackets **After Fixing:** - Read back the changed code to verify syntax - Check the edit was correct (line numbers, indentation) - Only fix ONE issue, even if multiple exist - don't cascade fixes - Document what was changed in a clear comment **Example Fix:** ```python # BEFORE from typing import Dict, List, Any # AFTER from typing import Dict, List, Any, Optional ``` ### Phase 4: Restart (Automated) Restart the backend server after each fix: ```bash # Kill existing processes lsof -ti:8000 | xargs kill -9 2>/dev/null # Clear Python bytecode cache find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null find . -type f -name "*.pyc" -delete 2>/dev/null # Restart backend sleep 3 && python -m src.main --command web > /tmp/backend_restart.log 2>&1 & sleep 10 # Wait for startup # Verify health curl -m 5 http://localhost:8000/api/health ``` ### Phase 5: Verification Verify the fix worked through multiple checks: **Health Check:** - Call `/api/health` endpoint - Verify `"status": "healthy"` - If still failing, check logs for new errors **Browser Verification:** - Navigate to the affected UI page - Trigger the action that previously failed - Verify the error is gone - Check for new errors in console **Database Verification:** - Query the affected records/tasks - Verify status changed from failed/error to success/completed - Check that metrics updated (e.g., scheduler shows "1 done" instead of "0 done") **Log Verification:** - Check recent logs for the same error - Verify no new errors appeared - Look for success messages or "completed" status **Example:** - Scheduler should show "1 done" instead of "0 done" - Task record should show status="completed" instead of "failed" - No error messages in logs - WebSocket shows healthy status in UI ### Phase 6: Iteration If issues remain, repeat the cycle: 1. **Continue if more issues exist:** - Check logs for remaining errors - If yes, return to Phase 2 (Analysis) - Fix the next issue (Phase 3) - Restart (Phase 4) - Verify (Phase 5) 2. **Stop when all issues fixed:** - All schedulers show completed execution counts - UI shows no error states - Logs show no error patterns - Tasks/records show success status - Full verification complete ## Common Error Patterns See `references/common_errors.md` for patterns to recognize: - Python syntax errors (unterminated strings, missing quotes) - Import errors (`name 'X' is not defined`, `cannot import name 'Y'`) - Class/attribute errors (`'dict' object has no attribute 'symbol'`) - Type errors (passing wrong data type) - Payload/configuration errors (missing required fields) ## Fix Templates See `references/fix_templates.md` for ready-to-use fix patterns: - How to add missing imports - How to fix class name mismatches - How to fix docstring syntax - How to add missing payload fields - How to fix type errors ## Tools Used - **Playwright Browser Tools:** Navigate UI, verify changes - **Read/Grep Tools:** Examine code and logs - **Bash:** Server restart, cache clearing, health checks - **Edit Tool:** Apply code fixes - **Database Queries:** Verify task/record state ## MCP Tools Integration Use robo-trader-dev MCP tools for 95%+ token-efficient debugging: | Task | MCP Tool | Token Savings | Usage | |------|----------|---------------|-------| | Analyze error logs | `mcp__robo-trader-dev__analyze_logs` | 98% | Pattern detection with time windows | | System health check | `mcp__robo-trader-dev__check_system_health` | 97% | Database, queues, API, disk status | | Diagnose DB locks | `mcp__robo-trader-dev__diagnose_database_locks` | 95% | Correlate logs with code patterns | | Queue monitoring | `mcp__robo-trader-dev__queue_status` | 96% | Real-time queue backlog analysis | | Coordinator status | `mcp__robo-trader-dev__coordinator_status` | 94% | Init status, error details | | Error pattern fix | `mcp__robo-trader-dev__suggest_fix` | 90% | Known pattern matching with examples | | Read code files | `mcp__robo-trader-dev__smart_file_read` | 85% | Progressive context (summary/targeted/full) | | Find related files | `mcp__robo-trader-dev__find_related_files` | 88% | Import/git/similarity analysis | **Example debugging workflow**: ```python # 1. Detect errors (MCP instead of tail/grep) mcp__robo-trader-dev__analyze_logs(patterns=["ERROR", "TIMEOUT"], time_window="1h") # 2. Check system health (MCP instead of curl loops) mcp__robo-trader-dev__check_system_health(components=["database", "queues", "api_endpoints"]) # 3. Diagnose specific issue (MCP instead of sqlite3 + code reading) mcp__robo-trader-dev__diagnose_database_locks(time_window="24h", include_code_references=True) # 4. Get fix suggestions (MCP instead of manual pattern matching) mcp__robo-trader-dev__suggest_fix(error_message="name 'Optional' is not defined", context_file="src/services/analyzer.py") ``` **Integration with robo-trader architecture**: - Queue operations: Use `queue_status` to monitor PORTFOLIO_SYNC, DATA_FETCHER, AI_ANALYSIS - Coordinator debugging: Use `coordinator_status` for BroadcastCoordinator, AIChatCoordinator init issues - Database access: Use `query_portfolio` or `diagnose_database_locks` instead of direct sqlite3 connections ## Key Principles 1. **One issue at a time** - Fix one problem per iteration to prevent cascading failures 2. **Verify immediately** - Always restart and verify after each fix 3. **Multi-layer detection** - Check UI, logs, and database for clues 4. **Iterative refinement** - Continue until all issues resolved 5. **Automated restart** - Always use clean restart (kill + cache clear + restart) 6. **Browser verification** - Always test in actual UI, not just logs