--- name: diagnostics-runner description: Run comprehensive system diagnostics including dependency checks, test suite execution, git status, and pipeline health verification. Use when troubleshooting issues, verifying system readiness, or preparing for a release. --- # Diagnostics Runner Skill Comprehensive system health checks and diagnostics for the VideoChunking project. ## What This Skill Does This skill provides multi-level diagnostics to ensure system health: 1. **Dependency Verification**: Check all required packages and tools are installed 2. **System Health**: Verify FFmpeg, Ollama, and other components are functional 3. **Test Suite Execution**: Run pytest with coverage analysis 4. **Configuration Validation**: Check party configs, settings, and data files 5. **Git Status Review**: Report repository state and uncommitted changes 6. **Data Integrity**: Verify session data and knowledge base files 7. **Performance Benchmarks**: Test critical operations for performance issues 8. **Cleanup Recommendations**: Identify temporary files and optimization opportunities ## Diagnostic Levels ### Quick Check (30 seconds) - FFmpeg availability - Ollama status - Critical dependencies - Git branch and status ### Standard Check (2 minutes) - All dependencies with versions - System health (FFmpeg, Ollama, PyAnnote) - Quick test run (no full suite) - Configuration file validation - Recent session data integrity ### Comprehensive Check (5-10 minutes) - Full dependency audit - Complete test suite with coverage - All configuration validation - All session data verification - Performance benchmarks - Disk space and resource analysis - Detailed recommendations ## Usage ### Quick Diagnostics User: "Check if the system is ready to process" User: "Is everything working?" User: "Quick health check" ### Standard Diagnostics User: "Run diagnostics" User: "Check system health" User: "Verify the pipeline is healthy" ### Comprehensive Diagnostics User: "Run full diagnostics" User: "Complete system check before release" User: "Thorough health check and tests" ## Command Reference ```bash # Quick health check python cli.py health # MCP tool: Check pipeline health # Via MCP: check_pipeline_health # Run test suite pytest --cov=src --cov-report=term # MCP tool: Analyze test coverage # Via MCP: analyze_test_coverage # MCP tool: Run full diagnostics # Via MCP: run_diagnostics_suite # Validate specific configuration python cli.py validate-config --party default # MCP tool: Validate party config # Via MCP: validate_party_config ``` ## MCP Tool Integration This skill orchestrates multiple MCP tools: ### check_pipeline_health Returns: ```json { "ffmpeg": true/false, "ollama": true/false, "ollama_models": "list of available models", "pyannote_models": true/false, "whisper": true/false, "dependencies": [ {"package": "faster-whisper", "installed": true, "version": "1.2.0"}, {"package": "pyannote.audio", "installed": true, "version": "4.0.1"}, ... ] } ``` ### run_diagnostics_suite Returns: ```json { "timestamp": "2024-11-03T21:45:00", "python_version": "Python 3.10.6", "git_status": "modified files...", "test_status": "45 tests collected", "dependencies": "package status..." } ``` ### analyze_test_coverage Runs full pytest with coverage and returns detailed report. ### validate_party_config Validates party configuration files: ```json { "configs": [ { "file": "party_default.json", "valid": true, "player_count": 4, "character_count": 4, "errors": [], "warnings": [] } ] } ``` ### list_available_models Lists Ollama models for IC/OOC classification. ### list_processed_sessions Shows recently processed sessions to verify data integrity. ## Diagnostic Components ### 1. Dependency Verification Checks required Python packages: ``` ✅ faster-whisper v1.2.0 ✅ pyannote.audio v4.0.1 ✅ gradio v5.49.1 ✅ torch v2.9.0 ✅ groq v0.32.0 ✅ click v8.1.7 ✅ rich v13.7.0 ❌ opencv-python NOT INSTALLED ``` ### 2. System Health Verifies external tools: ``` FFmpeg: ✅ Installed: v6.0 ✅ Accessible: f:/Repos/VideoChunking/ffmpeg/bin/ffmpeg.exe ✅ Features: Audio encoding, video encoding, filters Ollama: ✅ Running: localhost:11434 ✅ Models: mistral (active), llama3 (available) ✅ Response time: 124ms PyAnnote: ✅ Models downloaded: yes ✅ Segmentation model: pyannote/segmentation ✅ Embedding model: pyannote/embedding ``` ### 3. Test Suite Execution Runs pytest and reports: ``` Test Results: ✅ Passed: 42 ❌ Failed: 3 ⚠️ Skipped: 2 Failed Tests: tests/test_classifier.py::test_ic_ooc_accuracy tests/test_diarization.py::test_speaker_count tests/test_knowledge_extractor.py::test_npc_extraction Coverage: 78% (target: 80%) src/audio_processor.py: 95% src/transcriber.py: 92% src/classifier.py: 65% ⚠️ LOW src/knowledge_extractor.py: 58% ⚠️ LOW ``` ### 4. Configuration Validation Checks configuration files: ``` Party Configurations: ✅ party_default.json: Valid (4 players, 4 characters, 1 DM) ✅ party_oneshot.json: Valid (5 players, 5 characters, 1 DM) ❌ party_custom.json: INVALID - Missing 'dm' field Settings: ✅ src/config.py: Valid ✅ .env.example: Valid ⚠️ .env: Not found (expected for production) ``` ### 5. Git Status Review Reports repository state: ``` Git Repository: Branch: main Ahead of remote: 3 commits Behind remote: 0 commits Uncommitted Changes: Modified: src/classifier.py Modified: tests/test_classifier.py Untracked: temp_output/ Recommendations: ⚠️ Commit changes before processing new sessions ⚠️ Push commits to backup work ℹ️ .coverage file can be added to .gitignore ``` ### 6. Data Integrity Validates data files: ``` Campaign Knowledge: ✅ File exists: data/campaign_knowledge.json ✅ Valid JSON: yes ✅ Entity counts: 47 NPCs, 23 locations, 15 quests ✅ Last updated: 2024-11-02 18:45:23 Party Configurations: ✅ Files: 2 valid, 1 invalid ⚠️ party_custom.json has validation errors Recent Sessions: ✅ session_010: Complete (all output files present) ✅ session_011: Complete (all output files present) ⚠️ session_012: Incomplete (missing knowledge extraction) Recommendations: ⚠️ Re-run knowledge extraction for session_012 ⚠️ Fix validation errors in party_custom.json ``` ### 7. Performance Benchmarks Tests critical operations: ``` Performance Benchmarks: Audio extraction (10s video): 1.2s ✅ Transcription (1min audio): 8.4s ✅ Diarization (1min audio): 12.1s ✅ Classification (100 segments): 3.7s ✅ Knowledge extraction: 5.2s ✅ Memory Usage: Idle: 450MB Transcription: 2.1GB ✅ Diarization: 1.8GB ✅ Peak: 2.8GB ✅ (within 16GB available) ``` ### 8. Cleanup Recommendations Identifies optimization opportunities: ``` Disk Space: Total: 512GB Used: 387GB Available: 125GB Temporary Files: output/temp/: 2.3GB (45 files) .pytest_cache/: 12MB __pycache__/: 8MB Recommendations: ℹ️ Clean temporary output files: rm -rf output/temp/ ℹ️ Remove pytest cache: rm -rf .pytest_cache/ ℹ️ Clear Python cache: find . -type d -name __pycache__ -exec rm -rf {} + ⚠️ Low disk space: Consider archiving old session outputs ``` ## Output Formats ### Summary View (Default) Concise overview suitable for quick checks: ``` System Health: ✅ HEALTHY Dependencies: ✅ ALL INSTALLED Tests: ⚠️ 3 FAILED (42 passed, 78% coverage) Configuration: ⚠️ 1 INVALID (2 valid) Data: ✅ INTACT Git: ⚠️ UNCOMMITTED CHANGES Action Required: 1. Fix 3 failing tests 2. Validate party_custom.json 3. Commit pending changes ``` ### Detailed View Expanded report with specifics for each component. ### JSON Export Machine-readable format: ```json { "timestamp": "2024-11-03T21:45:00Z", "overall_health": "warning", "components": { "dependencies": {"status": "ok", "details": {...}}, "system_health": {"status": "ok", "details": {...}}, "tests": {"status": "warning", "failed": 3, "passed": 42}, "configuration": {"status": "warning", "invalid": 1}, "data": {"status": "ok", "details": {...}}, "git": {"status": "warning", "uncommitted": 2} }, "recommendations": [...] } ``` ## Automated Diagnostics ### Pre-Commit Hook Run quick diagnostics before commits: ```bash # .git/hooks/pre-commit pytest tests/ --tb=short python cli.py health --quick ``` ### CI/CD Integration Run comprehensive diagnostics in CI: ```yaml # .github/workflows/diagnostics.yml - name: Run diagnostics run: python cli.py diagnostics --comprehensive --json > report.json ``` ### Scheduled Health Checks Periodic system verification: ```bash # Cron job or Windows Task Scheduler 0 9 * * * cd /path/to/project && python cli.py health --email-report ``` ## Troubleshooting Guide Based on diagnostic results: ### FFmpeg Not Found ``` Symptoms: ffmpeg=false in health check Solutions: 1. Install FFmpeg 2. Add FFmpeg to system PATH 3. Update config with FFmpeg path 4. Run: ffmpeg -version to verify ``` ### Ollama Not Running ``` Symptoms: ollama=false in health check Solutions: 1. Start Ollama: ollama serve 2. Check port 11434 is available 3. Verify model is pulled: ollama list 4. Test: curl http://localhost:11434/api/tags ``` ### Tests Failing ``` Symptoms: test_status shows failures Solutions: 1. Read failure messages carefully 2. Run specific failed test: pytest tests/test_X.py -v 3. Check if data dependencies exist 4. Review recent code changes 5. Update test fixtures if needed ``` ### Low Coverage ``` Symptoms: coverage <80% Solutions: 1. Identify uncovered code: pytest --cov-report=html 2. Open htmlcov/index.html in browser 3. Write tests for uncovered functions 4. Add edge case tests ``` ### Invalid Configuration ``` Symptoms: Configuration validation errors Solutions: 1. Review error messages 2. Compare with .json schema 3. Check for missing required fields 4. Verify JSON syntax is valid 5. Use JSON validator tool ``` ### Data Integrity Issues ``` Symptoms: Corrupt or missing session data Solutions: 1. Re-process affected sessions 2. Restore from backups if available 3. Manually correct JSON files 4. Run data migration scripts ``` ## Best Practices 1. **Regular Checks**: Run diagnostics weekly or before major processing batches 2. **Before Releases**: Always run comprehensive diagnostics before tagging releases 3. **After Updates**: Check system health after dependency updates 4. **Monitor Trends**: Track test coverage and performance over time 5. **Document Issues**: Log diagnostic results when reporting bugs 6. **Automate**: Set up pre-commit hooks and CI checks 7. **Review Recommendations**: Act on cleanup and optimization suggestions ## Integration with Other Skills - **test-pipeline**: Focuses specifically on test execution and coverage - **debug-ffmpeg**: Deep dive into FFmpeg-specific issues - **party-validator**: Detailed party configuration validation - **session-processor**: Use diagnostics before starting session processing - **campaign-analyzer**: Verify knowledge base integrity ## Example Workflows ### Pre-Processing Workflow ``` User: "I'm about to process several new sessions. Is the system ready?" Assistant uses diagnostics-runner: 1. Runs check_pipeline_health MCP tool 2. Validates party configurations 3. Checks recent session data integrity 4. Verifies sufficient disk space 5. Reviews git status for uncommitted changes 6. Reports: "System healthy, ready to process" ``` ### Troubleshooting Workflow ``` User: "Session processing failed. Help me figure out why." Assistant uses diagnostics-runner: 1. Runs comprehensive diagnostics 2. Identifies: Ollama not running 3. Provides solution: Start Ollama service 4. Verifies: Re-runs health check 5. Confirms: "Ollama now running, ready to retry" ``` ### Release Preparation Workflow ``` User: "Preparing for v2.0 release. Run full diagnostics." Assistant uses diagnostics-runner: 1. Runs comprehensive diagnostics 2. Executes full test suite with coverage 3. Validates all configurations 4. Checks git status 5. Generates JSON report 6. Provides release readiness assessment ``` ## Advanced Features ### Custom Diagnostic Scripts ```python # custom_diagnostics.py from src.diagnostics import DiagnosticsRunner runner = DiagnosticsRunner() results = runner.run_all() # Custom checks if results['disk_space_gb'] < 50: print("WARNING: Low disk space") # Send to monitoring system send_to_datadog(results) ``` ### Diagnostic Dashboard Create a monitoring dashboard: ```bash # Generate HTML report python cli.py diagnostics --html > diagnostics.html # Serve with Python python -m http.server 8000 # Open: http://localhost:8000/diagnostics.html ``` ### Integration with Monitoring ```python # Export metrics python cli.py diagnostics --prometheus > metrics.txt # Scrape with Prometheus for alerting ```