--- name: log-analyze description: Parse and analyze system and application logs. Use when the user says "find errors in logs", "analyze logs", "check journalctl", "what's in the logs", "debug from logs", or asks to investigate log files. allowed-tools: Bash, Read, Grep --- # Log Analysis Parse and analyze logs to identify errors, patterns, and issues. ## Instructions 1. Identify log source (journalctl, file, application) 2. Establish time range of interest 3. Filter for relevant entries (errors, specific service) 4. Identify patterns and root causes 5. Summarize findings with evidence ## Common log sources ```bash # Systemd journal journalctl -u --since "1 hour ago" journalctl -p err --since today journalctl -b # current boot # System logs /var/log/syslog /var/log/messages /var/log/auth.log # Application logs /var/log/nginx/error.log /var/log/apache2/error.log /var/log/postgresql/ ``` ## Journalctl patterns ```bash # Errors only journalctl -p err -b # Specific service with context journalctl -u nginx --since "2024-01-01" --until "2024-01-02" # Follow live journalctl -f -u myapp # Kernel messages journalctl -k # JSON output for parsing journalctl -o json -u myapp | jq . # Disk usage journalctl --disk-usage ``` ## Analysis patterns ```bash # Count errors by type grep -oP 'ERROR: \K[^:]+' app.log | sort | uniq -c | sort -rn # Find IPs with most errors grep "error" access.log | grep -oP '\d+\.\d+\.\d+\.\d+' | sort | uniq -c | sort -rn # Time distribution of errors grep "ERROR" app.log | grep -oP '^\d{4}-\d{2}-\d{2} \d{2}' | uniq -c # Errors around a specific time grep -A5 -B5 "15:30:" error.log ``` ## Common error patterns | Pattern | Indicates | | --------------------- | ----------------------------- | | OOM, "Killed" | Out of memory | | ENOSPC | Disk full | | ECONNREFUSED | Service not running/listening | | ETIMEDOUT | Network/firewall issue | | Permission denied | File permissions or SELinux | | "too many open files" | ulimit exhausted | ## Output format ``` ## Summary [Brief description of what was found] ## Errors Found - [timestamp] [error message] (occurred N times) ## Root Cause Analysis [Explanation of likely cause] ## Recommendations 1. [Action to fix] 2. [Preventive measure] ``` ## Rules - MUST establish time range before analyzing - MUST quantify error frequency (not just "found errors") - MUST provide specific log excerpts as evidence - Never expose sensitive data from logs (passwords, tokens) - Always check for time correlation between errors