--- name: pg-io-deep-dive description: Deep-dive investigation of PostgreSQL I/O patterns using targeted analysis types (buffer pool, tables, indexes, temp files, checkpoints). Covers pg_stat_io for PG16+, correlates I/O patterns with slow queries, and guides diagnosis of cache misses, temp file spills, and checkpoint storms. --- # PostgreSQL I/O Deep-Dive This skill guides a thorough investigation of PostgreSQL I/O performance using the full capabilities of the `analyze_disk_io_patterns` tool, correlating I/O data with query workload and configuration. ## When to Use This Skill Use this skill when the user: - Reports high disk I/O or slow disk performance - Sees queries with high `blks_read` or low cache hit ratios - Reports `temp_blks_written` in slow queries (work_mem too low) - Sees frequent checkpoint warnings in PostgreSQL logs - Has "IO" wait events from `analyze_wait_events` - Asks "why is my database doing so much disk I/O?" - Is sizing storage or evaluating SSD vs HDD performance ## MCP Resources Available - `pgtuner://health/cache` -- Quick cache hit ratio check (lightweight, no full health scan) - `pgtuner://settings/memory` -- Current memory configuration - `pgtuner://settings/checkpoint` -- Current checkpoint configuration - `pgtuner://table/{schema}/{table_name}/stats` -- Table-level I/O statistics ## Related MCP Prompt This skill extends the I/O analysis portions of the **`health_check`** and **`diagnose_slow_queries`** MCP Prompts. ## Prerequisites - The pgtuner-mcp server must be connected with a valid `DATABASE_URI` - PostgreSQL 12+ required. PostgreSQL 16+ recommended for `pg_stat_io` metrics. ## Agent Decision Tree: Where to Start ``` Is the I/O problem... | +--> GENERAL (overall slow, high iowait)? | --> Start with Step 1 (Buffer Pool) then Step 5 (Checkpoints) | +--> QUERY-SPECIFIC (one slow query with high blks_read)? | --> Start with Step 2 (Table I/O) then Step 3 (Index I/O) | +--> TEMP-FILE RELATED (sort/hash spilling to disk)? | --> Start with Step 4 (Temp Files) | +--> CHECKPOINT RELATED (log warnings about checkpoints)? | --> Start with Step 5 (Checkpoints) | +--> UNKNOWN? --> Start with Step 1 (Buffer Pool) for overall picture ``` ## I/O Investigation Workflow ### Step 1: Buffer Pool Analysis ``` Tool: analyze_disk_io_patterns Parameters: analysis_type: "buffer_pool" top_n: 20 ``` **What this reveals:** - Overall buffer cache hit ratio (shared buffers effectiveness) - Buffer allocation rate and eviction patterns - Backend writes vs checkpointer writes **Interpret results:** | Metric | Healthy | Warning | Critical | |--------|---------|---------|----------| | Cache hit ratio | > 99% | 95-99% | < 95% | | Backend writes > 0 | Rare | Occasional | Frequent (checkpointer not keeping up) | **IF cache hit ratio < 99%:** ``` Tool: review_settings Parameters: category: "memory" ``` Check if `shared_buffers` is undersized for the working set. **IF backend writes > 0:** Checkpointer/bgwriter is falling behind. Check `bgwriter_lru_maxpages` and checkpoint frequency. ### Step 2: Table I/O Hotspots ``` Tool: analyze_disk_io_patterns Parameters: analysis_type: "tables" schema_name: "public" top_n: 20 min_size_gb: 0.01 ``` **What this reveals:** - Per-table breakdown of `heap_blks_hit` vs `heap_blks_read` - Tables causing the most physical disk reads - Tables with poor cache hit ratios **Agent reasoning:** - Tables with high `heap_blks_read` relative to `heap_blks_hit`: Data is not fitting in shared_buffers - Very large tables with mostly sequential access: Consider partitioning or BRIN indexes - Tables with high `idx_blks_read`: Index is too large for buffer cache **Cross-reference with table stats:** For the top I/O-heavy tables: ``` Tool: get_table_stats Parameters: table_name: "" schema_name: "public" include_indexes: true order_by: "seq_scans" ``` IF `seq_scan` count is very high AND the table is large, the table likely needs better indexes to avoid full scans. ### Step 3: Index I/O Patterns ``` Tool: analyze_disk_io_patterns Parameters: analysis_type: "indexes" top_n: 20 min_size_gb: 0.01 ``` **What this reveals:** - Per-index physical read rates - Indexes that are too large for the buffer cache - Indexes that are read-heavy but rarely hit in cache **Agent reasoning:** - Bloated indexes have more pages to read. Cross-check: ``` Tool: analyze_index_bloat Parameters: table_name: "" schema_name: "public" min_index_size_gb: 0.01 min_bloat_percent: 20 ``` - Indexes on rarely-accessed columns waste buffer cache space. Cross-check with `find_unused_indexes`. ### Step 4: Temp File Analysis ``` Tool: analyze_disk_io_patterns Parameters: analysis_type: "temp_files" top_n: 20 ``` **What this reveals:** - Total temp files created and their aggregate size - Whether sorts and hash joins are spilling to disk **Correlate with slow queries:** ``` Tool: get_slow_queries Parameters: limit: 20 min_calls: 1 order_by: "mean_time" ``` Look for queries with `temp_blks_read > 0` or `temp_blks_written > 0` in the results. These queries are the ones spilling to disk. **Then analyze the worst offender:** ``` Tool: analyze_query Parameters: query: "" analyze: true buffers: true settings: true format: "text" ``` The `settings: true` parameter reveals if a session-level `SET work_mem` is in effect. The `buffers: true` shows temp buffer usage in the plan. **Look for in the plan:** - `Sort Method: external merge Disk: XXkB` -> `work_mem` too low for this sort - `Hash Batches: 16` -> Hash table spilling, increase `work_mem` - `Buffers: temp read=XX written=XX` -> Direct confirmation of temp file usage **Fix:** ```sql -- Session-level (test first): SET work_mem = '256MB'; -- Then re-run the query to verify temp files are eliminated. -- Global (if many queries affected): ALTER SYSTEM SET work_mem = '128MB'; SELECT pg_reload_conf(); ``` ### Step 5: Checkpoint I/O Analysis ``` Tool: analyze_disk_io_patterns Parameters: analysis_type: "checkpoints" top_n: 10 ``` **What this reveals:** - Timed checkpoints vs requested checkpoints ratio - Checkpoint write time and sync time - Buffers written per checkpoint **Interpret:** | Metric | Healthy | Problem | |--------|---------|---------| | Requested checkpoints | < 10% of total | > 20% (max_wal_size too small) | | Checkpoint write time | < 30 seconds | > 60 seconds (disk too slow or too much dirty data) | | Sync time | < 5 seconds | > 10 seconds (fsync bottleneck) | **IF requested checkpoints > 20%:** ``` Tool: review_settings Parameters: category: "checkpoint" ``` Recommend increasing `max_wal_size` (e.g., from 1GB to 4GB or 8GB). ### Step 6: pg_stat_io Analysis (PostgreSQL 16+ Only) The `analyze_disk_io_patterns` tool with `analysis_type: "all"` automatically includes `pg_stat_io` data when running on PostgreSQL 16+. This provides: - I/O statistics broken down by backend type (client backend, autovacuum, checkpointer, etc.) - Read/write/extend/fsync counts per object type (relation, temp relation, WAL) - Hit rates per context (normal, vacuum, bulkread, bulkwrite) **Agent decision:** If the PostgreSQL version is < 16, the `pg_stat_io` section will show a message saying it's not available. Skip this section and rely on the other analysis types. **What to look for in pg_stat_io:** - High `reads` in `bulkread` context: Sequential scans pulling large data volumes - High `extends` on temp relations: Confirm temp file spill issues (Step 4) - Autovacuum backend with high reads: Vacuum is scanning cold data, confirm with `monitor_vacuum_progress` ## Output Format ### I/O Analysis Report **Buffer Pool:** - Cache hit ratio: XX.X% - Status: Healthy / Warning / Critical - Recommendation: (if needed) **Table I/O Hotspots:** | Table | Size | Reads | Hits | Hit Ratio | Issue | |-------|------|-------|------|-----------|-------| | orders | 12GB | 450K | 2.1M | 82.4% | Low hit ratio -- undersized shared_buffers or missing index | **Temp File Impact:** - Total temp files since reset: X - Total temp bytes: X GB - Affected queries: list with `work_mem` recommendations **Checkpoint Performance:** - Requested: X% / Timed: X% - Avg write time: X sec - Recommendation: (if needed) **Action Items (prioritized):** 1. ... 2. ... ## Iterative Verification After applying any I/O-related fix: 1. **After increasing `shared_buffers`** (requires restart): Wait for cache to warm up (30+ minutes under normal workload), then re-run `analysis_type: "buffer_pool"`. 2. **After increasing `work_mem`**: Re-run the offending query with `analyze_query` (buffers: true) to confirm no more temp files. 3. **After increasing `max_wal_size`**: Monitor checkpoint frequency over 24 hours, then re-run `analysis_type: "checkpoints"`. ## When to Stop and Ask the User - **If cache hit ratio is fine (> 99%) but queries are still slow**: The problem is not I/O. Redirect to `pg-slow-query-diagnosis` skill. - **If temp file spill is caused by a single massive query**: Ask the user if increasing `work_mem` globally is acceptable, or suggest session-level `SET` for that specific query. - **If the server is on HDD and I/O is saturated**: Ask "Is migrating to SSD an option? Most I/O tuning has limited effect on saturated spinning disks." ## PostgreSQL Version Notes | Feature | Version | Notes | |---------|---------|-------| | `pg_stat_io` view | PG16+ | Detailed per-backend I/O statistics | | `wal_compression = lz4` | PG15+ | Reduces WAL I/O | | `BUFFERS` in EXPLAIN | PG9.4+ | Universally available | | `checkpoint_flush_after` | PG9.6+ | Limit OS page cache dirty page accumulation | ## Production Safety - All `analyze_disk_io_patterns` calls are read-only. No data is modified. - `analyze_query` with `analyze: true` executes the query. Use caution with write queries. - `review_settings` is read-only. - I/O statistics are cumulative since server start or last reset. Short uptime means less meaningful data.