---
name: postgres-optimization
description: Unconventional PostgreSQL optimization techniques
license: MIT
tier: 2
allowed-tools:
  - read_file
  - write_file
  - run_terminal_cmd
  - grep
related: [debugging, plan-then-execute, robust-first]
tags: [moollm, database, postgresql, performance, optimization, indexing]
inputs:
  query:
    type: string
    required: false
    description: "Query to optimize"
  table:
    type: string
    required: false
    description: "Table to analyze"
outputs:
  - OPTIMIZATION.md
  - EXPLAIN-ANALYSIS.txt
credits:
  source:
    title: "Unconventional PostgreSQL Optimizations"
    author: "Haki Benita"
    url: "https://hakibenita.com/postgresql-unconventional-optimizations"
---

# 🐘 PostgreSQL Optimization

> **"Beyond 'just add an index' — creative solutions for real performance problems."**

Unconventional optimization techniques for PostgreSQL that go beyond standard DBA playbooks.

## Purpose

When conventional approaches fall short — query rewrites, adding indexes, VACUUM, ANALYZE — these techniques offer creative solutions:

- Eliminate impossible query scans with constraint exclusion
- Reduce index size with function-based indexes
- Enforce uniqueness with hash indexes instead of B-Trees

## When to Use

- Ad-hoc query environments where users make mistakes
- Large indexes approaching table size
- Uniqueness constraints on large text values (URLs, documents)
- Timestamp columns queried at coarser granularity

---

## Technique 1: Constraint Exclusion

### The Problem

Check constraints prevent invalid data, but PostgreSQL doesn't use them to optimize queries by default.

```sql
CREATE TABLE users (
    id INT PRIMARY KEY,
    username TEXT NOT NULL,
    plan TEXT NOT NULL,
    CONSTRAINT plan_check CHECK (plan IN ('free', 'pro'))
);
```

An analyst writes:

```sql
SELECT * FROM users WHERE plan = 'Pro';  -- Note: capital P
```

Despite the check constraint making this condition impossible, PostgreSQL scans the entire table.

### The Solution

```sql
SET constraint_exclusion TO 'on';
```

With constraint exclusion enabled:

```sql
EXPLAIN ANALYZE SELECT * FROM users WHERE plan = 'Pro';
```

```
Result  (cost=0.00..0.00 rows=0 width=0)
  One-Time Filter: false
Execution Time: 0.008 ms
```

PostgreSQL recognizes the condition contradicts the constraint and skips the scan entirely.

### When to Enable

| Environment | Recommendation |
|-------------|----------------|
| OLTP production | Leave as 'partition' (default) |
| BI / Data Warehouse | Set to 'on' |
| Ad-hoc query tools | Set to 'on' |
| Reporting databases | Set to 'on' |

### Tradeoffs

- **Benefit**: Eliminates impossible query scans
- **Cost**: Extra planning overhead evaluating constraints against conditions
- **Default**: 'partition' — only used for partition pruning

---

## Technique 2: Function-Based Indexes for Lower Cardinality

### The Problem

You have a sales table with timestamps:

```sql
CREATE TABLE sale (
    id INT PRIMARY KEY,
    sold_at TIMESTAMPTZ NOT NULL,
    charged INT NOT NULL
);
```

Analysts query by day:

```sql
SELECT date_trunc('day', sold_at AT TIME ZONE 'UTC'), SUM(charged)
FROM sale
WHERE sold_at BETWEEN '2025-01-01 UTC' AND '2025-02-01 UTC'
GROUP BY 1;
```

You add a B-Tree index on `sold_at` — 214 MB for a 160 MB table. The index is almost half the table size!

### The Solution

Index only what queries need:

```sql
CREATE INDEX sale_sold_at_date_ix 
ON sale((date_trunc('day', sold_at AT TIME ZONE 'UTC'))::date);
```

| Index | Size |
|-------|------|
| `sale_sold_at_ix` (full timestamp) | 214 MB |
| `sale_sold_at_date_ix` (date only) | 66 MB |

The function-based index is **3x smaller** because:
- Dates are 4 bytes vs 8 bytes for timestamptz
- Fewer distinct values enable deduplication

### The Discipline Problem

Function-based indexes require exact expression match:

```sql
-- Uses the index ✓
WHERE date_trunc('day', sold_at AT TIME ZONE 'UTC')::date 
      BETWEEN '2025-01-01' AND '2025-01-31'

-- Does NOT use the index ✗
WHERE (sold_at AT TIME ZONE 'UTC')::date 
      BETWEEN '2025-01-01' AND '2025-01-31'
```

### Solution: Virtual Generated Columns (PostgreSQL 18+)

```sql
ALTER TABLE sale ADD sold_at_date DATE
GENERATED ALWAYS AS (date_trunc('day', sold_at AT TIME ZONE 'UTC'));
```

Now queries use the virtual column:

```sql
SELECT sold_at_date, SUM(charged)
FROM sale
WHERE sold_at_date BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY 1;
```

**Benefits:**
- Smaller index
- Faster queries
- No discipline required — column guarantees correct expression
- No ambiguity about timezones

**Limitation:** PostgreSQL 18 doesn't support indexes directly on virtual columns (yet).

---

## Technique 3: Hash Index for Uniqueness

### The Problem

You have a table with large URLs:

```sql
CREATE TABLE urls (
    id INT PRIMARY KEY,
    url TEXT NOT NULL,
    data JSON
);
```

You add a unique B-Tree index:

```sql
CREATE UNIQUE INDEX urls_url_unique_ix ON urls(url);
```

| Size |
|------|
| Table: 160 MB |
| B-Tree index: 154 MB |

The index is almost as large as the table because B-Tree stores actual values in leaf blocks.

### The Solution

Use an exclusion constraint with a hash index:

```sql
ALTER TABLE urls 
ADD CONSTRAINT urls_url_unique_hash 
EXCLUDE USING HASH (url WITH =);
```

| Index | Size |
|-------|------|
| B-Tree | 154 MB |
| Hash | 32 MB |

The hash index is **5x smaller** because it stores hash values, not the actual URLs.

### Uniqueness Is Enforced

```sql
INSERT INTO urls (id, url) VALUES (1000002, 'https://example.com');
-- ERROR: conflicting key value violates exclusion constraint
```

### Queries Still Fast

```sql
EXPLAIN ANALYZE SELECT * FROM urls WHERE url = 'https://example.com';
```

```
Index Scan using urls_url_unique_hash on urls
Execution Time: 0.022 ms  -- Faster than B-Tree's 0.046 ms!
```

### Limitations

| Feature | B-Tree Unique | Hash Exclusion |
|---------|--------------|----------------|
| Foreign key reference | ✓ | ✗ |
| `ON CONFLICT (column)` | ✓ | ✗ |
| `ON CONFLICT ON CONSTRAINT` | ✓ | ✓ (DO NOTHING only) |
| `ON CONFLICT DO UPDATE` | ✓ | ✗ |
| `MERGE` | ✓ | ✓ |

### Workaround: Use MERGE

Instead of `INSERT ... ON CONFLICT DO UPDATE`:

```sql
MERGE INTO urls t
USING (VALUES (1000004, 'https://example.com')) AS s(id, url)
ON t.url = s.url
WHEN MATCHED THEN UPDATE SET id = s.id
WHEN NOT MATCHED THEN INSERT (id, url) VALUES (s.id, s.url);
```

---

## Quick Reference

### Diagnostic Queries

**Check index sizes:**
```sql
\di+ table_*
```

**Compare index to table size:**
```sql
SELECT 
    relname AS name,
    pg_size_pretty(pg_relation_size(oid)) AS size
FROM pg_class 
WHERE relname LIKE 'your_table%'
ORDER BY pg_relation_size(oid) DESC;
```

**Check constraint_exclusion setting:**
```sql
SHOW constraint_exclusion;
```

### Decision Tree

```
Is the query scanning impossibly?
├── Yes → Enable constraint_exclusion
└── No
    ↓
Is index nearly as large as table?
├── Yes, timestamp column → Function-based index on date
├── Yes, large text column → Hash exclusion constraint
└── No → Standard B-Tree is fine
```

---

## Commands

| Command | Action |
|---------|--------|
| `ANALYZE [table]` | Analyze query performance |
| `CHECK-CONSTRAINTS` | Evaluate constraint exclusion opportunity |
| `LOWER-CARDINALITY` | Find function-based index opportunities |
| `HASH-UNIQUE` | Evaluate hash index for large values |
| `COMPARE-INDEXES` | Compare index sizes and performance |

---

## Integration

| Direction | Skill | Relationship |
|-----------|-------|--------------|
| ← | [debugging](../debugging/) | Query debugging leads here |
| → | [plan-then-execute](../plan-then-execute/) | Systematic optimization |