PostgreSQL Query Performance: Indexes, Read/Write Trade-offs, and Advanced Concepts
A solid schema alone is not enough. As queries grow, planning, statistics, and indexes decide whether the database feels fast or fragile. This post covers PostgreSQL index types, when they help, and the read vs. write trade-off in practical terms.
Query Engine in Brief
Before executing a query, PostgreSQL estimates costs for several paths (heuristics + table statistics). Common access methods:
- Seq Scan: Reads the whole table — fine for small tables or when indexes would not pay off.
- Index Scan / Index Only Scan: Reaches rows via the index, or satisfies the query from the index alone.
- Bitmap Index Scan: Often used to combine multiple index predicates.
To see real behaviour, use EXPLAIN (ANALYZE, BUFFERS). Large gaps between estimated and actual rows usually point to stale statistics or plan quality issues.
B-tree: The Default Workhorse
Plain CREATE INDEX builds a B-tree. It fits equality (=) and range predicates (<, >, BETWEEN, ORDER BY).
Where it shines:
- Primary keys and uniqueness
- Filters on timestamps or surrogate IDs
- Join keys on the smaller side of a join
Caveat: Very low selectivity (e.g. WHERE status = 'active' when 90% of rows are active) may still favour a seq scan — the planner is often right.
Composite Indexes
Column order matters. For an index on (a, b, c):
WHERE a = ?— usableWHERE a = ? AND b = ?— usableWHERE b = ?— usually not without a leading predicate ona
Put the most selective and most frequently filtered column first; align with ORDER BY when it reduces sort cost.
Partial Indexes
Index only a subset of rows:
CREATE INDEX idx_orders_open ON orders (created_at)
WHERE status IN ('pending', 'processing');
Benefits: Smaller structure, less disk, cheaper maintenance, targeted reads for hot subsets.
Requirement: Query predicates must match the index predicate; otherwise the planner may ignore the index.
Covering Indexes and INCLUDE
When the query needs few columns, an Index Only Scan may be possible. Since PostgreSQL 11:
CREATE INDEX idx_users_email ON users (email) INCLUDE (display_name, id);
INCLUDE columns are not part of the key ordering; they live in leaf entries — wider keys without full sort participation.
Advanced Types: GIN, GiST, BRIN (Cheat Sheet)
| Type | Typical use |
|---|---|
| GIN | jsonb, full-text, arrays — high fan-out structures |
| GiST | geometry, some full-text and range types |
| BRIN | very large, naturally ordered tables (e.g. time series) — tiny footprint |
BRIN is coarse; it assumes physical correlation with the indexed column.
Read-Side Impact
- Few rows returned: indexes usually reduce latency (random heap fetches still add cost).
- Many rows returned: a sequential scan (sometimes parallel) can be cheaper.
- Join pipelines: the right index on the right relation keeps nested-loop cost under control.
Practice: inspect shared hit vs read in EXPLAIN (ANALYZE, BUFFERS); test with a cold cache when relevant.
Write-Side Impact
Every extra index means:
- INSERT: one write per index.
- UPDATE: updates to indexed columns touch those indexes.
- DELETE: removal from every relevant index.
“Index everything” hurts write throughput and increases pressure on VACUUM / autovacuum and bloat.
HOT (Heap-Only Tuple) Updates
If updated columns are not indexed and the new row version fits on the same page, PostgreSQL can sometimes use HOT and skip index updates — lowering write amplification. Fewer unnecessary indexes improve the odds.
Statistics and Maintenance
ANALYZE(or autovacuum’s analyze phase): essential for row estimates.- Volatile tables: stale stats → wrong plans; watch estimate vs actual in
EXPLAIN. - Index bloat: after heavy churn, plan
REINDEX CONCURRENTLY(where supported) or maintenance windows.
Common Mistakes
- Duplicate indexes that differ only slightly
- Partial index predicates that do not match application queries
- Hidden N+1 patterns from ORMs — indexes do not fix multiplicative round-trips
- Long locks when creating indexes inside the wrong transaction pattern
- Shipping index changes without
EXPLAIN ANALYZEon representative workloads
Summary Trade-off
| Reads | Writes / maintenance |
|---|---|
| Less full scans, lower latency | Each index is extra write path |
| Index-only scans cut I/O | More indexes → less HOT opportunity, more vacuum work |
Indexes behave like a subscription: each one helps some reads while taxing writes and housekeeping. The right set comes from measurement — EXPLAIN ANALYZE, production-like traffic, and tail latency (p95/p99).
Happy querying.