PostgreSQL Query Performance: Indexes, Read/Write Trade-offs, and Advanced Concepts

A solid schema alone is not enough. As queries grow, planning, statistics, and indexes decide whether the database feels fast or fragile. This post covers PostgreSQL index types, when they help, and the read vs. write trade-off in practical terms.

Query Engine in Brief

Before executing a query, PostgreSQL estimates costs for several paths (heuristics + table statistics). Common access methods:

Seq Scan: Reads the whole table — fine for small tables or when indexes would not pay off.
Index Scan / Index Only Scan: Reaches rows via the index, or satisfies the query from the index alone.
Bitmap Index Scan: Often used to combine multiple index predicates.

To see real behaviour, use EXPLAIN (ANALYZE, BUFFERS). Large gaps between estimated and actual rows usually point to stale statistics or plan quality issues.

B-tree: The Default Workhorse

Plain CREATE INDEX builds a B-tree. It fits equality (=) and range predicates (<, >, BETWEEN, ORDER BY).

Where it shines:

Primary keys and uniqueness
Filters on timestamps or surrogate IDs
Join keys on the smaller side of a join

Caveat: Very low selectivity (e.g. WHERE status = 'active' when 90% of rows are active) may still favour a seq scan — the planner is often right.

Composite Indexes

Column order matters. For an index on (a, b, c):

WHERE a = ? — usable
WHERE a = ? AND b = ? — usable
WHERE b = ? — usually not without a leading predicate on a

Put the most selective and most frequently filtered column first; align with ORDER BY when it reduces sort cost.

Partial Indexes

Index only a subset of rows:

CREATE INDEX idx_orders_open ON orders (created_at)
WHERE status IN ('pending', 'processing');

Benefits: Smaller structure, less disk, cheaper maintenance, targeted reads for hot subsets.

Requirement: Query predicates must match the index predicate; otherwise the planner may ignore the index.

Covering Indexes and INCLUDE

When the query needs few columns, an Index Only Scan may be possible. Since PostgreSQL 11:

CREATE INDEX idx_users_email ON users (email) INCLUDE (display_name, id);

INCLUDE columns are not part of the key ordering; they live in leaf entries — wider keys without full sort participation.

Advanced Types: GIN, GiST, BRIN (Cheat Sheet)

Type	Typical use
GIN	`jsonb`, full-text, arrays — high fan-out structures
GiST	geometry, some full-text and range types
BRIN	very large, naturally ordered tables (e.g. time series) — tiny footprint

BRIN is coarse; it assumes physical correlation with the indexed column.

Read-Side Impact

Few rows returned: indexes usually reduce latency (random heap fetches still add cost).
Many rows returned: a sequential scan (sometimes parallel) can be cheaper.
Join pipelines: the right index on the right relation keeps nested-loop cost under control.

Practice: inspect shared hit vs read in EXPLAIN (ANALYZE, BUFFERS); test with a cold cache when relevant.

Write-Side Impact

Every extra index means:

INSERT: one write per index.
UPDATE: updates to indexed columns touch those indexes.
DELETE: removal from every relevant index.

“Index everything” hurts write throughput and increases pressure on VACUUM / autovacuum and bloat.

HOT (Heap-Only Tuple) Updates

If updated columns are not indexed and the new row version fits on the same page, PostgreSQL can sometimes use HOT and skip index updates — lowering write amplification. Fewer unnecessary indexes improve the odds.

Statistics and Maintenance

ANALYZE (or autovacuum’s analyze phase): essential for row estimates.
Volatile tables: stale stats → wrong plans; watch estimate vs actual in EXPLAIN.
Index bloat: after heavy churn, plan REINDEX CONCURRENTLY (where supported) or maintenance windows.

Common Mistakes

Duplicate indexes that differ only slightly
Partial index predicates that do not match application queries
Hidden N+1 patterns from ORMs — indexes do not fix multiplicative round-trips
Long locks when creating indexes inside the wrong transaction pattern
Shipping index changes without EXPLAIN ANALYZE on representative workloads

Summary Trade-off

Reads	Writes / maintenance
Less full scans, lower latency	Each index is extra write path
Index-only scans cut I/O	More indexes → less HOT opportunity, more vacuum work

Indexes behave like a subscription: each one helps some reads while taxing writes and housekeeping. The right set comes from measurement — EXPLAIN ANALYZE, production-like traffic, and tail latency (p95/p99).

Happy querying.