Mert Tosun
← Posts
PostgreSQL Partitioning: A Deep Technical Guide

PostgreSQL Partitioning: A Deep Technical Guide

Blog AuthorDatabase

As large tables grow, you usually hit the same pain points: slower queries, heavy vacuum cycles, index bloat, and rising operational cost. PostgreSQL partitioning is not a magic fix, but when designed correctly, it is a major scalability lever.

This post walks through partitioning from architecture to production operations.

What Is Partitioning?

Partitioning splits one logical table into multiple physical child tables (partitions). Your application still queries one parent table, while PostgreSQL can target only relevant partitions during planning and execution.

When Should You Consider It?

Partitioning is a strong candidate when:

  • Table size grows continuously (events, logs, orders, telemetry)
  • Most queries are time-window based
  • You frequently archive or remove old data
  • Write throughput is increasing and maintenance windows are shrinking

Diagram: Logical Table, Physical Partitions

                    +----------------------+
                    |      orders          |  (parent table)
                    +----------+-----------+
                               |
        +----------------------+----------------------+
        |                      |                      |
+-------v-------+      +-------v-------+      +-------v-------+
| orders_2026_01|      | orders_2026_02|      | orders_2026_03|
| RANGE: Jan    |      | RANGE: Feb    |      | RANGE: Mar    |
+-------+-------+      +-------+-------+      +-------+-------+
        |                      |                      |
  local index(es)        local index(es)        local index(es)

In this model, a query on orders can prune unrelated monthly partitions and read much less data.

Partitioning Strategies

1) RANGE Partitioning

Most common for date/time workloads.

CREATE TABLE orders (
  id           bigserial,
  customer_id  bigint NOT NULL,
  order_date   date   NOT NULL,
  amount       numeric(12,2) NOT NULL
) PARTITION BY RANGE (order_date);

CREATE TABLE orders_2026_01 PARTITION OF orders
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

CREATE TABLE orders_2026_02 PARTITION OF orders
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

2) LIST Partitioning

Useful for low-cardinality categorical keys (region, tenant tier, status).

CREATE TABLE tickets (
  id          bigserial,
  region      text NOT NULL,
  created_at  timestamptz NOT NULL
) PARTITION BY LIST (region);

CREATE TABLE tickets_eu PARTITION OF tickets FOR VALUES IN ('eu');
CREATE TABLE tickets_us PARTITION OF tickets FOR VALUES IN ('us');

3) HASH Partitioning

Good for balancing write distribution when temporal retention is not the main concern.

CREATE TABLE events (
  id          bigserial,
  user_id     bigint NOT NULL,
  payload     jsonb  NOT NULL,
  created_at  timestamptz NOT NULL
) PARTITION BY HASH (user_id);

CREATE TABLE events_p0 PARTITION OF events FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE events_p1 PARTITION OF events FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE events_p2 PARTITION OF events FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE events_p3 PARTITION OF events FOR VALUES WITH (MODULUS 4, REMAINDER 3);

Why Partition Pruning Matters Most

The primary performance gain comes from pruning. PostgreSQL must skip irrelevant partitions early.

Good pattern:

SELECT count(*)
FROM orders
WHERE order_date >= DATE '2026-02-01'
  AND order_date <  DATE '2026-03-01';

Risky pattern:

-- Function wrapping may hurt pruning
SELECT count(*)
FROM orders
WHERE date_trunc('month', order_date) = DATE '2026-02-01';

Rule of thumb: keep partition-key predicates sargable and avoid wrapping the key in functions.

Index Design: Think Per Partition

Indexes are effectively partition-local, which is often beneficial:

  • Smaller index structures
  • Localized maintenance impact
  • Different tuning on hot vs cold partitions
CREATE INDEX ON orders_2026_02 (customer_id, order_date);
CREATE INDEX ON orders_2026_02 (order_date);

You can also define indexes at the parent level, but high-scale systems often still tune active partitions explicitly.

Constraints and Uniqueness Reality

For partitioned tables, uniqueness constraints typically need the partition key included to guarantee correctness across partitions.

A common practical model:

  • Surrogate primary key (id)
  • Business-level uniqueness designed with partition key awareness

Operational Excellence (Where It Really Pays Off)

Partitioning is not just schema design; it is an operations model.

1) Pre-create Future Partitions

Use scheduled jobs to create upcoming weekly/monthly partitions before data arrives.

2) Retention by Dropping Partitions

Dropping an old partition is far cheaper than mass deletes:

DROP TABLE orders_2024_01;

3) Targeted Maintenance

Tune autovacuum and maintenance differently for hot and cold partitions.

Migration Strategy for Live Systems

Avoid a risky one-shot migration on a huge table. Prefer staged migration:

  1. Create the new partitioned parent
  2. Route new writes there
  3. Backfill historical data in batches
  4. Validate with counts/checksums
  5. Perform controlled cutover (rename/switch)

For very large datasets, consider dual-write or CDC-assisted migration patterns.

Observability and Validation

Validate outcomes with data, not assumptions:

  • Compare plans via EXPLAIN (ANALYZE, BUFFERS)
  • Measure logical/physical reads
  • Track P95/P99 query latency
  • Review vacuum and bloat trends

If pruning is not happening, expected gains will not materialize.

Common Pitfalls

  • Choosing a partition key that does not match query patterns
  • Creating too many partitions and increasing planner overhead
  • Skipping automation for partition creation and retention
  • Assuming partitioning automatically speeds up every workload

Conclusion

PostgreSQL partitioning can dramatically improve both performance and operations for large, time-evolving datasets. But success depends on a full system approach: key selection, pruning-friendly query design, and disciplined automation.

Treat partitioning as data lifecycle architecture, not just a DDL change.