Introduction
Seq Scan on orders. Cost: 0.00..45832.00. Rows: 1200000. That's your problem, right there, in one line.
Most PostgreSQL performance issues come from three places: missing indexes, stale statistics, and tables that have outgrown their autovacuum thresholds. Not bad query structure. Not wrong join types. Missing indexes and bloated tables. The fixes are boring. Finding them is the skill.
So we start with EXPLAIN ANALYZE -- the diagnostic. Then indexes -- the fix. Then monitoring and maintenance -- the thing that keeps it fixed.
Understanding EXPLAIN ANALYZE
Without this, you're guessing. Every optimization conversation starts here.
EXPLAIN shows the planner's estimated plan without executing. EXPLAIN ANALYZE actually runs the query and reports real times and row counts. You almost always want the latter.
EXPLAIN ANALYZESELECT o.id, o.total_amount, o.created_at, p.name AS product_name
FROM orders o
JOIN order_items oi ON oi.order_id = o.id
JOIN products p ON p.id = oi.product_id
WHERE o.customer_id = 1042AND o.created_at >= '2026-01-01'ORDER BY o.created_at DESCLIMIT20;The output is a tree. Read it bottom-up. Here's what matters at each node:
Seq Scan on a large table with a WHERE clause. Red flag. Every row is being read. You want Index Scan.
The "actual time" numbers are milliseconds. First number is startup cost, second is total. Compare estimated rows to actual rows -- a big gap means stale statistics. Run ANALYZE. In nested loop joins, check the "loops" count. The inner operation runs once per outer row, so multiply actual time by loops for the real cost. Sort method says "external merge Disk"? The sort spilled to disk because work_mem is too small.
Add BUFFERS -- as in EXPLAIN (ANALYZE, BUFFERS) -- to see how many pages came from shared buffers versus disk. High disk reads mean your data isn't cached.
Never run EXPLAIN ANALYZE on DELETE or UPDATE in production without wrapping in a transaction and rolling back. It actually executes the query.
Index Types and When to Use Them
Biggest lever you have. But the wrong index type is barely better than no index at all.
B-tree Indexes
The default. Equality, range, sorting. Right choice for the vast majority of columns.
-- Simple B-tree index on a frequently filtered columnCREATE INDEX idx_orders_customer_id
ON orders (customer_id);
-- Composite index for queries that filter by customer AND date rangeCREATE INDEX idx_orders_customer_date
ON orders (customer_id, created_at DESC);
-- Partial index: only index rows that matter for your queryCREATE INDEX idx_orders_active_customer
ON orders (customer_id, created_at DESC)
WHERE status = 'active';Column order in composite indexes matters. An index on (customer_id, created_at, status) works for filtering by customer_id alone, or customer_id + created_at. Does nothing for a query that only filters on status. Leading prefix or nothing.
Partial Indexes
Criminally underused. A partial index covers only a subset of rows. Million-order table, but most queries filter status = 'pending' (maybe 50,000 rows)? A partial index gives you a tiny, focused index instead of one covering the entire table. Faster to scan, faster to maintain.
GIN Indexes
For columns with multiple values: arrays, JSONB, full-text search vectors. Querying JSONB? GIN. Full-text search? Also GIN.
-- GIN index for JSONB containment queriesCREATE INDEX idx_products_metadata
ON products USING gin (metadata jsonb_path_ops);
-- Now this query can use the indexSELECT * FROM products
WHERE metadata @>'{"category": "electronics", "in_stock": true}';
-- GIN index for full-text searchCREATE INDEX idx_articles_search
ON articles USING gin (to_tsvector('english', title || ' ' || body));GiST Indexes
Geometric data, range types, PostGIS geospatial queries, nearest-neighbor searches. If you're not doing those things, skip GiST.
Over-indexing is a bigger problem than under-indexing. Dozens of indexes on a single table, many with zero scans over the past quarter. Meanwhile, write performance is terrible because PostgreSQL maintains every one of them on each INSERT, UPDATE, and DELETE. Check pg_stat_user_indexes. Drop anything with an idx_scan count of zero. Free performance.
Join Optimization Strategies
Three join algorithms: nested loop (good when outer set is small, inner is indexed), hash join (good for equi-joins, bad when the hash table spills to disk), merge join (good when both sides come pre-sorted). PostgreSQL picks for you. Usually it picks well.
When it doesn't:
-- This query joins multiple tables; ensure the most-- restrictive filter is applied firstSELECT
c.name AS customer_name,
p.name AS product_name,
o.total_amount,
s.tracking_number
FROM orders o
JOIN customers c ON c.id = o.customer_id
JOIN order_items oi ON oi.order_id = o.id
JOIN products p ON p.id = oi.product_id
LEFT JOIN shipments s ON s.order_id = o.id
WHERE o.created_at >= '2026-02-01'AND c.region = 'us-west'AND p.category = 'electronics'ORDER BY o.created_at DESCLIMIT50;Beyond about 8 joins, the planner can't evaluate every ordering. Falls back to heuristics. Sometimes misses.
If joins are slow, check in this order. Are statistics current? Run ANALYZE. Does any column have a skewed distribution? Increase default_statistics_target for that column. And -- this is surprisingly common -- are foreign key columns actually indexed? Missing FK indexes quietly destroy join performance. Seen it on every project I've audited.
Query Rewriting Techniques
Sometimes the fix is rewriting the query, not adding an index.
CTEs: Optimization Fences vs Inline
Since PostgreSQL 12, simple non-recursive CTEs are automatically inlined. They optimize like subqueries. Before 12, they were always materialized -- an optimization fence the planner couldn't see through. If you need the old behavior, say MATERIALIZED explicitly:
-- Inlined CTE (PostgreSQL 12+): planner can optimize across the boundaryWITH recent_orders AS (
SELECT id, customer_id, total_amount, created_at
FROM orders
WHERE created_at >= now() - INTERVAL'30 days'
)
SELECT c.name, sum(ro.total_amount) AS monthly_spend
FROM recent_orders ro
JOIN customers c ON c.id = ro.customer_id
WHERE c.region = 'eu-central'GROUP BY c.name
ORDER BY monthly_spend DESC;
-- Materialized CTE: forces separate evaluationWITH heavy_calculation AS MATERIALIZED (
SELECT product_id, avg(unit_price) AS avg_price
FROM order_items
GROUP BY product_id
)
SELECT p.name, hc.avg_price
FROM heavy_calculation hc
JOIN products p ON p.id = hc.product_id
WHERE hc.avg_price >100;EXISTS vs IN
EXISTS short-circuits on first match. IN builds the full result set. For large subqueries, EXISTS wins.
But the real issue is NOT IN vs NOT EXISTS. NOT IN has a NULL gotcha: if the subquery returns any NULL, the entire NOT IN evaluates to false for every row. Not a performance bug. A correctness bug. Use NOT EXISTS.
Avoiding Correlated Subqueries
Correlated subquery runs once per outer row. Million-row outer query means a million subquery executions. Rewrite as a join. Or a lateral join if you need to reference outer columns.
Connection Pooling and Configuration
Optimized queries on a badly configured instance will still be slow. The defaults assume 512 MB of RAM. You're not running 512 MB of RAM.
Key Configuration Parameters
shared_buffers: 25% of total RAM. On a 64 GB server, 16 GB. Don't go higher -- the OS file cache needs room too.
work_mem: memory per sort or hash operation before spilling to disk. Default is 4 MB. Way too small. 64-256 MB is reasonable. But this is per-operation, not per-query -- a complex query with multiple sorts could use several multiples. Multiply by expected concurrent connections before committing.
effective_cache_size: allocates nothing. Just tells the planner how much total cache is available so it can decide between index scans and seq scans. Set to 75% of total RAM.
random_page_cost. Default is 4.0. Assumes spinning disks. On SSDs, drop it to 1.1. This one change shifts the planner from sequential scans to index scans on multiple queries at once. Takes seconds. Sometimes avoids the need for new indexes entirely.
Connection Pooling with PgBouncer
PostgreSQL spawns a process per connection. 500 concurrent connections means 500 processes. Memory adds up fast.
PgBouncer sits between app and database. 500 app connections, 50 actual PostgreSQL connections. Transaction pooling works for most web apps. Session pooling if you need prepared statements.
Partitioning Large Tables
At hundreds of millions of rows, even indexed queries slow down. Indexes themselves bloat. VACUUM takes forever. Partitioning splits one logical table into smaller physical ones.
Range partitioning by date. Most common. Best place to start:
-- Create a partitioned table by date rangeCREATE TABLE events (
id bigserial,
event_type text NOT NULL,
payload jsonb,
created_at timestamptz NOT NULL DEFAULT now()
) PARTITION BY RANGE (created_at);
-- Create monthly partitionsCREATE TABLE events_2026_01 PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02 PARTITION OF events
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
CREATE TABLE events_2026_03 PARTITION OF events
FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');
-- Create indexes on each partition (inherited automatically in PG 11+)CREATE INDEX idx_events_created_at ON events (created_at);
CREATE INDEX idx_events_type ON events (event_type);Query for February 2026 data? Only scans events_2026_02. Every other month ignored. That's partition pruning.
Dropping old data becomes instant too. Detach and drop a partition versus deleting millions of rows and dealing with the resulting bloat. Night and day.
But don't partition just because a table is large. It adds schema complexity. Cross-partition queries can be slower. Consider it past 100 million rows, or when you have a clear time-based retention policy.
Monitoring and Maintenance
Tables grow. Data distributions shift. New queries appear. Without monitoring, performance degrades silently until someone files a ticket.
pg_stat_statements
Install this extension. If you install nothing else, install this. Tracks execution stats for every query: total time, call count, rows returned, buffer usage.
-- Find the top 10 slowest queries by total execution timeSELECT
round(total_exec_time::numeric, 2) AS total_time_ms,
calls,
round(mean_exec_time::numeric, 2) AS avg_time_ms,
round((100 * total_exec_time / sum(total_exec_time) OVER())::numeric, 2) AS pct,
substring(query, 1, 120) AS short_query
FROM pg_stat_statements
ORDER BY total_exec_time DESCLIMIT10;Run these weekly. Copy-paste them into a cron job. It's striking how often one or two queries account for most of the total execution time. Fix those and everything else speeds up -- less CPU contention, less I/O pressure.
VACUUM and Autovacuum
Dead tuples. MVCC means updated and deleted rows stick around until VACUUM cleans them. Let them accumulate and your tables bloat. Queries slow down scanning through garbage.
Autovacuum handles this, but the defaults are conservative. The default autovacuum_vacuum_scale_factor of 0.2 means autovacuum triggers when 20% of rows are dead. On a 100-million-row table? That's 20 million dead rows before cleanup starts. Lower it to 0.01 or 0.005 for large tables.
autovacuum_vacuum_cost_delay throttles autovacuum. Default is 2ms. If autovacuum can't keep up, drop to 0 during off-peak hours. Keep autovacuum_analyze_threshold aggressive for tables where data distribution changes frequently.
Monitor bloat with pgstattuple. When it's severe, regular VACUUM won't reclaim space. VACUUM FULL locks the table. pg_repack does it online.
Keeping Statistics Fresh
Stale statistics produce bad plans. Run ANALYZE manually after bulk loads, schema changes, anything that significantly shifts data distribution. If EXPLAIN ANALYZE shows estimated rows wildly different from actual rows, this is almost certainly the cause.
Performance work is never done. But pg_stat_statements tells you where to look, and that's most of the battle.