Reading EXPLAIN plans without losing your mind

Mar 20268 min

Performance

Reading EXPLAIN plans without losing your mind

Postgres EXPLAIN ANALYZE looks like an alien language until you learn the three things to read first. After that, every slow query becomes a five-minute conversation, not a forty-minute mystery.

Most performance problems I've fixed in production weren't fixed by clever code. They were fixed by reading the execution plan. EXPLAIN ANALYZE is the most useful single tool in the Postgres ecosystem - and most engineers I've met don't read its output past the first line.

What you're actually looking at

An EXPLAIN ANALYZE output is a tree, printed inside-out. The deepest indented node runs first; its result feeds the parent above it; that parent's result feeds its parent; and so on up to the root, which is the final result Postgres returns to the client.

Every node carries two sets of numbers. The first - `cost=` - is the planner's estimate before the query ran. The second - `actual time=` - is what really happened. The interesting bugs live in the gap between those two.

Three things to read first

1. Is there a Seq Scan over a big table?

A Seq Scan reads every row. If the table has a million rows and the WHERE filters down to a thousand, you almost certainly want an index. If the planner chose Seq Scan despite the index existing, your `WHERE` clause probably can't use the index - typically because of a function on the column (`WHERE LOWER(email) = ...`), an implicit cast, or selectivity the planner mis-estimated.

2. Are the row estimates wildly off?

Look at `rows=` (estimated) vs `actual rows=`. If the planner expected 50 rows and got back 50,000, every join above that point will be wrong-sized - usually a Nested Loop where you wanted a Hash Join. Fix: run `ANALYZE table_name` to refresh statistics, or bump `default_statistics_target` for that column. Stale or coarse stats mislead the planner into picking the wrong algorithm.

3. Where's the time going?

Each node reports `actual time=A..B rows=N loops=L`. The total work for that node is roughly `(B - A) × L`. A node that runs 10,000 loops at 0.5ms each is your bottleneck even if each individual call looks fast. This is the classic N+1 query in disguise - fix it by pulling the inner work into a join or a CTE.

Patterns that show up over and over

Nested Loop where you wanted Hash Join - almost always a row-estimate problem (run ANALYZE).
Index Scan when you wanted Bitmap Index Scan - the planner thinks the index returns few rows; if it returns many, Bitmap is faster.
Sort using disk - `Sort Method: external merge Disk:` in the output. Bump `work_mem` for the session, or restructure the query so the sort step is smaller.
Filter inside Index Scan - the index isn't selective enough; you need a composite index covering all WHERE columns.

How I work through a slow query

Run `EXPLAIN (ANALYZE, BUFFERS) SELECT ...` - the BUFFERS flag adds disk I/O numbers.
Read the plan inside-out. Find the deepest expensive node.
Check if estimated rows ≈ actual rows. If not, run ANALYZE and re-plan.
If the deep node is a Seq Scan over a big table, build the index that would let it do an Index Scan instead.
Re-run, compare, validate the win is real (not just one cold-cache vs warm-cache lucky run).
If the new index didn't move p99, drop it before you ship. Indexes have a write cost - keep only the ones that earn their keep.

What this gets you

I've shipped query optimizations that took p99 latency from 50 seconds to under 4 - a 92% improvement, 5× peak capacity recovered, zero rollbacks. The interesting part is none of it was clever. It was reading the plan, finding the Seq Scan, building the right composite index, validating before/after, and being willing to delete the indexes that didn't help.

EXPLAIN ANALYZE is talking to you. It's worth learning the language.

TWITTER LINKEDIN

Mitul Jagad

Reading EXPLAIN plans without losing your mind