PostgreSQL is built to let lots of things happen at once: reads while writes are happening, multiple users running queries, background maintenance cleaning up old rows, migrations rolling out without downtime. Most of the time it delivers on that promise because its concurrency model (MVCC) avoids blocking in many common cases.
But every Postgres operator eventually runs into a moment that feels like someone yanked the power cord on concurrency: queries pile up, the app starts timing out, dashboards show CPU is oddly calm, and yet nothing is “moving.” In reality the system is usually not crashed. It’s waiting. One session is holding (or waiting for) a lock that forces other sessions to queue behind it, and that queue can spread until the database appears frozen.
1. Introduction:
“Stop-the-world” isn’t an official PostgreSQL term. It’s a useful mental label for incidents where a single lock or blocking chain effectively pauses progress for a large part of the workload. These events matter because:
- The blast radius is huge. A lock taken on one table can stall unrelated requests if your app workflow touches that table frequently, or if connection pools fill up with waiting sessions.
- They’re easy to trigger accidentally. Routine operations like certain schema changes,
TRUNCATE, or an overlooked long-running transaction can create conditions where other sessions must wait. - They’re hard to diagnose under pressure. The database looks “idle” at the resource level, while the real problem is invisible unless you know where to look (
pg_stat_activity,pg_locks, lock wait events). - They turn small mistakes into outages. One migration run at the wrong time can turn into cascading timeouts, retries, and even more load—making recovery slower.
The goal of this article is to make those moments predictable. We’ll map the kinds of locks that can cause system-wide stalls, show you how lock queues form and spread, and give you practical ways to detect, prevent, and safely execute the operations that are most likely to cause “stop-the-world” behavior in production.
2. What People Mean by “Stop-the-World” in a Database Context
“Stop-the-world” is borrowed from other areas of computing (like garbage collection) where the system briefly pauses normal work. In databases, it’s not usually a literal pause of the entire server process. Instead, it’s a concurrency pause that happens when enough sessions end up waiting on the same chokepoint that the workload behaves like it’s frozen.
In PostgreSQL, that chokepoint is most often a lock (or a chain of locks) that forces other sessions to wait.
It’s not “the database is down”
During a stop-the-world-style event, Postgres is typically still:
- accepting connections,
- running some background work,
- responding to trivial queries that don’t touch the blocked objects.
The problem is that your important queries can’t make progress, because they need a lock that can’t be granted yet.
What it looks like from the outside
Common “everything stopped” symptoms include:
- Requests timing out across many endpoints, even ones that are normally fast.
- A sudden spike in the number of active sessions, many of them “doing nothing” except waiting.
- Low CPU utilization (because waiting doesn’t burn CPU) paired with high app error rates.
- A backlog in the connection pool: threads are stuck waiting for a DB connection that is itself waiting on a lock.
The mechanics: how a single blocker becomes “world-stopping”
A typical pattern is:
- One session takes a strong lock (or tries to) on a table or other object.
Example: a DDL statement that requires anACCESS EXCLUSIVElock. - Other sessions queue behind it, waiting for locks that conflict.
These waiting sessions are now not completing and remain in your pool. - The queue spreads outward as more application requests arrive and also block.
You get a connection pileup, then timeouts, retries, and even more pressure. - The real blocker might not even be the DDL.
Often, the DDL is waiting behind a long-running transaction that holds a weaker lock. Because Postgres queues lock requests, a single “strong lock request” can cause other compatible lock requests to queue behind it as well, amplifying the impact.
So the “stop-the-world” feeling can be caused by either:
- A session holding a lock that blocks many others, or
- A session waiting for a lock in a way that causes everyone else to line up behind it (lock queue effects).
What qualifies as “stop-the-world” vs normal contention
Normal contention is localized: one hot row, one hot index, one busy table, and most other work continues.
A “stop-the-world” episode is characterized by:
- High fan-out blocking: one blocker affects many sessions.
- Lock queue amplification: a few lock waits turn into many.
- Application-level stall: pools saturate, timeouts rise, and the app effectively stops serving.
Key takeaway
When people say “stop-the-world lock” in PostgreSQL, they usually mean:
- “A lock situation (often involving
ACCESS EXCLUSIVEor a lock queue behind it) that makes the database effectively unusable for a broad slice of traffic.”
The rest of this article focuses on turning that vague fear into concrete knowledge: which operations are risky, how to spot the true blocker, and how to run changes without triggering a production-wide stall.
3. PostgreSQL Locking Basics You Need for the Rest of This Article
PostgreSQL’s locking system is how it keeps data correct when multiple sessions do things at the same time. If you’re going to reason about “stop-the-world” incidents, you only need a handful of core ideas: what a lock protects, how long it lasts, how conflicts create waiting, and why waiting can cascade.
Locks exist to protect specific resources
In Postgres, a “lock” is not one single thing. It’s a family of mechanisms that protect different kinds of resources, such as:
- Tables and other relations (tables, indexes, materialized views, sequences)
- Rows (the specific tuples you update/delete)
- Transactions and virtual transactions
- Internal shared-memory structures (via lightweight locks)
For “everything is stuck” events, the most common culprits are locks on relations (especially tables) and the way lock requests queue.
Two big categories you’ll hear about
Row-level locks
These happen as part of UPDATE, DELETE, and SELECT ... FOR UPDATE/SHARE. They are usually narrow in scope (one row or set of rows), so they more often cause localized blocking.
Table-level (relation) locks
Every query takes some lock level on the relations it touches. Many of these are compatible (so they don’t block each other), but some operations need strong locks that block most concurrent access. These are the ones that feel “stop-the-world.”
Lock compatibility is the whole game
A lock only causes waiting when there is a conflict. Postgres defines lock modes (like ACCESS SHARE, ROW EXCLUSIVE, ACCESS EXCLUSIVE) and a compatibility matrix that determines which combinations can coexist.
The practical takeaway:
- Many sessions can hold “weak” locks together.
- “Strong” locks conflict with many other modes.
- When a strong lock is involved, waiting can spread fast.
Locks are held for different durations
Lock duration depends on what kind of lock it is and what statement you ran, but two rules of thumb explain most surprises:
- Locks taken by DDL are often held until the end of the transaction.
If you run schema changes inside an explicit transaction block and keep it open, you keep the lock longer than you think. - Long transactions make everything worse.
Even if a transaction is “idle,” if it’s still open it may hold locks (and it can also prevent cleanup via MVCC, which indirectly increases pressure).
Postgres uses MVCC, so reads usually don’t block writes
Multi-Version Concurrency Control (MVCC) is why plain SELECT can run while other sessions are UPDATE-ing the same table. Readers see a consistent snapshot; writers create new row versions.
But MVCC doesn’t eliminate all blocking:
- Certain reads explicitly request locks (
SELECT ... FOR UPDATE). - Writes can block each other when they touch the same rows.
- Many schema/maintenance operations still require strong relation locks.
Waiting isn’t just “slow”—it creates queues
A common misconception is: “If a query waits on a lock, only that query is affected.”
In real systems, waiting causes secondary failures:
- connection pools fill with waiting sessions,
- threads block upstream,
- retries amplify load,
- more sessions arrive and join the wait queue.
So a single blocked statement can become an application-wide outage even if the database is technically functioning.
The mental model you’ll use going forward
For the rest of this article, keep this simple loop in mind:
- A statement needs a lock on some object.
- If the lock conflicts with existing locks, it waits.
- Waiters accumulate, and the wait spreads via your app and pool.
- The “stop-the-world” moment ends only when the blocking lock is released (commit/rollback, statement finishes, session is canceled/terminated).
Once you have this model, the next sections (lock modes, ACCESS EXCLUSIVE, DDL pitfalls, diagnosis with pg_stat_activity/pg_locks) become much easier to follow.
4. A Quick Tour of PostgreSQL Lock Types
PostgreSQL uses multiple locking mechanisms at different layers. When people talk about “locks,” they’re often mixing these together, so it helps to name the main families and what they’re for. The sections later in the article will zoom in on the ones most likely to create stop-the-world behavior.
Relation locks (table/index locks)
These are locks on relations: tables, indexes, materialized views, sequences, etc. Almost every SQL statement takes at least a weak relation lock on the objects it touches.
- Examples:
ACCESS SHARE,ROW EXCLUSIVE,ACCESS EXCLUSIVE - Where you see them:
pg_lockswithlocktype = 'relation' - Why they matter: strong relation locks (especially
ACCESS EXCLUSIVE) can block huge parts of your workload.
Row locks (tuple locks)
These protect specific rows. They arise from:
UPDATE,DELETESELECT ... FOR UPDATE,FOR NO KEY UPDATE,FOR SHARE,FOR KEY SHARE
Row locks tend to cause localized contention (hot rows) rather than “the whole DB froze,” but they can still contribute to large incidents when a hot table is involved or when many sessions pile up behind the same row(s).
- Where you see them: you don’t get a neat “row id” in
pg_locksthe way you might expect; tuple locks are represented indirectly (and often show up via waits liketuple/transactioniddepending on circumstances).
Transaction-related locks
Postgres also has locks that are effectively “wait for another transaction to finish.”
Two common ones you’ll encounter during diagnosis:
transactionidlocks: a session is waiting for a particular transaction ID to commit/rollback (very common with row-level conflicts).virtualxidlocks: related to a session’s virtual transaction; these show up in some DDL/maintenance interactions and can surprise you when a seemingly harmless session blocks a stronger operation.
These are important because sometimes the visible blocked query is waiting on a lock that’s actually “owned” by a transaction you didn’t expect to matter.
Advisory locks
Advisory locks are application-defined locks. Postgres enforces them, but Postgres doesn’t take them automatically; your code does.
- Functions:
pg_advisory_lock,pg_try_advisory_lock, and transaction-scoped variants - Use cases: ensuring a singleton job runs once, preventing concurrent migrations, serializing access to a resource
- Risk: you can create your own “stop-the-world” event if you take a global advisory lock and forget to release it (or if the session hangs around).
Lightweight locks (LWLocks) and other internal locks
Postgres has internal locks used to coordinate access to shared memory structures (buffer manager, WAL, lock manager structures, etc.). These are not SQL-level table/row locks and typically aren’t what you’ll see in pg_locks as relation locks.
- Why they matter: under extreme contention or I/O pressure, waits on these internals can look like a “freeze,” but they’re a different class of problem than DDL-induced blocking.
- Where you see them: wait event reporting (
pg_stat_activity.wait_event_type/wait_event).
Predicate locks (Serializable Snapshot Isolation)
If you run at the SERIALIZABLE isolation level, Postgres uses predicate locking (implemented as SIREAD locks) to detect dangerous read/write patterns.
- These don’t behave like typical blocking locks most of the time.
- They’re primarily for correctness (and can lead to serialization failures rather than long blocking).
- Where you see them:
pg_lockswithlocktype = 'predicate'in serializable workloads.
Extension: object and metadata locks
Some operations also take locks on non-table objects:
- databases (
locktype = 'database') - pages or tuples (less commonly surfaced in the way people expect)
- extensions, schemas, or system catalogs (often indirectly, but crucial during DDL)
These can matter because “stop-the-world” incidents are frequently triggered by metadata-changing statements that must coordinate globally.
What to remember for stop-the-world scenarios
If your goal is to prevent “everything is stuck” moments, focus mostly on:
- Relation locks (especially
ACCESS EXCLUSIVE) - Lock queues and lock waits (how one waiter can cause pileups)
- Long-running transactions (they keep locks and block cleanup)
- DDL and maintenance commands (they’re disproportionately likely to need strong locks)
Next up, we’ll break down lock modes in plain language so you can predict which statements block which workloads.
5. Lock Modes Explained: From ACCESS SHARE to ACCESS EXCLUSIVE
PostgreSQL relation (table/index) locks come in a set of modes. Every statement that touches a table takes one of these modes (sometimes more than one object per statement). The mode determines what else can run concurrently.
You do not need to memorize every edge case, but you do need a feel for the spectrum: most queries take weak locks that happily coexist, while a few operations require locks strong enough to stall almost everything.
The spectrum at a glance
From “least blocking” to “most blocking,” the commonly encountered relation lock modes are:
ACCESS SHAREROW SHAREROW EXCLUSIVESHARE UPDATE EXCLUSIVESHARESHARE ROW EXCLUSIVEEXCLUSIVEACCESS EXCLUSIVE
Below is what each one generally means in practice, plus typical statements that take it.
ACCESS SHARE
What it’s for: ordinary reads.
Typically taken by: SELECT (plain selects, not FOR UPDATE variants).
Concurrency impact:
- Compatible with other
ACCESS SHARElocks and most write locks. - Does not block normal
INSERT/UPDATE/DELETE.
Why it matters:
Large numbers of readers usually aren’t the “stop-the-world” problem—but long-running reads can still get in the way of certain DDL that needs stronger locks, or they can keep transactions open and indirectly cause pain.
ROW SHARE
What it’s for: reads that intend to lock rows.
Typically taken by: SELECT ... FOR SHARE / FOR KEY SHARE (and related row-locking selects).
Concurrency impact:
- Stronger than
ACCESS SHARE. - Still allows a lot of concurrency, but it participates in conflicts with some stronger DDL-type locks.
ROW EXCLUSIVE
What it’s for: typical writes.
Typically taken by: INSERT, UPDATE, DELETE, MERGE.
Concurrency impact:
- Compatible with reads (
ACCESS SHARE) and with other writers’ROW EXCLUSIVElocks. - Conflicts with the more DDL-ish “share/exclusive” modes above it.
Why it matters:
Most OLTP workloads live here. If you’re seeing widespread blocking and almost everything is waiting, it’s rarely because of ROW EXCLUSIVE alone—it’s usually because something stronger entered the picture.
SHARE UPDATE EXCLUSIVE
What it’s for: operations that need stability of the relation definition and coordination with other maintenance.
Often taken by: VACUUM (including autovacuum), ANALYZE, CREATE INDEX CONCURRENTLY (phases vary), and other maintenance-style commands.
Concurrency impact:
- Allows reads and writes to continue.
- Conflicts with some other schema/maintenance operations (and can queue behind or block them).
Why it matters:
This mode is usually “production-friendly,” but it can become part of a bigger blocking chain when combined with DDL or long transactions.
SHARE
What it’s for: stronger coordination than maintenance, often to protect against concurrent writes that would invalidate an operation’s assumptions.
Sometimes taken by: certain DDL/maintenance steps (exact statements vary by version and operation).
Concurrency impact:
- Allows reads.
- Tends to block writers (
ROW EXCLUSIVE) or other operations that need to modify the table.
SHARE ROW EXCLUSIVE
What it’s for: “I need to do something substantial, and I don’t want concurrent schema-ish things or concurrent writes stepping on it.”
Sometimes taken by: some ALTER TABLE variants and other heavy operations.
Concurrency impact:
- More blocking than
SHARE. - Starts to feel “big” in production because it conflicts with common write locks.
EXCLUSIVE
What it’s for: very strong operations that need near-complete control, but still not quite the “stop all access” level.
Sometimes taken by: certain ALTER TABLE operations (operation-dependent).
Concurrency impact:
- Blocks a wide set of other operations.
- Still not as absolute as
ACCESS EXCLUSIVE.
ACCESS EXCLUSIVE
What it’s for: operations that must have the table to themselves.
Commonly taken by: many ALTER TABLE operations, DROP TABLE, TRUNCATE, some REINDEX/rewrite operations, and other commands that fundamentally change or replace the relation.
Concurrency impact (practically):
- Conflicts with everything on that relation, including plain
SELECT. - If it’s granted, it can make a hot table feel like it “went offline.”
- If it’s waiting to be granted, it can still cause a pileup due to lock queueing effects.
Why it’s the headline “stop-the-world” lock:
If your application depends on the table (directly or through foreign keys, views, or ORM behavior), an ACCESS EXCLUSIVE lock request is one of the fastest ways to turn normal traffic into a wall of waiting sessions.
Two crucial behaviors that make this worse than it sounds
Lock waits form queues
If session A is waiting for a strong lock, and session B arrives later requesting a lock that would normally be compatible with the currently-held locks, B may still end up waiting because it’s behind A in the queue. This is one reason a single “big” lock request can make everything line up behind it.
Locks often live until transaction end
Especially with DDL, the lock may be held until COMMIT/ROLLBACK. That means even a quick operation can cause a long freeze if the transaction is left open (for example, a migration framework that wraps multiple steps in one transaction).
Practical takeaway
Most day-to-day queries use ACCESS SHARE (reads) or ROW EXCLUSIVE (writes) and coexist nicely. Stop-the-world moments nearly always involve:
- an
ACCESS EXCLUSIVElock being held or requested on a busy table, and/or - a lock queue forming behind that request,
- made worse by a long-running or idle transaction.
Next, we’ll connect these modes to real-life blocking by looking at relation locks vs row locks and what actually blocks what in practice.
6. Relation Locks vs Row Locks: What Can Block What
When PostgreSQL “freezes,” it’s tempting to think “some row is locked.” But the most disruptive incidents are usually about relation locks (table/index-level) and lock queues, not a single locked row. Understanding the difference makes diagnosis much faster.
Relation locks: the traffic lights on tables and indexes
Relation locks apply to an entire object (table, index, materialized view, sequence). They are taken in a specific lock mode (like ACCESS SHARE or ACCESS EXCLUSIVE) and conflicts are decided by the lock compatibility matrix.
Common examples:
- Plain
SELECTtakes anACCESS SHARElock on the table. INSERT/UPDATE/DELETEtakeROW EXCLUSIVE.- Many
ALTER TABLEoperations takeACCESS EXCLUSIVE.
What relation locks block best:
- DDL blocking DML:
ALTER TABLE ...(oftenACCESS EXCLUSIVE) blocks reads and writes. - DDL blocking reads: a granted
ACCESS EXCLUSIVEblocks even plainSELECT. - Maintenance coordination: some operations (VACUUM/ANALYZE, concurrent index builds) use mid-strength locks that mostly allow traffic but can conflict with other schema/maintenance tasks.
Why relation locks cause “stop-the-world”:
- The blocked unit is a whole hot table (or an index the hot table needs).
- The lock request can create a queue that causes other sessions to wait even if they would otherwise be compatible.
You can observe relation locks directly in pg_locks (locktype = 'relation'), and identify the object via relation::regclass.
Row locks: conflicts on specific rows
Row locks (tuple locks) happen when statements need to change or explicitly lock rows:
UPDATE/DELETElock the rows they modify.SELECT ... FOR UPDATE/SHARElocks the rows it selects (depending on the clause).
Row locking conflicts are usually about two sessions touching the same rows, not about the whole table.
What row locks block best:
- Writer vs writer on the same row:
- Two
UPDATEs hitting the same row → one waits.
- Two
- Locker vs writer:
SELECT ... FOR UPDATEin one session can block anUPDATEin another session on the same rows.
- Long transactions amplify row lock pain:
- If a transaction updates rows and stays open, it holds those row locks longer.
How row lock waits show up:
Row lock waits often present as waiting on a transaction to finish, because the second session must wait until the first transaction commits/rolls back to know whether the row version is visible/updatable. In practice, you may see waits tied to transactionid (or related wait events) more than a neat “row lock entry.”
The key difference in “blast radius”
- Row lock contention is usually surgical: it affects sessions that touch the same row(s). The database keeps serving other work.
- Relation lock contention is often systemic: it can block every query that touches the table, which can quickly become “everything is stuck” in an application.
That’s why stop-the-world stories are disproportionately about ACCESS EXCLUSIVE (relation lock), not “a locked row.”
What can block what: common patterns you’ll actually see
1) DDL blocks everything on a table (classic stop-the-world)
- Session A:
ALTER TABLE big_table ...→ requestsACCESS EXCLUSIVE - Sessions B..N: normal app queries (SELECT/INSERT/UPDATE) now wait
If the table is central, your app appears down.
2) Long-running read blocks DDL (then the queue blocks everyone else)
- Session A: long
SELECT(holdsACCESS SHARE) - Session B:
ALTER TABLE ...requestsACCESS EXCLUSIVEand waits for A to finish - Session C: arrives later wanting a normal lock (e.g.,
ROW EXCLUSIVEfor an update)
Even though C is compatible with A, it can end up queued behind B’s pendingACCESS EXCLUSIVErequest, so C waits too.
This is a common “it was just a SELECT, why did migrations take down prod?” incident.
3) Row lock pileups look like timeouts, not “global freeze”
- Session A: updates a popular row and keeps the transaction open
- Sessions B..N: try to update the same row → all wait on A
- Other parts of the system may still work fine
It’s painful, but the impact is usually limited to the code paths that touch those rows.
Practical rules of thumb
- If plain SELECTs are waiting, suspect a relation lock (often a DDL lock like
ACCESS EXCLUSIVE) rather than a row lock. - If only updates to specific entities are timing out (e.g., “users table updates fail for one customer”), suspect row lock contention.
- If many sessions are waiting and the blocker is “idle in transaction,” you may have a transaction holding locks open (row locks, relation locks, or both).
Next, we’ll look at the internal lock layers—lightweight locks vs heavyweight locks—and how to interpret what you see in Postgres’ monitoring views when you’re in the middle of an incident.
7. Lightweight Locks: The Hidden Locks Behind Core Operations
When most people say “Postgres locks,” they mean the SQL-visible ones: table locks, row locks, and the waits you can inspect via pg_locks. But PostgreSQL also relies heavily on lightweight locks (LWLocks)—short-lived internal locks used to coordinate access to shared in-memory structures.
LWLocks don’t usually show up as “blocking on a table,” yet under load they can be the reason your database feels stuck even when there’s no obvious ACCESS EXCLUSIVE drama.
What LWLocks are (and what they’re not)
LWLocks are:
- Internal synchronization primitives used by the Postgres server
- Held for very short periods in normal operation (microseconds to milliseconds)
- Designed to protect shared memory structures where correctness matters
LWLocks are not:
- the same as relation locks (
ACCESS SHARE,ACCESS EXCLUSIVE, etc.) - row locks from
UPDATE/SELECT ... FOR UPDATE - advisory locks
So if you’re hunting a “stop-the-world lock” and you only look at pg_locks, you can miss an LWLock bottleneck entirely.
What kinds of things LWLocks protect
LWLocks cover a wide range of core internals. Examples of areas that require LWLocks include:
- Buffer manager and buffer mapping (finding/managing cached data pages)
- WAL (Write-Ahead Log) coordination and related shared structures
- Lock manager internal data structures (yes—locks have locks)
- Replication and snapshot-related shared state
- System catalog caches and shared metadata structures
The exact LWLock names and granularity can vary across Postgres versions, but the theme is consistent: they protect shared state that many backends may contend for.
How LWLock contention feels in production
LWLock contention often presents differently from classic DDL blocking:
- CPU might be moderate to high (spinning, context switching, contention overhead)
- Query latency increases broadly, not just for one table
- Many sessions appear “active,” but they’re frequently waiting on internal events
- You don’t necessarily see a clean “one blocker, many victims” chain
In other words, it can look like “Postgres is slow everywhere” rather than “one statement is blocked.”
How to observe LWLock waits
The most practical place to see LWLock pain is wait event reporting:
pg_stat_activity.wait_event_typepg_stat_activity.wait_event
If you see many sessions with:
wait_event_type = 'LWLock'(or'LWLockTranche'in newer versions)- similar
wait_eventvalues across many sessions
…that’s a signal the system is contending on an internal shared resource.
What you won’t get from this alone is a simple “kill this PID and you’re done.” LWLock contention is usually systemic: workload pattern, I/O pressure, checkpoint behavior, too many active backends, or a hot internal structure.
LWLocks vs “stop-the-world locks”
It’s useful to separate two incident archetypes:
A) Lock-queue freeze (SQL-level, usually DDL-related)
- Many sessions waiting on relation locks
- Clear blocker in
pg_locks/pg_stat_activity - Often fixed by releasing/canceling the blocker or waiting for a long transaction to end
B) LWLock contention (internal bottleneck)
- Many sessions waiting on LWLock-related wait events
- No single SQL statement is “the blocker”
- Fix usually involves reducing contention: tuning, schema/workload changes, connection management, checkpoint/WAL tuning, or scaling resources
Both can feel like “stop-the-world,” but they’re diagnosed and fixed differently.
Common drivers of LWLock contention
Without overfitting to any one case, LWLock hotspots are often associated with:
- Too many concurrent active sessions (especially without a pooler)
- High write rates that stress WAL and buffer management
- Checkpoint or bgwriter pressure causing heavy buffer churn
- Hot relations/indexes that concentrate access patterns
- Catalog-heavy workloads (lots of DDL, many temporary objects, frequent plan invalidations)
The important point: LWLock problems are frequently about contention and throughput, not about a lock mode like ACCESS EXCLUSIVE.
What to do when you suspect LWLocks
If your incident looks systemic and wait events point to LWLocks:
- Confirm it’s not classic blocking
Check whether there’s an obvious relation-lock queue (many sessions waiting on locks with a single root blocker). - Look for a pattern in wait events
Are most sessions waiting on the same LWLock tranche/event? That narrows the class of bottleneck. - Reduce concurrency and smooth spikes
Often the fastest mitigation is to reduce the number of concurrently active queries (pooler limits, app throttling, temporarily pausing heavy jobs). - Then do the real fix
The durable fix depends on the specific contention point: tuning checkpoints, WAL settings, pooling strategy, schema/index changes, query rewrites, or hardware/I/O improvements.
LWLocks are the “hidden gears” that keep Postgres correct and fast. Most days you’ll never notice them—but when they become a bottleneck, understanding that they’re internal contention (not “a table got locked”) is the difference between flailing and fixing.
8. Heavyweight Locks: The Ones You See in pg_locks
When you’re diagnosing a “stop-the-world” episode, the most actionable evidence usually comes from heavyweight locks—the SQL-level locks managed by Postgres’ lock manager. These are the locks that:
- control concurrency for tables, rows (indirectly), transactions, and various objects,
- can block other sessions for long periods,
- and are visible in the system catalog view
pg_locks.
If you can find a clear blocking chain, it almost always involves these locks.
What counts as a “heavyweight lock”
“Heavyweight” here doesn’t mean “slow.” It means:
- the lock is tracked in the lock manager,
- it can be waited on,
- and it’s exposed to you for inspection.
Relation locks like ACCESS SHARE and ACCESS EXCLUSIVE are heavyweight locks. Advisory locks are also heavyweight (because they’re managed by the lock manager and visible in pg_locks).
Why pg_locks is so useful in incidents
pg_locks tells you, for each session/process:
- what it is trying to lock (
locktype+ object identifiers) - which lock mode it holds or wants (
mode) - whether the lock is granted or waiting (
granted = true/false)
That’s the core of answering two questions under pressure:
- Who is blocked?
- Who is blocking them?
The most important columns in pg_locks
You’ll see many columns, but these are the ones you’ll use constantly:
pid: backend process ID holding or waiting on the locklocktype: what kind of resource is locked (examples below)mode: lock mode (e.g.,AccessShareLock,AccessExclusiveLock, etc.)granted:trueif held,falseif waitingrelation: OID of the locked relation (whenlocktype = 'relation')database: database OID (useful for some locktypes)transactionid/virtualxid: transaction-level lock identifiersclassid/objid/objsubid: used for some object locks
Tip: the lock mode names in pg_locks.mode are the internal names, but they map directly to the modes discussed earlier:
AccessShareLock↔ACCESS SHARERowExclusiveLock↔ROW EXCLUSIVEAccessExclusiveLock↔ACCESS EXCLUSIVE
…and so on.
Common locktype values you’ll actually encounter
relation
Locks on tables, indexes, materialized views, sequences, etc.
- This is the main one for DDL/DML blocking analysis.
- You can resolve the name with
relation::regclass.
Stop-the-world pattern: a waiting or granted AccessExclusiveLock on a hot table.
transactionid
A backend waiting for a specific transaction to finish.
- Common with row-level conflicts: “I need to update this row, but I must wait for the transaction that last touched it to commit/rollback.”
- This often means the “blocker” is a long transaction, possibly “idle in transaction.”
virtualxid
A lock on a virtual transaction ID.
- Often appears when certain operations need to ensure no conflicting snapshots/transactions are active.
- Can be involved in DDL waiting patterns that are confusing if you only look for relation locks.
advisory
Advisory locks taken by application code.
- Great tool when used intentionally.
- Dangerous when a global advisory lock becomes a single point of failure.
Other types you may see
Depending on workload and features, you may run into:
database(locks at database scope)extend(related to extending relation files)object(miscellaneous catalog objects)page/tuple(less commonly used in day-to-day diagnosis; row-level behavior is more often inferred viatransactionidwaits)
How to interpret “granted = false”
A row in pg_locks with granted = false means:
- the backend has requested a lock,
- it conflicts with an already-granted lock,
- and it is waiting in the queue.
In incidents, it’s worth remembering a subtle but important point:
- The session that’s waiting might not be the root cause.
The root cause is often the session holding a lock (or holding a transaction open) that prevents the requested lock from being granted.
Turning pg_locks into something human-readable
Raw pg_locks is hard to read because you need to join it to other views to see query text, usernames, and relation names.
Here’s a practical “what’s blocked and by whom?” query that focuses on relation locks and shows the table names. (This won’t cover every possible locktype, but it’s very effective for classic stop-the-world events.)
SQLSELECT
a.pid AS waiting_pid,
a.usename AS waiting_user,
a.application_name AS waiting_app,
a.state AS waiting_state,
a.wait_event_type,
a.wait_event,
a.query AS waiting_query,
now() - a.query_start AS waiting_for,
l.mode AS waiting_lock_mode,
l.relation::regclass AS locked_relation,
b.pid AS blocking_pid,
b.usename AS blocking_user,
b.application_name AS blocking_app,
b.state AS blocking_state,
b.query AS blocking_query,
now() - b.xact_start AS blocking_xact_age
FROM pg_stat_activity a
JOIN pg_locks l
ON l.pid = a.pid
AND l.granted = false
AND l.locktype = 'relation'
JOIN pg_locks bl
ON bl.locktype = l.locktype
AND bl.relation = l.relation
AND bl.granted = true
JOIN pg_stat_activity b
ON b.pid = bl.pid
ORDER BY waiting_for DESC;
Notes:
- This highlights relation-level waits, which is where most stop-the-world DDL incidents live.
- It doesn’t compute full lock compatibility (Postgres has its own rules), but in practice it quickly surfaces the likely blocker(s).
- If you see many waiting PIDs on the same
locked_relation, you’ve found your hotspot.
What heavyweight locks tell you during a “freeze”
Heavyweight locks are where you get your most operationally useful answers:
- Which table is the choke point?
- Is someone holding (or requesting)
AccessExclusiveLock? - Is the real issue a long transaction holding locks open?
- Did an advisory lock serialize more than you intended?
Next, we’ll take this from “I can see locks” to “I understand why the whole system queued up” by digging into lock queueing and the “one waiter can stall many” effect.
9. Lock Queuing and Why One Waiter Can Stall Many Others
The most confusing part of PostgreSQL locking—especially during “stop-the-world” incidents—is that the session causing the outage is often not the one holding a lock. It can be the one waiting for a strong lock, because PostgreSQL must preserve correctness and fairness by queueing lock requests.
This is how you get the classic situation: “A migration is waiting for a lock, but while it waits, the whole app starts timing out.”
The core idea: locks are granted in a queue
For any given lockable object (like a table), Postgres maintains a queue of lock requests. When a backend requests a lock:
- If it is compatible with the locks currently granted and granting it wouldn’t violate the queue ordering rules, it can be granted.
- Otherwise, it waits.
That queue ordering rule is the amplifier: once a conflicting request is waiting, later requests can end up waiting too, even if they would have been compatible with the locks that are currently granted.
A concrete scenario: the “pending ACCESS EXCLUSIVE” pileup
Imagine a busy table orders.
- Session A is running a long query:
SQLSELECT * FROM orders WHERE created_at >= now() - interval '30 days';
It takes ACCESS SHARE on orders and keeps it for the duration of the query (and possibly longer if inside a long transaction).
- Session B starts a migration:
SQLALTER TABLE orders ADD COLUMN foo text;
This commonly requires ACCESS EXCLUSIVE on orders. It cannot be granted while A holds ACCESS SHARE, so B waits.
- Session C arrives and wants to do a normal write:
SQLUPDATE orders SET status = 'paid' WHERE id = 123;
Normally, C’s ROW EXCLUSIVE is compatible with A’s ACCESS SHARE, so you’d expect it to proceed.
But because B is already waiting for ACCESS EXCLUSIVE, Postgres may prevent C from “cutting in line” if granting C would prolong B’s wait. The result: C queues behind B.
Now you’ve got:
- A: granted
ACCESS SHARE(the original long reader) - B: waiting
ACCESS EXCLUSIVE(the “big request”) - C, D, E…: normal traffic also waiting because they’re stuck behind the pending big request
From the application’s perspective, it looks like the database suddenly stopped serving orders entirely.
Why Postgres behaves this way
Two design goals collide here:
- Correctness: conflicts must be respected.
- Fairness / starvation avoidance: a strong lock request should eventually run; otherwise, a busy table with constant writes could starve DDL forever.
If Postgres kept granting an endless stream of compatible “small” locks while a strong lock waits, the strong lock might never get a chance to acquire the table. Queueing prevents that.
The tradeoff: you can get a “stop-the-world” feeling when a strong lock request appears.
“The blocker” vs “the head-of-line waiter”
In these incidents, it helps to separate roles:
- Root blocker: the session holding a conflicting lock (often a long-running query or long transaction). In the example, that’s session A.
- Head-of-line waiter: the session waiting for a strong lock that causes others to queue behind it. In the example, that’s session B.
- Victims: everyone else who now can’t get locks quickly. In the example, session C and friends.
Operationally:
- Killing B (the waiter) can “unfreeze” the line immediately, because C and others can proceed with their compatible locks again.
- But the underlying issue (A being long-running) still exists; you just avoided the queue amplification.
That’s why “cancel the migration” often restores service faster than “wait for it.”
How to recognize lock-queue amplification
Signs you’re seeing queueing behavior (not just normal blocking):
- Many sessions are waiting on the same relation.
- The waiting lock mode of the head-of-line session is very strong (often
AccessExclusiveLock). - The root blocker might be a completely different query than the one you expect.
- You may see many sessions waiting even though, pairwise, their requested locks would be compatible with the currently held locks.
In monitoring terms:
pg_stat_activityshows lots of sessions inwait_event_type = 'Lock'.pg_locksshows a mix of granted weak locks and one pending strong lock request.
Practical implications for production
This queueing behavior drives a lot of “rules of thumb” you’ll see from experienced Postgres operators:
- Avoid strong locks on hot tables during peak traffic.
- Prefer concurrent/online schema changes when possible (
CREATE INDEX CONCURRENTLY, etc.). - Keep transactions short, especially around DDL and migrations.
- Watch for long-running reads (analytics queries, report jobs) that inadvertently block migrations.
What this section should change in your thinking
A stop-the-world incident is not always “someone held an exclusive lock.” It’s often:
- a long-running session holding a weak lock,
- plus a strong lock request that starts waiting,
- plus Postgres’ lock queue fairness turning that wait into a traffic jam.
Next, we’ll ground this with the compatibility rules (which lock modes conflict) so you can predict when a queued request is likely to trigger a pileup.
10. Compatibility Matrix: Which Locks Conflict in Practice
PostgreSQL decides whether a lock can be granted by checking compatibility: “Does the requested lock mode conflict with any lock modes currently granted on this same object?”
You don’t need to memorize the full official matrix to operate Postgres safely, but you do need a practical version of it—especially around the modes that show up in production: reads (ACCESS SHARE), writes (ROW EXCLUSIVE), maintenance (SHARE UPDATE EXCLUSIVE), and the “stop-the-world” lock (ACCESS EXCLUSIVE).
First: the mental model
For a given table:
- If two lock modes are compatible, they can be held at the same time.
- If they conflict, the later request waits until the conflicting lock is released.
- Separately from pure compatibility, queueing can still make compatible requests wait if a conflicting request is already queued ahead of them (covered in the previous section). Compatibility explains “can these coexist,” while queueing explains “why are they waiting anyway?”
Practical compatibility rules you’ll use most
Reads don’t normally block writes (and vice versa)
ACCESS SHARE(plainSELECT)- Compatible with:
ACCESS SHARE,ROW EXCLUSIVE, and most other common modes - Conflicts with:
ACCESS EXCLUSIVE(and some high-strength modes used by certain DDL)
- Compatible with:
This is why normal reads and writes can run together under MVCC.
Writes can usually run together
ROW EXCLUSIVE(typicalINSERT/UPDATE/DELETE)- Compatible with:
ACCESS SHAREand otherROW EXCLUSIVElocks - Conflicts with: stronger “share/exclusive” modes involved in DDL and heavy maintenance (notably
SHARE,SHARE ROW EXCLUSIVE,EXCLUSIVE,ACCESS EXCLUSIVE, and some operations usingSHARE UPDATE EXCLUSIVEdepending on the other mode)
- Compatible with:
This is why OLTP workloads with lots of concurrent writes can work fine—until DDL enters the picture.
Maintenance tries to be friendly, but it’s not free
SHARE UPDATE EXCLUSIVE(often VACUUM/ANALYZE-ish)- Usually compatible with: reads and writes (
ACCESS SHARE,ROW EXCLUSIVE) - Conflicts with: many schema-change operations and other “strong” modes, and it can be blocked by them
- Usually compatible with: reads and writes (
That’s why VACUUM typically doesn’t block your application, but certain migrations can interfere with it (and vice versa).
ACCESS EXCLUSIVE conflicts with everything
ACCESS EXCLUSIVE- Conflicts with: all other modes, including
ACCESS SHARE - Effect: once granted, nobody else can read or write that table until it’s released
- Conflicts with: all other modes, including
This is the lock mode that most closely matches “stop-the-world” behavior for a single relation.
A “useful in practice” mini-matrix
Below is a simplified compatibility table for the modes you’ll see most often. “✓” means compatible; “✗” means conflict (so one side must wait). This is not the full official matrix, but it matches the operational intuition you need for incident response and safe changes.
| Requested \ Held | ACCESS SHARE | ROW EXCLUSIVE | SHARE UPDATE EXCLUSIVE | ACCESS EXCLUSIVE |
|---|---|---|---|---|
| ACCESS SHARE | ✓ | ✓ | ✓ | ✗ |
| ROW EXCLUSIVE | ✓ | ✓ | ✓ (usually) | ✗ |
| SHARE UPDATE EXCLUSIVE | ✓ | ✓ (usually) | ✗ (often conflicts with itself for some ops) | ✗ |
| ACCESS EXCLUSIVE | ✗ | ✗ | ✗ | ✗ |
Two important notes:
- “Usually” depends on the exact command and phase (for example, some “concurrent” operations change lock levels during different phases).
- Even when the matrix says “✓”, lock queueing can still cause waiting if there’s a conflicting request ahead in the queue.
Where people get surprised
Surprise #1: “Why did my ALTER TABLE block simple SELECTs?”
Because many ALTER TABLE operations require ACCESS EXCLUSIVE, and ACCESS EXCLUSIVE conflicts with ACCESS SHARE.
Surprise #2: “Why are writes blocked if the only thing running is a read?”
They usually aren’t—unless a strong lock request is queued behind that read, and now other sessions are queuing behind the strong request.
Surprise #3: “Why did a ‘concurrent’ command still block something?”
“Concurrent” usually means “minimizes blocking of normal reads/writes,” not “never takes locks.” Many concurrent operations still take locks briefly (or take different locks at different phases).
How to apply this during an incident
When you see a stall:
- Identify the hot relation(s) most waiters have in common.
- Look for any session holding or waiting for
AccessExclusiveLock. - If no
AccessExclusiveLockis present, look for other strong modes (ExclusiveLock,ShareRowExclusiveLock, etc.) and check whether they conflict with typical workload locks (AccessShareLock,RowExclusiveLock). - Remember the queue: the “wrong” session might be the head-of-line waiter, not the root blocker.
Reference note
PostgreSQL’s full lock compatibility matrix and lock mode definitions are documented in the official manual under explicit locking / table-level locks. For exact behavior by mode and for less common lock types, rely on the docs for your Postgres version.
Citations: PostgreSQL Documentation – Explicit Locking and Table-Level Locks (Lock Modes and Conflict Table) https://www.postgresql.org/docs/current/explicit-locking.html
11. The Real “Stop-the-World” Lock: ACCESS EXCLUSIVE and What It Blocks
If you had to pick one lock mode that most often creates the “everything just froze” feeling in PostgreSQL, it’s ACCESS EXCLUSIVE.
It is the strongest relation lock mode. When it’s granted on a table, it effectively makes that table unavailable to other sessions for the duration of the lock. When it’s waiting to be granted, it can still trigger a pileup because of lock queueing (the “head-of-line waiter” effect).
What ACCESS EXCLUSIVE actually blocks
ACCESS EXCLUSIVE conflicts with all other relation lock modes, including the weakest one:
- Blocks plain
SELECT(ACCESS SHARE) - Blocks
INSERT/UPDATE/DELETE(ROW EXCLUSIVE) - Blocks row-locking reads (
ROW SHARE) - Blocks maintenance modes (
SHARE UPDATE EXCLUSIVE) - Blocks other DDL and schema changes (all the stronger modes)
Operationally: if a hot table is central to your app, an ACCESS EXCLUSIVE lock on it can feel like the app lost its database, even though the server is up.
Citations: PostgreSQL documentation, Table-Level Locks (lock modes and conflict behavior)
https://www.postgresql.org/docs/current/explicit-locking.html
Two ways it causes outages
1) It’s granted (hard block)
If a session successfully acquires ACCESS EXCLUSIVE on a busy table:
- every other session that touches that table will wait,
- connection pools fill with blocked sessions,
- upstream timeouts and retries amplify load.
This is the obvious “stop-the-world” pattern: one session holds the table hostage until it commits/rolls back.
2) It’s waiting (queue amplification)
Even if ACCESS EXCLUSIVE is not granted yet, its presence in the lock queue can be damaging.
A common chain is:
- Session A holds a weak lock for a long time (often just
ACCESS SHAREfrom a longSELECT). - Session B requests
ACCESS EXCLUSIVE(DDL) and must wait for A to finish. - Sessions C, D, E… arrive needing normal locks (reads/writes). Even though those locks might be compatible with A, they can end up queued behind B’s pending exclusive request to avoid starving B.
Result: it looks like “DDL took down the app,” but the root blocker might be a long-running read or an open transaction.
This behavior is part of PostgreSQL’s lock manager fairness rules and is a big reason production migrations can cause surprise outages.
What kinds of statements require ACCESS EXCLUSIVE
Many commands that change the physical structure or metadata of a table require ACCESS EXCLUSIVE, especially when they need to guarantee nobody can concurrently read/write the relation while it’s being rewritten or its definition is changing.
Common examples include:
- Many
ALTER TABLEvariants (especially those that rewrite the table or change constraints in certain ways) TRUNCATEDROP TABLE/DROP INDEX(on the object being dropped)- Some forms of
REINDEX(non-concurrent) - Operations that rewrite a table (often the “dangerous” ones)
Exactly which statements take ACCESS EXCLUSIVE and for how long can vary by PostgreSQL version and the specific subcommand. The safe operational assumption is: if it’s DDL on a hot table, assume it might request ACCESS EXCLUSIVE until proven otherwise.
Citations: PostgreSQL documentation, ALTER TABLE, TRUNCATE, and lock mode reference
https://www.postgresql.org/docs/current/sql-altertable.html
https://www.postgresql.org/docs/current/sql-truncate.html
https://www.postgresql.org/docs/current/explicit-locking.html
How long is it held?
A key operational gotcha: the lock is generally held for at least the duration of the statement, and often until the end of the transaction.
So these two are very different in impact:
SQLALTER TABLE orders ADD COLUMN foo text;
-- autocommit: lock held briefly (still risky on a hot table)
vs
SQLBEGIN;
ALTER TABLE orders ADD COLUMN foo text;
-- ... do other stuff, wait for an app deploy, etc. ...
COMMIT;
-- lock can be held far longer than the DDL itself needed
Even if the DDL step is “fast,” holding the transaction open extends the lock’s blast radius.
Citations: PostgreSQL documentation, Explicit Locking and transactional behavior of DDL
https://www.postgresql.org/docs/current/explicit-locking.html
Why this lock is the one to respect
ACCESS EXCLUSIVE is not “bad”—Postgres needs it to safely perform certain changes. The problem is running it on hot objects at the wrong time or letting transactions linger.
In practical terms, whenever you see a stop-the-world incident, you should immediately ask:
- Is there a session holding
AccessExclusiveLock? - Is there a session waiting for
AccessExclusiveLockthat has become head-of-line and caused a queue? - What transaction is preventing it from being granted?
Next, we’ll enumerate the most common operations that take ACCESS EXCLUSIVE, so you can recognize risky commands before you run them in production.
12. Common Operations That Take ACCESS EXCLUSIVE Locks
ACCESS EXCLUSIVE is PostgreSQL’s strongest table-level lock mode. Many operations that change table structure or replace its underlying storage need it to guarantee that no other session is reading from or writing to the table while the change happens.
Exactly which statements take ACCESS EXCLUSIVE can vary by PostgreSQL version and by the specific subcommand, but the following list covers the most common real-world triggers you should treat as “production risky until proven otherwise.”
Citations: PostgreSQL docs on lock modes and SQL commands (see links in each subsection).
ALTER TABLE (many subcommands)
Many ALTER TABLE operations request ACCESS EXCLUSIVE, especially those that change physical layout, rewrite the table, or require strong metadata stability.
Common examples that are frequently ACCESS EXCLUSIVE in practice:
- Adding/dropping columns (dropping is often cheap but still needs strong locking for catalog changes)
- Changing a column type (often forces a rewrite, depending on the change)
- Setting a column
NOT NULL(often requires validation; may hold strong locks while it checks) - Adding/dropping constraints in ways that require validation or catalog updates
- Enabling/disabling triggers (impactful and typically needs strong coordination)
Reference: ALTER TABLE
https://www.postgresql.org/docs/current/sql-altertable.html
Lock mode reference:
https://www.postgresql.org/docs/current/explicit-locking.html
Operational tip: even when an ALTER TABLE is “fast,” it can still block hard because it must acquire the lock first. The wait to get the lock is often the outage.
TRUNCATE
TRUNCATE is designed to be fast by deallocating or resetting storage rather than deleting row-by-row, but that speed comes with a big locking requirement: it needs to ensure no concurrent transaction can see a partial state.
TRUNCATEtakes anACCESS EXCLUSIVElock on the table(s) it truncates.- If you truncate multiple tables in one statement, it locks all of them.
Reference: TRUNCATE
https://www.postgresql.org/docs/current/sql-truncate.html
Lock mode reference:
https://www.postgresql.org/docs/current/explicit-locking.html
DROP TABLE, DROP INDEX, DROP MATERIALIZED VIEW (on the object being dropped)
Dropping an object must prevent concurrent use of it and update catalogs safely, so it generally requires a very strong lock on that object.
Reference: DROP TABLE / DROP INDEX
https://www.postgresql.org/docs/current/sql-droptable.html
https://www.postgresql.org/docs/current/sql-dropindex.html
Lock mode reference:
https://www.postgresql.org/docs/current/explicit-locking.html
Non-concurrent index maintenance
When you rebuild an index without the concurrent option, PostgreSQL generally needs strong locks that block normal access patterns.
REINDEX INDEX ...(non-concurrent) can block operations on that index and may require strong locks on the underlying table depending on the exact command and version.CREATE INDEX(non-concurrent) does not usually block reads, but it can block writes in specific ways and it still takes locks you must respect. The important operational point is: if you need an online rebuild, you typically want theCONCURRENTLYvariants where available.
Reference: REINDEX / CREATE INDEX
https://www.postgresql.org/docs/current/sql-reindex.html
https://www.postgresql.org/docs/current/sql-createindex.html
Lock mode reference:
https://www.postgresql.org/docs/current/explicit-locking.html
(Note: locking details differ by “concurrent vs non-concurrent” and by phase; later sections on concurrent index operations cover this in depth.)
Table rewrite operations
Some commands effectively create a new copy of the table data and swap it in. Those operations almost always need ACCESS EXCLUSIVE for at least part of their execution because concurrent readers/writers can’t safely interact with a moving target.
Examples include:
VACUUM FULLCLUSTER- Some
ALTER TABLEsubcommands that rewrite the heap
References:VACUUM (incl. VACUUM FULL): https://www.postgresql.org/docs/current/sql-vacuum.htmlCLUSTER: https://www.postgresql.org/docs/current/sql-cluster.html
Lock mode reference: https://www.postgresql.org/docs/current/explicit-locking.html
Partition maintenance that changes attachment/detachment (often strong)
Partitioning can reduce blast radius, but certain partition operations still require strong locks:
- Attaching/detaching partitions
- Moving data between partitions in ways that require metadata changes
Depending on the exact operation and version, these can require very strong locks (often on the partitioned table and/or partitions).
Reference: partitioning and ALTER TABLE ... ATTACH PARTITION
https://www.postgresql.org/docs/current/ddl-partitioning.html
https://www.postgresql.org/docs/current/sql-altertable.html
The “hidden” trigger: DDL inside a long transaction
A frequent outage pattern isn’t “the DDL took a long time.” It’s:
- the DDL needed
ACCESS EXCLUSIVE, - it waited (queue amplification),
- and once it got the lock, it was held longer than expected because the transaction stayed open.
This is why migration frameworks that wrap many steps in a single transaction can be dangerous on hot tables.
Reference: Locking + transactional DDL behavior
https://www.postgresql.org/docs/current/explicit-locking.html
Checklist: treat these as ACCESS EXCLUSIVE until you verify
If you’re operating a busy production database, assume you need a plan (low-traffic window, online alternative, lock/statement timeouts, rollback strategy) before running:
ALTER TABLE ...(anything non-trivial)TRUNCATEDROP ...VACUUM FULLCLUSTER- non-concurrent
REINDEXor heavy rebuild operations
Next, we’ll drill into DDL locking more systematically—what different schema changes do, why some are safe and others rewrite tables, and how to run the risky ones without triggering a lock queue pileup.
13. DDL and Locking: CREATE, ALTER, DROP and Their Impact
DDL (Data Definition Language) is where PostgreSQL most often shifts from “high concurrency” to “everyone wait.” That’s not because Postgres is fragile—it’s because changing schema safely requires strong guarantees about what other sessions can see and do while the metadata is being changed.
This section focuses on the practical reality: CREATE is often the least disruptive, ALTER is the most likely to hurt you, and DROP can be deceptively dangerous depending on what’s being dropped and how widely it’s referenced.
References for exact lock modes and command behavior (varies by version and subcommand):
- Lock modes / conflict rules: https://www.postgresql.org/docs/current/explicit-locking.html
CREATE TABLE: https://www.postgresql.org/docs/current/sql-createtable.htmlCREATE INDEX: https://www.postgresql.org/docs/current/sql-createindex.htmlALTER TABLE: https://www.postgresql.org/docs/current/sql-altertable.htmlDROP TABLE: https://www.postgresql.org/docs/current/sql-droptable.htmlDROP INDEX: https://www.postgresql.org/docs/current/sql-dropindex.html
Why DDL needs locks at all
DDL changes system catalogs (metadata) that every query relies on: table definitions, column types, constraints, indexes, dependencies. If those catalogs changed mid-query without coordination, you could get inconsistent plans, incorrect results, or crashes.
So Postgres uses locks to ensure:
- other sessions don’t concurrently change the same objects in incompatible ways,
- queries either see the “old” schema or the “new” schema in a consistent way,
- table rewrites or file swaps happen safely.
CREATE: Usually low blast radius, with one big exception
CREATE TABLE
Creating a brand-new table is usually safe from a concurrency perspective because nothing depends on it yet. It still takes locks, but it rarely blocks application traffic unless you’re doing unusual catalog-heavy patterns (like creating/dropping many objects rapidly).
CREATE INDEX (non-concurrent)
A normal CREATE INDEX is often less disruptive than people fear, but it is not “free”:
- It must prevent certain concurrent schema changes.
- It can still be a performance event (I/O, CPU) even when locking impact is acceptable.
If you need to keep write traffic flowing, the “big exception” is that non-concurrent index creation can interfere with writes. For production OLTP tables, the safer default is often CREATE INDEX CONCURRENTLY (covered later).
Reference: CREATE INDEX (including CONCURRENTLY notes)
https://www.postgresql.org/docs/current/sql-createindex.html
CREATE VIEW, CREATE FUNCTION, etc.
These usually have limited direct locking impact on hot tables unless they touch them in a way that forces dependency checks or compilation steps. The bigger operational risk tends to be deployment mistakes (bad plans, incompatible changes), not stop-the-world locking.
ALTER: The most common cause of stop-the-world incidents
ALTER TABLE is the DDL family most likely to require ACCESS EXCLUSIVE and trigger lock queue pileups.
The impact depends on what kind of alteration you’re doing:
Metadata-only ALTERs (often fast, still needs strong coordination)
Examples: renaming a column, changing comments, some constraint changes.
These can be quick to execute, but:
- they can still require strong locks to ensure no concurrent session uses an inconsistent definition,
- the biggest risk is waiting to acquire the lock on a hot table.
Table-rewriting ALTERs (can be catastrophic if unplanned)
Some ALTERs force PostgreSQL to rewrite the entire table (copy data to a new physical structure). Common triggers:
- many column type changes,
- adding a column with a non-null default in older versions or certain patterns,
- changing storage parameters that require rewriting,
- other changes that affect the physical layout.
Table rewrite = longer lock hold time and huge I/O, which is the perfect recipe for a stop-the-world episode.
Reference: ALTER TABLE details and which subcommands rewrite tables
https://www.postgresql.org/docs/current/sql-altertable.html
Constraint validation and foreign keys
Some constraint operations aren’t just metadata flips—they require scanning data to prove validity. That can mean:
- longer-running operations,
- stronger locks held longer,
- or strategies like adding constraints as
NOT VALIDand validating later (often used to reduce blocking).
(We cover FKs and “NOT VALID” patterns in a dedicated section later.)
DROP: Often quick, but can block in surprising ways
Dropping an object needs strong locking because it removes catalog entries and dependencies.
DROP TABLE
- Requires preventing concurrent access to the table being dropped.
- Can cascade to dependent objects (
CASCADE), potentially widening the lock footprint.
Reference:
https://www.postgresql.org/docs/current/sql-droptable.html
DROP INDEX
Dropping an index is sometimes thought of as harmless. But it can still block operations that depend on that index’s metadata stability, and it can have big performance implications immediately after (plans change, workload shifts).
Reference:
https://www.postgresql.org/docs/current/sql-dropindex.html
Dependency surprises
A DROP might lock or invalidate more than you expect because of dependencies:
- views, functions, triggers,
- foreign keys,
- materialized views,
- ORMs and prepared statements that assume the old schema.
This is less “lock mode trivia” and more “blast radius awareness.”
The operational reality: DDL pain is often about lock acquisition, not execution
Even “fast DDL” can take your app down if:
- the table is hot,
- a long transaction is holding a weak lock,
- your DDL requests a strong lock (often
ACCESS EXCLUSIVE), - lock queueing causes everyone else to line up behind the DDL request.
So the risky part is frequently the wait to get the lock, not the milliseconds of catalog change once it’s acquired.
Safe patterns you should associate with DDL
When you run CREATE/ALTER/DROP in production, the difference between a routine deploy and an outage is usually process:
- Use lock timeouts so DDL fails fast rather than queueing forever.
- Use statement timeouts to avoid long-running surprises.
- Prefer concurrent variants where available (
CREATE INDEX CONCURRENTLY). - Keep transactions short; don’t leave migration transactions open.
- Schedule high-risk ALTERs (especially table rewrites) intentionally.
(Those mitigation techniques get concrete in later sections.)
Takeaway
- CREATE is often low-lock-risk but can still be resource-heavy (indexes).
- ALTER TABLE is the #1 source of stop-the-world behavior because it commonly requires
ACCESS EXCLUSIVEand may rewrite data. - DROP can be quick but still requires strong locking and can have unexpected dependency blast radius.
Next, we’ll go deeper into the specific DDL operations most likely to cause widespread blocking—starting with TRUNCATE and why it’s fast but disruptive.
14. TRUNCATE Locking: Fast Operation, Big Blast Radius
TRUNCATE is one of those PostgreSQL commands that’s simultaneously a performance gift and an operational foot-gun. It’s extremely fast compared to DELETE FROM big_table, but the tradeoff is that it needs very strong locking to be safe.
Why TRUNCATE is so fast
DELETE removes rows one by one, generating lots of row versions, WAL, and vacuum work. TRUNCATE, on the other hand, is closer to “throw away the table’s contents efficiently”:
- it doesn’t scan and delete row-by-row,
- it resets the relation to empty (implementation details vary, but the important part is it’s not doing per-row work).
This is why it’s often used for staging tables, ETL pipelines, and periodic refresh workflows.
Reference: PostgreSQL TRUNCATE docs
https://www.postgresql.org/docs/current/sql-truncate.html
The lock it takes: ACCESS EXCLUSIVE
The cost of that speed is concurrency: TRUNCATE takes an ACCESS EXCLUSIVE lock on each table it truncates.
That means:
- it blocks all reads (
SELECT) and writes (INSERT/UPDATE/DELETE) on that table, - it conflicts with every other table lock mode.
If the table is touched by production traffic, TRUNCATE can instantly create a “stop-the-world” experience.
References:
TRUNCATElocking notes: https://www.postgresql.org/docs/current/sql-truncate.html- Lock modes and conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
Why the blast radius can be bigger than “just that table”
There are a few common ways TRUNCATE spreads beyond what you intended:
1) Truncating multiple tables in one command
SQLTRUNCATE a, b, c;
You’re now requesting ACCESS EXCLUSIVE on all of them. If any one of those tables is busy, the statement waits—and while it waits, it can become head-of-line and cause lock queue pileups on whichever table(s) it’s waiting for.
2) Foreign keys: TRUNCATE ... CASCADE
If other tables reference the target via foreign keys, Postgres won’t let you truncate safely without handling those references.
TRUNCATE ... CASCADEautomatically truncates referencing tables too.- That means more tables locked
ACCESS EXCLUSIVE. - It’s very easy to accidentally lock a large portion of your schema.
Reference: TRUNCATE and CASCADE behavior
https://www.postgresql.org/docs/current/sql-truncate.html
3) Sequences and identity columns: RESTART IDENTITY
When you do:
SQLTRUNCATE my_table RESTART IDENTITY;
Postgres resets associated sequences. That’s usually what you want for staging tables, but it’s another reminder that TRUNCATE is a “reset state” operation, not just “delete data.”
Reference: RESTART IDENTITY
https://www.postgresql.org/docs/current/sql-truncate.html
The most common production failure mode
The outage pattern is rarely “TRUNCATE ran for a long time.” It’s usually:
- The app is actively using the table.
- Someone runs
TRUNCATE. TRUNCATErequestsACCESS EXCLUSIVEand either:- gets it and instantly blocks everyone, or
- waits to get it, becomes head-of-line, and triggers queue amplification.
Either way, the app sees timeouts and connection pileups.
Safer alternatives and patterns
Depending on what you’re trying to do, you can often avoid truncating a hot table:
- For “clear and refill” workflows: use a new table and swap references (or partition swapping), rather than truncating the live table.
- For batch pipelines: truncate only isolated staging tables not touched by OLTP traffic.
- For large deletes in production: a
DELETEin controlled batches may be slower, but it can be less disruptive because it doesn’t requireACCESS EXCLUSIVE(it uses row-level locks andROW EXCLUSIVEon the table).
There isn’t a one-size-fits-all answer; the key is recognizing that TRUNCATE is structurally disruptive even if it’s computationally cheap.
Takeaway
TRUNCATE is fast because it avoids row-by-row work, but that speed comes from taking the strongest lock PostgreSQL has on the table: ACCESS EXCLUSIVE. On a hot table, that’s not a maintenance command—it’s an outage trigger unless you plan it carefully.
Next, we’ll look at VACUUM and autovacuum—operations that are designed to be online-friendly—and how their lock behavior differs from table-rewriting commands like VACUUM FULL.
15. VACUUM, Autovacuum, and Lock Behavior
VACUUM is PostgreSQL’s garbage collector for MVCC: it cleans up dead row versions so tables don’t bloat forever, and it helps keep query performance stable. Because it has to run all the time on busy systems, PostgreSQL designs regular VACUUM (and autovacuum) to be as non-blocking as possible.
But “non-blocking” doesn’t mean “no locks,” and it doesn’t mean vacuum can’t get stuck or contribute to production pain.
References:
VACUUMcommand docs: https://www.postgresql.org/docs/current/sql-vacuum.html- Autovacuum behavior and tuning: https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
- Lock modes and conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
What regular VACUUM does (and doesn’t do)
A standard VACUUM:
- removes dead tuples from visibility (so space can be reused),
- updates visibility information (visibility map),
- may update statistics if
ANALYZEis included (VACUUM (ANALYZE)).
It does not rewrite the whole table and it does not require exclusive access to the table the way VACUUM FULL does (next section).
Reference: https://www.postgresql.org/docs/current/sql-vacuum.html
What locks does VACUUM take?
Regular VACUUM takes a lock mode that is designed to allow normal traffic to continue. In practical terms:
- It does not block normal
SELECT,INSERT,UPDATE,DELETEon the table. - It can conflict with some DDL or table-rewrite operations that want very strong locks.
Under the hood, VACUUM uses a table-level lock that allows concurrent reads/writes, and it also uses short-lived internal locks while it works through pages. The important operational point is that it’s intended to be “online.”
References: lock mode overview and conflict rules
https://www.postgresql.org/docs/current/explicit-locking.html
Vacuum’s concurrency notes: https://www.postgresql.org/docs/current/sql-vacuum.html
Autovacuum: same locking model, different operational risks
Autovacuum is just PostgreSQL automatically running vacuum/analyze in background workers. Lock behavior is similar to manual vacuum, but autovacuum adds two common operational dynamics:
1) It competes for resources
Even if it doesn’t block your queries via relation locks, it can still affect latency through:
- extra I/O,
- buffer cache churn,
- CPU usage,
- WAL generation (depending on workload).
So you can have “everything feels slower” without seeing obvious lock waits.
Reference: https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
2) It can get starved or prevented from finishing
Autovacuum can be disrupted by things that keep old row versions “pinned” as still potentially visible:
- long-running transactions
- sessions stuck idle in transaction
- long-running replication slots (in some setups) holding back cleanup
The result is classic bloat growth: vacuum runs but can’t reclaim enough, tables swell, indexes bloat, and performance degrades over time.
Reference: vacuum and MVCC horizon concepts are discussed across vacuum docs and MVCC/transactions docs
https://www.postgresql.org/docs/current/sql-vacuum.html
How VACUUM interacts with stop-the-world incidents
Regular vacuum is rarely the direct cause of a lock-based “stop-the-world” outage. More often, it’s involved indirectly:
- A migration requests
ACCESS EXCLUSIVEand waits. - Vacuum (or other sessions) continues taking locks compatible with current activity.
- The lock queue behind the migration grows, and then suddenly everything is waiting (queue amplification).
Vacuum itself is usually not the villain, but it’s part of the lock ecosystem.
The two vacuum situations that do deserve extra attention
1) Anti-wraparound vacuum
To prevent transaction ID wraparound (a correctness issue), PostgreSQL will force more aggressive vacuuming when needed. These vacuums can be more urgent and more expensive.
- The intent is safety, not convenience.
- If you’re near wraparound, the system may spend significant effort vacuuming, which can cause major performance impact even if it isn’t “blocking” in the classic lock sense.
Reference: vacuum and wraparound notes
https://www.postgresql.org/docs/current/sql-vacuum.html
2) VACUUM on heavily contended tables
On extremely hot tables, vacuum may struggle to make progress efficiently (lots of churn). Again, this is more about throughput and contention than about ACCESS EXCLUSIVE, but the user-visible symptom can still be “the database feels jammed.”
What to check when you suspect vacuum-related pain
If you’re diagnosing a slowdown or weird waiting:
- Look at
pg_stat_activitywait events.
If you see a lot ofwait_event_type = 'Lock', that’s more likely DDL / relation locking than vacuum itself.
If you see I/O-related waits or LWLock waits, vacuum may be contributing via resource pressure. - Identify long-running transactions.
These are the #1 reason vacuum can’t reclaim space and bloat grows. - Inspect vacuum progress views when relevant.
PostgreSQL provides progress reporting for vacuum (pg_stat_progress_vacuum) in supported versions, which helps you see whether vacuum is stuck or slow.
Reference: vacuum progress reporting is described in the monitoring docs:
https://www.postgresql.org/docs/current/progress-reporting.html
Takeaway
- Regular
VACUUMand autovacuum are designed to be online-friendly: they take locks that generally allow reads and writes to proceed. - Their biggest risks are resource contention and being blocked from effective cleanup by long-lived transactions, not classic stop-the-world locking.
- The truly disruptive vacuum variant is
VACUUM FULL, which rewrites the table and requires strong locks—covered next.
16. VACUUM FULL: Why It’s So Disruptive
VACUUM FULL is the “nuclear option” of vacuuming. It aggressively compacts a table by rewriting it, which can reclaim disk space back to the operating system. That’s sometimes exactly what you need—but it’s also why it’s one of the most disruptive maintenance commands you can run on a busy system.
References:
VACUUM/VACUUM FULLdocs: https://www.postgresql.org/docs/current/sql-vacuum.html- Lock modes and conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
What VACUUM FULL does differently from regular VACUUM
Regular VACUUM marks dead tuples as reusable inside the table and updates visibility metadata, but it generally does not shrink the table file on disk.
VACUUM FULL instead:
- creates a new copy of the table containing only live rows,
- compacts it (removing bloat),
- and then swaps it in place of the old one.
This rewrite is what allows it to return disk space to the OS, but it also requires much stronger coordination.
Reference: https://www.postgresql.org/docs/current/sql-vacuum.html
The locking reason it’s disruptive: it needs exclusive access
Because VACUUM FULL is effectively replacing the underlying storage of the table, PostgreSQL must ensure no one is concurrently reading from or writing to that table during the critical parts of the operation.
In practice, VACUUM FULL takes a very strong table lock (documented as requiring an ACCESS EXCLUSIVE-level lock on the table), which:
- blocks plain
SELECT, - blocks
INSERT/UPDATE/DELETE, - blocks most maintenance operations,
- and can create lock-queue pileups if it has to wait.
References:
- Vacuum full behavior: https://www.postgresql.org/docs/current/sql-vacuum.html
- Lock conflict rules: https://www.postgresql.org/docs/current/explicit-locking.html
Why it can “stop the world” even if it’s waiting
If VACUUM FULL can’t immediately acquire its exclusive lock (because the table is busy), it will wait. As covered earlier, a pending strong lock request can become a head-of-line waiter and cause other sessions to queue behind it.
So VACUUM FULL can hurt you in two ways:
- Granted: the table becomes unusable to other sessions until it finishes.
- Waiting: it can still cause broad stalls due to lock queueing rules on that table.
The other reason it’s disruptive: it’s expensive work
Even if you had perfect lock timing, VACUUM FULL is a heavyweight operation:
- It reads and writes the entire table (big I/O).
- It can generate significant WAL (relevant for replication lag and disk usage).
- It can evict useful pages from cache (buffer churn).
- It can take a long time on large relations.
That means it’s not just a locking event; it’s often a performance event for the whole instance.
Reference: https://www.postgresql.org/docs/current/sql-vacuum.html
When people reach for it (and what to consider first)
Common reasons teams run VACUUM FULL:
- severe table bloat after large deletes/updates,
- reclaiming disk space urgently,
- fixing extreme fragmentation.
Before using it, consider less disruptive options depending on your goal:
- If you mainly need query performance, regular vacuum plus proper autovacuum tuning (and possibly
pg_repackas a third-party option) may be enough. - If you need online compaction, tools/strategies that avoid long
ACCESS EXCLUSIVEholds are often preferred (e.g., rebuild-and-swap approaches). - If you need disk space returned to the OS,
VACUUM FULLdoes that, but it’s best treated like a maintenance-window operation.
I can’t guarantee which alternative fits your environment without specifics, but the principle is: don’t treat VACUUM FULL as routine maintenance on hot tables.
Practical safety guardrails
If you must run VACUUM FULL in a production-like environment:
- Run it during a low-traffic window.
- Set
lock_timeoutso it fails fast rather than sitting in the queue and causing pileups. - Set
statement_timeoutto avoid an unexpectedly long stall. - Ensure you have sufficient disk headroom during the rewrite (you may need extra space for the new copy while the old still exists).
- Watch replication lag if you stream WAL to replicas.
(We’ll cover timeouts and incident playbooks in later sections.)
Takeaway
VACUUM FULL is disruptive because it rewrites the table and therefore needs exclusive access (strong locking), and it also does heavy I/O work that can impact the whole system. Treat it as a last-resort maintenance operation for busy databases unless you have a plan to contain the blast radius.
17. REINDEX and Concurrency: When Index Maintenance Blocks Workloads
Index maintenance is one of the easiest ways to accidentally introduce “stop-the-world” behavior, because indexes sit on the critical path of reads and writes. Rebuilding them can require locks that block normal traffic—unless you use the concurrent variants and plan for their tradeoffs.
References:
REINDEXdocs (including CONCURRENTLY): https://www.postgresql.org/docs/current/sql-reindex.html- Lock modes overview: https://www.postgresql.org/docs/current/explicit-locking.html
CREATE INDEX(for comparison and lock behavior): https://www.postgresql.org/docs/current/sql-createindex.html
Why you run REINDEX in the first place
Common reasons include:
- fixing index bloat or poor clustering,
- recovering from index corruption (rare, but it happens),
- after certain bulk operations where an index becomes inefficient,
- as part of maintenance on heavily updated tables.
REINDEX rebuilds the index structure from the underlying table (or rebuilds all indexes on a table/database depending on the command variant).
Reference: https://www.postgresql.org/docs/current/sql-reindex.html
The concurrency question: does it block reads or writes?
It depends mainly on whether you use CONCURRENTLY.
REINDEX without CONCURRENTLY (the blocking default)
A plain REINDEX is generally a “maintenance window” operation for hot objects because it can take strong locks that block normal use of the index (and sometimes effectively block the table’s workload depending on what queries need that index).
Typical impact patterns:
- Writes can be blocked if they need to update the index.
- Queries that depend on that index can stall or switch to worse plans (and some operations can’t proceed safely without the index being in a stable state).
- On busy tables, the time to acquire the needed locks can create queues and amplify into an outage.
While the exact lock modes and affected operations vary by the specific REINDEX form (INDEX vs TABLE vs DATABASE) and PostgreSQL version, the practical operational assumption is:
- Non-concurrent
REINDEXis potentially workload-blocking and should be treated like high-risk DDL.
Reference: https://www.postgresql.org/docs/current/sql-reindex.html
REINDEX CONCURRENTLY (online-friendly, slower, more complex)
REINDEX CONCURRENTLY is designed to rebuild indexes while allowing normal reads and writes to continue as much as possible. It does this by building a new index alongside the old and swapping them, similar in spirit to CREATE INDEX CONCURRENTLY.
Key characteristics:
- Much lower chance of “stop-the-world” blocking for application traffic.
- Usually takes weaker locks for longer and may take short stronger locks briefly during swap/validation steps.
- Uses more time and resources overall than non-concurrent reindexing.
- Has restrictions (for example, you can’t run it inside a transaction block).
Reference: https://www.postgresql.org/docs/current/sql-reindex.html
The stop-the-world failure modes with REINDEX
1) Reindexing a critical index without concurrency
If you REINDEX INDEX on an index that every transaction touches (common for primary keys and hot foreign key indexes), even brief blocking can cascade:
- transactions queue,
- connection pools saturate,
- timeouts and retries amplify load.
This can look exactly like a global database outage even though the issue is localized to one table/index.
2) Waiting for locks can be worse than holding them
Just like with ACCESS EXCLUSIVE DDL, a reindex command can become the head-of-line waiter:
- it requests a lock it can’t immediately acquire,
- it waits,
- and other sessions pile up behind it due to queue ordering.
This is why setting lock_timeout before running maintenance is so effective: you’d rather fail fast than become the reason everyone queues.
Lock timeout reference: PostgreSQL runtime settings (lock_timeout)
https://www.postgresql.org/docs/current/runtime-config-client.html
3) Doing it at the wrong scope
REINDEX TABLE, REINDEX SCHEMA, and REINDEX DATABASE expand the blast radius quickly:
- you may rebuild many indexes you didn’t intend to touch,
- the probability of hitting a hot index (and a hot lock) goes way up,
- it becomes hard to reason about total impact.
Reference: https://www.postgresql.org/docs/current/sql-reindex.html
Choosing between REINDEX and alternatives
Depending on the goal:
- If you’re trying to reduce bloat on a single index and you need uptime, prefer
REINDEX CONCURRENTLYwhen it’s available and fits your constraints. - If you can tolerate downtime or are in a maintenance window, non-concurrent
REINDEXmay be acceptable and faster. - If you need a broader “rebuild and compact table + indexes” effect, commands like
CLUSTERorVACUUM FULLalso exist, but they typically have even worse locking characteristics (covered in other sections).
Operational safety checklist for index rebuilds
Before running any REINDEX on a production system:
- Identify whether the target index is on a hot table / hot path.
- Prefer
REINDEX CONCURRENTLYfor hot objects. - Set
lock_timeoutto prevent lock-queue pileups. - Set
statement_timeout(or run from a controlled session) to bound worst-case impact. - Monitor
pg_stat_activityandpg_locksduring the operation. - Avoid massive-scope commands (
REINDEX DATABASE) unless you truly mean it and can take the risk.
Takeaway
Index rebuilds can block workloads because indexes are involved in nearly every write and many reads. Non-concurrent REINDEX can be outage-inducing on hot objects, mostly due to strong locks and lock queue amplification. When uptime matters, REINDEX CONCURRENTLY is typically the safer tool—at the cost of longer runtime and higher overall resource usage.
18. CLUSTER and Table Rewrites: Understanding the Downtime Risk
Some PostgreSQL operations aren’t just “metadata changes.” They physically rewrite a table: they build a new copy of its data in a different order or format, then swap it in. These operations can be great for performance and space efficiency—but they are also among the most downtime-prone commands in Postgres because a rewrite typically requires strong locks for correctness.
CLUSTER is the poster child here.
References:
CLUSTERdocs: https://www.postgresql.org/docs/current/sql-cluster.html- Lock modes / conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
VACUUM FULL(another rewrite): https://www.postgresql.org/docs/current/sql-vacuum.html
What CLUSTER does
CLUSTER reorganizes a table’s physical order to match an index. The classic use case is improving locality:
- if you often query ranges on
(created_at)or(customer_id), clustering by an index on that column can make relevant rows sit closer together on disk, - which can reduce random I/O and improve cache efficiency.
But PostgreSQL tables do not stay clustered automatically. As inserts/updates happen, the physical order drifts again.
Reference: https://www.postgresql.org/docs/current/sql-cluster.html
Why CLUSTER is risky: it rewrites the whole table
Operationally, CLUSTER is similar to “build a new table and swap it in”:
- Read the original table.
- Write a new reordered copy.
- Swap the new copy into place (and rebuild/adjust indexes as needed).
This implies:
- big I/O (read + write the entire table, often plus index work),
- significant WAL generation (affects replicas and disk),
- buffer churn (evicts useful cache),
- and, crucially, strong locking while the swap happens (and often for the duration, depending on specifics).
Locking behavior: why it can cause real downtime
Because CLUSTER needs a stable target to rewrite and swap safely, it requires a strong table lock (commonly ACCESS EXCLUSIVE on the table). That means:
- plain
SELECTis blocked, - all writes are blocked,
- other maintenance/DDL is blocked,
- and if it has to wait to acquire the lock, it can trigger lock-queue pileups.
Even if the rewrite itself is “fast” for a small table, the attempt to acquire the lock on a hot table can cause a production-wide jam.
References:
CLUSTERcommand docs: https://www.postgresql.org/docs/current/sql-cluster.html- Lock conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
Table rewrites are more than just CLUSTER
The broader category “table rewrite operations” includes:
VACUUM FULL- some
ALTER TABLEsubcommands (especially those that change physical layout or require rewriting rows) - some “rebuild and swap” maintenance patterns
The common thread is: rewriting means exclusive coordination.
Reference:
VACUUM FULL: https://www.postgresql.org/docs/current/sql-vacuum.htmlALTER TABLE: https://www.postgresql.org/docs/current/sql-altertable.html
The real downtime risk: time scales with table size
A rewrite operation’s lock hold time tends to scale with:
- table size (GB/TB matters),
- I/O throughput,
- WAL bandwidth and checkpoint behavior,
- CPU (sorting / index scans),
- background load (other queries fighting for cache and I/O).
So on large tables, these are not “a few seconds” operations—they can become minutes or hours, which is why they belong in planned windows unless you have an online alternative.
I can’t predict your runtime without details, but the scaling behavior is the reason these commands are treated with caution.
Safer approaches when you want the benefits
If your goal is “better locality” or “less bloat” but you can’t take downtime, common strategies include:
- Partitioning to reduce the amount of data any one operation touches (smaller rewrites).
- Online rebuild tools/approaches (e.g., build-and-swap patterns; some teams use third-party tooling such as
pg_repackto rewrite with less blocking, accepting operational complexity). - Index-only improvements (sometimes creating a better index or changing query patterns yields more than clustering).
(These topics are covered in later sections like partitioning and safe migration playbooks.)
Practical guardrails if you must run CLUSTER
If you decide CLUSTER is worth it:
- Run it during a low-traffic/maintenance window.
- Use
lock_timeoutso it fails fast instead of sitting in the lock queue and stalling others. - Monitor replication lag (WAL) and disk headroom.
- Consider doing it table-by-table, starting with the smallest/highest-impact targets.
Runtime settings reference (lock_timeout, statement_timeout):
https://www.postgresql.org/docs/current/runtime-config-client.html
Takeaway
CLUSTER is powerful because it rewrites the table into an order that can improve real workloads. The price is that it’s fundamentally a table rewrite, which typically implies strong locks and downtime risk that grows with table size. Treat CLUSTER (and other rewrite operations) as planned, high-blast-radius maintenance unless you have an online strategy.
19. Long-Running Transactions: The Silent Cause of “Everything Is Stuck”
If ACCESS EXCLUSIVE is the obvious “stop-the-world” lock, long-running transactions are the subtle villain that makes those incidents both more likely and harder to recover from.
They often don’t look scary: maybe it’s “just a SELECT,” or a session that’s “idle.” But an open transaction can quietly hold locks, keep snapshots alive, and block maintenance or DDL in ways that cascade into an app-wide stall.
References:
- Locking / table-level locks: https://www.postgresql.org/docs/current/explicit-locking.html
- MVCC and transaction behavior: https://www.postgresql.org/docs/current/mvcc.html
pg_stat_activitymonitoring: https://www.postgresql.org/docs/current/monitoring-stats.html
What “long-running transaction” really means
It’s not about one slow statement. It’s about a transaction that stays open for a long time—seconds, minutes, hours—whether it’s actively running queries or doing nothing.
Examples:
- An application begins a transaction, runs a query, then waits for user input before committing.
- A background worker starts a transaction and keeps it open across many operations.
- A reporting job runs a long
SELECTinside an explicit transaction and forgets to commit. - A session is idle in transaction (we cover that explicitly next), meaning it ran something and then stopped sending commands without ending the transaction.
A long-running transaction can exist even when the connection appears “quiet.”
How a long transaction causes “everything is stuck”
There are three big mechanisms.
1) It holds locks longer than you think
Even a plain SELECT takes a relation lock (ACCESS SHARE) while it runs. More importantly, many operations hold locks until transaction end, not just statement end—especially DDL.
So if a transaction:
- takes row locks via
UPDATE/SELECT ... FOR UPDATE, or - takes relation locks via DDL, or
- does a mixture of reads and writes and stays open,
it can become the root blocker for other sessions that need conflicting locks.
Reference: transactional nature of locking and DDL is discussed in the locking docs
https://www.postgresql.org/docs/current/explicit-locking.html
2) It blocks DDL by keeping old snapshots alive
Postgres uses MVCC snapshots to decide what data is visible. A long-running transaction keeps an old snapshot around.
That matters because many schema changes and maintenance tasks need to ensure there aren’t transactions that could observe inconsistent states or depend on older metadata. So a seemingly harmless long-running transaction can:
- delay
ALTER TABLEacquiring the strong lock it needs, - cause DDL to wait,
- and then trigger lock queue amplification when the DDL request becomes head-of-line.
Reference: MVCC snapshots and transaction visibility
https://www.postgresql.org/docs/current/mvcc.html
3) It prevents vacuum cleanup, leading to bloat and compounding pain
A transaction that stays open can prevent vacuum from reclaiming dead tuples, because those old row versions might still be visible to that old snapshot.
That doesn’t always cause an immediate “freeze,” but over time it causes:
- table and index bloat,
- slower queries,
- higher I/O,
- more frequent and heavier vacuuming,
- and a system that’s easier to tip into a crisis.
Reference: vacuum behavior and MVCC interaction
https://www.postgresql.org/docs/current/sql-vacuum.html
Why long transactions create misleading incidents
In a typical outage story, the timeline looks like this:
- Someone runs a migration needing
ACCESS EXCLUSIVE. - It waits.
- The app starts stalling.
- Everyone blames the migration.
But the real root blocker is often:
- a long-running transaction started earlier (sometimes hours),
- possibly from a report tool, a psql session, or a misbehaving app instance,
- holding locks or snapshots that prevent the migration from progressing.
So the migration becomes the head-of-line waiter, and the long transaction is the silent root cause.
How to spot them quickly
The fastest way during an incident is pg_stat_activity. You’re looking for:
- old
xact_start(transaction age), state = 'idle in transaction',- long-running queries (
now() - query_start), - sessions tied to unexpected applications/users.
pg_stat_activity reference:
https://www.postgresql.org/docs/current/monitoring-stats.html
A practical query to list the oldest open transactions:
SQLSELECT
pid,
usename,
application_name,
client_addr,
state,
now() - xact_start AS xact_age,
now() - query_start AS query_age,
wait_event_type,
wait_event,
left(query, 200) AS query_preview
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
ORDER BY xact_start ASC
LIMIT 50;
This doesn’t prove a session is the blocker, but it quickly surfaces “usual suspects.”
Operational takeaway
Long-running transactions are dangerous because they:
- extend the lifetime of locks and snapshots,
- make DDL lock acquisition unpredictable,
- enable lock-queue pileups,
- and gradually degrade the system through bloat.
If you want fewer stop-the-world incidents, reducing long-lived transactions (especially “idle in transaction”) is one of the highest-leverage changes you can make.
Next, we’ll zoom in on the worst offender in this category: idle in transaction, why it happens, and how to prevent it.
20. Idle in Transaction: How One Session Can Freeze Progress
“Idle in transaction” is one of the most common, most avoidable causes of PostgreSQL lock misery. It’s not a special mode Postgres chooses; it’s what happens when a client:
- starts a transaction (
BEGINhappens explicitly or implicitly), - runs one or more statements,
- then stops sending commands,
- and crucially: does not
COMMITorROLLBACK.
At that point, PostgreSQL reports the session as idle in transaction.
References:
pg_stat_activityand session states: https://www.postgresql.org/docs/current/monitoring-stats.html- MVCC and transaction visibility: https://www.postgresql.org/docs/current/mvcc.html
- Locking behavior: https://www.postgresql.org/docs/current/explicit-locking.html
idle_in_transaction_session_timeout: https://www.postgresql.org/docs/current/runtime-config-client.html
Why “idle in transaction” is uniquely dangerous
An idle session that is not in a transaction is usually harmless. An idle session in a transaction can be a production incident waiting to happen because it can do three things at once:
1) Hold locks indefinitely
Depending on what it already executed, an idle-in-transaction session may be holding:
- row locks (e.g., it ran an
UPDATEorSELECT ... FOR UPDATE) - relation locks (e.g., it ran DDL, or it’s mid-migration)
- other lock types that block unrelated work
And because the transaction never ends, those locks can stick around until someone intervenes.
Reference: https://www.postgresql.org/docs/current/explicit-locking.html
2) Block DDL and trigger lock-queue pileups
Even if it’s “only” holding a weak relation lock from a query, it can still prevent a later ACCESS EXCLUSIVE request from being granted. Then that ACCESS EXCLUSIVE request becomes head-of-line, and suddenly normal traffic starts queueing behind it.
This is the classic pattern where one forgotten session causes the app to “freeze” during a migration.
3) Prevent vacuum from reclaiming dead tuples
An open transaction holds an old snapshot. While it remains open, vacuum may be unable to remove dead tuples that are still potentially visible to that snapshot. Over time that causes:
- bloat,
- slower queries,
- heavier vacuum work,
- and higher I/O.
Reference: MVCC and vacuum behavior
https://www.postgresql.org/docs/current/mvcc.html
https://www.postgresql.org/docs/current/sql-vacuum.html
How it happens in real life
Common causes include:
- Application code paths that start a transaction and then do network I/O (calling another service, waiting for a file upload, waiting for user input).
- Connection pool misuse, where a transaction is started and the connection is returned to the pool without committing (buggy code or exception paths).
- Interactive sessions (
psql, BI tools) where someone runsBEGIN; SELECT ...;and then goes to lunch. - ORM defaults that open transactions automatically, plus a slow/failed request path that doesn’t close them.
Postgres isn’t “being picky” here—your client asked it to keep the transaction open, so it does.
How to identify idle-in-transaction sessions fast
pg_stat_activity includes both the state and transaction timing fields. The telltales are:
state = 'idle in transaction'(or'idle in transaction (aborted)')- an old
xact_start - often a
querythat is the last statement executed, not necessarily the statement causing current blocking
A practical query:
SQLSELECT
pid,
usename,
application_name,
client_addr,
now() - xact_start AS xact_age,
now() - state_change AS idle_in_xact_for,
left(query, 200) AS last_query
FROM pg_stat_activity
WHERE state LIKE 'idle in transaction%'
ORDER BY xact_start ASC;
Reference: session state fields in pg_stat_activity
https://www.postgresql.org/docs/current/monitoring-stats.html
How one idle transaction “freezes progress”: a typical chain
- Session A:
BEGIN; UPDATE accounts SET ... WHERE id=...;
(row locks are now held) - Session A goes idle—transaction remains open.
- Session B: tries to update the same row → waits on A (localized pain).
- Later, Session C: migration tries
ALTER TABLE accounts ...→ requestsACCESS EXCLUSIVEand waits. - Sessions D..N: normal traffic arrives; because a strong lock request is queued, they can start queuing too.
- The app looks “down.”
One session, one forgotten transaction—big blast radius.
Prevention: make “idle in transaction” impossible to ignore
You want both application discipline and database guardrails.
Application-level
- Keep transactions short and tightly scoped.
- Never wrap user think-time or network calls inside a DB transaction.
- Ensure exception paths always
ROLLBACK. - Use poolers correctly: return connections only after the transaction is closed.
Database-level guardrail: idle_in_transaction_session_timeout
PostgreSQL can automatically terminate sessions that stay idle in a transaction longer than a configured threshold:
- Setting:
idle_in_transaction_session_timeout - Behavior: if a session is idle in a transaction past the timeout, the backend is terminated (transaction rolled back).
This is one of the most effective “stop-the-world prevention” settings because it turns “infinite lock hostage” into “bounded damage.”
Reference:
https://www.postgresql.org/docs/current/runtime-config-client.html
Incident response: what you do when you find one
Operationally you have two escalating actions:
- Cancel the current query (
pg_cancel_backend(pid)) — often not enough if it’s idle (there may be no running query). - Terminate the session (
pg_terminate_backend(pid)) — ends the transaction and releases locks, but will roll back work.
Use termination carefully, but in a lock-induced outage it’s frequently the fastest way to restore service once you’ve identified the true root blocker.
Reference: server signaling functions
https://www.postgresql.org/docs/current/functions-admin.html
Takeaway
Idle-in-transaction sessions are dangerous because they can hold locks and snapshots indefinitely, blocking DDL and vacuum and triggering lock-queue pileups. The fix is mostly cultural (short transactions) plus one strong safety net: set idle_in_transaction_session_timeout so a single stuck session can’t freeze your database for hours.
21. Lock Escalation Myth in PostgreSQL: What Actually Happens Instead
A lot of database folks come to PostgreSQL expecting “lock escalation”: the idea that if you lock enough rows, the database upgrades that into a table lock automatically to reduce overhead. That’s a real behavior in some systems.
PostgreSQL is different: it does not automatically escalate row locks into a table lock in the classic sense. When Postgres appears to “escalate,” what you’re usually seeing is one of a few other mechanisms: normal table locks that every statement takes, long transactions holding locks, lock queue amplification, or maintenance/DDL operations that require strong table locks.
References:
- Lock modes and relation locks: https://www.postgresql.org/docs/current/explicit-locking.html
- MVCC and concurrency model: https://www.postgresql.org/docs/current/mvcc.html
What people mean by “lock escalation”
In the typical “escalation” story:
- many row locks are acquired,
- the DB decides tracking so many locks is expensive,
- it replaces them with a single table-level lock,
- which suddenly blocks everything.
This can sound like what’s happening in Postgres when an app gets slow after large updates. But that slowdown is almost never because Postgres escalated locks.
What PostgreSQL actually does
1) PostgreSQL always takes a relation lock for a statement anyway
Even if you’re updating rows, Postgres also takes a relation-level lock on the table (commonly ROW EXCLUSIVE for INSERT/UPDATE/DELETE), because it must prevent conflicting schema changes while the statement runs.
That relation lock:
- is normal,
- is compatible with many other operations (including reads and other writes),
- and is not an escalation—it’s simply part of statement execution.
Reference: lock modes and what statements take them
https://www.postgresql.org/docs/current/explicit-locking.html
2) Row-level contention can look like a table-level freeze
If lots of transactions contend on the same set of rows (hot keys), you can get:
- long wait chains,
- many sessions blocked,
- timeouts and pool exhaustion.
From the application’s point of view, it looks like “the table is locked,” but the actual issue is row-level conflicts and transactions waiting on each other (often visible as transactionid waits).
This is not escalation; it’s just many sessions colliding on the same data.
3) Lock queueing can make compatible locks wait
This is the sneaky one. Suppose:
- a long
SELECTholdsACCESS SHARE, - then a migration requests
ACCESS EXCLUSIVEand waits, - and now other sessions queue behind the pending exclusive request.
Suddenly reads and writes are stalled, and it feels like a giant table lock appeared “out of nowhere.”
Again, not escalation. It’s queue fairness plus a strong lock request sitting in line.
(We covered this mechanism in the lock-queuing section.)
4) DDL and table rewrites legitimately require strong locks
When you run operations like:
- many
ALTER TABLEsubcommands, TRUNCATE,VACUUM FULL,CLUSTER,- non-concurrent index rebuilds in some forms,
Postgres may request ACCESS EXCLUSIVE (or similarly strong modes). That’s not escalation; it’s the correct lock mode for the operation.
References:
ALTER TABLE: https://www.postgresql.org/docs/current/sql-altertable.htmlTRUNCATE: https://www.postgresql.org/docs/current/sql-truncate.htmlVACUUM FULL: https://www.postgresql.org/docs/current/sql-vacuum.htmlCLUSTER: https://www.postgresql.org/docs/current/sql-cluster.html
So why does “it escalated” feel true?
Because the symptom is similar:
- lots of sessions waiting,
- throughput collapses,
- and it happens suddenly.
But the cause is usually one of these:
- a long transaction holding locks open,
- a hot-row contention storm,
- a queued
ACCESS EXCLUSIVErequest causing head-of-line blocking, - a rewrite/DDL command that legitimately blocks.
Once you stop looking for escalation, you start looking for the real root blocker, which is much easier to fix.
Practical takeaway for operators
When someone says “Postgres escalated the lock,” translate it into a checklist:
- Is there a session holding or waiting for
AccessExclusiveLockon the relation? (pg_locks) - Is there a long-running or idle-in-transaction session? (
pg_stat_activity.xact_start,state) - Are many sessions waiting on
transactionid(row contention)? (pg_locks.locktype = 'transactionid'and wait events) - Did a migration or maintenance job start right before the stall?
That’s where the real answers usually are.
Takeaway
PostgreSQL doesn’t do classic lock escalation from rows to tables. When you see “it feels escalated,” it’s almost always normal relation locks + queueing + long transactions, or a DDL/maintenance operation that truly needs a strong lock. Understanding that distinction helps you debug faster and choose safer operational patterns.
22. Blocking Chains: How to Identify the “Root Blocker”
When PostgreSQL feels stuck, the most important diagnostic skill is finding the root blocker: the session that, if it ended (commit/rollback/terminate), would cause the whole backlog to drain.
In real incidents you’ll usually see a blocking chain (or a blocking tree), not a single pair of blocker/victim:
- Session C is waiting on B
- Session B is waiting on A
- Session A is running (or idle in transaction) holding the lock everyone needs
Your job is to identify session A.
References:
- Locking /
pg_locks: https://www.postgresql.org/docs/current/view-pg-locks.html - Monitoring with
pg_stat_activity: https://www.postgresql.org/docs/current/monitoring-stats.html pg_blocking_pids()function: https://www.postgresql.org/docs/current/functions-info.html- Lock modes: https://www.postgresql.org/docs/current/explicit-locking.html
Key concepts: blocker, victim, and root blocker
- Victim (waiter): a session that cannot proceed because it’s waiting for a lock.
- Blocker: a session currently holding a conflicting lock the victim needs.
- Root blocker: the blocker that is not itself waiting on anyone else (or is the earliest cause in the chain). Removing the root blocker breaks the whole chain.
A common trap: the “obvious bad query” is a head-of-line waiter (like a migration waiting for ACCESS EXCLUSIVE) that is causing queue amplification. That query might be causing the symptom, but the root cause is often a different session holding a lock or holding an open transaction.
Fastest path in practice: use pg_blocking_pids()
PostgreSQL provides a built-in helper that tells you which PIDs are blocking a given PID: pg_blocking_pids(pid).
Start from the victims (sessions waiting) and walk upstream until you find sessions that are not blocked by anyone.
Find currently waiting sessions and their blockers
SQLSELECT
a.pid AS waiting_pid,
a.usename AS waiting_user,
a.application_name AS waiting_app,
now() - a.query_start AS waiting_for,
a.wait_event_type,
a.wait_event,
left(a.query, 200) AS waiting_query,
pg_blocking_pids(a.pid) AS blocking_pids
FROM pg_stat_activity a
WHERE a.wait_event_type = 'Lock'
ORDER BY waiting_for DESC;
Interpretation:
blocking_pidsis an array; often it has 1 PID, sometimes more.- If it’s empty but the session is waiting, you may be dealing with a wait that’s not a heavyweight lock (or the snapshot is changing quickly). Still, for classic lock incidents, this is very effective.
Reference: pg_blocking_pids()
https://www.postgresql.org/docs/current/functions-info.html
Build the blocking tree to find the root blocker(s)
This recursive query expands blockers until it reaches the top of each chain, producing a readable “who blocks whom” tree.
SQLWITH RECURSIVE chain AS (
-- Start with sessions that are waiting on a lock
SELECT
a.pid,
a.usename,
a.application_name,
a.state,
a.wait_event_type,
a.wait_event,
a.query,
a.query_start,
a.xact_start,
pg_blocking_pids(a.pid) AS blockers,
0 AS depth,
ARRAY[a.pid] AS path
FROM pg_stat_activity a
WHERE a.wait_event_type = 'Lock'
UNION ALL
-- Walk to the blockers
SELECT
b.pid,
b.usename,
b.application_name,
b.state,
b.wait_event_type,
b.wait_event,
b.query,
b.query_start,
b.xact_start,
pg_blocking_pids(b.pid) AS blockers,
c.depth + 1 AS depth,
c.path || b.pid
FROM chain c
JOIN LATERAL unnest(c.blockers) AS bp(pid) ON true
JOIN pg_stat_activity b ON b.pid = bp.pid
WHERE NOT b.pid = ANY (c.path) -- prevent loops
)
SELECT
depth,
pid,
usename,
application_name,
state,
now() - COALESCE(xact_start, query_start) AS age,
wait_event_type,
wait_event,
left(query, 200) AS query_preview,
blockers
FROM chain
ORDER BY depth DESC, age DESC;
How to use it:
- The highest depth entries are closer to the root blockers.
- Root blockers often have
wait_event_typeNULL (they’re running) or areidle in transaction. - Look for very old
xact_startand suspiciousapplication_name(BI tool, migration runner, ad-hoc psql).
Validate what object they’re fighting over
Once you have candidate blocker PIDs, confirm what they’re locking.
A practical relation-lock focused view:
SQLSELECT
a.pid,
a.usename,
a.application_name,
a.state,
now() - a.xact_start AS xact_age,
l.locktype,
l.mode,
l.granted,
l.relation::regclass AS relation,
left(a.query, 200) AS query_preview
FROM pg_locks l
JOIN pg_stat_activity a ON a.pid = l.pid
WHERE l.locktype = 'relation'
AND (l.granted = true OR l.granted = false)
ORDER BY relation, l.granted DESC, l.mode;
This helps you spot the classic stop-the-world signature:
- one session holding a weak lock for a long time,
- another waiting for
AccessExclusiveLock, - and many victims behind it.
Reference: pg_locks
https://www.postgresql.org/docs/current/view-pg-locks.html
Common root blockers (what they look like)
In real production incidents, root blockers often fall into a few buckets:
- Idle in transaction
state = 'idle in transaction'- old
xact_start - query text shows the last statement, not the one causing current pain
- Long-running read
SELECTrunning for minutes/hours- often from analytics/BI/reporting
- Migration/DDL transaction left open
- DDL completed quickly but the transaction wasn’t committed
AccessExclusiveLockmight be held longer than expected
- Advisory lock “singleton” gone wrong
- app took a global advisory lock and got stuck
What to do once you find the root blocker
From least to most disruptive:
- Ask the app/job owner to end it cleanly (commit/rollback/stop the job) if you can coordinate quickly.
- Cancel the query:
SELECT pg_cancel_backend(<pid>);
Useful when it’s actively running. - Terminate the session:
SELECT pg_terminate_backend(<pid>);
Ends the transaction and releases locks (rolls back). Often the fastest way to restore service.
Admin function reference:
https://www.postgresql.org/docs/current/functions-admin.html
Takeaway
Stop-the-world lock incidents are usually blocking chains. The winning move is to stop guessing and systematically:
- list the waiters,
- follow
pg_blocking_pids()upstream, - find the root blocker(s),
- confirm the locked object in
pg_locks, - then choose the least risky intervention (cancel → terminate).
Next, we’ll distinguish blocking chains from deadlocks, which look similar at first but behave differently and require different responses.
23. Deadlocks vs “Stop-the-World” Blocking: How to Tell the Difference
Both deadlocks and stop-the-world-style blocking can look like “everything is waiting,” but they’re fundamentally different problems:
- Stop-the-world blocking is usually a queue: one or a few sessions are blocking many others, and progress resumes when the blocker releases a lock.
- Deadlock is a cycle: sessions are waiting on each other in a loop, so none of them can ever make progress unless the database breaks the cycle.
PostgreSQL can automatically detect deadlocks and resolve them by aborting one participant. It does not automatically “fix” stop-the-world blocking, because that’s often a legitimate long wait rather than a logical impossibility.
References:
- Explicit locking and deadlocks: https://www.postgresql.org/docs/current/explicit-locking.html
- Error codes / deadlock error: https://www.postgresql.org/docs/current/errcodes-appendix.html
- Server log messages / deadlock reporting: https://www.postgresql.org/docs/current/runtime-config-logging.html
pg_stat_activity/ wait events: https://www.postgresql.org/docs/current/monitoring-stats.html
What a deadlock is (in practical terms)
A deadlock happens when there’s a cycle like:
- Session A holds lock X and waits for lock Y
- Session B holds lock Y and waits for lock X
Neither can proceed, ever.
Postgres detects this (after deadlock_timeout) and raises an error:
ERROR: deadlock detected
One transaction gets canceled/aborted, releasing its locks so the others can proceed.
Reference: deadlock discussion and parameters
https://www.postgresql.org/docs/current/explicit-locking.html
What “stop-the-world” blocking is
Stop-the-world blocking is usually:
- a long-running transaction holding a lock (or holding a snapshot),
- plus a strong lock request queued behind it,
- plus many sessions queued behind the strong lock request.
There’s no cycle. There’s just a chokepoint and a queue. If the root blocker ends, the queue drains.
This can last indefinitely if the blocker never finishes (for example, “idle in transaction”), but it’s not logically impossible the way a deadlock is.
How to tell them apart in the moment
1) Does it eventually resolve by itself?
- Deadlock: Postgres will resolve it by aborting one participant, usually within seconds (depending on
deadlock_timeout). - Stop-the-world blocking: It can continue indefinitely until the blocker commits/rolls back or is terminated.
Caveat: both can cause timeouts at the application layer before the DB resolves anything, so don’t rely on app errors alone.
2) Look for the deadlock error and log entry
Deadlocks generate a specific SQLSTATE and a detailed log message (if logging captures it). You’ll see:
deadlock detected- a description of the processes and the locks involved
Reference: deadlock logging and error code
- SQLSTATE 40P01: https://www.postgresql.org/docs/current/errcodes-appendix.html
- Logging config: https://www.postgresql.org/docs/current/runtime-config-logging.html
3) Shape of the wait graph: cycle vs tree
If you graph “who waits on whom”:
- Deadlock: cycle (A → B → A, or longer cycles).
- Blocking incident: tree/chain with one or few roots (many → one).
In Postgres terms, if you use pg_blocking_pids() and recursively expand blockers (as in the previous section):
- cycles are a deadlock signature,
- a single root blocker with lots of descendants is stop-the-world blocking.
Reference: pg_blocking_pids()
https://www.postgresql.org/docs/current/functions-info.html
4) Wait events and lock types can hint, but aren’t definitive
Both scenarios often show wait_event_type = 'Lock'.
The distinction is not “Lock wait vs not”—it’s the relationship pattern:
- Deadlock: multiple sessions each waiting and holding locks needed by the others.
- Blocking: most sessions are waiting, but one session is making progress (or is idle holding locks).
You can often spot stop-the-world blocking because the root blocker is:
- running a long query, or
- idle in transaction, or
- holding
AccessExclusiveLockor waiting for it and causing queueing.
Why deadlocks can look less “world-stopping” than blocking
Deadlocks usually involve a small number of sessions (2–5 is common) and Postgres resolves them. So they often show up as:
- sporadic transaction failures,
- retries succeeding,
- a few error spikes.
Stop-the-world blocking is more likely to involve dozens/hundreds of sessions and cause widespread timeouts, because it jams a critical table or saturates the pool.
That said, a deadlock inside a hot path can still cause lots of user-visible errors even if it’s quickly resolved.
Response differences: what you do about each
If it’s a deadlock
- Expect Postgres to abort one transaction.
- Fix is usually application-level ordering: take locks in a consistent order, keep transactions short, avoid “lock A then lock B” in one code path and “lock B then lock A” in another.
- Retries are often appropriate for deadlock errors.
Reference: deadlock prevention guidance
https://www.postgresql.org/docs/current/explicit-locking.html
If it’s stop-the-world blocking
- Find the root blocker and end it (commit/rollback/terminate) or cancel the head-of-line strong lock request to let compatible traffic proceed.
- Prevent recurrence with: timeouts (
lock_timeout,statement_timeout,idle_in_transaction_session_timeout), safer migration patterns, and reducing long transactions.
Takeaway
Deadlocks are cycles that Postgres can detect and break by aborting one participant; they show up as deadlock detected errors. Stop-the-world incidents are queues caused by strong lock requests, long transactions, and lock-queue amplification; they can persist indefinitely until you remove the root blocker or the head-of-line waiter.
24. Symptoms Checklist: What You’ll See When the System “Freezes”
A PostgreSQL “freeze” is usually not a crash. It’s the system doing exactly what it was designed to do: protect correctness by making sessions wait. The hard part is that lock-driven waiting often looks like a total outage from the application side.
This checklist is what you typically observe when you’re in a stop-the-world-style lock incident, plus what each symptom usually implies.
References:
pg_stat_activityand wait events: https://www.postgresql.org/docs/current/monitoring-stats.htmlpg_locksview: https://www.postgresql.org/docs/current/view-pg-locks.html- Lock modes and conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
Application-level symptoms
□ Sudden spike in request timeouts across many endpoints
- Often indicates pool saturation: threads are waiting on DB connections that are themselves waiting on locks.
- Especially common if most requests touch the same hot table.
□ Retry storms / cascading failures
- Clients retry timeouts, increasing concurrency, which increases lock queue depth and makes the stall worse.
- Watch for a sharp increase in request volume without a corresponding increase in completed requests.
□ Connection pool exhaustion
- Pool at max, many app threads blocked waiting for a free connection.
- From Postgres, you’ll see lots of backends “active” but not progressing.
Database-level “it looks weirdly calm” symptoms
□ CPU is not pegged (sometimes it’s surprisingly low)
- Lock waits don’t burn CPU.
- This is a classic clue that the bottleneck is waiting, not compute.
□ I/O is not obviously saturated
- In pure lock incidents, disk may look normal.
- (Contrast: table rewrites, vacuum full, cluster, massive sort operations—those often do show I/O pressure.)
What you’ll see in pg_stat_activity
□ Many sessions with wait_event_type = 'Lock'
This is the most direct “lock incident” indicator.
statemay beactive(a query is running but currently waiting)wait_eventmay indicate the specific lock class being waited on (implementation detail; don’t overinterpret the exact string)
Reference: wait event fields
https://www.postgresql.org/docs/current/monitoring-stats.html
□ A small number of sessions running for a long time
Look for:
- very old
query_start(long-running statement), - very old
xact_start(long-running transaction), state = 'idle in transaction'(particularly dangerous).
□ A migration/DDL statement sitting “active” but not progressing
Often an ALTER TABLE, TRUNCATE, DROP, REINDEX, VACUUM FULL, CLUSTER, etc. It might be:
- the head-of-line waiter (waiting for
ACCESS EXCLUSIVE), or - the holder (already has the exclusive lock).
What you’ll see in pg_locks
□ A pile of waiting locks (granted = false)
If the freeze is lock-driven, pg_locks will show many entries where granted = false.
Reference: pg_locks columns and meaning
https://www.postgresql.org/docs/current/view-pg-locks.html
□ Many waits on the same relation
A telltale sign is many waiting locks tied to one table:
locktype = 'relation'relation::regclassis the same for most waiters
That’s your chokepoint table.
□ One session holding or waiting for AccessExclusiveLock
Common patterns:
- one backend holds
AccessExclusiveLockon a hot table (hard freeze), or - one backend is waiting for
AccessExclusiveLockand many others are queued behind it (queue amplification).
Reference: lock modes and conflicts
https://www.postgresql.org/docs/current/explicit-locking.html
□ Lots of transactionid waits
If you see many waits with:
locktype = 'transactionid'
…that’s often a sign of heavy row-level contention or many sessions waiting for one long transaction to commit/rollback.
This tends to be more “hot-row storm” than “DDL stop-the-world,” but in high-traffic systems it can still feel like an outage.
“Smell tests” that narrow the cause quickly
□ SELECTs are timing out too
If plain reads are timing out on a particular table, suspect:
- a strong relation lock (
ACCESS EXCLUSIVEor similar), or - lock queueing behind a pending exclusive request.
□ Only certain writes/time-critical paths are failing
If only specific entity updates fail (e.g., one customer or one order ID), suspect:
- row-level contention,
- long transactions holding row locks,
- or a deadlock pattern (which usually resolves with errors).
□ The database is reachable, but everything is slow
If you can connect and run SELECT 1; quickly, but application queries hang, suspect:
- relation lock on a hot object,
- lock queueing,
- or severe internal contention (LWLocks / I/O) rather than a full outage.
A quick “freeze triage” checklist
If you want a tight checklist to run mentally:
- □ Are many sessions waiting with
wait_event_type = 'Lock'? (pg_stat_activity) - □ Is there a head-of-line
ACCESS EXCLUSIVErequest? (pg_locksonrelation) - □ Is there an old open transaction / idle in transaction session? (
pg_stat_activity.xact_start,state) - □ Is the pain localized to one relation? (
relation::regclass) - □ Is it actually a deadlock (errors) rather than blocking (waiting)? (logs + deadlock errors)
Takeaway
A lock-driven “freeze” is characterized by lots of lock waits, queueing, and often one hot relation with either a held or pending strong lock. The fastest path out is usually identifying the root blocker (often a long transaction or idle-in-transaction session) and understanding whether a queued ACCESS EXCLUSIVE request is amplifying the stall.
25. Measuring Impact: Latency, Throughput, and Connection Pileups
Lock incidents are sneaky because “the database is waiting” doesn’t always look like “the database is busy.” To manage an incident (and to justify preventive work later), you need a way to quantify impact in three dimensions:
- Latency: how long requests take
- Throughput: how many requests complete per second
- Connection pileups: how many sessions are stuck waiting, and where
References:
- Monitoring statistics (
pg_stat_activity,pg_stat_database): https://www.postgresql.org/docs/current/monitoring-stats.html - Wait events: https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-STATS-VIEWS-TABLE
pg_locks: https://www.postgresql.org/docs/current/view-pg-locks.html
Latency: “how long are we waiting?”
In a lock freeze, the single most useful latency metric is time spent waiting for locks, because that translates directly into user-facing response time.
What to look at:
- Application request latency (p50/p95/p99)
- p99 usually explodes first in lock incidents.
- Database query latency
- slow query logs help if they include lock waits (often the query isn’t “slow,” it’s “blocked”).
Inside Postgres, use pg_stat_activity to spot who has been waiting the longest:
SQLSELECT
pid,
usename,
application_name,
now() - query_start AS query_age,
wait_event_type,
wait_event,
left(query, 200) AS query_preview
FROM pg_stat_activity
WHERE wait_event_type IS NOT NULL
ORDER BY query_age DESC;
Interpretation:
- Large
query_age+wait_event_type = 'Lock'is “pure lock wait time.” - If
query_ageis large but wait events aren’t lock-related, you may have I/O or internal contention instead.
Reference: pg_stat_activity fields
https://www.postgresql.org/docs/current/monitoring-stats.html
Throughput: “are we still completing work?”
A system can have thousands of “active” sessions and still complete almost nothing if most are waiting.
What to look at:
- Completed requests per second at the app/load balancer
- Transactions committed/rolled back in Postgres
From Postgres, a quick sense of throughput comes from pg_stat_database:
SQLSELECT
datname,
xact_commit,
xact_rollback,
blks_read,
blks_hit,
tup_returned,
tup_fetched,
tup_inserted,
tup_updated,
tup_deleted
FROM pg_stat_database
WHERE datname = current_database();
This is cumulative. For incident measurement you typically sample it twice (e.g., 10–30 seconds apart) and compute deltas to get rates. The key signal in a lock freeze is:
- request volume remains high,
- but commit rate drops sharply (or stalls),
- rollbacks/timeouts may rise.
Reference: pg_stat_database
https://www.postgresql.org/docs/current/monitoring-stats.html
Connection pileups: “where are we stuck?”
Lock incidents become outages because waiting sessions accumulate faster than they drain. That’s the connection pileup.
What pileups look like
- Many sessions in
pg_stat_activitywithwait_event_type = 'Lock' - Connection pool is maxed out
- New requests can’t even get a DB connection
- Sometimes
max_connectionspressure shows up too (more on that if you’re running without a pooler)
Measure pileup size and shape
Count sessions by state and wait type:
SQLSELECT
state,
wait_event_type,
count(*) AS sessions
FROM pg_stat_activity
WHERE datname = current_database()
GROUP BY state, wait_event_type
ORDER BY sessions DESC;
This tells you whether you have:
- a lock wait storm (
wait_event_type = 'Lock'), - a CPU/I/O issue (other wait event types),
- or simply too many idle connections.
Reference: monitoring stats and wait events
https://www.postgresql.org/docs/current/monitoring-stats.html
Identify the hot relation causing the pileup
If it’s a relation-lock incident, group by relation:
SQLSELECT
l.relation::regclass AS relation,
l.mode,
l.granted,
count(*) AS cnt
FROM pg_locks l
WHERE l.locktype = 'relation'
GROUP BY 1, 2, 3
ORDER BY cnt DESC, relation;
Typical stop-the-world signature:
- many
granted = falserows on a single relation, - often with someone waiting for (or holding)
AccessExclusiveLock.
Reference: pg_locks
https://www.postgresql.org/docs/current/view-pg-locks.html
Translating DB symptoms into user impact
A useful mental mapping:
- Latency spikes happen immediately when sessions start waiting.
- Throughput collapse happens when the backlog saturates the pool and commits slow down.
- Pileup growth happens when arrival rate > completion rate (classic queueing).
So during the incident, track:
- p95/p99 request latency,
- commit rate (or completed request rate),
- number of waiting sessions (and on what).
That combination tells you if the situation is:
- stabilizing (pileup shrinking),
- steady-state (pileup flat, still bad),
- or runaway (pileup growing, timeouts rising).
Takeaway
Impact measurement in lock incidents is about confirming the “waiting outage” pattern:
- Latency: wait times dominate query duration (
pg_stat_activity.query_age+ lock waits). - Throughput: commits per second drop (sample
pg_stat_databasedeltas). - Pileups: waiting sessions explode and concentrate on a relation (
pg_locksgrouped by relation).
Once you can quantify those three, the next step is observability: knowing exactly which views, joins, and queries to use to find the root blocker quickly.
26. Observability Primer: The Must-Know Catalogs and Views
When a lock incident hits, you don’t want to “browse around.” You want a small set of Postgres views you trust, and you want to know what question each one answers.
This section is a cheat sheet for the must-know catalogs and views for lock and “stop-the-world” diagnosis.
References:
- Statistics and monitoring views: https://www.postgresql.org/docs/current/monitoring-stats.html
pg_locks: https://www.postgresql.org/docs/current/view-pg-locks.html- System catalogs overview: https://www.postgresql.org/docs/current/catalogs.html
- Progress reporting views: https://www.postgresql.org/docs/current/progress-reporting.html
pg_stat_activity: what each session is doing (and waiting on)
This is your primary incident dashboard.
Use it to answer:
- Who is currently running a query?
- Who is waiting, and on what (
wait_event_type,wait_event)? - Which sessions have long-lived transactions (
xact_start)? - Who is
idle in transaction? - What is the query text and which app/user is it from?
Key columns you’ll use constantly:
pid,usename,application_name,client_addrstate,state_changequery,query_startxact_startwait_event_type,wait_event
Reference: https://www.postgresql.org/docs/current/monitoring-stats.html
pg_locks: what locks are held and what locks are waiting
This is the ground truth for heavyweight locks.
Use it to answer:
- Which locks are granted vs waiting (
granted)? - Which lock mode is involved (
mode)? - What object is locked (
locktype,relation,transactionid, etc.)? - Is there a pending
AccessExclusiveLockon a hot table?
Pro tip:
- For relation locks, resolve names with
relation::regclass.
Reference: https://www.postgresql.org/docs/current/view-pg-locks.html
pg_blocking_pids(pid): who is blocking this session right now
This is the fastest way to move from “lots of waiters” to “who do I talk to / kill?”
Use it to answer:
- For this waiting PID, which PIDs are directly blocking it?
It’s especially useful combined with pg_stat_activity and a recursive query to find root blockers.
Reference: https://www.postgresql.org/docs/current/functions-info.html
pg_stat_database: high-level health and throughput counters
This view is about “is the system making progress?”
Use it to answer:
- Are commits happening?
- Are rollbacks spiking?
- Are we reading from disk vs cache?
- Is activity concentrated in one database?
It’s cumulative, so you typically sample it twice to get rates.
Reference: https://www.postgresql.org/docs/current/monitoring-stats.html
Progress reporting views: “is the big operation still moving?”
When the problem involves long maintenance or DDL-like operations, progress views help distinguish “blocked” from “slow but progressing.”
Common ones:
pg_stat_progress_vacuum(vacuum progress)pg_stat_progress_create_index(index build progress)pg_stat_progress_cluster(cluster progress)pg_stat_progress_copy(COPY progress)
Use them to answer:
- Is the operation advancing or stuck waiting?
- Which phase is it in?
- How much work remains?
Reference: https://www.postgresql.org/docs/current/progress-reporting.html
Catalogs for resolving OIDs to names
Lock rows often contain OIDs. You need to translate those quickly.
Most common conversions:
relation::regclass(fastest for relation OIDs)pg_class(tables, indexes, relkinds)pg_namespace(schemas)pg_database(databases)pg_roles/pg_authid(roles;pg_rolesis safer for most users)
References:
- Catalogs overview: https://www.postgresql.org/docs/current/catalogs.html
pg_class: https://www.postgresql.org/docs/current/catalog-pg-class.htmlpg_namespace: https://www.postgresql.org/docs/current/catalog-pg-namespace.html
pg_stat_statements: who is generating pain over time
This isn’t a “find the blocker right now” view, but it’s crucial for prevention and postmortems.
Use it to answer:
- Which query patterns are slow or frequent?
- Which statements contribute most total time?
- Are there recurrent queries that often get stuck behind locks?
It requires enabling the extension and configuration, so it may not be available in every environment.
Reference: pg_stat_statements docs
https://www.postgresql.org/docs/current/pgstatstatements.html
Wait events: the “what are we waiting for” taxonomy
Wait events are not a lock list. They’re a classification of why a backend isn’t running on CPU right now.
Use wait events to distinguish:
- lock waits (
wait_event_type = 'Lock') vs - internal contention (
LWLock/LWLockTranche) vs - I/O waits vs
- client/network waits.
This matters because the fix path is different.
Reference: wait event reporting in monitoring stats
https://www.postgresql.org/docs/current/monitoring-stats.html
A minimal “incident toolkit” query set
If you only memorize three things, make them these:
- Who’s waiting and on what? (
pg_stat_activity) - What lock is involved and on which relation? (
pg_lockswithrelation::regclass) - Who is blocking whom? (
pg_blocking_pids())
Everything else is supporting detail.
Takeaway
PostgreSQL gives you enough observability to solve most lock incidents quickly, but only if you know where to look. pg_stat_activity tells you who is waiting, pg_locks tells you what they’re waiting on, and pg_blocking_pids() tells you who’s responsible. Add progress views for long operations and catalog lookups for OID resolution, and you have a complete “stop-the-world” debugging toolkit.
27. Using pg_stat_activity to Spot Blockers and Victims
When a PostgreSQL system “freezes,” pg_stat_activity is the fastest place to separate:
- victims: sessions waiting and piling up
- suspects: sessions running long or holding transactions open
- likely blockers: sessions that victims say are blocking them (via
pg_blocking_pids())
pg_stat_activity won’t tell you everything about locks by itself (that’s pg_locks), but it gives you the operational context you need: user, app, client address, query text, and timing.
References:
pg_stat_activityfields and session states: https://www.postgresql.org/docs/current/monitoring-stats.html- Wait events: https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-STATS-WAIT-EVENT-TABLE
pg_blocking_pids(): https://www.postgresql.org/docs/current/functions-info.html
The columns that matter most in an incident
In pg_stat_activity, these columns are your “blockers vs victims” toolkit:
pid: the session identifier you’ll use everywhere elsestate:active,idle,idle in transaction, etc.query,query_start: what it’s doing (or last did) and how long it’s been runningxact_start: how long the transaction has been open (critical)wait_event_type,wait_event: what it’s waiting for right nowusename,application_name,client_addr: who/what to blame (or contact)
Reference: https://www.postgresql.org/docs/current/monitoring-stats.html
Step 1: Find the victims (the waiters)
Victims are sessions that are currently waiting. In lock incidents, the most common signal is:
wait_event_type = 'Lock'
Query:
SQLSELECT
pid,
usename,
application_name,
client_addr,
state,
now() - query_start AS query_age,
now() - xact_start AS xact_age,
wait_event_type,
wait_event,
left(query, 200) AS query_preview
FROM pg_stat_activity
WHERE datname = current_database()
AND wait_event_type IS NOT NULL
ORDER BY query_age DESC;
How to interpret:
- Lots of rows with
wait_event_type = 'Lock'→ lock storm. - If
wait_event_typeis mostlyLWLock*or I/O-related, you may have internal contention rather than classic blocking.
Step 2: Attach “who is blocking me?” to each victim
pg_stat_activity doesn’t have a direct “blocked by” column, but Postgres provides pg_blocking_pids(pid).
SQLSELECT
a.pid AS waiting_pid,
a.usename,
a.application_name,
now() - a.query_start AS waiting_for,
left(a.query, 200) AS waiting_query,
pg_blocking_pids(a.pid) AS blocking_pids
FROM pg_stat_activity a
WHERE a.datname = current_database()
AND a.wait_event_type = 'Lock'
ORDER BY waiting_for DESC;
Interpretation:
- If most victims list the same blocker PID(s), you’re close to the root cause.
- If victims show different blockers, you may have multiple hot spots or a more complex chain.
Reference: https://www.postgresql.org/docs/current/functions-info.html
Step 3: Find the “classic” blockers in pg_stat_activity
Even without pg_locks, pg_stat_activity often makes the blocker obvious because blockers are usually:
A) Long-running transactions
Look for old xact_start:
SQLSELECT
pid,
usename,
application_name,
client_addr,
state,
now() - xact_start AS xact_age,
now() - query_start AS query_age,
wait_event_type,
wait_event,
left(query, 200) AS query_preview
FROM pg_stat_activity
WHERE datname = current_database()
AND xact_start IS NOT NULL
ORDER BY xact_start ASC
LIMIT 50;
Red flags:
- very large
xact_age(minutes/hours) state = 'idle in transaction'(often the real villain)
B) Idle in transaction
This is the “held locks forever” pattern:
SQLSELECT
pid,
usename,
application_name,
client_addr,
now() - xact_start AS xact_age,
now() - state_change AS idle_in_xact_for,
left(query, 200) AS last_query
FROM pg_stat_activity
WHERE datname = current_database()
AND state LIKE 'idle in transaction%'
ORDER BY xact_start ASC;
If you find one of these during a lock storm, treat it as a prime suspect.
C) The head-of-line DDL waiter
Migrations often show up as a single DDL statement that’s been “running” for a long time, but it’s actually waiting.
Look for:
state = 'active'wait_event_type = 'Lock'- query text includes
ALTER TABLE,TRUNCATE,DROP,REINDEX,VACUUM FULL,CLUSTER, etc.
This session may be the symptom amplifier rather than the root blocker.
Step 4: Label sessions as blocker vs victim (quick triage rule)
A useful incident triage heuristic:
- Victim:
wait_event_type = 'Lock'andpg_blocking_pids(pid)is not empty - Likely blocker: appears inside other sessions’
blocking_pids, often has:- no lock wait event (it’s running or idle),
- old
xact_start, - or
state = 'idle in transaction'
You can pull the blocker details directly:
SQLWITH waiters AS (
SELECT
pid AS waiting_pid,
unnest(pg_blocking_pids(pid)) AS blocking_pid
FROM pg_stat_activity
WHERE datname = current_database()
AND wait_event_type = 'Lock'
)
SELECT
b.pid,
b.usename,
b.application_name,
b.client_addr,
b.state,
now() - b.xact_start AS xact_age,
now() - b.query_start AS query_age,
left(b.query, 200) AS query_preview,
count(*) AS blocked_sessions
FROM waiters w
JOIN pg_stat_activity b ON b.pid = w.blocking_pid
GROUP BY 1,2,3,4,5,6,7,8
ORDER BY blocked_sessions DESC, xact_age DESC;
This gives you a prioritized “top blockers” list.
What pg_stat_activity can’t tell you (and when to switch to pg_locks)
pg_stat_activity will not reliably tell you:
- which relation (table) is the hotspot,
- what lock mode is involved (
AccessExclusiveLockvs others), - or the full lock compatibility picture.
Once you have candidate blocker PIDs, the next step is usually to join to pg_locks to confirm:
- the locked object,
- the lock mode,
- granted vs waiting.
(That’s the next section.)
Takeaway
Use pg_stat_activity to quickly:
- identify victims (
wait_event_type = 'Lock'), - attach blockers (
pg_blocking_pids()), - find the usual root causes (old
xact_start,idle in transaction, long-running reads), - and produce a “top blockers” list you can act on.
Then use pg_locks to confirm exactly what’s being locked and why.
28. Reading pg_locks Without Losing Your Mind
pg_locks is the most accurate view of heavyweight locking in PostgreSQL—and also one of the easiest to misread. It’s low-level, full of OIDs, and a single incident can produce thousands of rows.
The trick is to stop thinking “I will understand every lock” and instead use pg_locks to answer three questions:
- What is waiting? (
granted = false) - What object is the bottleneck? (usually a
relation) - Is there a strong lock involved? (often
AccessExclusiveLock)
References:
pg_locksview: https://www.postgresql.org/docs/current/view-pg-locks.html- Lock modes and conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
- System catalogs / OIDs: https://www.postgresql.org/docs/current/catalogs.html
The columns that matter (ignore the rest at first)
For lock incidents, start with:
pid: who holds/wants the locklocktype: what kind of thing is being lockedmode: lock mode (internal names likeAccessExclusiveLock)granted: true = held, false = waiting- object identifiers:
relation(for relation locks)transactionid/virtualxid(for transaction waits)classid/objid/objsubid(misc object locks)
Reference: https://www.postgresql.org/docs/current/view-pg-locks.html
Step 1: Filter to the waits (the incident surface area)
Most of the noise in pg_locks is granted locks that are normal and harmless. During a “freeze,” the signal is in the waits:
SQLSELECT
pid,
locktype,
mode,
relation::regclass AS relation,
transactionid,
virtualxid
FROM pg_locks
WHERE NOT granted
ORDER BY locktype, relation, mode;
If this returns many rows, you have real contention. If it returns almost none, your problem is probably not heavyweight locks (maybe LWLocks, I/O, or client-side issues).
Step 2: Identify the hotspot relation (most common stop-the-world pattern)
In stop-the-world DDL incidents, most waits will be locktype = 'relation', concentrated on one or a few relations.
Group them:
SQLSELECT
relation::regclass AS relation,
mode,
count(*) AS waiting
FROM pg_locks
WHERE locktype = 'relation'
AND NOT granted
GROUP BY 1, 2
ORDER BY waiting DESC, relation;
What you’re looking for:
- a single table with a large number of waiters, and/or
- a waiting
AccessExclusiveLockrequest on a hot table.
Step 3: Learn the 5 locktypes you’ll actually use
pg_locks.locktype can be many things, but in practice, most incident work involves these:
relation
Locks on tables/indexes/etc. The most actionable for “stop-the-world.”
- Resolve with
relation::regclass.
transactionid
Usually means: “I need to know whether transaction X commits/rolls back before I can proceed.”
Common with row-level conflicts and hot-row storms.
virtualxid
Often shows up when operations need to coordinate with active transactions/snapshots. Can be involved in DDL waiting patterns.
advisory
App-controlled locks. Great when intentional; nasty when a session holds one forever.
object
Misc catalog objects. Important sometimes, but start with relation locks unless you have evidence otherwise.
Reference: pg_locks lock types
https://www.postgresql.org/docs/current/view-pg-locks.html
Step 4: Interpret lock mode names in pg_locks
In SQL you say ACCESS EXCLUSIVE. In pg_locks.mode you’ll see:
AccessShareLockRowExclusiveLockShareUpdateExclusiveLockAccessExclusiveLock- etc.
Operationally:
- If you see
AccessExclusiveLockinvolved on a hot table, treat it as a high-likelihood stop-the-world trigger. - If you see lots of
RowExclusiveLockwaits, you may be looking at DDL conflict or queue amplification (writes usually don’t block each other at the table-lock level).
Reference: lock modes
https://www.postgresql.org/docs/current/explicit-locking.html
Step 5: Always join to pg_stat_activity for “who/what/why”
A lock row without the query text is half the story. Join on pid:
SQLSELECT
a.pid,
a.usename,
a.application_name,
a.client_addr,
a.state,
a.wait_event_type,
a.wait_event,
now() - a.xact_start AS xact_age,
now() - a.query_start AS query_age,
l.locktype,
l.mode,
l.granted,
l.relation::regclass AS relation,
left(a.query, 200) AS query_preview
FROM pg_locks l
JOIN pg_stat_activity a ON a.pid = l.pid
WHERE a.datname = current_database()
ORDER BY l.granted ASC, query_age DESC NULLS LAST;
Now you can see:
- which app/user holds the lock,
- whether it’s idle in transaction,
- and how long it’s been open.
Reference: pg_stat_activity
https://www.postgresql.org/docs/current/monitoring-stats.html
Common “gotchas” that make people misdiagnose pg_locks
Gotcha 1: “There are tons of locks, so that must be the problem”
No. A healthy system has lots of granted locks. The problem is waiters.
Gotcha 2: “The DDL statement is the blocker”
Often the DDL is waiting (not granted). The root blocker is commonly a long transaction holding a weaker lock.
Use:
grantedto distinguish holder vs waiterpg_blocking_pids()to find the real blocker
Reference: https://www.postgresql.org/docs/current/functions-info.html
Gotcha 3: “I don’t see row locks, so it can’t be row contention”
Row-level waits often show up as transactionid waits rather than neat tuple identifiers. You may see many transactionid waiters tied back to one long-running transaction.
Gotcha 4: “Everything is waiting on one table, but I don’t see AccessExclusiveLock”
It might be:
- another strong mode (
ExclusiveLock,ShareRowExclusiveLock), - an object-level lock,
- or queueing behind a pending strong request.
A simple workflow that keeps you sane
When you’re under pressure, do this in order:
- Count waiters: are there many
NOT grantedlocks? - Find the hotspot relation: group waiting relation locks by
relation::regclass. - Find the strong request: look for
AccessExclusiveLockwaiting/held on that relation. - Identify the blocker PID(s): join to
pg_stat_activity+ usepg_blocking_pids(). - Decide action: cancel the head-of-line waiter, terminate the root blocker, or fix the app behavior.
Takeaway
Reading pg_locks is manageable if you focus on waits, not totals; relations, not every locktype; and lock modes that matter operationally (AccessExclusiveLock especially). Combine it with pg_stat_activity for context, and you can move from “Postgres is frozen” to “this PID on this table is causing it” quickly.
29. Joining pg_locks and pg_stat_activity: A Practical Blocking Query
When you’re in the middle of a lock incident, you want one query that answers, in one screen:
- who is waiting,
- what they’re waiting on (which relation + which lock mode),
- who is blocking them,
- and what the blocker is doing (query text, transaction age, idle-in-transaction status).
There isn’t a single “perfect” query for every locktype, but the one below is a very practical starting point for the most common stop-the-world scenario: relation lock contention (hot table + strong lock request + queue pileup).
References:
pg_locks: https://www.postgresql.org/docs/current/view-pg-locks.htmlpg_stat_activity: https://www.postgresql.org/docs/current/monitoring-stats.html- Lock modes: https://www.postgresql.org/docs/current/explicit-locking.html
pg_blocking_pids(): https://www.postgresql.org/docs/current/functions-info.html
Practical “show me blockers and victims” query (relation locks)
This version uses pg_blocking_pids() to avoid re-implementing lock compatibility rules, then enriches with relation lock details from pg_locks.
SQLWITH waiting AS (
SELECT
a.pid AS waiting_pid,
a.usename AS waiting_user,
a.application_name AS waiting_app,
a.client_addr AS waiting_client,
a.state AS waiting_state,
a.wait_event_type,
a.wait_event,
now() - a.query_start AS waiting_query_age,
now() - a.xact_start AS waiting_xact_age,
left(a.query, 200) AS waiting_query,
unnest(pg_blocking_pids(a.pid)) AS blocking_pid
FROM pg_stat_activity a
WHERE a.datname = current_database()
AND a.wait_event_type = 'Lock'
),
waiting_rel_locks AS (
SELECT
l.pid,
l.mode,
l.relation,
l.relation::regclass AS relation_name
FROM pg_locks l
WHERE l.locktype = 'relation'
AND l.granted = false
)
SELECT
w.waiting_pid,
w.waiting_user,
w.waiting_app,
w.waiting_client,
w.waiting_state,
w.waiting_query_age,
w.wait_event_type,
w.wait_event,
rl.mode AS waiting_lock_mode,
rl.relation_name AS locked_relation,
w.blocking_pid,
b.usename AS blocking_user,
b.application_name AS blocking_app,
b.client_addr AS blocking_client,
b.state AS blocking_state,
now() - b.xact_start AS blocking_xact_age,
now() - b.query_start AS blocking_query_age,
left(b.query, 200) AS blocking_query
FROM waiting w
LEFT JOIN waiting_rel_locks rl
ON rl.pid = w.waiting_pid
LEFT JOIN pg_stat_activity b
ON b.pid = w.blocking_pid
ORDER BY
w.waiting_query_age DESC,
locked_relation NULLS LAST,
w.waiting_pid,
w.blocking_pid;
What this gives you:
- A row per (waiting PID, blocking PID) pair.
- The relation (when the waiting is on a relation lock) plus the waiting lock mode.
- Full context about the blocker (transaction age, last query, state).
Why pg_blocking_pids() is helpful:
- It uses PostgreSQL’s actual lock conflict rules, including edge cases, so you don’t need to infer blockers by “same relation” alone.
Reference: pg_blocking_pids()
https://www.postgresql.org/docs/current/functions-info.html
How to use the output during an incident
1) Find the hotspot relation quickly
Sort/scan for locked_relation that appears repeatedly. If dozens/hundreds of waiters point at the same relation, that’s your blast radius.
2) Identify the root blocker vs head-of-line waiter
- If the blocker is
idle in transactionwith a hugeblocking_xact_age, it’s often the root cause. - If the blocker is itself waiting on locks (check its
wait_event_typein a separate query), you may be in a chain. - If many sessions are waiting and you see one DDL query waiting for a strong lock, that DDL may be the head-of-line waiter amplifying the incident.
3) Decide intervention
Common quick wins:
- cancel/stop the head-of-line migration to remove queue amplification, or
- terminate the root blocker that’s holding locks open.
(Your incident policy decides which is acceptable.)
Two small upgrades that often help
Upgrade A: collapse to “top blockers”
If you want a prioritized list:
SQLWITH w AS (
SELECT unnest(pg_blocking_pids(pid)) AS blocking_pid
FROM pg_stat_activity
WHERE datname = current_database()
AND wait_event_type = 'Lock'
)
SELECT
b.pid,
b.usename,
b.application_name,
b.client_addr,
b.state,
now() - b.xact_start AS xact_age,
left(b.query, 200) AS query_preview,
count(*) AS blocked_sessions
FROM w
JOIN pg_stat_activity b ON b.pid = w.blocking_pid
GROUP BY 1,2,3,4,5,6,7
ORDER BY blocked_sessions DESC, xact_age DESC;
Upgrade B: show the blocker’s relation locks too
Sometimes you want to confirm the blocker is holding (not just blocking via transactionid semantics). Add a join to pg_locks for the blocker’s granted relation locks.
Reference for join logic: pg_locks
https://www.postgresql.org/docs/current/view-pg-locks.html
Limitations to be aware of (so you don’t misread it)
- This is optimized for lock waits (
wait_event_type = 'Lock'). If the system is “frozen” due to LWLocks or I/O, you need a different approach (wait events, notpg_locks). - The relation info here is pulled from the waiting session’s relation lock request; some waits (especially row/transactionid waits) won’t have a
relationrow to display. - A blocker can be “blocking” due to transaction-level dependencies even if it doesn’t show an obvious relation lock row you expect.
Takeaway
Join pg_stat_activity (context + wait state) with pg_locks (lock details) and use pg_blocking_pids() (true blocking relationships) to get a high-signal incident view: victims → locked relation → blocking PID → blocker transaction/query. This is usually enough to identify the root blocker and stop the pileup fast.
30. Lock Timeouts: Preventing Infinite Waiting
A big reason lock incidents turn into outages is not that locks exist—it’s that sessions can wait forever by default. One DDL statement that sits in the lock queue for minutes can become head-of-line and trigger a pileup, saturating your pool and taking the app down.
lock_timeout is the simplest guardrail against that: it makes lock acquisition fail fast instead of waiting indefinitely.
References:
lock_timeoutruntime parameter: https://www.postgresql.org/docs/current/runtime-config-client.html- Locking and conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
What lock_timeout does
lock_timeout sets a maximum amount of time a statement will wait to acquire a lock. If the lock cannot be acquired within that time, PostgreSQL cancels the statement with an error.
Key properties:
- It applies to waiting for locks, not executing once you have them.
- It’s measured per statement while it is blocked on locks.
- It helps prevent queue amplification because the head-of-line waiter will give up instead of collecting victims behind it.
Reference: https://www.postgresql.org/docs/current/runtime-config-client.html
Why this matters specifically for stop-the-world incidents
Most production “freeze” stories have this shape:
- A hot table is being used normally.
- A migration or maintenance command requests a strong lock (
ACCESS EXCLUSIVEor similar). - It waits.
- Other sessions start queuing behind it.
- The pool fills, timeouts explode.
With lock_timeout, step 3 turns into:
- “migration fails quickly,” not “migration becomes the reason everyone queues.”
That’s often the difference between:
- an annoying failed deploy, and
- a cascading outage.
How to set it (session, transaction, or role/database level)
Session-level (common for migration scripts)
SQLSET lock_timeout = '2s';
Transaction-level (only for the current transaction)
SQLBEGIN;
SET LOCAL lock_timeout = '2s';
-- your DDL here
COMMIT;
SET LOCAL is nice for migrations because it doesn’t leak into other transactions on the same connection.
Reference: GUC scoping rules and runtime settings
https://www.postgresql.org/docs/current/runtime-config-client.html
Role/database defaults (use carefully)
You can set a default for a user or database so “dangerous clients” can’t wait forever:
SQLALTER ROLE app_user SET lock_timeout = '1s';
-- or
ALTER DATABASE mydb SET lock_timeout = '1s';
This is powerful but can surprise applications that aren’t written to handle lock-timeout errors cleanly.
What error you’ll get (and what your app should do)
When lock_timeout triggers, Postgres raises an error like:
ERROR: canceling statement due to lock timeout
Apps should generally treat this as a retryable failure for idempotent operations, with backoff/jitter. For migrations, it usually means “abort and try again during a quieter window.”
(Exact SQLSTATE is documented, but the key is to handle it intentionally rather than letting it bubble up as a generic 500.)
Reference: error behavior tied to lock timeout setting
https://www.postgresql.org/docs/current/runtime-config-client.html
Choosing a value: what works in practice
A good operational stance is:
- Short for DDL/migrations on hot tables (e.g., 1–5 seconds): fail fast, avoid pileups.
- Longer (or default) for batch jobs where waiting is acceptable and you control concurrency.
There is no universal best value, but the rule is: the closer an operation is to “might request ACCESS EXCLUSIVE on a hot table,” the more you want a short lock_timeout.
lock_timeout vs statement_timeout
These are complementary:
lock_timeoutlimits waiting to acquire locks.statement_timeoutlimits total statement runtime (including waiting + execution).
In stop-the-world scenarios, lock_timeout is often the more important safety valve because the worst incidents come from lock queueing, not from long CPU execution.
Reference: runtime client settings
https://www.postgresql.org/docs/current/runtime-config-client.html
A safe migration pattern using lock timeout
A common pattern is:
SQLBEGIN;
SET LOCAL lock_timeout = '2s';
SET LOCAL statement_timeout = '30s';
-- DDL that might block
ALTER TABLE orders ADD COLUMN foo text;
COMMIT;
Outcome:
- If the table is busy and the lock isn’t available quickly, the migration fails fast.
- If it does acquire the lock, the statement has a bound on how long it can run.
This doesn’t make risky DDL “safe,” but it makes its failure mode much less catastrophic.
Takeaway
Lock timeouts turn “infinite waiting” into “bounded damage.” Set lock_timeout (especially for migrations and maintenance commands) so a single strong lock request can’t sit in the queue long enough to stall your entire application.
31. Statement Timeouts: Putting an Upper Bound on Damage
If lock_timeout prevents a statement from waiting forever to get a lock, statement_timeout prevents a statement from running (or waiting) forever in total. It’s your “hard ceiling” safety valve: no single SQL statement should be allowed to consume unlimited time and tie up a connection indefinitely.
References:
statement_timeoutruntime parameter: https://www.postgresql.org/docs/current/runtime-config-client.html- Locking concepts: https://www.postgresql.org/docs/current/explicit-locking.html
What statement_timeout does
statement_timeout sets the maximum time any single statement may run. If it exceeds the threshold, PostgreSQL cancels it.
Important details:
- The timer includes time spent waiting (locks, I/O, etc.) plus actual execution.
- It applies per statement (not per transaction).
- When triggered, you’ll see an error like:
ERROR: canceling statement due to statement timeout
Reference: https://www.postgresql.org/docs/current/runtime-config-client.html
Why it matters in stop-the-world incidents
Lock-related outages often involve “stuck” statements that hold onto connections:
- victims waiting on locks,
- DDL waiting to acquire
ACCESS EXCLUSIVE, - long-running queries holding locks/snapshots,
- maintenance commands running far longer than expected.
Even if you set lock_timeout, you can still be harmed by statements that do acquire the lock but then run too long, or by long-running reads that indirectly block DDL. statement_timeout bounds both.
This is especially important if your application uses a connection pool: every stuck statement consumes a pool slot.
How to set it (session, transaction, role/database)
Session-level
SQLSET statement_timeout = '30s';
Transaction-scoped (recommended for migrations)
SQLBEGIN;
SET LOCAL statement_timeout = '30s';
-- your changes here
COMMIT;
Defaults for an app role
SQLALTER ROLE app_user SET statement_timeout = '5s';
Be cautious with role/database defaults: they can break legitimate long reports or maintenance tasks unless those tasks use a different role or override the setting.
Reference: https://www.postgresql.org/docs/current/runtime-config-client.html
statement_timeout vs lock_timeout: how to use both
They solve different failure modes:
lock_timeout: “Don’t wait too long to acquire locks.”
Stops queue amplification and lock pileups.statement_timeout: “Don’t let a statement run too long overall.”
Caps total damage from runaway queries, slow plans, heavy I/O, or unexpectedly slow maintenance.
For production changes (especially DDL), using both is common:
SQLBEGIN;
SET LOCAL lock_timeout = '2s';
SET LOCAL statement_timeout = '30s';
ALTER TABLE orders ADD COLUMN foo text;
COMMIT;
If this fails, it fails predictably, without turning into a multi-minute outage.
Picking values that won’t backfire
There’s no single correct value, but the safe approach is to pick timeouts based on intent:
- Interactive OLTP queries: low seconds (e.g., 1–5s) often makes sense if your app can retry.
- Background jobs: longer timeouts or explicit overrides, because they may legitimately scan lots of data.
- Migrations/DDL: short
lock_timeout, moderatestatement_timeout, and run in controlled steps.
If your app doesn’t handle statement timeout errors, turning this on globally can cause new failures. The best rollout is usually:
- start with migration tooling and batch jobs you control,
- then consider app-wide defaults with proper retry/backoff.
Reference: runtime configuration discussion
https://www.postgresql.org/docs/current/runtime-config-client.html
What happens when a statement times out
When statement_timeout triggers:
- the current statement is canceled,
- the transaction is left in an aborted state if you were inside an explicit transaction block (you must
ROLLBACK), - locks held by the transaction may remain held until you end the transaction (another reason to keep transactions short and handle errors cleanly).
This is a common gotcha: timeouts are protective, but only if your client cleans up by rolling back promptly.
Reference: transactional behavior and statement cancellation
https://www.postgresql.org/docs/current/runtime-config-client.html
Takeaway
statement_timeout is the “upper bound on damage” setting: it prevents runaway or unexpectedly slow statements from tying up connections indefinitely. Combined with lock_timeout, it gives you predictable failure modes for risky operations—especially migrations—so a single statement can’t quietly turn into a stop-the-world incident.
32. DDL Safely in Production: Patterns That Reduce Lock Pain
Running DDL on a busy PostgreSQL database is less about memorizing lock modes and more about adopting patterns that keep the “blast radius” small when locks are contested. The goal is to make schema changes:
- short-lived (don’t hold strong locks long),
- retryable (fail fast instead of queueing),
- incremental (avoid table rewrites and full validations in one step),
- observable (you can tell if it’s blocked and back out safely).
References:
- Locking and lock modes: https://www.postgresql.org/docs/current/explicit-locking.html
ALTER TABLE: https://www.postgresql.org/docs/current/sql-altertable.htmlCREATE INDEX/ CONCURRENTLY: https://www.postgresql.org/docs/current/sql-createindex.html- Runtime timeouts (
lock_timeout,statement_timeout): https://www.postgresql.org/docs/current/runtime-config-client.html
Pattern 1: Fail fast on lock contention (don’t become head-of-line)
The most important production DDL habit is: never wait forever for a lock.
In migration sessions, set timeouts:
SQLBEGIN;
SET LOCAL lock_timeout = '2s';
SET LOCAL statement_timeout = '30s';
-- DDL here
COMMIT;
lock_timeoutprevents queue amplification: if you can’t get the lock quickly, you abort rather than jamming the table.statement_timeoutensures the statement doesn’t run unbounded after it does get the lock.
Reference: https://www.postgresql.org/docs/current/runtime-config-client.html
Pattern 2: Keep the transaction scope tiny
DDL is transactional in PostgreSQL, which is great for correctness—but it also means locks can be held until COMMIT.
Rules that prevent accidental lock hostage situations:
- Don’t bundle many unrelated DDL steps into one long transaction on hot tables.
- Don’t do “DDL + data backfill + deploy + wait” in one transaction.
- Avoid leaving a migration session open after the DDL runs.
If you use a migration tool that wraps everything in a single transaction by default, be extra cautious with operations that request strong locks.
Reference: transactional DDL and locking behavior
https://www.postgresql.org/docs/current/explicit-locking.html
Pattern 3: Prefer “online” variants when available
Some operations have online-friendly forms specifically to reduce lock pain:
CREATE INDEX CONCURRENTLYREINDEX CONCURRENTLY(where supported)- Adding constraints in a way that avoids full blocking (e.g.,
NOT VALIDthenVALIDATE CONSTRAINTfor certain constraints)
These aren’t magic—they still take locks—but they’re designed to avoid long ACCESS EXCLUSIVE holds.
References:
CREATE INDEX CONCURRENTLY: https://www.postgresql.org/docs/current/sql-createindex.htmlREINDEX CONCURRENTLY: https://www.postgresql.org/docs/current/sql-reindex.html- Constraint operations: https://www.postgresql.org/docs/current/sql-altertable.html
Pattern 4: Avoid table rewrites whenever possible
The operations that hurt most are the ones that rewrite the entire table:
VACUUM FULLCLUSTER- some
ALTER TABLEchanges (especially certain type changes, defaults, storage changes)
For production systems, the question to ask isn’t “will it work?” but “does it rewrite the table, and how long will it hold an exclusive lock?”
If you can’t avoid a rewrite, treat it as a maintenance-window change or use an online rebuild approach.
Reference: ALTER TABLE rewrite behavior varies by subcommand; see docs
https://www.postgresql.org/docs/current/sql-altertable.html
Pattern 5: Use expand/backfill/contract instead of “big bang” changes
For application-facing schema changes, the safest pattern is almost always:
- Expand: add new nullable columns / new tables / new indexes (online-friendly)
- Backfill: populate data in batches (outside of strong DDL locks)
- Contract: enforce constraints, switch reads/writes, remove old columns later
This pattern reduces lock pain because the slow part (backfill) is not done under an exclusive schema lock.
(We’ll lay out this playbook in detail later, but it’s worth calling out as the default mindset.)
Pattern 6: Separate “validation” steps from “metadata” steps
Some operations look like metadata but require scanning data:
- adding a
NOT NULLconstraint, - validating foreign keys,
- validating CHECK constraints.
When possible, do:
- add constraint as
NOT VALID, - then validate it later in a controlled way.
This shifts risk from “instant lock outage” to “controlled background work,” usually with less blocking.
Reference: constraint validation options in ALTER TABLE
https://www.postgresql.org/docs/current/sql-altertable.html
Pattern 7: Reduce the blast radius with partitioning or smaller units
If a single table is enormous and hot, any strong lock on it is scary. Partitioning can:
- limit operations to a smaller partition,
- make maintenance and index builds more targeted,
- reduce the number of concurrent sessions impacted by a single lock.
Partitioning isn’t free (it changes query planning and operational complexity), but it’s a powerful tool to avoid “one table locks the world.”
Reference: partitioning overview
https://www.postgresql.org/docs/current/ddl-partitioning.html
Pattern 8: Make DDL observable and abortable
Before and during production DDL:
- watch
pg_stat_activityfor lock waits and long transactions, - watch
pg_locksfor a pendingAccessExclusiveLock, - use progress views for long operations (
pg_stat_progress_create_index, etc.), - be ready to abort quickly if it can’t get locks.
References:
- Monitoring stats: https://www.postgresql.org/docs/current/monitoring-stats.html
- Progress reporting: https://www.postgresql.org/docs/current/progress-reporting.html
A practical “safe DDL” checklist
Before you run DDL on a hot table:
- □ Does it require
ACCESS EXCLUSIVE(or rewrite the table)? - □ Do I have
lock_timeoutset so it won’t queue forever? - □ Is there any long-running / idle-in-transaction session right now?
- □ Is the change deployable via expand/backfill/contract?
- □ Do I have a rollback plan and a way to stop quickly?
Takeaway
Safe production DDL is about engineering the failure mode. Use timeouts to fail fast, keep transactions short, prefer concurrent/online operations, avoid rewrites, and roll schema changes out in phases (expand/backfill/contract). Those patterns turn lock behavior from “surprise outage” into “predictable, manageable risk.”
33. Using CREATE INDEX CONCURRENTLY to Avoid Blocking Writes
Creating an index on a busy table is a classic way to cause pain: a standard CREATE INDEX can block writes at the wrong time, and even when it doesn’t “stop the world,” it can create enough contention to trigger timeouts and pileups.
CREATE INDEX CONCURRENTLY is PostgreSQL’s built-in, production-friendly way to build an index while allowing normal reads and writes to keep flowing.
References:
CREATE INDEX/CONCURRENTLY: https://www.postgresql.org/docs/current/sql-createindex.html- Lock modes / conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
- Progress view for index builds: https://www.postgresql.org/docs/current/progress-reporting.html
What “concurrently” really means
CREATE INDEX CONCURRENTLY builds the index in a way that avoids holding locks that would block normal DML for long periods.
In practical terms:
- Your application can keep doing
INSERT/UPDATE/DELETEwhile the index is being built. - The index build takes longer and does more work than a normal
CREATE INDEX. - There are still locks involved, but they’re chosen to minimize disruption to reads/writes.
Reference: https://www.postgresql.org/docs/current/sql-createindex.html
The big rule: you can’t run it inside a transaction block
This is the #1 operational gotcha:
SQLBEGIN;
CREATE INDEX CONCURRENTLY idx_orders_created_at ON orders(created_at);
COMMIT;
This will fail. PostgreSQL requires CREATE INDEX CONCURRENTLY to run outside an explicit transaction block.
So your migration tooling must support “non-transactional” migrations for this step.
Reference: CREATE INDEX restrictions
https://www.postgresql.org/docs/current/sql-createindex.html
How it avoids blocking writes (high-level)
A normal index build wants a more exclusive relationship with the table to ensure it sees a consistent set of rows and can safely update index state.
The concurrent build instead:
- does an initial build while allowing writes,
- then performs additional passes to catch rows that changed during the build,
- and finally performs a short “swap/validation” step to make the index usable.
You don’t need every internal detail to operate it safely; what matters is the tradeoff:
- less blocking
- more time + more work
Reference: https://www.postgresql.org/docs/current/sql-createindex.html
What it does (and still doesn’t) block
Operationally, CREATE INDEX CONCURRENTLY is designed to avoid long write blocks, but you should still expect:
- brief lock steps where it coordinates with other DDL on the same table
- conflicts with some schema changes or other concurrent index operations
So “concurrently” is best read as:
- “won’t take the table down for normal DML,” not
- “never waits for locks.”
Reference: lock mode conflicts and index build notes
https://www.postgresql.org/docs/current/explicit-locking.html
https://www.postgresql.org/docs/current/sql-createindex.html
Practical usage patterns
Create the index
SQLCREATE INDEX CONCURRENTLY idx_orders_created_at
ON orders (created_at);
Combine with IF NOT EXISTS (when appropriate)
SQLCREATE INDEX CONCURRENTLY IF NOT EXISTS idx_orders_created_at
ON orders (created_at);
Whether you can use IF NOT EXISTS depends on Postgres version and your migration style; it’s supported for CREATE INDEX in modern versions, but always check your target version behavior.
Reference: CREATE INDEX syntax
https://www.postgresql.org/docs/current/sql-createindex.html
Partial indexes are still a great fit
SQLCREATE INDEX CONCURRENTLY idx_orders_open_recent
ON orders (created_at)
WHERE status = 'open';
Partial indexes often provide big wins with smaller build time and lower write overhead.
Reference: partial indexes (covered under indexing docs)
https://www.postgresql.org/docs/current/indexes-partial.html
Safety guardrails for production
Even though it’s online-friendly, you still want guardrails:
- Set a reasonable
statement_timeoutin the session running the migration (so it doesn’t run for hours unexpectedly). - Consider
lock_timeoutto avoid it sitting behind conflicting DDL and becoming part of a queue pileup. - Monitor progress via
pg_stat_progress_create_index.
Example session setup:
SQLSET lock_timeout = '2s';
SET statement_timeout = '30min';
CREATE INDEX CONCURRENTLY idx_orders_created_at
ON orders (created_at);
References:
- timeouts: https://www.postgresql.org/docs/current/runtime-config-client.html
- progress reporting: https://www.postgresql.org/docs/current/progress-reporting.html
Monitoring a concurrent index build
Use the progress view to see whether it’s moving and what phase it’s in:
SQLSELECT *
FROM pg_stat_progress_create_index;
On a busy system, this is extremely helpful for distinguishing:
- “it’s blocked waiting on a lock” vs
- “it’s working, just slow.”
Reference: progress reporting
https://www.postgresql.org/docs/current/progress-reporting.html
Failure modes you should expect (and plan for)
- If the statement is canceled or fails mid-way, you can end up with an invalid index that must be dropped or rebuilt.
- It may take significantly longer than a normal index build, especially under heavy write load.
- Running multiple concurrent index builds on the same table can amplify resource usage and increase the chance of contention.
Reference: CREATE INDEX notes on concurrent builds and invalid indexes
https://www.postgresql.org/docs/current/sql-createindex.html
Takeaway
CREATE INDEX CONCURRENTLY is the default choice for adding indexes to hot production tables because it avoids long write-blocking locks. The tradeoff is longer runtime and added complexity (notably: it can’t run inside a transaction block, and you should monitor/guard it with timeouts). Used correctly, it’s one of the highest-leverage tools for avoiding lock-driven outages during migrations.
34. Using REINDEX CONCURRENTLY When You Can
When an index needs rebuilding (bloat, corruption recovery, performance issues), the operational question is almost always: “Can we do this without blocking production traffic?”
If your PostgreSQL version and the specific index support it, REINDEX CONCURRENTLY is usually the best answer. It rebuilds indexes in an online-friendly way, similar in spirit to CREATE INDEX CONCURRENTLY, aiming to avoid long blocking of normal reads/writes.
References:
REINDEX(includingCONCURRENTLY): https://www.postgresql.org/docs/current/sql-reindex.html- Locking / lock modes: https://www.postgresql.org/docs/current/explicit-locking.html
- Runtime timeouts: https://www.postgresql.org/docs/current/runtime-config-client.html
What REINDEX CONCURRENTLY is for
Use it when:
- the table is hot and you can’t afford a blocking reindex,
- you need to rebuild an index to reduce bloat or restore performance,
- you want a safer alternative to
REINDEX(non-concurrent),VACUUM FULL, orCLUSTERfor index-specific problems.
It rebuilds the index while keeping the old index usable until a final swap step.
Reference: https://www.postgresql.org/docs/current/sql-reindex.html
The big restriction: not inside a transaction block
Just like CREATE INDEX CONCURRENTLY, you generally cannot run REINDEX CONCURRENTLY inside an explicit transaction block:
SQLBEGIN;
REINDEX INDEX CONCURRENTLY idx_orders_created_at;
COMMIT;
This will fail. Plan migrations accordingly (a separate non-transactional migration step, or a migration tool feature for “no transaction”).
Reference: REINDEX restrictions
https://www.postgresql.org/docs/current/sql-reindex.html
What it buys you (and what it doesn’t)
Benefits
- Minimizes disruption to normal reads/writes.
- Reduces the risk of stop-the-world lock incidents on hot tables.
- Often the safest way to rebuild a critical index in production.
Tradeoffs
- Takes longer than non-concurrent reindexing.
- Uses more resources while it runs (extra work, extra I/O).
- Still takes locks at various phases; it’s “online-friendly,” not “lock-free.”
- Can be blocked by conflicting DDL on the same table/index, and can wait for locks.
References:
REINDEXnotes: https://www.postgresql.org/docs/current/sql-reindex.html- Locking concepts: https://www.postgresql.org/docs/current/explicit-locking.html
Typical usage
Reindex a single index concurrently:
SQLREINDEX INDEX CONCURRENTLY idx_orders_created_at;
Reindex all indexes on a table concurrently:
SQLREINDEX TABLE CONCURRENTLY orders;
Whether you should use TABLE scope depends on blast radius: it’s often safer to start with the single worst index and expand only if needed.
Reference: https://www.postgresql.org/docs/current/sql-reindex.html
Guardrails: avoid becoming the head-of-line waiter
Even concurrent maintenance can cause trouble if it waits too long on locks and triggers queueing effects. Use timeouts:
SQLSET lock_timeout = '2s';
SET statement_timeout = '30min';
REINDEX INDEX CONCURRENTLY idx_orders_created_at;
lock_timeoutensures you don’t sit in the queue forever if someone is doing DDL.statement_timeoutbounds the operation’s total time.
Reference: runtime settings
https://www.postgresql.org/docs/current/runtime-config-client.html
What to check before you run it
A quick pre-flight checklist:
- Are there long-running transactions or
idle in transactionsessions?
Those can block key phases and make “online” operations painful. - Is there other DDL scheduled on the same table?
Avoid overlapping schema changes with concurrent reindexing. - Is the index truly the problem?
If the underlying issue is table bloat or bad query patterns, reindexing may not help much.
Failure handling you should plan for
If a concurrent reindex is interrupted, you may end up with an index state that requires cleanup (for example, rebuilding again). Operationally:
- be prepared to retry during a quiet window,
- confirm index validity afterwards (your operational tooling may check this; PostgreSQL also tracks index validity internally).
Exact details and edge cases depend on version and circumstances; treat concurrent reindex as safer, not risk-free.
Reference: REINDEX documentation notes
https://www.postgresql.org/docs/current/sql-reindex.html
Takeaway
Use REINDEX CONCURRENTLY whenever you need to rebuild indexes on hot production tables and your environment supports it. It’s designed to avoid the “maintenance took down prod” pattern by keeping reads/writes flowing, at the cost of longer runtime and more complexity (notably: no transaction block, and you still need timeouts and monitoring).
35. Adding Columns Safely: Fast Defaults and Avoiding Table Rewrites
“Add a column” sounds harmless, but in PostgreSQL it can range from a near-instant metadata change to a table-rewriting, outage-prone operation—depending on how you do it and which version/feature behavior you’re relying on.
The safe goal in production is usually:
- avoid a full table rewrite,
- avoid holding strong locks for long,
- make the change deployable with expand/backfill/contract.
References:
ALTER TABLE(adding columns, defaults, not null): https://www.postgresql.org/docs/current/sql-altertable.html- Locking basics / lock modes: https://www.postgresql.org/docs/current/explicit-locking.html
- Runtime timeouts: https://www.postgresql.org/docs/current/runtime-config-client.html
The baseline: adding a nullable column with no default
This is typically the safest and fastest form:
SQLALTER TABLE orders ADD COLUMN foo text;
Why it’s usually safe:
- It’s primarily a catalog (metadata) change.
- PostgreSQL doesn’t need to rewrite every existing row just to “have a new column that is NULL for old rows.”
What can still hurt:
- It may still require a strong table lock to perform the metadata change. On a hot table, the risk is often waiting to acquire the lock, not the execution time once acquired.
So still treat it like production DDL:
SQLBEGIN;
SET LOCAL lock_timeout = '2s';
SET LOCAL statement_timeout = '30s';
ALTER TABLE orders ADD COLUMN foo text;
COMMIT;
References:
ALTER TABLE: https://www.postgresql.org/docs/current/sql-altertable.htmllock_timeout/statement_timeout: https://www.postgresql.org/docs/current/runtime-config-client.html
The foot-gun: adding a column with a default (historically rewrite-prone)
The risky pattern is:
SQLALTER TABLE orders ADD COLUMN foo text DEFAULT 'x';
Historically, PostgreSQL implemented this by rewriting the entire table to physically store the default value in every existing row—meaning:
- big I/O,
- long runtime,
- and often long-lived strong locks (stop-the-world potential).
Modern PostgreSQL versions introduced an optimization often called “fast defaults” where certain constant defaults can be stored in metadata and treated as if they exist for old rows without rewriting the table. However, behavior depends on version and the exact default expression.
Because this is version- and expression-dependent, the safest operational stance is:
- assume
ADD COLUMN ... DEFAULT ...might rewrite unless you’ve verified your target Postgres version and the default expression qualifies for the fast-path.
Reference: ALTER TABLE behavior and notes are version-specific; always confirm for your Postgres version
https://www.postgresql.org/docs/current/sql-altertable.html
The safe production pattern: add it nullable, backfill, then enforce
If you need a non-null column with a default value, the most robust pattern is:
Step 1: Add column as nullable, no default (fast metadata change)
SQLALTER TABLE orders ADD COLUMN foo text;
Step 2: Backfill existing rows in batches (no ACCESS EXCLUSIVE lock)
Do this in controlled chunks to avoid long transactions and hot-row storms:
SQLUPDATE orders
SET foo = 'x'
WHERE foo IS NULL
AND id >= 100000
AND id < 110000;
Repeat in batches with application-side loop logic and sleep/backoff as needed.
Step 3: Add a default for new rows
Once old rows are backfilled, set a default for inserts going forward:
SQLALTER TABLE orders ALTER COLUMN foo SET DEFAULT 'x';
Step 4: Enforce NOT NULL (after you’re confident backfill is complete)
SQLALTER TABLE orders ALTER COLUMN foo SET NOT NULL;
This pattern reduces the highest-risk part (rewriting and long exclusive lock holds) by moving the expensive work into ordinary updates you can throttle.
Reference: ALTER TABLE for defaults and NOT NULL
https://www.postgresql.org/docs/current/sql-altertable.html
Avoiding the “NOT NULL validation outage”
Setting NOT NULL can require validating that all rows comply, which can take time on large tables. Two practical ways to reduce risk:
- Ensure the backfill is truly complete before you attempt it.
- Run it with tight timeouts so it fails fast rather than waiting and queueing:
SQLBEGIN;
SET LOCAL lock_timeout = '2s';
SET LOCAL statement_timeout = '30s';
ALTER TABLE orders ALTER COLUMN foo SET NOT NULL;
COMMIT;
Even if it’s “fast” on your table, treat it as potentially disruptive until proven otherwise.
Reference: lock timeouts and DDL locking
https://www.postgresql.org/docs/current/runtime-config-client.html
https://www.postgresql.org/docs/current/explicit-locking.html
Special case: adding a column with a volatile/default expression
Even on versions with fast defaults, defaults like now(), random(), or expressions depending on other columns may not qualify for metadata-only optimization and can force heavier work or different semantics.
Rule of thumb:
- If the default isn’t a simple constant, treat it as higher risk and prefer the expand/backfill approach.
Reference: expression semantics and ALTER TABLE notes
https://www.postgresql.org/docs/current/sql-altertable.html
Takeaway
The safest way to add columns in production is usually:
- ADD the column nullable with no default
- BACKFILL in batches
- SET DEFAULT for future rows
- ENFORCE NOT NULL last
This avoids table rewrites in the common case, reduces lock hold time, and turns a potentially stop-the-world migration into a controlled, observable rollout.
36. Changing Column Types: What Forces a Rewrite and What Doesn’t
ALTER TABLE ... ALTER COLUMN ... TYPE ... is one of the most dangerous “simple” migrations in PostgreSQL. Sometimes it’s almost instant. Other times it rewrites the entire table, holds strong locks for a long time, and becomes a stop-the-world event.
The difference comes down to whether PostgreSQL must touch every row to convert or re-encode values.
References:
ALTER TABLE/ type changes /USING: https://www.postgresql.org/docs/current/sql-altertable.html- Locking and lock modes: https://www.postgresql.org/docs/current/explicit-locking.html
The big idea: metadata-only vs table rewrite
A type change can fall into two categories:
Metadata-only (usually fast)
Postgres updates the column’s type metadata, but does not need to rewrite each tuple.
This is possible when the on-disk representation is compatible or the conversion is essentially a “relabeling” that doesn’t require per-row transformation.
Table rewrite (potentially catastrophic)
Postgres must scan every row, compute a new representation, and write a new copy of the table (or otherwise rewrite the affected tuples). This is slow, I/O-heavy, and typically requires strong locks for the duration.
Reference: ALTER TABLE notes on type changes and table rewrites
https://www.postgresql.org/docs/current/sql-altertable.html
What usually forces a rewrite
Treat these as “rewrite likely” unless you’ve verified otherwise:
- Conversions that require computing a new value per row
e.g.,text→integer,jsonb→ typed columns,timestamp→timestamptzwith aUSINGexpression, etc. - Changes that alter physical storage format
For example, changing between types with different binary representations. - Any type change using a non-trivial
USINGclause
If you need:SQLALTER TABLE t ALTER COLUMN c TYPE newtype USING some_expression(c);…you should assume it will touch every row. - Type changes on large, hot tables
Even if the conversion is “simple,” if it’s not metadata-only, the runtime scales with table size.
What sometimes does not force a rewrite
There are type changes that can be cheap because they don’t require rewriting stored values (or can be performed without visiting all rows). Common examples people rely on:
- Increasing the limit of a
varchar(n)ornumeric(p,s)in certain cases. - Changing
varchar(n)totext(often compatible in practice).
However, exact behavior depends on PostgreSQL version and details, and the safe operational posture is:
- Don’t assume a type change is metadata-only just because it “feels compatible.”
Confirm with your version’s documentation and test on a production-like copy of the table.
Reference: ALTER TABLE type change behavior and restrictions
https://www.postgresql.org/docs/current/sql-altertable.html
Locking: why type changes are “stop-the-world” prone
Even if the conversion itself is fast, ALTER TABLE ... TYPE typically requires a strong lock on the table (often ACCESS EXCLUSIVE) because:
- the table’s definition is changing,
- concurrent queries must not see mixed type semantics,
- indexes and constraints may need updates.
So the risk is both:
- lock acquisition (can cause queue amplification), and
- lock hold time (if it rewrites, it can hold the lock for a long time).
References:
- Lock mode conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
ALTER TABLElocking implications: https://www.postgresql.org/docs/current/sql-altertable.html
Safer alternatives to “big bang” type changes
If the table is hot or large, the safest approach is usually to avoid in-place conversion.
Option A: Expand/backfill/contract with a new column
- Add a new column with the desired type:
SQLALTER TABLE t ADD COLUMN c_new newtype;
- Backfill in batches:
SQLUPDATE t
SET c_new = transform(c)
WHERE c_new IS NULL
AND id BETWEEN ...;
- Update the application to read from
c_new(or dual-read). - Dual-write for a period (write to both).
- Validate and swap (rename columns, drop old later).
This turns one scary rewrite into controlled DML you can throttle.
Option B: Build a new table and swap
For very large transformations, it can be safer to:
- create a new table with the correct schema,
- copy/transform data (possibly with logical replication or triggers),
- cut over with a controlled swap.
This is more complex but can avoid long exclusive locks.
Practical guardrails if you must do an in-place type change
If you decide to run ALTER COLUMN TYPE directly:
- Set
lock_timeoutso it fails fast rather than sitting in the queue. - Set
statement_timeoutso it can’t run forever if it rewrites. - Run it during a low-traffic window if there’s any chance it rewrites.
Example:
SQLBEGIN;
SET LOCAL lock_timeout = '2s';
SET LOCAL statement_timeout = '5min';
ALTER TABLE t
ALTER COLUMN c TYPE newtype
USING c::newtype;
COMMIT;
Reference: timeouts
https://www.postgresql.org/docs/current/runtime-config-client.html
How to know for sure
The only reliable way to know if your exact change will rewrite (for your Postgres version, column, data distribution, indexes, and constraints) is to:
- read the version-specific docs for the exact subcommand, and
- test it on a production-sized staging environment.
I can’t infer rewrite behavior perfectly without those details, but the risk management approach remains: assume rewrite unless verified, and prefer phased migrations for hot tables.
Takeaway
Changing column types is risky because many conversions require a full table rewrite and strong locks (ACCESS EXCLUSIVE). If you can’t prove the change is metadata-only for your version and case, treat it as a rewrite and use safer patterns like expand/backfill/contract. This is one of the highest-leverage places to avoid accidental stop-the-world outages.
37. Dropping Columns and Constraints: Lock Implications
Dropping things feels like it should be easy: remove a column, remove a constraint, done. In PostgreSQL, these operations are often fast in terms of CPU/I/O because they’re mostly catalog changes, but they can still be high impact because they require strong locks and can trigger dependency cascades.
The practical rule: drops are usually “quick to execute, dangerous to schedule.”
References:
ALTER TABLE(DROP COLUMN / DROP CONSTRAINT): https://www.postgresql.org/docs/current/sql-altertable.html- Lock modes and conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
Why drops can be disruptive even when they’re fast
Two reasons:
- Strong lock requirement
MostALTER TABLEoperations, including many drops, require strong relation locks (commonlyACCESS EXCLUSIVE) to ensure nobody is concurrently using a schema that is being changed. - Lock acquisition is the real risk
On a hot table, the DDL might spend 99.9% of its wall-clock time waiting to acquire the lock. While it waits, it can become head-of-line and cause lock queue pileups (the stop-the-world symptom).
Reference: lock behavior and conflict rules
https://www.postgresql.org/docs/current/explicit-locking.html
Dropping columns
ALTER TABLE ... DROP COLUMN
Example:
SQLALTER TABLE orders DROP COLUMN legacy_code;
What happens:
- PostgreSQL removes the column from the table’s definition (catalog metadata).
- This does not require rewriting the whole table to physically remove that field from every row in the immediate sense, but it does require catalog updates and dependency cleanup.
Why it still hurts:
- It typically needs a strong lock on the table while the catalog change is made.
- If the table is hot, acquiring that lock is the outage risk.
References:
https://www.postgresql.org/docs/current/sql-altertable.html
https://www.postgresql.org/docs/current/explicit-locking.html
DROP COLUMN ... CASCADE
If you do:
SQLALTER TABLE orders DROP COLUMN legacy_code CASCADE;
You may remove dependent objects too (views, triggers, constraints, etc.). That expands the blast radius:
- more catalog work,
- more dependencies changed,
- more surprising downstream breakage.
Even if locking remains focused on the table, the operational impact can be much larger.
Reference: CASCADE behavior in ALTER TABLE
https://www.postgresql.org/docs/current/sql-altertable.html
Dropping constraints
ALTER TABLE ... DROP CONSTRAINT
Example:
SQLALTER TABLE orders DROP CONSTRAINT orders_status_check;
Constraints are part of the table’s definition, and dropping them requires strong coordination:
- queries planning/using the constraint must see a consistent schema,
- other sessions must not simultaneously alter the same metadata.
So again: usually fast, but often requires a strong lock and can queue.
Reference:
https://www.postgresql.org/docs/current/sql-altertable.html
Special note: dropping uniqueness / primary keys
Dropping constraints that are backed by indexes (like PRIMARY KEY or UNIQUE) can have bigger effects than “just schema”:
- The supporting index might be dropped (or become unused) depending on how it was created and referenced.
- Query plans can change immediately.
- Writes might get faster, but reads and join performance can tank if the index was crucial.
So the “impact” is often performance and correctness semantics as much as locking.
Dependency and blast radius pitfalls
Dropping schema objects often triggers surprises through dependencies:
- views and materialized views,
- foreign keys (both referencing and referenced sides),
- triggers,
- generated columns or expressions,
- application prepared statements expecting the old schema.
From a lock perspective, the risk is still usually “strong lock on a hot table,” but from an outage perspective the risk is “we broke code paths immediately.”
Reference: dependency behavior is discussed throughout DDL docs; CASCADE is a major lever
https://www.postgresql.org/docs/current/sql-altertable.html
How to drop safely in production
These patterns reduce the chances of lock pain and surprise breakage:
1) Make it an explicit “contract” step
Don’t drop immediately after adding the replacement. Instead:
- stop reads/writes to the old column in application code,
- deploy and observe,
- then drop in a later release.
This reduces the chance you need an emergency rollback that depends on the old schema.
2) Use timeouts so drops don’t become head-of-line
SQLBEGIN;
SET LOCAL lock_timeout = '2s';
SET LOCAL statement_timeout = '30s';
ALTER TABLE orders DROP COLUMN legacy_code;
COMMIT;
If the table is busy, it fails fast instead of queueing and amplifying a stall.
Reference: timeouts
https://www.postgresql.org/docs/current/runtime-config-client.html
3) Avoid CASCADE unless you truly want the cascade
Prefer explicit drops of dependent objects so you can predict impact. CASCADE is convenient, but it’s a classic way to widen the change beyond what you intended.
Takeaway
Dropping columns and constraints is often quick in execution but can be high-risk in production because it typically requires strong locks and can trigger lock queue pileups on hot tables. Treat drops as planned “contract” steps, use timeouts to fail fast, and be cautious with CASCADE due to dependency blast radius.
38. Foreign Keys and Locking: Validations, NOT VALID Constraints, and Backfills
Foreign keys are essential for data integrity, but they’re also one of the easiest ways to accidentally introduce painful locking and long-running table scans during a migration. The good news: PostgreSQL gives you tools to make FK rollouts much safer, especially NOT VALID constraints and separate validation.
This section focuses on how FK creation/validation interacts with locking and how to roll them out with minimal production impact.
References:
ALTER TABLE(ADD CONSTRAINT, NOT VALID, VALIDATE CONSTRAINT): https://www.postgresql.org/docs/current/sql-altertable.html- Locking / conflict rules: https://www.postgresql.org/docs/current/explicit-locking.html
CREATE TABLE/ FK basics: https://www.postgresql.org/docs/current/ddl-constraints.html
Why foreign keys are operationally “heavy”
A foreign key has two tables:
- referencing table (child): has the FK column(s)
- referenced table (parent): has the primary key/unique key being referenced
Adding a foreign key isn’t just metadata. PostgreSQL must ensure:
- existing rows in the child all have matching parent rows (unless the FK allows NULLs and the values are NULL), and
- future changes keep that property true.
That first part—checking existing data—is the operational hazard: it can require scanning a large child table and probing the parent index many times.
Reference: FK behavior and constraint validation
https://www.postgresql.org/docs/current/ddl-constraints.html
Locking: what gets locked, and why it matters
At a high level, FK operations require table locks to prevent schema changes and ensure the validation sees a stable definition.
Practical implications:
- The child table is where you feel most impact because validation touches its rows.
- The parent table matters too because inserts/updates/deletes must be coordinated with referential checks, and the parent must have a suitable unique index/constraint.
Exact lock modes vary by operation and PostgreSQL version, but the key operational takeaway is:
- FK creation/validation can take long enough that even “non-stop-the-world” lock modes become a production issue, and
- if anything requests
ACCESS EXCLUSIVEon either table while validation is happening (or queued), you can get lock queue pileups.
Reference: lock modes and DDL locking model
https://www.postgresql.org/docs/current/explicit-locking.html
FK DDL reference: https://www.postgresql.org/docs/current/sql-altertable.html
The safe rollout pattern: ADD CONSTRAINT NOT VALID, then VALIDATE
PostgreSQL allows you to add a foreign key without immediately scanning and validating existing rows:
Step 1: Add the FK as NOT VALID
SQLALTER TABLE child
ADD CONSTRAINT child_parent_fk
FOREIGN KEY (parent_id)
REFERENCES parent(id)
NOT VALID;
What this does:
- Adds the constraint metadata.
- Enforces the FK for new rows (and updates that touch the FK columns) going forward.
- Does not scan/validate all existing child rows yet.
Reference: NOT VALID constraints
https://www.postgresql.org/docs/current/sql-altertable.html
Step 2: Backfill / fix existing data
Now you can clean up any orphan rows safely, using controlled batch jobs. Common approaches:
- delete or archive orphan rows,
- update them to a valid parent,
- insert missing parent rows (if that’s correct),
- or set FK column to NULL if semantics allow it.
Example orphan finder:
SQLSELECT c.*
FROM child c
LEFT JOIN parent p ON p.id = c.parent_id
WHERE c.parent_id IS NOT NULL
AND p.id IS NULL
LIMIT 100;
Then fix in batches.
Step 3: Validate the constraint later
SQLALTER TABLE child
VALIDATE CONSTRAINT child_parent_fk;
Validation scans the table to ensure all existing rows comply. It is still a heavyweight operation, but the critical improvement is that you can schedule it deliberately and you’ve already cleaned up the data, so it should complete predictably.
Reference: VALIDATE CONSTRAINT
https://www.postgresql.org/docs/current/sql-altertable.html
Why this pattern reduces lock pain
Compared to “add FK and validate immediately,” the NOT VALID approach:
- decouples “start enforcing correctness for new data” from “scan the entire history,”
- lets you do the expensive cleanup as throttled DML instead of one giant DDL event,
- reduces the chance of a long-running validation being the trigger for lock queues at peak traffic.
It doesn’t make FK changes zero-risk, but it converts “surprise table scan during deploy” into “planned validation step.”
Indexing: the hidden prerequisite for safe FK rollout
A major operational mistake is adding an FK without the right indexes.
- The parent side must have a
PRIMARY KEYorUNIQUEconstraint/index on the referenced columns (that’s required). - The child side should have an index on the FK column(s) in most real systems, because otherwise:
- deletes/updates on the parent may require scanning the child table to check for referencing rows,
- which can create huge performance spikes and long-running locks.
This is not just performance trivia; it affects how long FK-related operations hold locks and how likely they are to trigger pileups.
Reference: FK constraint requirements and general indexing guidance
https://www.postgresql.org/docs/current/ddl-constraints.html
(Postgres docs discuss the parent uniqueness requirement explicitly; the “index the child FK columns” guidance is a widely followed operational best practice because of how referential checks are performed.)
Guardrails: timeouts and sequencing
When adding or validating FKs in production:
- Set
lock_timeoutso the operation fails fast instead of waiting and amplifying a queue. - Consider a moderate
statement_timeoutfor validation so it can’t run unbounded. - Prefer running
VALIDATE CONSTRAINTduring a low-traffic window on very large tables.
Timeout settings reference:
https://www.postgresql.org/docs/current/runtime-config-client.html
Common failure modes (and how to avoid them)
Failure mode 1: Long transaction blocks validation
A long-running transaction (especially idle in transaction) can make validation take longer or cause lock acquisition delays. Always check for old xact_start sessions before running validation.
Reference: pg_stat_activity
https://www.postgresql.org/docs/current/monitoring-stats.html
Failure mode 2: Orphans discovered mid-deploy
If you add the FK without NOT VALID, the statement can fail after doing a lot of work—or worse, it can sit scanning while holding locks. Using NOT VALID moves the “do we have orphans?” discovery earlier and makes it manageable.
Failure mode 3: Parent updates/deletes become slow (missing child index)
If you validate and ship an FK without an index on child FK columns, you may create a performance regression that feels like random lockups later when parent rows are updated/deleted.
Takeaway
Foreign keys are correctness-critical but operationally tricky because validation can require large scans and prolonged locking interactions. The production-safe pattern is:
- ADD FK … NOT VALID
- backfill/fix existing data in batches
- VALIDATE CONSTRAINT in a planned step
- ensure indexes (especially on child FK columns) so ongoing enforcement doesn’t create future lock pain.
Done this way, FK rollouts stop being “deploy roulette” and become predictable, observable operations.
39. Safe Migration Playbook: Expand/Backfill/Contract
The most reliable way to avoid stop-the-world lock incidents during schema changes is to stop treating migrations as “one big DDL event” and instead roll them out in phases that are:
- backward compatible,
- retryable,
- and reversible.
The canonical pattern is Expand → Backfill → Contract. It trades “one risky, blocking change” for “several small, controlled changes.”
References:
- Locking / lock modes: https://www.postgresql.org/docs/current/explicit-locking.html
ALTER TABLE(columns, constraints, validation): https://www.postgresql.org/docs/current/sql-altertable.htmlCREATE INDEX CONCURRENTLY: https://www.postgresql.org/docs/current/sql-createindex.html- Timeouts (
lock_timeout,statement_timeout): https://www.postgresql.org/docs/current/runtime-config-client.html
Why this playbook works (in locking terms)
Stop-the-world incidents usually happen when you combine:
- strong locks (often
ACCESS EXCLUSIVE), - long-running operations (table rewrites, full validations),
- long transactions,
- lock queue amplification.
Expand/backfill/contract reduces those risk multipliers by:
- making “strong-lock” steps short,
- doing expensive work (moving data) as throttled DML instead of a rewrite,
- and postponing irreversible steps until you’ve proven the system is stable.
Phase 1: Expand
Goal: introduce new schema elements in a way that is compatible with the old application code.
Typical Expand actions (low risk when done carefully):
- Add a new nullable column (no default)SQL
ALTER TABLE t ADD COLUMN new_col text; - Add a new table, or new enum values (version-dependent risk)
- Add an index concurrently (for hot tables)SQL
CREATE INDEX CONCURRENTLY idx_t_new_col ON t(new_col); - Add constraints as
NOT VALID(when supported)SQLALTER TABLE t ADD CONSTRAINT t_new_col_check CHECK (new_col <> '') NOT VALID;
Operational guardrails:
- Use short
lock_timeoutso Expand steps fail fast instead of queueing. - Keep Expand transactions tiny.
References:
ALTER TABLE: https://www.postgresql.org/docs/current/sql-altertable.htmlCREATE INDEX CONCURRENTLY: https://www.postgresql.org/docs/current/sql-createindex.html
Phase 2: Backfill
Goal: populate or transform existing data without relying on long exclusive locks.
Backfill is intentionally DML-heavy and time-flexible. You do it in batches so you can control load and avoid long transactions.
Example batch backfill:
SQLUPDATE t
SET new_col = old_col
WHERE new_col IS NULL
AND id >= 100000 AND id < 110000;
Best practices for backfill:
- Small batches (bounded time per transaction).
- Commit frequently (avoid long transactions).
- Throttle between batches to reduce I/O and lock pressure.
- Monitor dead tuples/bloat and autovacuum behavior.
Backfill is where many teams accidentally create a different kind of incident (hot-row contention or massive write amplification). The key is pacing and observability.
References: MVCC/locking concepts
https://www.postgresql.org/docs/current/mvcc.html
Phase 3: Contract
Goal: make the new schema the source of truth and remove the old schema, but only after you’ve proven correctness and compatibility.
Typical Contract actions:
- Switch application reads to new column/table.
- Switch writes to new column/table (or stop dual-writing).
- Validate constraints added as
NOT VALID:SQLALTER TABLE t VALIDATE CONSTRAINT t_new_col_check; - Enforce
NOT NULL(after backfill is complete):SQLALTER TABLE t ALTER COLUMN new_col SET NOT NULL; - Drop old columns, old constraints, or old indexes later:SQL
ALTER TABLE t DROP COLUMN old_col;
Contract is where you re-introduce stronger locking risk (drops, NOT NULL enforcement, validation). The difference is:
- by now, the data is ready, so operations are faster and more predictable,
- you can schedule the small “strong lock” steps carefully,
- and rollback is easier because you’ve already run both schemas for a while.
References: ALTER TABLE, validation
https://www.postgresql.org/docs/current/sql-altertable.html
A concrete example: change column type safely
Problem: change customer_id from text to uuid on a hot table.
Expand
SQLALTER TABLE orders ADD COLUMN customer_id_uuid uuid;
CREATE INDEX CONCURRENTLY idx_orders_customer_id_uuid
ON orders(customer_id_uuid);
Backfill
SQLUPDATE orders
SET customer_id_uuid = customer_id::uuid
WHERE customer_id_uuid IS NULL
AND id BETWEEN ...; -- batch window
Deploy app changes:
- dual-write: write both
customer_idandcustomer_id_uuid - gradually switch reads to
customer_id_uuid
Contract
SQLALTER TABLE orders ALTER COLUMN customer_id_uuid SET NOT NULL;
-- later
ALTER TABLE orders DROP COLUMN customer_id;
ALTER TABLE orders RENAME COLUMN customer_id_uuid TO customer_id;
This avoids the in-place ALTER COLUMN TYPE rewrite-risk path.
Guardrails that make this playbook reliable
No matter the schema change, these practices keep the phases safe:
- Fail fast:
lock_timeoutfor DDL steps. - Bound runtime:
statement_timeoutfor riskier operations. - Keep transactions short: especially during backfill.
- Separate concerns: schema first, data later, enforcement last.
- Observe continuously: watch
pg_stat_activity,pg_locks, and progress views.
Timeout reference:
https://www.postgresql.org/docs/current/runtime-config-client.html
Takeaway
Expand/backfill/contract is the default “production-safe” migration strategy because it minimizes long exclusive locks and turns risky schema changes into small, controllable steps. It doesn’t eliminate locking—but it makes locking predictable, limits queue amplification, and gives you safe rollback points.
40. Partitioning and Locks: How Partitions Can Limit Blast Radius
Partitioning is often sold as a performance feature, but for stop-the-world prevention it has an equally important benefit: it can reduce lock blast radius. If your workload naturally divides into slices (time, tenant, region, etc.), partitions let many operations target a smaller physical object, so a lock (even a strong one) affects less of your traffic.
Partitioning does not eliminate locking, and some partition maintenance operations still require strong locks. But it can turn “locking the world” into “locking one shard.”
References:
- Partitioning overview: https://www.postgresql.org/docs/current/ddl-partitioning.html
ALTER TABLEpartition operations (ATTACH/DETACH): https://www.postgresql.org/docs/current/sql-altertable.html- Lock modes and conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
How partitioning changes the “unit of locking”
Without partitioning, a hot table is a single choke point. Many disruptive operations need strong locks on that one object.
With partitioning, you have:
- a partitioned parent table (mostly routing/metadata)
- many child partitions (where the actual rows live)
Many DML operations (inserts/updates/queries) touch only the specific partition(s) relevant to the rows. That means:
- row locks are naturally localized,
- relation locks may also localize to the partition being accessed (depending on the operation),
- and some maintenance can be done partition-by-partition.
Reference: partitioning concepts
https://www.postgresql.org/docs/current/ddl-partitioning.html
Ways partitions can limit lock blast radius
1) Maintenance can be done one partition at a time
Instead of:
- rebuilding a huge index on one monolithic table,
you can:
- rebuild indexes on one partition, then move to the next.
Even if an operation needs a strong lock, it impacts only a fraction of the table’s total traffic.
Examples:
REINDEX (CONCURRENTLY)a single partition’s indexVACUUMa single partitionCREATE INDEX CONCURRENTLYon a single partition (or on the parent in newer patterns that propagate)
Caveat: behavior varies by version and by whether you create indexes on the parent or per partition. Always validate in your version.
References:
- partitioning: https://www.postgresql.org/docs/current/ddl-partitioning.html
- index creation: https://www.postgresql.org/docs/current/sql-createindex.html
2) Data lifecycle operations become safer
For time-based data, partitions enable “drop old data” as metadata operations:
- detach/drop an old partition rather than deleting millions of rows.
This can be dramatically faster and can avoid long row-locking delete storms. But it can still involve strong locks for attach/detach/drop steps—so it’s safer, not free.
Reference: partition management concepts
https://www.postgresql.org/docs/current/ddl-partitioning.html
3) Hot spots are contained
In multi-tenant systems, a single tenant can become a hot spot. Partitioning by tenant can isolate:
- lock contention,
- vacuum pressure,
- index bloat,
- and even some kinds of DDL/maintenance scheduling.
This is only worth it if tenant distribution and query patterns make it viable; otherwise you can create too many partitions and hurt planning/maintenance.
The locking reality: partition operations can still be “big locks”
Some partition maintenance operations can request strong locks, sometimes on:
- the partitioned parent,
- the partition being attached/detached,
- and potentially related objects.
Examples that often require careful scheduling:
ALTER TABLE parent ATTACH PARTITION ...ALTER TABLE parent DETACH PARTITION ...- dropping partitions
- changing constraints/indexes that propagate or require parent-level coordination
If your parent table is extremely hot, even short-lived strong locks on the parent can cause a noticeable stall.
Reference: partition operations are ALTER TABLE subcommands
https://www.postgresql.org/docs/current/sql-altertable.html
Lock modes overview: https://www.postgresql.org/docs/current/explicit-locking.html
A practical “blast radius” example
Imagine an events table partitioned by day:
- Without partitioning: a lock on
eventsblocks most event reads/writes. - With daily partitions: a lock on
events_2025_12_21blocks only traffic touching today’s partition; older partitions are unaffected, and many queries (especially historical ones) continue.
Even more importantly, if you’re doing maintenance on an old partition (say last month), you can often do it with negligible impact on “today’s hot path.”
Tradeoffs and pitfalls (so you don’t partition just for locks)
Partitioning adds complexity:
- more objects to manage (tables, indexes),
- more planning considerations for queries,
- operational work to create future partitions and manage retention,
- potential performance issues if you have too many partitions or poor pruning.
So the decision should be driven by clear needs: retention management, query pruning, and/or lock blast radius reduction.
Reference: partitioning caveats and pruning
https://www.postgresql.org/docs/current/ddl-partitioning.html
Operational guidance: using partitioning to reduce stop-the-world risk
If your main goal is “avoid one table locking everything”:
- Partition along a dimension that aligns with most queries (time is the most common).
- Keep partitions “small enough” that maintenance and rewrites are tolerable per partition.
- Run strong-lock operations (even
ACCESS EXCLUSIVE-requiring ones) on a single partition at a time. - Use
lock_timeouton partition attach/detach/drop steps to avoid queue amplification.
Timeout reference:
https://www.postgresql.org/docs/current/runtime-config-client.html
Takeaway
Partitioning can limit lock blast radius by changing the unit of contention from “one giant table” to “many smaller partitions.” That makes maintenance and even some disruptive operations safer because you can target one partition at a time. But partition management itself can require strong locks—so treat attach/detach/drop as planned operations with timeouts and careful scheduling.
41. Advisory Locks: When to Use Them and When Not To
Advisory locks are PostgreSQL’s “bring your own mutex” feature. They let your application (or ops tooling) coordinate work using locks that PostgreSQL enforces, but PostgreSQL does not take automatically. Used well, advisory locks prevent whole classes of race conditions. Used poorly, they create self-inflicted stop-the-world incidents.
References:
- Advisory lock functions: https://www.postgresql.org/docs/current/functions-admin.html
pg_locks(advisory lock visibility): https://www.postgresql.org/docs/current/view-pg-locks.html- Locking overview: https://www.postgresql.org/docs/current/explicit-locking.html
What advisory locks are
Advisory locks are:
- explicit: you request them via functions (
pg_advisory_lock,pg_try_advisory_lock, etc.) - named by integers: either one
bigintkey or twointkeys - tracked by Postgres like other heavyweight locks (they show up in
pg_locks) - independent of tables/rows: they don’t lock data; they lock a key you choose
They’re called “advisory” because Postgres won’t automatically tie them to data correctness—you must design the protocol.
Reference: https://www.postgresql.org/docs/current/functions-admin.html
Two flavors: session-level vs transaction-level
This distinction matters a lot operationally.
Session-level advisory locks
- Functions:
pg_advisory_lock,pg_try_advisory_lock,pg_advisory_unlock - Held until you explicitly unlock or the session ends.
Risk: if your app crashes, loses network, or forgets to unlock, you can block other workers indefinitely (until the connection is dropped).
Transaction-level advisory locks
- Functions:
pg_advisory_xact_lock,pg_try_advisory_xact_lock - Held until the transaction ends (commit/rollback).
These are usually safer because they can’t outlive the transaction. If you’re choosing between the two, transaction-level is often the better default.
Reference: advisory lock function list
https://www.postgresql.org/docs/current/functions-admin.html
When advisory locks are a good idea
1) Singleton jobs / leader election (lightweight coordination)
Examples:
- “Only one instance of the cron job runs at a time.”
- “Only one worker does the daily aggregation.”
Pattern:
SQLSELECT pg_try_advisory_lock(42) AS got_lock;
If got_lock is false, exit quickly.
Why it’s good:
- Simple
- No extra coordination service
- Lock is released if the DB session dies
2) Serialize a critical section that spans multiple tables
If you need a cross-table invariant and don’t want to invent a “lock table” row, advisory locks can be a clean way to guarantee only one worker executes a section at once.
3) Prevent concurrent migrations
Many teams use an advisory lock key like hashtext('migrations') to ensure only one migrator runs at a time—especially important if you have multiple deploy pipelines.
When advisory locks are a bad idea
1) Using them as a “global DB mutex” in the hot path
If you do something like:
- every request tries to take the same advisory lock key
…you’ve created a single-threaded bottleneck. Under load, this looks exactly like a stop-the-world event: huge waits, pool exhaustion, timeouts.
If you must coordinate per-entity, use a keyed lock (e.g., one key per account id), not one global key.
2) Holding session-level locks across network calls or user think-time
This is the advisory-lock version of “idle in transaction”:
- worker takes lock
- does HTTP call / waits for user / waits on queue
- forgets to release promptly
Result: other workers block behind it.
If the critical section includes any unpredictable wait, avoid session-level advisory locks or use strict timeouts and try-style acquisition.
3) Using advisory locks to “fix” data consistency issues
If you’re using advisory locks to paper over missing constraints, missing unique indexes, or flawed transaction design, you’ll eventually lose. Advisory locks can help coordinate workflows, but they’re not a substitute for database constraints and correct transaction logic.
Best practices that keep advisory locks safe
Prefer pg_try_* over blocking locks in most application flows
Blocking lock acquisition can create pileups:
SQLSELECT pg_try_advisory_xact_lock(12345);
If false, back off and retry later.
Use transaction-level locks unless you truly need session scope
Transaction-level locks naturally bound the blast radius.
Design your lock key scheme carefully
- Pick a stable namespace (e.g., one
bigintper resource). - Avoid collisions across features.
- Consider using two-key form:
(feature_id, entity_id)for readability.
Reference: function signatures
https://www.postgresql.org/docs/current/functions-admin.html
Add timeouts and observability
Advisory locks show in pg_locks with locktype = 'advisory'. You can detect stuck holders and measure waiters.
Reference: pg_locks
https://www.postgresql.org/docs/current/view-pg-locks.html
How to troubleshoot advisory lock stalls
If you suspect advisory locks are the bottleneck:
- Find advisory locks and who holds them:
SQLSELECT
l.pid,
a.usename,
a.application_name,
a.client_addr,
l.mode,
l.granted,
now() - a.xact_start AS xact_age,
left(a.query, 200) AS query_preview
FROM pg_locks l
JOIN pg_stat_activity a ON a.pid = l.pid
WHERE l.locktype = 'advisory'
ORDER BY l.granted DESC, xact_age DESC NULLS LAST;
- Identify whether you have:
- one “global” key everyone wants, or
- many keys (per-entity contention), or
- leaked session locks (holders idle for a long time).
References:
pg_locks: https://www.postgresql.org/docs/current/view-pg-locks.htmlpg_stat_activity: https://www.postgresql.org/docs/current/monitoring-stats.html
Takeaway
Advisory locks are great for coordinating work (singleton jobs, migration serialization, cross-table critical sections) when used with scoped keys, transaction-level locks, and try + backoff patterns. They’re dangerous when used as a global mutex or held across unpredictable waits, because they can create self-inflicted stop-the-world behavior.
42. Application-Level Tactics: Retry Logic, Jitter, and Backpressure
Database locking is only half the story. The other half is what your application does when it hits a lock wait. Most “stop-the-world” incidents become outages because applications respond to timeouts and transient lock failures by doing the worst possible thing: piling on.
The goal at the application layer is to convert “lock contention” into “controlled slowdown,” using:
- retry logic (for the right errors),
- jittered backoff (so retries don’t synchronize),
- backpressure (so you don’t overload the database and your pool).
References:
- PostgreSQL timeouts (
lock_timeout,statement_timeout): https://www.postgresql.org/docs/current/runtime-config-client.html - Error handling basics / SQLSTATE reference: https://www.postgresql.org/docs/current/errcodes-appendix.html
- Advisory lock functions (try-style patterns): https://www.postgresql.org/docs/current/functions-admin.html
Why naive retries make lock incidents worse
During a lock freeze:
- many requests are already waiting,
- the pool is near saturation,
- latency is spiking.
If every request times out and instantly retries, you increase concurrency precisely when the system is least able to handle it. This creates:
- more waiting sessions,
- longer queues,
- more timeouts,
- and a “retry storm” that can keep the system unstable even after the root blocker is gone.
So the problem is not “retries are bad,” it’s “unbounded, synchronized retries are bad.”
Retry logic: what to retry and what not to retry
Retryable categories (common)
These are often safe to retry if the operation is idempotent (or you have an idempotency key):
- Lock timeout: statement canceled due to
lock_timeout - Statement timeout: statement canceled due to
statement_timeout(sometimes retryable, sometimes indicates a real performance issue) - Deadlock detected: Postgres aborts one participant; retry is the normal pattern
- Serialization failures at
SERIALIZABLE(retry is required by design)
Postgres documents SQLSTATE codes; in practice you usually match on SQLSTATE rather than parsing text.
Reference: SQLSTATE appendix (including deadlock 40P01, serialization failures, etc.)
https://www.postgresql.org/docs/current/errcodes-appendix.html
Usually not retryable (or only with human-level caution)
- Syntax/permission errors
- Constraint violations (unless your business logic expects it)
- “relation does not exist” / schema mismatch (deploy ordering issue)
- Out-of-disk / corruption / admin shutdown type errors
A helpful rule: retry only when the failure is plausibly transient and retrying won’t amplify harm.
Backoff and jitter: how to retry without stampeding
Exponential backoff
Instead of retrying immediately, wait longer after each failure:
- attempt 1: wait ~50ms
- attempt 2: wait ~100ms
- attempt 3: wait ~200ms
- cap at some maximum (e.g., 1–2 seconds)
Add jitter (randomness)
If every worker uses the same backoff schedule, they wake up together and collide again. Jitter desynchronizes retries.
Two common approaches:
- full jitter: pick a random value between 0 and the current backoff cap
- equal jitter: wait half the cap plus a random half
Either is fine; the important point is “not synchronized.”
Practical example pseudocode:
textbase = 50ms
cap = 2000ms
for attempt in 1..N:
try operation
if success: return
sleep random(0, min(cap, base * 2^(attempt-1)))
Backpressure: stop feeding the queue
Retries handle individual failures; backpressure protects the system.
1) Bound concurrency to the database
Your app should have a hard upper bound on concurrently in-flight DB work. Usually that’s your connection pool size, but also consider:
- limiting the number of worker threads that can simultaneously execute DB-heavy code paths,
- using bulkheads: separate pools for “critical path” and “background jobs.”
The aim is: a lock incident in a background job should not starve your user-facing traffic.
2) Fail fast when the pool is exhausted
If a request can’t get a DB connection quickly, it’s often better to:
- return a fast error (or degraded response),
- rather than wait, time out, and retry.
This prevents “threads stuck waiting for pool” from taking down the whole app server.
3) Adaptive throttling
If you detect high DB latency or lock timeout rates, dynamically reduce request rate for heavy endpoints (queue, shed load, or degrade). This is especially valuable during migrations or partial outages.
Align app behavior with database timeouts
A healthy pattern is:
- Postgres
lock_timeoutis short for contentious operations, so DDL fails fast. - App request timeout is slightly larger than normal DB query budget.
- App retries are bounded and jittered.
If your application timeout is shorter than lock_timeout or statement_timeout, you can end up in a weird state:
- the app gives up and retries,
- but the original statement is still waiting in Postgres,
- multiplying load.
So you want your timeout layers to be coherent.
Reference: timeouts
https://www.postgresql.org/docs/current/runtime-config-client.html
Use try-style locks for singleton workflows
If you use advisory locks or other “only one worker should do this” patterns, prefer try-lock + backoff:
pg_try_advisory_lock/pg_try_advisory_xact_lock
This avoids having dozens of workers blocked on the same lock key.
Reference: advisory lock functions
https://www.postgresql.org/docs/current/functions-admin.html
Practical checklist for production apps
- Retry only retryable errors (use SQLSTATE).
- Make retries bounded (max attempts, max total time).
- Add jitter to every retry.
- Cap concurrency per DB (pool + bulkheads).
- Fail fast if the pool is exhausted.
- Use idempotency keys for write retries.
- Avoid long transactions in request flows (no user think-time / network calls inside a transaction).
Takeaway
Locking incidents become outages when applications amplify them. Good application behavior—bounded retries with jitter, plus backpressure and sensible timeouts—turns “database waiting” from a cascading failure into a controlled degradation that recovers quickly once the root blocker is gone.
43. Connection Poolers and Lock Storms: What PgBouncer Changes
Connection poolers (most famously PgBouncer) are often deployed to protect PostgreSQL from having to manage thousands of client connections. They can dramatically improve stability and throughput. But they also change the shape of lock incidents—sometimes for the better, sometimes in surprising ways.
This section focuses on how poolers interact with lock storms and “stop-the-world” events, and what to configure so the pooler doesn’t become an outage amplifier.
References:
- PostgreSQL connection/auth/timeouts settings: https://www.postgresql.org/docs/current/runtime-config-client.html
pg_stat_activityand monitoring: https://www.postgresql.org/docs/current/monitoring-stats.html- PgBouncer documentation (pool modes, transaction pooling, etc.): https://www.pgbouncer.org/ (external)
(PgBouncer behavior is defined by its own docs; Postgres docs don’t fully describe it.)
What a pooler changes at a high level
Without a pooler, every application connection is usually a PostgreSQL backend process. During a lock storm:
- requests hang,
- apps open more connections (or keep them stuck),
- Postgres ends up with too many backends,
- memory grows, context switching increases, and recovery gets harder.
With a pooler:
- the app can have many client connections,
- but the database sees a smaller, controlled number of server connections.
That means a lock storm is less likely to become “we hit max_connections and everything fell over.”
The big concept: poolers cap concurrency (this is usually good)
Lock storms are queueing problems: too many requests pile up behind a blocker.
A pooler helps by limiting how many queries can be simultaneously waiting inside Postgres. Instead of 2,000 app threads all waiting on locks in the database, you might have:
- 100 server connections waiting in Postgres,
- and 1,900 client connections queued in PgBouncer.
That can be a major stability win because Postgres isn’t forced to context-switch and manage lock state for thousands of waiting backends.
But the queue doesn’t disappear: it moves
A pooler doesn’t fix blocking—it relocates the backlog:
- without a pooler: queue is mostly inside Postgres
- with a pooler: queue is partly inside PgBouncer
Operational implication:
- Your app may still time out, but Postgres metrics might look “less insane.”
- You must monitor PgBouncer too, or you’ll miss the pileup.
Pool modes: how PgBouncer changes transaction behavior
Session pooling
- One client connection maps to one server connection for the life of the client session.
Pros:
- Works with session features: temp tables, prepared statements (server-side), session variables, etc.
- Least surprising.
Cons:
- Less efficient at sharing server connections.
- During lock storms, server connections can still get “stuck” per client.
Transaction pooling
- A server connection is assigned to a client only for the duration of a transaction, then returned to the pool.
Pros:
- Greatly improves server connection reuse.
- Often the best mode for OLTP workloads.
Cons (very important):
- You cannot rely on session state being preserved (session-level prepared statements, SET variables, temp tables, advisory session locks, etc.).
- If your app assumes session affinity, transaction pooling can cause correctness bugs.
Statement pooling
- Even more aggressive reuse (per statement). Rarely used for typical apps because it breaks too many assumptions.
PgBouncer reference: pool modes
https://www.pgbouncer.org/
Lock storms under PgBouncer: what gets better
1) You’re less likely to melt Postgres with connection count
By capping server connections, you reduce:
- per-backend memory overhead,
- CPU spent on managing thousands of idle/waiting processes,
- the “death spiral” where timeouts cause more connection attempts.
2) You can protect critical traffic with separate pools
A powerful pattern is to run multiple logical pools / users:
- a small, protected pool for migrations/maintenance,
- a primary pool for OLTP traffic,
- a background pool for batch jobs.
This prevents a batch backfill from consuming all server connections during a lock incident.
(Exact implementation depends on your PgBouncer config.)
Lock storms under PgBouncer: what can get worse or more confusing
1) More client timeouts while Postgres looks “fine”
Because the queue is now in PgBouncer, clients may time out waiting for a server connection even though Postgres isn’t at max connections. This is still an outage, just with different symptoms.
2) “Idle in transaction” is still deadly
PgBouncer doesn’t magically prevent long transactions. In session pooling, an idle-in-transaction client can tie up a server connection and hold locks. In transaction pooling, a client that holds a transaction open still holds the server connection for that transaction duration.
So the classic root causes remain: long transactions + strong lock requests.
Reference: idle_in_transaction_session_timeout
https://www.postgresql.org/docs/current/runtime-config-client.html
3) Prepared statements and session state assumptions can break
If you switch to transaction pooling but your application uses:
- server-side prepared statements,
- temp tables,
- session-level advisory locks,
- session-local
SETvalues,
you can get functional bugs that are hard to trace because they appear only under load or failover.
Operational advice: settings and tactics that help during lock storms
Use database timeouts anyway
Even with PgBouncer, set:
lock_timeoutfor DDL/migrationsstatement_timeoutfor runaway queriesidle_in_transaction_session_timeoutto kill “lock hostage” sessions
Reference:
https://www.postgresql.org/docs/current/runtime-config-client.html
Size pools intentionally
- Too many server connections → you lose the main benefit.
- Too few server connections → you can create artificial bottlenecks and longer client queues.
A good configuration is workload-specific, but the principle is: use PgBouncer to cap and shape concurrency, not to maximize it.
Monitor both sides
- Postgres:
pg_stat_activity,pg_locks, commit rate - PgBouncer: client queue length, wait times, pool saturation
(PgBouncer exposes its own stats via its admin console; see its docs.)
Takeaway
PgBouncer doesn’t prevent locks, but it changes lock storms from “Postgres has 2,000 stuck backends” to “a controlled number of Postgres backends are stuck and the rest are queued in the pooler.” That usually improves stability and recovery, but it also makes incidents easier to misdiagnose unless you monitor PgBouncer. Choose the right pool mode (session vs transaction), avoid session-state assumptions in transaction pooling, and keep database-level timeouts in place to prevent lock queue amplification.
44. High Availability Considerations: Replication, Failover, and DDL
High availability (HA) changes the operational meaning of “stop-the-world.” A lock incident might be isolated to the primary, but its effects can propagate through replication lag, failover behavior, and read-scaling patterns. DDL in particular has sharp edges in HA setups because schema changes are both lock-heavy and replication-visible.
This section covers the main HA interactions you need to think about: streaming replication lag, failover timing/consistency, and how to run DDL without turning replicas (or failovers) into a second incident.
References:
- High availability / replication overview: https://www.postgresql.org/docs/current/high-availability.html
- WAL / streaming replication concepts: https://www.postgresql.org/docs/current/wal-intro.html
- Locking and DDL: https://www.postgresql.org/docs/current/explicit-locking.html
ALTER TABLEand transactional DDL: https://www.postgresql.org/docs/current/sql-altertable.html
Replication reality: locks don’t replicate, but their effects do
Locks are local to a server. A lock queue on the primary does not “replicate” to a standby.
But the work that causes the lock incident often has downstream effects:
- DDL and big maintenance operations can generate lots of WAL.
- Heavy write bursts (including backfills and rewrites) can increase replication lag.
- If your app reads from replicas, lag can become an availability problem: stale reads, inconsistent behavior, or read timeouts.
Reference: replication and WAL
https://www.postgresql.org/docs/current/high-availability.html
https://www.postgresql.org/docs/current/wal-intro.html
How DDL behaves with streaming replication
Most DDL is transactional: it commits on the primary and then the commit record replicates. Standbys apply WAL in order, so:
- the schema change appears on the standby after it replays the commit,
- but only once it reaches that WAL position.
This creates a few practical hazards:
1) Schema drift during lag
If you deploy app code expecting the new schema immediately, but replicas lag, you can get:
- “column does not exist” errors on read replicas,
- or ORM query failures if reads go to standbys.
This is why migration sequencing matters more with replicas.
Reference: transactional behavior and HA consistency topics
https://www.postgresql.org/docs/current/high-availability.html
https://www.postgresql.org/docs/current/sql-altertable.html
2) Rewrites and large backfills increase lag
Operations that rewrite tables or generate heavy WAL (large updates, index builds, VACUUM FULL, CLUSTER) can push replicas behind. Even if the primary survives the lock risk, your read capacity might collapse because replicas can’t keep up.
Reference: WAL generation concepts
https://www.postgresql.org/docs/current/wal-intro.html
Failover considerations: what happens to locks and in-flight DDL?
During failover:
- existing locks on the old primary disappear because that server is no longer the primary.
- in-flight transactions (including DDL) on the old primary are either:
- committed and replicated (if they had committed and been replayed), or
- lost/rolled back (if they never committed or didn’t reach the standby).
Practical implications:
1) “Did the migration happen or not?”
After failover, you can end up in an uncertain state if:
- the migration client saw an error/disconnect mid-flight,
- and you don’t know whether the DDL committed.
Because DDL is transactional, the truth is in the new primary’s catalogs: check schema state rather than assuming success/failure from the client’s perspective.
Reference: HA and transactional guarantees
https://www.postgresql.org/docs/current/high-availability.html
2) Prepared statements / cached plans across failover
Applications often cache prepared statements or schemas. After failover + DDL changes, apps can see plan invalidation issues or errors until they reconnect or refresh. This is an app/driver concern, but it’s commonly triggered by DDL rollouts in HA environments.
(Postgres handles plan invalidation internally on one server; across failover you’re effectively connecting to a different server.)
Read replicas and “stop-the-world”: the gotcha with locking + stale reads
If you do “expand/backfill/contract” migrations with read replicas:
- Expand (add new nullable columns, new tables, new indexes) is usually safe.
- During lag, some readers might still be on the old schema.
- If your app uses replicas for reads, you must ensure your code path is compatible with both schemas until replicas catch up.
This is why expand/backfill/contract is not just about locks—it’s also about replication-safe compatibility.
Reference: partitioned deployment / rolling upgrades ideas map back to HA docs and transactional DDL
https://www.postgresql.org/docs/current/high-availability.html
Operational patterns that reduce HA pain during DDL
1) Prefer backward-compatible schema changes
Expand/backfill/contract is especially valuable in HA because it tolerates replication lag and rolling deploys.
Reference: ALTER TABLE and constraint validation options
https://www.postgresql.org/docs/current/sql-altertable.html
2) Avoid rewrite-heavy operations on the primary during peak
Even if you can tolerate a lock, you might not tolerate the replication lag.
Treat these as high risk for HA:
VACUUM FULLCLUSTER- large backfills without throttling
- non-concurrent index rebuilds
- large
ALTER TABLE ... TYPErewrites
References:
https://www.postgresql.org/docs/current/sql-vacuum.html
https://www.postgresql.org/docs/current/sql-cluster.html
https://www.postgresql.org/docs/current/sql-altertable.html
3) Use timeouts to keep DDL from becoming a long lock queue
HA doesn’t prevent queue amplification on the primary. Use:
lock_timeout(fail fast)statement_timeout(cap runtime)
Reference: https://www.postgresql.org/docs/current/runtime-config-client.html
4) Plan cutovers with replication state in mind
If your app uses replicas:
- don’t switch reads/writes to the new schema until replicas have replayed the DDL commit (or route that code path to the primary temporarily).
- in some setups, teams gate deploy phases on replication lag thresholds.
(Postgres provides replication status views; exact choice depends on your HA tooling.)
Reference: replication monitoring views (in HA docs and monitoring stats)
https://www.postgresql.org/docs/current/high-availability.html
https://www.postgresql.org/docs/current/monitoring-stats.html
Takeaway
In HA systems, lock incidents are only part of the risk. DDL and heavy maintenance can also create replication lag, schema visibility gaps between primary and replicas, and uncertainty during failover about whether a migration committed. The safest approach is to use backward-compatible migrations (expand/backfill/contract), avoid rewrite-heavy operations on hot primaries, and keep strict timeouts so DDL never becomes a long-lived head-of-line blocker.
45. Scheduling Risky Operations: Maintenance Windows vs Online Techniques
Not every PostgreSQL operation can be made “online,” and not every operation needs a midnight maintenance window. The art is knowing which changes are inherently downtime-prone (strong locks, table rewrites) versus which can be engineered to be low impact (concurrent builds, phased migrations).
This section gives you a scheduling framework: when to insist on a maintenance window, when to use online techniques, and how to avoid becoming the head-of-line lock waiter either way.
References:
- Lock modes and conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
ALTER TABLE: https://www.postgresql.org/docs/current/sql-altertable.htmlCREATE INDEX CONCURRENTLY: https://www.postgresql.org/docs/current/sql-createindex.htmlREINDEX CONCURRENTLY: https://www.postgresql.org/docs/current/sql-reindex.htmlVACUUM FULL: https://www.postgresql.org/docs/current/sql-vacuum.htmlCLUSTER: https://www.postgresql.org/docs/current/sql-cluster.html- Timeouts: https://www.postgresql.org/docs/current/runtime-config-client.html
The decision: “online” vs “window” is about failure mode
Ask two questions:
- Does this operation require a strong lock for a meaningful duration?
If yes, it’s downtime-prone. - Can we redesign it into smaller steps that hold strong locks only briefly?
If yes, you can usually do it online.
If the answer to (2) is “no,” you either accept a maintenance window or accept outage risk.
Operations that usually belong in a maintenance window
These commonly cause stop-the-world behavior because they rewrite tables or need ACCESS EXCLUSIVE in ways that can last:
VACUUM FULL(table rewrite)
Reference: https://www.postgresql.org/docs/current/sql-vacuum.htmlCLUSTER(table rewrite)
Reference: https://www.postgresql.org/docs/current/sql-cluster.html- Many
ALTER TABLE ... ALTER COLUMN TYPE ...(often rewrites)
Reference: https://www.postgresql.org/docs/current/sql-altertable.html - Large-scale schema changes that touch a central hot table and can’t be phased
- Non-concurrent index rebuild patterns when
CONCURRENTLYis not possible/available
Even in a window, still set lock_timeout and statement_timeout so a “window change” doesn’t become an all-night incident.
Reference: timeouts
https://www.postgresql.org/docs/current/runtime-config-client.html
Operations that are good candidates for online techniques
These are commonly doable during business hours if you do them correctly:
CREATE INDEX CONCURRENTLY
Reference: https://www.postgresql.org/docs/current/sql-createindex.htmlREINDEX CONCURRENTLY(when supported)
Reference: https://www.postgresql.org/docs/current/sql-reindex.html- Adding nullable columns (no default), then backfilling in batches
Reference: https://www.postgresql.org/docs/current/sql-altertable.html - Adding constraints as
NOT VALID, then validating later
Reference: https://www.postgresql.org/docs/current/sql-altertable.html - Expand/backfill/contract migrations
(built from the primitives above)
Online doesn’t mean “no risk.” It means “bounded and observable risk,” where the worst case is typically “it fails fast and you try later,” not “it locked the table for 20 minutes.”
The “gray zone”: depends on table hotness and your tooling
Some operations can be either windowed or online depending on traffic, table size, and appetite for risk:
ALTER TABLE ... ADD COLUMN ... DEFAULT ...
Could be fast with “fast defaults” in some versions and cases, but can also be rewrite-prone or lock-prone. Verify on your version and table; otherwise use expand/backfill/contract.DROP COLUMN/DROP CONSTRAINT
Often quick but still needs strong locking; safe if you can acquire the lock quickly, risky if the table is extremely hot.- Partition attach/detach operations
Can be quick but can still require strong coordination locks.
References:
ALTER TABLE: https://www.postgresql.org/docs/current/sql-altertable.html- Partitioning: https://www.postgresql.org/docs/current/ddl-partitioning.html
Scheduling guidance that works in real systems
1) Prefer “online” when you can bound the lock wait
If you can run the operation with:
- short
lock_timeout(so it won’t queue), - and you have a rollback/retry plan,
then you can often schedule it during normal hours, because the worst case is a quick failure.
Example pattern:
SQLBEGIN;
SET LOCAL lock_timeout = '2s';
SET LOCAL statement_timeout = '60s';
-- small DDL step
COMMIT;
Reference: https://www.postgresql.org/docs/current/runtime-config-client.html
2) Use maintenance windows when the lock hold time is inherently long
If the operation:
- must rewrite a very large table,
- will generate huge I/O/WAL,
- or must hold strong locks for minutes,
then a window is the honest choice. Trying to “online” it usually just means “we risk a surprise outage during peak.”
3) Separate risk: do the dangerous part when traffic is low
Even within “online” techniques, you can schedule the riskiest micro-steps (like constraint validation or the final cutover/rename/drop) during a lower-traffic period, while doing the long backfill earlier in the day with throttling.
4) Don’t overlap risky operations
Avoid running:
- backfills,
- index builds,
- vacuum-heavy maintenance,
- and schema changes
at the same time on the same hot tables. Overlap increases contention and can push you into LWLock/I/O bottlenecks even if locks are “fine.”
A practical pre-flight checklist before you schedule
- □ Is the target table on a critical path and continuously written?
- □ Do you have long-running transactions or “idle in transaction” sessions regularly?
- □ Can you use
CONCURRENTLYorNOT VALIDto reduce blocking? - □ Have you set
lock_timeoutso you won’t become head-of-line? - □ What’s the rollback plan if it fails halfway?
- □ For HA setups: can your replicas absorb the WAL (lag risk)?
References:
- Locking: https://www.postgresql.org/docs/current/explicit-locking.html
- HA: https://www.postgresql.org/docs/current/high-availability.html
Takeaway
Maintenance windows are for operations that are inherently exclusive or rewrite-heavy. Online techniques are for operations you can decompose into short-lock steps plus throttled background work. The safest scheduling strategy is to choose the approach that gives you a predictable failure mode—then enforce that predictability with timeouts, phased rollouts, and “no overlap” discipline.
46. Incident Response: Step-by-Step When the Database Appears “Hung”
When PostgreSQL “hangs,” it’s usually not dead. It’s waiting—on locks, I/O, internal contention, or downstream dependencies. The fastest recoveries come from running a consistent playbook that answers, in order:
- Is it locks or something else?
- If it’s locks, what is the hot object and who is the root blocker?
- What is the least risky intervention that restores throughput?
References:
- Monitoring stats / wait events: https://www.postgresql.org/docs/current/monitoring-stats.html
pg_locks: https://www.postgresql.org/docs/current/view-pg-locks.htmlpg_blocking_pids(): https://www.postgresql.org/docs/current/functions-info.html- Admin functions (
pg_cancel_backend,pg_terminate_backend): https://www.postgresql.org/docs/current/functions-admin.html - Locking overview: https://www.postgresql.org/docs/current/explicit-locking.html
Step 0: Stabilize the system so you can diagnose
Before you dive into SQL:
- Pause/slow any known deploys, migrations, or batch jobs if you control them.
- If your application is retrying aggressively, reduce retry rate or shed load if possible (to stop the pileup from growing).
This isn’t “fixing the lock,” it’s preventing amplification so your actions have effect.
Step 1: Confirm whether you have a lock incident
Run:
SQLSELECT
wait_event_type,
count(*) AS sessions
FROM pg_stat_activity
WHERE datname = current_database()
GROUP BY wait_event_type
ORDER BY sessions DESC;
Interpretation:
- Lots of
wait_event_type = 'Lock'→ classic lock storm. - Mostly
LWLock*→ internal contention. - Mostly I/O waits → storage/throughput issue.
- Mostly
Client→ clients not sending/receiving (less likely “DB hung”).
Reference: wait events
https://www.postgresql.org/docs/current/monitoring-stats.html
If it’s not “Lock,” your lock playbook may not help; you need the wait-event-specific path. If it is “Lock,” continue.
Step 2: Identify the worst waiters (what’s piling up)
SQLSELECT
pid,
usename,
application_name,
state,
now() - query_start AS query_age,
wait_event_type,
wait_event,
left(query, 200) AS query_preview
FROM pg_stat_activity
WHERE datname = current_database()
AND wait_event_type = 'Lock'
ORDER BY query_age DESC
LIMIT 50;
This shows:
- how long sessions have been stuck,
- what kinds of queries are waiting (often points to the hot table or DDL).
Step 3: Find the hot relation (the blast radius)
If it’s relation-lock driven, pg_locks will show many waiters on one table:
SQLSELECT
relation::regclass AS relation,
mode,
count(*) AS waiting
FROM pg_locks
WHERE locktype = 'relation'
AND NOT granted
GROUP BY 1, 2
ORDER BY waiting DESC, relation;
If you see one relation dominating, that’s your choke point.
Reference: pg_locks
https://www.postgresql.org/docs/current/view-pg-locks.html
Step 4: Identify the root blocker(s)
Use pg_blocking_pids() to jump from victims to blockers:
SQLWITH waiters AS (
SELECT
pid AS waiting_pid,
unnest(pg_blocking_pids(pid)) AS blocking_pid
FROM pg_stat_activity
WHERE datname = current_database()
AND wait_event_type = 'Lock'
)
SELECT
b.pid,
b.usename,
b.application_name,
b.client_addr,
b.state,
now() - b.xact_start AS xact_age,
now() - b.query_start AS query_age,
left(b.query, 200) AS query_preview,
count(*) AS blocked_sessions
FROM waiters w
JOIN pg_stat_activity b ON b.pid = w.blocking_pid
GROUP BY 1,2,3,4,5,6,7,8
ORDER BY blocked_sessions DESC, xact_age DESC NULLS LAST;
What to look for:
- high
blocked_sessions - old
xact_age state = 'idle in transaction'(very common root blocker)
Reference: pg_blocking_pids()
https://www.postgresql.org/docs/current/functions-info.html
Step 5: Determine if you’re seeing “head-of-line DDL” amplification
Now check if there’s a strong lock request queued (common amplifier):
SQLSELECT
a.pid,
a.usename,
a.application_name,
now() - a.query_start AS waiting_for,
l.mode,
l.relation::regclass AS relation,
left(a.query, 200) AS query_preview
FROM pg_locks l
JOIN pg_stat_activity a ON a.pid = l.pid
WHERE l.locktype = 'relation'
AND NOT l.granted
ORDER BY waiting_for DESC;
If you see a long-waiting AccessExclusiveLock request on a hot table, you likely have:
- a root blocker (long transaction) plus
- a head-of-line waiter (DDL) causing queue amplification.
Reference: lock modes
https://www.postgresql.org/docs/current/explicit-locking.html
Step 6: Choose the least risky intervention
Your options, from least to most disruptive:
A) Stop the amplifier (cancel the head-of-line DDL)
If a migration is waiting for AccessExclusiveLock and causing pileups, canceling it can immediately let compatible reads/writes proceed.
SQLSELECT pg_cancel_backend(<ddl_pid>);
Reference: admin functions
https://www.postgresql.org/docs/current/functions-admin.html
B) End the root blocker (terminate the session holding locks)
If the root blocker is idle in transaction or running an unacceptably long query, you may need:
SQLSELECT pg_terminate_backend(<blocking_pid>);
This rolls back its transaction and releases locks, usually restoring service quickly.
Reference: https://www.postgresql.org/docs/current/functions-admin.html
C) Reduce load while you work
If you can throttle background jobs, pause deploys, or shed non-critical traffic, do it—especially if you’re seeing multiple blockers or broad contention.
Step 7: Verify recovery (don’t stop at “we killed the blocker”)
After intervention, confirm the system is draining:
- lock wait count drops:SQL
SELECT count(*) FROM pg_stat_activity WHERE datname = current_database() AND wait_event_type = 'Lock'; - throughput returns (sample
pg_stat_databasedeltas):SQLSELECT xact_commit, xact_rollback FROM pg_stat_database WHERE datname = current_database(); - pool returns to normal and p95/p99 latency drops (app metrics).
Reference: monitoring stats
https://www.postgresql.org/docs/current/monitoring-stats.html
Step 8: Capture evidence for the postmortem
Before everything disappears, capture:
- top blockers and their queries (
pg_stat_activity) - relevant lock snapshots (
pg_locks) - whether
idle in transactionwas involved - the command that triggered the queue (often DDL)
- timestamps and affected relations
This is what lets you fix the root cause instead of repeating the incident.
Takeaway
A “hung” Postgres is usually “waiting Postgres.” The fastest incident response is a disciplined sequence: confirm lock waits, find the hotspot relation, identify blockers via pg_blocking_pids(), remove the head-of-line amplifier if needed, terminate the true root blocker when necessary, and verify that waiters and latency are draining.
47. What to Kill First: Canceling Queries vs Terminating Sessions
When a lock incident is actively causing timeouts, you often have to intervene. In PostgreSQL you have two primary “stop it” tools:
- Cancel the current query:
pg_cancel_backend(pid) - Terminate the session:
pg_terminate_backend(pid)
They are not interchangeable. The right choice depends on whether you’re dealing with:
- a long-running query that’s safe to stop, or
- a transaction that is holding locks/snapshots and won’t go away unless the session ends.
References:
- Admin functions (
pg_cancel_backend,pg_terminate_backend): https://www.postgresql.org/docs/current/functions-admin.html pg_stat_activitysession state: https://www.postgresql.org/docs/current/monitoring-stats.html- Locking overview: https://www.postgresql.org/docs/current/explicit-locking.html
The difference in one sentence
- Cancel stops the currently running statement but keeps the session alive.
- Terminate kills the session, rolling back any open transaction and releasing its locks.
Canceling queries: pg_cancel_backend(pid)
When cancel is the right first move
Cancel is ideal when the problem is an actively running statement that you want to stop with minimal collateral damage, for example:
- a runaway
SELECTdoing a huge scan and holding resources, - a long-running report query that is blocking DDL (or contributing to contention),
- a DDL statement that is waiting and acting as head-of-line (canceling it can remove queue amplification),
- a batch update that you’d rather retry later in smaller chunks.
How to do it:
SQLSELECT pg_cancel_backend(<pid>);
What happens:
- Postgres sends an interrupt to that backend.
- If the backend is in a cancelable state, the current statement ends with an error (commonly “canceling statement due to user request”).
Important caveats:
- If the session is inside an explicit transaction, canceling a statement typically leaves the transaction in an aborted state; the client must
ROLLBACKbefore it can do more work. - Canceling does not necessarily release locks if the session remains in a transaction holding them. Some locks (especially those acquired by DDL or row locks held by an open transaction) may remain until transaction end.
Reference: admin functions and cancellation behavior
https://www.postgresql.org/docs/current/functions-admin.html
Terminating sessions: pg_terminate_backend(pid)
When termination is the right move
Termination is the “release the locks now” tool. It’s appropriate when the root blocker is:
- idle in transaction (classic lock hostage)
- holding row locks from an unfinished write transaction
- holding a strong relation lock (e.g., DDL transaction left open)
- a session that won’t respond to cancel quickly enough to restore service
How to do it:
SQLSELECT pg_terminate_backend(<pid>);
What happens:
- The backend process is terminated.
- Any open transaction is rolled back.
- Locks held by that transaction/session are released.
This is often the fastest way to unstick a system, but it has higher risk:
- rolled-back work (writes undone),
- potential partial application workflows,
- possible user-visible errors for whatever was using that session.
Reference: admin functions
https://www.postgresql.org/docs/current/functions-admin.html
A practical “what to kill first” decision tree
1) If there’s a queued ACCESS EXCLUSIVE DDL acting as head-of-line
- Cancel the DDL waiter first (often lowest collateral damage).
- This can immediately allow compatible traffic (reads/writes) to proceed again.
Why: the DDL may not have changed anything yet; it’s just waiting and amplifying the queue.
Then, separately, fix the root cause (often a long transaction) before retrying the migration.
2) If the root blocker is idle in transaction
- Terminate the session.
- Canceling often does nothing because there may be no running statement to cancel, and the transaction (and its locks) will remain.
Use pg_stat_activity.state and xact_start to confirm.
Reference: session states
https://www.postgresql.org/docs/current/monitoring-stats.html
3) If the root blocker is a long-running query (active)
- Try cancel first.
- If it doesn’t stop quickly enough (or keeps coming back), terminate.
4) If you’re unsure whether a session holds locks past statement end
- Prefer terminate for true “lock hostage” situations (especially when you’ve identified it as the root blocker via
pg_blocking_pids()). - Prefer cancel when you’re addressing the head-of-line waiter or a clear runaway statement and want to minimize rollback impact.
Reference: lock behavior and transaction scope
https://www.postgresql.org/docs/current/explicit-locking.html
How to avoid making things worse
Before you kill anything, confirm you’re targeting the right PID(s):
- Use
pg_blocking_pids()to identify blockers from victims.
Reference: https://www.postgresql.org/docs/current/functions-info.html - Check
pg_stat_activityfor:state(idle in transaction vs active),xact_startage,- query text and application name.
- If possible, kill the smallest blast radius first:
- cancel head-of-line DDL,
- terminate a single idle-in-transaction blocker,
- avoid mass-killing many victims (it often just creates retry storms).
Takeaway
In lock incidents, “what to kill first” is about minimizing collateral damage while restoring throughput:
- Cancel (
pg_cancel_backend) is best for stopping a currently running statement—especially a head-of-line DDL waiter or runaway query. - Terminate (
pg_terminate_backend) is best for ending lock-hostage sessions—especially idle in transaction—because it rolls back the transaction and releases locks immediately.
Use pg_blocking_pids() + pg_stat_activity to be sure you’re acting on the root blocker, not just the loudest victim.
48. Post-Incident Review: How to Prevent the Next Lock Freeze
The goal of a post-incident review isn’t to write a story about what happened. It’s to extract a small set of changes that make the next lock event:
- less likely to occur, and
- less damaging if it does occur.
For lock freezes, the highest-leverage fixes usually live in four places:
- transaction discipline,
- safer migrations and DDL practices,
- timeouts/guardrails,
- observability and runbooks.
References:
- Locking overview: https://www.postgresql.org/docs/current/explicit-locking.html
- Monitoring stats: https://www.postgresql.org/docs/current/monitoring-stats.html
- Runtime timeouts (
lock_timeout,statement_timeout,idle_in_transaction_session_timeout): https://www.postgresql.org/docs/current/runtime-config-client.html - Admin functions and advisory locks: https://www.postgresql.org/docs/current/functions-admin.html
1) Reconstruct the blocking chain (root blocker vs head-of-line waiter)
For lock incidents, the most important postmortem artifact is:
- Which PID was the root blocker?
- Which statement was the head-of-line waiter (often DDL)?
- Which relation(s) were the hotspots?
If you didn’t capture it live, it’s worth adding “snapshot locks and activity” to the incident runbook so you always have evidence next time (e.g., dumps of pg_stat_activity, pg_locks, and a blocker summary query).
Reference: pg_stat_activity / pg_locks
https://www.postgresql.org/docs/current/monitoring-stats.html
https://www.postgresql.org/docs/current/view-pg-locks.html
2) Fix the #1 root cause: long-lived and idle-in-transaction sessions
If the root blocker was:
- a long-running transaction,
- or
idle in transaction,
then the prevention work is usually clear:
- make transactions shorter in the app,
- ensure exception paths always rollback,
- don’t keep transactions open across network calls or user think-time,
- set a defensive database timeout:
SQLALTER DATABASE mydb SET idle_in_transaction_session_timeout = '60s';
This converts “infinite lock hostage” into “bounded damage.”
Reference: idle_in_transaction_session_timeout
https://www.postgresql.org/docs/current/runtime-config-client.html
3) Make migrations fail fast instead of queueing
If a migration/DDL step was involved (common), standardize a migration policy:
- Every migration session sets:
lock_timeout(short)statement_timeout(reasonable)
- Prefer online primitives:
CREATE INDEX CONCURRENTLYREINDEX CONCURRENTLY(when supported)NOT VALIDconstraints + later validation- expand/backfill/contract
A concrete standard:
SQLBEGIN;
SET LOCAL lock_timeout = '2s';
SET LOCAL statement_timeout = '60s';
-- small DDL step
COMMIT;
References:
- timeouts: https://www.postgresql.org/docs/current/runtime-config-client.html
CREATE INDEX CONCURRENTLY: https://www.postgresql.org/docs/current/sql-createindex.html- constraints: https://www.postgresql.org/docs/current/sql-altertable.html
4) Add backpressure and sane retry behavior in the application
If the incident involved retry storms or pool exhaustion, the fix is often in the app:
- treat lock timeouts and deadlocks as retryable,
- use bounded exponential backoff with jitter,
- cap concurrency and isolate background work into separate pools,
- fail fast when the pool is exhausted rather than letting threads pile up.
Postgres reference for SQLSTATE codes (deadlock, serialization, etc.):
https://www.postgresql.org/docs/current/errcodes-appendix.html
5) Reduce the blast radius of “strong lock” operations
If the hotspot table is central and huge, consider structural changes:
- partitioning (so maintenance and some locks are per-partition),
- moving heavy reporting workloads off the primary,
- redesigning workflows to avoid large rewrites.
Reference: partitioning overview
https://www.postgresql.org/docs/current/ddl-partitioning.html
6) Observability upgrades that pay off next time
Make sure you can answer these quickly during the next incident:
- How many sessions are waiting on locks right now?
- Which relation is the hotspot?
- Who are the top blockers, and how old are their transactions?
- Is there a queued
AccessExclusiveLock?
Practical upgrades:
- ensure you collect
pg_stat_activityand lock snapshots during incidents, - enable and use
pg_stat_statementsfor recurring “slow/blocked query” patterns (if feasible), - log slow statements (and consider logging lock waits via
log_lock_waits).
Reference: runtime logging settings (including lock wait logging)
https://www.postgresql.org/docs/current/runtime-config-logging.htmlpg_stat_statements: https://www.postgresql.org/docs/current/pgstatstatements.html
7) Turn the incident into a checklist change
The most effective postmortems end with concrete, testable changes like:
- “All migrations must set
lock_timeoutand run non-transactional steps for concurrent index builds.” - “We enforce
idle_in_transaction_session_timeoutin production.” - “We added pool isolation: background jobs can’t consume OLTP connections.”
- “We added an on-call query pack to identify blockers in <2 minutes.”
This is how you reduce recurrence, not by hoping people remember.
Takeaway
To prevent the next lock freeze, focus on the systemic amplifiers: long-lived transactions (especially idle in transaction), migrations that wait forever for locks, and application retry storms that turn blocking into outages. Add guardrails (timeouts), adopt online migration patterns (expand/backfill/contract, concurrently, NOT VALID), and improve observability and runbooks so diagnosis and recovery are fast and repeatable.
49. Checklist: Pre-Flight Questions Before Running DDL
Use this checklist before you run any DDL on a production (or production-like) PostgreSQL system—especially on hot tables. It’s designed to prevent the two most common failure modes:
- lock queue amplification (your DDL waits and stalls everyone), and
- long lock hold (your DDL gets the lock and blocks traffic longer than expected).
References:
- Lock modes and conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
ALTER TABLE: https://www.postgresql.org/docs/current/sql-altertable.html- Timeouts (
lock_timeout,statement_timeout): https://www.postgresql.org/docs/current/runtime-config-client.html - Monitoring (
pg_stat_activity,pg_locks): https://www.postgresql.org/docs/current/monitoring-stats.html , https://www.postgresql.org/docs/current/view-pg-locks.html
Scope and risk
- □ What exact objects will this touch (tables, indexes, partitions, referenced tables via FKs, views)?
- □ Is the target relation on a critical path (high QPS, heavy writes, core endpoints)?
- □ Does this DDL rewrite the table or do a full scan/validation (high downtime risk)?
- Common high-risk:
ALTER COLUMN TYPE,VACUUM FULL,CLUSTER, some constraint validations
Reference: https://www.postgresql.org/docs/current/sql-altertable.html
- Common high-risk:
Locking expectations
- □ What lock mode is likely required (especially
ACCESS EXCLUSIVE)?- Treat unknown DDL as “might request
ACCESS EXCLUSIVE” until verified
Reference: https://www.postgresql.org/docs/current/explicit-locking.html
- Treat unknown DDL as “might request
- □ If it has to wait for the lock, will it become head-of-line and stall normal traffic?
- □ Are there long-running transactions or
idle in transactionsessions right now?- Quick check:
pg_stat_activity.xact_start,state
Reference: https://www.postgresql.org/docs/current/monitoring-stats.html
- Quick check:
Safety guardrails
- □ Are you setting a short
lock_timeoutso this fails fast instead of queueing? - □ Are you setting
statement_timeoutto cap worst-case runtime? - □ Are you running the smallest possible transaction scope (no lingering open transaction)?
- DDL locks are often held until transaction end
Reference: https://www.postgresql.org/docs/current/explicit-locking.html
- DDL locks are often held until transaction end
Online alternatives and phased rollouts
- □ Can you use an online alternative?
CREATE INDEX CONCURRENTLYinstead ofCREATE INDEX
Reference: https://www.postgresql.org/docs/current/sql-createindex.htmlREINDEX CONCURRENTLYinstead ofREINDEX(when supported)
Reference: https://www.postgresql.org/docs/current/sql-reindex.html- Add constraints as
NOT VALID, validate later
Reference: https://www.postgresql.org/docs/current/sql-altertable.html
- □ Can you do expand/backfill/contract instead of a big-bang change?
Operational timing and coordination
- □ Is this being run during an appropriate traffic window (or with a clear abort plan)?
- □ Are other risky operations running at the same time (backfills, large batch jobs, heavy maintenance)?
- □ In HA setups, have you considered replication lag and replica schema visibility?
Observability and rollback
- □ Do you know how you will monitor progress and blocking while it runs?
pg_stat_activity,pg_locks, progress views where applicable
References:
https://www.postgresql.org/docs/current/monitoring-stats.html
https://www.postgresql.org/docs/current/view-pg-locks.html
https://www.postgresql.org/docs/current/progress-reporting.html
- □ Do you have a clear rollback/abort plan?
- If it times out or blocks, do you cancel it, terminate it, or wait?
- □ If this partially succeeds (e.g., concurrent index build interrupted), do you know cleanup steps?
Application readiness
- □ Is the app compatible with the new schema (and the old schema during rollout)?
- □ Are retries/backoff sane so transient lock timeouts don’t cause a retry storm?
- □ Are you sure no code relies on the thing you’re about to drop?
Minimal “pre-flight” SQL snippet (optional)
If you want a tiny set of checks you can run right before DDL:
SQL-- 1) Look for long/idle transactions (common root blockers)
SELECT pid, usename, application_name, state,
now() - xact_start AS xact_age,
left(query, 200) AS query_preview
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
ORDER BY xact_start ASC
LIMIT 20;
-- 2) Confirm whether there is already lock contention
SELECT count(*) AS lock_waiters
FROM pg_stat_activity
WHERE wait_event_type = 'Lock';
References:
https://www.postgresql.org/docs/current/monitoring-stats.html
Takeaway
If you can answer these pre-flight questions—especially around lock mode, table rewrites, timeouts, and long transactions—you’ll avoid the majority of production DDL lock incidents. The checklist is intentionally conservative: it’s cheaper to abort a migration than to recover from a lock-driven outage.
50. FAQ: Common Misconceptions About PostgreSQL “Stop-the-World” Locks
Below are the misconceptions that most often cause teams to get surprised by lock incidents in PostgreSQL—and the reality behind each one.
References (general):
- Lock modes and conflicts: https://www.postgresql.org/docs/current/explicit-locking.html
- Monitoring locks: https://www.postgresql.org/docs/current/view-pg-locks.html
- Session activity / wait events: https://www.postgresql.org/docs/current/monitoring-stats.html
ALTER TABLE: https://www.postgresql.org/docs/current/sql-altertable.html
“Postgres has lock escalation like SQL Server / Oracle”
No. PostgreSQL does not do classic automatic lock escalation from many row locks to a table lock. When it feels like escalation, it’s usually one of:
- a strong table lock request (often DDL) entering the queue,
- long transactions holding locks open,
- lock queue amplification,
- or heavy row contention causing widespread waits.
Reference: locking behavior
https://www.postgresql.org/docs/current/explicit-locking.html
“MVCC means reads never block writes”
MVCC means plain reads (SELECT) usually don’t block writes (INSERT/UPDATE/DELETE) on the same rows. But reads still take relation locks (ACCESS SHARE) and can participate in conflicts with DDL and table-rewrite operations.
Also, reads that explicitly lock rows (SELECT ... FOR UPDATE/SHARE) absolutely can block writers.
Reference: MVCC overview and explicit locking
https://www.postgresql.org/docs/current/mvcc.html
https://www.postgresql.org/docs/current/explicit-locking.html
“If I’m not running DDL, I can’t cause a stop-the-world event”
You can. The most common non-DDL trigger is a long-running transaction, especially one that becomes idle in transaction. It can:
- hold row locks indefinitely,
- block DDL lock acquisition,
- prevent vacuum cleanup,
- and enable queue pileups when a strong lock request shows up later.
Reference: monitoring sessions and transaction behavior
https://www.postgresql.org/docs/current/monitoring-stats.html
https://www.postgresql.org/docs/current/mvcc.html
“The query holding ACCESS EXCLUSIVE is always the culprit”
Sometimes, but not always.
Often the real problem is:
- a session holding a weaker lock for a long time (root blocker), plus
- a DDL statement waiting for
ACCESS EXCLUSIVE(head-of-line waiter), plus - many sessions queueing behind that request.
Canceling the waiting DDL may immediately restore traffic, even though it wasn’t “holding” the lock yet.
Reference: lock conflict rules and lock wait visibility
https://www.postgresql.org/docs/current/explicit-locking.html
https://www.postgresql.org/docs/current/view-pg-locks.html
“If I don’t see many locks in pg_locks, it’s not a lock problem”
Not necessarily.
- You need to look specifically for waiters (
granted = false), not total lock rows. - Some contention shows up as
transactionidwaits rather than obvious relation locks. - And some “hangs” aren’t heavyweight locks at all—they’re LWLocks, I/O waits, or client/network waits, which won’t appear the same way in
pg_locks.
References:
https://www.postgresql.org/docs/current/view-pg-locks.html
https://www.postgresql.org/docs/current/monitoring-stats.html
“CREATE INDEX CONCURRENTLY means ‘no locks’”
No. It means “minimizes blocking of normal reads/writes,” not “lock-free.”
Concurrent builds still take locks at different phases and can still wait behind conflicting DDL. They are usually much safer for production, but you still want timeouts and monitoring.
Reference: CREATE INDEX CONCURRENTLY
https://www.postgresql.org/docs/current/sql-createindex.html
“Adding a column is always safe”
Adding a nullable column with no default is usually a fast metadata change. But:
- it can still require a strong lock to change the table definition,
- and adding a column with a default or enforcing
NOT NULLcan be much riskier depending on version and the operation.
Reference: ALTER TABLE
https://www.postgresql.org/docs/current/sql-altertable.html
“TRUNCATE is just a fast DELETE”
It’s fast, but it’s not “just a delete.”
TRUNCATE takes an ACCESS EXCLUSIVE lock on the table(s), which blocks reads and writes. With CASCADE, it can lock and truncate additional referencing tables—expanding blast radius quickly.
Reference: TRUNCATE
https://www.postgresql.org/docs/current/sql-truncate.html
“If something is blocked, Postgres will eventually resolve it like a deadlock”
Deadlocks are cycles and Postgres detects them and aborts a participant. Lock blocking is usually a queue, not a cycle, and can last indefinitely if the blocker doesn’t finish.
So “wait long enough” is not a strategy if the blocker is idle in transaction or a migration is queued behind a long-running query.
Reference: deadlocks and explicit locking
https://www.postgresql.org/docs/current/explicit-locking.html
“Killing random waiting queries will fix it”
Usually it won’t. Killing victims often creates a retry storm and doesn’t remove the root cause.
The effective fix is typically:
- cancel the head-of-line DDL waiter (if it’s amplifying), and/or
- terminate the root blocker holding locks or holding a transaction open.
Reference: admin functions
https://www.postgresql.org/docs/current/functions-admin.html
“If CPU is low, Postgres must be fine”
Low CPU during an outage is a classic sign of waiting. Lock waits don’t burn CPU. The database can be “healthy” in resource metrics while being completely unusable for your workload.
Use wait events and lock views to confirm what you’re waiting on.
Reference: pg_stat_activity wait events
https://www.postgresql.org/docs/current/monitoring-stats.html
“Setting timeouts is dangerous; it’ll just cause errors”
Timeouts do cause errors—and that’s the point. They convert “infinite waiting that takes down the whole system” into “bounded failures that your app/migration can retry safely.”
The tradeoff is that your application must handle these errors intentionally (retries with backoff/jitter for idempotent work, and clean rollbacks).
Reference: timeouts (lock_timeout, statement_timeout, idle_in_transaction_session_timeout)
https://www.postgresql.org/docs/current/runtime-config-client.html
Takeaway
Most PostgreSQL “stop-the-world” lock incidents come from misconceptions: assuming MVCC eliminates blocking, assuming Postgres escalates locks, assuming the waiting DDL is always the culprit, or assuming resource metrics reflect lock health. The reliable approach is to observe waits (pg_stat_activity), confirm lock types/modes (pg_locks), identify root blockers (pg_blocking_pids()), and enforce guardrails (timeouts + safe migration patterns).
51. Conclusion: Designing for Zero-Surprise Locking in PostgreSQL
PostgreSQL isn’t “bad at concurrency.” Most of the time it’s excellent at it—MVCC lets reads and writes coexist in ways that feel almost magical. The outages happen when a small number of predictable conditions line up:
- a hot relation that everyone touches,
- a long-running or idle-in-transaction session,
- a strong lock request (often
ACCESS EXCLUSIVE) that enters the queue, - and an application/pool that amplifies waiting into timeouts and retries.
None of that is random. The goal of “zero-surprise locking” is to make these events boring: visible early, bounded by guardrails, and avoidable through standard practices.
References:
- Lock modes and behavior: https://www.postgresql.org/docs/current/explicit-locking.html
- Monitoring (
pg_stat_activity, wait events): https://www.postgresql.org/docs/current/monitoring-stats.html pg_locks: https://www.postgresql.org/docs/current/view-pg-locks.html- Timeouts: https://www.postgresql.org/docs/current/runtime-config-client.html
The core design principles
If you remember only a few principles, make them these:
- Keep transactions short.
Long transactions—and especiallyidle in transaction—are the silent enablers of lock freezes. They hold locks and snapshots open and make every DDL operation unpredictable.
Reference: https://www.postgresql.org/docs/current/mvcc.html - Assume hot-table DDL is dangerous until proven otherwise.
Many schema changes need strong locks, and the wait to acquire them is often the real outage.
Reference: https://www.postgresql.org/docs/current/sql-altertable.html - Engineer failure modes with timeouts.
lock_timeoutprevents queue amplification;statement_timeoutbounds total damage;idle_in_transaction_session_timeoutprevents indefinite lock hostages.
Reference: https://www.postgresql.org/docs/current/runtime-config-client.html - Use online and phased techniques by default.
PreferCREATE INDEX CONCURRENTLY,REINDEX CONCURRENTLY,NOT VALIDconstraints + later validation, and expand/backfill/contract migrations.
References:
- https://www.postgresql.org/docs/current/sql-createindex.html
- https://www.postgresql.org/docs/current/sql-reindex.html
- https://www.postgresql.org/docs/current/sql-altertable.html
- Make the application part of the solution.
Bounded retries with jitter and real backpressure prevent lock incidents from turning into retry storms and pool exhaustion.
What “zero-surprise” looks like operationally
In a mature setup:
- migrations either succeed quickly or fail quickly (never “hang for 20 minutes”),
- one bad session can’t hold locks indefinitely,
- on-call can identify the root blocker in minutes using
pg_stat_activity,pg_locks, andpg_blocking_pids(), - dangerous operations are either decomposed (online) or scheduled intentionally (window),
- and post-incident reviews produce concrete guardrails, not folklore.
References:
pg_blocking_pids(): https://www.postgresql.org/docs/current/functions-info.html- Admin tooling (
pg_cancel_backend,pg_terminate_backend): https://www.postgresql.org/docs/current/functions-admin.html
The final takeaway
Stop-the-world moments in PostgreSQL are rarely mysterious. They’re the result of understandable lock rules interacting with real-world deployment and traffic patterns. If you design around those rules—short transactions, safe migrations, strict timeouts, good observability, and sane app backpressure—you can run PostgreSQL at high concurrency with very few “surprise freezes,” and when they do happen, you’ll have a predictable way to unwind them.
52. Further Reading: Official Docs and Deep Dives
Official PostgreSQL documentation
- Explicit Locking (lock modes, conflicts, lock behavior)
https://www.postgresql.org/docs/current/explicit-locking.html - The
pg_lockssystem view (how to interpret lock rows)
https://www.postgresql.org/docs/current/view-pg-locks.html - Monitoring database activity (
pg_stat_activity, wait events, stats views)
https://www.postgresql.org/docs/current/monitoring-stats.html - MVCC and transaction isolation (why long transactions hurt)
https://www.postgresql.org/docs/current/mvcc.html ALTER TABLEreference (subcommands, constraints, validation, rewrite-related behavior)
https://www.postgresql.org/docs/current/sql-altertable.htmlTRUNCATEreference
https://www.postgresql.org/docs/current/sql-truncate.htmlVACUUMandVACUUM FULLreference
https://www.postgresql.org/docs/current/sql-vacuum.htmlCREATE INDEXandCREATE INDEX CONCURRENTLYreference
https://www.postgresql.org/docs/current/sql-createindex.htmlREINDEXandREINDEX CONCURRENTLYreference
https://www.postgresql.org/docs/current/sql-reindex.htmlCLUSTERreference
https://www.postgresql.org/docs/current/sql-cluster.html- Progress reporting views (
pg_stat_progress_*)
https://www.postgresql.org/docs/current/progress-reporting.html - Runtime configuration: client connection settings and timeouts
(lock_timeout,statement_timeout,idle_in_transaction_session_timeout)
https://www.postgresql.org/docs/current/runtime-config-client.html - Admin functions for incident response
(pg_cancel_backend,pg_terminate_backend, advisory locks)
https://www.postgresql.org/docs/current/functions-admin.html - Error codes / SQLSTATE reference (deadlocks, serialization failures, etc.)
https://www.postgresql.org/docs/current/errcodes-appendix.html - High availability and replication overview (DDL + replicas considerations)
https://www.postgresql.org/docs/current/high-availability.html - Partitioning (pruning, partition management, operational caveats)
https://www.postgresql.org/docs/current/ddl-partitioning.html - Constraints (including foreign keys) background documentation
https://www.postgresql.org/docs/current/ddl-constraints.html pg_stat_statementsextension (workload analysis and recurring offenders)
https://www.postgresql.org/docs/current/pgstatstatements.html
Deep dives and ecosystem resources
- PgBouncer documentation (pool modes, behavior, stats, operational guidance)
https://www.pgbouncer.org/ - PostgreSQL Wiki (assorted operational notes; quality varies by page)
https://wiki.postgresql.org/ - PostgreSQL source code (ultimate reference for lock internals; advanced)
https://github.com/postgres/postgres
Suggested “study path” if you want mastery
- Read Explicit Locking and understand
ACCESS EXCLUSIVE+ lock conflicts.
https://www.postgresql.org/docs/current/explicit-locking.html - Learn incident inspection with
pg_stat_activityandpg_locks.
https://www.postgresql.org/docs/current/monitoring-stats.html
https://www.postgresql.org/docs/current/view-pg-locks.html - Learn migration-safe primitives: CONCURRENTLY, NOT VALID, and timeouts.
https://www.postgresql.org/docs/current/sql-createindex.html
https://www.postgresql.org/docs/current/sql-altertable.html
https://www.postgresql.org/docs/current/runtime-config-client.html