Improve PostgreSQL performance: Diagnose and mitigate lock manager contention

Are your database read operations unexpectedly slowing down as your workload scales? Many organizations running PostgreSQL-based systems encounter performance bottlenecks that aren’t immediately obvious. When many concurrent read operations access tables with numerous partitions or indexes, they can even exhaust PostgreSQL’s fast path locking mechanism, forcing the system to use shared memory locks. The switch from fast path to shared memory locking creates lightweight lock (LWLock) contention in the lock manager, affecting database performance even during read-only operations.

In this post, we explore how read-heavy workloads can trigger LWLock contention by exceeding fast path locking limits, an issue that can occur in any system based on the PostgreSQL engine and its locking mechanism. For demonstration purposes, we use Amazon Aurora PostgreSQL-Compatible Edition in this post. Through practical experiments, we demonstrate how partition scanning, index usage, and complex joins affect locking behavior. We also show how to identify when your workload is transitioning to slow path locking and implement specific techniques to optimize query design and schema structure for better performance.

As your read-heavy workload scales, database administrators (DBAs) need to monitor and address LWLock contention to keep your database healthy. Similarly, developers must avoid patterns that trigger excessive contention. When your database transitions from fast path to slow path locking, throughput can drop significantly—our tests in this post show up to a 34 percent performance difference. You can identify this throughput drop by monitoring wait events such as LWLock:lock_manager and examining the pg_locks view to see if the fast path slots are getting exhausted for backend processes running the workload. To address these bottlenecks, you need strategies such as effective partition pruning, careful index management, and simplified join patterns that keep your workload within PostgreSQL’s 16-slot fast path per backend process limit.

Understanding LWLock contention

LWLock is a synchronization primitive in PostgreSQL used to coordinate access to shared memory structures. Unlike heavyweight locks (used for user-driven database object-level locking), LWLocks are lightweight and optimized for high performance, providing low overhead while managing concurrent access to shared data.

LWLock contention occurs when multiple processes compete to acquire the same LWLock on lock data structure in shared memory, leading to delays. This contention typically happens when many backend processes need to access a heavily shared resource, such as:

Buffer Manager – Protects shared buffers during read/write operations
Lock Manager – Coordinates access to lock-related data structures
WAL management – Synchronizes writes to the write-ahead log (WAL)

LWLock contention can increase as the number of concurrent database connections grows. This is especially true in high-throughput environments such that Aurora PostgreSQL-Compatible Edition is capable of supporting workloads that involve heavy parallelism, tables with many partitions, or tables with many indexes.

While executing a SQL query on a table, PostgreSQL tries to acquire a lock on the table and associated indexes. If this is a partitioned table, a lock is acquired on the table partitions to be accessed by a SQL query. To make locking faster and efficient, fast path locking is used for less restrictive weak locks (such as AccessShareLock, RowShareLock, or RowExclusiveLock) when the system can quickly confirm there’s no conflict with other locks.

Fast path locking: Operational mechanics

In PostgreSQL, the fast path locking mechanism lets nonconflicting operations bypass the shared memory lock hash table and its associated LWLocks. Fast path locking is designed for the most common use case: frequent, concurrent queries that acquire weak locks as long as no conflicting strong locks exist on the same relation.

Here’s how fast path locking works:

Per-session cache – Each backend process allocates up to 16 slots (FP_LOCK_SLOTS_PER_BACKEND = 16 by default) in its private PGPROC structure to store fast path locks.
Quick fast path eligibility check – When you run SELECT * FROM my_table;, PostgreSQL grabs a tiny per-backend LWLock (MyProc->fpInfoLock) to check if fast path locking mechanism can be used for current SQL query. It verifies that:
- The lock mode is a qualified weak mode.
- No other session holds a conflicting lock.
- The backend hasn’t already used all 16 slots of local backend memory array.
Local grant – If the above fast path eligibility check passes, FastPathGrantRelationLock() stores the lock in the backend’s local cache. No shared memory-based LWLock guarding the shared memory lock hash table is acquired, and the function returns immediately with success.

In practice, this means the first 16 unique tables (or indexes) a transaction touches incur almost zero lock manager overhead.

The fast path cache is small, and when you exceed 16 locks or request a stronger lock mode, PostgreSQL must fall back to the slow path:

All requests for fast path locks in excess of 16 already acquired fast path locks will be migrated to the shared lock table by using FastPathTransferRelationLocks().
The lock tag (which includes relation OID and lock mode) is hashed to one of 16 lock partitions of the shared memory lock hash table.
PostgreSQL then acquires the partition LWLock (LWLockAcquire(partitionLock, LW_EXCLUSIVE)), updates the shared hash table, and releases the LWLock.

From that point on, additional lock acquisitions—from tables to indexes—go through the lock manager, producing LWLock:LockManager wait events under concurrency.

By understanding the transition from fast path optimization to slow path contention, you can design queries and schemas that stay within the fast path limits, avoiding the lock manager bottleneck altogether.

Solution overview

The following sections detail LWLock contention through three controlled experiments:

Observe locks on a partitioned table
Observe locks on a nonpartitioned table with multiple unused or unnecessary indexes
Observe locking behavior with a multi-join query

Each controlled experiment demonstrates schema setup, workload execution, lock monitoring through PostgreSQL system views, and analysis of concurrency impacts using pgbench. All controlled experiments were conducted on Aurora PostgreSQL-Compatible (compatible with PostgreSQL 16.6) by using db.r7g.4xlarge instances.

Prerequisites

Before you begin, make sure you have the following:

An AWS account with access to Aurora PostgreSQL-Compatible database
Access to the AWS Management Console
An Amazon Elastic Compute Cloud (Amazon EC2) instance with connectivity to the Aurora PostgreSQL instance
A PostgreSQL client (such as psql) installed on the Amazon EC2 instance
pgbench installed on the Amazon EC2 instance
AWS Identity and Access Management (IAM) permissions to create and manage Aurora PostgreSQL clusters

Controlled experiment 1: Observe locks on a partitioned table

In this experiment, we investigate PostgreSQL’s locking behavior when working with partitioned tables, focusing on three key test scenarios:

First, we query all partitions of an orders table to observe how PostgreSQL handles locks across the entire partition set
Then, we examine a more targeted approach using partition pruning to access specific partitions
Finally, we stress test both approaches under high concurrency using pgbench to simulate real-world workloads

Through these queries, we demonstrate how PostgreSQL’s fast path lock optimization works, what happens when fast path slots are exhausted, and how partition pruning can significantly improve performance under concurrent workloads. You use a partitioned orders table where data is partitioned by month using the order_ts timestamp column.

This experiment will reveal important insights about:

How PostgreSQL manages locks even during read-only operations
The impact of fast path vs. slow path locks
How partition pruning can reduce lock contention
Performance implications for high-concurrency environments

Schema preparation

Using your preferred PostgreSQL client (such as psql) connected to your Aurora PostgreSQL instance from your EC2 instance, create a partitioned orders table with 12 monthly child partitions. Run the following SQL code:

-- Create the schema
CREATE SCHEMA experiment_1;
SET search_path TO experiment_1;

-- Create the partitioned parent table
CREATE TABLE orders (
  id          int NOT NULL,
  order_ts    timestamp NOT NULL,
  customer_id int,
  amount      numeric,
  PRIMARY KEY (id, order_ts)
) PARTITION BY RANGE (order_ts);

-- Create child partitions (example: Jan 2025 - Jun 2025)
DO $$
DECLARE
  month int := 1;
  partition_name text;
  from_date date;
  to_date date;
BEGIN
  WHILE month <= 12 LOOP
    partition_name := 'orders_2025_' || to_char(month, 'FM00');
    from_date := make_date(2025, month, 1);
    to_date := from_date + interval '1 month';
    EXECUTE format(
      'CREATE TABLE %I.%I PARTITION OF %I.%I FOR VALUES FROM (%L) TO (%L)', 
      'experiment_1', partition_name, 
      'experiment_1', 'orders',
      from_date, to_date
    );
    month := month + 1;
  END LOOP;
END $$;

Test 1: Query all partitions of the orders table to observe locking behavior

Now, start a transaction and query all partitions (unpruned partitions) of the partitioned orders table. When we query without partition pruning, PostgreSQL must access every partition, significantly increasing lock overhead. To begin this test, open a new connection to the Aurora PostgreSQL database, and run the following commands (we’ll call this session 1):

postgres=> set search_path to experiment_1;
SET
postgres=> begin;
BEGIN
postgres=*> select count(*) from orders;
 count 
-------
     0
(1 row)

The SQL statements will initiate a transaction scanning all 12 partitions. Keep the transaction open to preserve locks by not executing COMMIT, ROLLBACK or End commands.

While the transaction in session 1 remains open, open a second session (we’ll call this session 2) and run the following SQL query to check state of locks in the database:

postgres=> SELECT                                                                 
    n.nspname AS schema,
    c.relname AS table,
    l.locktype,
    l.mode,
    l.fastpath
FROM pg_locks l
JOIN pg_class c ON l.relation = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid 
    AND n.nspname <> 'pg_catalog';
    schema    |        table        | locktype |      mode       | fastpath 
--------------+---------------------+----------+-----------------+----------
 experiment_1 | orders_2025_07_pkey | relation | AccessShareLock | t
 experiment_1 | orders_2025_07      | relation | AccessShareLock | t
 experiment_1 | orders_2025_06_pkey | relation | AccessShareLock | t
 experiment_1 | orders_2025_06      | relation | AccessShareLock | t
 experiment_1 | orders_2025_05_pkey | relation | AccessShareLock | t
 experiment_1 | orders_2025_05      | relation | AccessShareLock | t
 experiment_1 | orders_2025_04_pkey | relation | AccessShareLock | t
 experiment_1 | orders_2025_04      | relation | AccessShareLock | t
 experiment_1 | orders_2025_03_pkey | relation | AccessShareLock | t
 experiment_1 | orders_2025_03      | relation | AccessShareLock | t
 experiment_1 | orders_2025_02_pkey | relation | AccessShareLock | t
 experiment_1 | orders_2025_02      | relation | AccessShareLock | t
 experiment_1 | orders_2025_01_pkey | relation | AccessShareLock | t
 experiment_1 | orders_2025_01      | relation | AccessShareLock | t
 experiment_1 | orders_pkey         | relation | AccessShareLock | t
 experiment_1 | orders              | relation | AccessShareLock | t
 experiment_1 | orders_2025_11      | relation | AccessShareLock | f
 experiment_1 | orders_2025_09      | relation | AccessShareLock | f
 experiment_1 | orders_2025_09_pkey | relation | AccessShareLock | f
 experiment_1 | orders_2025_11_pkey | relation | AccessShareLock | f
 experiment_1 | orders_2025_08_pkey | relation | AccessShareLock | f
 experiment_1 | orders_2025_10      | relation | AccessShareLock | f
 experiment_1 | orders_2025_08      | relation | AccessShareLock | f
 experiment_1 | orders_2025_10_pkey | relation | AccessShareLock | f
 experiment_1 | orders_2025_12_pkey | relation | AccessShareLock | f
 experiment_1 | orders_2025_12      | relation | AccessShareLock | f
(26 rows)

Notice that the fastpath column with values t (True) and f (False) in the preceding output has f for the last 10 rows, and the total number of rows returned is 26. The value t means that 16 fast path slots have been exhausted, and the remaining partition/index AccessShareLock have been migrated to a shared memory lock hash table (slow path). When the transaction is complete, these locks will be released.

Test 2: Query specific partitions of the orders table by using partition pruning to observe change in locking behavior

In session 1 of previous test, run a query that uses the partition pruning approach inside a new transaction as shown below. If you continue to touch more partitions, a subsequent number of fast path locks will be acquired.

postgres=> set search_path to experiment_1;
SET
postgres=> begin;
BEGIN
postgres=*> SELECT * FROM orders WHERE order_ts <= '2025-01-01';
 id | order_ts | customer_id | amount 
----+----------+-------------+--------
(0 rows)

postgres=*> SELECT * FROM orders WHERE order_ts <= '2025-02-01';
 id | order_ts | customer_id | amount 
----+----------+-------------+--------
(0 rows)

postgres=*> SELECT * FROM orders WHERE order_ts <= '2025-03-01';
 id | order_ts | customer_id | amount 
----+----------+-------------+--------
(0 rows)

While the transaction in session 1 remains open, check the state of locks in the database by executing following SQL statement in session 2 created in previous test:

postgres=> SELECT 
    n.nspname AS schema,
    c.relname AS table,
    l.locktype,
    l.mode,
    l.fastpath
FROM pg_locks l
JOIN pg_class c ON l.relation = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid 
    AND n.nspname <> 'pg_catalog';
    schema    |        table        | locktype |      mode       | fastpath 
--------------+---------------------+----------+-----------------+----------
 experiment_1 | orders_2025_03_pkey | relation | AccessShareLock | t
 experiment_1 | orders_2025_03      | relation | AccessShareLock | t
 experiment_1 | orders_2025_02_pkey | relation | AccessShareLock | t
 experiment_1 | orders_2025_02      | relation | AccessShareLock | t
 experiment_1 | orders_2025_01_pkey | relation | AccessShareLock | t
 experiment_1 | orders_2025_01      | relation | AccessShareLock | t
 experiment_1 | orders_pkey         | relation | AccessShareLock | t
 experiment_1 | orders              | relation | AccessShareLock | t
(8 rows)

The output shows that all the locks are fast path locks denoted by value t (True) for the column fastpath, removing the need for acquiring slow path locks.

This demonstrates fast path optimization but doesn’t explain scenarios involving concurrency, where each backend either exhausts its fast path slots or remains within the 16-slot limit. Let’s dive deep into this specific scenario. We’ll use pgbench to simulate a multiuser workload

Test 3.1: Query all partitions of the orders table under high concurrency to observe change in locking behavior

To simulate a high-concurrency read workload that accesses unpruned partitions, use the following pgbench command. This command continuously issues SELECT count(*) FROM orders queries across multiple threads. In this test, we evaluate how PostgreSQL’s fast path lock optimization behaves under high concurrency when transactions exhaust fast path slots, forcing lock acquisition through the main lock manager (triggering LWLock:LockManager waits).

pgbench -c 100 -j 10 -n -f transaction.sql -T 900

In pgbench, the -c and -j options are used to control the concurrency and parallelism of the benchmark workload. The -c option specifies the number of concurrent clients, meaning how many simulated user sessions or database connections will be active at the same time. This number determines the level of load applied to the PostgreSQL database. The -j option defines the number of worker threads that pgbench uses to manage these client connections. Each thread handles a subset of the total clients, and the workload is evenly distributed across threads, allowing pgbench to better use multicore systems and avoid bottlenecks on the client side.

To run pgbench command without entering credentials at runtime, you can set the following environment variables: PGHOST for the Aurora cluster endpoint, PGPORT for the port number (such as, 5432), PGDATABASE for the database name (such as, postgres), PGUSER for the database user, and PGPASSWORD for the database user password.

The preceding query simulates 100 concurrent clients (-c 100) running transactions defined in transaction.sql for 15 minutes or 900 seconds (-T 900).

The transaction.sql file contains the following SQL along with setting the search path to experiment_1 schema:

set search_path to experiment_1;
DO $$
DECLARE
	random_date date;
	last_day date;
	sql text;
BEGIN
	random_date := date '2025-01-01' + floor(random() * 365)::int;
	last_day := (date_trunc('month', random_date) + interval '1 month - 1 day')::date;
	sql := format('SELECT COUNT(*) FROM orders');
	EXECUTE sql;
END $$;

Execute the following pgbench command from your session 1 terminal from previous test. This command will complete the test in 15 minutes. While this command is running, you can monitor database wait events from Amazon CloudWatch Database Insights.

sh-5.1$ pgbench -c 100 -j 10 -n -f transaction.sql -T 900
transaction type: transaction.sql
scaling factor: 1
query mode: simple
number of clients: 100
number of threads: 10
duration: 900 s
number of transactions actually processed: 42005251
latency average = 2.143 ms
tps = 46672.159856 (including connections establishing)
tps = 46672.894164 (excluding connections establishing)

In this workload, the average transactions per second (tps) achieved was 46,672, and 42 million transactions were processed in 15 minutes of test time.

The following screenshot from CloudWatch Database Insights shows the database experiencing high load, with active sessions exceeding the instance’s CPU capacity and significant lock contention.

The preceding screenshot shows that the Aurora PostgreSQL 16.6 cluster running on a db.r7g.4xlarge instance has elevated database load driven by both CPU utilization and lock contention. The average active sessions (AAS) sustained at approximately 21—higher than the instance’s 16 vCPU capacity. Although 66 percent of the load was attributable to CPU utilization, a significant 34 percent was spent waiting on LWLock:LockManager, indicating contention for internal PostgreSQL lock structures.

Next, we’ll use partition pruning method and evaluate the performance.

Test 3.2: Query specific partitions of the orders table by using partition pruning under high concurrency

To contrast with our previous test using unpruned partitions, we now demonstrate how partition pruning can significantly improve performance. For this workload, the transaction.sql file, used for this controlled experiment, contains PL/pgSQL, which is used instead of simple SQL queries to provide efficient partition pruning in PostgreSQL when dealing with partitioned tables and runtime-generated values. You can also use SQL as shown in the following query, but the filter on order_ts is derived from a randomly generated date within a Common Table Expression (CTE), which means the query planner which creates an optimal execution plan, can’t determine order_ts value at query planning time. As a result, PostgreSQL must consider all partitions, leading to unnecessary locking and scanning of every partition. However, by switching to a PL/pgSQL block that computes the random date and constructs the query dynamically by using EXECUTE, the actual date value is injected directly into the SQL string. This transforms the filter into a constant from the query planner’s perspective, enabling effective partition pruning and making sure only the relevant partition is accessed and locked.

Following is the CTE based SQL query described above which can lock all partitions:

WITH rand_date AS (
    SELECT (date '2025-01-01' + (floor(random() * 365))::int) AS dt
),
last_day AS (
    SELECT 
        dt,
        (date_trunc('month', dt) + interval '1 month - 1 day')::date AS last_day
    FROM rand_date
)
SELECT 
    dt AS random_date,
    last_day,
    (
        SELECT COUNT(*)
        FROM orders
        WHERE order_ts >= last_day.last_day
        AND order_ts < last_day.last_day + interval '1 day'
    ) AS order_count
FROM last_day;

Use the following SQL using the PL/pgSQL approach to use effective partition pruning as described above:

set search_path to experiment_1;
DO $$
DECLARE
    random_date date;
    last_day date;
    sql text;
BEGIN
    random_date := date '2025-01-01' + (floor(random() * 365))::int;
    last_day := (date_trunc('month', random_date) + interval '1 month - 1 day')::date;
    
    sql := format('SELECT count(*) FROM orders WHERE order_ts = %L', last_day);
    EXECUTE sql;

Follow the same steps as in the previous test and run the pgbench command from the session 1 terminal. This command will complete the test in 15 minutes, and you can monitor database wait events from CloudWatch Database Insights while the command is executing.

sh-5.15 pgbench -c 100 -j 10 -n -f transaction.sql -T 900
transaction type: transaction.sql
scaling factor: 1
query mode: simple
number of clients: 100
number of threads: 10
duration: 900 s
number of transactions actually processed: 53240775
latency average = 1.690 ms
tps = 59255.673468 (including connections establishing)
tps = 59156.616479 (excluding connections establishing)

As shown in the pgbench output above, the workload achieved an average transactions per second of 59,255 and approximately 53.3 million transactions were processed in 15 minutes of test duration. In the absence of lock contentions, the system processed 11 million additional transactions.

The following screenshot from CloudWatch Database Insights shows improved database performance after implementing partition pruning, with stable load patterns and without any lock contention.

With the introduction of partition pruning in the preceding workload, the performance of the Aurora PostgreSQL 16.6 cluster has significantly improved. Previously, the workload was characterized by high LWLock:LockManager wait events, consuming nearly 34 percent of database load alongside CPU utilization.

In contrast, the current workload performance shows a well-balanced workload: Average active sessions (AAS) hovers comfortably under the Max vCPU threshold, and waits are minimal—only a small fraction of Timeout:SpinDelay is observed. CPU is now the dominant contributor to load (as expected in well-optimized OLTP systems), and lock contention has been greatly reduced. This implies that partition pruning successfully reduced the number of locks acquired, allowing each session to access only the relevant partition and limited per-session fast path locks, and drastically improving concurrency. With partition pruning, the AAS remained under the maximum vCPU threshold.

Controlled experiment 2: Observe locks on a nonpartitioned table with multiple unused or unnecessary indexes

In this experiment, we examine how PostgreSQL handles locking behavior on a nonpartitioned table with multiple B-tree indexes. We are using nonpartiotioned table here to keep focus on impact of having unused or unnecessary indexes. Using an items table that represents an ecommerce or inventory system, we’ll explore two key test scenarios:

First, we perform a simple query using index-only scan to observe:
- How PostgreSQL manages locks across multiple unused or unnecessary indexes
- What happens when fast path slots are exhausted
- The impact of having 20 B-tree indexes on lock acquisition
Then, we stress test the system under high concurrency to demonstrate:
- How excessive indexes affect lock manager behavior
- The performance impact of lock contention with multiple indexes
- The relationship between index count and LWLock:LockManager waits

Through these tests, we reveal important insights about index-related lock overhead and provide practical guidance for index management in high-concurrency environments.

Schema preparation

Use the following SQL code to create the items table schema and its associated indexes in your Aurora PostgreSQL database:

-- Create Schema
CREATE SCHEMA IF NOT EXISTS experiment_2;
SET search_path TO experiment_2;

CREATE TABLE items (
    item_id SERIAL PRIMARY KEY, sku TEXT NOT NULL, barcode TEXT, name TEXT NOT NULL, 
    description TEXT,category TEXT, subcategory TEXT, vendor_id INT, 
    vendor_region TEXT, brand TEXT, model TEXT,price NUMERIC(10,2), 
    cost NUMERIC(10,2), discount NUMERIC(5,2), margin NUMERIC(5,2),
    quantity_in_stock INT, reorder_level INT, lead_time_days INT, rating NUMERIC(2,1)
    ,num_reviews INT, available BOOLEAN DEFAULT true, warehouse_location TEXT, 
    is_active BOOLEAN DEFAULT true,last_restocked TIMESTAMP, 
    created_at TIMESTAMP DEFAULT now(), updated_at TIMESTAMP DEFAULT now()
);

CREATE INDEX idx_items_sku ON items(sku);
CREATE INDEX idx_items_barcode ON items(barcode);
CREATE INDEX idx_items_name ON items(name);
CREATE INDEX idx_items_category ON items(category);
CREATE INDEX idx_items_vendor_id ON items(vendor_id);
CREATE INDEX idx_items_brand ON items(brand);
CREATE INDEX idx_items_model ON items(model);
CREATE INDEX idx_items_price ON items(price);
CREATE INDEX idx_items_cost ON items(cost);
CREATE INDEX idx_items_quantity ON items(quantity_in_stock);
CREATE INDEX idx_items_rating ON items(rating);
CREATE INDEX idx_items_num_reviews ON items(num_reviews);
CREATE INDEX idx_items_created_at ON items(created_at);
CREATE INDEX idx_items_updated_at ON items(updated_at);
CREATE INDEX idx_items_vendor_category ON items(vendor_id, category);
CREATE INDEX idx_items_sku_available ON items(sku, available);
CREATE INDEX idx_items_barcode_active ON items(barcode, is_active);
CREATE INDEX idx_items_category_subcategory_price ON items(category, subcategory, 
    price);
CREATE INDEX idx_items_vendor_region_brand ON items(vendor_region, brand);
CREATE INDEX idx_items_brand_model ON items(brand, model);

The SQL code above creates an ecommerce items table with 26 columns for product details (such as SKU, price, and inventory) and 20 B-tree indexes on commonly queried columns and column combinations.

Test 1: Query a nonpartitioned table with multiple indexes

To begin our examination of how excess indexes impact PostgreSQL’s locking behavior, let’s perform our first test by querying the items table. We’ll run a simple name lookup query to observe how PostgreSQL handles lock management across all 20 indexes, even though most of them are unnecessary for this query. This will help us understand the baseline locking overhead created by maintaining excessive indexes. Start a transaction and query specific columns of the items table by using the index access path:

postgres=> set search_path to experiment_2;
SET

postgres=> EXPLAIN SELECT name FROM items WHERE name = 'test';
                           QUERY PLAN                           
----------------------------------------------------------------
 Index Only Scan using idx_items_name on items  (cost=0.14..8.16 rows=1 width=32)
   Index Cond: (name = 'test'::text)
(2 rows)
postgres=> BEGIN;
BEGIN
postgres=> SELECT name FROM items WHERE name = 'test';
 name 
------
(0 rows)

For the preceding SQL query, the query planner chose the index-only scan path. We are selecting only the name column from the items table. Now, in another session, you’ll observe locking behavior.

After executing the SQL query in another session, notice following in the output below, the indexes which acquired an AccessShareLock apart from the items table, primary key, and index (idx_items_name) used by the planner for query execution. The total number of rows returned is 22, fast path slots are exhausted, and six locks are moved to a shared memory lock table. If we had more indexes on this table, those indexes would require AccessShareLock as well and would be placed in a shared memory lock table because fast path slots are exhausted. During high concurrency workloads on this table, performance would degrade because of contention created on the shared memory lock table.

postgres=> SELECT
    n.nspname AS schema,
    c.relname AS table,
    l.locktype,
    l.mode,
    l.fastpath
FROM pg_locks l
JOIN pg_class c ON l.relation = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
    AND n.nspname <> 'pg_catalog';
    schema    |                table                 | locktype |      mode       | fastpath
--------------+--------------------------------------+----------+-----------------+----------
 experiment_2 | idx_items_updated_at                 | relation | AccessShareLock | t
 experiment_2 | idx_items_created_at                 | relation | AccessShareLock | t
 experiment_2 | idx_items_num_reviews                | relation | AccessShareLock | t
 experiment_2 | idx_items_rating                     | relation | AccessShareLock | t
 experiment_2 | idx_items_quantity                   | relation | AccessShareLock | t
 experiment_2 | idx_items_cost                       | relation | AccessShareLock | t
 experiment_2 | idx_items_price                      | relation | AccessShareLock | t
 experiment_2 | idx_items_model                      | relation | AccessShareLock | t
 experiment_2 | idx_items_brand                      | relation | AccessShareLock | t
 experiment_2 | idx_items_vendor_id                  | relation | AccessShareLock | t
 experiment_2 | idx_items_category                   | relation | AccessShareLock | t
 experiment_2 | idx_items_name                       | relation | AccessShareLock | t
 experiment_2 | idx_items_barcode                    | relation | AccessShareLock | t
 experiment_2 | idx_items_sku                        | relation | AccessShareLock | t
 experiment_2 | items_pkey                           | relation | AccessShareLock | t
 experiment_2 | items                                | relation | AccessShareLock | t
 experiment_2 | idx_items_vendor_region_brand        | relation | AccessShareLock | f
 experiment_2 | idx_items_brand_model                | relation | AccessShareLock | f
 experiment_2 | idx_items_vendor_category            | relation | AccessShareLock | f
 experiment_2 | idx_items_barcode_active             | relation | AccessShareLock | f
 experiment_2 | idx_items_sku_available              | relation | AccessShareLock | f
 experiment_2 | idx_items_category_subcategory_price | relation | AccessShareLock | f
(22 rows)

Test 2: Query a nonpartitioned table with multiple indexes under high concurrency

To understand how these unnecessary indexes affect performance at scale, let’s stress test our nonpartitioned items table under high concurrency. Like our earlier experiment with partitioned tables (in Controlled experiment 1), we use pgbench to simulate multiple concurrent users accessing the table simultaneously.

Using the same pgbench command from our first experiment, run the following from your session 1 terminal. This test will run for five minutes. While the test is running, you can monitor database wait events from CloudWatch Database Insights.

pgbench -c 100 -j 10 -n -f transaction.sql -T 300

The transaction.sql file contains the same name lookup SQL query for items table as the previous test above.

The following screenshot from CloudWatch Database Insights illustrates how excessive indexes on the items table create significant LWLock:LockManager waits, resulting in increased database load and CPU utilization.

The LWLock:LockManager waits observed here are primarily driven by the excessive number of indexes on the items table. Even with no data, PostgreSQL incurs overhead during query planning and execution because it must examine all 20 indexes, acquire related locks, and access catalog metadata. Due to high concurrency, high number of locks were involved, database sessions exhausted fast path locks, forcing backend processes to fall back to the main lock manager, which introduced additional contention. This resulted in increased CPU usage caused by repeated catalog scans and elevated database load from lock acquisition overhead. Reducing the number of unnecessary indexes would not only decrease query planning complexity but also help preserve fast path locking, improving system efficiency under high concurrency workloads.

Controlled experiment 3: Observe locking behavior with a multi-join query

In this experiment, we investigate how PostgreSQL manages locks when executing complex queries across multiple related tables. We’ll use a realistic ecommerce database schema to explore two key test scenarios:

First, we examine a single multi-join query that:
- Retrieves a user’s cart contents, item prices, order status, and payment details in a single read-only operation.
- Connects six interrelated tables (users, carts, cart_items, items, orders, and payments).
- Demonstrates how PostgreSQL handles locks across multiple tables and their indexes.
This query provides an opportunity to observe how multiple table joins influence the cumulative lock footprint. Because each table has its own primary key and foreign key indexes, PostgreSQL can potentially use fast path locks for these simple reads, avoiding the overhead of acquiring entries in the shared memory lock table.
Then, we stress test the system under high concurrency to:
- Observe how multiple sessions executing complex joins affect lock management.
- Measure the performance impact of lock acquisition across multiple indexed tables.
- Demonstrate how join complexity influences both CPU utilization and lock contention.

Through these tests, we reveal important insights about:

Lock management in complex, interconnected table structures
The relationship between table joins, indexes, and lock overhead
Performance considerations for multi-join queries in high-concurrency environments

Schema preparation

Use the following SQL code to create the ecommerce schema, which includes six interconnected tables (users, carts, cart_items, items, orders, and payments) with their associated indexes and foreign key constraints. The schema is intentionally kept comprehensive to simulate a real-world ecommerce database, including multiple indexes per table and proper referential integrity constraints:

-- Create Schema
CREATE SCHEMA IF NOT EXISTS experiment_3;
SET search_path TO experiment_3;

-- USERS table
CREATE TABLE users (
    user_id SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    email TEXT UNIQUE NOT NULL,
    phone TEXT,
    role TEXT DEFAULT 'customer',
    is_active BOOLEAN DEFAULT TRUE,
    is_deleted BOOLEAN DEFAULT FALSE,
    created_by INT,
    updated_by INT,
    created_at TIMESTAMP DEFAULT now(),
    updated_at TIMESTAMP DEFAULT now(),
    deleted_at TIMESTAMP
);
-- Indexes for USERS
CREATE INDEX idx_users_name_created_at ON users(name, created_at);
CREATE INDEX idx_users_email_active ON users(email, is_active);
CREATE INDEX idx_users_role_active ON users(role, is_active);
CREATE INDEX idx_users_created_at ON users(created_at);

-- ITEMS table
CREATE TABLE items (
    item_id SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    description TEXT,
    sku TEXT UNIQUE,
    category TEXT,
    price NUMERIC(10,2) NOT NULL,
    discount NUMERIC(5,2) DEFAULT 0.00,
    stock_quantity INT NOT NULL,
    restock_threshold INT DEFAULT 10,
    is_active BOOLEAN DEFAULT TRUE,
    is_deleted BOOLEAN DEFAULT FALSE,
    created_by INT,
    updated_by INT,
    created_at TIMESTAMP DEFAULT now(),
    updated_at TIMESTAMP DEFAULT now(),
    deleted_at TIMESTAMP
);
-- Indexes for ITEMS
CREATE INDEX idx_items_price_stock ON items(price, stock_quantity);
CREATE INDEX idx_items_category_active ON items(category, is_active);
CREATE INDEX idx_items_name_price ON items(name, price);
CREATE INDEX idx_items_created_at ON items(created_at);

-- CARTS table
CREATE TABLE carts (
    cart_id SERIAL PRIMARY KEY,
    user_id INT REFERENCES users(user_id) ON DELETE CASCADE,
    status TEXT DEFAULT 'open',
    is_deleted BOOLEAN DEFAULT FALSE,
    expires_at TIMESTAMP,
    created_by INT,
    updated_by INT,
    created_at TIMESTAMP DEFAULT now(),
    updated_at TIMESTAMP DEFAULT now(),
    deleted_at TIMESTAMP
);
-- Indexes for CARTS
CREATE INDEX idx_carts_user_created ON carts(user_id, created_at);
CREATE INDEX idx_carts_user_status ON carts(user_id, status);
CREATE INDEX idx_carts_status_expires ON carts(status, expires_at);
CREATE INDEX idx_carts_created_at ON carts(created_at);

-- CART_ITEMS table
CREATE TABLE cart_items (
    cart_item_id SERIAL PRIMARY KEY,
    cart_id INT REFERENCES carts(cart_id) ON DELETE CASCADE,
    item_id INT REFERENCES items(item_id) ON DELETE RESTRICT,
    quantity INT NOT NULL CHECK (quantity > 0),
    price_at_addition NUMERIC(10,2),
    is_deleted BOOLEAN DEFAULT FALSE,
    added_at TIMESTAMP DEFAULT now()
);
-- Indexes for CART_ITEMS
CREATE INDEX idx_cart_items_cart_item ON cart_items(cart_id, item_id);
CREATE INDEX idx_cart_items_item_deleted ON cart_items(item_id, is_deleted);
CREATE INDEX idx_cart_items_cart_added ON cart_items(cart_id, added_at);
CREATE INDEX idx_cart_items_added_at ON cart_items(added_at);

-- ORDERS table
CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    user_id INT REFERENCES users(user_id),
    cart_id INT REFERENCES carts(cart_id),
    status TEXT NOT NULL CHECK (status IN ('pending', 'paid', 'shipped', 'cancelled', 'refunded')),
    shipping_address TEXT,
    billing_address TEXT,
    tracking_number TEXT,
    payment_due_date TIMESTAMP,
    is_deleted BOOLEAN DEFAULT FALSE,
    total NUMERIC(10,2),
    created_by INT,
    updated_by INT,
    created_at TIMESTAMP DEFAULT now(),
    updated_at TIMESTAMP DEFAULT now(),
    deleted_at TIMESTAMP
);
-- Indexes for ORDERS
CREATE INDEX idx_orders_status_created ON orders(status, created_at);
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
CREATE INDEX idx_orders_user_created ON orders(user_id, created_at);
CREATE INDEX idx_orders_tracking ON orders(tracking_number);

-- PAYMENTS table
CREATE TABLE payments (
    payment_id SERIAL PRIMARY KEY,
    order_id INT REFERENCES orders(order_id) ON DELETE CASCADE,
    payment_method TEXT NOT NULL,
    payment_status TEXT DEFAULT 'initiated',
    transaction_id TEXT UNIQUE,
    currency TEXT DEFAULT 'USD',
    amount NUMERIC(10,2) NOT NULL,
    is_refundable BOOLEAN DEFAULT FALSE,
    paid_at TIMESTAMP DEFAULT now(),
    refunded_at TIMESTAMP
);
-- Indexes for PAYMENTS
CREATE INDEX idx_payments_method_paid_at ON payments(payment_method, paid_at);
CREATE INDEX idx_payments_method_status ON payments(payment_method, payment_status);
CREATE INDEX idx_payments_order_status ON payments(order_id, payment_status);
CREATE INDEX idx_payments_currency_paid ON payments(currency, paid_at);

Test 1: Execute a multi-join query across multiple indexed tables

To begin our examination of how PostgreSQL handles locks during complex join operations, let’s execute our first test using a multi-join query that retrieves a user’s complete shopping cart information. This query demonstrates how PostgreSQL manages locks across multiple tables and their associated indexes. In session 1, start a transaction and execute this query:

postgres=> SET search_path TO experiment_3;
BEGIN;
SELECT
    u.user_id,
    u.name AS user_name,
    c.cart_id,
    i.name AS item_name,
    ci.quantity,
    i.price,
    (i.price * ci.quantity) AS line_total,
    o.order_id,
    o.status,
    p.payment_method,
    p.amount
FROM users u
    JOIN carts c ON c.user_id = u.user_id
    JOIN cart_items ci ON ci.cart_id = c.cart_id
    JOIN items i ON i.item_id = ci.item_id
    LEFT JOIN orders o ON o.cart_id = c.cart_id
    LEFT JOIN payments p ON p.order_id = o.order_id
WHERE u.user_id = (1 + floor(random() * 10))::int;
SET
BEGIN
 user_id | user_name | cart_id | item_name | quantity | price | line_total | order_id | status | payment_method | amount
---------+-----------+---------+-----------+----------+-------+------------+----------+--------+----------------+--------
(0 rows)

While keeping the transaction above open in session 1, run the following query in session 2 to examine how PostgreSQL manages locks across all the tables and indexes involved in our multi-join query:

postgres=> SELECT
    n.nspname AS schema,
    c.relname AS table,
    l.locktype,
    l.mode,
    l.fastpath
FROM pg_locks l
JOIN pg_class c ON l.relation = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
AND n.nspname <> 'pg_catalog';
schema         |            table             | locktype |      mode      | fastpath
---------------+------------------------------+----------+----------------+----------
experiment_3   | idx_carts_status_expires     | relation | AccessShareLock | t
experiment_3   | idx_carts_user_status        | relation | AccessShareLock | t
experiment_3   | idx_carts_user_created       | relation | AccessShareLock | t
experiment_3   | carts_pkey                   | relation | AccessShareLock | t
experiment_3   | idx_users_created_at         | relation | AccessShareLock | t
experiment_3   | idx_users_role_active        | relation | AccessShareLock | t
experiment_3   | idx_users_email_active       | relation | AccessShareLock | t
experiment_3   | idx_users_name_created_at    | relation | AccessShareLock | t
experiment_3   | users_email_key              | relation | AccessShareLock | t
experiment_3   | users_pkey                   | relation | AccessShareLock | t
experiment_3   | payments                     | relation | AccessShareLock | t
experiment_3   | orders                       | relation | AccessShareLock | t
experiment_3   | items                        | relation | AccessShareLock | t
experiment_3   | cart_items                   | relation | AccessShareLock | t
experiment_3   | carts                        | relation | AccessShareLock | t
experiment_3   | users                        | relation | AccessShareLock | t
experiment_3   | idx_cart_items_item_deleted  | relation | AccessShareLock | f
experiment_3   | idx_orders_user_status       | relation | AccessShareLock | f
experiment_3   | idx_cart_items_cart_item     | relation | AccessShareLock | f
experiment_3   | idx_payments_order_status    | relation | AccessShareLock | f
experiment_3   | idx_carts_created_at         | relation | AccessShareLock | f
experiment_3   | idx_cart_items_cart_added    | relation | AccessShareLock | f
experiment_3   | idx_items_created_at         | relation | AccessShareLock | f
experiment_3   | orders_pkey                  | relation | AccessShareLock | f
experiment_3   | payments_pkey                | relation | AccessShareLock | f
experiment_3   | idx_items_name_price         | relation | AccessShareLock | f
experiment_3   | idx_payments_method_status   | relation | AccessShareLock | f
experiment_3   | payments_transaction_id_key  | relation | AccessShareLock | f
experiment_3   | idx_items_category_active    | relation | AccessShareLock | f
experiment_3   | items_pkey                   | relation | AccessShareLock | f
experiment_3   | idx_orders_status_created    | relation | AccessShareLock | f
experiment_3   | idx_items_price_stock        | relation | AccessShareLock | f
experiment_3   | idx_cart_items_added_at      | relation | AccessShareLock | f
experiment_3   | idx_orders_tracking          | relation | AccessShareLock | f
experiment_3   | idx_payments_currency_paid   | relation | AccessShareLock | f
experiment_3   | cart_items_pkey              | relation | AccessShareLock | f
experiment_3   | items_sku_key                | relation | AccessShareLock | f
experiment_3   | idx_orders_user_created      | relation | AccessShareLock | f
experiment_3   | idx_payments_method_paid_at  | relation | AccessShareLock | f
(39 rows)

In the output above, we can see that PostgreSQL acquired a total of 39 locks. Looking at the fastpath column, 23 rows show f (false), indicating these locks had to use the slower path through the shared memory lock table instead of the fast path. This demonstrates that even a seemingly simple nested join query can encounter significant lock contention when fast path slots are exhausted.

Test 2: Execute a multi-join query under high concurrency

To understand how these complex joins in the above SQL query perform at scale, let’s stress test our ecommerce schema under high concurrency. Like our previous experiments, we use pgbench to simulate multiple concurrent users executing our multi-join query simultaneously.

Using the same pgbench command from our earlier experiments (in Controlled experiments 1 and 2), run the following from your session 1 terminal. This test will run for five minutes:

pgbench -c 100 -j 10 -n -f transaction.sql -T 300

The transaction.sql file contains the same multi-join SQL query as in the previous test.While the test is running, you can monitor database wait events from CloudWatch Database Insights.

The following screenshot from CloudWatch Database Insights demonstrates how multi-join queries across heavily indexed tables create a dual performance impact, with LWLock:LockManager contention consuming 20 percent of active sessions and CPU utilization approaching saturation at 80 percent.

The database load graph reveals significant LWLock:LockManager contention (20 percent of AAS) and high CPU utilization (80 percent) because of multi-join queries operating across tables with several indexes. Each join forces PostgreSQL to obtain AccessShareLock locks on indexes, exhausting fast path locks and falling back to the slower main lock manager. Repeated catalog scans during query planning drive CPU utilization near saturation. This lock manager bottleneck stems from the combined overhead of maintaining locks across multiple indexed tables during join planning. Reducing redundant indexes and simplifying join patterns will alleviate both the lock contention and CPU pressure visible in the workload.

Key mitigation strategies for managing PostgreSQL lock contention

Consider the following key mitigation strategies for reducing PostgreSQL lock contention:

Optimized access to partitioned tables:
- Enable partition pruning – Use explicit date ranges (WHERE order_ts BETWEEN X AND Y) instead of full-table scans. Learn more about partition pruning from PostgreSQL documentation.
- Avoid dynamic SQL without constants – Replace CTEs with PL/pgSQL blocks (as in this post’s first controlled experiment) to force pruning.
- Limit partition count – Reduce partitions where possible (for example, consider using quarterly partitions instead of monthly) to stay within fast path limits.
Index rationalization:
- Audit and remove unused indexes – Use pg_stat_user_indexes to identify low-usage indexes.
- Consolidate redundant indexes – Replace single-column indexes with composite indexes (for example, (category, subcategory, price) instead of three separate indexes).
- Avoid overindexing – Cap indexes per table (for example, less than or equal to 10) unless critical for OLTP.
Schema design adjustments:
- Avoid unnecessary partitioning – Only partition tables exceeding 100 GB or with clear access patterns. For detailed guidance on when and how to implement table partitioning, see Improve performance and manageability of large PostgreSQL tables by migrating to partitioned tables on Amazon Aurora and Amazon RDS.
- Use covering indexes – Add INCLUDE columns to avoid table access (reduces relation locks). Learn more about implementing covering indexes and index-only scans in the PostgreSQL documentation.
- Normalize high-concurrency tables – Split wide tables (for example, items) to reduce index sprawl.
Query optimization:
- Simplify joins – Break multi-join queries into materialized views or staged queries. For implementing materialized views, see the PostgreSQL documentation
- Batch small reads – Combine small lookups (for example, IN (...) clauses) to reduce lock frequency.
PostgreSQL tuning:
- Adjust max_locks_per_transaction – Increase (for example, from 256 to 512) if partitioning is unavoidable (monitor memory) and excess locks are moved to the shared-memory lock hash table.
- Monitor fast path usage – Track pg_locks to identify slots exhaustion.

Clean up

To avoid incurring future charges related to implementing the solution in this blog post, delete the resources you created:

Delete the test schemas and tables you created in the controlled experiments.
If you created a dedicated Aurora PostgreSQL cluster for testing, delete it.
Remove associated snapshots, if you no longer need them.

Conclusion

LWLock contention in PostgreSQL read-heavy workloads stems from exceeding fast path locking limits, which is triggered by unpruned partitions, redundant indexes, and complex joins. The controlled experiments in this post demonstrated the following:

Partition pruning reduced lock overhead, resulting in a 34 percent performance improvement (from 46,000 tps to 59,000 tps) by confining locks to fast path slots.
Each unused index added lock pressure, forcing slow path fallback even in empty tables.
Multi-join queries amplify contention, with 60 percent of locks spilling to slow path in tested scenarios.

By prioritizing partition-aware queries, rigorous index hygiene, and join simplification, teams can maintain fast path efficiency and provide linear read scalability on Amazon Aurora PostgreSQL and Amazon RDS for PostgreSQL. As databases grow, these optimizations become foundational to harnessing the elasticity of AWS without lock bottlenecks.

For more information about performance tuning and designing scalable PostgreSQL workloads on Aurora, see Essential concepts for Aurora PostgreSQL tuning.

Improve PostgreSQL performance: Diagnose and mitigate lock manager contention

Understanding LWLock contention

Fast path locking: Operational mechanics

Solution overview

Prerequisites

Controlled experiment 1: Observe locks on a partitioned table

Schema preparation

Test 1: Query all partitions of the orders table to observe locking behavior

Test 2: Query specific partitions of the orders table by using partition pruning to observe change in locking behavior

Test 3.1: Query all partitions of the orders table under high concurrency to observe change in locking behavior

Test 3.2: Query specific partitions of the orders table by using partition pruning under high concurrency

Controlled experiment 2: Observe locks on a nonpartitioned table with multiple unused or unnecessary indexes

Schema preparation

Test 1: Query a nonpartitioned table with multiple indexes

Test 2: Query a nonpartitioned table with multiple indexes under high concurrency

Controlled experiment 3: Observe locking behavior with a multi-join query

Schema preparation

Test 1: Execute a multi-join query across multiple indexed tables

Test 2: Execute a multi-join query under high concurrency

Key mitigation strategies for managing PostgreSQL lock contention

Clean up

Conclusion

About the authors

Leave a Comment Cancel Reply