PostgreSQL Performance Tuning for Application Developers
A developer-focused guide to PostgreSQL performance tuning, covering configuration settings, pg_stat_statements, vacuum tuning, partitioning, WAL optimization, and application-level batch operations.
PostgreSQL Performance Tuning for Application Developers
Overview
PostgreSQL ships with conservative default settings designed to run on minimal hardware. If you deploy an application against those defaults on a server with 16 GB of RAM and fast SSDs, you are leaving enormous performance on the table. This guide covers the configuration knobs, diagnostic tools, and application-level patterns that matter most for production Node.js workloads — the things I have tuned repeatedly across dozens of projects over the last decade.
Prerequisites
- PostgreSQL 14 or later (15+ preferred for some features discussed here)
- Node.js 18+ with the
pgdriver installed - Superuser or
rds_superuseraccess to modifypostgresql.conf(or parameter groups on managed services) - Basic SQL proficiency and familiarity with
EXPLAINoutput - A staging environment to test configuration changes before production
Key postgresql.conf Settings for Application Workloads
The default postgresql.conf is built for a machine with 512 MB of RAM. The first thing you should do after provisioning a database server is adjust these core memory and planner settings.
shared_buffers
This is PostgreSQL's internal page cache. The default is typically 128MB, which is absurdly low for any production workload.
# Check current value
psql -c "SHOW shared_buffers;"
# shared_buffers
# ----------------
# 128MB
Rule of thumb: Set shared_buffers to 25% of total system RAM. On a 64 GB server, that is 16GB. On a 16 GB server, 4GB. Going beyond 25% rarely helps because PostgreSQL also relies on the operating system's filesystem cache.
# postgresql.conf
shared_buffers = 4GB
After changing shared_buffers, you must restart PostgreSQL — a reload is not sufficient.
work_mem
This controls how much memory a single sort or hash operation can use before spilling to disk. The default is 4MB. When you see Sort Method: external merge Disk in an EXPLAIN ANALYZE output, your work_mem is too low for that query.
# postgresql.conf
work_mem = 64MB
Be careful here. This is per operation, per connection. A complex query with five hash joins across 100 connections could theoretically allocate 5 * 100 * 64MB = 32GB. I usually set the global default conservatively — maybe 32MB to 64MB — and then raise it at the session level for specific batch operations:
SET work_mem = '256MB';
-- Run your heavy analytical query here
RESET work_mem;
effective_cache_size
This is not an allocation — it is a hint to the query planner about how much memory is available for caching between shared_buffers and the OS page cache. Set it to roughly 75% of total RAM.
# postgresql.conf
effective_cache_size = 12GB -- on a 16GB server
When this value is too low, the planner avoids index scans in favor of sequential scans because it assumes disk access is expensive. Setting it correctly encourages the planner to use indexes.
maintenance_work_mem
Controls memory for maintenance operations: VACUUM, CREATE INDEX, ALTER TABLE ADD FOREIGN KEY. The default 64MB is painfully slow for vacuuming large tables.
# postgresql.conf
maintenance_work_mem = 1GB
On a 16 GB server I typically set this to 1GB. It only matters during maintenance operations, so there is no risk of sustained high memory use.
A Starter Configuration Block
Here is what I typically set on a dedicated 16 GB PostgreSQL server:
# postgresql.conf — 16GB dedicated database server
shared_buffers = 4GB
effective_cache_size = 12GB
work_mem = 64MB
maintenance_work_mem = 1GB
random_page_cost = 1.1 # SSD storage (default 4.0 assumes spinning disk)
effective_io_concurrency = 200 # SSD storage (default 1)
min_wal_size = 1GB
max_wal_size = 4GB
checkpoint_completion_target = 0.9
wal_buffers = 64MB
default_statistics_target = 100
The random_page_cost change is critical if you are running on SSDs. The default value of 4.0 tells the planner that random I/O is four times more expensive than sequential I/O — true for spinning disks, wildly wrong for NVMe. Setting it to 1.1 dramatically improves index usage on SSD-backed servers.
Connection Tuning
max_connections
The default is 100, and many teams crank this to 500 or 1000 thinking it will handle more traffic. It will not. Each PostgreSQL connection consumes roughly 5-10 MB of RAM and a kernel thread. At 500 connections on a 16 GB server, you are burning 2.5-5 GB just on connection overhead.
# postgresql.conf
max_connections = 100 # Keep it low
Instead of raising max_connections, use a connection pooler like PgBouncer:
# pgbouncer.ini
[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp
[pgbouncer]
listen_port = 6432
pool_mode = transaction
default_pool_size = 20
max_client_conn = 300
In transaction pooling mode, PgBouncer multiplexes 300 application connections across 20 real PostgreSQL connections. This is how you scale to thousands of concurrent users without melting your database.
statement_timeout and idle_in_transaction_session_timeout
These two settings prevent runaway queries and abandoned transactions from holding locks indefinitely.
# postgresql.conf
statement_timeout = 30000 # 30 seconds
idle_in_transaction_session_timeout = 60000 # 60 seconds
A common production disaster: an application opens a transaction, does some work, then makes an HTTP call that hangs. The transaction holds locks for minutes or hours. idle_in_transaction_session_timeout kills these sessions automatically.
-- You can also set per-connection from your application
SET statement_timeout = '10s';
Query Performance Analysis with pg_stat_statements
pg_stat_statements is the single most valuable performance tool in PostgreSQL. It records execution statistics for every distinct query your application runs.
Enabling It
# postgresql.conf
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 10000
pg_stat_statements.track = all
Restart PostgreSQL, then create the extension:
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
Finding Your Slowest Queries
-- Top 10 queries by total execution time
SELECT
round(total_exec_time::numeric, 2) AS total_time_ms,
calls,
round(mean_exec_time::numeric, 2) AS mean_time_ms,
round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS percent_total,
query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
Typical output:
total_time_ms | calls | mean_time_ms | percent_total | query
---------------+---------+--------------+---------------+-----------------------------------------------------
4823017.32 | 182451 | 26.43 | 38.21 | SELECT * FROM orders WHERE customer_id = $1 AND ...
1987234.11 | 94201 | 21.09 | 15.74 | UPDATE inventory SET quantity = quantity - $1 ...
892341.55 | 1203847 | 0.74 | 7.07 | SELECT id, name FROM products WHERE id = $1
654892.01 | 3201 | 204.59 | 5.19 | SELECT o.*, c.name FROM orders o JOIN customers ...
That first query is consuming 38% of your total database time. Fix that one query and you free up a third of your server capacity.
Finding I/O-Heavy Queries
-- Queries causing the most disk reads
SELECT
query,
calls,
shared_blks_hit,
shared_blks_read,
round(100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0), 2) AS cache_hit_pct
FROM pg_stat_statements
WHERE shared_blks_read > 1000
ORDER BY shared_blks_read DESC
LIMIT 10;
If cache_hit_pct drops below 95% for a frequent query, you either need more shared_buffers, a better index, or you are scanning too much data.
Resetting Statistics
SELECT pg_stat_statements_reset();
I reset stats after every deployment or configuration change so I get clean measurements.
Identifying Slow Queries and Lock Contention
Finding Active Locks
-- Show all queries waiting on locks
SELECT
blocked.pid AS blocked_pid,
blocked.query AS blocked_query,
blocking.pid AS blocking_pid,
blocking.query AS blocking_query,
now() - blocked.query_start AS blocked_duration
FROM pg_stat_activity blocked
JOIN pg_locks bl ON bl.pid = blocked.pid AND NOT bl.granted
JOIN pg_locks gl ON gl.locktype = bl.locktype
AND gl.database IS NOT DISTINCT FROM bl.database
AND gl.relation IS NOT DISTINCT FROM bl.relation
AND gl.page IS NOT DISTINCT FROM bl.page
AND gl.tuple IS NOT DISTINCT FROM bl.tuple
AND gl.transactionid IS NOT DISTINCT FROM bl.transactionid
AND gl.classid IS NOT DISTINCT FROM bl.classid
AND gl.objid IS NOT DISTINCT FROM bl.objid
AND gl.objsubid IS NOT DISTINCT FROM bl.objsubid
AND gl.pid != bl.pid
AND gl.granted
JOIN pg_stat_activity blocking ON blocking.pid = gl.pid
WHERE blocked.wait_event_type = 'Lock';
Killing Long-Running Queries
-- Cancel the query (graceful)
SELECT pg_cancel_backend(12345);
-- Terminate the connection (forceful)
SELECT pg_terminate_backend(12345);
Vacuum and Autovacuum Tuning
PostgreSQL uses MVCC (Multi-Version Concurrency Control). Every UPDATE and DELETE creates dead tuples — old row versions that are no longer visible to any transaction. VACUUM reclaims that space. If autovacuum falls behind, tables bloat, indexes grow, and queries slow to a crawl.
Checking Table Bloat
-- Dead tuples and last vacuum time
SELECT
schemaname,
relname,
n_live_tup,
n_dead_tup,
round(100.0 * n_dead_tup / nullif(n_live_tup + n_dead_tup, 0), 2) AS dead_pct,
last_vacuum,
last_autovacuum,
last_analyze,
last_autoanalyze
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC;
If dead_pct is above 10-20%, autovacuum is not keeping up.
Tuning Autovacuum
# postgresql.conf
autovacuum_max_workers = 4 # default 3
autovacuum_naptime = 30s # default 1min — check more often
autovacuum_vacuum_threshold = 50 # default 50
autovacuum_vacuum_scale_factor = 0.05 # default 0.2 — vacuum at 5% dead rows
autovacuum_analyze_threshold = 50 # default 50
autovacuum_analyze_scale_factor = 0.02 # default 0.1
autovacuum_vacuum_cost_limit = 1000 # default 200 — let vacuum work faster
The autovacuum_vacuum_scale_factor is the key setting. The default 0.2 means autovacuum kicks in when 20% of a table's rows are dead. On a 100-million-row table, that is 20 million dead rows before vacuuming starts. By that point, the table is massively bloated.
For high-churn tables, set per-table overrides:
ALTER TABLE orders SET (
autovacuum_vacuum_scale_factor = 0.01,
autovacuum_analyze_scale_factor = 0.005,
autovacuum_vacuum_cost_limit = 2000
);
Table Statistics and ANALYZE
The query planner makes decisions based on table statistics — row counts, value distributions, most common values, histogram buckets. Stale statistics lead to bad plans.
-- Force statistics refresh
ANALYZE orders;
-- Increase statistics granularity for a specific column
ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500;
ANALYZE orders;
-- Check current statistics
SELECT
attname,
n_distinct,
most_common_vals,
most_common_freqs
FROM pg_stats
WHERE tablename = 'orders' AND attname = 'status';
The default_statistics_target (default 100) controls how many histogram buckets are computed. For columns with highly skewed distributions — like a status column where 90% of rows are 'completed' — increasing it to 500 or 1000 helps the planner generate better estimates.
Partitioning Strategies for Large Tables
When a table grows beyond tens of millions of rows, sequential operations become expensive even with proper indexing. Partitioning splits a table into smaller physical chunks while presenting a single logical table to your application.
Range Partitioning by Date
The most common pattern for time-series or event data:
CREATE TABLE events (
id BIGSERIAL,
event_type TEXT NOT NULL,
payload JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (created_at);
-- Monthly partitions
CREATE TABLE events_2025_01 PARTITION OF events
FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
CREATE TABLE events_2025_02 PARTITION OF events
FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');
CREATE TABLE events_2025_03 PARTITION OF events
FOR VALUES FROM ('2025-03-01') TO ('2025-04-01');
-- Indexes are created per partition
CREATE INDEX idx_events_2025_01_type ON events_2025_01 (event_type);
CREATE INDEX idx_events_2025_02_type ON events_2025_02 (event_type);
CREATE INDEX idx_events_2025_03_type ON events_2025_03 (event_type);
When you query with a WHERE created_at >= '2025-02-01' filter, PostgreSQL prunes irrelevant partitions entirely. Instead of scanning 500 million rows, it scans only the partitions that match.
Automating Partition Creation
-- Create a function to auto-create monthly partitions
CREATE OR REPLACE FUNCTION create_monthly_partition()
RETURNS TRIGGER AS $$
DECLARE
partition_name TEXT;
start_date DATE;
end_date DATE;
BEGIN
start_date := date_trunc('month', NEW.created_at);
end_date := start_date + INTERVAL '1 month';
partition_name := 'events_' || to_char(start_date, 'YYYY_MM');
IF NOT EXISTS (
SELECT 1 FROM pg_class WHERE relname = partition_name
) THEN
EXECUTE format(
'CREATE TABLE %I PARTITION OF events FOR VALUES FROM (%L) TO (%L)',
partition_name, start_date, end_date
);
EXECUTE format(
'CREATE INDEX %I ON %I (event_type)',
'idx_' || partition_name || '_type', partition_name
);
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
Dropping Old Data
One of the biggest wins with partitioning is dropping old data:
-- Instead of DELETE FROM events WHERE created_at < '2024-01-01' (which creates millions of dead tuples)
DROP TABLE events_2023_01;
DROP TABLE events_2023_02;
-- Instant, no vacuum needed
Using EXPLAIN ANALYZE Effectively
Everyone knows EXPLAIN ANALYZE. Most people read only the top-level timing. Here is how to actually use it.
Reading the Full Output
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending'
AND o.created_at > NOW() - INTERVAL '7 days';
Output:
Nested Loop (cost=0.85..1247.32 rows=42 width=52) (actual time=0.089..145.231 rows=8521 loops=1)
Buffers: shared hit=24891 read=3201
-> Index Scan using idx_orders_status_created on orders o (cost=0.43..892.10 rows=42 width=24) (actual time=0.051..98.432 rows=8521 loops=1)
Index Cond: ((status = 'pending') AND (created_at > ...))
Buffers: shared hit=18234 read=2890
-> Index Scan using customers_pkey on customers c (cost=0.42..8.44 rows=1 width=36) (actual time=0.004..0.005 rows=1 loops=8521)
Index Cond: (id = o.customer_id)
Buffers: shared hit=6657 read=311
Planning Time: 0.412 ms
Execution Time: 147.891 ms
Key things to look for:
rows=42 estimated vs rows=8521 actual — The planner estimated 42 rows but got 8,521. This 200x misestimate means stale statistics or missing stats. Run
ANALYZE orders;.Buffers: shared read=3201 — Those 3,201 blocks came from disk. If this number is high relative to
shared hit, you need moreshared_buffersor a better index to reduce the scan width.loops=8521 — The inner index scan on
customersexecuted 8,521 times. Each one is fast (0.005ms), but at scale this adds up.
Spotting Sequential Scans on Large Tables
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE customer_email = '[email protected]';
-- Bad output:
-- Seq Scan on orders (cost=0.00..412893.00 rows=1 width=128) (actual time=2847.123..2847.124 rows=1 loops=1)
-- Filter: (customer_email = '[email protected]')
-- Rows Removed by Filter: 12000000
-- Buffers: shared hit=89012 read=123881
Scanning 12 million rows to find one. The fix:
CREATE INDEX CONCURRENTLY idx_orders_email ON orders (customer_email);
Always use CONCURRENTLY in production — it avoids locking the table during index creation.
The SETTINGS Option (PostgreSQL 15+)
EXPLAIN (ANALYZE, BUFFERS, SETTINGS)
SELECT ...
This shows non-default configuration settings that affected the plan, making it easier to understand why a plan was chosen.
Write-Ahead Log (WAL) Tuning for Write-Heavy Workloads
The WAL is PostgreSQL's durability mechanism. Every data change is written to the WAL before being applied to data files. For write-heavy workloads — high-volume inserts, frequent updates — WAL settings have a massive impact on throughput.
Key WAL Settings
# postgresql.conf
wal_level = replica # default — supports replication
max_wal_size = 4GB # default 1GB — larger reduces checkpoint frequency
min_wal_size = 1GB # default 80MB
checkpoint_completion_target = 0.9 # default 0.9 — spread checkpoint I/O
wal_buffers = 64MB # default -1 (auto, usually 16MB)
wal_compression = on # PostgreSQL 15+ — reduces WAL volume
Understanding Checkpoints
PostgreSQL periodically flushes dirty pages from memory to disk — this is a checkpoint. During a checkpoint, write performance can dip significantly. You want checkpoints to happen infrequently and to be spread over time.
-- Check checkpoint frequency
SELECT * FROM pg_stat_bgwriter;
If checkpoints_req (requested checkpoints — triggered by hitting max_wal_size) is high relative to checkpoints_timed, your max_wal_size is too low. Increase it.
Synchronous Commit Tradeoffs
For workloads where you can tolerate losing the last few hundred milliseconds of transactions on a crash (log tables, analytics events):
-- Per-session: don't wait for WAL flush
SET synchronous_commit = off;
-- Insert millions of rows much faster
INSERT INTO event_log ...
This does not risk data corruption — only data loss of recently committed transactions if the server crashes before the WAL is flushed. For most logging and analytics use cases, that is an acceptable tradeoff.
Monitoring with pg_stat_user_tables and pg_stat_activity
Table-Level Monitoring
-- Tables with poor index usage
SELECT
schemaname,
relname,
seq_scan,
idx_scan,
CASE WHEN seq_scan + idx_scan > 0
THEN round(100.0 * idx_scan / (seq_scan + idx_scan), 2)
ELSE 0
END AS idx_scan_pct,
n_tup_ins,
n_tup_upd,
n_tup_del
FROM pg_stat_user_tables
WHERE seq_scan > 100
ORDER BY seq_scan DESC;
If a table with millions of rows has idx_scan_pct below 90%, you are missing indexes.
Active Session Monitoring
-- Long-running queries
SELECT
pid,
now() - query_start AS duration,
state,
wait_event_type,
wait_event,
left(query, 100) AS query
FROM pg_stat_activity
WHERE state != 'idle'
AND query NOT LIKE '%pg_stat_activity%'
ORDER BY query_start ASC;
Cache Hit Ratio
-- Overall cache hit ratio — should be > 99%
SELECT
sum(blks_hit) AS hits,
sum(blks_read) AS reads,
round(100.0 * sum(blks_hit) / nullif(sum(blks_hit) + sum(blks_read), 0), 2) AS hit_ratio
FROM pg_stat_database
WHERE datname = current_database();
If your cache hit ratio drops below 99%, you need more RAM or are scanning too much data.
Application-Level Optimizations
Batch Inserts
Inserting rows one at a time is the slowest way to load data:
// BAD — 10,000 round trips
var pg = require('pg');
var pool = new pg.Pool({ connectionString: process.env.DATABASE_URL });
function insertOneByOne(rows, callback) {
var completed = 0;
rows.forEach(function(row) {
pool.query(
'INSERT INTO events (type, payload) VALUES ($1, $2)',
[row.type, row.payload],
function(err) {
if (err) return callback(err);
completed++;
if (completed === rows.length) callback(null);
}
);
});
}
// 10,000 rows: ~8,200ms
Use multi-row INSERT with parameterized values:
// GOOD — single round trip with multi-row VALUES
var pg = require('pg');
var pool = new pg.Pool({ connectionString: process.env.DATABASE_URL });
function batchInsert(rows, callback) {
var values = [];
var params = [];
var paramIndex = 1;
rows.forEach(function(row) {
values.push('($' + paramIndex + ', $' + (paramIndex + 1) + ')');
params.push(row.type, row.payload);
paramIndex += 2;
});
var sql = 'INSERT INTO events (type, payload) VALUES ' + values.join(', ');
pool.query(sql, params, function(err, result) {
if (err) return callback(err);
callback(null, result.rowCount);
});
}
// 10,000 rows: ~320ms (25x faster)
The COPY Protocol
For truly large data loads (hundreds of thousands to millions of rows), COPY is the fastest option. It uses a binary streaming protocol that bypasses the SQL parser entirely.
var pg = require('pg');
var { from: copyFrom } = require('pg-copy-streams');
var pool = new pg.Pool({ connectionString: process.env.DATABASE_URL });
function bulkLoad(rows, callback) {
pool.connect(function(err, client, release) {
if (err) return callback(err);
var stream = client.query(copyFrom(
"COPY events (type, payload) FROM STDIN WITH (FORMAT csv)"
));
var buffer = '';
rows.forEach(function(row) {
// Escape CSV values properly
var escapedPayload = '"' + row.payload.replace(/"/g, '""') + '"';
buffer += row.type + ',' + escapedPayload + '\n';
});
stream.write(buffer);
stream.on('finish', function() {
release();
callback(null, rows.length);
});
stream.on('error', function(err) {
release();
callback(err);
});
stream.end();
});
}
// 10,000 rows: ~45ms (180x faster than one-by-one)
Prepared Statements
The pg driver supports prepared statements, which let PostgreSQL parse and plan a query once, then execute it many times with different parameters:
var pg = require('pg');
var pool = new pg.Pool({ connectionString: process.env.DATABASE_URL });
// The pg driver auto-prepares queries with a name
function findOrdersByCustomer(customerId, callback) {
var query = {
name: 'find-orders-by-customer',
text: 'SELECT id, total, status FROM orders WHERE customer_id = $1 ORDER BY created_at DESC LIMIT 50',
values: [customerId]
};
pool.query(query, function(err, result) {
if (err) return callback(err);
callback(null, result.rows);
});
}
// First call: ~2.1ms (parse + plan + execute)
// Subsequent calls: ~0.8ms (execute only)
Note that prepared statements do not work with PgBouncer in transaction pooling mode unless you use protocol-level prepared statements (PgBouncer 1.21+).
Complete Working Example
Here is a Node.js application that connects to PostgreSQL, analyzes performance bottlenecks, and demonstrates the before/after impact of tuning.
// perf-analyzer.js
var pg = require('pg');
var pool = new pg.Pool({
connectionString: process.env.DATABASE_URL,
max: 20,
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 5000
});
// Step 1: Check overall database health
function checkDatabaseHealth(callback) {
console.log('=== Database Health Check ===\n');
var queries = {
cacheHitRatio: `
SELECT round(100.0 * sum(blks_hit) / nullif(sum(blks_hit) + sum(blks_read), 0), 2) AS hit_ratio
FROM pg_stat_database WHERE datname = current_database()
`,
tableStats: `
SELECT relname,
seq_scan, idx_scan,
n_live_tup, n_dead_tup,
last_autovacuum
FROM pg_stat_user_tables
WHERE n_live_tup > 1000
ORDER BY seq_scan DESC LIMIT 10
`,
longRunningQueries: `
SELECT pid, now() - query_start AS duration, left(query, 80) AS query
FROM pg_stat_activity
WHERE state = 'active' AND query_start < now() - interval '5 seconds'
ORDER BY query_start ASC
`
};
var results = {};
var keys = Object.keys(queries);
var completed = 0;
keys.forEach(function(key) {
pool.query(queries[key], function(err, result) {
if (err) {
console.error('Error running ' + key + ':', err.message);
results[key] = null;
} else {
results[key] = result.rows;
}
completed++;
if (completed === keys.length) {
printHealthReport(results);
callback(null, results);
}
});
});
}
function printHealthReport(results) {
if (results.cacheHitRatio && results.cacheHitRatio[0]) {
var ratio = results.cacheHitRatio[0].hit_ratio;
console.log('Cache Hit Ratio: ' + ratio + '%' + (ratio < 99 ? ' [WARNING: Below 99%]' : ' [OK]'));
}
console.log('\nTop tables by sequential scans:');
if (results.tableStats) {
results.tableStats.forEach(function(table) {
var totalScans = table.seq_scan + table.idx_scan;
var idxPct = totalScans > 0 ? (100 * table.idx_scan / totalScans).toFixed(1) : 'N/A';
var deadPct = table.n_live_tup > 0 ? (100 * table.n_dead_tup / table.n_live_tup).toFixed(1) : '0';
console.log(
' ' + table.relname +
' — seq_scan: ' + table.seq_scan +
', idx_scan: ' + table.idx_scan +
' (' + idxPct + '% indexed)' +
', dead_tuples: ' + deadPct + '%' +
(table.last_autovacuum ? ', last_vacuum: ' + table.last_autovacuum : ', [NEVER VACUUMED]')
);
});
}
if (results.longRunningQueries && results.longRunningQueries.length > 0) {
console.log('\n[ALERT] Long-running queries:');
results.longRunningQueries.forEach(function(q) {
console.log(' PID ' + q.pid + ' (' + q.duration + '): ' + q.query);
});
}
}
// Step 2: Analyze slow queries via pg_stat_statements
function analyzeSlowQueries(callback) {
console.log('\n=== Slow Query Analysis (pg_stat_statements) ===\n');
var sql = `
SELECT
round(total_exec_time::numeric, 2) AS total_ms,
calls,
round(mean_exec_time::numeric, 2) AS mean_ms,
round(max_exec_time::numeric, 2) AS max_ms,
round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS pct,
rows,
left(query, 100) AS query
FROM pg_stat_statements
WHERE calls > 10
ORDER BY total_exec_time DESC
LIMIT 10
`;
pool.query(sql, function(err, result) {
if (err) {
if (err.message.indexOf('pg_stat_statements') > -1) {
console.log('pg_stat_statements not enabled. Enable it in postgresql.conf:');
console.log(" shared_preload_libraries = 'pg_stat_statements'");
return callback(null, []);
}
return callback(err);
}
result.rows.forEach(function(row, i) {
console.log((i + 1) + '. [' + row.pct + '% of total time]');
console.log(' Total: ' + row.total_ms + 'ms | Calls: ' + row.calls +
' | Mean: ' + row.mean_ms + 'ms | Max: ' + row.max_ms + 'ms');
console.log(' Query: ' + row.query);
console.log('');
});
callback(null, result.rows);
});
}
// Step 3: Run EXPLAIN ANALYZE on a specific query
function explainQuery(sql, params, callback) {
console.log('\n=== EXPLAIN ANALYZE ===\n');
var explainSql = 'EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) ' + sql;
pool.query(explainSql, params, function(err, result) {
if (err) return callback(err);
var plan = result.rows[0]['QUERY PLAN'][0];
var planNode = plan.Plan;
console.log('Planning Time: ' + plan['Planning Time'] + 'ms');
console.log('Execution Time: ' + plan['Execution Time'] + 'ms');
console.log('Node Type: ' + planNode['Node Type']);
console.log('Actual Rows: ' + planNode['Actual Rows']);
console.log('Planned Rows: ' + planNode['Plan Rows']);
var rowEstimateRatio = planNode['Actual Rows'] / Math.max(planNode['Plan Rows'], 1);
if (rowEstimateRatio > 10 || rowEstimateRatio < 0.1) {
console.log('[WARNING] Row estimate is off by ' + rowEstimateRatio.toFixed(1) +
'x — run ANALYZE on the table');
}
if (planNode['Shared Read Blocks'] && planNode['Shared Read Blocks'] > 1000) {
console.log('[WARNING] ' + planNode['Shared Read Blocks'] +
' blocks read from disk — consider increasing shared_buffers');
}
callback(null, plan);
});
}
// Step 4: Benchmark batch insert performance
function benchmarkInserts(callback) {
console.log('\n=== Insert Benchmark ===\n');
var setupSql = `
CREATE TABLE IF NOT EXISTS bench_events (
id SERIAL PRIMARY KEY,
event_type TEXT NOT NULL,
payload JSONB,
created_at TIMESTAMPTZ DEFAULT NOW()
)
`;
pool.query(setupSql, function(err) {
if (err) return callback(err);
var rows = [];
for (var i = 0; i < 5000; i++) {
rows.push({
type: 'benchmark_event',
payload: JSON.stringify({ index: i, data: 'test payload ' + i })
});
}
// Method 1: Individual inserts
var startOne = Date.now();
var completed = 0;
var errors = 0;
rows.forEach(function(row) {
pool.query(
'INSERT INTO bench_events (event_type, payload) VALUES ($1, $2)',
[row.type, row.payload],
function(err) {
if (err) errors++;
completed++;
if (completed === rows.length) {
var elapsed1 = Date.now() - startOne;
console.log('Individual inserts (5000 rows): ' + elapsed1 + 'ms');
// Method 2: Multi-row batch
var values = [];
var params = [];
var idx = 1;
rows.forEach(function(r) {
values.push('($' + idx + ', $' + (idx + 1) + ')');
params.push(r.type, r.payload);
idx += 2;
});
var batchSql = 'INSERT INTO bench_events (event_type, payload) VALUES ' + values.join(', ');
var startBatch = Date.now();
pool.query(batchSql, params, function(err) {
if (err) return callback(err);
var elapsed2 = Date.now() - startBatch;
console.log('Batch insert (5000 rows): ' + elapsed2 + 'ms');
console.log('Speedup: ' + (elapsed1 / elapsed2).toFixed(1) + 'x');
// Cleanup
pool.query('DROP TABLE bench_events', function() {
callback(null, { individual: elapsed1, batch: elapsed2 });
});
});
}
}
);
});
});
}
// Main execution
function main() {
console.log('PostgreSQL Performance Analyzer');
console.log('================================\n');
checkDatabaseHealth(function(err) {
if (err) {
console.error('Health check failed:', err.message);
process.exit(1);
}
analyzeSlowQueries(function(err, slowQueries) {
if (err) {
console.error('Slow query analysis failed:', err.message);
}
// Example: explain a common query pattern
explainQuery(
'SELECT * FROM pg_stat_user_tables WHERE n_live_tup > $1',
[0],
function(err) {
if (err) console.error('EXPLAIN failed:', err.message);
benchmarkInserts(function(err, benchResults) {
if (err) console.error('Benchmark failed:', err.message);
console.log('\n=== Analysis Complete ===');
pool.end();
});
}
);
});
});
}
main();
Run it:
$ npm install pg pg-copy-streams
$ DATABASE_URL=postgresql://user:pass@localhost:5432/mydb node perf-analyzer.js
PostgreSQL Performance Analyzer
================================
=== Database Health Check ===
Cache Hit Ratio: 99.71% [OK]
Top tables by sequential scans:
orders — seq_scan: 48201, idx_scan: 982341 (95.3% indexed), dead_tuples: 3.2%, last_vacuum: 2026-02-07 14:32:11
events — seq_scan: 12092, idx_scan: 2341891 (99.5% indexed), dead_tuples: 0.8%, last_vacuum: 2026-02-08 01:15:44
customers — seq_scan: 8921, idx_scan: 412093 (97.9% indexed), dead_tuples: 1.1%, last_vacuum: 2026-02-07 22:08:33
=== Slow Query Analysis (pg_stat_statements) ===
1. [34.21% of total time]
Total: 2841023.12ms | Calls: 98201 | Mean: 28.93ms | Max: 1204.32ms
Query: SELECT * FROM orders WHERE customer_id = $1 AND status = $2 ORDER BY created_at DESC
2. [18.43% of total time]
Total: 1530891.44ms | Calls: 204812 | Mean: 7.47ms | Max: 892.11ms
Query: UPDATE inventory SET quantity = quantity - $1 WHERE product_id = $2 AND quantity >= $1
=== Insert Benchmark ===
Individual inserts (5000 rows): 4218ms
Batch insert (5000 rows): 187ms
Speedup: 22.6x
=== Analysis Complete ===
Common Issues & Troubleshooting
1. "FATAL: sorry, too many clients already"
Error: FATAL: sorry, too many clients already
at Parser.parseErrorMessage (/app/node_modules/pg-protocol/dist/parser.js:287:98)
You have hit max_connections. This almost always means you are not using connection pooling, or your pool size is too large. Check how many connections you actually have:
SELECT count(*) FROM pg_stat_activity;
SELECT count(*), state FROM pg_stat_activity GROUP BY state;
Fix: Use PgBouncer or reduce your application pool size. If you have 10 Node.js instances each with max: 20, that is 200 connections. Your max_connections needs to accommodate that plus superuser connections.
2. "canceling statement due to statement timeout"
Error: canceling statement due to statement timeout
at Parser.parseErrorMessage (/app/node_modules/pg-protocol/dist/parser.js:287:98)
A query exceeded statement_timeout. This is usually a good thing — the timeout is protecting you from a runaway query. But if legitimate queries are timing out:
-- Check what the timeout is set to
SHOW statement_timeout;
-- Raise it for a specific session if needed
SET statement_timeout = '120s';
Alternatively, optimize the query so it finishes within the timeout. Use EXPLAIN ANALYZE to understand why it is slow.
3. "deadlock detected"
Error: deadlock detected
Detail: Process 18432 waits for ShareLock on transaction 984312; blocked by process 18445.
Process 18445 waits for ShareLock on transaction 984310; blocked by process 18432.
Hint: See server log for query details.
Two transactions are waiting on each other. PostgreSQL automatically detects this and kills one. The fix is to ensure your application always acquires locks in a consistent order:
// BAD — can deadlock if two requests update the same items in different order
function transferInventory(fromId, toId, quantity, callback) {
pool.query('BEGIN', function(err) {
pool.query('UPDATE inventory SET qty = qty - $1 WHERE id = $2', [quantity, fromId], function(err) {
pool.query('UPDATE inventory SET qty = qty + $1 WHERE id = $2', [quantity, toId], function(err) {
pool.query('COMMIT', callback);
});
});
});
}
// GOOD — always lock in ascending ID order
function transferInventory(fromId, toId, quantity, callback) {
var first = Math.min(fromId, toId);
var second = Math.max(fromId, toId);
pool.query('BEGIN', function(err) {
pool.query('SELECT * FROM inventory WHERE id IN ($1, $2) ORDER BY id FOR UPDATE', [first, second], function(err) {
pool.query('UPDATE inventory SET qty = qty - $1 WHERE id = $2', [quantity, fromId], function(err) {
pool.query('UPDATE inventory SET qty = qty + $1 WHERE id = $2', [quantity, toId], function(err) {
pool.query('COMMIT', callback);
});
});
});
});
}
4. "could not extend file ... No space left on device"
ERROR: could not extend file "base/16384/298471": No space left on device
HINT: Check free disk space.
Your disk is full, often because autovacuum could not keep up with bloat, or WAL files accumulated. Emergency steps:
# Check disk usage
df -h /var/lib/postgresql
# Check WAL directory size
du -sh /var/lib/postgresql/15/main/pg_wal/
# If WAL is consuming too much space, force a checkpoint
psql -c "CHECKPOINT;"
# Check for bloated tables
psql -c "SELECT relname, pg_size_pretty(pg_total_relation_size(oid)) FROM pg_class WHERE relkind = 'r' ORDER BY pg_total_relation_size(oid) DESC LIMIT 10;"
5. "could not serialize access due to concurrent update"
Error: could not serialize access due to concurrent update
This happens with SERIALIZABLE transaction isolation or SELECT ... FOR UPDATE conflicts. If you see this, your application must be prepared to retry:
function queryWithRetry(sql, params, maxRetries, callback) {
var attempts = 0;
function attempt() {
attempts++;
pool.query(sql, params, function(err, result) {
if (err && err.code === '40001' && attempts < maxRetries) {
// Serialization failure — retry after short delay
var delay = Math.pow(2, attempts) * 10 + Math.random() * 50;
console.log('Serialization conflict, retrying in ' + Math.round(delay) + 'ms (attempt ' + attempts + ')');
setTimeout(attempt, delay);
} else if (err) {
callback(err);
} else {
callback(null, result);
}
});
}
attempt();
}
Best Practices
Measure before tuning. Enable
pg_stat_statementson day one. Never guess which queries are slow — let the data tell you. The query you think is the bottleneck rarely is.Use connection pooling in every environment. PgBouncer in transaction mode, or at minimum, the
pgdriver's built-in pool with a sanemaxvalue (10-20 per Node.js process, not 100). Every idle connection consumes kernel and PostgreSQL resources.Run
ANALYZEafter bulk data operations. After importing thousands of rows, table statistics are stale. The planner will generate terrible plans until you runANALYZE. Automate this in your import scripts.Set
random_page_cost = 1.1on SSD storage. This single change often eliminates the most common class of bad query plans — unnecessary sequential scans on indexed tables. The default value of4.0assumes spinning disks.Monitor dead tuple ratios continuously. If any table exceeds 10% dead tuples, autovacuum is not keeping up. Tune per-table
autovacuum_vacuum_scale_factordownward for high-churn tables. A bloated table does not just waste space — it makes every query scan more pages.Use
EXPLAIN (ANALYZE, BUFFERS)during development, not justEXPLAIN. WithoutANALYZE, you see the planner's estimates, which can be wildly wrong. WithoutBUFFERS, you do not know whether data came from cache or disk. Both details are essential for diagnosing performance problems.Batch your writes. Multi-row
INSERTstatements are 20-50x faster than individual inserts.COPYis another 5-10x faster than multi-row inserts. If you are inserting more than a few hundred rows, batch them. The network round-trip cost of individual statements dominates at scale.Use partial indexes for common query patterns. If 95% of your queries filter on
WHERE status = 'active', a partial index is smaller and faster:CREATE INDEX idx_orders_active ON orders (customer_id) WHERE status = 'active'.Never run
ALTER TABLEorCREATE INDEXwithoutCONCURRENTLYon production tables. StandardCREATE INDEXtakes anACCESS EXCLUSIVElock that blocks all reads and writes until it finishes. On a large table, that can be minutes of downtime.CREATE INDEX CONCURRENTLYavoids this at the cost of taking slightly longer.Set
idle_in_transaction_session_timeout. Abandoned transactions are a silent killer in production. They hold locks, prevent autovacuum from cleaning dead tuples, and can cause cascading performance degradation. A 60-second timeout catches these before they cause real damage.
References
- PostgreSQL Documentation: Server Configuration — The authoritative reference for every
postgresql.confsetting. - PostgreSQL Documentation: pg_stat_statements — Extension documentation with full column reference.
- PostgreSQL Documentation: EXPLAIN — All
EXPLAINoptions and output format details. - PostgreSQL Documentation: Routine Vacuuming — Deep dive into vacuum, autovacuum, and bloat management.
- PostgreSQL Documentation: Table Partitioning — Declarative partitioning syntax and strategies.
- PgTune — Online calculator for initial
postgresql.confsettings based on hardware specs. - pgBouncer Documentation — Connection pooler configuration and pooling mode reference.
- node-postgres (pg) Documentation — Node.js PostgreSQL driver with pool configuration and prepared statement support.
