Bài 18: Performance Tuning
Tối ưu PostgreSQL configuration, triển khai connection pooling (PgBouncer), load balancing (HAProxy) và scaling read replicas.
Bài 18: Performance Tuning
Mục tiêu
Sau bài học này, bạn sẽ:
- Optimize PostgreSQL configuration cho HA cluster
- Setup connection pooling với PgBouncer
- Implement load balancing với HAProxy
- Scale reads với multiple replicas
- Tune queries và indexes
- Monitor và troubleshoot performance issues
1. PostgreSQL Configuration Tuning
1.1. Memory settings
shared_buffers
-- Recommended: 25% of total RAM
-- Example for 16GB RAM server:
ALTER SYSTEM SET shared_buffers = '4GB';
-- Check current:
SHOW shared_buffers;
effective_cache_size
-- Recommended: 50-75% of total RAM
-- Tells planner how much memory available for caching
ALTER SYSTEM SET effective_cache_size = '12GB';
work_mem
-- Per-operation memory (sorting, hashing)
-- Careful: per query per operation!
-- Example: 10 concurrent queries × 5 operations = 50 × work_mem
ALTER SYSTEM SET work_mem = '64MB';
-- For specific query:
SET work_mem = '256MB';
SELECT ...;
maintenance_work_mem
-- For VACUUM, CREATE INDEX, ALTER TABLE
ALTER SYSTEM SET maintenance_work_mem = '1GB';
1.2. Checkpoint tuning
-- How often to checkpoint (time-based)
ALTER SYSTEM SET checkpoint_timeout = '15min'; -- Default: 5min
-- Maximum size of WAL between checkpoints
ALTER SYSTEM SET max_wal_size = '4GB'; -- Default: 1GB
ALTER SYSTEM SET min_wal_size = '1GB';
-- Spread checkpoint I/O over time (0.5 = 50% of checkpoint_timeout)
ALTER SYSTEM SET checkpoint_completion_target = 0.9;
-- Warn if checkpoints happen too frequently
ALTER SYSTEM SET checkpoint_warning = '5min';
1.3. WAL settings
-- WAL buffers (auto-tuned to 1/32 of shared_buffers, max 16MB)
ALTER SYSTEM SET wal_buffers = '16MB';
-- WAL writer delay
ALTER SYSTEM SET wal_writer_delay = '200ms'; -- Default: 200ms
-- Commit delay (group commit optimization)
ALTER SYSTEM SET commit_delay = 0; -- Microseconds, 0 = disabled
ALTER SYSTEM SET commit_siblings = 5; -- Minimum concurrent transactions
1.4. Query planner
-- Random page cost (lower for SSD)
ALTER SYSTEM SET random_page_cost = 1.1; -- Default: 4.0 (HDD)
-- Enable parallel query
ALTER SYSTEM SET max_parallel_workers_per_gather = 4;
ALTER SYSTEM SET max_parallel_workers = 8;
ALTER SYSTEM SET parallel_tuple_cost = 0.1;
ALTER SYSTEM SET parallel_setup_cost = 1000;
-- Join optimization
ALTER SYSTEM SET enable_hashjoin = on;
ALTER SYSTEM SET enable_mergejoin = on;
ALTER SYSTEM SET enable_nestloop = on;
1.5. Connection settings
-- Maximum connections (balance with work_mem)
ALTER SYSTEM SET max_connections = 200;
-- Superuser reserved connections
ALTER SYSTEM SET superuser_reserved_connections = 5;
-- Statement timeout (prevent runaway queries)
ALTER SYSTEM SET statement_timeout = '30min';
-- Lock timeout
ALTER SYSTEM SET lock_timeout = '10s';
-- Idle in transaction timeout
ALTER SYSTEM SET idle_in_transaction_session_timeout = '5min';
1.6. Autovacuum tuning
-- Enable autovacuum
ALTER SYSTEM SET autovacuum = on;
-- Number of autovacuum workers
ALTER SYSTEM SET autovacuum_max_workers = 4;
-- Delay between runs
ALTER SYSTEM SET autovacuum_naptime = '1min';
-- Vacuum threshold
ALTER SYSTEM SET autovacuum_vacuum_threshold = 50;
ALTER SYSTEM SET autovacuum_vacuum_scale_factor = 0.1; -- 10% of table
-- Analyze threshold
ALTER SYSTEM SET autovacuum_analyze_threshold = 50;
ALTER SYSTEM SET autovacuum_analyze_scale_factor = 0.05; -- 5% of table
-- Vacuum cost delay (throttling)
ALTER SYSTEM SET autovacuum_vacuum_cost_delay = '2ms';
ALTER SYSTEM SET autovacuum_vacuum_cost_limit = 400;
1.7. Logging for performance
-- Log slow queries
ALTER SYSTEM SET log_min_duration_statement = '1000'; -- 1 second
-- Log checkpoints (monitoring)
ALTER SYSTEM SET log_checkpoints = on;
-- Log connections/disconnections
ALTER SYSTEM SET log_connections = off;
ALTER SYSTEM SET log_disconnections = off;
-- Log lock waits
ALTER SYSTEM SET log_lock_waits = on;
ALTER SYSTEM SET deadlock_timeout = '1s';
-- Log temp files
ALTER SYSTEM SET log_temp_files = 10485760; -- 10MB
1.8. Apply configuration
-- Reload configuration (no restart needed for most)
SELECT pg_reload_conf();
-- Check what requires restart:
SELECT name, setting, pending_restart
FROM pg_settings
WHERE pending_restart = true;
-- Restart if needed:
sudo systemctl restart patroni
2. Connection Pooling with PgBouncer
2.1. Why connection pooling?
Problem without pooling:
Application: 1000 concurrent users
Each user: 1 PostgreSQL connection
PostgreSQL: 1000 connections = HIGH overhead
Each connection = ~10MB RAM + fork overhead
1000 connections = ~10GB RAM wasted!
Solution with PgBouncer:
Application: 1000 concurrent users → PgBouncer
PgBouncer: Pool of 50 connections → PostgreSQL
PostgreSQL: 50 connections = LOW overhead
50 connections = ~500MB RAM ✅
2.2. Install PgBouncer
# Install
sudo apt-get install -y pgbouncer
# Create config directory
sudo mkdir -p /etc/pgbouncer
# Create log directory
sudo mkdir -p /var/log/pgbouncer
sudo chown postgres:postgres /var/log/pgbouncer
2.3. Configure PgBouncer
# /etc/pgbouncer/pgbouncer.ini
[databases]
myapp = host=localhost port=5432 dbname=myapp
postgres = host=localhost port=5432 dbname=postgres
[pgbouncer]
# Listen address
listen_addr = *
listen_port = 6432
# Authentication
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
# Admin
admin_users = postgres
stats_users = monitoring
# Pool settings
pool_mode = transaction # session | transaction | statement
max_client_conn = 1000
default_pool_size = 25
min_pool_size = 10
reserve_pool_size = 5
reserve_pool_timeout = 3
# Connection limits per user/database
max_db_connections = 50
max_user_connections = 50
# Timeouts
server_idle_timeout = 600
server_lifetime = 3600
server_connect_timeout = 15
query_timeout = 0
query_wait_timeout = 120
# Logging
log_connections = 1
log_disconnections = 1
log_pooler_errors = 1
logfile = /var/log/pgbouncer/pgbouncer.log
# Additional
ignore_startup_parameters = extra_float_digits
Pool modes explained:
session mode:
- Connection assigned to client for entire session
- Most compatible
- Least efficient pooling
transaction mode: ✅ RECOMMENDED
- Connection returned to pool after transaction
- Good balance of compatibility and efficiency
- Some features don't work (temp tables, prepared statements)
statement mode:
- Connection returned after each statement
- Most efficient
- Least compatible (no multi-statement transactions)
2.4. User authentication
# Create userlist
sudo tee /etc/pgbouncer/userlist.txt <<EOF
"app_user" "md5hashed_password"
"postgres" "md5hashed_password"
EOF
# Generate MD5 hash:
echo -n "passwordusername" | md5sum
# Example: "app_user" "md5abc123..."
# Or use PostgreSQL to generate:
sudo -u postgres psql -c "SELECT 'md5' || md5('password' || 'app_user');"
sudo chmod 600 /etc/pgbouncer/userlist.txt
sudo chown postgres:postgres /etc/pgbouncer/userlist.txt
2.5. Start PgBouncer
# Edit systemd service
sudo tee /etc/systemd/system/pgbouncer.service <<EOF
[Unit]
Description=PgBouncer connection pooler
After=network.target
[Service]
Type=forking
User=postgres
ExecStart=/usr/sbin/pgbouncer -d /etc/pgbouncer/pgbouncer.ini
ExecReload=/bin/kill -HUP \$MAINPID
KillSignal=SIGINT
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
# Start
sudo systemctl daemon-reload
sudo systemctl start pgbouncer
sudo systemctl enable pgbouncer
# Verify
sudo systemctl status pgbouncer
2.6. Test connection
# Connect through PgBouncer
psql -h localhost -p 6432 -U app_user -d myapp
# Check PgBouncer stats
psql -h localhost -p 6432 -U postgres pgbouncer -c "SHOW POOLS;"
# database | user | cl_active | cl_waiting | sv_active | sv_idle | sv_used
# ----------+----------+-----------+------------+-----------+---------+---------
# myapp | app_user | 10 | 0 | 5 | 5 | 0
# postgres | postgres | 0 | 0 | 0 | 2 | 0
# cl_active: Active client connections
# sv_active: Active server connections
# sv_idle: Idle server connections in pool
2.7. Application configuration
# Python example
import psycopg2
# OLD: Direct connection
# conn = psycopg2.connect(
# host="10.0.1.11",
# port=5432,
# database="myapp",
# user="app_user",
# password="password"
# )
# NEW: Through PgBouncer ✅
conn = psycopg2.connect(
host="10.0.1.11", # PgBouncer host
port=6432, # PgBouncer port (not 5432!)
database="myapp",
user="app_user",
password="password"
)
2.8. Monitor PgBouncer
# Admin console
psql -h localhost -p 6432 -U postgres pgbouncer
# Useful commands:
SHOW POOLS;
SHOW DATABASES;
SHOW CLIENTS;
SHOW SERVERS;
SHOW STATS;
SHOW CONFIG;
# Reload config without restart
RELOAD;
# Pause all connections
PAUSE;
# Resume
RESUME;
3. Load Balancing with HAProxy
3.1. HAProxy architecture
Application Servers
↓
HAProxy (VIP: 10.0.1.100:5432)
↓
├─→ node1 (Primary - Write) :5432
├─→ node2 (Replica - Read) :5432
└─→ node3 (Replica - Read) :5432
Write traffic → Primary only
Read traffic → Round-robin across replicas
3.2. Install HAProxy
sudo apt-get install -y haproxy
# Verify version
haproxy -v
3.3. Configure HAProxy
# /etc/haproxy/haproxy.cfg
sudo tee /etc/haproxy/haproxy.cfg <<'EOF'
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
log global
mode tcp
option tcplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
# Stats page
listen stats
mode http
bind *:7000
stats enable
stats uri /
stats refresh 10s
stats admin if TRUE
# Frontend for write (primary)
frontend postgres_write
bind *:5000
mode tcp
default_backend postgres_primary
# Backend for primary (writes)
backend postgres_primary
mode tcp
option httpchk
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server node1 10.0.1.11:5432 check port 8008 check-ssl verify none
server node2 10.0.1.12:5432 check port 8008 check-ssl verify none backup
server node3 10.0.1.13:5432 check port 8008 check-ssl verify none backup
# Frontend for read (replicas)
frontend postgres_read
bind *:5001
mode tcp
default_backend postgres_replicas
# Backend for replicas (reads)
backend postgres_replicas
mode tcp
balance roundrobin
option httpchk
http-check expect status 200
http-check send meth GET uri /replica
default-server inter 3s fall 3 rise 2
server node2 10.0.1.12:5432 check port 8008 check-ssl verify none
server node3 10.0.1.13:5432 check port 8008 check-ssl verify none
server node1 10.0.1.11:5432 check port 8008 check-ssl verify none backup
EOF
Configuration explained:
Port 5000: Write traffic → Primary node
- Health check: Patroni REST API port 8008
- If primary fails, backup (replica) can take over
- Backup = only used if primary down
Port 5001: Read traffic → Replicas (round-robin)
- Health check: /replica endpoint
- Primary as backup (if all replicas down)
- Load balanced across healthy replicas
Port 7000: HAProxy stats page
3.4. Patroni REST API endpoints for health checks
# Check if node is leader
curl http://10.0.1.11:8008/leader
# Returns 200 if leader, 503 if not
# Check if node is replica
curl http://10.0.1.12:8008/replica
# Returns 200 if replica, 503 if not
# Check if node is running (any role)
curl http://10.0.1.11:8008/health
# Returns 200 if running
# Master endpoint (redirects to current leader)
curl http://10.0.1.11:8008/master
3.5. Start HAProxy
# Test configuration
sudo haproxy -c -f /etc/haproxy/haproxy.cfg
# Start
sudo systemctl restart haproxy
sudo systemctl enable haproxy
# Check status
sudo systemctl status haproxy
# View logs
sudo journalctl -u haproxy -f
3.6. Test load balancing
# Test write endpoint (should connect to primary)
psql -h localhost -p 5000 -U app_user -d myapp -c "SELECT pg_is_in_recovery();"
# pg_is_in_recovery
# ------------------
# f ← false = PRIMARY ✅
# Test read endpoint (should connect to replica)
psql -h localhost -p 5001 -U app_user -d myapp -c "SELECT pg_is_in_recovery();"
# pg_is_in_recovery
# ------------------
# t ← true = REPLICA ✅
# Multiple reads should round-robin:
for i in {1..10}; do
psql -h localhost -p 5001 -U app_user -d myapp -c "SELECT inet_server_addr();" -t
done
# Should see different IPs rotating
3.7. Application usage
# Application code with read/write split
# Write connection (primary only)
write_conn = psycopg2.connect(
host="haproxy-host",
port=5000, # Write port
database="myapp",
user="app_user"
)
# Read connection (replicas)
read_conn = psycopg2.connect(
host="haproxy-host",
port=5001, # Read port
database="myapp",
user="app_user"
)
# Writes
write_conn.cursor().execute("INSERT INTO users ...")
write_conn.commit()
# Reads (load balanced)
cursor = read_conn.cursor()
cursor.execute("SELECT * FROM users WHERE ...")
results = cursor.fetchall()
3.8. Monitor HAProxy
# Access stats page
# http://haproxy-host:7000/
# Shows:
# - Backend status (UP/DOWN)
# - Current connections
# - Requests per second
# - Health check results
# - Traffic distribution
4. Read Scaling Strategies
4.1. Add more read replicas
# Add 4th node as read replica
# On node4:
# Install PostgreSQL + Patroni (same as before)
# Configure patroni.yml with tags:
tags:
nofailover: true # Don't promote to primary
noloadbalance: false # Include in load balancing
priority: 0 # Lowest priority
# Start Patroni
sudo systemctl start patroni
# Verify joined cluster
patronictl list postgres
Before (3 nodes):
Write: 100% → Primary
Read: 50% → Replica1, 50% → Replica2
After (4 nodes):
Write: 100% → Primary
Read: 33% → Replica1, 33% → Replica2, 33% → Replica3 ✅
4.2. Cascading replication
# For geographically distributed replicas
# node4 (remote datacenter) replicates from node2 instead of primary
# In node4's patroni.yml:
bootstrap:
dcs:
postgresql:
parameters:
primary_conninfo: 'host=node2 port=5432 user=replicator...'
Topology:
Primary (node1)
↓
├─→ Replica (node2)
│ ↓
│ └─→ Replica (node4 - cascading) ← Reduces load on primary
└─→ Replica (node3)
4.3. Application-level read routing
# Smart routing based on query type
class DatabaseRouter:
def __init__(self):
self.write_pool = create_pool(host='haproxy', port=5000)
self.read_pool = create_pool(host='haproxy', port=5001)
def execute(self, query):
# Parse query to determine read vs write
if query.upper().startswith(('SELECT', 'WITH')):
return self.read_pool.execute(query)
else:
return self.write_pool.execute(query)
4.4. Monitoring read distribution
-- On each replica, check query load
SELECT count(*)
FROM pg_stat_activity
WHERE state = 'active';
-- Track queries per replica
SELECT pg_stat_statements.query,
calls,
total_exec_time,
mean_exec_time
FROM pg_stat_statements
ORDER BY calls DESC
LIMIT 20;
5. Query Optimization
5.1. Enable pg_stat_statements
-- Add to postgresql.conf
ALTER SYSTEM SET shared_preload_libraries = 'pg_stat_statements';
-- Restart required
sudo systemctl restart patroni
-- Create extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- View top queries by time
SELECT query,
calls,
total_exec_time,
mean_exec_time,
max_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
5.2. Identify slow queries
-- Currently running slow queries
SELECT pid,
now() - query_start AS duration,
state,
wait_event,
query
FROM pg_stat_activity
WHERE state = 'active'
AND now() - query_start > interval '10 seconds'
ORDER BY duration DESC;
-- Queries with high mean time
SELECT query,
calls,
mean_exec_time / 1000 AS mean_time_seconds,
(total_exec_time / 1000 / 3600) AS total_hours
FROM pg_stat_statements
WHERE mean_exec_time > 1000 -- > 1 second
ORDER BY mean_exec_time DESC
LIMIT 20;
5.3. EXPLAIN ANALYZE
-- Analyze query execution plan
EXPLAIN ANALYZE
SELECT u.*, o.order_date
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01';
-- Look for:
-- ❌ Sequential Scan (should be Index Scan)
-- ❌ High cost
-- ❌ High actual time
-- ❌ Rows mismatch (estimated vs actual)
5.4. Create indexes
-- Index for WHERE clause
CREATE INDEX CONCURRENTLY idx_users_created_at
ON users(created_at);
-- Index for JOIN
CREATE INDEX CONCURRENTLY idx_orders_user_id
ON orders(user_id);
-- Composite index
CREATE INDEX CONCURRENTLY idx_orders_user_date
ON orders(user_id, order_date);
-- Partial index (for filtered queries)
CREATE INDEX CONCURRENTLY idx_active_users
ON users(created_at)
WHERE status = 'active';
-- CONCURRENTLY = no table lock ✅
5.5. Index maintenance
-- Find unused indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexname NOT LIKE '%_pkey'
ORDER BY pg_relation_size(indexrelid) DESC;
-- Find duplicate indexes
SELECT pg_size_pretty(SUM(pg_relation_size(idx))::BIGINT) AS size,
(array_agg(idx))[1] AS idx1,
(array_agg(idx))[2] AS idx2
FROM (
SELECT indexrelid::regclass AS idx,
indrelid,
(indcollation, indclass, indkey, indexprs, indpred) AS key
FROM pg_index
) sub
GROUP BY indrelid, key
HAVING COUNT(*) > 1
ORDER BY SUM(pg_relation_size(idx)) DESC;
-- Rebuild bloated indexes
REINDEX INDEX CONCURRENTLY idx_name;
6. Best Practices
✅ DO
- Start with conservative settings - Tune incrementally
- Monitor before and after - Measure impact of changes
- Use connection pooling - Essential for web applications
- Separate read and write traffic - Scale reads independently
- Create appropriate indexes - Based on query patterns
- Regular VACUUM - Keep table statistics updated
- Use EXPLAIN ANALYZE - Understand query execution
- Set statement_timeout - Prevent runaway queries
- Monitor pool saturation - Scale PgBouncer if needed
- Test configuration changes - In staging first
❌ DON'T
- Don't over-allocate work_mem - Multiply by max connections!
- Don't create too many indexes - Slow down writes
- Don't ignore autovacuum - Will cause bloat
- Don't skip connection pooling - Connection overhead hurts
- Don't use session pooling - Transaction mode better
- Don't forget to analyze - Stale statistics = bad plans
- Don't tune blindly - Understand what you're changing
- Don't set shared_buffers too high - >25% RAM wasteful
7. Lab Exercises
Lab 1: PostgreSQL tuning
Tasks:
- Benchmark current performance with pgbench
- Tune memory settings (shared_buffers, work_mem)
- Tune checkpoint settings
- Re-run pgbench and compare results
- Document improvements
Lab 2: Setup PgBouncer
Tasks:
- Install PgBouncer on primary node
- Configure transaction pooling
- Update application to use PgBouncer
- Monitor connection counts (before/after)
- Load test and measure improvement
Lab 3: HAProxy load balancing
Tasks:
- Install and configure HAProxy
- Setup write and read endpoints
- Test routing (write→primary, read→replicas)
- Simulate failover, verify HAProxy adapts
- Monitor traffic distribution
Lab 4: Query optimization
Tasks:
- Enable pg_stat_statements
- Run sample workload
- Identify top 10 slowest queries
- Use EXPLAIN ANALYZE to understand plans
- Create indexes to optimize
- Measure improvement
8. Tổng kết
Performance Tuning Checklist
- Tune shared_buffers (25% RAM)
- Set effective_cache_size (50-75% RAM)
- Adjust work_mem carefully
- Optimize checkpoints
- Lower random_page_cost for SSD
- Enable pg_stat_statements
- Setup PgBouncer connection pooling
- Configure HAProxy load balancing
- Create indexes based on queries
- Monitor and iterate
Key Concepts
✅ Connection Pooling - Reduces connection overhead dramatically
✅ Load Balancing - Distributes read traffic across replicas
✅ Read Scaling - Add replicas to handle read load
✅ Query Optimization - Indexes + EXPLAIN ANALYZE
✅ Configuration Tuning - Balance memory, I/O, and CPU
Next Steps
Bài 19 sẽ cover Logging và Troubleshooting:
- PostgreSQL log analysis
- Patroni log interpretation
- etcd troubleshooting
- Common issues and solutions
- Debug techniques and tool