Bài 10: Quản lý Replication

Mục tiêu

Sau bài học này, bạn sẽ:

Hiểu quá trình bootstrap Patroni cluster
Khởi động Patroni lần đầu trên 3 nodes
Kiểm tra cluster status với patronictl
Verify replication đang hoạt động
Troubleshoot các issues thường gặp
Test basic failover

1. Pre-Bootstrap Checklist

1.1. Verify prerequisites

Trước khi start Patroni, verify tất cả components đã sẵn sàng:

# ✅ etcd cluster healthy
etcdctl endpoint health --cluster
# All endpoints should be healthy

# ✅ PostgreSQL installed nhưng NOT running
systemctl status postgresql
# Should be: inactive (dead)

# ✅ Patroni installed
patroni --version
# Should show: patroni 3.2.0+

# ✅ Config file exists và valid
sudo -u postgres cat /etc/patroni/patroni.yml
python3 -c "import yaml; yaml.safe_load(open('/etc/patroni/patroni.yml'))"

# ✅ Data directory exists với permissions đúng
ls -ld /var/lib/postgresql/18/data
# Owner: postgres:postgres, Permissions: drwx------

# ✅ Firewall rules
sudo ufw status | grep -E "(5432|8008)"
# Ports 5432, 8008 should be allowed

1.2. Network connectivity test

Verify connectivity giữa các nodes:

# Test PostgreSQL port
nc -zv 10.0.1.11 5432
nc -zv 10.0.1.12 5432
nc -zv 10.0.1.13 5432

# Test Patroni REST API port
nc -zv 10.0.1.11 8008
nc -zv 10.0.1.12 8008
nc -zv 10.0.1.13 8008

# Test etcd port
nc -zv 10.0.1.11 2379
nc -zv 10.0.1.12 2379
nc -zv 10.0.1.13 2379

1.3. Clean data directories

Nếu data directory không empty, xóa để fresh start:

# CẢNH BÁO: Chỉ làm khi bootstrap lần đầu
sudo systemctl stop patroni
sudo rm -rf /var/lib/postgresql/18/data/*
sudo chown postgres:postgres /var/lib/postgresql/18/data

2. Understanding Bootstrap Process

2.1. Bootstrap flow

Step 1: Start Patroni trên Node 1
   ↓
Node 1 checks DCS: No cluster exists
   ↓
Node 1 acquires initialize key
   ↓
Node 1 runs pg_initdb
   ↓
Node 1 starts PostgreSQL as PRIMARY
   ↓
Node 1 creates replication user
   ↓
Node 1 stores cluster config in DCS
   ↓
Node 1 acquires leader lock

Step 2: Start Patroni trên Node 2
   ↓
Node 2 checks DCS: Cluster exists
   ↓
Node 2 sees Node 1 is leader
   ↓
Node 2 runs pg_basebackup from Node 1
   ↓
Node 2 starts PostgreSQL as REPLICA
   ↓
Node 2 connects to Node 1 for replication

Step 3: Start Patroni trên Node 3
   ↓
Node 3 checks DCS: Cluster exists
   ↓
Node 3 sees Node 1 is leader
   ↓
Node 3 runs pg_basebackup from Node 1
   ↓
Node 3 starts PostgreSQL as REPLICA
   ↓
Node 3 connects to Node 1 for replication

Final State:
┌─────────┐         ┌─────────┐         ┌─────────┐
│ Node 1  │────────→│ Node 2  │         │ Node 3  │
│ PRIMARY │         │ REPLICA │←────────│ REPLICA │
└─────────┘         └─────────┘         └─────────┘
   Leader              Streaming           Streaming

2.2. Race condition prevention

Patroni sử dụng DCS để prevent multiple nodes từ initializing cluster:

# In etcd
/service/postgres/initialize: "node1"  # First node acquires this
/service/postgres/leader: {...}        # Leader lock

Nếu 2 nodes start simultaneously:

Node nhanh hơn acquires /initialize key
Node kia thấy key đã tồn tại → waits và clones from leader

3. Bootstrap Cluster - Step by Step

3.1. Start Patroni trên Node 1

Terminal trên Node 1:

# Start Patroni service
sudo systemctl start patroni

# Watch logs
sudo journalctl -u patroni -f

Expected logs:

INFO: No initialize key found in DCS
INFO: Trying to bootstrap a new cluster
INFO: Acquiring initialize key
INFO: Initializing a new cluster
INFO: Running initdb: /usr/lib/postgresql/18/bin/initdb ...
INFO: postmaster pid: 12345
INFO: PostgreSQL started
INFO: Running post_bootstrap script
INFO: Creating replication user
INFO: Lock owner: node1; I am node1
INFO: Leader election acquired
INFO: I am the leader with the lock

Verify Node 1:

# Check Patroni status
sudo systemctl status patroni
# Should be: active (running)

# Check PostgreSQL is running
ps aux | grep postgres
# Should see multiple postgres processes

# Check if it's PRIMARY
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"
# pg_is_in_recovery
# ------------------
#  f                  ← false = PRIMARY

3.2. Verify trong etcd

# Check leader key
etcdctl get /service/postgres/leader --print-value-only | jq

# Output:
# {
#   "role": "master",
#   "state": "running",
#   "conn_url": "postgres://10.0.1.11:5432/postgres",
#   "api_url": "http://10.0.1.11:8008/patroni",
#   "xlog_location": 50331648,
#   "timeline": 1
# }

# Check members
etcdctl get /service/postgres/members/ --prefix
# Should show node1

3.3. Start Patroni trên Node 2

Terminal trên Node 2:

# Start Patroni
sudo systemctl start patroni

# Watch logs
sudo journalctl -u patroni -f

Expected logs:

INFO: Cluster already initialized
INFO: Found leader: node1
INFO: Trying to clone from leader
INFO: Running: pg_basebackup -D /var/lib/postgresql/18/data ...
INFO: Basebackup completed
INFO: Starting PostgreSQL
INFO: postmaster pid: 12346
INFO: Configuring standby mode
INFO: Following new leader: node1
INFO: Replication established

Verify Node 2:

# Check if it's REPLICA
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"
# pg_is_in_recovery
# ------------------
#  t                  ← true = REPLICA

# Check replication status
sudo -u postgres psql -c "SELECT * FROM pg_stat_wal_receiver;" -x

3.4. Start Patroni trên Node 3

Terminal trên Node 3:

# Start Patroni
sudo systemctl start patroni

# Watch logs
sudo journalctl -u patroni -f

Expected logs: Tương tự Node 2.

Verify Node 3:

# Check replica status
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"
# Should return: t (true)

4. Verify Cluster Status

4.1. Using patronictl

# List cluster members
patronictl -c /etc/patroni/patroni.yml list

# Output:
# + Cluster: postgres (7001234567890123456) ----+----+-----------+
# | Member | Host          | Role    | State   | TL | Lag in MB |
# +--------+---------------+---------+---------+----+-----------+
# | node1  | 10.0.1.11:5432| Leader  | running |  1 |           |
# | node2  | 10.0.1.12:5432| Replica | running |  1 |         0 |
# | node3  | 10.0.1.13:5432| Replica | running |  1 |         0 |
# +--------+---------------+---------+---------+----+-----------+

Column meanings:

Member: Node name
Host: Connection address
Role: Leader (primary) or Replica
State: running, streaming, in archive recovery
TL: Timeline (should be same for all)
Lag in MB: Replication lag

4.2. Check topology

patronictl -c /etc/patroni/patroni.yml topology postgres

# Output shows replication tree

4.3. Using REST API

# Check node1 (primary)
curl -s http://10.0.1.11:8008/ | jq

# Output:
# {
#   "state": "running",
#   "postmaster_start_time": "2024-11-24 10:30:15.123+00",
#   "role": "master",
#   "server_version": 180000,
#   "cluster_unlocked": false,
#   "xlog": {
#     "location": 50331648
#   },
#   "timeline": 1,
#   "database_system_identifier": "7001234567890123456"
# }

# Check node2 (replica)
curl -s http://10.0.1.12:8008/ | jq

# Check node3 (replica)
curl -s http://10.0.1.13:8008/ | jq

4.4. Check replication from PostgreSQL

On primary (node1):

sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;" -x

Output:

-[ RECORD 1 ]----+------------------------------
pid              | 12350
usesysid         | 16384
usename          | replicator
application_name | node2
client_addr      | 10.0.1.12
client_hostname  | 
client_port      | 45678
backend_start    | 2024-11-24 10:31:00.123+00
backend_xmin     | 
state            | streaming
sent_lsn         | 0/3000000
write_lsn        | 0/3000000
flush_lsn        | 0/3000000
replay_lsn       | 0/3000000
write_lag        | 
flush_lag        | 
replay_lag       | 
sync_state       | async
sync_priority    | 0
reply_time       | 2024-11-24 10:35:00.456+00

-[ RECORD 2 ]----+------------------------------
pid              | 12351
usesysid         | 16384
usename          | replicator
application_name | node3
...

On replicas (node2, node3):

sudo -u postgres psql -c "SELECT status, received_lsn, latest_end_lsn FROM pg_stat_wal_receiver;" -x

4.5. Verify replication lag

# On primary
sudo -u postgres psql -c "
SELECT 
  application_name,
  client_addr,
  state,
  pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS lag_bytes,
  replay_lag
FROM pg_stat_replication;
"

# Output:
# application_name | client_addr | state     | lag_bytes | replay_lag
# -----------------+-------------+-----------+-----------+------------
# node2            | 10.0.1.12   | streaming |         0 | 
# node3            | 10.0.1.13   | streaming |         0 |

5. Test Basic Operations

5.1. Create test database and table

On primary (connect to any node, patronictl will route to primary):

# Create database
sudo -u postgres psql -h 10.0.1.11 -c "CREATE DATABASE testdb;"

# Create table with data
sudo -u postgres psql -h 10.0.1.11 -d testdb << EOF
CREATE TABLE test_table (
  id SERIAL PRIMARY KEY,
  data TEXT,
  created_at TIMESTAMP DEFAULT NOW()
);

INSERT INTO test_table (data) 
SELECT 'Test data ' || i 
FROM generate_series(1, 1000) AS i;
EOF

5.2. Verify replication

On replica (node2 or node3):

# Check data replicated
sudo -u postgres psql -h 10.0.1.12 -d testdb -c "SELECT COUNT(*) FROM test_table;"
# Should return: 1000

# Try to write (should fail on replica)
sudo -u postgres psql -h 10.0.1.12 -d testdb -c "INSERT INTO test_table (data) VALUES ('test');"
# ERROR:  cannot execute INSERT in a read-only transaction

5.3. Test continuous replication

Terminal 1 (primary - node1):

# Insert data continuously
while true; do
  sudo -u postgres psql -h 10.0.1.11 -d testdb -c \
    "INSERT INTO test_table (data) VALUES ('Data at ' || NOW());"
  sleep 1
done

Terminal 2 (replica - node2):

# Watch count increase
watch -n 1 "sudo -u postgres psql -h 10.0.1.12 -d testdb -t -c 'SELECT COUNT(*) FROM test_table;'"

Data should increase every second → Replication working!

6. Common Bootstrap Issues

6.1. Issue: Patroni won't start

Symptoms:

sudo systemctl status patroni
# Failed to start

Check logs:

sudo journalctl -u patroni -n 50 --no-pager

Common causes & solutions:

A. Config file syntax error

ERROR: Error parsing config file

Solution:

# Validate YAML
python3 -c "import yaml; yaml.safe_load(open('/etc/patroni/patroni.yml'))"

# Common issues:
# - Mixed tabs and spaces (use spaces only)
# - Incorrect indentation
# - Missing quotes around special characters

B. Cannot connect to etcd

ERROR: Failed to connect to etcd

Solution:

# Check etcd is running
etcdctl endpoint health

# Check etcd endpoints in patroni.yml
grep "hosts:" /etc/patroni/patroni.yml

# Test connectivity
curl http://10.0.1.11:2379/version

C. Permission denied on data directory

ERROR: data directory has wrong ownership

Solution:

sudo chown -R postgres:postgres /var/lib/postgresql/18/data
sudo chmod 700 /var/lib/postgresql/18/data

D. Port already in use

ERROR: could not bind IPv4 address "0.0.0.0": Address already in use

Solution:

# Check what's using port 5432
sudo lsof -i :5432

# Stop PostgreSQL if running
sudo systemctl stop postgresql

# Kill process if needed
sudo pkill -9 postgres

6.2. Issue: Cluster won't initialize

Symptoms: Patroni starts nhưng không initialize cluster.

Check logs:

sudo journalctl -u patroni -f

Common causes:

A. Data directory not empty

INFO: Data directory is not empty

Solution:

# Backup old data if needed
sudo mv /var/lib/postgresql/18/data /var/lib/postgresql/18/data.bak

# Create fresh directory
sudo mkdir -p /var/lib/postgresql/18/data
sudo chown postgres:postgres /var/lib/postgresql/18/data
sudo chmod 700 /var/lib/postgresql/18/data

# Restart Patroni
sudo systemctl restart patroni

B. Initialize key stuck in etcd

INFO: Another node is initializing

Solution:

# Check initialize key
etcdctl get /service/postgres/initialize

# If stuck, delete it
etcdctl del /service/postgres/initialize

# Restart Patroni
sudo systemctl restart patroni

6.3. Issue: Replica cannot clone from primary

Symptoms: Node 2 hoặc 3 không thể basebackup.

Check logs:

sudo journalctl -u patroni -n 100 | grep -i basebackup

Common causes:

A. Network connectivity

ERROR: could not connect to server

Solution:

# Test connectivity
telnet 10.0.1.11 5432

# Check firewall
sudo ufw status
sudo ufw allow from 10.0.1.0/24 to any port 5432

B. Authentication failed

ERROR: FATAL: password authentication failed for user "replicator"

Solution:

# Verify replication user exists on primary
sudo -u postgres psql -h 10.0.1.11 -c "\du replicator"

# Check pg_hba.conf allows replication
sudo -u postgres psql -h 10.0.1.11 -c "SHOW hba_file;"
# Then check the file

# Verify password matches in patroni.yml
grep -A2 "replication:" /etc/patroni/patroni.yml

C. Insufficient space

ERROR: No space left on device

Solution:

# Check disk space
df -h /var/lib/postgresql

# Clean up if needed
sudo du -sh /var/lib/postgresql/* | sort -h

6.4. Issue: Nodes have different timelines

Symptoms:

patronictl list
# node1: TL=1
# node2: TL=2  ← Different!

Solution:

# Reinitialize diverged node
patronictl reinit postgres node2

# Or manually
sudo systemctl stop patroni
sudo rm -rf /var/lib/postgresql/18/data/*
sudo systemctl start patroni

7. Enable Auto-start on Boot

# Enable Patroni service
sudo systemctl enable patroni

# Verify
systemctl is-enabled patroni
# Output: enabled

# Test reboot (optional)
sudo reboot

# After reboot, check cluster
patronictl list

8. Basic Cluster Management

8.1. Restart a node

# Graceful restart
patronictl restart postgres node2

# Force restart
patronictl restart postgres node2 --force

8.2. Reload configuration

# Reload Patroni config (non-PostgreSQL settings)
sudo systemctl reload patroni

# Reload PostgreSQL config
patronictl reload postgres node1

8.3. Pause/Resume auto-failover

# Pause (disable auto-failover)
patronictl pause postgres

# Resume (enable auto-failover)
patronictl resume postgres

8.4. Show configuration

# Show current DCS configuration
patronictl show-config postgres

9. Test Automatic Failover (Optional)

CẢNH BÁO: Chỉ test trong môi trường non-production!

9.1. Simulate primary failure

# On node1 (current primary)
sudo systemctl stop patroni

# Or kill PostgreSQL
sudo pkill -9 postgres

9.2. Watch cluster failover

# On node2 hoặc node3
watch -n 1 "patronictl list"

# Timeline:
# T+0s: node1 is Leader
# T+10s: node1 not responding
# T+30s: Leader lock expires
# T+35s: node2 or node3 becomes Leader
# T+40s: Cluster operational with new Leader

9.3. Verify new primary

patronictl list

# New output:
# + Cluster: postgres ----+----+-----------+
# | Member | Host    | Role    | State   | TL | Lag in MB |
# +--------+---------+---------+---------+----+-----------+
# | node1  | 10.0.1.11| Replica | stopped |  1 |           |
# | node2  | 10.0.1.12| Leader  | running |  2 |           |  ← New primary
# | node3  | 10.0.1.13| Replica | running |  2 |         0 |
# +--------+---------+---------+---------+----+-----------+

Note: Timeline increased from 1 → 2 (indicates failover occurred).

9.4. Rejoin old primary

# Start node1 again
sudo systemctl start patroni

# Patroni auto-rewinds và rejoins as replica
patronictl list

# Output:
# | node1  | 10.0.1.11| Replica | running |  2 |         0 |  ← Rejoined
# | node2  | 10.0.1.12| Leader  | running |  2 |           |
# | node3  | 10.0.1.13| Replica | running |  2 |         0 |

10. Lab Exercise

Lab 1: Bootstrap và verify

Tasks: 1. ✅ Start Patroni trên 3 nodes theo thứ tự 2. ✅ Verify cluster với patronictl list 3. ✅ Check replication status 4. ✅ Create test database và verify data replicates

Lab 2: Test replication lag

Tasks: 1. Insert 10,000 rows vào primary 2. Measure replication lag trên replicas 3. Monitor pg_stat_replication

Lab 3: Simulate node failure

Tasks: 1. Stop primary node 2. Watch automatic failover 3. Verify new primary elected 4. Rejoin old primary 5. Verify all nodes healthy

11. Tổng kết

Key Takeaways

✅ Bootstrap: First node initializes, others clone

✅ Leader election: Automatic, DCS-based

✅ Replication: Automatic setup via pg_basebackup

✅ patronictl: Primary management tool

✅ Monitoring: Check via patronictl, REST API, pg_stat_replication

✅ Failover: Automatic khi primary fails

Checklist sau Bootstrap

All 3 nodes showing in patronictl list
1 Leader, 2 Replicas
All nodes same Timeline
Replication lag = 0 MB
Test data replicates to all nodes
REST API responding on all nodes
Patroni enabled for auto-start
etcd cluster healthy

Architecture hiện tại

✅ 3 VMs prepared (Bài 4)
✅ PostgreSQL 18 installed (Bài 5)
✅ etcd cluster running (Bài 6)
✅ Patroni installed (Bài 7)
✅ Patroni configured (Bài 8)
✅ Cluster bootstrapped (Bài 9)

Next: Advanced replication management

Chuẩn bị cho Bài 10

Bài 10 sẽ đi sâu vào Replication Management:

Synchronous vs Asynchronous replication
Configure sync mode
Monitor replication lag
Handle replication issues

Menu

Bài 10: Quản lý Replication

DUY TRAN