Bài 12: Patroni REST API
Sử dụng Patroni REST API endpoints, làm chủ patronictl commands và automation quản lý cluster qua CLI và API.
Bài 12: Patroni REST API
Mục tiêu
Sau bài học này, bạn sẽ:
- Hiểu Patroni REST API và endpoints
- Sử dụng REST API cho health checks
- Integrate với load balancers (HAProxy, Nginx)
- Query cluster status và configuration
- Implement custom monitoring
- Secure REST API endpoints
1. REST API Overview
1.1. REST API là gì?
Patroni exposes HTTP REST API trên mỗi node để:
- 🔍 Health checks: Load balancers check node health
- 📊 Monitoring: External systems query cluster state
- ⚙️ Management: Read configuration, cluster topology
- 🔄 Automation: Integration với CI/CD, orchestration tools
1.2. API Configuration
In patroni.yml:
restapi:
listen: 0.0.0.0:8008 # Listen address and port
connect_address: 10.0.1.11:8008 # Advertised address
# Optional: Basic authentication
# authentication:
# username: admin
# password: secret_password
# Optional: SSL/TLS
# certfile: /etc/patroni/certs/server.crt
# keyfile: /etc/patroni/certs/server.key
# cafile: /etc/patroni/certs/ca.crt
Default port: 8008
1.3. API Endpoints Overview
| Endpoint | Method | Purpose | Use Case |
|---|---|---|---|
/ | GET | Basic node info | Quick health check |
/primary or /master | GET | Check if node is primary | LB primary routing |
/replica | GET | Check if node is replica | LB read routing |
/read-write | GET | Check if writable (primary) | LB write routing |
/read-only or /standby | GET | Check if read-only (replica) | LB read routing |
/synchronous | GET | Check if synchronous replica | Sync replica detection |
/asynchronous | GET | Check if asynchronous replica | Async replica detection |
/health | GET | Detailed health check | Monitoring |
/patroni | GET | Detailed cluster and node info | Advanced monitoring |
/config | GET | Cluster configuration from DCS | Config inspection |
/cluster | GET | All cluster members info | Topology view |
/history | GET | Failover history | Audit log |
2. Health Check Endpoints
2.1. Basic health check: GET /
Purpose: Quick check if node is running.
curl -s http://10.0.1.11:8008/
# Response on PRIMARY:
# HTTP 200 OK
# {
# "state": "running",
# "postmaster_start_time": "2024-11-25 10:30:00.123456+00:00",
# "role": "master",
# "server_version": 180000,
# "cluster_unlocked": false,
# "xlog": {
# "location": 67108864
# },
# "timeline": 1,
# "database_system_identifier": "7001234567890123456",
# "patroni": {
# "version": "3.2.0",
# "scope": "postgres"
# }
# }
# Response on REPLICA:
# HTTP 200 OK
# {
# "state": "running",
# "postmaster_start_time": "2024-11-25 10:31:15.789012+00:00",
# "role": "replica",
# "server_version": 180000,
# "cluster_unlocked": false,
# "xlog": {
# "received_location": 67108864,
# "replayed_location": 67108864
# },
# "timeline": 1,
# "database_system_identifier": "7001234567890123456",
# "patroni": {
# "version": "3.2.0",
# "scope": "postgres"
# }
# }
Response codes:
- 200 OK: Node is healthy and running
- 503 Service Unavailable: Node is unhealthy (PostgreSQL down, etc.)
2.2. Primary check: GET /primary or /master
Purpose: Check if node is current primary/leader.
curl -s http://10.0.1.11:8008/primary
# On PRIMARY:
# HTTP 200 OK
# {
# "state": "running",
# "role": "master",
# "xlog": {
# "location": 67108864
# }
# }
# On REPLICA:
# HTTP 503 Service Unavailable
# (empty body or error message)
Use case: Load balancer health check for write traffic routing.
2.3. Replica check: GET /replica
Purpose: Check if node is replica (standby).
curl -s http://10.0.1.12:8008/replica
# On REPLICA:
# HTTP 200 OK
# {
# "state": "running",
# "role": "replica",
# "xlog": {
# "received_location": 67108864,
# "replayed_location": 67108864
# }
# }
# On PRIMARY:
# HTTP 503 Service Unavailable
Use case: Load balancer health check for read traffic routing.
2.4. Read-write check: GET /read-write
Purpose: Check if node accepts writes (primary + not in maintenance).
curl -s http://10.0.1.11:8008/read-write
# Returns 200 if:
# - Node is primary
# - Cluster is not paused
# - No maintenance mode
2.5. Read-only check: GET /read-only or /standby
Purpose: Check if node is read-only replica.
curl -s http://10.0.1.12:8008/read-only
# Returns 200 if:
# - Node is replica
# - PostgreSQL is running
# - Replication lag < threshold (optional)
Advanced: Lag tolerance:
# Check replica with max 1MB lag tolerance
curl -s "http://10.0.1.12:8008/read-only?lag=1048576"
# Returns 503 if lag > 1MB
2.6. Synchronous replica check: GET /synchronous
Purpose: Check if node is synchronous replica.
curl -s http://10.0.1.12:8008/synchronous
# Returns 200 if:
# - Node is replica
# - sync_state = 'sync' (from pg_stat_replication)
2.7. Asynchronous replica check: GET /asynchronous
Purpose: Check if node is asynchronous replica.
curl -s http://10.0.1.13:8008/asynchronous
# Returns 200 if:
# - Node is replica
# - sync_state != 'sync'
2.8. Health endpoint: GET /health
Purpose: Detailed health information.
curl -s http://10.0.1.11:8008/health | jq
# Response:
# {
# "state": "running",
# "role": "master",
# "server_version": 180000,
# "cluster_unlocked": false,
# "timeline": 1,
# "database_system_identifier": "7001234567890123456",
# "postmaster_start_time": "2024-11-25 10:30:00.123456+00:00",
# "patroni": {
# "version": "3.2.0",
# "scope": "postgres",
# "name": "node1"
# },
# "replication": [
# {
# "usename": "replicator",
# "application_name": "node2",
# "client_addr": "10.0.1.12",
# "state": "streaming",
# "sync_state": "sync",
# "sync_priority": 1
# },
# {
# "usename": "replicator",
# "application_name": "node3",
# "client_addr": "10.0.1.13",
# "state": "streaming",
# "sync_state": "async",
# "sync_priority": 0
# }
# ]
# }
3. Cluster Information Endpoints
3.1. Detailed node info: GET /patroni
Purpose: Comprehensive node and cluster information.
curl -s http://10.0.1.11:8008/patroni | jq
# Response (truncated):
# {
# "state": "running",
# "postmaster_start_time": "2024-11-25 10:30:00.123456+00:00",
# "role": "master",
# "server_version": 180000,
# "xlog": {
# "location": 67108864
# },
# "timeline": 1,
# "cluster_unlocked": false,
# "database_system_identifier": "7001234567890123456",
# "patroni": {
# "version": "3.2.0",
# "scope": "postgres",
# "name": "node1"
# },
# "dcs": {
# "last_seen": 1700912345,
# "ttl": 30
# },
# "tags": {
# "nofailover": false,
# "noloadbalance": false,
# "clonefrom": false,
# "nosync": false
# },
# "pending_restart": false,
# "replication": [...],
# "timeline_history": [...]
# }
3.2. Cluster configuration: GET /config
Purpose: Get cluster-wide configuration from DCS.
curl -s http://10.0.1.11:8008/config | jq
# Response:
# {
# "ttl": 30,
# "loop_wait": 10,
# "retry_timeout": 10,
# "maximum_lag_on_failover": 1048576,
# "synchronous_mode": true,
# "synchronous_mode_strict": false,
# "postgresql": {
# "parameters": {
# "max_connections": 100,
# "shared_buffers": "256MB",
# "wal_level": "replica",
# "max_wal_senders": 10,
# "max_replication_slots": 10,
# "hot_standby": "on"
# },
# "use_pg_rewind": true,
# "use_slots": true
# }
# }
3.3. Cluster members: GET /cluster
Purpose: Get information about all cluster members.
curl -s http://10.0.1.11:8008/cluster | jq
# Response:
# {
# "members": [
# {
# "name": "node1",
# "role": "leader",
# "state": "running",
# "api_url": "http://10.0.1.11:8008/patroni",
# "host": "10.0.1.11",
# "port": 5432,
# "timeline": 1,
# "lag": 0
# },
# {
# "name": "node2",
# "role": "sync_standby",
# "state": "running",
# "api_url": "http://10.0.1.12:8008/patroni",
# "host": "10.0.1.12",
# "port": 5432,
# "timeline": 1,
# "lag": 0
# },
# {
# "name": "node3",
# "role": "replica",
# "state": "running",
# "api_url": "http://10.0.1.13:8008/patroni",
# "host": "10.0.1.13",
# "port": 5432,
# "timeline": 1,
# "lag": 0
# }
# ],
# "scope": "postgres"
# }
3.4. Failover history: GET /history
Purpose: Get cluster failover/switchover history.
curl -s http://10.0.1.11:8008/history | jq
# Response:
# [
# [
# 1, // Timeline
# 67108864, // LSN
# "no recovery target specified",
# "2024-11-25T10:30:00+00:00"
# ],
# [
# 2,
# 134217728,
# "no recovery target specified",
# "2024-11-25T11:45:30+00:00"
# ]
# ]
4. Load Balancer Integration
4.1. HAProxy configuration
haproxy.cfg:
global
log /dev/log local0
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
# Stats page
listen stats
bind *:7000
stats enable
stats uri /stats
stats refresh 10s
stats auth admin:password
# Primary/Write endpoint
listen postgres-primary
bind *:5000
mode tcp
option tcplog
option tcp-check
# Health check via Patroni REST API
tcp-check connect port 8008
tcp-check send GET\ /primary\ HTTP/1.0\r\n\r\n
tcp-check expect string HTTP/1.1\ 200
default-server inter 3s fall 3 rise 2
server node1 10.0.1.11:5432 check port 8008
server node2 10.0.1.12:5432 check port 8008
server node3 10.0.1.13:5432 check port 8008
# Replicas/Read-only endpoint
listen postgres-replicas
bind *:5001
mode tcp
option tcplog
option tcp-check
balance roundrobin
# Health check via Patroni REST API
tcp-check connect port 8008
tcp-check send GET\ /replica\ HTTP/1.0\r\n\r\n
tcp-check expect string HTTP/1.1\ 200
default-server inter 3s fall 3 rise 2
server node1 10.0.1.11:5432 check port 8008
server node2 10.0.1.12:5432 check port 8008
server node3 10.0.1.13:5432 check port 8008
# Read-write endpoint (primary only)
listen postgres-read-write
bind *:5002
mode tcp
option tcplog
option tcp-check
tcp-check connect port 8008
tcp-check send GET\ /read-write\ HTTP/1.0\r\n\r\n
tcp-check expect string HTTP/1.1\ 200
default-server inter 3s fall 3 rise 2
server node1 10.0.1.11:5432 check port 8008
server node2 10.0.1.12:5432 check port 8008
server node3 10.0.1.13:5432 check port 8008
# Read-only endpoint (replicas only)
listen postgres-read-only
bind *:5003
mode tcp
option tcplog
option tcp-check
balance leastconn
tcp-check connect port 8008
tcp-check send GET\ /read-only\ HTTP/1.0\r\n\r\n
tcp-check expect string HTTP/1.1\ 200
default-server inter 3s fall 3 rise 2
server node1 10.0.1.11:5432 check port 8008
server node2 10.0.1.12:5432 check port 8008
server node3 10.0.1.13:5432 check port 8008
Install và start HAProxy:
# Install
sudo apt install -y haproxy
# Configure
sudo nano /etc/haproxy/haproxy.cfg
# (paste config above)
# Validate config
sudo haproxy -c -f /etc/haproxy/haproxy.cfg
# Start
sudo systemctl restart haproxy
sudo systemctl enable haproxy
# Check status
sudo systemctl status haproxy
Test HAProxy:
# Connect to primary (port 5000)
psql -h haproxy_host -p 5000 -U app_user -d myapp -c "SELECT pg_is_in_recovery();"
# Should return: f (false = primary)
# Connect to replica (port 5001)
psql -h haproxy_host -p 5001 -U app_user -d myapp -c "SELECT pg_is_in_recovery();"
# Should return: t (true = replica)
# View HAProxy stats
curl http://haproxy_host:7000/stats
# Or open in browser: http://haproxy_host:7000/stats
4.2. Nginx (with stream module)
nginx.conf:
stream {
# Upstream for primary
upstream postgres_primary {
least_conn;
server 10.0.1.11:5432 max_fails=3 fail_timeout=10s;
server 10.0.1.12:5432 max_fails=3 fail_timeout=10s backup;
server 10.0.1.13:5432 max_fails=3 fail_timeout=10s backup;
}
# Upstream for replicas
upstream postgres_replicas {
least_conn;
server 10.0.1.11:5432 max_fails=3 fail_timeout=10s;
server 10.0.1.12:5432 max_fails=3 fail_timeout=10s;
server 10.0.1.13:5432 max_fails=3 fail_timeout=10s;
}
# Primary endpoint
server {
listen 5000;
proxy_pass postgres_primary;
proxy_connect_timeout 5s;
proxy_timeout 300s;
}
# Replicas endpoint
server {
listen 5001;
proxy_pass postgres_replicas;
proxy_connect_timeout 5s;
proxy_timeout 300s;
}
}
Note: Nginx stream module doesn't support HTTP health checks directly. Need external script or use HAProxy instead.
4.3. Health check script for external LB
Script for cloud load balancers (AWS ALB, GCP LB, etc.):
#!/bin/bash
# /usr/local/bin/patroni_health_check.sh
set -e
NODE_IP="$1"
PORT="${2:-8008}"
ENDPOINT="${3:-/primary}" # or /replica
RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" "http://${NODE_IP}:${PORT}${ENDPOINT}")
if [ "$RESPONSE" = "200" ]; then
echo "Healthy"
exit 0
else
echo "Unhealthy (HTTP $RESPONSE)"
exit 1
fi
Usage:
# Check if node is primary
./patroni_health_check.sh 10.0.1.11 8008 /primary
# Check if node is replica
./patroni_health_check.sh 10.0.1.12 8008 /replica
5. Monitoring Integration
5.1. Prometheus exporter
Use postgres_exporter with custom queries:
# Install postgres_exporter
wget https://github.com/prometheus-community/postgres_exporter/releases/download/v0.15.0/postgres_exporter-0.15.0.linux-amd64.tar.gz
tar -xzf postgres_exporter-0.15.0.linux-amd64.tar.gz
sudo mv postgres_exporter-0.15.0.linux-amd64/postgres_exporter /usr/local/bin/
# Create systemd service
sudo tee /etc/systemd/system/postgres_exporter.service > /dev/null << EOF
[Unit]
Description=PostgreSQL Exporter
After=network.target
[Service]
Type=simple
User=postgres
Environment="DATA_SOURCE_NAME=postgresql://exporter:password@localhost:5432/postgres?sslmode=disable"
ExecStart=/usr/local/bin/postgres_exporter
Restart=always
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl start postgres_exporter
sudo systemctl enable postgres_exporter
Custom query for Patroni metrics:
# /etc/postgres_exporter/queries.yaml
patroni_info:
query: |
SELECT
CASE WHEN pg_is_in_recovery() THEN 'replica' ELSE 'primary' END as role,
1 as value
metrics:
- role:
usage: "LABEL"
description: "PostgreSQL role"
- value:
usage: "GAUGE"
description: "Node role indicator"
5.2. Custom monitoring script
Python script using REST API:
#!/usr/bin/env python3
# /usr/local/bin/patroni_monitor.py
import requests
import json
import sys
NODES = [
"http://10.0.1.11:8008",
"http://10.0.1.12:8008",
"http://10.0.1.13:8008"
]
def check_cluster():
results = []
for node_url in NODES:
try:
response = requests.get(f"{node_url}/patroni", timeout=5)
data = response.json()
results.append({
"node": data["patroni"]["name"],
"role": data["role"],
"state": data["state"],
"timeline": data["timeline"],
"lag": data.get("xlog", {}).get("replayed_location", 0)
})
except Exception as e:
print(f"Error checking {node_url}: {e}", file=sys.stderr)
results.append({
"node": node_url,
"role": "unknown",
"state": "unreachable",
"error": str(e)
})
return results
def main():
cluster_status = check_cluster()
print(json.dumps(cluster_status, indent=2))
# Check if we have a leader
leaders = [n for n in cluster_status if n.get("role") == "master"]
if len(leaders) != 1:
print(f"ERROR: Expected 1 leader, found {len(leaders)}", file=sys.stderr)
sys.exit(1)
# Check all nodes reachable
unreachable = [n for n in cluster_status if n.get("state") == "unreachable"]
if unreachable:
print(f"WARNING: {len(unreachable)} nodes unreachable", file=sys.stderr)
sys.exit(1)
print("Cluster is healthy")
sys.exit(0)
if __name__ == "__main__":
main()
Run monitoring:
python3 /usr/local/bin/patroni_monitor.py
# Output:
# [
# {
# "node": "node1",
# "role": "master",
# "state": "running",
# "timeline": 1,
# "lag": 0
# },
# {
# "node": "node2",
# "role": "replica",
# "state": "running",
# "timeline": 1,
# "lag": 0
# },
# {
# "node": "node3",
# "role": "replica",
# "state": "running",
# "timeline": 1,
# "lag": 0
# }
# ]
# Cluster is healthy
5.3. Grafana dashboard query examples
PromQL queries:
# Node role
patroni_info{role="primary"}
# Replication lag
pg_stat_replication_replay_lag_seconds
# Timeline
patroni_timeline
# Number of replicas
count(patroni_info{role="replica"})
# Synchronous replica status
patroni_sync_state{sync_state="sync"}
6. Secure REST API
6.1. Enable authentication
In patroni.yml:
restapi:
listen: 0.0.0.0:8008
connect_address: 10.0.1.11:8008
# Basic authentication
authentication:
username: admin
password: secure_password_here
Access with authentication:
# Using curl
curl -u admin:secure_password_here http://10.0.1.11:8008/patroni
# Or with header
curl -H "Authorization: Basic $(echo -n admin:secure_password_here | base64)" \
http://10.0.1.11:8008/patroni
6.2. Enable SSL/TLS
Generate certificates:
# Create CA
openssl genrsa -out ca.key 4096
openssl req -new -x509 -days 3650 -key ca.key -out ca.crt \
-subj "/CN=Patroni-CA"
# Create server certificate
openssl genrsa -out server.key 4096
openssl req -new -key server.key -out server.csr \
-subj "/CN=node1.example.com"
# Sign with CA
openssl x509 -req -days 365 -in server.csr -CA ca.crt -CAkey ca.key \
-set_serial 01 -out server.crt
# Set permissions
sudo chown postgres:postgres server.key server.crt ca.crt
sudo chmod 600 server.key
Configure in patroni.yml:
restapi:
listen: 0.0.0.0:8008
connect_address: 10.0.1.11:8008
certfile: /etc/patroni/certs/server.crt
keyfile: /etc/patroni/certs/server.key
cafile: /etc/patroni/certs/ca.crt
# Optional: Require client certificates
# verify_client: required
authentication:
username: admin
password: secure_password_here
Access with HTTPS:
curl -k -u admin:secure_password_here https://10.0.1.11:8008/patroni
# Or with CA certificate
curl --cacert /etc/patroni/certs/ca.crt \
-u admin:secure_password_here \
https://10.0.1.11:8008/patroni
6.3. Firewall rules
# Allow REST API only from specific IPs
sudo ufw allow from 10.0.1.0/24 to any port 8008
sudo ufw allow from <load_balancer_ip> to any port 8008
sudo ufw allow from <monitoring_server_ip> to any port 8008
# Deny from everywhere else
sudo ufw deny 8008
7. Advanced REST API Usage
7.1. Scripted failover check
#!/bin/bash
# Check if failover is safe
CLUSTER_URL="http://10.0.1.11:8008/cluster"
# Get cluster info
CLUSTER_DATA=$(curl -s "$CLUSTER_URL")
# Count healthy replicas
HEALTHY_REPLICAS=$(echo "$CLUSTER_DATA" | jq '[.members[] | select(.role != "leader" and .state == "running")] | length')
if [ "$HEALTHY_REPLICAS" -ge 1 ]; then
echo "Safe to failover: $HEALTHY_REPLICAS healthy replicas"
exit 0
else
echo "NOT safe to failover: only $HEALTHY_REPLICAS healthy replicas"
exit 1
fi
7.2. Get primary endpoint dynamically
#!/bin/bash
# Get current primary IP:port
get_primary() {
for NODE in 10.0.1.11 10.0.1.12 10.0.1.13; do
RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" "http://${NODE}:8008/primary")
if [ "$RESPONSE" = "200" ]; then
echo "${NODE}:5432"
return 0
fi
done
echo "No primary found" >&2
return 1
}
PRIMARY=$(get_primary)
echo "Current primary: $PRIMARY"
# Use in connection string
psql "host=$(echo $PRIMARY | cut -d: -f1) port=5432 user=app_user dbname=myapp"
7.3. Monitor replication lag
#!/bin/bash
# Alert if replication lag > threshold
THRESHOLD_MB=100
for NODE in 10.0.1.11 10.0.1.12 10.0.1.13; do
LAG=$(curl -s "http://${NODE}:8008/patroni" | jq '.replication[]? | select(.sync_state != "sync") | .replay_lag' | wc -l)
if [ "$LAG" -gt "$THRESHOLD_MB" ]; then
echo "ALERT: Node $NODE replication lag > ${THRESHOLD_MB}MB"
# Send notification
fi
done
8. Lab Exercises
Lab 1: Explore REST API endpoints
Tasks:
- Query all endpoints on each node
- Compare responses between primary and replicas
- Identify which endpoint returns 200 on primary vs replica
# Test script
for ENDPOINT in / /primary /replica /read-write /read-only /health /patroni; do
echo "=== $ENDPOINT ==="
for NODE in 10.0.1.11 10.0.1.12 10.0.1.13; do
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" "http://${NODE}:8008${ENDPOINT}")
echo " Node $NODE: $HTTP_CODE"
done
done
Lab 2: Setup HAProxy
Tasks:
- Install HAProxy
- Configure with Patroni health checks
- Test write traffic goes to primary only
- Test read traffic distributed to replicas
- Trigger failover, verify HAProxy redirects automatically
Lab 3: Create monitoring dashboard
Tasks:
- Write Python script to query all nodes
- Display cluster topology
- Show replication lag
- Highlight current primary
- Run every 5 seconds
Lab 4: Secure REST API
Tasks:
- Enable basic authentication
- Generate SSL certificates
- Configure HTTPS
- Update curl commands to use auth + SSL
- Configure firewall rules
9. Troubleshooting REST API
9.1. REST API not responding
Check:
# 1. Verify Patroni is running
sudo systemctl status patroni
# 2. Check if port is listening
sudo netstat -tlnp | grep 8008
# 3. Check firewall
sudo ufw status | grep 8008
# 4. Test locally
curl http://localhost:8008/
# 5. Check logs
sudo journalctl -u patroni -n 50 | grep -i rest
9.2. Wrong HTTP codes returned
Debug:
# Get detailed response
curl -v http://10.0.1.11:8008/primary
# Check PostgreSQL status
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"
# Check Patroni sees correct role
patronictl list
9.3. SSL/TLS errors
Check:
# Verify certificate
openssl x509 -in /etc/patroni/certs/server.crt -text -noout
# Check certificate matches key
openssl x509 -modulus -noout -in server.crt | md5sum
openssl rsa -modulus -noout -in server.key | md5sum
# Should match
# Test SSL connection
openssl s_client -connect 10.0.1.11:8008 -CAfile ca.crt
10. Tổng kết
Key Endpoints Summary
| Endpoint | Returns 200 When | Use Case |
|---|---|---|
/primary | Node is primary | LB write routing |
/replica | Node is replica | LB read routing |
/read-write | Node accepts writes | Write endpoint |
/read-only | Node is read-only replica | Read endpoint |
/health | Node is healthy | Detailed monitoring |
/patroni | Always (detailed info) | Advanced monitoring |
/cluster | Always (all members) | Topology view |
Integration Checklist
- REST API accessible from all nodes
- HAProxy configured with health checks
- Monitoring system queries REST API
- Authentication enabled
- SSL/TLS configured (production)
- Firewall rules configured
- Health check scripts tested
Architecture hiện tại
✅ 3 VMs prepared (Bài 4)
✅ PostgreSQL 18 installed (Bài 5)
✅ etcd cluster running (Bài 6)
✅ Patroni installed (Bài 7)
✅ Patroni configured (Bài 8)
✅ Cluster bootstrapped (Bài 9)
✅ Replication configured (Bài 10)
✅ Callbacks implemented (Bài 11)
✅ REST API integrated (Bài 12)
Next: Failover management
Chuẩn bị cho Bài 13
Bài 13 sẽ cover Failover và Switchover:
- Automatic failover process
- Manual switchover
- Failover scenarios và testing
- DCS role in leader election
- Minimize downtime strategies