Bài 12: Patroni REST API

Mục tiêu

Sau bài học này, bạn sẽ:

Hiểu Patroni REST API và endpoints
Sử dụng REST API cho health checks
Integrate với load balancers (HAProxy, Nginx)
Query cluster status và configuration
Implement custom monitoring
Secure REST API endpoints

1. REST API Overview

1.1. REST API là gì?

Patroni exposes HTTP REST API trên mỗi node để:

🔍 Health checks: Load balancers check node health
📊 Monitoring: External systems query cluster state
⚙️ Management: Read configuration, cluster topology
🔄 Automation: Integration với CI/CD, orchestration tools

1.2. API Configuration

In patroni.yml:

restapi:
  listen: 0.0.0.0:8008        # Listen address and port
  connect_address: 10.0.1.11:8008  # Advertised address
  
  # Optional: Basic authentication
  # authentication:
  #   username: admin
  #   password: secret_password
  
  # Optional: SSL/TLS
  # certfile: /etc/patroni/certs/server.crt
  # keyfile: /etc/patroni/certs/server.key
  # cafile: /etc/patroni/certs/ca.crt

Default port: 8008

1.3. API Endpoints Overview

Endpoint	Method	Purpose	Use Case
`/`	GET	Basic node info	Quick health check
`/primary` or `/master`	GET	Check if node is primary	LB primary routing
`/replica`	GET	Check if node is replica	LB read routing
`/read-write`	GET	Check if writable (primary)	LB write routing
`/read-only` or `/standby`	GET	Check if read-only (replica)	LB read routing
`/synchronous`	GET	Check if synchronous replica	Sync replica detection
`/asynchronous`	GET	Check if asynchronous replica	Async replica detection
`/health`	GET	Detailed health check	Monitoring
`/patroni`	GET	Detailed cluster and node info	Advanced monitoring
`/config`	GET	Cluster configuration from DCS	Config inspection
`/cluster`	GET	All cluster members info	Topology view
`/history`	GET	Failover history	Audit log

2. Health Check Endpoints

2.1. Basic health check: GET /

Purpose: Quick check if node is running.

curl -s http://10.0.1.11:8008/

# Response on PRIMARY:
# HTTP 200 OK
# {
#   "state": "running",
#   "postmaster_start_time": "2024-11-25 10:30:00.123456+00:00",
#   "role": "master",
#   "server_version": 180000,
#   "cluster_unlocked": false,
#   "xlog": {
#     "location": 67108864
#   },
#   "timeline": 1,
#   "database_system_identifier": "7001234567890123456",
#   "patroni": {
#     "version": "3.2.0",
#     "scope": "postgres"
#   }
# }

# Response on REPLICA:
# HTTP 200 OK
# {
#   "state": "running",
#   "postmaster_start_time": "2024-11-25 10:31:15.789012+00:00",
#   "role": "replica",
#   "server_version": 180000,
#   "cluster_unlocked": false,
#   "xlog": {
#     "received_location": 67108864,
#     "replayed_location": 67108864
#   },
#   "timeline": 1,
#   "database_system_identifier": "7001234567890123456",
#   "patroni": {
#     "version": "3.2.0",
#     "scope": "postgres"
#   }
# }

Response codes:

200 OK: Node is healthy and running
503 Service Unavailable: Node is unhealthy (PostgreSQL down, etc.)

2.2. Primary check: GET /primary or /master

Purpose: Check if node is current primary/leader.

curl -s http://10.0.1.11:8008/primary

# On PRIMARY:
# HTTP 200 OK
# {
#   "state": "running",
#   "role": "master",
#   "xlog": {
#     "location": 67108864
#   }
# }

# On REPLICA:
# HTTP 503 Service Unavailable
# (empty body or error message)

Use case: Load balancer health check for write traffic routing.

2.3. Replica check: GET /replica

Purpose: Check if node is replica (standby).

curl -s http://10.0.1.12:8008/replica

# On REPLICA:
# HTTP 200 OK
# {
#   "state": "running",
#   "role": "replica",
#   "xlog": {
#     "received_location": 67108864,
#     "replayed_location": 67108864
#   }
# }

# On PRIMARY:
# HTTP 503 Service Unavailable

Use case: Load balancer health check for read traffic routing.

2.4. Read-write check: GET /read-write

Purpose: Check if node accepts writes (primary + not in maintenance).

curl -s http://10.0.1.11:8008/read-write

# Returns 200 if:
# - Node is primary
# - Cluster is not paused
# - No maintenance mode

2.5. Read-only check: GET /read-only or /standby

Purpose: Check if node is read-only replica.

curl -s http://10.0.1.12:8008/read-only

# Returns 200 if:
# - Node is replica
# - PostgreSQL is running
# - Replication lag < threshold (optional)

Advanced: Lag tolerance:

# Check replica with max 1MB lag tolerance
curl -s "http://10.0.1.12:8008/read-only?lag=1048576"

# Returns 503 if lag > 1MB

2.6. Synchronous replica check: GET /synchronous

Purpose: Check if node is synchronous replica.

curl -s http://10.0.1.12:8008/synchronous

# Returns 200 if:
# - Node is replica
# - sync_state = 'sync' (from pg_stat_replication)

2.7. Asynchronous replica check: GET /asynchronous

Purpose: Check if node is asynchronous replica.

curl -s http://10.0.1.13:8008/asynchronous

# Returns 200 if:
# - Node is replica
# - sync_state != 'sync'

2.8. Health endpoint: GET /health

Purpose: Detailed health information.

curl -s http://10.0.1.11:8008/health | jq

# Response:
# {
#   "state": "running",
#   "role": "master",
#   "server_version": 180000,
#   "cluster_unlocked": false,
#   "timeline": 1,
#   "database_system_identifier": "7001234567890123456",
#   "postmaster_start_time": "2024-11-25 10:30:00.123456+00:00",
#   "patroni": {
#     "version": "3.2.0",
#     "scope": "postgres",
#     "name": "node1"
#   },
#   "replication": [
#     {
#       "usename": "replicator",
#       "application_name": "node2",
#       "client_addr": "10.0.1.12",
#       "state": "streaming",
#       "sync_state": "sync",
#       "sync_priority": 1
#     },
#     {
#       "usename": "replicator",
#       "application_name": "node3",
#       "client_addr": "10.0.1.13",
#       "state": "streaming",
#       "sync_state": "async",
#       "sync_priority": 0
#     }
#   ]
# }

3. Cluster Information Endpoints

3.1. Detailed node info: GET /patroni

Purpose: Comprehensive node and cluster information.

curl -s http://10.0.1.11:8008/patroni | jq

# Response (truncated):
# {
#   "state": "running",
#   "postmaster_start_time": "2024-11-25 10:30:00.123456+00:00",
#   "role": "master",
#   "server_version": 180000,
#   "xlog": {
#     "location": 67108864
#   },
#   "timeline": 1,
#   "cluster_unlocked": false,
#   "database_system_identifier": "7001234567890123456",
#   "patroni": {
#     "version": "3.2.0",
#     "scope": "postgres",
#     "name": "node1"
#   },
#   "dcs": {
#     "last_seen": 1700912345,
#     "ttl": 30
#   },
#   "tags": {
#     "nofailover": false,
#     "noloadbalance": false,
#     "clonefrom": false,
#     "nosync": false
#   },
#   "pending_restart": false,
#   "replication": [...],
#   "timeline_history": [...]
# }

3.2. Cluster configuration: GET /config

Purpose: Get cluster-wide configuration from DCS.

curl -s http://10.0.1.11:8008/config | jq

# Response:
# {
#   "ttl": 30,
#   "loop_wait": 10,
#   "retry_timeout": 10,
#   "maximum_lag_on_failover": 1048576,
#   "synchronous_mode": true,
#   "synchronous_mode_strict": false,
#   "postgresql": {
#     "parameters": {
#       "max_connections": 100,
#       "shared_buffers": "256MB",
#       "wal_level": "replica",
#       "max_wal_senders": 10,
#       "max_replication_slots": 10,
#       "hot_standby": "on"
#     },
#     "use_pg_rewind": true,
#     "use_slots": true
#   }
# }

3.3. Cluster members: GET /cluster

Purpose: Get information about all cluster members.

curl -s http://10.0.1.11:8008/cluster | jq

# Response:
# {
#   "members": [
#     {
#       "name": "node1",
#       "role": "leader",
#       "state": "running",
#       "api_url": "http://10.0.1.11:8008/patroni",
#       "host": "10.0.1.11",
#       "port": 5432,
#       "timeline": 1,
#       "lag": 0
#     },
#     {
#       "name": "node2",
#       "role": "sync_standby",
#       "state": "running",
#       "api_url": "http://10.0.1.12:8008/patroni",
#       "host": "10.0.1.12",
#       "port": 5432,
#       "timeline": 1,
#       "lag": 0
#     },
#     {
#       "name": "node3",
#       "role": "replica",
#       "state": "running",
#       "api_url": "http://10.0.1.13:8008/patroni",
#       "host": "10.0.1.13",
#       "port": 5432,
#       "timeline": 1,
#       "lag": 0
#     }
#   ],
#   "scope": "postgres"
# }

3.4. Failover history: GET /history

Purpose: Get cluster failover/switchover history.

curl -s http://10.0.1.11:8008/history | jq

# Response:
# [
#   [
#     1,  // Timeline
#     67108864,  // LSN
#     "no recovery target specified",
#     "2024-11-25T10:30:00+00:00"
#   ],
#   [
#     2,
#     134217728,
#     "no recovery target specified",
#     "2024-11-25T11:45:30+00:00"
#   ]
# ]

4. Load Balancer Integration

4.1. HAProxy configuration

haproxy.cfg:

global
    log /dev/log local0
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000

# Stats page
listen stats
    bind *:7000
    stats enable
    stats uri /stats
    stats refresh 10s
    stats auth admin:password

# Primary/Write endpoint
listen postgres-primary
    bind *:5000
    mode tcp
    option tcplog
    option tcp-check
    
    # Health check via Patroni REST API
    tcp-check connect port 8008
    tcp-check send GET\ /primary\ HTTP/1.0\r\n\r\n
    tcp-check expect string HTTP/1.1\ 200
    
    default-server inter 3s fall 3 rise 2
    
    server node1 10.0.1.11:5432 check port 8008
    server node2 10.0.1.12:5432 check port 8008
    server node3 10.0.1.13:5432 check port 8008

# Replicas/Read-only endpoint
listen postgres-replicas
    bind *:5001
    mode tcp
    option tcplog
    option tcp-check
    balance roundrobin
    
    # Health check via Patroni REST API
    tcp-check connect port 8008
    tcp-check send GET\ /replica\ HTTP/1.0\r\n\r\n
    tcp-check expect string HTTP/1.1\ 200
    
    default-server inter 3s fall 3 rise 2
    
    server node1 10.0.1.11:5432 check port 8008
    server node2 10.0.1.12:5432 check port 8008
    server node3 10.0.1.13:5432 check port 8008

# Read-write endpoint (primary only)
listen postgres-read-write
    bind *:5002
    mode tcp
    option tcplog
    option tcp-check
    
    tcp-check connect port 8008
    tcp-check send GET\ /read-write\ HTTP/1.0\r\n\r\n
    tcp-check expect string HTTP/1.1\ 200
    
    default-server inter 3s fall 3 rise 2
    
    server node1 10.0.1.11:5432 check port 8008
    server node2 10.0.1.12:5432 check port 8008
    server node3 10.0.1.13:5432 check port 8008

# Read-only endpoint (replicas only)
listen postgres-read-only
    bind *:5003
    mode tcp
    option tcplog
    option tcp-check
    balance leastconn
    
    tcp-check connect port 8008
    tcp-check send GET\ /read-only\ HTTP/1.0\r\n\r\n
    tcp-check expect string HTTP/1.1\ 200
    
    default-server inter 3s fall 3 rise 2
    
    server node1 10.0.1.11:5432 check port 8008
    server node2 10.0.1.12:5432 check port 8008
    server node3 10.0.1.13:5432 check port 8008

Install và start HAProxy:

# Install
sudo apt install -y haproxy

# Configure
sudo nano /etc/haproxy/haproxy.cfg
# (paste config above)

# Validate config
sudo haproxy -c -f /etc/haproxy/haproxy.cfg

# Start
sudo systemctl restart haproxy
sudo systemctl enable haproxy

# Check status
sudo systemctl status haproxy

Test HAProxy:

# Connect to primary (port 5000)
psql -h haproxy_host -p 5000 -U app_user -d myapp -c "SELECT pg_is_in_recovery();"
# Should return: f (false = primary)

# Connect to replica (port 5001)
psql -h haproxy_host -p 5001 -U app_user -d myapp -c "SELECT pg_is_in_recovery();"
# Should return: t (true = replica)

# View HAProxy stats
curl http://haproxy_host:7000/stats
# Or open in browser: http://haproxy_host:7000/stats

4.2. Nginx (with stream module)

nginx.conf:

stream {
    # Upstream for primary
    upstream postgres_primary {
        least_conn;
        server 10.0.1.11:5432 max_fails=3 fail_timeout=10s;
        server 10.0.1.12:5432 max_fails=3 fail_timeout=10s backup;
        server 10.0.1.13:5432 max_fails=3 fail_timeout=10s backup;
    }
    
    # Upstream for replicas
    upstream postgres_replicas {
        least_conn;
        server 10.0.1.11:5432 max_fails=3 fail_timeout=10s;
        server 10.0.1.12:5432 max_fails=3 fail_timeout=10s;
        server 10.0.1.13:5432 max_fails=3 fail_timeout=10s;
    }
    
    # Primary endpoint
    server {
        listen 5000;
        proxy_pass postgres_primary;
        proxy_connect_timeout 5s;
        proxy_timeout 300s;
    }
    
    # Replicas endpoint
    server {
        listen 5001;
        proxy_pass postgres_replicas;
        proxy_connect_timeout 5s;
        proxy_timeout 300s;
    }
}

Note: Nginx stream module doesn't support HTTP health checks directly. Need external script or use HAProxy instead.

4.3. Health check script for external LB

Script for cloud load balancers (AWS ALB, GCP LB, etc.):

#!/bin/bash
# /usr/local/bin/patroni_health_check.sh

set -e

NODE_IP="$1"
PORT="${2:-8008}"
ENDPOINT="${3:-/primary}"  # or /replica

RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" "http://${NODE_IP}:${PORT}${ENDPOINT}")

if [ "$RESPONSE" = "200" ]; then
    echo "Healthy"
    exit 0
else
    echo "Unhealthy (HTTP $RESPONSE)"
    exit 1
fi

Usage:

# Check if node is primary
./patroni_health_check.sh 10.0.1.11 8008 /primary

# Check if node is replica
./patroni_health_check.sh 10.0.1.12 8008 /replica

5. Monitoring Integration

5.1. Prometheus exporter

Use postgres_exporter with custom queries:

# Install postgres_exporter
wget https://github.com/prometheus-community/postgres_exporter/releases/download/v0.15.0/postgres_exporter-0.15.0.linux-amd64.tar.gz
tar -xzf postgres_exporter-0.15.0.linux-amd64.tar.gz
sudo mv postgres_exporter-0.15.0.linux-amd64/postgres_exporter /usr/local/bin/

# Create systemd service
sudo tee /etc/systemd/system/postgres_exporter.service > /dev/null << EOF
[Unit]
Description=PostgreSQL Exporter
After=network.target

[Service]
Type=simple
User=postgres
Environment="DATA_SOURCE_NAME=postgresql://exporter:password@localhost:5432/postgres?sslmode=disable"
ExecStart=/usr/local/bin/postgres_exporter
Restart=always

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl start postgres_exporter
sudo systemctl enable postgres_exporter

Custom query for Patroni metrics:

# /etc/postgres_exporter/queries.yaml

patroni_info:
  query: |
    SELECT 
      CASE WHEN pg_is_in_recovery() THEN 'replica' ELSE 'primary' END as role,
      1 as value
  metrics:
    - role:
        usage: "LABEL"
        description: "PostgreSQL role"
    - value:
        usage: "GAUGE"
        description: "Node role indicator"

5.2. Custom monitoring script

Python script using REST API:

#!/usr/bin/env python3
# /usr/local/bin/patroni_monitor.py

import requests
import json
import sys

NODES = [
    "http://10.0.1.11:8008",
    "http://10.0.1.12:8008",
    "http://10.0.1.13:8008"
]

def check_cluster():
    results = []
    
    for node_url in NODES:
        try:
            response = requests.get(f"{node_url}/patroni", timeout=5)
            data = response.json()
            
            results.append({
                "node": data["patroni"]["name"],
                "role": data["role"],
                "state": data["state"],
                "timeline": data["timeline"],
                "lag": data.get("xlog", {}).get("replayed_location", 0)
            })
        except Exception as e:
            print(f"Error checking {node_url}: {e}", file=sys.stderr)
            results.append({
                "node": node_url,
                "role": "unknown",
                "state": "unreachable",
                "error": str(e)
            })
    
    return results

def main():
    cluster_status = check_cluster()
    
    print(json.dumps(cluster_status, indent=2))
    
    # Check if we have a leader
    leaders = [n for n in cluster_status if n.get("role") == "master"]
    
    if len(leaders) != 1:
        print(f"ERROR: Expected 1 leader, found {len(leaders)}", file=sys.stderr)
        sys.exit(1)
    
    # Check all nodes reachable
    unreachable = [n for n in cluster_status if n.get("state") == "unreachable"]
    
    if unreachable:
        print(f"WARNING: {len(unreachable)} nodes unreachable", file=sys.stderr)
        sys.exit(1)
    
    print("Cluster is healthy")
    sys.exit(0)

if __name__ == "__main__":
    main()

Run monitoring:

python3 /usr/local/bin/patroni_monitor.py

# Output:
# [
#   {
#     "node": "node1",
#     "role": "master",
#     "state": "running",
#     "timeline": 1,
#     "lag": 0
#   },
#   {
#     "node": "node2",
#     "role": "replica",
#     "state": "running",
#     "timeline": 1,
#     "lag": 0
#   },
#   {
#     "node": "node3",
#     "role": "replica",
#     "state": "running",
#     "timeline": 1,
#     "lag": 0
#   }
# ]
# Cluster is healthy

5.3. Grafana dashboard query examples

PromQL queries:

# Node role
patroni_info{role="primary"}

# Replication lag
pg_stat_replication_replay_lag_seconds

# Timeline
patroni_timeline

# Number of replicas
count(patroni_info{role="replica"})

# Synchronous replica status
patroni_sync_state{sync_state="sync"}

6. Secure REST API

6.1. Enable authentication

In patroni.yml:

restapi:
  listen: 0.0.0.0:8008
  connect_address: 10.0.1.11:8008
  
  # Basic authentication
  authentication:
    username: admin
    password: secure_password_here

Access with authentication:

# Using curl
curl -u admin:secure_password_here http://10.0.1.11:8008/patroni

# Or with header
curl -H "Authorization: Basic $(echo -n admin:secure_password_here | base64)" \
  http://10.0.1.11:8008/patroni

6.2. Enable SSL/TLS

Generate certificates:

# Create CA
openssl genrsa -out ca.key 4096
openssl req -new -x509 -days 3650 -key ca.key -out ca.crt \
  -subj "/CN=Patroni-CA"

# Create server certificate
openssl genrsa -out server.key 4096
openssl req -new -key server.key -out server.csr \
  -subj "/CN=node1.example.com"

# Sign with CA
openssl x509 -req -days 365 -in server.csr -CA ca.crt -CAkey ca.key \
  -set_serial 01 -out server.crt

# Set permissions
sudo chown postgres:postgres server.key server.crt ca.crt
sudo chmod 600 server.key

Configure in patroni.yml:

restapi:
  listen: 0.0.0.0:8008
  connect_address: 10.0.1.11:8008
  
  certfile: /etc/patroni/certs/server.crt
  keyfile: /etc/patroni/certs/server.key
  cafile: /etc/patroni/certs/ca.crt
  
  # Optional: Require client certificates
  # verify_client: required
  
  authentication:
    username: admin
    password: secure_password_here

Access with HTTPS:

curl -k -u admin:secure_password_here https://10.0.1.11:8008/patroni

# Or with CA certificate
curl --cacert /etc/patroni/certs/ca.crt \
  -u admin:secure_password_here \
  https://10.0.1.11:8008/patroni

6.3. Firewall rules

# Allow REST API only from specific IPs
sudo ufw allow from 10.0.1.0/24 to any port 8008
sudo ufw allow from <load_balancer_ip> to any port 8008
sudo ufw allow from <monitoring_server_ip> to any port 8008

# Deny from everywhere else
sudo ufw deny 8008

7. Advanced REST API Usage

7.1. Scripted failover check

#!/bin/bash
# Check if failover is safe

CLUSTER_URL="http://10.0.1.11:8008/cluster"

# Get cluster info
CLUSTER_DATA=$(curl -s "$CLUSTER_URL")

# Count healthy replicas
HEALTHY_REPLICAS=$(echo "$CLUSTER_DATA" | jq '[.members[] | select(.role != "leader" and .state == "running")] | length')

if [ "$HEALTHY_REPLICAS" -ge 1 ]; then
    echo "Safe to failover: $HEALTHY_REPLICAS healthy replicas"
    exit 0
else
    echo "NOT safe to failover: only $HEALTHY_REPLICAS healthy replicas"
    exit 1
fi

7.2. Get primary endpoint dynamically

#!/bin/bash
# Get current primary IP:port

get_primary() {
    for NODE in 10.0.1.11 10.0.1.12 10.0.1.13; do
        RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" "http://${NODE}:8008/primary")
        if [ "$RESPONSE" = "200" ]; then
            echo "${NODE}:5432"
            return 0
        fi
    done
    echo "No primary found" >&2
    return 1
}

PRIMARY=$(get_primary)
echo "Current primary: $PRIMARY"

# Use in connection string
psql "host=$(echo $PRIMARY | cut -d: -f1) port=5432 user=app_user dbname=myapp"

7.3. Monitor replication lag

#!/bin/bash
# Alert if replication lag > threshold

THRESHOLD_MB=100

for NODE in 10.0.1.11 10.0.1.12 10.0.1.13; do
    LAG=$(curl -s "http://${NODE}:8008/patroni" | jq '.replication[]? | select(.sync_state != "sync") | .replay_lag' | wc -l)
    
    if [ "$LAG" -gt "$THRESHOLD_MB" ]; then
        echo "ALERT: Node $NODE replication lag > ${THRESHOLD_MB}MB"
        # Send notification
    fi
done

8. Lab Exercises

Lab 1: Explore REST API endpoints

Tasks:

Query all endpoints on each node
Compare responses between primary and replicas
Identify which endpoint returns 200 on primary vs replica

# Test script
for ENDPOINT in / /primary /replica /read-write /read-only /health /patroni; do
    echo "=== $ENDPOINT ==="
    for NODE in 10.0.1.11 10.0.1.12 10.0.1.13; do
        HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" "http://${NODE}:8008${ENDPOINT}")
        echo "  Node $NODE: $HTTP_CODE"
    done
done

Lab 2: Setup HAProxy

Tasks:

Install HAProxy
Configure with Patroni health checks
Test write traffic goes to primary only
Test read traffic distributed to replicas
Trigger failover, verify HAProxy redirects automatically

Lab 3: Create monitoring dashboard

Tasks:

Write Python script to query all nodes
Display cluster topology
Show replication lag
Highlight current primary
Run every 5 seconds

Lab 4: Secure REST API

Tasks:

Enable basic authentication
Generate SSL certificates
Configure HTTPS
Update curl commands to use auth + SSL
Configure firewall rules

9. Troubleshooting REST API

9.1. REST API not responding

Check:

# 1. Verify Patroni is running
sudo systemctl status patroni

# 2. Check if port is listening
sudo netstat -tlnp | grep 8008

# 3. Check firewall
sudo ufw status | grep 8008

# 4. Test locally
curl http://localhost:8008/

# 5. Check logs
sudo journalctl -u patroni -n 50 | grep -i rest

9.2. Wrong HTTP codes returned

Debug:

# Get detailed response
curl -v http://10.0.1.11:8008/primary

# Check PostgreSQL status
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"

# Check Patroni sees correct role
patronictl list

9.3. SSL/TLS errors

Check:

# Verify certificate
openssl x509 -in /etc/patroni/certs/server.crt -text -noout

# Check certificate matches key
openssl x509 -modulus -noout -in server.crt | md5sum
openssl rsa -modulus -noout -in server.key | md5sum
# Should match

# Test SSL connection
openssl s_client -connect 10.0.1.11:8008 -CAfile ca.crt

10. Tổng kết

Key Endpoints Summary

Endpoint	Returns 200 When	Use Case
`/primary`	Node is primary	LB write routing
`/replica`	Node is replica	LB read routing
`/read-write`	Node accepts writes	Write endpoint
`/read-only`	Node is read-only replica	Read endpoint
`/health`	Node is healthy	Detailed monitoring
`/patroni`	Always (detailed info)	Advanced monitoring
`/cluster`	Always (all members)	Topology view

Integration Checklist

REST API accessible from all nodes
HAProxy configured with health checks
Monitoring system queries REST API
Authentication enabled
SSL/TLS configured (production)
Firewall rules configured
Health check scripts tested

Architecture hiện tại

✅ 3 VMs prepared (Bài 4)
✅ PostgreSQL 18 installed (Bài 5)
✅ etcd cluster running (Bài 6)
✅ Patroni installed (Bài 7)
✅ Patroni configured (Bài 8)
✅ Cluster bootstrapped (Bài 9)
✅ Replication configured (Bài 10)
✅ Callbacks implemented (Bài 11)
✅ REST API integrated (Bài 12)

Next: Failover management

Chuẩn bị cho Bài 13

Bài 13 sẽ cover Failover và Switchover:

Automatic failover process
Manual switchover
Failover scenarios và testing
DCS role in leader election
Minimize downtime strategies

Menu

Bài 12: Patroni REST API

DUY TRAN