Bài 23: Patroni Configuration Management

Dynamic configuration changes, DCS-based configuration, sử dụng patronictl edit-config và update config không downtime.

6 min read
XDEV ASIA

Bài 23: Patroni Configuration Management

Mục tiêu

Sau bài học này, bạn sẽ:

  • Manage Patroni configuration dynamically
  • Use patronictl edit-config
  • Understand DCS-stored configuration
  • Perform zero-downtime config changes
  • Validate and rollback configurations

1. Configuration Layers

1.1. Configuration hierarchy

Priority (highest to lowest):
1. PostgreSQL parameters in postgresql.conf (overrides all)
2. DCS configuration (patronictl edit-config)
3. Patroni YAML file (/etc/patroni/patroni.yml)
4. PostgreSQL defaults

Typical workflow:
- Bootstrap config → patroni.yml
- Runtime changes → DCS (patronictl edit-config)
- Local overrides → postgresql.conf (rare, not recommended)

1.2. Configuration scope

# Bootstrap config (patroni.yml) - initial setup only
bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    postgresql:
      parameters:
        max_connections: 100
        shared_buffers: 256MB

# Runtime config (DCS) - can be changed anytime
# Stored in etcd/consul/k8s and applied to all nodes

2. View Current Configuration

2.1. Show DCS configuration

patronictl -c /etc/patroni/patroni.yml show-config

# Output:
# loop_wait: 10
# maximum_lag_on_failover: 1048576
# postgresql:
#   parameters:
#     archive_command: 'test ! -f /var/lib/postgresql/wal_archive/%f && cp %p /var/lib/postgresql/wal_archive/%f'
#     archive_mode: 'on'
#     hot_standby: 'on'
#     max_connections: 100
#     max_replication_slots: 10
#     max_wal_senders: 10
#     shared_buffers: 256MB
#     wal_level: replica
#   use_pg_rewind: true
#   use_slots: true
# retry_timeout: 10
# ttl: 30

2.2. Get specific parameter

# Query etcd directly
export ETCDCTL_API=3
etcdctl get /service/postgres-cluster/config --print-value-only | jq .

# Or use patronictl
patronictl -c /etc/patroni/patroni.yml show-config | grep max_connections

2.3. Compare with local config

# Show differences
diff <(patronictl -c /etc/patroni/patroni.yml show-config) \
     <(grep -A 100 "^bootstrap:" /etc/patroni/patroni.yml)

3. Dynamic Configuration Changes

3.1. Edit configuration interactively

# Open editor with current config
patronictl -c /etc/patroni/patroni.yml edit-config

# This opens in $EDITOR (vim/nano)
# Example changes:
# Before:
postgresql:
  parameters:
    max_connections: 100
    shared_buffers: 256MB

# After:
postgresql:
  parameters:
    max_connections: 200  # Changed
    shared_buffers: 512MB  # Changed
    work_mem: 8MB  # Added
# Save and exit
# Patroni will prompt:
# Apply these changes? [y/N]: y
# --- 
# +++ 
# @@ -5,7 +5,8 @@
#  postgresql:
#    parameters:
# -    max_connections: 100
# -    shared_buffers: 256MB
# +    max_connections: 200
# +    shared_buffers: 512MB
# +    work_mem: 8MB
# 
# Configuration changed

3.2. Automatic vs manual restart

PostgreSQL parameters fall into 3 categories:

1. Dynamic (no restart):
   - work_mem, maintenance_work_mem
   - effective_cache_size
   - random_page_cost
   - Apply immediately with pg_reload_conf()

2. Reload required (SIGHUP):
   - max_connections (if increasing)
   - shared_buffers
   - Patroni will reload automatically

3. Restart required:
   - max_connections (if decreasing)
   - shared_buffers (decreasing)
   - wal_level, max_wal_senders
   - Patroni will restart replicas, then switchover and restart leader

3.3. Check pending restart

patronictl -c /etc/patroni/patroni.yml list

# + Cluster: postgres-cluster (7329123456789012345) ---+----+-----------+
# | Member | Host       | Role    | State   | TL | Lag in MB | Pending restart |
# +--------+------------+---------+---------+----+-----------+-----------------+
# | node1  | 10.0.1.11  | Leader  | running |  5 |           | *               |
# | node2  | 10.0.1.12  | Replica | running |  5 |         0 | *               |
# | node3  | 10.0.1.13  | Replica | running |  5 |         0 | *               |
# +--------+------------+---------+---------+----+-----------+-----------------+
#
# * = Pending restart required

3.4. Trigger restart

# Restart specific node
patronictl -c /etc/patroni/patroni.yml restart postgres-cluster node2

# Restart all nodes (one by one)
patronictl -c /etc/patroni/patroni.yml restart postgres-cluster

# Force restart (even if no pending changes)
patronictl -c /etc/patroni/patroni.yml restart postgres-cluster node1 --force

4. Configuration Templates

4.1. Set configuration via command line

# Patch configuration
patronictl -c /etc/patroni/patroni.yml edit-config --apply - <<EOF
postgresql:
  parameters:
    max_connections: 300
    shared_buffers: 1GB
EOF

# Or use --set flag (if supported)
# patronictl edit-config --set postgresql.parameters.max_connections=300

4.2. Save and restore configs

# Export current config
patronictl -c /etc/patroni/patroni.yml show-config > config-backup-$(date +%Y%m%d).yml

# Restore config
patronictl -c /etc/patroni/patroni.yml edit-config --apply config-backup-20241125.yml

4.3. Version control

# Track config changes in git
mkdir -p /opt/patroni/configs
cd /opt/patroni/configs
git init

# Save config
patronictl -c /etc/patroni/patroni.yml show-config > current-config.yml
git add current-config.yml
git commit -m "Increased max_connections to 300"

# View history
git log --oneline
# abc123 Increased max_connections to 300
# def456 Added work_mem parameter
# ghi789 Initial configuration

5. Common Configuration Tasks

5.1. Increase max_connections

patronictl -c /etc/patroni/patroni.yml edit-config
postgresql:
  parameters:
    max_connections: 200  # Change from 100
    
    # May also need to increase:
    shared_buffers: 512MB  # ~25% of RAM
    max_wal_senders: 15    # max_connections / 10
    max_replication_slots: 15

Note: Requires restart if decreasing, reload if increasing within limits.

5.2. Enable query logging

patronictl -c /etc/patroni/patroni.yml edit-config
postgresql:
  parameters:
    log_statement: 'all'  # or 'ddl', 'mod', 'none'
    log_duration: 'on'
    log_min_duration_statement: 1000  # Log queries > 1s

Note: No restart required (dynamic parameter).

5.3. Adjust memory settings

patronictl -c /etc/patroni/patroni.yml edit-config
postgresql:
  parameters:
    shared_buffers: 512MB          # Requires restart
    effective_cache_size: 2GB      # Dynamic
    work_mem: 8MB                  # Dynamic
    maintenance_work_mem: 128MB    # Dynamic

5.4. Tune checkpoint behavior

patronictl -c /etc/patroni/patroni.yml edit-config
postgresql:
  parameters:
    checkpoint_timeout: 15min
    checkpoint_completion_target: 0.9
    max_wal_size: 4GB
    min_wal_size: 1GB

Note: Dynamic or reload, no restart needed.

5.5. Enable pg_stat_statements

patronictl -c /etc/patroni/patroni.yml edit-config
postgresql:
  parameters:
    shared_preload_libraries: 'pg_stat_statements'  # Requires restart!
    pg_stat_statements.track: 'all'
    pg_stat_statements.max: 10000
# After restart, create extension
sudo -u postgres psql -c "CREATE EXTENSION IF NOT EXISTS pg_stat_statements;"

6. Validation and Testing

6.1. Check parameter values

-- Current settings
SELECT name, setting, unit, context, source
FROM pg_settings
WHERE name IN ('max_connections', 'shared_buffers', 'work_mem');

-- Pending reload
SELECT name, setting, pending_restart
FROM pg_settings
WHERE pending_restart = true;

6.2. Validate configuration

# PostgreSQL validation
sudo -u postgres /usr/lib/postgresql/18/bin/postgres \
  -D /var/lib/postgresql/18/data \
  -C max_connections

# Check for errors
sudo journalctl -u patroni -n 100 --no-pager

6.3. Test configuration change

# 1. Change config
patronictl -c /etc/patroni/patroni.yml edit-config --apply test-config.yml

# 2. Monitor logs
tail -f /var/lib/postgresql/18/data/log/postgresql-*.log

# 3. Check cluster status
watch -n 1 'patronictl -c /etc/patroni/patroni.yml list'

# 4. Verify parameter
psql -h 10.0.1.11 -U postgres -c "SHOW max_connections;"

7. Rollback Procedures

7.1. Immediate rollback

# Restore from backup
patronictl -c /etc/patroni/patroni.yml edit-config --apply config-backup-20241125.yml

# Verify
patronictl -c /etc/patroni/patroni.yml show-config

# Restart if needed
patronictl -c /etc/patroni/patroni.yml restart postgres-cluster

7.2. Emergency recovery

# If DCS is corrupted, reset from local config
# 1. Stop Patroni on all nodes
sudo systemctl stop patroni

# 2. Edit patroni.yml directly
sudo vi /etc/patroni/patroni.yml

# 3. Reinitialize DCS config (on leader only)
patronictl -c /etc/patroni/patroni.yml reinit postgres-cluster node1 --force

# 4. Start Patroni on all nodes
sudo systemctl start patroni

8. Advanced Configuration

8.1. Per-database parameters

-- Set parameter for specific database
ALTER DATABASE myapp SET work_mem = '16MB';

-- Per-user settings
ALTER USER app_user SET statement_timeout = '30s';

Note: These override cluster-wide settings.

8.2. Conditional configuration

# In patroni.yml (local config)
postgresql:
  parameters:
    # Leader-only settings
    synchronous_standby_names: 'node2,node3'
    
  # pg_hba.conf can differ per node
  pg_hba:
    - host replication replicator 10.0.1.0/24 scram-sha-256
    - host all all 10.0.1.0/24 scram-sha-256

8.3. Custom callbacks

# In patroni.yml
postgresql:
  callbacks:
    on_reload: /usr/local/bin/patroni-reload-hook.sh
    on_restart: /usr/local/bin/patroni-restart-hook.sh
    on_role_change: /usr/local/bin/patroni-role-change-hook.sh
#!/bin/bash
# /usr/local/bin/patroni-reload-hook.sh
echo "$(date): PostgreSQL reloaded" >> /var/log/patroni-hooks.log

# Send notification
curl -X POST https://hooks.slack.com/... \
  -d '{"text": "PostgreSQL config reloaded on '$(hostname)'"}'

9. Configuration Best Practices

✅ DO

  1. Use DCS for runtime changes - Consistent across cluster
  2. Version control configs - Track changes in git
  3. Test in staging first - Validate before production
  4. Document changes - Why, what, when
  5. Backup before changes - Easy rollback
  6. Monitor after changes - Watch for issues
  7. Schedule restarts - During maintenance window
  8. Use patronictl edit-config - Not manual etcd changes
  9. Validate parameters - Check pg_settings
  10. Review regularly - Quarterly config audits

❌ DON'T

  1. Don't edit postgresql.conf - Use patronictl instead
  2. Don't change etcd directly - Use Patroni tools
  3. Don't skip backups - Always save before changes
  4. Don't apply untested changes - Test first
  5. Don't ignore pending restart - May not apply correctly
  6. Don't change wal_level lightly - Requires full restart
  7. Don't forget about replicas - Changes apply cluster-wide

10. Configuration Monitoring

10.1. Track configuration drift

# Check if all nodes have same config
for node in node1 node2 node3; do
  echo "=== $node ==="
  ssh $node "sudo -u postgres psql -Atc \"SELECT name, setting FROM pg_settings WHERE name = 'max_connections'\""
done

10.2. Alert on config changes

# Prometheus alert
groups:
  - name: patroni-config
    rules:
      - alert: PatroniConfigChanged
        expr: changes(patroni_config_last_modified[5m]) > 0
        labels:
          severity: info
        annotations:
          summary: "Patroni configuration changed"
      
      - alert: PostgreSQLPendingRestart
        expr: patroni_pending_restart == 1
        for: 1h
        labels:
          severity: warning
        annotations:
          summary: "PostgreSQL pending restart for {{ $labels.instance }}"

10.3. Audit log for config changes

# Enable auditd for /etc/patroni/
sudo auditctl -w /etc/patroni/ -p wa -k patroni-config

# View audit logs
sudo ausearch -k patroni-config

# Or use journalctl for Patroni service
sudo journalctl -u patroni --since "1 hour ago" | grep "config"

11. Lab Exercises

Lab 1: Dynamic configuration change

Tasks:

  1. View current configuration
  2. Edit configuration to increase max_connections
  3. Apply changes without restart
  4. Verify new setting
  5. Save configuration backup

Lab 2: Configuration requiring restart

Tasks:

  1. Change shared_buffers parameter
  2. Observe pending restart flag
  3. Perform rolling restart
  4. Verify change applied
  5. Test with pg_settings query

Lab 3: Rollback configuration

Tasks:

  1. Backup current config
  2. Make intentional bad change
  3. Observe cluster behavior
  4. Rollback to backup
  5. Document recovery steps

Lab 4: Configuration automation

Tasks:

  1. Create shell script to apply config
  2. Implement validation checks
  3. Add backup/rollback logic
  4. Test script on cluster
  5. Add to cron for scheduled changes

12. Tổng kết

Configuration Management Flow

1. Backup current config
   ↓
2. Edit configuration (patronictl edit-config)
   ↓
3. Validate changes
   ↓
4. Apply to DCS
   ↓
5. Patroni propagates to all nodes
   ↓
6. Reload or restart as needed
   ↓
7. Verify changes applied
   ↓
8. Monitor cluster health

Key Commands

# View config
patronictl show-config

# Edit config
patronictl edit-config

# Backup config
patronictl show-config > backup.yml

# Restore config
patronictl edit-config --apply backup.yml

# Check pending restart
patronictl list

# Restart node
patronictl restart postgres-cluster node1

Parameter Types

TypeActionExamples
DynamicImmediatework_mem, effective_cache_size
ReloadSIGHUPmax_connections (up), log_statement
RestartFull restartshared_buffers (down), wal_level

Next Steps

Bài 24 sẽ cover Upgrade Strategies:

  • PostgreSQL major version upgrades
  • Patroni version upgrades
  • Zero-downtime upgrade procedures
  • Rollback strategies
  • Testing and validation
configuration-management dynamic-config patronictl dcs zero-downtime lab

Đánh dấu hoàn thành (Bài 23: Patroni Configuration Management)