Bài 28: Thiết Kế Kiến Trúc HA

Mục tiêu

Sau bài học này, bạn sẽ:

Gather requirements cho HA cluster design
Create comprehensive architecture documents
Perform capacity planning calculations
Estimate infrastructure costs
Conduct design reviews effectively

1. Requirements Gathering

1.1. Business requirements template

# PostgreSQL HA Requirements

## Business Context
- Application: [E-commerce platform]
- Users: [50M registered, 2M DAU]
- Business hours: [24/7 global]
- Peak traffic: [Black Friday, 10x normal]

## Availability Requirements
- Target uptime: [99.99% = 52 min downtime/year]
- Planned maintenance windows: [None - zero-downtime required]
- Acceptable downtime per incident: [< 5 minutes]

## Performance Requirements
- Expected QPS: [5,000 average, 15,000 peak]
- Query latency target: [p50: 10ms, p95: 50ms, p99: 200ms]
- Write throughput: [1,000 TPS average, 3,000 TPS peak]

## Data Requirements
- Current data size: [2TB]
- Growth rate: [20% per year]
- Retention period: [7 years for compliance]
- Backup frequency: [Daily full, continuous WAL archiving]

## Disaster Recovery
- RPO (Recovery Point Objective): [< 5 minutes]
- RTO (Recovery Time Objective): [< 15 minutes]
- Geographic redundancy: [Required - multi-region]

## Security & Compliance
- Data encryption: [At rest and in transit]
- Audit logging: [All queries must be logged]
- Compliance: [PCI DSS, GDPR, SOC 2]

## Budget Constraints
- Infrastructure budget: [$10K-15K/month]
- Staffing: [1 Senior DBA, 1 Junior DBA]
- Acceptable cost per transaction: [< $0.0001]

1.2. Technical requirements

# Technical Requirements

## Workload Characteristics
- Read/Write ratio: [80% reads, 20% writes]
- Transaction types:
  * OLTP (transactional): 70%
  * OLAP (analytical): 30%
- Query complexity: [Mix of simple and complex joins]
- Largest tables: [orders: 500M rows, users: 50M rows]

## Integration Requirements
- Application stack: [Node.js, Python, Java]
- Connection pooling: [Required - PgBouncer]
- Load balancing: [HAProxy or cloud LB]
- Monitoring: [Prometheus + Grafana]
- CI/CD: [GitLab CI]

## Operational Requirements
- Deployment method: [Kubernetes preferred, VMs acceptable]
- Backup storage: [S3-compatible object storage]
- Log aggregation: [ELK or Loki]
- Alerting: [PagerDuty integration]
- Documentation: [Confluence/GitHub Wiki]

2. Architecture Design Document

2.1. High-level architecture

                                    Internet
                                       ↓
                              [CloudFlare CDN]
                                       ↓
                          [AWS Application Load Balancer]
                                       ↓
                    ┌──────────────────┴──────────────────┐
                    ↓                                      ↓
            [us-east-1a]                            [us-east-1b]
        ┌─────────────────┐                    ┌─────────────────┐
        │ Application     │                    │ Application     │
        │ Servers (ECS)   │                    │ Servers (ECS)   │
        └────────┬────────┘                    └────────┬────────┘
                 ↓                                       ↓
              [HAProxy]                              [HAProxy]
           10.0.1.100:5432                       10.0.2.100:5432
                 ↓                                       ↓
    ┌────────────┴───────────┬──────────────────────────┴────────┐
    ↓                        ↓                                    ↓
[PostgreSQL Leader]  [PostgreSQL Replica]              [PostgreSQL Replica]
  10.0.1.11              10.0.1.12                         10.0.2.11
  r6g.4xlarge            r6g.2xlarge                       r6g.2xlarge
  16 vCPU, 128GB         8 vCPU, 64GB                      8 vCPU, 64GB
  2TB io2 SSD            1TB io2 SSD                       1TB io2 SSD
  20K IOPS               10K IOPS                          10K IOPS
  Patroni + etcd         Patroni + etcd                    Patroni + etcd
  
Supporting Services:
├─ PgBouncer (connection pooling) - Co-located with PostgreSQL
├─ etcd cluster (3 nodes) - Co-located with PostgreSQL
├─ Prometheus (monitoring) - Separate t3.large
├─ Grafana (visualization) - Separate t3.small
├─ Backup storage - S3 bucket (s3://pg-backups)
└─ WAL archive - S3 bucket (s3://pg-wal-archive)

DR Site (us-west-2):
└─ [PostgreSQL Replica] - Async replication from us-east-1
   10.100.1.11
   r6g.xlarge (standby, can scale up on DR activation)

2.2. Network design

VPC: 10.0.0.0/16

Subnets:
├─ Public subnet (us-east-1a): 10.0.1.0/24
│  └─ NAT Gateway, Bastion host
├─ Private subnet (us-east-1a): 10.0.10.0/24
│  └─ PostgreSQL leader, replica1, etcd1
├─ Public subnet (us-east-1b): 10.0.2.0/24
│  └─ NAT Gateway
├─ Private subnet (us-east-1b): 10.0.20.0/24
│  └─ PostgreSQL replica2, etcd2
└─ Private subnet (us-east-1c): 10.0.30.0/24
   └─ etcd3

Security Groups:
├─ PostgreSQL SG:
│  ├─ Inbound: 5432 from Application SG
│  ├─ Inbound: 8008 from Monitoring SG (Patroni API)
│  ├─ Inbound: 5432 from PostgreSQL SG (replication)
│  └─ Outbound: All
├─ Application SG:
│  ├─ Inbound: 443 from ALB
│  └─ Outbound: 5432 to PostgreSQL SG
└─ Monitoring SG:
   ├─ Inbound: 9090 from VPN
   └─ Outbound: 8008 to PostgreSQL SG

2.3. Data flow diagram

Write Path:
Client → ALB → Application → HAProxy (master) → PostgreSQL Leader
                                                    ↓
                                         [Synchronous replication]
                                                    ↓
                                            PostgreSQL Replica1
                                                    ↓
                                         [Asynchronous replication]
                                                    ↓
                                            PostgreSQL Replica2 (AZ2)
                                                    ↓
                                         [Asynchronous replication]
                                                    ↓
                                            PostgreSQL DR (us-west-2)

Read Path (80% of traffic):
Client → ALB → Application → HAProxy (replicas) → Round-robin:
                                                    ├─ PostgreSQL Replica1
                                                    ├─ PostgreSQL Replica2
                                                    └─ PostgreSQL Replica (DR, optional)

Backup Path:
PostgreSQL Leader → WAL archiving → S3 (wal-archive)
                  ↓
            Daily pg_basebackup → S3 (backups)
                  ↓
            Monthly full backup → S3 Glacier

3. Capacity Planning

3.1. Compute capacity

# Capacity planning calculator

# Given requirements:
avg_qps = 5000  # queries per second
peak_qps = 15000  # peak queries per second
avg_query_time_ms = 10  # milliseconds
connection_per_query = 1

# Calculate connections needed
avg_connections = (avg_qps * avg_query_time_ms) / 1000
peak_connections = (peak_qps * avg_query_time_ms) / 1000

print(f"Average concurrent connections: {avg_connections}")
# Output: 50 connections

print(f"Peak concurrent connections: {peak_connections}")
# Output: 150 connections

# With connection pooling (transaction mode):
pooler_multiplier = 10  # Each DB connection serves 10 app connections
app_connections = peak_connections * pooler_multiplier
db_connections = peak_connections

print(f"Application connections: {app_connections}")
# Output: 1500 connections

print(f"Database connections (with pooler): {db_connections}")
# Output: 150 connections

# PostgreSQL configuration:
max_connections = 200  # 150 + 50 overhead

3.2. Memory capacity

# PostgreSQL memory calculation

# Rule of thumb: 25% of RAM for shared_buffers
total_ram_gb = 128
shared_buffers_gb = total_ram_gb * 0.25
print(f"shared_buffers: {shared_buffers_gb}GB")
# Output: 32GB

# effective_cache_size: 50-75% of RAM
effective_cache_size_gb = total_ram_gb * 0.75
print(f"effective_cache_size: {effective_cache_size_gb}GB")
# Output: 96GB

# work_mem per connection
# Formula: (RAM - shared_buffers) / max_connections / 2
available_ram_gb = total_ram_gb - shared_buffers_gb
work_mem_mb = (available_ram_gb * 1024) / max_connections / 2
print(f"work_mem: {work_mem_mb:.0f}MB per connection")
# Output: 240MB per connection

# Validate total memory usage
max_memory_usage_gb = shared_buffers_gb + (max_connections * work_mem_mb / 1024)
print(f"Maximum memory usage: {max_memory_usage_gb:.1f}GB")
# Output: 80GB (within 128GB limit ✅)

3.3. Storage capacity

# Storage planning

# Current data size
current_data_tb = 2  # TB

# Growth rate
annual_growth_rate = 0.20  # 20% per year
years_to_plan = 3

# Projected data size
projected_data_tb = current_data_tb * (1 + annual_growth_rate) ** years_to_plan
print(f"Data size in {years_to_plan} years: {projected_data_tb:.2f}TB")
# Output: 3.46TB

# WAL volume
# Estimate: 10% of data size per day
wal_per_day_gb = (current_data_tb * 1024) * 0.10
wal_retention_days = 7
total_wal_gb = wal_per_day_gb * wal_retention_days
print(f"WAL storage needed (7 days): {total_wal_gb:.0f}GB")
# Output: 1434GB ~ 1.4TB

# Backup storage
# Full backup + 7 days of WAL
backup_storage_tb = projected_data_tb + (total_wal_gb / 1024)
print(f"Backup storage needed: {backup_storage_tb:.2f}TB")
# Output: 4.86TB

# Total storage per instance
# Data + WAL + temp + overhead (20%)
storage_per_instance_tb = (projected_data_tb + (total_wal_gb / 1024)) * 1.20
print(f"Storage per instance: {storage_per_instance_tb:.2f}TB")
# Output: 5.83TB ~ 6TB

# Recommend: Provision 8TB for growth buffer

3.4. IOPS calculation

# IOPS requirements

# Given:
write_tps = 3000  # transactions per second (peak)
reads_per_write = 4  # Average reads per write
checkpoint_interval_sec = 300  # 5 minutes

# Write IOPS
# Each transaction: 1 write to WAL + 1 write to data (during checkpoint)
wal_iops = write_tps * 1  # WAL writes
checkpoint_iops = write_tps * 1 / (checkpoint_interval_sec / 5)  # Amortized
total_write_iops = wal_iops + checkpoint_iops
print(f"Write IOPS: {total_write_iops:.0f}")
# Output: 3060 IOPS

# Read IOPS
# 80% read ratio, 20% write ratio
total_tps = write_tps / 0.20  # Total transactions = write TPS / write percentage
read_tps = total_tps * 0.80
read_iops = read_tps * reads_per_write
print(f"Read IOPS: {read_iops:.0f}")
# Output: 48000 IOPS

# Total IOPS
total_iops = total_write_iops + read_iops
print(f"Total IOPS required: {total_iops:.0f}")
# Output: 51060 IOPS

# Recommendation: Provision 60K IOPS for headroom
# AWS io2 SSD: $0.065/IOPS/month
iops_provisioned = 60000

4. Cost Estimation

4.1. AWS infrastructure costs

Compute (EC2):
├─ Leader: r6g.4xlarge (16 vCPU, 128GB RAM)
│  └─ $0.672/hour x 730 hours = $490/month
├─ Replica1: r6g.2xlarge (8 vCPU, 64GB RAM)
│  └─ $0.336/hour x 730 hours = $245/month
├─ Replica2: r6g.2xlarge (8 vCPU, 64GB RAM)
│  └─ $0.336/hour x 730 hours = $245/month
├─ DR Replica: r6g.xlarge (4 vCPU, 32GB RAM)
│  └─ $0.168/hour x 730 hours = $123/month
└─ Total compute: $1,103/month

Storage (EBS io2):
├─ Leader: 8TB @ $0.125/GB = $1,000/month
│  └─ IOPS: 20K @ $0.065/IOPS = $1,300/month
├─ Replica1: 6TB @ $0.125/GB = $750/month
│  └─ IOPS: 10K @ $0.065/IOPS = $650/month
├─ Replica2: 6TB @ $0.125/GB = $750/month
│  └─ IOPS: 10K @ $0.065/IOPS = $650/month
├─ DR: 6TB @ $0.125/GB = $750/month
│  └─ IOPS: 5K @ $0.065/IOPS = $325/month
└─ Total storage: $6,175/month

Backup Storage (S3):
├─ S3 Standard (30 days): 5TB @ $0.023/GB = $115/month
├─ S3 Glacier (7 years): 50TB @ $0.004/GB = $200/month
└─ Total backup storage: $315/month

Network:
├─ Data transfer out: 2TB @ $0.09/GB = $180/month
├─ Inter-AZ transfer: 5TB @ $0.01/GB = $50/month
└─ Total network: $230/month

Supporting Services:
├─ Application Load Balancer: $23/month
├─ NAT Gateway (2): $65/month
├─ Monitoring (Prometheus, Grafana): $100/month
└─ Total supporting: $188/month

Grand Total: $8,011/month

Reserved Instance Savings (1-year):
- Compute: $1,103 → $770 (30% savings)
- Revised total: $7,678/month

4.2. Cost optimization opportunities

Potential Savings:
1. Use gp3 instead of io2 for non-critical replicas
   Savings: ~$2,000/month

2. Use Spot instances for DR replica
   Savings: ~$90/month

3. Lifecycle S3 backups to Glacier faster (7 days vs 30 days)
   Savings: ~$50/month

4. Reduce DR replica size (only scale up during DR)
   Savings: ~$60/month

5. Use AWS Savings Plans
   Additional 10-15% savings: ~$750/month

Optimized Total: ~$4,728/month (41% savings)

5. Design Review Process

5.1. Design review checklist

☐ Requirements Review
  ☐ Business requirements documented
  ☐ Technical requirements clear
  ☐ Non-functional requirements (performance, security)
  ☐ Constraints identified (budget, timeline)

☐ Architecture Review
  ☐ High-level diagram created
  ☐ Component responsibilities defined
  ☐ Data flow documented
  ☐ Network topology validated
  ☐ Security controls identified

☐ Capacity Planning
  ☐ Compute resources sized appropriately
  ☐ Storage capacity calculated
  ☐ IOPS requirements met
  ☐ Network bandwidth sufficient
  ☐ Growth projections considered

☐ High Availability
  ☐ SPoF (Single Points of Failure) eliminated
  ☐ Failover mechanisms tested
  ☐ RTO/RPO targets achievable
  ☐ DR plan documented
  ☐ Backup/restore validated

☐ Performance
  ☐ Query performance tested
  ☐ Load testing completed
  ☐ Bottlenecks identified
  ☐ Tuning recommendations documented
  ☐ Monitoring and alerting in place

☐ Security
  ☐ Encryption at rest and in transit
  ☐ Network segmentation
  ☐ Access controls (IAM, RBAC)
  ☐ Audit logging enabled
  ☐ Compliance requirements met

☐ Operational Readiness
  ☐ Runbooks created
  ☐ Monitoring dashboards configured
  ☐ Alerting rules defined
  ☐ On-call rotation established
  ☐ Training completed

☐ Cost Management
  ☐ Cost estimation completed
  ☐ Budget approved
  ☐ Cost optimization opportunities identified
  ☐ Ongoing cost monitoring plan

5.2. Review meeting agenda

# Architecture Design Review Meeting

Date: 2024-11-25
Duration: 90 minutes
Attendees: DBA team, SRE, Dev team, Product manager, Security

Agenda:
1. Introduction (5 min)
   - Project overview
   - Review objectives

2. Requirements Review (10 min)
   - Business requirements
   - Q&A

3. Architecture Presentation (30 min)
   - High-level design
   - Component details
   - Data flow
   - Network design

4. Capacity Planning (15 min)
   - Compute, storage, IOPS calculations
   - Growth projections

5. Cost Estimation (10 min)
   - Infrastructure costs
   - Optimization opportunities

6. Security & Compliance (10 min)
   - Security controls
   - Compliance mapping

7. Open Discussion (10 min)
   - Concerns and risks
   - Alternative approaches

8. Action Items & Next Steps (5 min)
   - Assign owners
   - Set deadlines

Follow-up:
- Circulate meeting notes within 24 hours
- Address action items within 1 week
- Final approval from stakeholders

6. Risk Assessment

6.1. Risk matrix

Risk	Likelihood	Impact	Severity	Mitigation
Leader node failure	Medium	Low	Medium	Automatic failover with Patroni
Datacenter outage	Low	High	Medium	Multi-AZ deployment + DR site
Data corruption	Low	High	Medium	PITR backups, checksums enabled
Capacity exhaustion	Medium	Medium	Medium	Monitoring + auto-scaling
Security breach	Low	Critical	High	Encryption, network segmentation, audit logs
Cost overrun	Medium	Medium	Medium	Budget alerts, cost optimization
Staff turnover	High	Medium	Medium	Documentation, cross-training
Vendor lock-in	Low	Medium	Low	Use open-source tools (Patroni vs RDS)

6.2. Mitigation strategies

1. Leader node failure
   - Patroni automatic failover (RTO < 30s)
   - Health checks every 10s
   - Synchronous replication to 1 replica
   - Runbook for manual intervention

2. Datacenter outage
   - Multi-AZ deployment (2 AZs in primary region)
   - DR site in different region (us-west-2)
   - Quarterly DR drills
   - Documented failover procedures

3. Data corruption
   - pg_checksums enabled
   - Daily full backups + continuous WAL archiving
   - PITR tested monthly
   - Backup retention: 30 days hot, 7 years cold

4. Capacity exhaustion
   - Prometheus alerts at 70% CPU/memory/disk
   - PgBouncer for connection management
   - Read replicas for horizontal scaling
   - Annual capacity planning review

5. Security breach
   - Encryption at rest (LUKS) and in transit (SSL/TLS)
   - Network segmentation (private subnets)
   - MFA for all admin access
   - Quarterly security audits
   - Intrusion detection system (IDS)

6. Cost overrun
   - AWS Budgets with alerts at 80%, 100%, 120%
   - Monthly cost review meetings
   - Reserved instances for predictable workloads
   - Automatic shutdown of non-production environments

7. Staff turnover
   - Comprehensive documentation (Confluence)
   - Runbooks for common tasks
   - Cross-training program
   - Bus factor > 2 for critical knowledge

7. Lab Exercises

Lab 1: Requirements gathering

Tasks:

Interview stakeholders (role-play)
Document business requirements
Define technical requirements
Identify constraints
Create requirements document

Lab 2: Architecture design

Tasks:

Create high-level architecture diagram
Design network topology
Document data flow
Define security controls
Present to team for review

Lab 3: Capacity planning

Tasks:

Calculate compute requirements
Estimate storage needs
Determine IOPS requirements
Plan for 3-year growth
Document assumptions

Lab 4: Cost estimation

Tasks:

Price out infrastructure on AWS/GCP/Azure
Compare managed vs self-hosted options
Identify cost optimization opportunities
Create budget proposal
Present to management

8. Tổng kết

Design Principles

1. Simplicity: Start simple, add complexity as needed
2. Resilience: Eliminate single points of failure
3. Scalability: Plan for 3x growth
4. Security: Defense in depth
5. Observability: Monitor everything
6. Cost-effectiveness: Optimize for cost/performance ratio
7. Maintainability: Document and automate

Key Deliverables

1. Requirements document
2. Architecture diagrams
3. Capacity planning spreadsheet
4. Cost estimation
5. Risk assessment
6. Design review presentation
7. Runbooks and documentation

Next Steps

Bài 29 sẽ cover Deploy Production-Ready Cluster:

Complete end-to-end deployment guide
Production deployment checklist
Operational runbooks
Knowledge transfer
Final assessment

Menu

Bài 28: Thiết Kế Kiến Trúc HA

DUY TRAN

Bài 28: Thiết Kế Kiến Trúc HA

Bài học khóa học

Mục tiêu

1. Requirements Gathering

1.1. Business requirements template

1.2. Technical requirements

2. Architecture Design Document

2.1. High-level architecture

2.2. Network design

2.3. Data flow diagram

3. Capacity Planning

3.1. Compute capacity

3.2. Memory capacity

3.3. Storage capacity

3.4. IOPS calculation

4. Cost Estimation

4.1. AWS infrastructure costs

4.2. Cost optimization opportunities

5. Design Review Process

5.1. Design review checklist

5.2. Review meeting agenda

6. Risk Assessment

6.1. Risk matrix

6.2. Mitigation strategies

7. Lab Exercises

Lab 1: Requirements gathering

Lab 2: Architecture design

Lab 3: Capacity planning

Lab 4: Cost estimation

8. Tổng kết

Design Principles

Key Deliverables

Next Steps

Đánh dấu hoàn thành (Bài 28: Thiết Kế Kiến Trúc HA)

Menu

Bài 28: Thiết Kế Kiến Trúc HA

DUY TRAN

Bài 28: Thiết Kế Kiến Trúc HA

Bài học khóa học

Bài 1: Tổng quan về PostgreSQL High Availability

Bài 2: Streaming Replication trong PostgreSQL

Bài 3: Giới thiệu Patroni và etcd

Bài 4: Chuẩn bị hạ tầng

Bài 1: Giới thiệu và Cài đặt Nginx

Bài 2: Cấu hình Cơ bản Nginx

Bài 3: Logging và Monitoring Nginx

Bài 4: Reverse Proxy

Bài 5: Load Balancing

Bài 6: NGINX CACHING

Bài 7: SSL/TLS và HTTPS trong NGINX

Bài 8: Performance Tuning NGINX

Bài 9: Security trong NGINX

Bài 10: Rewrite và Redirects trong NGINX

Bài 11: Nginx với Application Stack trong NGINX

Bài 12: Monitoring và Logging trong NGINX

Bài 13: High Availability và Load Balancing Advanced trong NGINX

Bài 14: Microservices và Service Mesh trong NGINX

Bài 15: Production Best Practices và Advanced Topics trong NGINX

Bài 5: Cài đặt PostgreSQL

Bài 6: Cài đặt và cấu hình etcd cluster

Bài 7: Cài đặt Patroni

Bài 8: Cấu hình Patroni chi tiết

Bài 9: Bootstrap PostgreSQL Cluster

Bài 10: Quản lý Replication

Bài 11: Patroni Callbacks

Bài 12: Patroni REST API

Bài 13: Automatic Failover

Bài 14: Switchover có kế hoạch (Planned Switchover)

Bài 15: Recovering failed nodes

Bài 16: Backup và Point-in-Time Recovery (PITR)

Bài 17: Monitoring Patroni Cluster

Bài 18: Performance Tuning

Bài 19: Logging và Troubleshooting

Bài 20: Security Best Practices

Bài 21: Multi-datacenter Setup

Bài 22: Patroni với Kubernetes

Bài 24: Upgrade Strategies

Bài 23: Patroni Configuration Management

Bài 25: Real-world Case Studies

Bài 26: Automation với Ansible

Bài 27: Disaster Recovery Drills

Bài 28: Thiết Kế Kiến Trúc HA

Bài 29: Deploy Production-ready Cluster

BÀI 1: GIỚI THIỆU KUBERNETES VÀ CONTAINER ORCHESTRATION

Mục tiêu

1. Requirements Gathering

1.1. Business requirements template

1.2. Technical requirements

2. Architecture Design Document

2.1. High-level architecture

2.2. Network design

2.3. Data flow diagram

3. Capacity Planning

3.1. Compute capacity

3.2. Memory capacity

3.3. Storage capacity

3.4. IOPS calculation

4. Cost Estimation

4.1. AWS infrastructure costs

4.2. Cost optimization opportunities

5. Design Review Process

5.1. Design review checklist

5.2. Review meeting agenda

6. Risk Assessment

6.1. Risk matrix

6.2. Mitigation strategies

7. Lab Exercises

Lab 1: Requirements gathering

Lab 2: Architecture design

Lab 3: Capacity planning

Lab 4: Cost estimation

8. Tổng kết

Design Principles

Key Deliverables