π Backup & Disaster Recovery
Master cross-region backup, disaster recovery strategies, and RPO/RTO planning.
Module: AWS Data Engineering β’ Topic 34 of 65 β’ Premium Content
DR Strategies
Architecture Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DR STRATEGIES (RPO/RTO) β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 1. BACKUP & RESTORE β β
β β RPO: Hours | RTO: Hours | Cost: $ β β
β β S3 versioning, snapshots, cross-region replication β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 2. PILOT LIGHT β β
β β RPO: Minutes | RTO: 10-30 min | Cost: $$ β β
β β Minimal infrastructure running, scale up on failover β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 3. WARM STANDBY β β
β β RPO: Seconds | RTO: <5 min | Cost: $$$ β β
β β Scaled-down replica, ready to scale β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 4. MULTI-SITE ACTIVE-ACTIVE β β
β β RPO: Zero | RTO: Zero | Cost: $$$$ β β
β β Full infrastructure in multiple regions β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Cross-Region S3 Replication
import boto3
s3 = boto3.client('s3')
# Enable versioning
s3.put_bucket_versioning(
Bucket='data-lake-primary',
VersioningConfiguration={'Status': 'Enabled'}
)
# Configure cross-region replication
s3.put_bucket_replication(
Bucket='data-lake-primary',
ReplicationConfiguration={
'Role': 'arn:aws:iam::123456789012:role/S3ReplicationRole',
'Rules': [
{
'Status': 'Enabled',
'Priority': 1,
'Filter': {'Prefix': ''},
'Destination': {
'Bucket': 'arn:aws:s3:::data-lake-dr',
'StorageClass': 'STANDARD_IA',
'ReplicationTime': {
'Status': 'Enabled',
'Time': {'Minutes': 15}
}
}
}
]
}
)
Interview Q&A
Q1: What is RPO and RTO?
Answer: RPO (Recovery Point Objective) is maximum acceptable data loss. RTO (Recovery Time Objective) is maximum acceptable downtime.
Q2: When should you use Pilot Light vs. Warm Standby?
Answer: Pilot Light for cost-sensitive workloads with higher RTO tolerance. Warm Standby for critical systems requiring fast failover.
Q3: How does S3 cross-region replication work?
Answer: S3 CRR asynchronously replicates new objects to a destination bucket in another region. Supports RTC for 15-minute SLA.
Summary
- RPO/RTO: Define acceptable data loss and downtime
- Strategies: Backup/Restore β Pilot Light β Warm Standby β Multi-Site
- S3 Replication: Cross-region for DR, RTC for 15-min SLA
- Redshift: Cross-region snapshot copy
- Testing: Regular DR drills are essential