This presentation explores following architecture blueprints for achieving High Availability in Amazon Web Services (AWS) :
Blue print1 : How to achieve High Availability across AWS regions ?
Blue print2 : How to achieve High Availability across AWS Availability Zones (AZ’s) ?
Architecture diagrams , Explanations , Positives and Negatives of both the Blueprints are explored in this article
Time Series Foundation Models - current state and future directions
Architecture Blueprints for achieving High Availability in AWS
1. Architecture Blueprints for achieving
High Availability in AWS
Harish Ganesan
in.linkedin.com/in/harishganesan
Harish11g.AWS@gmail.com
2. Agenda
A fault tolerant environment
has no service interruption
but a significantly higher
cost, while a highly available
environment has a minimal
service interruption
Across Clouds/DC
Across Regions
Inside Region
2
3. Availability Zone = Distinct Physical Locations, Low latency NW
connectivity, Independent Power, Cooling, Network and security
3
4. AWS Building Blocks
AZ:1 AZ:2
AWS building blocks
DNS Route 53
are inherently fault
Load Balancer ELB tolerant, Highly
CDN CloudFront Available , Scalable
Web Tier AMI+EC2
and Elastic.
Application Tier AMI+EC2 / BeanStalk
Cache Tier ElastiCache
Search Tier CloudSearch *
Storage Tier S3
NoSQL Tier DynamoDB
Database Tier RDS + Multi-AZ
Monitoring Tier CloudWatch
6. HA @ LB/Web/App Tier
Pattern 1: DNS+LB +Web/App Pattern 2: DNS+ ELB + Web/App
Amazon Amazon
Route 53 DNS RR Route 53
Load Balancer Elastic
Load
Balancer
Web/App Web/App
Server Server
• Traffic can be instantly shifted to healthy Web/App EC2’s by ELB or LB
• ELB and Auto Scaling can work across multiple AZ inside a Region
7. HA and Web Session Synchronization
Pattern 3: Sync using JGroups Pattern 4: Sync using Pattern 5: Sync using RDS / DB
ElastiCache/MemCacheD / Amazon Dynamo DB
Web/App Server Web/App Server Web/App Server
Cluster
Synchronization
(Currently TCP ElastiCache / MemCacheD Database
unicast supported)
Session Synchronization is needed to make App Servers Stateless
8. HA based on Elastic IP
1. Amazon Elastic IP’s
are public IP’s and
are fixed
2. Elastic IP can be
EIP: 23.23.174.255 attached / detached
from EC2 instances
3. Elastic IP can be
detached->
remapped to healthy
Web/App EC2 : A Web/App EC2 : B EC2 instance
4. Elastic IP remapping
takes ~180 seconds
When EC2 A is down we can remap the same Elastic IP to EC2 B and re route the traffic
9. HA and impact on Logs
1. Synch all the logs to
S3 periodically
ELB Amazon
CloudFront 2. Synch all the user
CDN uploaded data ( pdf,
images, videos etc)
to S3
Web / App Auto
EC2 Scaling
3. S3 replicates data at
multiple locations
inside Region
S3
4. Move older archives
to Amazon glacier
5. Amazon S3 and
Data base Glacier are very
Glacier robust services for
storage
10. HA @ Database Tier
Pattern 8: Web/App + Pattern 9: Web/App + Database Pattern 10: Web/App +
Database Replication Cluster/Mirroring/ M-M RDS Multi-AZ
Web/App Server Web/App Server Web/App Server
Master DB
Asynchronous
RDS
Replication
DB Node-1 DB Node-2 Synchronous
Replication
Slave DB Clustering / M-M
11. HA Pattern ->Inside Region
Amazon
Route 53
US East Region
Elastic Load
Balancer
Amazon
CloudFront
Smart Phone
CDN
Web / App EC2 Auto Scaling Web / App EC2
Pad / Tab
S3
Cache Nodes Cache Nodes
Search Nodes Search Nodes
CloudWatch
PC
RDS Hot RDS Read
RDS Read RDS MySQL Standby Replica
Replica Master
Read: 10K
AWS Write: 5K
Management
Console Amazon DynamoDB
Availability Zone 1 Availability Zone 2
12. Points to note
• Latencies between AZ’s are varying
• Frequently evaluate->adapt->automate
• Bigger EC2 instance types have better IO
performance ~ replication lag
• Data Charges apply between AZ
• EBS volumes are AZ specific -> Take
snapshots to use in other AZ’s
13. Points to note
• RDS hot standby takes ~3 minutes for
RDS Master Elevation in event of failure
• Have RDS Read Replica’s in Multiple AZ
for HA
• Leverage RDS Read Replica Elevation
• Deployment & Monitoring challenges
remain in auto scaled environment
15. Work Load
• Reserve your capacity – RI
• Software licenses depending upon MAC
address / ENI
• Deployment practices
16. Deployment Challenges and Practices
1. AMI (S3 & EBS
Backed) have EC2
regional scope. Need
Amazon
Route 53 to be created if not
present in that
region
EC2 EC2 2. EBS has AZ scope
Chef
3. Automated
Deployment using
Chef or Puppet
(recommended)
Cloud Formation (CF)
Cloud Formation
4. RightScale
Templates will ease
the complexity
AMI ( EBS & S3 Backed)
AMI ( EBS & S3 Backed)
USA Europe
17. Data
• Regulatory impact when data is
geographically distributed across
continents
• Data Synchronization patterns
• Other Data Challenges
18. Data Synchronization Patterns
Pattern 12: RDS Replication
USA Europe
1. RDS provides Multi-
AZ standby
2. RDS currently does
not provide Sync
across Amazon EC2
regions
19. Data Synchronization Patterns
Pattern 13: MySQL M-S Replication
1. MySQL M-SSS Uni-
USA Europe directional
Public Subnet replication between
Public Subnet
regions (secured
S thru SSL)
S S 2. EIP is mandatory
SSL 3. Easy and widely used
pattern
Pattern 14: MySQL M-S Replication
USA-VPC Europe-VPC 1. IPSEC VPN tunnel
Private Subnet
Private Subnet between Amazon
VPC EC2 regions
S
2. Highly secured
S S
VPN
VPC = Amazon Virtual Private Cloud is a private, isolated section of the AWS Cloud
where you can launch resources in a virtual network
20. Data Synchronization Patterns
Pattern 15: MySQL M-M Replication
USA Europe 1. MySQL M-M bi-
Public Subnet Public Subnet directional replication
between regions
(secured thru SSL)
S S S S 2. EIP is mandatory
SSL
Pattern 16: MySQL M-M Replication 1. IPSEC VPN tunnel
between Amazon EC2
USA-VPC Europe-VPC regions VPC
Private Subnet Private Subnet 2. Elevation is already
taken care
3. RTO/RPO will be
S S S S better compared to
VPN other patterns
VPC = Amazon Virtual Private Cloud is a private, isolated section of the AWS Cloud
where you can launch resources in a virtual network
21. Data Challenges
• S3 is accessible from another region , but
• Latency ?
• Data charges ?
• S3 programmatic replication across
regions (recommended)
• Distribution media & static contents
through Cloudfront CDN
• Cache replication across regions – Not
recommended
• Cache Warming inside regions suggested
22. Network
• NTP sync the regions involved
• Monitoring – use multiple levels
• CloudWatch, Nagios, Ganglia, Pingdom
• NewRelic
• LBR vs DDR
• Uniform ElastiCache Cluster Names
• Replication through NAT / IPSEC
• HA of IPSEC /NAT Layers
• Avoid EIP/ENI hardcoding in 3rd party
services
23. Latency Based Routing vs Directional DNS Routing
Pattern 17: Latency based Routing (LBR)
1. ELB is regional Scope
2. LBR is suitable for
Active-Active setup
Amazon
Route 53
Active Active 3. LBR might need bi
ELB ELB directional data sync
depending upon use
EC2 EC2
case
USA Europe
Pattern 18: Directional DNS Routing 1. For Active – Passive
Akamai / setup Directional
UltraDNS DNS is preferred
Active Passive 2. Akamai, UltraDNS
ELB ELB can do the job
EC2 EC2
USA Europe
24. Replication & HA of NAT/ IPSEC Layers
Pattern 19: Replication through NAT
USA-VPC Europe-VPC 1. Currently VPC- VPN
Private Subnet Private Subnet NAT EC2 instance
can be SPOF
2. NAT cannot take
S S S S heavy data load
VPC = Amazon Virtual Private Cloud
Pattern 20: IPSEC VPN tunnel with HA
USA-VPC Europe-VPC 1. HA @ IPSEC VPN
Private Subnet Private Subnet tunnel layer
2. Active-Active or A-P
depending upon
S S S S RTO,RPO and Data
25. Avoid EIP/ENI are hardcoding
Amazon
Route 53
USA Europe
1. EIP and ENI are
Amazon EC2 regional
scope
EC2 EC2
2. FTP and Custom
EIP: 23.23.174.255 EIP: 50.19.82.183
Hardware in
Customers
Corporate
datacenters pointing
to EIP needs to be
Hardware FTP client Others
remapped
26. Internet
USA Europe/MEA
LBR / Directional DNS
Amazon
Route 53
USA / DNS Europe
Elastic Load Elastic Load
Balancer Balancer
Auto scaling Group Auto scaling Group
Apache EC2 Apache EC2
CloudFront CDN
MemCacheD MemCacheD Data Sync between MemCacheD MemCacheD
AWS Regions
Solr
Other Common
MySQL-S MySQL-M Services (Rest , MySQL-M MySQL-S
Availability Availability SOAP) Availability Availability
Zone 1A Zone 1B Zone 1A Zone 1B
28. Point to note
• Most of the points / challenges / patterns
mentioned in previous architectures
applies to this Architecture as well
29. AWS and Corporate Datacenter
Pattern 22 DNS
Amazon 1. CloudStack or
Active
Route 53
Passive Eucalyptus in DC.
Better integration,
EC2 Direct Connect Private cloud Compatible and
interoperable.
VPN
USA USA
2. AWS Direct connect
Corporate Data center
Amazon Web Services 1Gbps – 10Gbps
connectivity
3. Direct connect
Pattern 23 DNS
Amazon
provides improved
Route 53
Active RTO/RPO thru private
Passive
NW
EC2 Direct Connect Private cloud
4. (or) VPN connectivity
VPN
USA
between AWS and DC
USA
over internet
Corporate Data center
Amazon Web Services
30. 1. Not very mature
Across Public Clouds / DC pattern (Currently)
Pattern 24 DNS
Amazon
Route 53 2. Not all providers are
capable to provide
EC2 Fixed IP etc
VM
VM
3. Compatibility
USA
USA Challenges exists–
Amazon Web Services Terremark / Rackspace / VM, NW, CPU , Data
Azure / Others
4. No Standards –
Automation Scripts,
API are different
5. Multi Cloud
Provisioning, unified
Management, API –
RightScale,
EnStratus will ease
your effort
31. If you need help in architecting Highly Available
solutions on AWS?
32. Leave it to the experts , we will
handle this
Cloud Architecture Consulting
Cloud Application Development
Cloud Migration & Implementation
Cloud Adoption Strategy
“Let's get the job done”