More Related Content Similar to Under the Hood of Amazon Route 53 (ARC408-R1) - AWS re:Invent 2018 (20) More from Amazon Web Services (20) Under the Hood of Amazon Route 53 (ARC408-R1) - AWS re:Invent 20182. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Under the Hood of Route 53
Gavin McCullagh
System Development Engineer
Amazon Route 53
A R C 4 0 8
Alec Peterson
General Manager
Amazon Route 53
4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
Design for high availability: Amazon Route 53
public DNS data plane
Redundancy, redundancy, redundancy
Blast radius reduction
Customer isolation
Constant work (maybe)
5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
Goal: A good discussion about design patterns
Questions, discussions, debates are welcome
Really
6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Definitions
PoP: (or point of presence) basic data center footprint.
Multiple DNS servers. Often co-located.
Data plane: the DNS service that answers queries.
Consists of many PoPs.
Control plane: the Web API that accepts calls to create
and update zones and records.
Blast radius: the scope of impact when a problem occurs.
Eye ball/transit: networks hosting clients vs
interconnecting transit providers
7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Availability SLAs
SLA = service level agreement
99.9% SLA
1 min 26.4 sec per day
43 min 49.7 sec per month
8 hour 45 min 57.0 sec per year
9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Availability SLAs
99.99% SLA
8.6 sec per day
4 min 23 sec per month
52 min 35.7 sec per year
10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Availability SLAs
100% SLA
Makes the math 100% easier
Why 100%?
Every 99.99% SLA service depends on DNS
11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What about DNS caching?
Suppose 300x resolvers cache a TTL of 60
Some resolver features help
Prefetching … fetch cached records early
Stale caching … use the last-known good answer
Most resolvers don’t do any of this
TTLs: 5 sec (Amazon Simple Storage Service [Amazon S3], Amazon DynamoDB,
Amazon Relational Database Service [Amazon RDS]), 1 sec (Amazon Aurora).
12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Design goals
100% data plane availability
Support all AWS customers
Customer isolation
Low latency
Affordable
13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Common failures
Things that can fail:
Hosts, switches, routers, power, PoPs
Network paths, transit providers, TLDs
Solution: Disposable, independent PoPs
15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Uncommon failures
How could a whole data plane fail?
Deployments
Operator Makes Global Change
Common Routing
Common Transit
16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Redundant data planes
A Route 53 delegation set:
gavinmc.com. 172800 IN NS ns-190.awsdns-23.com.
gavinmc.com. 172800 IN NS ns-1084.awsdns-07.org.
gavinmc.com. 172800 IN NS ns-1831.awsdns-36.co.uk.
gavinmc.com. 172800 IN NS ns-634.awsdns-15.net.
DNS resolvers retry against each NS
Each data plane (“stripe”) is one /23 subnet, routed
independently
Our stripes deployed, operated separately
17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Result
Loss of one data plane has minimal impact for any customer
PoP black holing
Routing problem
Transit provider congestion event
TLD failure
18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Questions
How big is your failure domain?
Do you operate isolated, redundant failure
domains?
Is this a pattern you use or would consider?
What pros/cons do you see?
19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Blast radius reduction—Anycast
20x PoPs advertising each IP prefix to BGP
Resolvers hit nearest PoP for each stripe
Reduces blast radius, improves latency
If a PoP fails, we route it elsewhere
21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
COM stripe
22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
NET stripe
23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
ORG stripe
24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
CO.UK stripe
25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Non-striped anycast
26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Route 53
27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Blast Radius: Deployments
CO.UKNETORGCOMOnePoPGamma
TST1 FRA53
ATL50 EWR50 JFK1 IAD12
ORD51 SEA4 SFO9 ORD50
PHL50 JFK5 JFK6 ORD54
28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Results
Data plane failures typically geo contained
and stripe
Individual bad clients are geo contained
Deployments failures are geo and stripe
contained
Latency Trade-Offs
29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Questions?
How do you contain Blast Radius?
Do you align deployments with blast radius?
Do you partition your service endpoints?
30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customer isolation
Prevent customers impacting each other
Trade-offs:
Multi-tenant services offer cost efficiency
Single tenant gives isolation, but expensive
Blast radius
32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Horizontal scaling
33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Horizontal scaling
34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Horizontal scaling
35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Horizontal scaling
36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Horizontal scaling
37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sharding
38. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sharding
39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sharding
40. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Shuffle sharding
41. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Shuffle sharding
42. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Shuffle sharding
43. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Shuffle sharding
44. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Shuffle sharding
45. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Shuffle sharding
46. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Shuffle sharding Route 53
Route 53 has 512x nameservers per stripe
Every hosted zone gets one NS on each stripe
Guaranteed max overlap of 2x nameservers
A Route 53 delegation set:
gavinmc.com. 172800 IN NS ns-190.awsdns-23.com.
gavinmc.com. 172800 IN NS ns-634.awsdns-15.net.
gavinmc.com. 172800 IN NS ns-1084.awsdns-07.org.
gavinmc.com. 172800 IN NS ns-1831.awsdns-36.co.uk.
47. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Results
Benefits:
High availability and customer isolation at low cost
Rare single-customer impacts are contained to the single customer
Route 53 continually meets 100% availability SLA for customers
48. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Results
Challenges:
Customer experience monitoring can be challenging
Nameservers are easy to confuse or typo
Capacity management can be challenging
49. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using shuffle sharding
Routing layer, for example, per customer/resource DNS names
For example: Elastic Load Balancing, Amazon CloudFront, unique names
Smart retrying client
Means to withdraw failing endpoints
Multiple redundant endpoints
50. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Route 53 as routing layer
1x DNS name per customer
1x A and/or AAAA record per physical endpoint
Health checks (NB fail open)
WRR combinations of ALIAS to endpoints
Multi-value answers
51. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Example: Weighted round robin
endpoint1.mydomain.com/A 1.2.3.4 (Health Checked)
endpoint2.mydomain.com/A 1.2.3.5 (Health Checked)
endpoint3.mydomain.com/A 1.2.3.6 (Health Checked)
…
customer1.mydomain.com./A WRR(ALIAS endpoint1, ALIAS endpoint2)
customer2.mydomain.com./A WRR(ALIAS endpoint1, ALIAS endpoint3)
customer3.mydomain.com./A WRR(ALIAS endpoint2, ALIAS endpoint3)
…
52. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Example: Multi-value answers
endpoint1.mydomain.com/A 1.2.3.4 (Health Checked)
endpoint2.mydomain.com/A 1.2.3.5 (Health Checked)
endpoint3.mydomain.com/A 1.2.3.6 (Health Checked)
…
customer1.mydomain.com./A MVA(ALIAS endpoint1, ALIAS endpoint2)
customer2.mydomain.com./A MVA(ALIAS endpoint1, ALIAS endpoint3)
customer3.mydomain.com./A MVA(ALIAS endpoint2, ALIAS endpoint3)
…
53. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Questions?
Would you build this?
Do you see pros/cons we’ve missed?
What tools would you look for?
54. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
55. “Bimodal is the practice of managing two separate but
coherent styles of work: one focused on predictability;
the other on exploration.”
Gartner IT Glossary
56. “If your system has a mode change once every six
months, you should plan for an outage about twice a
year.”
Alec Peterson
GM & Plagiarist, Route 53
57. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Non-constant work
Database failure, failover to standby
DC failure, failover to remote DC
Dependency fails, fall back to alt code
path
API Caller changes pattern
Major storage failure, revert backups
58. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Anti-patterns
Untested bimodal/fallback paths
Do it all the time or don’t ever do it
Optimizing for 99.99% of cases
If X fails (0.01%), try Y
Accepting unbounded work from clients
Throttling, fail fast
59. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DNS data propagation
Route 53
Control Plane
PoP1 PoP2 PoP3 PoP4
Config Data
Store
60. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Route 53 health checks
Your
endpoint
Checker (us-
east-1)
Checker (us-
west-2)
Checker (eu-
west-1)
Checker (ap-
southeast-1)
Checker (sa-
east-1)
Checker (ap-
southeast-2)
Checker (us-
east-2)
61. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Route 53 DNS API
ListResourceRecordSets is paginated.
API Calls are throttled globally, by
customer and by call type.
If API overloaded, fail requests fast.
Work is bounded to a limit we can sustain.
62. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Is this always possible?
Single-Master Databases
DNS
Nameserver failures == traffic shifts.
Caching/Retries cause surprising query load
increase after outages.
Zone Transfer is incremental, but falls back to
full.
63. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Constant work
Bounded workloads in all operating modes
Prefer redundant work always vs occasionally
increased work
Be wary of optimizing for most but not all workloads. Be
wary of caches.
Throttle APIs, bound their work
64. Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Alec Peterson
General Manager
Route 53
Gavin McCullagh
System Development Engineer
Route 53
65. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.