Amazon Aurora is a MySQL-compatible database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. This session introduces you to Amazon Aurora, explains common use cases for the service, and helps you get started with building your first Amazon Aurora–powered application.
2. Meet Amazon Aurora ……
Databases reimagined for the cloud
Speed and availability of high-end commercial databases
Simplicity and cost-effectiveness of open source databases
Drop-in compatibility with MySQL
Simple pay as you go pricing
Delivered as a managed service
3. Not much has changed in last 30 years
Even when you scale it out, you’re still replicating the same stack
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Application
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Application
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Storage
Application
4. Reimagining the relational database
What if you were inventing the database today?
You wouldn’t design it the way we did in 1970
You’d build something
that can scale out …
that is self-healing …
that leverages existing AWS services …
5. A service-oriented architecture applied to the database
Moved the logging and storage layer into a
multi-tenant, scale-out database-optimized
storage service
Integrated with other AWS services like
Amazon EC2, Amazon VPC, Amazon
DynamoDB, Amazon SWF, and Amazon
Route 53 for control plane operations
Integrated with Amazon S3 for continuous
backup with 99.999999999% durability
Control planeData plane
Amazon
DynamoDB
Amazon SWF
Amazon Route 53
Logging + Storage
SQL
Transactions
Caching
Amazon S3
1
2
3
8. Expedia: On-line travel marketplace
Real-time business intelligence and analytics on a
growing corpus of on-line travel marketplace data.
Current Microsoft SQL Server–based architecture
is too expensive. Performance degrades as data
volume grows.
Cassandra with Solr index requires large memory
footprint and hundreds of nodes, adding cost.
Aurora benefits:
Aurora meets scale and performance
requirements with much lower cost.
25,000 inserts/sec with peak up to 70,000. 30 ms
average response time for write and 17 ms for
read, with 1 month of data.
World’s leading online travel
company, with a portfolio that
includes 150+ travel sites in 70
countries.
9. Higher performance, lower cost
Safe.com lowered their bill by 40% by switching
from sharded MySQL to a single Aurora instance.
Double Down Interactive (gaming) lowered their
bill by 67% while also achieving better latencies
(most queries ran faster) and lower CPU
utilization.
Aurora benefits:
Due to high performance and large storage
support, sharded MySQL instances can be
consolidated in fewer Aurora instances.
High performance allows for smaller instances.
Automated storage provisioning removes the need
for storage “headroom.”
No additional storage for Read Replicas.
10. “When we ran Alfresco’s workload on Aurora, we were blown away to find that
Aurora was 10x faster than our MySQL environment,” said John Newton,
founder and CTO of Alfresco. “Speed matters in our business, and Aurora has
been faster, cheaper, and considerably easier to use than MySQL.”
Amazon Aurora is fast
11. • Four client machines with 1,000 threads each
WRITE PERFORMANCE READ PERFORMANCE
• Single client with 1,000 threads
• MySQL SysBench
• R3.8XL with 32 cores and 244 GB RAM
SQL benchmark results
12. Consistent performance as table count increases
• Write-only workload
• 1,000 connections
• Query cache (default on for Amazon Aurora, off for MySQL)
• i2.8XL instances for MySQL SSD and RAM
• r3.8XL instances for Aurora and Amazon RDS MySQL
11x
U P TO
FASTER
13. Writes scale with connection count
• OLTP Workload
• Variable connection count
• 250 tables
• Query cache (default on for Amazon Aurora, off for MySQL)
8x
U P TO
FA STER
14. Consistent performance with growing dataset
67x
U P TO
FA STER
• SysBench write-only workload
• Aurora r3.8XL
• Amazon RDS MySQL r3.8XL with 30K IOPS (single AZ)
15. Do fewer I/Os
Minimize network packets
Cache prior results
Offload the database engine
DO LESS WORK
Process asynchronously
Reduce latency path
Use lock-free data structures
Batch operations together
BE MORE EFFICIENT
How do we achieve these results?
16. Aurora requires fewer I/Os
Binlog Data Double-write bufferLog records FRM files, metadata
T Y P E O F W R IT E S
EBS mirrorEBS mirror
AZ 1 AZ 2
Amazon S3
MYSQL WITH STANDBY
SEQUENTIAL
WRITE
SEQUENTIAL
WRITE
EBS
Amazon Elastic
Block Store (EBS)
Primary
Instance
Standby
Instance
AZ 1 AZ 3
Primary
Instance
Amazon S3
AZ 2
Replica
Instance
AMAZON AURORA
ASYNC
4/6 QUORUM
DISTRIBUTED
WRITES
18. Amazon Aurora is highly available
Highly available storage
• Six copies of data
across three AZs
• Latency tolerant
quorum system for
read/write
• Up to 15 replicas with
low replication lag
Survivable caches
• Cache remains warm
in the event of a
database restart
• Lets you resume fully
loaded operations
much faster
Instant crash recovery
• Underlying storage
replays redo records
on demand as part of a
disk read
• Parallel, distributed,
asynchronous
AZ 1 AZ 2 AZ 3
Amazon
S3
SQL
Transactions
Caching
T0
19. Faster, more predictable failover
App
RunningFailure Detection DNS Propagation
Recovery Recovery
DB
Failure
RDS MYSQL
App
Running
Failure Detection DNS Propagation
Recovery
DB
Failure
AURORA WITH MARIADB DRIVER
1 5 – 3 0 s e c
5 – 2 0 s e c
20. Amazon Aurora is easy to use
“Amazon Aurora’s new user-friendly monitoring interface made it
easy to diagnose and address issues. Its performance, reliability, and
monitoring really shows Amazon Aurora is an enterprise-grade AWS
database.” —Mohamad Reza, information systems officer at United
Nations
21. Simplify storage management
• Automatic storage scaling up to 64 TB—no performance impact
• Continuous, incremental backups to Amazon S3
• Instantly create user snapshots—no performance impact
• Automatic restriping, mirror repair, hot spot management, encryption
Up to 64 TB of storage—auto-incremented in 10 GB units
up to 64 TB
22. Simplify monitoring with AWS Management Console
Amazon CloudWatch
metrics for RDS
CPU utilization
Storage
Memory
50+ system/OS metrics
1–60 second granularity
DB connections
Selects per second
Latency (read and write)
Cache hit ratio
Replica lag
CloudWatch alarms
Similar to on-premises custom
monitoring tools
23. Simplify data security
Encryption to secure data at rest
• AES-256; hardware accelerated
• All blocks on disk and in Amazon S3 are encrypted
• Key management by using AWS KMS
SSL to secure data in transit
Network isolation by using Amazon VPC by
default
No direct access to nodes
Supports industry standard security and data
protection certifications
Storage
SQL
Transactions
Caching
Amazon S3
Application
25. If you host your databases on-premises
Power, HVAC, net
Rack and stack
Server maintenance
OS patches
DB software patches
Database backups
Scaling
High availability
DB software installs
OS installation
you
App optimization
26. If you host your databases in Amazon EC2
Power, HVAC, net
Rack and stack
Server maintenance
OS patches
DB software patches
Database backups
Scaling
High availability
DB software installs
OS installation
you
App optimization
27. If you choose Amazon RDS
Power, HVAC, net
Rack and stack
Server maintenance
OS patches
DB software patches
Database backups
App optimization
High availability
DB software installs
OS installation
you
Scaling
29. Simple pricing
No licenses
No lock-in
Pay only for what you use
Discounts
44% with a 1-year Reserved Instance
63% with a 3-year Reserved Instance
vCPU Mem Hourly Price
db.r3.large 2 15.25 $0.29
db.r3.xlarge 4 30.5 $0.58
db.r3.2xlarge 8 61 $1.16
db.r3.4xlarge 16 122 $2.32
db.r3.8xlarge 32 244 $4.64
• Storage consumed, up to 64 TB, is $0.10/GB-month
• IOs consumed are billed at $0.20 per million I/O
• Prices are for US East (N. Virginia) region
Enterprise grade, open source pricing
30. Cost of ownership: Aurora vs. MySQL
MySQL configuration hourly cost
Primary
r3.8XL
Standby
r3.8XL
Replica
r3.8XL
Replica
R3.8XL
Storage
6TB/10K PIOP
Storage
6TB/10K PIOP
Storage
6TB/5K PIOP
Storage
6TB/5K PIOP
$3.78/hr
$3.78/hr
$3.78/hr $3.78/hr
$2.42/hr
$2.42/hr $2.42/hr
Instance cost: $15.12/hr
Storage cost: $8.30/hr
Total cost: $23.42/hr
$2.42/hr
31. Cost of ownership: Aurora vs. MySQL
Aurora configuration hourly cost
Instance cost: $13.92/hr
Storage cost: $4.43/hr
Total cost: $18.35/hr
Primary
r3.8XL
Replica
r3.8XL
Replica
R3.8XL
Storage/6 TB
$4.64/hr $4.64/hr $4.64/hr
$4.43/hr
*At a macro level, Aurora saves over 50% in
storage cost compared to RDS MySQL.
21.6%
Savings
No idle standby instance
Single shared storage volume
No PIOPS—pay for use IO
Reduction in overall IOP
32. Cost of ownership: Aurora vs. MySQL
Further opportunity for saving
Instance cost: $6.96/hr
Storage cost: $4.43/hr
Total cost: $11.39/hrStorage IOPS assumptions:
1. Average IOPS is 50% of maximum IOPS
2. 50% savings from shipping logs vs. full pages
51.3%
Savings
Primary
r3.8XL
Replica
r3.8XL
Replica
r3.8XL
Storage/6 TB
$2.32/hr $2.32/hr $2.32 hr
$4.43/hr
r3.4XL r3.4XL r3.4XL
Use smaller instance size
Pay-as-you-go storage
34. Start your first migration in 10 minutes or less
Keep your apps running during the migration
Replicate within, to, or from Amazon EC2 or RDS
Move data to the same or different database
engine
AWS
Database Migration
Service
35. Customer
premises
Application users
AWS
Internet
VPN
Start a replication instance
Connect to source and target databases
Select tables, schemas, or databases
Let AWS Database Migration Service
create tables, load data, and keep
them in sync
Switch applications over to the target
at your convenience
Keep your apps running during the migration
AWS
Database Migration
Service
36. Migrate from Oracle and SQL Server
Move your tables, views, stored procedures, and
data manipulation language (DML) to MySQL,
MariaDB, and Amazon Aurora
Know exactly where manual edits are needed
Download at aws.amazon.com/dms
AWS
Schema Conversion
Tool
39. Pearson Assessments—core business
• Computer based assessments
• K-12 students, college prep, clinical, etc.
• Oftentimes high-stakes
• Typical platform will oftentimes contain…
o A customer-facing product to take tests on
o A customer-facing product to register students, administer tests, view
reports
o A back-end product to score tests and generate reports
40. Context
• We’ve been using MySQL heavily for 5 years
o Percona’s XtraDB-based products are what we’ve used most frequently
• Deployed first products into AWS in 2012–2013
• In 2012–2013, RDS MySQL was inadequate for our use cases
• Currently manage (lots, but I’m not allowed to say how many) MySQL servers
deployed in AWS, mostly on traditional EC2 instances
• Most product/customer combinations are deployed on separate/isolated
application and database servers. Each deployment only needs to perform “fast
enough” for the product/customer.
• We’ve been testing Aurora since day 1 of the Aurora beta
• We aim for entirely open-source tech stacks whenever possible
41. Database needs and priorities
• ACID
• Durability above all else. We can’t lose data.
• Failover with no data loss (if at all avoidable)
o Better to be down longer and lose no data than to recover quickly and
lose data
• “Fast enough” performance
42. Aurora alternatives: MySQL master/slave
Pros
• Simple
• Well supported/documented
• “Fast enough” for many use cases
Cons
• Replication is asynchronous
• Heavy write traffic on the master can cause
replication lag
• Failover to the slave can result in data loss, so
failover only occurs when the master is
unrecoverable
• The slave can fall out of sync with the master. It’s
rare, but it can happen.
43. Aurora alternatives: Galera-based products
Pros
• Synchronous replication
• Highly available
Cons
• Cost (need three or more servers for minimum footprint)
• Difficult to manage at scale
• Slow writes due to network latency (must commit on all
nodes, preferably in 3+ AWS Availability Zones).
• High volumes of small write transactions = 50% slower loss
(or more) compared to M/S
• Deadlocks if writing to all nodes, so must effectively use 1
node as a read/write node, and the other nodes as read-
only/failover
• Must avoid large transactions, which isn’t always feasible
44. Aurora alternatives: MySQL RDS
Pros
• Don’t have to manage the servers ourselves
• Failover is easier than DIY MySQL master/slave
• Infrastructure changes are easier than DIY MySQL
• RDS servers keeps up-to-date with OS patches and
MySQL community edition patches
Cons
• Same cons as MySQL master/slave
• 2 TB file size limit
• 6 TB total DB size limit
• Lack of thread-pooling hurts performance on platforms
that require many connections to the DB
45. Aurora alternatives: Other topologies
• MySQL master with a hot standby using DRDB
• MySQL semi-synchronous replication
We ultimately decided that veering too far outside of mainstream MySQL patterns
does more harm than good.
46. Aurora—why we’re moving towards it:
Technology
• Failover is “quick enough” and reliable
o MariaDB driver has excellent failover support
• Performance is “fast enough”
• “Readers” don’t slow down the “writer”
• Technology ensures that failover will not result in data
loss
• Full compatibility with MySQL Community Edition means
that we can back out of Aurora at any time
o This is important for us, as some contracts may not
allow us to use AWS
47. Aurora—why we’re moving towards it:
Logistics
• Don’t have to manage the servers any more
• Don’t have plan or manage disk space allocation anymore
• Don’t have to develop, maintain, or test automation code for deploying
and managing databases and database backups
• No more babysitting MySQL traditional asynchronous replication
• Adding a new “reader” node is easy
48. Aurora advice
• Use the MariaDB Connector/J driver
• The primary difference between the MariaDB Connector/J driver and the
MySQL Connector/J driver is that MariaDB Connector/J has better
failover support for Aurora
• Test Test Test
• Don’t do your dev, test, and QA work solely against RDS MySQL (or
some other flavor of MySQL). Test your app on Aurora before you
deploy it to production.
• Fail over Aurora regularly in your QA environment. In production, you
want to make sure that a failover doesn’t cause unexpected issues.
• Don’t send all read-only queries out to replica nodes
• Recommended reader use cases: reports, CPU-heavy queries,
asynchronous tasks, analytics
49. Aurora advice: performance
• Don’t enable binary logging (unless you need it)
• Aurora replicas do not require binary logging to function
• Binary logging can degrade performance in Aurora
• Only enable if you are planning on having a traditional MySQL
asynchronous slave attached to an Aurora cluster
• Enable query cache
• Query cache isn’t performance poison on Aurora like it can be in MySQL
Community Edition. Aurora’s default behavior is to have query cache
enabled and many applications will benefit from query cache.
• Aurora won’t fix a bad query
• Untuned SQL will oftentimes consume as much CPU on Aurora as on
MySQL Community Edition
• On writer nodes, CPU can only scale so much …
50. Feature requests
Here’s some stuff that we wish Aurora could do. If enough of us ask Amazon really
nicely...
• Smaller and cheaper instance types for dev and test?
• Transaction isolation on replicas are currently locked at REPEATABLE READ. We
wish we could set a replica to another isolation level, like READ COMMITTED.
• Right now, all replicas are failover candidates. We’d like a replica that we can
mark as “do not fail over to.”
• Aurora clusters have a DNS endpoint that always points to the writer. Can we get
one that points to the reader(s)?
• Copy a snapshot to another region?
• Copy Aurora database cluster parameter groups?
51. Special thanks to…
Our AWS solutions architects, technical account managers, and the Aurora
engineering team.
Thanks!