AWS offers customers a range of different database options. These include Amazon DynamoDB, a fully-managed NoSQL database service that makes it simple and cost-effective to store and retrieve any amount of data as well as Amazon Relational Database Service (RDS), a service that makes it easy to set up, operate, and scale a relational database in the cloud with support for MySQL, Microsoft SQL Server, PostgreSQL, and Oracle Database. In this session you’ll get an overview of AWS database options and how they might help support your application and see how to get started.
2. Today’s agenda
• Why managed database services?
• A non-relational managed database
• A relational managed database
• A managed in-memory cache
• A managed data warehouse
• What to do next
3. If you host your databases on-premises
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
you
App optimization
4. If you host your databases on-premises
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
you
App optimization
5. If you host your databases in
Amazon EC2
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
you
App optimization
6. If you host your databases in
Amazon EC2
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
you
App optimization
Power, HVAC, net
Rack & stack
Server maintenance
OS installation
7. If you choose a managed DB service
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
App optimization
High availability
DB s/w installs
OS installation
you
Scaling
8. The self-managed vs. AWS-managed
decision
Self-managed database AWS-managed database
You have full responsibility for
upgrades and backup
Upgrades, backup, and failover are
provided as a service
You have full responsibility for security AWS provides high infrastructure
security, certifications; gives you tools
to ensure DB security
Full control over parameters of server,
OS, and database
Database is a managed appliance, so
you can easily automate
Replication is expensive, complex and
requires a lot of engineering
Failover is a packaged service
9. A Managed Service for Each Major DB Type
Amazon
DynamoDB
Key-value
Amazon
RDS
SQL
Amazon
ElastiCache
In-memory
cache
Amazon
Redshift
Data
warehouse
11. DynamoDB: a managed key-value store
• Simple and fast to deploy
• Simple and fast to scale
• To millions of IOPS
• Data is automatically replicated
• Fast, predictable performance
– Backed by SSD storage
• Secondary indexes offer fast lookups
• No cost to get started; pay only for what you consume
DynamoDB
12. Smugmug relies on DynamoDB
• Millions of customers
use Smugmug’s photo and
video sharing service to store
billions of valuable photos and
videos
• DynamoDB lets Smugmug scale
quickly, frees engineers from
database management
“"DynamoDB is a truly
revolutionary product …
I love how DynamoDB
enables us to provision our
desired throughput, and
achieve low latency and
seamless scale, even with
our constantly growing
workloads.”
-Don MacAskill, CEO
19. DynamoDB: What are capacity units?
1 1
1 1
Pay to bearer
on demand
1 write per sec
of up to 1KB
1 1
1 1
Pay to bearer
on demand
1 read per sec
of up to 4KB
Eventually consistent reads at 50% off!
One write capacity unit
One read capacity unit
23. DynamoDB: serverless app architecture
Clients
AWS
Client JS
AWS
Client JS
AWS
Client JS
Login with
Amazon,
Facebook, or
Google
Auth via
Web Identity
Federation
Business logic
Fine-
grained
access
control
24. How DynamoDB billing works
Monthly
bill = GB +
Assumes DB instance accessed only from AWS region
Further details at http://aws.amazon.com/dynamodb/pricing/
≈ 5 GB * $0.25 +
21 * 720 hrs * $0.0065/10 +
35 * 720 hrs * $0.0065/50
≈ $14.36
Storage consumed
(plus 100 bytes per item)
Charge for
write capacity units
per hour
+
Charge for
read capacity units
per hour
26. Amazon RDS: a managed SQL service
• Simple and fast to deploy
• Simple and fast to scale
• AWS handles patching, backups, replication
• Compatible with your applications
– Choose among MySQL, PostgreSQL,
Oracle, SQL Server
• Fast, predictable performance
• No cost to get started; pay only for what you consume
Amazon RDS
27. Flipboard relies on Amazon RDS
• Flipboard is an online
magazine with millions of
users and billions of “flips”
per month
• Uses Amazon RDS and its
Multi-AZ capabilities to store
mission critical user data
"We were able to go from
concept to delivered product
in about six months with just
a handful of engineers."
-Greg Scallan, Chief Architect
28. How Amazon RDS delivers high performance
• Choose Provisioned IOPS storage for high,
predictable performance
– Provision up to 3 TB storage and 30,000 IOPS per database
instance
– Scale IOPS up or down online
• Choose a database instance type with the right
amount of CPU and memory
29. How Amazon RDS backups work
• Automated backups
– Restore your database to a point in time
– Enabled by default
– Choose a retention period, up to 35 days
• Manual snapshots
– Initiated by you
– Persist until you delete them
– Stored in Amazon Simple Storage Service (Amazon S3)
– Build a new database instance from a snapshot when needed
30. Choose Multi-AZ for greater availability, durability
• An Availability Zone is a physically distinct,
independent infrastructure
• With Multi-AZ operation, your database is
synchronously replicated to another zone in the
same AWS region
• Failover occurs automatically in response to the
most important failure scenarios
• Planned maintenance is applied first to backup
31. Choose Read Replicas for greater scalability
• Help your app scale by offloading read traffic to
an automatically maintained read replica
• Create multiple read replicas, load-share traffic
• Easy to set up
Native
MySQL
RDS
32. Choose cross-region snapshot copy for even greater
durability, ease of migration
• Copy a database
snapshot to a
different AWS
region
• Warm standby for
disaster recovery
• Or use it as a base
for migration to a
different region
33. Choose cross-region read replicas for enhanced data
locality, even more ease of migration
• Even faster recovery
in the event of
disaster
• Bring data close to
your customers
• Promote to a master
for easy migration
34. How to scale with Amazon RDS
• Scale up or down with resizable instance types
• Scale your storage up with a few clicks while online
• Offload read traffic to read replicas
• Put a cache in front of Amazon RDS
– Amazon ElastiCache for Memcached or Redis
– Or your favorite cache, self-managed in Amazon EC2
• Amazon RDS takes some of the pain out of sharding
35. NoSQL vs. SQL for a new app: how to choose?
• Want simplest possible
DB management?
• Dataset grows without
bound?
• Need joins, transactions,
frequent table scans?
• Team has SQL skills?
• Dataset growth
manageable?
Amazon DynamoDB Amazon RDS
36. How Amazon RDS billing works
Monthly
bill = GB+ +
Assumes DB instance accessed only from Amazon EC2
Further details at http://aws.amazon.com/rds/pricing/
= 720 hrs * $0.60 + 100 GB * $0.125 + 1,000 IOPS * $0.10
= $544.50
db.m3.xlarge; MySQL;
Oregon; Single-AZ;
On-Demand
100 GB
Provisioned IOPS
Provisioned 1,000
IOPS
38. ElastiCache: resizable in-memory cache
• High performance, resizable in-
memory caching
• Speed your application by
bypassing database access and
disk storage
• Compatible with your existing
applications
– Choose between the popular memcached
and Redis engines
ElastiCache
39. 2U relies on ElastiCache
• 2U, Inc. , is a “School As A Service”
provider that helps universities take
their degrees online.
• To support collaboration and
learning, the company’s technology
platform uses ElastiCache to cache
data that grows exponentially as
students communicate with
instructors and with each other.
• ElastiCache is used to cache news
feeds and data from RDS MySQL.
“ElastiCache helps us
specifically a lot around
our social and
collaborative tools…It just
works. We don’t even
know its there.”
- James Kenigsberg
Chief Technology Officer
40. Use Cases for ElastiCache
• Performance or cost optimization of an
underlying database
• Storage of ephemeral key-value data
• High-performance application patterns
42. How ElastiCache billing works
Monthly
bill = N ×
Further details at http://aws.amazon.com/elasticache/pricing/
= 4 nodes * 720 hrs * $0.31
= $892.80
Standard large; Oregon;
On-Demand
44. Amazon Redshift: a managed data warehouse
• Petabyte-scale columnar
database
• Fast response time
– ~10x that of typical relational stores
• Pricing as low as $1,000 per
TB per year
Amazon Redshift
45. Foursquare relies on Amazon Redshift
• More than 40 million people worldwide
use Foursquare to meet up with
friends, exchange travel tips, and find
money-saving deals
• Foursquare uses AWS to perform
analytics across millions of daily
check-ins, saving licensing fees and
redeploying its dev/ops staff on more
strategic work
“Amazon Redshift offers the
performance we needed
while freeing us from the
licensing costs of our
previous solution.”
--Jon Hoffman
Software Engineer
46. Use Cases for Amazon Redshift
• Reduce costs by extending
DW rather than adding HW
• Migrate completely from
existing DW systems
• Respond faster to business;
provision in minutes
• Improve performance by an
order of magnitude
• Make more data available
for analysis
• Access business data via
standard reporting tools
• Add analytic functionality to
applications
• Scale DW capacity as
demand grows
• Reduce HW & SW costs by
an order of magnitude
Traditional Enterprise DW Companies with Big Data SaaS Companies
48. Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage • With row storage you do
unnecessary I/O
• To get total amount, you have
to read everything
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
49. • With column storage, you
only read the data you need
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
50. analyze compression listing;
Table | Column | Encoding
---------+----------------+----------
listing | listid | delta
listing | sellerid | delta32k
listing | eventid | delta32k
listing | dateid | bytedict
listing | numtickets | bytedict
listing | priceperticket | delta32k
listing | totalprice | mostly32
listing | listtime | raw
Slides not intended for redistribution.
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
• COPY compresses
automatically
• You can analyze and override
• More performance, less cost
51. Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
10 | 13 | 14 | 26 |…
… | 100 | 245 | 324
375 | 393 | 417…
… 512 | 549 | 623
637 | 712 | 809 …
… | 834 | 921 | 959
10
324
375
623
637
959
• Track the minimum and
maximum value for each block
• Skip over blocks that don’t
contain relevant data
52. Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
DW.HS1.8XL:
• > 2 GB/s scan rate
• Optimized for data processing
• High disk density
DW.HS1.XL:
54. How Amazon Redshift billing works
Monthly
bill = N ×
Further details at http://aws.amazon.com/rds/pricing/
= 4 nodes * 720 hrs * $0.25
= $720
dw2.large; Oregon;
On-Demand
57. Benefits of AWS Database Services
Pay only for what you use
No up-front cost
Fully managed services
AWS handles installs,
patching, restarts
Easy to scale
Grow as you need
Designed for use with
other AWS services
Amazon
EC2
Amazon
S3
Amazon
CloudWatch
Amazon
SNS
Amazon
VPC
AWS
Data Pipeline
58. Pace of Innovation – a Bonus
• Amazon RDS/PostgreSQL, new M3 instances
• Amazon RDS/SQL Server TDE, version upgrade
• Amazon RDS/Oracle TDE, 3TB/30K IOPS, version upgrade
• Cross-region snapshot copy, parallel replica, chained replica
• Multi-AZ SLA, log access, VPC groups, …
Amazon RDS
team launched
25+ features
• ElastiCache: Redis 2.8.6 engine support
• DynamoDB cross-region import/export, fine grained access
control, global secondary indexes
• DynamoDB local, geospatial indexing library
NoSQL team
launched 12+
features
• Amazon Redshift dense compute nodes
• Encryption with HSM support
• Audit logging, Amazon SNS notification, snapshot sharing
• Cross-region Amazon Redshift automatic backups
• Faster resize, improved concurrency, distributed tables…
Amazon
Redshift team
launched 21+
features
59. AWS Marketplace
• Find software to use with
Amazon RDS, Amazon Redshift,
DynamoDB, and ElastiCache
• One-click deployments
• Flexible pricing options
•
http://aws.amazon.com/marketplace
60. Try AWS database services for free
Service Free to new customers every month
for one year
DynamoDB 100 MB of storage
5 units of write capacity
10 units of read capacity
Amazon RDS 750 Micro DB instance hours
20 GB of DB Storage
20 GB for Backups
10 million I/O operations
ElastiCache 750 Micro Cache Node instance hours