AWS Storage Foundation with EBS, S3 and More

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Mike Kuentz, Senior Solutions Architect
October 19, 2017
Building a Strong Foundation
with AWS Storage Services

Agenda
Introduction
Storage Primer
Block Storage
Shared File Systems
Object Store
On-Premises Storage Integration

AWS global infrastructure
16 Regions – 44 Availability Zones – 78 Edge Locations
Region & Number of Availability Zones
US East EU
N. Virginia (6), Ohio (3) Ireland (3)
Frankfurt (2)
US West London (2)
Oregon (3)
Northern California (3) Asia Pacific
Singapore (2)
AWS GovCloud Sydney (2), Tokyo (3),
(US-West) (2) Seoul (2), Mumbai (2)
Canada China
Central (2) Beijing (2)
South America
São Paulo (3)
Announced Regions
China, France, Hong Kong, Sweden, and a second
AWS GovCloud Region in the US.

Introduction: Why choose AWS for storage
Compelling
Economics Easy to Use Reduce Risk
Speed,
Agility, Scale
Pay as you go
No commitment
No upfront investment
No risky capacity
planning
No need to provision
for redundancy
or overhead
Self service
administration
SDKs for simple
integration
Durable and Secure
Avoid risks of physical
media handling
Reduce time to market
Focus on your
business, not your
infrastructure

Storage Primer

Block vs File vs Object
Block Storage
Raw Storage
Data organized as an array of unrelated blocks
Host File System places data on disk
e.g.: Microsoft NTFS, Unix ZFS
File Storage
Unrelated data blocks managed by a file (serving) system
Native file system places data on disk
Object Storage
Stores Virtual containers that encapsulate the data, data attributes, metadata and Object IDs
API Access to data
Metadata Driven, Policy-based, etc

Storage - Characteristics
Durability Availability Security Cost Scalability Performance Integration
Measure of
expected
data loss
Measure of
expected
downtime
Security
measures in
place
Amount per
storage unit,
e.g. $ / GB
Upward
flexibility
Performance
metrics
Ability to
interact with
other
technolgies
Some of the ways we look at storage

AWS has a variety of storage options
Amazon EBS (Elastic Block Storage)
Amazon Elastic File System (EFS)
Amazon EC2 Instance Store (Ephemeral Volumes)
Amazon S3 (Simple Storage Service)
Amazon Glacier
AWS Storage Gateway: File Gateway
Amazon Snowball & Snowball Edge
AWS Snowmobile

Block Storage

Amazon EBS
Persistent block level storage for EC2
Pay only for what you provision
Native redundancy and write cache
Consistent and low-latency performance
Optimized for random I/O
Native support for encryption at rest (data volumes)

Amazon EBS
Network attached block device
• Independent data lifecycle
• Virtual disks
• Multiple volumes per EC2 instance
• Only one EC2 instance at a time per volume
• Can be detached from an instance and attached to a different one
Raw block devices
• Unformatted block devices
• Ideal for databases, filesystems
Available in multiple types

AWS EBS features
Durable Secure
Low-latency SSD
Consistent I/O Performance
Stripe multiple volumes for
higher I/O performance
Identity and
Access Policies
Encryption
Scalable
Unlimited capacity
when you need it
Easily scale up
and down
Performance Backup
Designed for five
9’s reliability
Redundant storage
across multiple devices
within an Availability Zone
(AZ)
Point-in-time Snapshots
Copy snapshots across
AZ and Regions

Amazon EBS
Highly available block storage for all types of data
Internet-scale storage
Grow without limits
Benefit from
massive security
investments made
by AWS
Built-in redundancy
Designed for
99.999% availability
Low price per GB
per month
No commitment
No up-front cost

UMUC builds new analytics platform using Amazon Redshift
University of Maryland University College (UMUC) specializes
in providing career-relevant higher education opportunities to
busy professionals.
UMUC uses AWS to both run
the university as well as to
improve student outcomes.
Darren Catalano
VP of Analytics, UMUC
”
“ • UMUC needed to replace its aging legacy
applications with a more flexible and cost-effective
cloud-based solution
• The university built its new analytics platform on AWS
leveraging Amazon Redshift and Amazon RDS for
Oracle
• UMUC reports a 2x to 20x improvement in ETL
performance for its analytics platform compared to its
previous legacy applications
• Using AWS enables UMUC engineers to focus on
creating new applications instead of managing
infrastructure

EBS volume types comparison
Magnetic General Purpose
(SSD)
Provisioned IOPS
(SSD)
Performance Lowest Cost Burstable Predictable
Use Cases Infrequent Data
Access
Boot volumes
Small to Medium
DBs
Dev & Test
I/O Intensive
Relational & NoSQL
Media Magnetic (HDD) SSD SSD
Max IOPS 100 on average with
the ability to burst to
hundreds of IOPS
Baseline 3 IOPS/GB
Burstable to 3,000
IOPS
Consistently performed
at provisioned level, up
to 20,000 IOPS
Price $.05/GB/Month
$.05/million I/O
$.10/GB/Month
I/O Operations - Free
$.125/GB/Month
$.065/provisioned IOPS

EBS volume types
Solid-State Drives (SSD) Hard disk Drives (HDD)
Volume Type General Purpose SSD
(gp2)*
Provisioned IOPS SSD
(io1)
Throughput Optimized HDD
(st1)
Cold HDD
(sc1)
Description General purpose SSD
volume that balances
price and performance for
a wide variety of
transactional workloads
Highest-performance SSD
volume designed for
mission-critical applications
Low cost HDD volume
designed for frequently
accessed, throughput-
intensive workloads
Lowest cost HDD volume
designed for less frequently
accessed workloads
Use Cases • Recommended for most
workloads
• System boot volumes
• Virtual desktops
• Low-latency interactive
apps
• Dev and test
environments
• Critical business
applications that require
sustained IOPS
performance, or more
than 10,000 IOPS or 160
MiB/s of throughput per
volume
• Large database
workloads
• Streaming workloads
requiring consistent, fast
throughput at a low price
• Big data
• Data warehouses
• Log processing
• Cannot be a boot volume
• Throughput-oriented
storage for large volumes of
data that is infrequently
accessed
• Scenarios where the lowest
storage cost is important
• Cannot be a boot volume
Volume Size 1 GiB - 16 TiB 4 GiB - 16 TiB 500 GiB - 16 TiB 500 GiB - 16 TiB
Max. IOPS**/Volume 10,000 20,000 500 250
Max.
Throughput/Volume†
160 MiB/s 320 MiB/s 500 MiB/s 250 MiB/s
Max. IOPS/Instance 65,000 65,000 65,000 65,000
Max.
Throughput/Instance
1,250 MiB/s 1,250 MiB/s 1,250 MiB/s 1,250 MiB/s
Dominant Performance
Attribute
IOPS IOPS MiB/s MiB/s
*Default volume type
**gp2/io1 based on 16KiB I/O size, st1/sc1 based on 1 MiB I/O size
† To achieve this throughput, you must have an instance that supports it, such as r3.8xlarge or x1.32xlarge.

IOPS token bucket model
(SSD)
Provisioned IOPS
(SSD)
Perform
ance
Lowest Cost Burstable Predictable
Use
Cases
Infrequent Data
Access
Boot volumes
Small to Medium
DBs
Dev & Test
I/O Intensive
Relational & NoSQL
Max
IOPS
100 on average
with the ability to
burst to hundreds
of IOPS
Baseline 3
IOPS/GB
Burstable to 3,000
IOPS
to 20,000 IOPS
Price $.05/GB/Month
$.05/million I/O
$.10/GB/Month
I/O Operations -
Free
$.125/GB/Month
分
• Each token represents an “I/O credit” that pays
for one read or one write.
• A bucket is associated with each General
Purpose (SSD) volume, and can hold up to 5.4
million tokens.
• Tokens accumulate at a rate of 3 per configured
GB per second, up to the capacity of the bucket.
• Tokens can be spent at up to 3000 per second
per volume.
• The baseline performance of the volume is equal
to the rate at which tokens are accumulated — 3
IOPS per GB per second.

EBS Provisioned IOPS
•EBS Optimized Instances
• Dedicated storage throughput
•Predictable Performance
• 100-20000 IOPS per volume
• Single digit millisecond latency
•Performance Design
• Deliver within 10% of PIOPs, 99.9% of
the time
(SSD)
Provisioned IOPS
(SSD)
Perform
ance
Use
Cases
Infrequent Data
Access
Boot volumes
Small to Medium
DBs
Dev & Test
I/O Intensive
Relational & NoSQL
Max
IOPS
100 on average
with the ability to
burst to hundreds
of IOPS
Baseline 3
IOPS/GB
Burstable to 3,000
IOPS
to 20,000 IOPS
Price $.05/GB/Month
$.05/million I/O
$.10/GB/Month
I/O Operations -
Free
$.125/GB/Month

Amazon EBS at 20,000 IOPS
Provisioned IOPS (SSD)
• Max Volume 16 TB
• Max I/O rate 20,000 IOPS
• Max throughput 320 MB/s
General Purpose (SSD)
• Max Volume 16 TB
• Max I/O rate 10,000 IOPS
• Max throughput 160 MB/s
(SSD)
Provisioned IOPS
(SSD)
Perform
ance
Use
Cases
Infrequent Data
Access
Boot volumes
Small to Medium
DBs
Dev & Test
I/O Intensive
Relational & NoSQL
Max
IOPS
100 on average
with the ability to
burst to hundreds
of IOPS
Baseline 3
IOPS/GB
Burstable to 3,000
IOPS
to 20,000 IOPS
Price $.05/GB/Month
$.05/million I/O
$.10/GB/Month
I/O Operations -
Free
$.125/GB/Month

Internet
AWS Cloud
EBS snapshots
EC2 Availability Zone
EC2
Amazon S3
EBS
EC2 EC2
EBS EBS EBS EBS EBS
EBS Snapshot
EBS Snapshot
EBS Snapshot
EBS Snapshot
EBS Snapshot
Create Snapshot
Clone From
Snapshot

EC2 Instance Store (Ephemeral Volumes)
Free with your EC2 instance
• SAS and SSD options
• Size/type based on instance type
Local, direct attached resource
Consistent sequential reads and writes
Use only for non-persistent data

Shared File System

Elastic File System (EFS)
Fully managed file system for EC2 instances
Provides standard file system semantics
Works with standard operating system APIs
Sharable across thousands of instances
Elastically grows to petabyte scale
Delivers performance for a wide variety of workloads
Highly available and durable
NFS v4–based
Accessible from on-prem servers New!

Amazon EFS is simple
Fully managed
- No hardware, network, file layer
- Create a scalable file system in seconds!
Seamless integration with existing tools and apps
- NFS v4.1—widespread, open
- Standard file system access semantics
- Works with standard OS file system APIs
Simple pricing = simple forecasting

Amazon EFS is elastic
File systems grow and shrink automatically
as you add and remove files
No need to provision storage capacity or
performance
You pay only for the storage space you use,
with no minimum fee

Amazon EFS is scalable
File systems can grow to petabyte scale
Throughput and IOPS scale automatically
as file systems grow
Consistent low latencies regardless of file
system size
Support for thousands of concurrent NFS
connections

Highly durable and highly available
Designed to sustain AZ offline conditions
Resources aggregated across multiple AZs
Superior to traditional NAS availability
models
Appropriate for Production / Tier 0
applications

Example use cases
Big Data Analytics
Media Workflow Processing
Web Serving
Content Management
Home Directories

EFS – Mounting
EFS
EC2EC2 EC2 EC2EC2
EFS DNS Name
availability-zone.file-system-id.efs.aws-region.amazonaws.com
Mount on machine
sudo mount -t nfs4 mount-target-DNS:/ ~/efs-mount-point
EC2

Object Storage

Web accessible object store
Pay for exactly what you use
Highly durable (99.999999999% design)
Limitlessly scalable
Natively online
Two flavors:
• Standard Storage - $0.023 * per GB / mo
• Standard – Infrequent Access Storage (min size 128KB) – $0.0125* per GB / mo + Data
retrieval cost
* (US East (N Virginia) pricing)

Parallel I/O for max speed (Multipart Upload, Ranged GETs)
Resource-level IAM permissions
Bucket Policies & ACLs
Direct access through APIs
Server Side Encryption
Static Website Hosting
Data Lifecycle Rules
Amazon Athena – New
• Interactive Query Service that makes it easy to analyze data in Amazon S3 using
standard SQL

Object storage tiering
S3 Standard
Primary data
Big Data Analytics
Small objects
Temporary scratch
space
S3 - IA
File sync and share
Active Archive
Enterprise backup
Media transcoding
Geo-redundancy/DR
Glacier
Deep/offline
archives
Tape vaulting
replacement
WORM-compliant
data
Data tiering using S3 Life Cycle Policies

Object storage use cases
S3
S3-IA
Glacier
Cloud
Applications
Big Data
Analytics
Content
Distribution
Primary Data
File Sync &
Share
Active
Archive
Enterprise
Backup
Media
Transcoding
Disaster Recovery /
Geo Redundancy
Deep /
Offline
Archives
Tape Vaulting
Replacement
WORM
Compliant
Data
Temporary &
Small
Objects

L
i
f
e
c
y
c
l
e
Available
S3: 99.99%
S3-IA: 99.9%
Performant
Low Latency
High Throughput
≥ 30 Days≥ 128K
≥ 90 Days
Durable
99.999999999%
Scalable
Elastic capacity
No preset limits
> 0K$0.004 / GB per month
$0.0125 / GB per month
“Hot” Data
Active and/or
Temporary Data
“Warm” Data
Infrequently
Accessed Data
“Cold” Data
Archive and
Compliance Data
≥ 0 Days> 0KStarts at $0.023 / GB per month
1-5 mins
$0.01/GB retrieval
Storage tiered to your requirements
S3-IA
Glacier
S3
3 new retrieval options
3–5 hrs 5–12 hrs
Expedited Standard Bulk
$0.03 / GB $0.01 / GB $0.0025 / GB

Amazon Glacier
•Low-Cost Archival Storage
•Secure
• SSL & AES-256
•Durable
• Designed for 99.999999999% durability
•Optimized for data archiving and backup
• Suitable for RTO measured in hours
• Includes storage costs and retrieval costs
•Three retrieval options: Expedited, Standard, Bulk
•As little as $0.004 per GB/month (US East pricing)
•Integrated with S3

On-Premises Storage Integration

Storage Gateway hybrid storage solutions
Enables using standard storage protocols to access AWS storage services
AWS Storage
Gateway
Amazon EBS
snapshots
Amazon S3
Amazon Glacier
AWS Identity and Access
Management (IAM)
AWS Key Management
Service (KMS)
AWS
CloudTrail
Amazon
CloudWatch
Files
Volumes
Tapes

Storage Gateway – Files, volumes, and tapes
File gateway NFS (v3 and v4.1) interface
On-premises file storage backed by Amazon S3 objects
Tape gateway iSCSI virtual tape library interface
Virtual tape storage in Amazon S3 and Glacier with VTL management
Volume gateway iSCSI block interface
On-premises block storage backed by S3 with EBS snapshots

Storage Gateway – Common capabilities
Standard storage protocols integrate with on-premises applications
Local caching for low-latency access to frequently used data
Efficient data transfer with buffering and bandwidth management
Native data storage in AWS
Stateless virtual appliance for resiliency
Integrated with AWS management and security

File gateway
On-premises file storage maintained as objects in Amazon S3
Customer Premises
File
Gateway
• Data stored and retrieved from your S3 buckets
• One-to-one mapping from files-to-objects
• File metadata stored in object metadata
• Bucket access managed by IAM role you own and manage
• Use S3 Lifecycle Policies, versioning, or CRR to manage data
GlacierS3
Standard
S3
Standard -
Infrequent
Access
HTTPS
NFS
v3 / v4.1
Application
Server

Application
Server
Volume gateway
On-premises volume storage backed by Amazon S3 with EBS snapshots
Block storage in S3 accessed via the volume gateway
Data compressed in-transit and at-rest
Backup on-premises volumes to EBS snapshots
Create on-premises volumes from EBS snapshots
Up to 1PB of total volume storage per gateway
Amazon
EBS
snapshots
Storage Gateway
bucket in
Amazon S3
Customer Premises
Volume
Gateway
iSCSI HTTPS

Tape gateway
Virtual tape storage in Amazon S3 and Glacier with VTL management
Virtual tape storage in S3 and Glacier accessed via tape gateway
Data compressed in-transit and at-rest
Unlimited virtual tape storage, with up to 1PB of tapes active in library
Supports leading backup applications:
Archived Tapes
stored in
Amazon Glacier
MEDIA
CHANGER
TAPE
DRIVE
Customer Premises
Tape
Gateway
Virtual Tapes
stored in
Amazon S3
Backup
Server
HTTPSiSCSI

Hybrid storage use cases with Storage Gateway
Enabling cloud workloads
Move data to AWS storage for Big Data, cloud bursting, or migration
Tiered cloud storage
Easily add AWS storage to your on-premises environment
Backup, archive, and disaster recovery
Cost effective storage in AWS with local or cloud restore

Storage Gateway – Key benefits
Seamless integration across standard storage protocols
Low-latency access
Durability, cost, and elasticity of AWS Storage services
Efficient data transfer
Data encryption
Integrated with AWS monitoring, management, and security

Amazon Snowball & Snowball Edge
Petabyte scale data transport
Uses secure appliances
Economic and fast
Faster than Internet for significant data sets
Import into S3
HIPAA Compliant New

What is Snowball? Petabyte scale data transport
E-ink shipping
label
Ruggedized
case
“8.5G Impact”
All data encrypted
end-to-end
80 TB
10G network
Rain & dust
resistant
Tamper-resistant
case & electronics

How it works

How fast is Snowball?
• Less than 1 day to transfer 250TB via 5x10G connections with 5 Snowballs,
less than 1 week including shipping
• Number of days to transfer 250TB via the Internet at typical utilizations
Internet Connection Speed
Utilization 1Gbps 500Mbps 300Mbps 150Mbps
25% 95 190 316 632
50% 47 95 158 316
75% 32 63 105 211

Amazon Snowmobile
Exabyte-scale data transfer service
Each Snowmobile can transfer up to 100PB
Delivered to your site like a container
Connects to your network via removable high-speed network switch
Appears as network-attached data store
Once connected secure, high speed data transfer begins
After data transfer, Snowmobile driven back to AWS and data is
loaded into AWS service you select e.g. S3, Redshift, Glacier

Introducing AWS Snowmobile
• 45-foot long ruggedized shipping container
• Up to 100PB of capacity
• Load data S3 or Glacier
• Dedicated security personnel, GPS tracking,
alarm monitoring, 24/7 video surveillance,
and optional escort security while in transit
• Data encrypted with 256-bit encryption keys,
managed through KMS

Using multiple storage options together
EBS + S3: snapshots
S3 + EC2 Instance Store: caching
S3 + CloudFront: edge caching
S3 + Glacier: data lifecycle archiving

It’s all about
choicePerformance-oriented
Cost-oriented

Any Questions?

AWS Storage Foundation with EBS, S3 and More

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (14)

Similar to AWS Storage Foundation with EBS, S3 and More

Similar to AWS Storage Foundation with EBS, S3 and More (20)

More from Amazon Web Services

More from Amazon Web Services (20)

AWS Storage Foundation with EBS, S3 and More

Editor's Notes