Review this content as Amazon Web Services' (AWS) experts share best practices that are helping libraries save money, be more flexible and cope with the ever-increasing volume of data they are facing.
We will introduce you to AWS Cloud services and explore typical library use cases on AWS with a particular focus on storage and archiving use cases that provide exceptional durability and cost savings.
7. Variable expense
Replace capital
expenditure with variable
expense
Economies of scale
Lower variable expense
than companies can
achieve themselves
Elastic capacity
No need to guess
capacity requirements
and over-provision
Speed and agility
Infrastructure in minutes
not weeks
Focus on mission
Not undifferentiated heavy
IT lifting
Global Reach
Go global in minutes and
reach a global audience
Why are customers moving to AWS Cloud?
8. Experience
8+ years supporting 100s
of thousands of
customers across 190
countries
Innovation
Rapid delivery of new
services and features based
on customer feedback
Robust Platform
Number of services and
features, virtually to
support every use case
imaginable
Simple Pricing
Philosophy
44 Price reductions
Expect more reductions
in the future
Global Footprint
10 Regions
26 Availability Zones
51 Edge Locations
Eco system
3000 ISVs and 7000 SIs
1500 apps in Marketplace
AWS Differentiators
9. Dspace
Open Journal Systems
Open Conference Systems
Thesis and Dissertation Systems
Web Properties – WordPress
DuraCloud Preservation System
• Consortium of higher education institutions in
Texas that has provided shared digital library
services since 2005
• The mission of the Texas Digital Library (TDL) is
to enable each of its member libraries to advance
a program of digital initiatives in support of
research, scholarship, and learning.
10. Online Public
Access Catalogs
Library Catalogs
Online databases
Institutional
Repositories
Online Archive
Intellectual Output
Digital Asset
Storage
Protect from Loss and
Degradation
Offsite Storage
Redundancy and
Durability
Backups
Offsite
Redundant
Development Space
Disposable Environments
Start and Stop Frequently
Library Use Cases
12. Data Ingestion Options
AWS Direct Connect
Dedicated bandwidth between
your site and AWS
Internet
Transfer data in a secure SSL tunnel over the
public Internet
AWS Import/Export
Physical transfer of media into and
out of AWS
13. AWS Ingest Options
Internet / One Common Theme: Parallel Uploads
1. Multipart upload
2. Request rate optimization
3. TCP window scaling
4. TCP selective
acknowledgement
AWS has customers that ingest roughly 1 PB per day
14. AWS Ingest Options
AWS Direct Connect
• Private connectivity to AWS
– Physical connection – 1 Gbps or 10 Gbps port
• Consistent network performance
• Consider burst models on ingest
• Reduces costs for bandwidth-
heavy outbound workloads
US Locations
• CoreSite 32 Avenue of the Americas, NY
• CoreSite One Wilshire & 900 North Alameda, LA
• Equinix DC1 – DC6 & DC10 - DC11, Ashburn, VA
• Equinix SV1 & SV5, San Jose, CA
• Equinix SE2 & SE3, Seattle, WA
15. AWS Ingest Options
AWS Import/Export
• Rapidly move data into
and out of AWS
• Portable storage device
shipment to AWS
• Supports
– Amazon EBS
– Amazon S3
– Amazon Glacier
• Use cases
– Initial data migration
– Content distribution via portable
devices
– Disaster recovery
17. AWS Storage and Archive Options
Amazon Simple Storage Service (S3)
Highly scalable object storage
1 byte to 5 TB in size
99.999999999% durability
Amazon Elastic Block Store (EBS)
High-performance block storage device
1 GB to 1 TB in size
Mount as drives to instances with
snapshot/cloning functionalities
Magnetic and General Purpose SSD
Amazon Glacier
Long-term object archive
Extremely low cost per gigabyte
99.999999999% durability
18. AWS Storage and Archive Options
Amazon Elastic Block Store (EBS)
• High I/O block storage for Amazon
EC2
• Point-in-time snapshots to Amazon S3
• 99.999999999% Durability
• Snapshot software is FREE
• Point-in-time snapshots across
regions
19. AWS Storage and Archive Options
Amazon Simple Storage Service (S3)
• Durable and low cost
• Unlimited number of objects and volume
• Back up to Amazon S3 buckets via
HTTP/HTTPS
– Create scripts using PowerShell,
Perl, Python…
– Numerous solutions for data backup
• Authentication mechanisms ensure data
is kept secure
• Reduced redundancy storage (RRS)
option
20. • Time: Instant access, any time, any where
• Money: Pay for what you store
• Effort: Scales as you grow
• Quality: 99.999999999% durability Trillions
of
Objects
AWS Storage and Archive Options
Amazon S3: Trillions of Total Objects
21. AWS Storage and Archive Options
Amazon Glacier
• $0.01 per GB/mo, $120 per TB/yr
• 3-5 hour data retrieval latency
• Archives: single file or zipped files
• Vaults: collection of archives
• Infinite archival storage
• 99.999999999% durability
• Immutable, encrypted by default
22. AWS Storage and Archive Options
Object Lifecycle Management: Amazon S3 → Amazon Glacier
• Seamlessly move data from Amazon S3 → Amazon Glacier
• 3-5 hour asynchronous retrieval
• Data lifecycle policies
• $0.01 per GB for Amazon Glacier costs
→
23. Why AWS for Storage and Archiving?
• Protect digital content from fragility
• Protect digital assets from loss and degradation
• Promote learning
• Share research
24. TCO: On-Premises Cost Considerations
1. Primary storage hardware (primary / remote
site)
2. Storage growth (cost of upgrades)
3. Storage management software and 3rd party
tools
4. Professional services
5. Hardware maintenance
6. Software maintenance
7. Backup software
8. Backup hardware (primary / remote site)
9. Offsite tape storage / vault
10. Archive software
11. Archive hardware
12. Power
13. Cooling
14. Space
15. Labor
16. Cost of capital
17. Training
18. Asset depreciation
19. Migration
20. Decommission / remove
21. Recycle
22. …
25. 10 TB S3 = $ 3,631.20 / YEAR
5 TB S3 | 5 TB Glacier = $ 2,433.12 / YEAR
10 TB Glacier = $ 1,228.80 / YEAR
Correct as of July 11, 2014
Storage on AWS