More Related Content Similar to Deep Dive on Amazon S3 & Amazon Glacier Storage Management - STG311 - re:Invent 2017 (20) More from Amazon Web Services (20) Deep Dive on Amazon S3 & Amazon Glacier Storage Management - STG311 - re:Invent 20171. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS re:INVENT
Deep Dive on Amazon S3 & Amazon
Glacier Storage Management
w i t h S p e c i a l G u e s t , A l e r t L o g i c
S u s a n C h a n , A W S
S u n d e r P a r a m e s w a r a n , A W S
P a u l F i s h e r , A l e r t L o g i c
N o v e m b e r 2 7 , 2 0 1 7
2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What to Expect from This Session
• Overview of storage management on S3
• Organize your data
• Understand what you have stored
• Act on your storage
• How Alert Logic manages storage at scale
3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Storage Management on S3
ActMonitor and Analyze
Organize
Object Tagging
S3
Inventory
AWS
CloudTrail
Amazon
CloudWatch
Storage Class
Analysis
Cross Region
Replication Lifecycle
Policy
Event
Notification
Security Management
Default
Encryption
Bucket
Permissions Check
Encryption Status
in S3 Inventory
Trusted Advisor Amazon MacieAWS KMS AWS IAM
4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Object Tags
• Classify your data
• Tag your objects with key-value pairs
• Use tags to filter objects for S3 Analytics and CloudWatch Request Metrics
• Define access and lifecycle policies based on tags
AnalysisLifecycle PoliciesAccess Control
Easily manage and control access for Amazon S3 objects
5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is Object Tagging?
Manage data based on the nature of the data instead of where it’s stored
Department=finance
Customer_ID=1234567
Project=x
Classification=Confidential
PHI=true
username=CloudNinja
format=mp4
Media_type=video
type=raw
Organization=corporate
6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": "arn:aws:s3:::Project-bucket/*"
"Condition": {"StringEquals": {"s3:RequestObjectTag/Project": "X"}}
}
]
}
Grant User permission by tags
7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Storage Management on S3
ActMonitor and Analyze
Organize
Object Tagging
S3
Inventory
AWS
CloudTrail
Amazon
CloudWatch
Storage Class
Analysis
Cross Region
Replication Lifecycle
Policy
Event
Notification
Security Management
Default
Encryption
Bucket
Permissions Check
Encryption Status
in S3 Inventory
Trusted Advisor Amazon MacieAWS KMS AWS IAM
8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
S3 Inventory
Save time Daily or Weekly delivery Delivery to S3 bucket
• Same set of metadata as the LIST API
• Can add size, last modified date, storage class, etag, or replication status
Trigger business workflows and applications such as secondary index,
garbage collection, data auditing, and offline analytics
Delivery notification
9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
S3 Inventory
Object level
Encryption Status
CSV or ORC
output format
Query with Athena,
Redshift Spectrum or
any Hive tools
Encrypt inventory
with SSE-S3 or
SSE-KMS
10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
S3 Inventory
Bucket name
Key name
Version ID
IsLatest
Size
Last modified date
ETag
Storage class
Multipart upload flag
Delete marker
Replication status
Encryption Status
11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
"The rich information generated by AWS
through the new object encryption status in S3
Inventory has been instrumental in helping us to
automate and streamline daily reporting on
compliance controls."
– John Andrukonis
Chief Architect, Capital One.
12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
CREATE EXTERNAL TABLE my_inventory_table(
`bucket` string,
key string,
version_id string,
is_latest boolean,
is_delete_marker boolean,
size bigint,
last_modified_date timestamp,
e_tag string,
storage_class string,
is_multipart_uploaded boolean,
replication_status string,
encryption_status string)
PARTITIONED BY (dt string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION 's3://bucketname/inventory/output_destination/hive';
Query S3 Inventory with Amazon Athena
13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Query S3 Inventory with Amazon Athena
14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Visualize in Quicksight
15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Free, daily,
bucket-level
metrics
• Object Count
• Bytes stored
AWS CloudWatch Storage Metrics
16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Monitor performance and operation
AWS CloudWatch Request metrics for S3
• Generate metrics for data of your choice
• 1-minute CloudWatch metrics
• Alert and alarm on metrics
17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS CloudWatch Metrics for S3
HEAD
Requests
POST
Requests
18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Storage Class Analysis
Daily Storage
Class Analysis
Data-driven storage management for S3
Export Analysis data
to your S3 bucket
Filter by Bucket,
Prefix, or Object
Tags
• Monitors access patterns to understand your storage usage
• After 30 days, recommends when to move objects to Standard – Infrequent Access
• Export file includes a daily report of storage, retrieved bytes, and GETs by object age
19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Storage Class Analysis
20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Storage Class Analysis
21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Storage Class Analysis
22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Visualizing with AWS QuickSight
23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Export Storage Class Analysis
Amazon Redshift Tableau
24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
API Logging with AWS CloudTrail
Perform security analysis, meet your IT auditing and compliance needs,
and take immediate action on object-level activity to immediately
improve security posture
Log object level
operations
(S3 Data Events)
Log bucket level
operations
(Management Events)
Amazon
CloudWatch
Event
25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
bucketname
26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Trusted Advisor Bucket Permissions Check
S3 Console
Object Encryption status
S3 Inventory
Security Inspection
29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Bucket Permissions Check
30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Putting it all together
Storage Class
Analysis
Monitor and Alarm
Amazon CloudWatch
Logging
AWS CloudTrail
Security
monitoring
S3 Inventory
32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Storage Management on S3
ActMonitor and Analyze
Organize
Object Tagging
S3
Inventory
AWS
CloudTrail
Amazon
CloudWatch
Storage Class
Analysis
Cross Region
Replication Lifecycle
Policy
Event
Notification
Security Management
Default
Encryption
Bucket
Permissions Check
Encryption Status
in S3 Inventory
Trusted Advisor Amazon MacieAWS KMS AWS IAM
33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Lifecycle Policies
Lifecycle rules take action based on object age
Create rules to automatically Transition or Expire your storage
34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Lifecycle Policies
Lifecycle rules take action based on object age
Example policy:
Create rules to automatically Transition or Expire your storage
35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Lifecycle Policies
Lifecycle rules take action based on object age
Example policy:
• Move all objects older than 30 days to Standard – Infrequent Access
Create rules to automatically Transition or Expire your storage
36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Lifecycle Policies
Lifecycle rules take action based on object age
Example policy:
• Move all objects older than 30 days to Standard – Infrequent Access
• Move all objects older than 90 days to Amazon Glacier
Create rules to automatically Transition or Expire your storage
37. Cross-Region Replication (CRR)
Use cases:
What is CRR?
Automated, fast, and reliable asynchronous
replication of data across AWS regions
Compliance Lower latency Security
38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How does CRR work?
Bucket B
Region A Region B
• All uploads into source
bucket are replicated
• Entire bucket or prefix
• Choose any AWS region as
your target region
• Secure transfer via SSL
• Exact replicas including
object ACL and tags
Bucket A
39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Bucket B
Region A Region B
• All uploads into source
bucket are replicated
• Entire bucket or prefix
• Choose any AWS region as
your target region
• Secure transfer via SSL
• Exact replicas including
object ACL and tags
Bucket A
Tip: Lifecycle policies are
independent between source
and destination
How does CRR work?
40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
CRR across account
Region A
Region B
Why?
• Additional protection on your
back up to prevent malicious
delete
Ownership overwrite
• Replica are owned by
destination bucket
• Maintain 2 distinct and
independent stacks of
ownership
Primary
Account
Primary
Account
Secondary
Account
41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Choose any AWS
region as target
Lifecycle policy
Support SSE-KMS
Encrypted objects
Ownership overwrite
for cross-account CRR
Choose any S3 Storage
Class as target
More with Cross-Region Replication
Bi-directional replication
42. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Getting Started—CRR
43. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
my_source_bucket
44. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
45. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Destination
KMS master key
Ownership
overwrite
My_destination_bucket
arn:aws:kms:us-east-2:123456789:/abc12345t234-1234-5678-a12b-a12b34cd567
46. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cross Account CRR-Destination Set Up
Destination bucket
47. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cross Account CRR—Destination Set Up
48. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Events
SNS topic
SQS
queue
Lambda
function
• Notification when objects are
created via Put, Post, Copy,
Multipart Upload, or Delete
• Filter on prefixes and suffixes
• Trigger workflow with Amazon
SNS, Amazon SQS, and AWS
Lambda functions
Automate with Trigger-Based Workflow
Amazon S3 event notifications
49. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Default Encryption
Automatically encrypts all objects written to your
Amazon S3 bucket
• Choose SSE-S3 or SSE-KMS
• Makes it easy to satisfy
compliance needs
50. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Macie
A security service that uses machine learning to automatically
discover, classify, and protect sensitive data in AWS.
• Recognizes sensitive data
• Continuously monitors data access
• Provides dashboards and alerts
51. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Summary—Act on Your Storage
Lower latency and Backup
with
Cross-Region Replication
Lower cost
with
Lifecycle policies
Manage security
With Amazon Macie
Automatic Encryption
with
Default Encryption
52. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Storage @ Alert Logic
Paul Fisher
Technical Fellow
53. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Our Offering
Your Team
Reduce
attack surface
Block
known bad
Integrate &
streamline
Identify
suspicious
Contain &
Remediate
24x7
Monitoring
& Validation
Detection
Analytics
Vulnerability
Assessment
ComplianceData Inspection
web | log | network
Managed WAF
Reveal actual
threats
Prioritize,
explain, notify
CONTAINMENTPREVENTION
54. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Our Business
• 4,100+ Customers, 100% subscription
business
• Ingesting 2+ PB/month,
• Up to 7 years data retention
• Processing 1.2M message/second
• Growing storage +110%/year
0
5
10
15
20
25
30
35
40
Jan-12
Aug-12
Mar-13
Oct-13
May-14
Dec-14
Jul-15
Feb-16
Sep-16
Apr-17
Nov-17
Jun-18
Jan-19
Aug-19
Mar-20
Oct-20
May-21
Petabytes
Monthly Customer Data Collection Forecast
(Uncompressed)
.09 PB
2 PB
~30 PB
55. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Foundational Components
Partners EnvironmentsCustomer Environments
AWS
Cloud Collection
appliance
host
agent
host
agent
Azure
appliance
host
agent
Traditional
appliance
host
agent
host
agent
Security Subsystems
AWS
Partner Account
log
ids
Ingestion,
Storage &
Access
Assets &
Config
Correlation &
Analytics
Incident
Analysis &
Workflow
Vulnerability
Assessment
Support
Customers
Reporting
Analysts
Partners
External APIs
Internal UI/UX & APIs
Cloud Collection
Partner DC
Ticketing
Monitoring
56. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Ingestion, Storage & Access
appliance
host
agent
host
agent
Ingestion
Data Access
Search
Collection
Control Flow
Data Flow
AWS
Partner Account
log
ids
Foundational Components
Partners EnvironmentsCustomer Environments
AWS
Cloud Collection
appliance
host
agent
host
agent
Azure
appliance
host
agent
Traditional
appliance
host
agent
host
agent
Security Subsystems
AWS
Partner Account
log
ids
Ingestion,
Storage &
Access
Assets &
Config
Correlation &
Analytics
Incident
Analysis &
Workflow
Vulnerability
Assessment
Support
Customers
Reporting
Analysts
Partners
External APIs
Internal UI/UX & APIs
Cloud Collection
Partner DC
Ticketing
Monitoring
57. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Storage Solution
Requirements
Guarantee End-to-end data integrity
Per-Customer Encryption-at-Rest
Per-Customer/Data Type Expiration Policies
Per-Data Type Storage Class Management
Multi-Region Data Availability
Per-Customer Storage and Access Analysis
Per-Customer economics need to be inexpensive
… and it still has to be fast
… and scale with customer growth!
58. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
We use these S3 Management Features
S3 Object Tagging
S3 Lifecycle Expiration and Tiering
S3 Cross-Region Replication
S3 Inventory
S3 VPC Endpoints
Glacier Expedited Retrieval
AWS KMS CMK and Data Keys
IAM Cross-Account Roles
59. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Core Data Storage & Retrieval
S3 Object Keys use hash prefix for performance
logmsgs-001:/X-OGA/15543.2016-03/…
S3 Objects written with two Tags
Customer identifier (e.g. cid=1234567890)
Date (e.g. date=2017-06)
AWS KMS used to generate data encryption keys
Customer Master Key (CMK) for each data type
with automatic rotation enabled
Data Keys generated per-customer/per-month
60. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Core Data Storage & Retrieval
Data Isolation via Cross-Account Service Access
Data account runs no code
Only read-only and read-write IAM Roles defined
Only authorized services can assume these Roles
All deletion is via Lifecycle Policy
61. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Multi-Region Availability
Primary Region in Standard/Standard-IA
Drops to Standard-IA in 1-3 months via
Lifecycle/Tags
Secondary Region in Infrequent Access/Glacier
S3 Cross-Region Replication (CRR)
Drop to Glacier in 1 month via Lifecycle/Tags
Multi-region availability in minutes
Simply redirect requests to secondary
Use Glacier Expedited Retrieval on-demand
Total blended cost ~ $0.017/GB/month
62. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tags with Lifecycle Expiration Policies
Per Customer Expiration Rule
Uses ‘cid’ and ‘date’ tags as filter
Independent of object create time
63. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tags with Lifecycle Transition Policies
One Transition Rule per month
Uses ‘date’ tag as filter
64. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Inventory for Bundling Optimization
65. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Analytics for Customer Usage Patterns
66. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Demonstrate Scale
of Storage Solution
Scaled workload 100x successfully
140 PB/month of customer data
30k writes/second sustained
Write latency 200ms at 95th
percentile
Read latency 125ms at 95th
percentile
Limited only by resources driving
traffic
67. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Recap
Organize your data
• Object tags
Understand your data
• Storage Class Analysis, S3 Inventory, Metrics
Act on your data
• Lifecycle, CRR, Default Encryption, Event Notifications
Monitor and Secure your data
• Macie, Bucket Permissions Check, Trusted Advisor
68. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
STG302—Best Practices for Amazon S3 – Mon., 7 p.m. OR Tues., 2:30 p.m.
STG401—This Is My Architecture – Storage Lightning Round – Tues.,
12:15 p.m.
STG301—Deep Dive on Amazon S3 & Amazon Glacier Infrastructure –
Tues., 4 p.m.
STG201—Storage State of the Union – Wed., 11:30 a.m.
STG313—Big Data Breakthroughs – Wed., 12:15 p.m. OR 7 p.m.
STG312—Best Practices for Building a Data Lake in Amazon S3 & Amazon
Glacier – Thurs., 3:15 p.m.
Learn more…
69. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!
Amazon S3 Amazon Glacier