Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)

Amazon EBS provides persistent block-level storage volumes for use with Amazon EC2 instances. In this technical session, you will discover how Amazon EBS can take your application deployments on EC2 to the next level. Session attendees will learn about the Amazon EBS features and benefits, how to identify applications that are appropriate for use with Amazon EBS, best practices, and details about its performance and volume types. We discuss how to maximize Amazon EBS performance, with a special emphasis on low-latency, high-throughput applications like transactional and NoSQL databases, and big data analysis frameworks like Hadoop and Kafka. We will also dive deep and discuss Elastic Volumes, our latest EBS feature that allows you to dynamically increase capacity, tune performance, and change the type of EBS volumes on the fly. Throughout, we share tips for success.

  • Login to see the comments

SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)

  1. 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. August 14, 2017 Deep Dive on Elastic Block Storage (Amazon EBS) SRV413
  2. 2. What to expect from this session • Storage options on AWS • EBS Overview • Volume Types • Deep Dive
  3. 3. The AWS Storage Portfolio Cloud Data Migration Direct ConnectSnow* data transport family 3rd Party Connectors Transfer
 Acceleration Storage
 Gateway Kinesis Firehose Object Amazon GlacierAmazon S3 Block Amazon EBS 
 (persistent) Amazon EC2 Instance Store (ephemeral) File Amazon EFS
  4. 4. AWS block storage offerings
  5. 5. AWS block storage offerings EC2 instance store io1gp2 EBS SSD-backed volumes sc1st1 EBS HDD-backed volumes HDDSSD
  6. 6. What is Amazon EC2 instance store? EC2 instances • Local to instance • Non-persistent data store • Data not replicated (by default) • No snapshot support • SSD or HDD Physical Host Instance Store or
  7. 7. What is Amazon Elastic Block Store (EBS)? • Local to instance • Non-persistent data store • Data not replicated (by default) • No snapshot support • SSD or HDD • Persistent block storage volumes • 99.999% availability 
 • Automatically replicated within its Availability Zone (AZ)
 • Point-in-time snapshot support
 • Modify volume type as needs change
 • SSD or HDD Elastic Block Store EC2 Instance Store
  8. 8. The history of EBS • 2006 – EC2 launched with instance storage • 2008 – EBS (Elastic Block Storage) launched on magnetic storage • 2012 – EBS io1 (SSD) & EBS-Optimized instances • 2014 – EBS gp2 (SSD) • 2014 – EBS data volume encryption • 2015 – Larger/faster EBS volumes • 2015 – EBS boot volume encryption • 2016 – EBS st1 (HDD) and sc1 (HDD) • 2017 – EBS Elastic Volumes!
  9. 9. AWS block storage offerings EC2 instance store sc1st1 io1gp2 EBS SSD-backed volumes EBS HDD-backed volumes HDDSSD
  10. 10. What is EBS? EBS volume EC2 instance • Block storage as a service • Create, attach volumes through an API • Service accessed over the network
  11. 11. What is EBS? EC2 instance != EBS volume
  12. 12. What is EBS? EC2 instance EBS volume
  13. 13. What is EBS? EC2 instance EC2 instance • Volumes persist independent of EC2
 • Select storage and compute based on your workload
 • Detach and attach between instances within the same AZ Availability Zone AWS Region EBS volume
  14. 14. What is EBS? EC2 instance • Volumes attach to one instance at a time • Many volumes can attach to an instance • Separate boot volume from data volumesEBS volume (boot) EBS volume (data) EBS volume (data) Availability Zone AWS Region
  15. 15. What is EBS? EBS volume Availability Zone AWS Region Replica
  16. 16. EBS is designed for: What is EBS? 99.999% service availability 0.1% to 0.2% annual failure rate (AFR)
  17. 17. What is an EBS-optimized instance? EC2 instances InternetDatabases c3.2xlarge ~ 125 MB/s Shared S3 EBS
  18. 18. What is an EBS-optimized instance? ~ 125 MB/s Dedicated EBS EC2 instances InternetDatabases S3 c3.2xlarge ~ 125 MB/s Shared
  19. 19. What is an EBS-optimized instance? More details: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.html • Dedicated network bandwidth for EBS I/O • Enabled by default on c4, d2, m4, r4, p2, and x1 instances • Can be enabled at instance launch or on a running instance • Not an option on some 10 Gbps instance types • (c3.8xlarge, r3.8xlarge, i2.8xlarge)
  20. 20. EBS volume types
  21. 21. EBS volume types Hard Disk Drives (HDD)Solid-State Drives (SSD)
  22. 22. EBS volume types HDDSSD Provisioned IOPS SSD io1 General Purpose SSD gp2 Throughput Optimized HDD st1 sc1 Cold HDD
  23. 23. or Choosing an EBS volume type What is more important to your workload? IOPS Throughput?
  24. 24. Don’t know Choosing an EBS volume type your workload yet?
  25. 25. EBS volume types: General Purpose SSD (gp2) gp2 Throughput: 160MiB/s Latency: Single-digit ms Capacity: 1 GiB to 16 TiB Baseline: 100 IOPS + 3 IOPS per GiB up to 10,000 IOPS Burst: 3,000 IOPS (for volumes up to 1 TiB) Great for boot volumes, low-latency applications, and bursty databases General Purpose SSD
  26. 26. Burst and baseline: General Purpose SSD (gp2)IOPS 0 1 16 1,000 2,000 3,000 8,000 10,000 BASELINE IOPS (Baseline Linearly Scales at 3 IOPS/GiB) Burstable to 3,000 IOPS 3 90.5 Volume size (TB) 3,334 GiB 100 IOPS Minimum 300 GiB volume
  27. 27. Burst bucket: General Purpose SSD (gp2) Max I/O credit per bucket is 5.4M You can spend up to 3000 IOPS per second Baseline performance = 3 IOPS per GiB or 100 IOPS Always accumulating 3 IOPS per GiB per second gp2
  28. 28. How long can I burst on gp2?MinutesofBurst 0 150 300 450 600 Volume size in GB 1 8 30 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 43 min 300GiB 1 hour 500 GiB 10 hours 950 GiB
  29. 29. How do I monitor gp2 burst balance? VolumeWriteOpsBurstBalance 900,000 write IOs over 5 min = 3000 IOPS 450,000 write IOs over 5 min = 1500 IOPS One Hour
  30. 30. How do I monitor gp2 burst balance?
  31. 31. Throughput? or Choosing an EBS volume type IOPS What is more important to your workload?
  32. 32. i3 gp2 io1 Choosing an EBS volume type Latency? < 1 ms Single-digit ms Which is more important? Cost Performance IOPS ≤ 80,000> 80,000 is more important
  33. 33. EBS volume types: I/O Provisioned Provisioned IOPS SSD io1 Baseline: 100 to 20,000 IOPS Throughput: 320 MiB/s Latency: Single-digit ms Capacity: 4 GiB to 16 TiB Ideal for critical applications and databases with sustained IOPS
  34. 34. Scaling Provisioned IOPS SSD (io1)IOPS 0 2 16 1,000 5,000 10,000 15,000 20,000 6 90.4 MAX PROVISIONED IOPS (Maximum IOPS:GB ratio of 50:1) Available Provisioned IOPS Volume Size (TiB) ~ 400 GiB
  35. 35. gp2 io1 Choosing an EBS volume type Latency? < 1 ms Single-digit ms Which is more important? PerformanceCost IOPS ≤ 80,000> 80,000 is more important Throughput? i3
  36. 36. Throughput is more important Small, random I/O Large, sequential I/O gp2 io1 st1 d2 Choosing an EBS volume type Latency? < 1 ms Single-digit ms ≤ 1,750 MiB/s Aggregate throughput? > 1,750 MiB/s Which is more important? Cost Performance IOPS ≤ 80,000> 80,000 is more important Which is more important? Cost Performance i3
  37. 37. EBS volume types: Throughput Provisioned st1 Baseline: 40 MiB/s per TiB up to 500 MiB/s Capacity: 500 GiB to 16 TiB Burst: 250 MiB/s per TiB up to 500 MiB/s Ideal for large-block, high-throughput sequential workloads
  38. 38. Throughput Optimized HDD – burst and base ThroughputinMB/s 0 125 250 375 500 Volume Size in TB 0.5 1 2 4 6 8 10 12 14 16 Burst Base 320 ST1
  39. 39. Burst bucket: Throughput Optimized HDD (st1) Max I/O bucket credit is 1 TiB of credit per TiB in volume You can spend up to 250 MiB/s per TiB Baseline performance = 40 MiB/s per TiB Always accumulating 40 MiB/s per TiB st1
  40. 40. Up to 8 TiB in I/O credit Always accumulating 320 MiB/s You can spend up to 500 MiB/s Burst bucket: example 8 TiB st1 volume Baseline performance = 320 MiB/s st1
  41. 41. Throughput is more important Small, random I/O Large, sequential I/O Which is more important? Latency? i2 gp2 io1 sc1 st1 d2 Choosing an EBS volume type IOPS ≤ 65,000> 80,000 < 1 ms Single-digit ms ≤ 1,750 MiB/s Aggregate throughput? > 1,750 MiB/s is more important Cost Performance Which is more important? Cost Performance
  42. 42. sc1 EBS volume types: Throughput Provisioned Baseline: 12 MiB/s per TB up to 192 MiB/s Capacity: 500 GiB to 16 TiB Burst: 80 MiB/s per TB up to 250 MiB/s Ideal for sequential throughput workloads, such as logging and backup Cold HDD
  43. 43. Cold HDD – burst and base ThroughputinMiB/s 0 75 150 225 300 Volume size in TiB 0.5 1 2 4 6 8 10 12 14 16 Burst Base 192 SC1
  44. 44. Burst bucket: Cold HDD (sc1) Max I/O bucket credit is 1 TiB of credit per TiB in volume You can spend up to 80 MiB/s per TiB Baseline performance = 12 MiB/s per TiB Always accumulating 12 MiB/s per TiB sc1
  45. 45. Throughput is more important Small, random I/O Large, sequential I/O Which is more important? Latency? i3 gp2 io1 sc1 st1 d2 Choosing an EBS volume type IOPS ≤ 80,000> 80,000 < 1 ms Single-digit ms ≤ 1,750 MiB/s Aggregate throughput? > 1,750 MiB/s is more important Cost Performance Which is more important? Cost Performance
  46. 46. I/O Provisioned Volumes Throughput Provisioned Volumes sc1st1io1gp2 $0.10 per GiB $0.125 per GiB $0.065 per PIOPS * All prices are per month, prorated to the hour and from the us-west-2 Region as of August 2017 $0.045 per GiB $0.025 per GiB Snapshot storage for all volume types is $0.05 per GiB per month
  47. 47. EBS Elastic Volumes
  48. 48. Elastic Volumes – features • Increase volume size • Change volume type • Increase / decrease provisioned IOPS *Volume modifications must be valid for the target volume type
  49. 49. Elastic Volumes – overview Maintenance Windows
  50. 50. Elastic Volumes – supported volume types Provisioned IOPS SSD Throughput Optimized HDD Cold HDD io1 st1 sc1 General Purpose SSD gp2
  51. 51. Elastic Volumes - modify provisioned IOPS io1 increase decrease
  52. 52. Elastic Volumes - increase volume size gp2 gp2 io1 io1 st1 st1 sc1 sc1
  53. 53. Decrease volume size? gp2 gp2 sc1 sc1 io1 io1 st1 st1
  54. 54. Change volume type and increase size? gp2 sc1io1 st1 gp2 sc1io1 st1
  55. 55. Considerations Volume modifications must be valid for the target volume type Example 400 GiB st1 ? 400 GiB gp2
  56. 56. EBS volume types: Throughput Provisioned Throughput Optimized HDD st1 Baseline: 40 MiB/s per TiB up to 500 MiB/s Capacity: 500 GiB to 16 TiB Burst: 250 MiB/s per TiB up to 500 MiB/s Ideal for large-block, high-throughput sequential workloads
  57. 57. Considerations Volume modifications must be valid for the target volume type Example 400 GiB 500 GiB st1 gp2
  58. 58. Considerations Volume modifications must be valid for the target volume type Example ? 100 GiB 10,000 IOPS io1 io1 100 GiB 1,000 IOPS
  59. 59. EBS volume types: I/O Provisioned Provisioned IOPS SSD io1 Baseline: 100 to 20,000 IOPS Throughput: 320 MiB/s Latency: Single-digit ms Capacity: 4 GiB to 16 TiB Max ratio: 50:1 IOPS to size Ideal for critical applications and databases with sustained IOPS
  60. 60. Considerations Volume modifications must be valid for the target volume type example: 100 GiB 1,000 IOPS 200 GiB 10,000 IOPS io1 io1
  61. 61. Elastic Volumes - automation ideas Amazon CloudWatchElastic Volumes
  62. 62. Elastic Volumes - automation ideas • Use a CloudWatch alarm to watch for a volume that is running at or near its IOPS limit. Amazon CloudWatch Right-sizing: • Initiate workflow and approval process to provision additional IOPS • Publish a “free space” metric to CloudWatch and use a similar approval process to resize the volume and the filesystem.
  63. 63. Elastic Volumes - automation ideas • Use metrics or schedules to reduce IOPS or to change the type of a volume Amazon CloudWatch Cost Reduction: https://github.com/awslabs/aws-elastic-volumes
  64. 64. EBS encryption
  65. 65. What is EBS encryption? Encryption
  66. 66. What is EBS encryption? Encryption • Attach both encrypted and unencrypted • No volume performance impact • Any current generation instance • Supported by all EBS volume types • Snapshots also encrypted • No extra cost • Boot and data volumes can be encrypted
  67. 67. EBS snapshots
  68. 68. What is an EBS snapshot? EBS volume Availability Zone AWS Region Amazon
 S3 Availability Zone Replica EBS snapshot
  69. 69. How does an EBS snapshot work? • Point-in-time backup of modified blocks • Stored in S3, accessed via EBS APIs • Subsequent snapshots are incremental • Deleting a snapshot will only remove data exclusive to that snapshot EBS volume EBS snapshot
  70. 70. Creating an EBS snapshot { "Description": "Summit Snapshot", "Encrypted": true, "VolumeId": "vol-0f8e54c9346aca182", "State": "pending", "VolumeSize": 8, "Progress": "", "StartTime": "2017-08-14T04:48:50.000Z", "SnapshotId": "snap-039cc569ea4cb40f6", "OwnerId": "51797716418770" } [ec2-user@summit ~]$ [ec2-user@summit ~]$ aws ec2 create-snapshot --volume-id vol-0f8e54c9346aca182 --description "Summit Snapshot"
  71. 71. What can you do with a snapshot? EBS volume Availability Zone AWS Region EBS snapshot EC2 instance AMI
  72. 72. What can you do with a snapshot? EBS volume Availability Zone AWS Region Amazon
 S3 EBS snapshot Availability Zone EBS volume Replica Replica
  73. 73. What can you do with a snapshot? EBS volume Availability Zone AWS Region Amazon
 S3 EBS snapshot EBS volume Availability Zone AWS Region EBS snapshot Replica Replica
  74. 74. EBS tagging
  75. 75. EBS tagging - tag on creation
  76. 76. EBS tagging - enforced tag usage •Enforce the use of specific tags with IAM •Block the deletion of tags •Control access to resources, ownership, and cost allocation.
  77. 77. EBS tagging - resource-level permissions •CreateTags and DeleteTags •RunInstances and CreateVolume •Control users and groups that can tag resources on creation
  78. 78. EBS tagging - enforced volume encryption •RunInstances and CreateVolume •Mandate use of encryption for any volume created https://aws.amazon.com/blogs/aws/new-tag-ec2-instances-ebs-volumes-on-creation/
  79. 79. EBS use case: Hadoop (big data) Use Case • EBS GP2 (SSD), ST1 and SC1 Hard Disk Drive (HDD) volumes • Cloudera Hadoop to analyze 15TB of data / day so their customers can run advertising campaigns Value Proposition • Flexibility: Right size instance - CPU and Memory, not storage. With EBS, choose any instance • Cost: More capacity and less processing at a lower price per request • Elastic: increase capacity, tune performance, and change volume types live (no downtime) • Data Persistence: Detach and reattach volumes to resize your cluster • Operational Efficiency: Easy to expand, no more babysitting cluster Videology Example • Chose M4.10xlarge + ST1 HDD Throughput Optimized EBS “We were able to save $15,000 per month, increase our available storage by 5%, and turn off eight server nodes by moving to Amazon EBS ST1 volumes to support our Hadoop cluster”
  80. 80. EBS use case: Cassandra (NoSQL) Use Case • Elasticity on demand at a cost-effective endpoint to analyze petabytes of network traffic • Process, store, and analyze billions of network packets (metadata) and petabytes of raw packet data Value Proposition • Flexibility: Right size instance - CPU and Memory, not storage. With EBS, choose any instance • Cost: Less expensive than on-prem, saved 30% costs by leveraging multi-tiered storage system with EBS • Performance: Search petabytes of stored data with 1-3 second response time • Data Persistence: Detach and reattach volumes to resize your cluster, minutes to move data that used to take days or weeks Protectwise Example • Chose I2 EC2 (hot data) + GP2, ST1 EBS (warm data), and S3 (cold data) “Using AWS, we have reduced our storage costs by 95%, meaning we are spending $1 for every $20 we would have spent on a traditional system.”
  81. 81. EBS use case: hybrid volumes st1 Case Study: https://aws.amazon.com/solutions/case-studies/infor-ebs/ Transaction logs “We’ve seen much stronger performance for our database backup workloads with the Amazon EBS ST1 volumes, and we’re also saving 75 percent on our monthly backup costs.” ~ Randy Young, Director of Cloud Operations, Infor i2 st1 Full backups st1 Partial backups SQL Server Database EBS snapshots
  82. 82. EBS deep dive
  83. 83. EBS deep dive Performance
  84. 84. How do we count I/Os for GP2 and IO1? When possible, we logically merge sequential I/Os (up to 256 KiB in size) ...To minimize I/O charges on IO1 and maximize burst on GP2
  85. 85. How do we count I/Os for GP2 and IO1? Example 1: Random I/Os • 4 random I/Os (i.e., non sequential I/Os) • Each I/O 64 KiB Up to 256 KiB EC2 instance EBS Counted as 4 I/Os
  86. 86. How do we count I/Os for GP2 and IO1? Example 2: Sequential I/O • 4 sequential I/Os • Each I/O 64 KiB Up to 256 KiB EC2 instance EBS Counted as 1 I/O
  87. 87. How do we count I/Os for GP2 and IO1? Example 3: Large I/O • 1 I/O • 1024 KiB Up to 256 KiB EC2 instance EBS Counted as 4 I/Os
  88. 88. How do we count I/Os for ST1 and SC1? • When possible, we merge sequential I/Os (up to 1 MiB in size) • Workloads with primarily large, sequential I/Os perform best on ST1 and SC1 • Ex: Big data / EMR, Hadoop, Kafka, log processing, data warehouses
  89. 89. How do we count I/Os for ST1 and SC1? Example 1: Random I/Os • 4 random I/Os • Each I/O 64 KiB Up to 1024 KiB EC2 instance EBS Counted as 4 I/Os or 4 MiB/s of burst
  90. 90. How do we count I/Os for ST1 and SC1? Example 2: Sequential I/O • 4 sequential I/Os • Each I/O 1024 KiB Up to 1024 KiB EC2 instance EBS Counted as 4 I/Os or 4 MiB/s of burst
  91. 91. How do we count I/Os for ST1 and SC1? Example 3: Mixed I/O • 2 * 512 KB sequential I/Os • 2 * 64 KB random I/Os • 2 * 128 KB sequential I/Os Up to 1024 KiB EC2 instance EBS Counted as 4 I/Os or 4 MiB/s of burst (but only ~ 1.4 MiB of data transferred)
  92. 92. Burst balance for ST1 and SC1 - sequentialBurstBalance% 0 25 50 75 100 Time in Hours 0 1 2 3 4 5 6 7 8 9 10 1 MB Sequential 4 TiB ST1 volume 1 MiB sequential: 500 MB/s for 3 hours
  93. 93. Burst balance for ST1 and SC1 - randomBurstBalance% 0 25 50 75 100 Time in Hours 0 1 2 3 4 5 6 7 8 9 10 16 KB Random 4 TiB ST1 volume 16 KiB random: 8 MB/s for 3 hours
  94. 94. Burst balance for ST1 and SC1 4 TiB ST1 volume Data Transfer Comparison: Sequential vs. Random 0 1500 3000 4500 6000 Data Transferred in GiB 87 GiB 5,400 GiB 1 MiB Sequential 16 KiB Random 1 MiB sequential: 5.4 TiB transferred 16 KiB random: 87 GiB transferred
  95. 95. 2046 sectors x 512 bytes/sector = ~1024 KiB $ iostat –xm Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util xvdf 0.00 0.20 0.00 523.40 0.00 523.00 2046.44 3.99 7.62 1.61 100.00 Verify workload I/O patterns iostat for Linux perfmon for Windows
  96. 96. 128 KiB Verify ST1 & SC1 workloads Amazon CloudWatch console
  97. 97. Under 64 KiB? Verify ST1 & SC1 workloads Small or random IOPS likely Amazon CloudWatch console
  98. 98. Stuck around 44 KiB? Verify ST1 & SC1 workloads Amazon CloudWatch console
  99. 99. ST1 & SC1: Linux performance tuning Increase maximum request size • Recommended ST1 and SC1 on a 4.2+ Linux kernel • Memory allocated per device • Default is 32, max for EC2 is 256 • OS boot command line configuration For example with GRUB’s /boot/grub/menu.lst configuration: kernel /boot/vmlinuz-4.4.5-15.26.amzn1.x86_64 root=LABEL=/ console=ttyS0 xen_blkfront.max=256 https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerformance.html
  100. 100. ST1 & SC1: Linux performance tuning Increase read-ahead buffer: • Recommended for high-throughput read workloads • Per device configuration • Default is 128 KiB (256 sectors) for Amazon Linux • Smaller or random I/O will degrade performance For example: $ sudo blockdev –setra 2048 /dev/xvdf https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerformance.html
  101. 101. IOPS vs. throughput 20,000 IOPS 16 KiB 320 MB/s 10,000 IOPS 64 KiB 640 MB/s 10,000 IOPS 32 KiB 320 MB/s I/O request size 1,250 IOPS 256 KiB 320 MB/s Example: io1 volume 20,000 PIOPS
  102. 102. Performance: EBS-optimized c4.large Dedicated to EBS 500 Mbps ~ 62.5 MB/s 2 TiB GP2 volume: 6,000 IOPS 160 MiB/s max throughput 4,000 16K IOPS c4.2xlarge Dedicated to EBS 1 Gbps ~ 125 MB/s 8,000 16K IOPS 2 TiB GP2 volume: 6,000 IOPS 160 MiB/s max throughput
  103. 103. Performance: throughput workload m4.16xlarge 10Gbps ~ 1250MB/sec Shared Dedicated EBS EC2 instances InternetDatabases S3 EBS-optimized 10 Gbps ~1,250 MB/s 80,000 16K IOPS 8 TiB ST1 320 MB/s base 500 MB/s burst
  104. 104. Performance: throughput workload m4.16xlarge 10Gbps ~ 125MB/sec Shared Dedicated EBS EC2 instances InternetDatabases S3 EBS-optimized 10 Gbps ~1,250 MB/s 65,000 16K IOPS 3 x 8 TiB ST1 960 MB/s base 1,250 MB/s burst EBS EBS
  105. 105. Best practice: RAID When to RAID? • Storage requirement > 16 TiB • Throughput requirement > 500 MB/s • IOPS requirement > 20,000 @ 16K EBS EBS EBS
  106. 106. Best practice: RAID Avoid RAID for redundancy • EBS data is already replicated
 • RAID1 halves available EBS bandwidth
 • RAID5/6 loses 20–30% of usable I/O to parity EBS EBS EBS
  107. 107. Best practice: RAID Availability Zone AWS Region EC2 instance EBS EBS RAID 0
  108. 108. Best practice: RAID Availability Zone AWS Region EC2 instance Replica ReplicaEBS EBS RAID 10
  109. 109. EBS deep dive Performance Reliability
  110. 110. What about EC2 instance failure? AWS Region EC2 instance Availability Zone EBS Volume
  111. 111. What about EC2 instance failure? AWS Region EBS Volume New EC2 instance Availability Zone
  112. 112. EBS enables EC2 auto recovery RECOVER Instance Instance ID Instance metadata Private IP addresses Elastic IP addresses EBS volume attachments Instance retains: * Supported on C3, C4, M3, M4, P2, R3, R4, T2, and X1 instance types with EBS-only storage StatusCheckFailed_System Amazon CloudWatch per-instance metric alarm: When alarm triggers? https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-recover.html
  113. 113. What about EC2 instance termination? Availability Zone EC2 instance AWS Region EBS volume Replica DeleteOnTermination = False
  114. 114. What about EC2 instance termination? Availability Zone EC2 instance AWS Region EBS volume Replica DeleteOnTermination = True
  115. 115. Best practice: taking snapshots from Linux Quiesce I/O 1. Database: FLUSH and LOCK tables 2. File system: sync and fsfreeze 3. EBS: snapshot all volumes 4. When CreateSnapshot API returns success, it is safe to resume
  116. 116. Best practice: taking snapshots from Windows 1. sync equivalent available 2. Use Volume Shadow Copy Service-(VSS) aware utilities for backups 3. EBS: backups on dedicated volume for snapshots
  117. 117. EBS boot volume Windows EC2 instance EBS backup volume Windows Server Backup EBS snapshot Best practice: taking snapshots from Windows EBS data volume
  118. 118. EBS volume initialization New EBS volume? New EBS volume from snapshot? • Attach and it’s ready to go • Initialize for best performance • Random read across volume https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html
  119. 119. Best practice: EBS volume initialization $ sudo yum install –y fio $ sudo fio --filename=/dev/xvdf --rw=randread --bs=128k --iodepth=32 --ioengine=libaio --direct=1 --name=volume-initialize Fio-based example: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html
  120. 120. Best practice: automate snapshots Key ingredients: AWS Lambda Amazon EC2 Run command Tagging https://aws.amazon.com/ec2/run-command/
  121. 121. Best practice: automate snapshots Tagging Backup Retention 30 days EC2 instances
  122. 122. Best practice: automate snapshots Lambda scheduled event: daily snapshots EC2 Backup Retention 30 days Search for instances tagged “Backup” EC2 Run commands to quiesce file system Snapshot attached volumes Tag snapshots with expire date 1. 2. 3. 4. instances
  123. 123. Best practice: automate snapshot expiration Lambda scheduled event: daily expire Search for snapshots tagged to “Expire On” today Delete expired snapshots 1. 2. EBS snapshots Backup ExpireOn Date
  124. 124. https://github.com/dR0ski/lambda-ebs-snapshot-custodian Prototype for EBS Snapshot Custodian: ExpireOn Date
  125. 125. EBS best practices Performance Reliability Security
  126. 126. Best practice: encryption EBS encryption: data volumes
  127. 127. Best practice: encryption Create a new AWS KMS master key for EBS • Define key rotation policy • Enable AWS CloudTrail auditing • Control who can use key • Control who can administer key
  128. 128. Best practice: encryption EBS encryption: data volumes
  129. 129. Summary Use encryption if you need it Take snapshots, tag snapshots Select the right instance for your workload Select the right volume for your workload
  130. 130. Thank You

×