AWS Auto Scaling: Optimize Cost and Performance

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Capacity Management Made Easy
with Amazon EC2 Auto Scaling
Anoop Kapoor
Senior Product Manager
Auto Scaling
AWS
C M P 3 7 7
Vadim Filanovsky
Performance and Reliability
Engineer
Netflix

Auto Scaling overview
Amazon EC2
Auto Scaling
AWS
Auto Scaling
AWS Application
Auto Scaling

What is Amazon EC2 Auto Scaling?
Amazon EC2
Auto Scaling

Auto Scaling group introduction
Logical group of instances
for your service
Minimum and maximum bound
for the number of instances that
can be in the Auto Scaling group
Launch or terminate instances
to meet the desired capacity
Desired
Min
Max

Launch template determines what will launch
• Amazon EC2 instance type
• Amazon Machine Image (AMI)
• Security groups, SSH keys, AWS Identity
and Access Management (IAM) instance
profile
• User data
…
1
2
3

Automate provisioning
of instances

Fully automated bootstrapping
Use Amazon Machine Image (AMI) with
all required configuration & software
(“golden image”)

Base Amazon Machine Image (AMI) + install code and
configuration as needed
User data in
launch template
AWS CodeDeploy AWS Systems Manager Configuration Tools

Sample user data
#!/bin/bash
# Install updates
sudo yum update -y;
# Install AWS CodeDeploy agent
cd /home/ec2-user;
curl https://aws-codedeploy-us-east-1.s3.amazonaws.com/latest/install -o install &&
chmod +x ./install &&
sudo ./install auto && sudo service codedeploy-agent start;

Perform additional actions with lifecycle hooks
Add an
instance Pending
InService
Terminating
Terminated
Remove an
instance
Health check
failed
Assign Amazon Elastic Compute Cloud (Amazon
EC2) IP address or ENI on launch
Register new instances with DNS, external
monitoring systems, firewalls …
Load existing state from Amazon Simple
Storage Service (Amazon S3) or other system
Pull down log files before instance is terminated
Investigate issues with an instance before
terminating it
Persist instance state to external system

Pending
Receive event notifications
Add an
instance
InService
Terminating
Terminated
Notifications get sent after a state transition
Rely on notifications to react to changes
that happened
Available via Amazon Simple Notification
Service (Amazon SNS) and Amazon CloudWatch
Events
Amazon EC2
instance launch
successful
Amazon EC2
instance launch
unsuccessful
Amazon EC2
instance
terminate
successful
Remove an
instance
Health check
failed

Register instances behind load balancer
Full integration with Elastic Load Balancing allows
you to automatically register instances behind Application
Load Balancer, Network Load Balancer, and
Classic Load Balancer

Reduce paging
frequency

Replace unhealthy instances
Amazon EC2 health checks
Instance state != ‘running’ or
System health check == ‘impaired’
Elastic Load Balancing health checks
ELB health == ‘OutOfService”
Includes Amazon EC2 health check
Custom health checks
Manually mark instances as ‘unhealthy’
Integrate with external monitoring systems
Elastic Load Balancing
Auto Scaling group

Balance capacity across availability zones
Availability Zone 2Availability Zone 1

Re-target capacity to alternative availability zones

Re-balance capacity across availability zones

Save up to 90% using Amazon EC2 Auto Scaling
Automatically scale instances across instance families
and purchase models in a single Auto Scaling group
Lowest cost
Specify what percentage of your group capacity should be fulfilled
by On-Demand Instances, and Spot Instances to optimize cost
Prioritized list
Use a prioritized list for On-Demand Instance types to
scale capacity during an urgent, unpredictable event to
optimize performance
Amazon EC2
Auto Scaling
Reduce cost Optimize performance Eliminate operational overhead
On-Demand Instances
Spot Instances
Reserved Instances

Before: Multiple Auto Scaling groups to use Spot, On-Demand,
and Reserved Instances together
m4.large Spot ASG Min: 1 Max: 10
c4.xlarge O-D ASG Min: 1 Max: 10
Availability
Zone 1
Availability
Zone 2
Availability
Zone3
The old way
with three Auto
Scaling groups
—one for each
instance type/
purchase option

After: Include Spot, On-Demand, and Reserved Instances in
a single Auto Scaling group
c4.xlarge O-D ASG Min: 1 Max: 10
Availability
Zone 1
Availability
Zone 2
Availability
Zone3
The new way
combines purchase
options, instance
types, and AZs in a
single Auto Scaling
group

Related chalk talks
Tuesday, November 27
Optimize Compute Cost and Performance with Amazon EC2 Auto Scaling and Amazon EC2 Fleet
5:30–6:30 p.m. | Aria East, Level 2, Mariposa 3, T1
Thursday, November 29
Optimize Compute Cost and Performance with Amazon EC2 Auto Scaling and Amazon EC2 Fleet
(Repeat)
2:30–3:30 p.m. | MGM, Level 1, South Concourse 104, T1

Scales my
infrastructure up and
down to save costs

Scheduled
scaling
Dynamic
scaling
Predictive
scaling
New!
Manual
scaling
Desired
Min
Max

Manual scaling

Scheduled scaling
Recurring scaling events
Schedule individual events
Auto Scaling group

Dynamic scaling with target tracking
Amazon EC2 instances
Traffic
5
10
15
20
25
30
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Instances
CPU
Target Utilization CPU Utilization Instances
Traffic
Time
Elastic Load
Balancing

Dynamic scaling with step scaling

Predictive scaling in Amazon EC2 Auto Scaling
Machine learning technology behind the scenes
Machine
learning model
Billions of data points
from Amazon.com
Load metric
and forecasts
next two days based on
the pre-trained model
Performs
regression analysis
between load metric
and scaling metric
Schedules
scaling actions
for the next
two days, hourly
Repeats
every day
Capacity provisioning On-Premises Capacity provisioning with Dynamic Scaling Capacity provisioning with Predictive Scaling and
Dynamic Scaling
Time
Load/Capacity
Time
Load/Capacity
Time
Load/Capacity
Provisioned Capacity Actual Capacity Demand

Netflix scale

Feature spotlight – Provisioning
The foundation
Continuous
delivery
Immutable
infrastructure

Feature spotlight – Replacing unhealthy instances

Feature spotlight – Lifecycle hooks

The need for dynamic scaling
790 Kbps 383 KbpsEncoding optimization

Encoding jobs run during off-peak
Regular streaming
usage
Encoding
3 am 7 pm

Other benefits of dynamic scaling
Recommendations Red-black deploys Regional
failover

The feedback loop of dynamic scaling
• Threshold
• Eval periods
• Metric
• Scaling amount
• Warmup time

Properties of dynamic scaling
Desired
Min
Max

Setting up dynamic scaling

What metric?

What metric, explained
Throughput per instance
Example: How much work I did
RequestCountPerTarget
Resource util. per instance
Example: How tired I am
avg. CPUUtilization
Pros: Direct measure of work; intuitive
Cons: Drifts over time
Pros: Requires less adjustment
Cons: More oscillation/jitter
VS.

Auto Scaling on multiple metrics
• Harder to reason about scaling behavior
• Different metrics might contradict each other,
causing oscillation
Typical Netflix setup
• Scaling on throughput + emergency scale-up on CPU
(aka “the hammer rule”)
OR
• Scaling on CPU

What is my target?
• curl, ab, siege, nghttp, Jmeter, Gatling…
Squeeze testing

Squeezing with live prod traffic
Proxy
Auto Scaling group
Server
Auto Scaling group
Squeeze
Auto Scaling group
Clone
Client
Auto Scaling groups
Normal traffic flow
Controlled
throughput

Understanding failures
VS.
TODO - graph here

Traffic patterns
2k
6k
10k
14k
18k
1:30 2:30 3:30 4:30 5:30 6:30 7:30
Weekday vs. weekend traffic
Friday Saturday
6k
11k
16k
21k
10:00 10:30 11:00 11:30 12:00 12:30
Mixing regular and batch traffic
Service A Service B Service C Batch Service

What could go wrong?
70
90
110
Per instance throughput
Scale up
Scale down
0
100
200
300
400
0k
10k
20k
30k
40k
ASG size
ASG throughput
Auto Scaling group
throughput and
size
Per-instance
throughput

“No rush” scaling
Problem: Scaling amounts
too small, cooldown too
long
Effect: Scaling lags behind
the traffic flow. Not
enough capacity at peak,
capacity wasted in trough.
Remedy: Increase scaling
amounts … or migrate to
target tracking!
0
100
200
300
400
15k
35k
55k
75k
ASG size
ASG throughput
45
70
95
120
145
170
Scale up
Scale down

Twitchy scaling
Problem: Scale-up policy is
too aggressive
Effect: Unnecessary
capacity churn
Remedy: Reduce scale-up
amount, increase the
number of eval periods …
or migrate to target
tracking!
0
25
50
75
100
0k
2k
3k
5k
6k
ASG size
ASG throughput
40
60
80
100
Scale up
Scale down

Should I stay or should I go?
Problem: -up and -down
thresholds are too close to
each other
Effect: Constant capacity
oscillation
Remedy: Move -up and -
down thresholds farther
apart … or migrate to
target tracking!
0
100
200
300
5k
15k
25k
35k
ASG size
ASG throughput
80
90
100
110
Scale up
Scale down

Target tracking
Cooling
74
rapidly
more
slowly
less
(set forget

Why target tracking?



25 250,000
there are features for everyone

Thank you!
Anoop Kapoor
anoopkap@amazon.com

AWS Auto Scaling: Optimize Cost and Performance

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to AWS Auto Scaling: Optimize Cost and Performance

Similar to AWS Auto Scaling: Optimize Cost and Performance (20)

More from Amazon Web Services

More from Amazon Web Services (20)

AWS Auto Scaling: Optimize Cost and Performance