Automating EC2 Management with Auto Scaling

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Michael Hanisch, AWS Solutions Architecture
3/23/2017
Automating Management of
Amazon EC2 Instances with
Auto Scaling

Auto Scaling groupAuto Scaling group
Auto Scaling
Dynamic Scaling
ELB
EC2 Instances
ELB
CPU
Utilization
EC2 Instances
Fleet Management

Myth Fact
My Application Doesn’t Need Scaling,
So I Don’t Benefit From Auto Scaling
It’s Hard To Use
My Instances are Stateful or Unique;
I can’t use Auto Scaling
It Works Well
with Stateful Instances
You Can Get Started in Minutes
It Monitors and Heals Instances

Is Fleet Management For You?
“I’ve got instances serving a
business-impacting application”
“If my instances become unhealthy,
I’d like them replaced automatically”
“I would like my instances
distributed to maximize resilience”

instance instance
Availability Zone bAvailability Zone a

instance instance
Auto Scaling group
Minimum # = 2 Maximum # = 2
Desired # of instances = 2

instance instance
Elastic Load
Balancing

instance instance
Elastic Load
Balancing
Auto Scaling group
Minimum # = 2 Maximum # = 2

Auto Scaling Groups
- Always keep minimum number of instances running
- Launch or terminate instances to meet desired capacity
- Never start more than maximum number of instances
- Keeps capacity balanced across AZs

Launch Configurations
Determine what is going to be launched:
- EC2 instance type & size
- Amazon Machine Image (AMI)
- Security groups, SSH keys, IAM instance profile
- User data
…

Bootstrapping
Installation & setup needs to be fully automated:
- Use Amazon Machine Image (AMI) with all required
configuration & software (“golden image”)
- Base AMI + install code & configuration as needed
- Via Userdata + scripts
- Via Chef/Puppet/Ansible/…
- Using AWS CodeDeploy
- Using Amazon EC2 Systems Manager
…

Bootstrapping
#!/bin/bash
# Install updates
sudo yum update -y;
# Install Amazon EC2 Systems Manager Agent
cd /tmp;
curl https://s3.amazonaws.com/ec2-downloads-
windows/SSMAgent/latest/linux_amd64/amazon-ssm-
agent.rpm -o amazon-ssm-agent.rpm && yum install -y
amazon-ssm-agent.rpm;

Bootstrapping
#!/bin/bash
# Install updates
sudo yum update -y;
# Install AWS CodeDeploy agent
cd /home/ec2-user;
curl https://aws-codedeploy-us-east-
1.s3.amazonaws.com/latest/install -o install &&
chmod +x ./install &&
sudo ./install auto && sudo service codedeploy-agent start;

Monitoring
Auto Scaling gives you access to new metrics in
Amazon CloudWatch:
- group-level metrics like number of running instances
- aggregate metrics like average CPU utilization for all
instances in the group

Auto Scaling Concepts
Launch Configuration
• Auto Scaling groups use a
launch configuration to
launch EC2 instances.
• Provides information
about the AMI and EC2
instance types/size
Scaling Plan
• A scaling plan tells Auto
Scaling when and how to
scale.
• Create a scaling plan
based on the occurrence
of specified conditions
(dynamic scaling) or
create a plan based on a
specific schedule.
Auto Scaling Groups
• EC2 instances are
managed by Auto Scaling
groups.
• Create Auto Scaling
groups by defining the
minimum, maximum,
and, optionally, the
desired number of
running EC2 instances.

Termination Policies
Determine which instances are terminated first:
- Longest running
- Oldest launch configuration
- Closest to full billing hour
But: rebalancing of capacity across AZs takes precedence!

Scaling Plans
Determine when the Auto Scaling group will scale in or out:
desired capacity > current capacity: launch instances
desired capacity < current capacity: terminate instances

Scaling Plans
- Default: ensure current capacity of healthy instances
remains within boundaries (never less than minimum)
- ‘Manual scaling’: modify desired capacity (via API,
console, CLI) to trigger a scaling event
- Scheduled: scale in / out based on timed events
- Dynamic scaling: scale based on CloudWatch metrics

Auto Scaling Groups
- Always keep minimum number of instances running
- Launch or terminate instances to meet desired capacity
- Never start more than maximum number of instances
- Keeps capacity balanced across AZs
- Replace unhealthy instances

Health Checks
- Performed periodically
- Instances are marked as “Unhealthy” when checks fail
- Unhealthy instances are terminated and replaced
(if new number of instances < minimum or < desired capacity)

Different Kinds of Health Checks
- EC2 instance status:
Instance is unhealthy when instance state != ‘running’
or system health check == ‘impaired’
- ELB health checks:
instance is unhealthy when ELB health check results in
“OutOfService” (or EC2 health check failed)
- Manual: mark individual instances as ‘unhealthy’
Instance unhealthy when marked as such or EC2 health check
failed. Use to integrate with external monitoring systems.

instance instanceinstance instance
Auto Scaling group
Minimum = 2 Maximum = 6
instanceinstance
Elastic Load
Balancing

Unhealthy Instances Get Replaced…
instance instanceinstance instance
Auto Scaling group
instanceinstance
Elastic Load
Balancing

…In a Different AZ if Necessary
instanceinstance instanceinstance
Auto Scaling group
instance
instance
Elastic Load
Balancing

Rebalancing Capacity
instanceinstance instanceinstance
Auto Scaling group
instance
instance
Elastic Load
Balancing
instance instance instance

What happens when an
instance is terminated?

Instance Lifecycle
Scale Out
Event
Instance launching:
Pending
InService
TerminatingTerminated
Scale In
Event
Health check
failed

Instance Lifecycle
Add an
Instance
Instance launching:
Pending
InService
Remove an
Instance
Health check
failed

Instance Lifecycle
Instance launching:
Pending
InService
Entering Standby…
Standby

How can we influence the
instance lifecycle?

Why? – Common Use Cases
• Assign Elastic IP address or ENI on launch
• Register new instances with DNS, external monitoring
systems, firewalls, load balancers, …
• Load existing state from S3 or other system
• Pull down log files before instance is terminated
• Investigate issues with an instance before terminating it
• Persist instance state to external system
• …

Lifecycle Hooks
& Notifications

Instance Lifecycle Notifications
Add an
Instance
Instance launching:
Pending
InService
Remove an
Instance
Health check
failed
EC2 Instance
Terminate
Successful
EC2 Instance
Launch
Successful
EC2 Instance
Launch
Unsuccessful

Instance Lifecycle Notifications
• Notifications get sent after a state transition.
• Rely on notifications to react to changes that happened.
• Available via Amazon Simple Notification Service and
Amazon CloudWatch Events.
• Prefer CloudWatch Events due to ease of use and
extended feature set!

Sample Notification
Service: Auto Scaling
Time: 2017-03-23T21:53:43.989Z
RequestId: 52e21eba-718a-43a7-81a8-3b379054cba6
LifecycleActionToken: 979c0f97-80c5-44bd-a2b6-5a8aae339f35
AccountId: XXXXXXXXX
AutoScalingGroupName: demo-asg
LifecycleHookName: do-something-on-launch
EC2InstanceId: i-XXXXX
LifecycleTransition: autoscaling:EC2_INSTANCE_LAUNCHING
NotificationMetadata: null

Instance Lifecycle Hooks
• When Lifecycle Hooks are defined, instances enter
special “WAIT” states during state transitions.
• Allows you to react to lifecycle events & impact the state
• WAIT states bring their own notifications, too.

Instance Lifecycle Hooks
Instance launching:
Pending
InService
Pending:WAIT
Terminating:WAIT
Invoke Hook(s)

Lifecycle Hooks
- Executed before taking a new
instance into service /
terminating it
- Put instances into a WAIT state
while work can happen
Auto Scaling Notifications
- Notifications get sent after an
instance has entered
“InService” or “Terminated”
state, respectively
- Cannot influence or stop a
transition

TODO:
Tasks? Second part of demo: show how to set up
CWE event + rule to invoke EC2 Systems
Manager RunCommand directly to save
webserver log files.
(Point out that this is a common example but
that we’d recommend to use a proper logging
solution, e.g. CloudWatch Logs, instead)

TODO:
Tasks?
Add architecture diagram of
the solution we just set up – or
do this before?

instance instance
Auto Scaling group
Auto Scaling
Lambda
function
CloudWatch Events Amazon EC2
Systems
Manager
1. Event fires,
triggers Rule
2. Rule invokes Lambda 3. Asks EC2 SSM
To RunCommand
4. Invoke RunCommand
on terminating instance
5. Command uploads logs to S3
Auto Scaling
6. Complete Hook

How Do I Write a Lifecycle Hook?
1. Code the lifecycle hook’s action
2. Create new Rule in CloudWatch Events
3. Associate the lifecycle hook with the Auto Scaling group

Writing a Lifecycle Hook
1. Extract instanceID, auto scaling group, other params.
2. Do stuff…
• Beware of timeouts!
• Send “heartbeats” if you need more time
3. Call CompleteLifecycleAction to signal that you’re done!

• AWS Lambda function
• Amazon EC2 Systems Manager RunCommand
• Any Code that Consumes Kinesis Streams/ SQS/ SNS

{ "schemaVersion": "1.2",
"description": "Backup logs to S3", "parameters": {},
"runtimeConfig": {
"aws:runShellScript": {
"properties": [ {
"id": "0.aws:runShellScript",
"runCommand": [ "",
"ASGNAME='demo-asg'",
"LIFECYCLEHOOKNAME='demo-asg-backup-hook'",
"INSTANCEID=$(curl http://169.254.169.254/latest/meta-data/instance-id)",
"REGION=$(curl http://169.254.169.254/latest/meta-data/placement/availability-zone)",
"REGION=${REGION::-1}",
"HOOKRESULT='CONTINUE’”,
[…]

[…]
aws s3 cp /tmp/${INSTANCEID}.tar s3://${S3BUCKET}/${INSTANCEID}/ &> /tmp/backup",
" MESSAGE=$(cat /tmp/backup)",
"fi",
"",
"aws autoscaling complete-lifecycle-action
--lifecycle-hook-name ${LIFECYCLEHOOKNAME}
--auto-scaling-group-name ${ASGNAME}
--lifecycle-action-result ${HOOKRESULT}
--instance-id ${INSTANCEID} --region ${REGION}”
]
} }
}
}

2. Create new Rule in CloudWatch Events

aws autoscaling put-lifecycle-hook
--auto-scaling-group-name demo-asg
--lifecycle-hook-name demo-hook-terminate
--lifecycle-transition autoscaling:EC2_INSTANCE_TERMINATING
3. Associate the lifecycle hook with the Auto Scaling group

Dealing with Stateful
Applications

Dealing With Stateful Applications
- While ”InService”:
- Persist state to EBS volume on a regular basis
- Tag with InstanceId, application name
- On “Instance-terminating Lifecycle Action”:
- Detach EBS volume with state information
- Remove InstanceId tag, keep application name tag
- On “Instance-launch Lifecycle Action” event:
- Find & Attach EBS volume tagged w/ application name
- Tag with InstanceId, Resume Application

Fact
It Works Well
with Stateful Instances
You Can Get Started in Minutes
It Monitors and Heals Instances
• Direct Integration with CloudWatch
• Instance Replacement
• AZ Rebalancing
• Options for Easy Bootstrapping
• Start off with Existing Instances
• Lots of Control via Lifecycle Hooks
• Keep Track with Notifications

Questions?
https://aws.amazon.com/blogs/compute/fleet-management-made-easy-with-auto-scaling/
http://docs.aws.amazon.com/autoscaling/latest/userguide/WhatIsAutoScaling.html
https://aws.amazon.com/autoscaling/getting-started/

Automating EC2 Management with Auto Scaling

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to Automating EC2 Management with Auto Scaling

Similar to Automating EC2 Management with Auto Scaling (20)

More from Amazon Web Services

More from Amazon Web Services (20)

Recently uploaded

Recently uploaded (20)

Automating EC2 Management with Auto Scaling