PyCon Russia 2014 - Auto Scale in the Cloud

An introduction
Scale in the Cloud
Created by: Simone Soldateschi
Modified Date: 2014-06-02
Classification: Public Conference

RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Who am I?
Simone Soldateschi
• Java, C/C++, PHP, Python developer
• More than 8 years experience as SysAdm/SysEng
• Developer Support Engineer at Rackspace
• Task automation enthusiast
• MTB’ing, triathlon, photo, manga
@soldasimo
simonesoldateschi
2

Who are Rackspace?
Founded in 1998 in San Antonio, TX by three guys that
wanted to create a hosting company
Home of Fanatical Support /o/
Second biggest Public Cloud in the world
OpenStack Project co-founder

To be recognized as one of the
world’s great service companies.
“
”
Rackspace Vision
4

• Python SDK, Cloud
• Auto Scaling
• Management System
• Control law
• Garçon, all together now!
Roadmap
5

Install SDK
$ mkvirtualenv pyconru
New python executable in pyconru/bin/python
Installing setuptools, pip...done.
6
(pyconru)$ pip install pyrax ipython
Downloading/unpacking pyrax
Downloading pyrax-1.8.1-py2.py3-none-any.whl (316kB): 316kB downloaded
Downloading/unpacking ipython
Downloading ipython-2.1.0-py2-none-any.whl (2.8MB): 2.8MB downloaded
…
Successfully installed pyrax ipython …
Cleaning up…
See: https://github.com/rackspace/pyrax

Authentication
7
# authenticate
pyrax.set_setting('identity_type', 'rackspace')
pyrax.set_credentials(os.getenv('OS_AUTH_USER'),
os.getenv('OS_AUTH_APIKEY'),
region=os.getenv('OS_AUTH_REGION'))
print “authenticated: %s” % pyrax.identity.authenticated

Authentication
Check credentials:
(pyconru)$ python -m pyconru.basic
(DEBUG) OS_AUTH_USER: foo
(DEBUG) OS_AUTH_APIKEY: ****
(WARNING) OS_AUTH_REGION undefined, using default 'LON'
(DEBUG) authenticated: True
(INFO) identity token: cfe6d60f070947bf****************
8
Define environment variables:
(pyconru)$ export OS_AUTH_USER=foo
(pyconru)$ export OS_AUTH_KEY=bar
(pyconru)$ export OS_AUTH_REGION=LON

Cloud components
9

Cloud components
10

• Auto Scaling
• Control law
Roadmap
11

simone.soldateschi@rackspace.co.uk
Vertical scaling
2 GB
2 CORES
8 GB
8 CORES

What is Autoscaling?

What is Autoscaling?
WASTED $$$

New Usage Models
CLOUDSMART
16
Dedicated Servers are Pets
• Great thought to their acquisition
• Name them and know each one
• Willing to pay big money for their care
Cloud Servers are Livestock
• Use them as long as they provide value
• Acquire more of them when needed
• Dispose of any that aren’t needed
• Get rid of them if they become ill

http://www.flickr.com/photos/fischerfotos/7419253200/

Traffic Patterns

Traffic Patterns
ON & OFF
Analytics
Banks/Tax Agencies
Test environments

Traffic Patterns
FAST GROWTH
Events
Business Growth
Slashdot Effect

Traffic Patterns
VARIABLE
News & Media
Event Registrations
Rapid fire sales

Traffic Patterns
CONSISTENT
HR Application
Accounting/Finance
E-mail

http://www.flickr.com/photos/maximalideal/3356408693/

Autoscaling Methodologies
Time Based
Reactive
Predictive

Time Based Autoscaling
Load
Balancer
Server Server

Load
Balancer
Server Server
9:00am

Load
Balancer
Server Server
Nov 1st

Load
Balancer
Server Server Server

GOOD FOR
On & Off
Consistent

Reactive Autoscaling
Load
Balancer
Server
60%
Server
60%

Load
Balancer
Server
80%
Server
80%

Load
Balancer
Server
60%
Server
60%
Server
40%

Load
Balancer
Server
30%
Server
30%
Server
30%

Load
Balancer
Server
45%
Server
45%

GOOD FOR
Fast Growth
Variable

Predictive Autoscaling
Load
Balancer
Server Server

Load
Balancer
Server Server
Forecasted
Traffic
+30%

Load
Balancer
Server Server Server

GOOD FOR
Fast Growth
Variable

Auto scaling - Schedule-based scaling

Auto scaling - Event-based scaling

SCALE UP

COOL DOWN

45
Cooldown

SCALE DOWN

Auto Scale – Use case
FRONT END

LB
FRONT END
Share nothing
Stateless nodes

LB
FRONT END
LB
API
BOSS WORKER

http://www.flickr.com/photos/samuraislice/3309481048/

• Auto Scaling
• Control law
Roadmap
52

The basics
That’s it!
$ pip install ansible
Installation on management host

The basics
Install agent on managed hosts:

Why use ansible?
Desired state
Go live!

Desired State
Write code to tell the
computer
how to set up itself!
56RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Auto Scaling
• Control law
Roadmap
57

Closed-Loop Control Law
58

Event-based Auto Scale

Auto Scale

Closed-loop Control Law – Garçon implementation
61
?

• Auto Scaling
• Control law
Roadmap
62

Garçon
How to integrate
Cloud Monitoring
and
Auto Scale
?
63

Garçon - How?
Garçon

Garçon - Overview
Garçon
cm2asd
cfgmgmtd

Garçon - Go Live!
cfgmgmtd
Go Live!

Closed-Loop Control Law – Garçon implementation
67
Garçon

Garçon in-depth – cm2asd
68
# fetch current list of servers
l_current_servers = scaling_group_servers(scaling_group_id)
Garçon

69
for i in range(len(l_current_servers)-1, -1, -1):
server_id = l_current_servers[i]
s = get_server(server_id)
Garçon
if s.status != 'ACTIVE':
# server not active
l_current_servers.pop(i)
continue
m = get_server_metadata(s.id)
try:
if (m['aspoc.server_status'] != 'configured'):
server not configured
continue
except KeyError:
# server not configured
continue

70
Garçon

71
# compute average system load for scaling group
servers_avg_load = servers_average_load(l_checks, samples,
sample_time)
# compare current load against configured threshold
if servers_avg_load >= threshold_high:
# trigger scale_up_webhook
r = requests.post(scale_up_webhook)
if r.status_code != 202:
logger.error('scale_up_webhook (%s) returned HTTP %d' %
(scale_up_webhook, r.status_code))
Garçon

72
if servers_avg_load <= threshold_low:
# trigger scale_down_webhook
r = requests.post(scale_down_webhook)
if r.status_code != 202:
logger.error('scale_down_webhook (%s) returned HTTP %d' %
(scale_down_webhook, r.status_code))
Garçon

Garçon in-depth – cfgmgmtd
73
for s_id in l_current_servers:
...
# server exists?
try:
cs = pyrax.cloudservers
cs.servers.get(s_id)
except:
logging.warning('Auto Scale server (%s, %s) missing '
'(maybe deleted manually?)' % ('-', s_id))
continue
...
try:
# read server metadata
m = get_server_metadata(s_id)
...

Garçon in-depth – cfgmgmtd
74
Use metadata
try:
if (server_status != 'configured' and
server_status != 'configuring'):
...
# run thread to configure server
threading.Thread(target=configure_server, args=(s_id,)).start()
No metadata?
except KeyError:
# CONFIGURE server (KeyError, no metadata) in thread
threading.Thread(target=configure_server,
args=(s_id, ansible_timeout,)).start()
?
X

Garçon in-depth – cfgmgmtd, Ansible
75
Reset server’s password:
# set server password
password = generate_password(10, punctuation=False)
set_server_password(server_id, password)
Server’s info (e.g. IP address):
# fetch server info
ip = get_server_ipv4(server_id, MGMT_NETWORK)

Garçon in-depth – cfgmgmtd, Ansible
76
Leverage Configuration Management System:
for playbook in list_ansible_playbooks():
...
cmd = ['ansible-playbook', playbook_filename,
# '-vvvv',
'-c', 'paramiko', '-i', inventory_file]
errcode = run_cmd(cmd, logfilename=playbook_logfilename,
timeout=ansible_timeout)
playbook_logfilename = (os.path.join(LOG_DIR, '%s-%s' %
(s.name, playbook)))
playbook_logfile = open(playbook_logfilename, 'w')
playbook_filename = (os.path.join((playbooks_base_dir),
'%s/main.yml' % playbook))

Garçon in-depth – cfgmgmtd, Monitoring System
77
Create checks for new server to be managed
# cloud monitoring (agent_id := server_uuid)
add_cm_cpu_check(server_id))
# good, set 'aspoc_server.status=configured' in metadata
set_server_metadata(server_id, 'aspoc.server_status', 'configured')

• Auto Scaling
• Control law
RECAP
78

79
Q&A
@soldasimo
simonesoldateschi

RACKSPACE® HOSTING | 5000 WALZEM ROAD | SAN ANTONIO, TX 78218
US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM
RACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN TH E UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COMRACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN TH E UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM
RACKSPACE® HOSTING | 5 MILLINGTON ROAD | HAYES, UNITED KINGDOM UB3 4AZ
UK SALES: +44 (0)20 8712 6507 | UK SUPPORT: 0800 988 0300 | WWW.RACKSPACE.CO.UK
@soldasimo
simonesoldateschi

PyCon Russia 2014 - Auto Scale in the Cloud

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to PyCon Russia 2014 - Auto Scale in the Cloud

Similar to PyCon Russia 2014 - Auto Scale in the Cloud (20)

Recently uploaded

Recently uploaded (20)

PyCon Russia 2014 - Auto Scale in the Cloud

Editor's Notes