Journey of Kubernetes Scaling

Journey of
Kubernetes Scaling
Code Mania 111 @ Siam University
June 10, 2018

Journey of Kubernetes Scaling
● Setthasarun Prasanpun (Beer)
● Former PHP developer
● DevOps Engineer @ Opsta
#whoami

● Jirayut Nimsaeng (Dear)
● Interested in Cloud and
Open Source
● Agile Practitioner with
DevOps Driven
● CEO and Founder Opsta
#whoami

● What is Docker and Kubernetes?
● Batch Processing
● Solution to scale Batch Processing
● Optimization
● Benchmark
● Future
Agenda

What is Docker Container?

One Server
Node
Container

Multiple Servers
Node 2
Container
Node 1 Node 3
???

Kubernetes Automatic Bin Packing
Node 2Node 1 Node 3
Container
Service A
Container
Service A
Container
Service B
kube-scheduler

● Self-healing
● Service discovery & load balancing
● Automated rollouts and rollbacks
● Secret and configuration management
● Storage orchestration
● Batch execution
● Horizontal manual/auto-scaling
Some more features on Kubernetes

Batch Processing
User
User
User
User
Queue
Worker
Worker
Worker
Result
Job
Job
Job
Job
Consume
Consume
Consume

Challenge
User
User
User
User
Queue
Worker
Worker
Worker
DB
Job
Job
Job
Job
API
Consume
Consume
Consume
Push

First Design on AWS
User
User
User
User
SQS
Worker
Worker
Worker
DB
API

Problem
User
User
User
User
SQS
Worker
Worker
Worker
DB
API
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
60,000
QUEUES!!!

Solution with Elastic Beanstalk
API
SQS
Elastic Beanstalk Container
Auto Scaling Instance Group
EC2 Sqsd
Worker
EC2 Sqsd
Worker
EC2 Sqsd
Worker
Set scale condition by CPU utilization

Problems
- CPU utilization not a good metric for autoscale condition
- 1 EC2 contain only 1 Worker container
- EC2 spec not fit with worker require, waste resources.
- Very slow to scale up, Autoscaling isn't really intended for
bursting.

Kubernetes Solution
User
User
User
User
SQS
Worker
Worker
Worker
DB
API

Solution with Kubernetes
SQS
WORKER
WORKER
WORKER
Node1
Node2
Node3
Kubernetes
Cluster

Scale Pod with Kubernetes
SQS
WORKER
WORKER
WORKER
WORKER
WORKER
WORKER
Node1
Node2
Node3
Kubernetes
Cluster

Scale Node with Kubernetes
SQS
WORKER
WORKER
WORKER
WORKER
WORKER
WORKER
WORKER
WORKER
Node1
Node2
Node3
Node4
Kubernetes
Cluster

What need to be done
● Change code not to depend on Sqsd
● Build Kubernetes Cluster on AWS
● Find solution to automated scale pods and nodes

Scale Pod with kube-sqs-autoscaler
● https://github.com/Wattpad/kube-sqs-autoscaler
● Pod autoscaler based on queue size in AWS SQS
● Periodically retrieves the number of messages in SQS
and scales pods accordingly with configuration
○ --scale-down-cool-down=30s
--scale-up-cool-down=5m
--scale-up-messages=100
--scale-down-messages=10
--max-pods=5
--min-pods=1

SQS Autoscaling Pods (1)
SQS
WORKER
WORKER
WORKER
Node1
Node2
Node3
SQS
Autoscale
10 QUEUES
Kubernetes
Cluster

SQS
WORKER
WORKER
WORKER
Node1
Node2
Node3
SQS
Autoscale
WORKER
Kubernetes
Cluster
5 QUEUES

SQS
WORKER
WORKER
WORKER
Node1
Node2
Node3
SQS
Autoscale
WORKER
WORKER
Kubernetes
Cluster
0 QUEUES

Scale Node with OpenAI
● https://github.com/openai/kubernetes-ec2-autoscaler
● Work with AWS Autoscaling Group to scale instance up
and down
● Scale node up by checking pod if pending status and no
free capacity node left
● Scale node down by checking CPU idle

SQS
WORKER
WORKER
WORKER
Node1
Node2
Node3
Kubernetes
ClusterEC2
Autoscaler
PENDING
WORKER
WORKER
WORKER

SQS
WORKER
WORKER
WORKER
Node1
Node2
Node3
Kubernetes
ClusterEC2
Autoscaler
WORKER
Node4
WORKER
WORKER
WORKER

Optimization

Enhance kube-sqs-autoscale
● Scale 1 Pod at a time is too slow!
● So we improve kube-sqs-autoscale code to scale pod by
ratio between SQS and pod
○ --scale-by-ratio
--queue-per-pod-ratio=100
--scale-down-cool-down=30s
--scale-up-cool-down=5m
--max-pods=5
--min-pods=1

Move from OpenAI to autoscaler
● https://github.com/kubernetes/autoscaler
● OpenAI is lack of development since developer move from
AWS to Azure
● OpenAI is not support multiple instance groups
● Autoscaler is more maturity since it is one of the
Kubernetes component

Worker parallel optimization
- Worker consume only 1 job at a time.
- CPU using less than 15% but Memory going to ~35% per
worker on node, Not good for us.
- We improved our worker to consume and process multiple
jobs simultaneously (configurable setting).
- After some trials, Worker can do 5 concurrent jobs with
same processing time using more CPU and a bit more of
Memory.

Worker CPU optimization
- Our worker using Tensorflow installed via Pip
- Tensorflow notice about library wasn't compiled to use
AVX and SSE4.1 instructions, but these are available on
machine. Pip version not build for any cpu instructions
- So, We build Tensorflow with all CPU instructions
available on EC2 (t2.medium) machine.
- Result is job processed about 35% Faster!!!

Benchmark

Benchmark questions
● How to do load test?
○ Python script 5000 reqs (200 ccu x 25 reqs/u)
within 1 mins
● What is the most optimize instance size with cost
effective?

Benchmark Result Graph
t2.medium win
@1570 queues/minute

Benchmark result
● Worker scaling speed:
○ EB 5-10 mins per worker instance
○ K8S <2 mins (Node available, use free node)
<5 mins (Node not available, spin up new)

Conclusions
● K8s is flexible for batch processing job
● K8s has many components for autoscale
● K8s help us to optimize resource with cost effective
● K8s can finished 60,000 queues in 10 mins

Future
● Use Kubernetes with AWS GPU Instance
● Change Queue
○ RabbitMQ
○ Kafka
● Optimize cost with AWS Spot Instance

Q/A

Journey of Kubernetes Scaling

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Journey of Kubernetes Scaling

Similar to Journey of Kubernetes Scaling (20)

More from Opsta

More from Opsta (18)

Recently uploaded

Recently uploaded (20)

Journey of Kubernetes Scaling