Top 5 Considerations for Running Stateful Apps on Kubernetes1. 1© 2018 All rights reserved.
Top 5 Considerations for
Running Stateful Apps on Kubernetes
Karthik Ranganathan
Co-Founder & CTO, YugaByte
April 2018
2. 2© 2018 All rights reserved.
Agenda
o About Us
o Why Run Stateful Apps on Kubernetes?
o Top 5 Considerations
1. High performance
2. Data resilience
3. Integration with application services
4. Day 2 operations
5. Automation with Operators
o Live Demo
o Q & A
3. 3© 2018 All rights reserved.
About YugaByte
Kannan Muthukkaruppan, CEO
Nutanix ♦ Facebook ♦ Oracle
Karthik Ranganathan, CTO
Nutanix ♦ Facebook ♦ Microsoft
Mikhail Bautin, Software Architect
Clear Story Data ♦ Facebook ♦ D.E.Shaw
Founded Feb 2016
25 employees
Ex-Facebook, Oracle, Nutanix, Google &
LinkedIn engineers
Apache HBase committers
Built NoSQL platform at Facebook for:
message inbox
message search
time series
spam detection
4. 4© 2018 All rights reserved.
TRANSACTIONAL PLANET-SCALEHIGH PERFORMANCE
Distributed ACID Transactions
Document-Based, Strongly Consistent
Low Latency, Tunable Reads
High Throughput
CLOUD-NATIVE OPEN SOURCE
Apache 2.0
Popular APIs Extended
Apache Cassandra, Redis and PostgreSQL (coming soon)
Built For The Container Era
Self-Healing, Fault-Tolerant
Auto Sharding & Rebalancing
Global Data Distribution
YugaByte DB
5. 5© 2018 All rights reserved.
Inside The Hood - 3 Node Cluster
DocDB Storage Engine
Purpose-built for ever-growing data, extended from RocksDB
yb-master1
yb-master3
yb-master2
YB-Master
Manage shard metadata &
coordinate cluster-wide ops
node1
node3
node2
Global Transaction Manager
Tracks ACID txns across multi-row ops, incl. clock skew mgmt.
Raft Consensus Replication
Highly resilient, used for both data replication & leader election
tablet 1’
tablet 1’
yb-tserver1 yb-tserver2
yb-tserver3
tablet 1’
tablet2-leader
tablet3-leader
tablet1-leaderYB-TServer
Stores/serves data in/from
tablets (shards)
tablet1-follower
tablet1-follower
tablet3-follower
tablet2-follower
tablet3-follower
tablet2-follower
…
…
…
6. 6© 2018 All rights reserved.
Why Run Stateful Apps on Kubernetes?
1. Unified orchestration across stateless & stateful apps
– Same set of compute/storage/network primitives across web server,
api server, message queue, cache, DB, file stores
2. Consistent, declarative provisioning across all envs
– minikube dev, intg test, failure test, perf test, staging, prod
3. Join one of the fastest-growing open source communities
– influence your peers and shape the future!
7. 7© 2018 All rights reserved.
Starting a YugaByte DB Cluster
https://docs.yugabyte.com/deploy/multi-node-cluster/
Stable network IDs Persistent volumesOrdered operations
STATEFUL
1. Start all yb-masters first (usually 3)
2. Start as many yb-tservers as needed
8. 8© 2018 All rights reserved.
Mapping to Kubernetes Controller APIs
DEPLOYMENT
Handles updates
(rolling/recreation) on
ReplicaSets
Most commonly used
REPLICASET
Run N identical Pods
Can only use
ephemeral storage
(pod loss = data loss)
DAEMONSET
Max 1 Pod per node
For node-level
functions
(e.g. monitoring)
JOB
Ensure N successful
completions
For batch jobs such as
crons
NOT FOR STATEFUL
APPS
9. 9© 2018 All rights reserved.
StatefulSets API
1. Ordered operations with ordinal index
– Startup, scale-up, scale-down, rolling upgrades, termination
2. Stable, unique network ID/name across restarts
– Re-spawning a pod will not make the cluster treat it as a new member
3. Stable, persistent storage (linked to ordinal index/name)
– Attach same persistent disk to a pod even if it gets rescheduled to new node
4. Mandatory headless service (no single IP) for integrations
– No load balancer, smart clients aware of all pods and connect to any
Alpha
v1.3
Jul 2016
Beta
v1.5
Dec 2016
Stable
v1.9
Dec 2017
Purpose-built for stateful apps
10. 10© 2018 All rights reserved.
YugaByte DB Deployed as StatefulSets
node2node1 node4node3
yb-master
StatefulSet yugabytedb
yb-master-1 pod
yugabytedb
yb-master-0 pod
yugabytedb
yb-master-2 pod
yb-tserver
StatefulSet
tablet 1’
yugabytedb
yb-tserver-1 podtablet 1’
yugabytedb
yb-tserver-0 pod tablet 1’
yugabytedb
yb-tserver-3 podtablet 1’
yugabytedb
yb-tserver-2 pod
…
Local/Remote
Persistent Volume
Local/Remote
Persistent Volume
Local/Remote
Persistent Volume
Local/Remote
Persistent Volume
yb-masters
Headless Service
yb-tservers
Headless Service
App ClientsAdmin Clients
11. 11© 2018 All rights reserved.
YB-Master
yb-masters
Headless Service
yb-master StatefulSet
Access the yb-masters service as
$name.$namespace.svc.cluster.local
3 pods
https://github.com/YugaByte/yugabyte-db/blob/master/cloud/kubernetes/yugabyte-statefulset-local-ssd-gke.yaml
ui & rpc ports
volume mountpath matches fs_data_dirs
Headless
12. 12© 2018 All rights reserved.
YB-TServer
yb-tservers
Headless Service
yb-tserver StatefulSet
Access the yb-masters service as
$name.$namespace.svc.cluster.local
3 pods to begin with, scale as needed
https://github.com/YugaByte/yugabyte-db/blob/master/cloud/kubernetes/yugabyte-statefulset-local-ssd-gke.yaml
ui, rpc, cassandra & redis ports
volume mountpath matches
fs_data_dirs
Headless
13. 13© 2018 All rights reserved.
Other Examples
http://blog.kubernetes.io/2017/02/postgresql-
clusters-kubernetes-statefulsets.html
https://www.slideshare.net/JoergHenning/elasticsearch-on-kubernetes
14. 14© 2018 All rights reserved.
1. Ensuring High Performance
LOCAL STORAGE
(Beta in latest v1.10) (Stable)
REMOTE STORAGE
Lower latency, Higher throughput
Recommended for workloads that do their own
replication
Pre-provision outside of K8s
Use SSDs for latency-sensitive apps
Higher latency, Lower throughput
Recommended for workloads do not perform any
replication on their own
Provision dynamically in K8s
Use alongside local storage for cost-efficient tiering
15. 15© 2018 All rights reserved.
2. Configuring Data Resilience
POD ANTI-AFFINITY MULTI-ZONE/REGIONAL/MULTI-REGION
POD SCHEDULING
Pods of the same type should not be
scheduled on the same node
Keeps impact of node failures to
absolute minimum
(Beta in latest v1.10)
Multi-Zone - Tolerate zone failures for
k8s slave nodes
Regional – Tolerate zone failures for
both k8s slave and master nodes [GKE
only]
Multi-Region – Requires federation of
k8s clusters
16. 16© 2018 All rights reserved.
3. Integrating StatefulSets with App Services
https://www.slideshare.net/ssuser6bb12d/kubernetes-introduction-71846110
preferred for clarity/readability
17. 17© 2018 All rights reserved.
4. Running Day 2 Operations
BACKUP & RESTORE
Backups and restores are a
database level construct
YugaByte DB can perform
distributed snapshot and copy to a
target for a backup
Restore the backup into an
existing cluster or a new cluster
with a different number of tservers
ROLLING UPGRADES
Supports two upgradeStrategies:
onDelete (default) and
rollingUpgrade
Pick rolling upgrade strategy for
DBs that support zero downtime
upgrades such as YugaByte DB
New instance of the pod spawned
with same network id and storage
HANDLING FAILURES
Pod failure handled by K8S
automatically
Node failure has to be handled
manually by adding a new slave
node to K8S cluster
Local storage failure has to be
handled manually by mounting
new local volume to K8S
18. 18© 2018 All rights reserved.
5. Extending StatefulSets with Operators
https://kubernetes.io/docs/concepts/api-extension/custom-resources/#custom-controllers
Based on Custom Controllers that have direct
access to lower level K8S API
Excellent fit for stateful apps requiring human
operational knowledge to correctly scale,
reconfigure and upgrade while simultaneously
ensuring high performance and data resilience
Complementary to Helm for packaging
CPU usage in the yb-tserver
StatefulSet
Scale yb-tserver by 1 pod
CPU > 80% for 1min and
max_threshold not exceeded
19. 19© 2018 All rights reserved.
A Real-World Example
Yugastore – E-Commerce app on the YERN stack
Deployed on
github.com/YugaByte/yugastore
20. 20© 2018 All rights reserved.
Kubernetes Deployment Architecture
yb-master
StatefulSet yugabytedb
yb-master-1 pod
yugabytedb
yb-master-0 pod
yugabytedb
yb-master-2 pod
yb-tserver
StatefulSet
tablet 1’
yugabytedb
yb-tserver-1 podtablet 1’
yugabytedb
yb-tserver-0 pod tablet 1’
yugabytedb
yb-tserver-2 pod
yb-masters
Headless Service
yb-master-ui
LoadBalancer Service
yb-tservers
Headless Service yugastore
Deployment
tablet 1’
yugastore
yugastore-1 podtablet 1’
yugastore
yugastore-0 pod
yugastore
LoadBalancer Service
End User
Admin User
22. 22© 2018 All rights reserved.
Summary
1. StatefulSets are getting increasingly more powerful
2. Ensuring high performance and data resilience requires
careful planning
3. Day 2 operations are getting simpler (thx to Operators)
4. Community ready to help if you want to get started!
23. 23© 2018 All rights reserved.
Try YugaByte DB on your laptop
docs.yugabyte.com/quick-start
24. 24© 2018 All rights reserved.
Join Us For Our Next Online Talk
25. 25© 2018 All rights reserved.
gitter.im/YugaByte
@YugaByte
Questions?