Federation of Kubernetes Clusters: ("Ubernetes")

Federation of Kubernetes
Clusters ("Übernetes")
Kubecon 2015
Quinton Hoole <quinton@google.com>
Staff Software Engineer - Google
quinton_hoole@github

Google has beeeg data centers...
... but you know that already.
Images by Connie Zhou

But we also have rather a lot of them...

Treating these differently can have benefits...

UI
CLI
API
Control Plane Servers
Kubernetes
Users
containers
containers
containers
containers
containers
containers
containers
containers
containers
containers
containers
containers
containers
containers
containers
Cluster / Data Center / Availability Zone

UI
All you really care about?
API Containers

UI
CLI
API
Control Plane Clusters
Übernetes
API
Users
Kubernetes on
Kubernetes on
Kubernetes on
Premise
Federation

Reason 1: High Availability
• Cloud providers have outages, yes, but...
• Has one of your application software
upgrades ever gone terribly wrong?
• How about infrastructure upgrades
(auth systems? quota? data store?)
• How about a fat-fingered config
change?
• There are several interesting variants:
• Multiple availability zones?
• Multiple cloud providers?
Cross-cluster
Load Balancer
Your
paying
customer
Cluster 1
Cluster 2
Cluster 3

Reason 2: Application Migration
• Migrating applications between clusters
is tedious and error-prone if done
manually
• Much like software upgrades, you
*can* script them, but (K)ubernetes
just does it quicker/safer/better.
• Now with rollback too!
• On-premise ↔ Cloud
• Amazon ↔ Google :-)
• ...
Ubernetes
UI
On-Premise Cluster In-Cloud Cluster
Migrate: On Premise→Cloud
Different Cloud Provider

Reason 3: Policy Enforcement
• Some data must be stored and
processed within specified political
jurisdictions, by law.
• Some software/data must be on
premise and air-gapped, by company
policy.
• Some business units get to use the
expensive gear, some don't.
• Auditing is also a big deal, so funnelling
all operations through a central control
point makes this easier.
Ubernetes
UI
U.S. Cloud Cluster E.U Cloud Cluster
On-premise Cluster

Reason 4: Vendor Lock-in Avoidance
• Make it easy to migrate applications
between cloud providers.
• Run the same app on multiple cloud
providers and choose the best one for
your:
• workload characteristics
• budget
• performance requirements
• availability requirements
Ubernetes
UI
Kubernetes on GCE Kubernetes on AWS
Kubernetes On-Premise

Reason 5: Capacity Overflow
• Make intelligent placement decisions
• Utilization
• Cost
• Performance Ubernetes
User
On Premise Cluster
Other Cloud Provider
Preferred Cloud Provider
Run my stuff

"OK, I'm sold. Where's the catch?"

Provider 1
Zone A
Zone B
Federation comes with some challenges...
Provider 2
Zone C
Provider 1
Zone D
● Different bandwidth
charges/latency/through-
put/reliability
● Different service discovery
(but DNS!)
● Consolidated monitoring
& alerting

Cross-cluster load balancing
• Geographically aware DNS gets clients to
the "closest" healthy cluster.
• Standard Kubernetes service load
balancing within each cluster.
• New L7 LB's available soon.
• Can be extended to divert traffic away from
"healthy-but-saturated" clusters.

Cross-cluster service discovery
• DNS + Kubernetes cluster-local service
discovery.
• Can default to cluster-local with failover to
remote clusters.

Location affinity
• Strictly coupled pods/applications
• High bandwidth requirements
• Low latency requirements
• High fidelity requirements
• Cannot easily span clusters
• Loosely coupled
• Opposite of above
• Relatively easily distributed across
clusters
• Preferentially coupled
• Strongly coupled but can be
migrated piecemeal.

Cross-cluster monitoring and auditing...
• "Cluster per tab" might suffice for small
numbers of clusters
• Some monitoring solutions provide
stronger integration and global
summarization

Cluster Federation - The Implementation...

API Compatible with Kubernetes
• Less new stuff to learn
• Can learn incrementally, as you
need new functionality.
• Analogous argument applies to
existing automation systems (PAAS
etc).
• These can be ported to
Ubernetes relatively easily.
• All Kubernetes entities are
"federatable".
Ubernetes or
Kubernetes
Client
Applications
Applications
Applications
Run my stuff

State and control resides in
underlying clusters
(for the most part)
• Better scalability
• Kubernetes scales with
number of nodes per
cluster (<10,000)
• Ubernetes scales with
number of clusters (~100)
• Beter fault isolation
• Kubernetes clusters fail
independently of
Ubernetes
Kubernetes Cluster Kubernetes Cluster
Ubernetes
API
APIRepl. Ctrl etc
State
API
APIRepl. Ctrl etc
State
API
APIRepl. Ctrl etc
State

• Drive current state -> desired state
• But per-cluster state, not per node,
per pod etc.
• Observed state is the truth
Recurring pattern in the system
Examples:
• ReplicationController
• Service
observe
diff
act
Similar Control loops to Kubernetes

Modularity
Loose coupling is a goal everywhere
• simpler
• composable
• extensible
Code-level plugins where possible
Multi-process where possible
Isolate risk by interchangeable parts
Examples:
• MigrationController
• Scheduler

Federation status & plans
Federation Lite (single cluster, multiple zones)
• In alpha Q4 2015
• Productionized ~Q1 2016
Federation Proper (multiple clusters, federated)
• Alpha Q1 2016
Google Container Engine (GKE)
• hosted Federation too
• GKE Federation Lite ~Q1-Q2 2016
PaaSes and Distros
• RedHat OpenShift, CoreOS Tectonic, RedHat Atomic...
• ... watch this space...

I want more!
• Requirements doc - comments welcome
• tinyurl.com/ubernetesv2
• Special interest group
• groups.google.com/forum/kubernetes-sig-federation
• quinton@google.com
• quinton_hoole@github
Kubernetes Cluster Kubernetes Cluster
Ubernetes
API
APIRepl. Ctrl etc
State
API
APIRepl. Ctrl etc
State
API
APIRepl. Ctrl etc
State

Federation of Kubernetes Clusters: ("Ubernetes")

Recommended

Recommended

More Related Content

More from KubeAcademy

More from KubeAcademy (20)

Recently uploaded

Recently uploaded (20)

Federation of Kubernetes Clusters: ("Ubernetes")