Kubernetes seems to be the biggest buzz word currently in the DevOps world. The Google designed container orchestrator based in their 10+ years of experience running production applications using containers seems to have positioned as the market leader.
Open source, available in both Google Cloud and Azure container platforms or as a custom installation, it is ready to receive production loads.
During this talk we will discover how does Kubernetes works, its architecture, what components compose a Kubernetes cluster. We will also learn what objects can a developer use to deploy its applications on a Kubernetes cluster. We will see a live demo where we will deploy an application and then introduce changes to it without any downtime.
2. Introduction: Who Am I
Juan Larriba
DevOps Engineer at everis cloud services
@compilemymind
3. Introduction: Containers
Containers are gaining a lot of traction because they isolate different
applications on the same physical or virtual hardware
Usually, servers are provisioned for the worst case scenario, leading to a lot of
unused resources most of the time
Containerization lets us to securely share that hardware between different
applications that can work a different times, optimizing the usage time
4. Introduction: Container Orchestrators
Currently there are 4 main container orchestrators fighting to be the market
leader
Kubernetes
Mesos
Docker Swarm
Service Fabric
7. Architecture
Kubernetes is programmed as a monolithic application but deployed as a
microservices application
It relies on external services for networking and persistent storage of its own
state
All comunications, both external and internal, use the HTTPS protocol
8. Architecture: Software Defined Networking
One of the first problems we face when working with Docker, is the manual
port management issue
When deploying a number of containers on the same machine, we need to
track manually which ports is exposing each container
To avoid this problem, Kubernetes uses a Software Defined Networking
(commonly Flannel, but also WeaveNet and others)
Each container is then automatically assigned a different IP, so all of them
can expose the same port
9. Architecture: etcd
Kubernetes needs to persist its state in some kind of persistent storage
It uses exclusively etcd as its backend
etcd is a distributed key-value storage created by the CoreOS team
Each etcd major version breaks the previous API
As of Kubernetes 1.6, the version used is etcd3
10. Architecture: Kubelet
The Kubelet is a native Linux daemon that needs to be executed in each
member of a cluster: masters and nodes
Is the executor of the commands
It communicates with its node Docker API to effectively launch the Docker
containers required by other Kubernetes components
It really can work standalone, acting as a Supervisord of Docker containers
It is the only Kubernetes component that does not work as a Docker
container
11. Architecture: kube-apiserver
It is deployed only in the master
It is the entrypoint for the Kubernetes cluster
It exposes a REST API
The client communicates and sends commands to the apiserver, who
validates the information sent and if it is correct stores it in etcd
12. Architecture: kube-scheduler
It is deployed only in the master
The Scheduler is aware of the cluster status and decides where the new
objects must be colocated
It is a very complex piece of software, the real “brain” of the Kubernetes
cluster
As stated in Kubernetes documentation:
The scheduler needs to take into account individual and collective resource requirements,
quality of service requirements, hardware/software/policy constraints, affinity and anti-
affinity specifications, data locality, inter-workload interference, deadlines, and so on
13. Architecture: kube-controller-manager
It is deployed only in the master
The Controller-Manager is a the control loop of the cluster
The Controller-Manager watches the shared state of the cluster stored in
etcd by the API Server
It continuously compares the desired state of the cluster with the current
state and notifies the other components of the cluster to perform the actions
needed to move the cluster towards the desired state
14. Architecture: kube-proxy
It is deployed as a static pod on each node of the cluster
Implements Services capabilities
16. Addons: Ingress Controller
It provides a way to route external requests to applications in the cluster
Matches DNS names and contexts (which external clients like browsers can
understand) to Kubernetes Services
One specification, multiple implementations
Currently we use the Nginx implementation, but a custom implementation is
easily done
17. Addons: Dashboard
A web frontend for the cluster
It shows in a graphical UI all the information that can be obtained through
the API or the CLI
Embeds the limited monitoring capabilities previously present on Kubedash,
which has been deprecated
18. Addons: Heapster
Reads monitoring data from the Kubelet (extracted from the Docker API and
the node it lives in) and exposes it via a REST API
It can be deployed standalone and it will store all the cluster metrics for the
last 15 minutes
It can be plugged to different backends, currently supporting Log, InfluxDB,
Google Cloud Monitoring, Google Cloud Logging, Hawkular-Metrics,
OpenTSDB, Monasca, Kafka, Riemann, Elasticsearch…
When plugged to a backend, it will store unlimited metrics (limited by the
backend policies)
19. Addons: kube-dns
Kubernetes uses DNS for service discovery
As each application deployed in the cluster will have its own IP, Kubernetes
provides a way to resolve service names to Ips
Until versión 1.3, it used SkyDNS is a Google implementation of the DNS
protocol in Go with etcd storage and REST API
From 1.4 onwards, it uses dnsmasq with a Go REST API which modifies
and reloads the configuration
21. Objects: Pod
The most basic unit of computation in Kubernetes is a Pod
A Pod can contain one or more Docker containers, but for simplification, we
will only store one container in one Pod
Each Pod definition passed to the Kubelet creates, at least, two Docker
containers: the user container and a special Pod container that handles the
networking
A Pod has a SDN assigned IP, and thus it is only accessible from the same
node
22. Objects: Service
Defines a “ClusterIP” so a Pod can be reached from each node of the cluster
Every replica of the same Pod share the same Service, which acts as Load
Balancer
A Service is not an Nginx or an HAProxy, it does not consume resources nor
it is deployed to a node. It is a kube-proxy configuration
Depending on the IaaS, a Service can aquire an external IP
23. Objects: Ingress
Exposes a Service with a network wide URL so it can be accessed from the
outside world
Provides a much more safer and manageable way of accessing services
than directly exposing IPs
The Ingress endpoint is provided by the Ingress Controller Addon
24. Objects: ReplicationController
Ensures that a specified number of pod “replicas” are running at any one
time
If there are too many pods, it will kill some. If there are too few, the
replication controller will start more
You can think of a replication controller as something similar to a process
supervisor, but rather than individual processes on a single node, the
replication controller supervises multiple pods across multiple nodes
25. Objects: ReplicaSet
It is the next-gen ReplicationController, still in beta.
The biggest difference is that ReplicaSets do not support the rolling-update
command
ReplicaSets can be used standalone, but their main usage is to be used by
Deployments to orchestrate pod creation, deletion and updates
When you use Deployments you don’t have to worry about managing the
Replica Sets that they create
26. Objects: Deployment
Provides declarative updates for ReplicaSet
It provides all the capabilities of a Replication Controller, but adds other
powerful features
It adds the versioning feature: a Deployment is able to track the previously
deployed versions and perform easy rollbacks
Pause and Resume
Update the Deployment to recreate the pods
27. Objects: DaemonSet
It is a special kind of ReplicationController that ensures one replica of a pod
is running on each node of the cluster
You do not specify directly how many replicas does a DaemonSet deploys
As nodes are added to the cluster, pods are added to them. As nodes are
removed from the cluster, those pods are garbage collected
28. Objects: Namespace
Every Kubernetes Object must be unique
This can be a nightmare as the cluster grows
To avoid this problem, each Object is created inside a Namespace, and its
name only needs to be unique to that Namespace.
DNS Service Discovery takes in account the Service Name and the
Namespace when resolving
30. Persistence: Volume
A Kubernetes Volume is a temporal data storage that lives while the pod is
alive
It persists through container restarts, but a pod restart will erase the
information
It is meant to be shared between different containers of the same Pod
As we take the approach of having just one container for each Pod, these
kind of volumes do not have any usage
31. Persistence: Persistent Volume
When containers need to store information in a persistent way, we use
Persistent Volumes
A Persistent Volume is a piece of networked storage provisioned and made
available to the cluster by an administrator
It is not meant to be created during a normal Kubernetes workflow
It is an abstraction of hardware resources (disk storage) so Pods can use it
without knowing what underlying technology provides the storage
32. Persistence: Persistent Volume Claim
When a user of the cluster wants to request storage for his Pods, he creates
a Persistent Volume Claim
The Claim object will automatically search the pooled and unused Persistent
Volumes to find one that matches the request
Once a Persistent Volume has been claimed, its ownership cannot be
changed until the Claim is removed from the cluster
33. Persistence: Storage Class
Persistent Volumes can be dynamically provisioned using Storage Classes
Each Storage Class is unique for a kind of storage. The key is that the
platform Kubernetes resides in has an API for storage provisioning
All the major IaaS providers have Storage Classes already available:
Amazon EBS, Google Cloud Disk, Azure Disk and OpenStack Cinder are
amongst the supported types,
38. Advanced: Secret
It is meant to hold sensitive information, such as password, in an encrypted
way
Putting secret info in a Secret is safer thant putting it verbatim in a Pod
definition or a Docker image
Secrets are used by Pods by mounting them in a container Volume
39. Advanced: ConfigMap
It is a standard way of storing generic configuration as a Kubernetes object
It is very similar to a Secret, but to work with string that do not contain
sensitive information
It can be thought of a HashMap for Kubernetes.
40. Advanced: Horizontal Pod Autoscaler
It can automatically scale the number of Pods in a ReplicationController,
Deployment or ReplicaSet based on observed CPU utilization
The user defines an autoscaling rule referencing CPU: Scale when the Pod
is at 80% CPU for 2 minutes with an upper limit of 10 replicas
Then, the autoscaler polls the CPU metric and scale up or down based on
that rule
Its functionality is very limited
41. Advanced: Resource Limits
When created without limits, a container inside a Pod can potentially
demand all the node’s resources
As not all the containers peak at the same time, this beahivour is sometimes
wonderful, as it cut down infrastructure costs
But for the moments we need hard limits, we can establish limits to both a
Pod or a Namespace
42. Advanced: REST API
As stated before, the only interface the Kubernetes components expose to
the world and between them, is an HTTPS one
Thus, everything can be achieved accessing directly the REST API exposed
by the apiserver
An extensive API documentation can be found in the Kubernetes
documentation page
43. Advanced: Downward API
Allows containers to consume information about themselves or the system
and expose that information how they want it, without necessarily coupling to
the Kubernetes client or REST API
It is a way to declarative use the Kubernetes API while writing YAML files
Examples of common information retrieved with Downward API are the
Pod’s IP or its memory and CPU limits