2. Introduction: Seiichiro Inoue
•
CTO, Ariel Networks, Inc. (until June 2016)
•
Executive Fellow, Works Applications Co., Ltd.
•
Author of:
• “P2P Textbook”
• “Perfect Java”
• “Perfect JavaScript”
• “Practical JS: Introduction to Server Side JavaScript”
• “Perfect Java EE” (to be published in August 2016)
3. Goal of Today's Session
•
To ease complicated Kubernetes.
•
To be explained based on version 1.2.6 of Kubernetes.
4. To Ease Complicated Kubernetes...
•
Kubernetes has many specific concepts and jargons.
•
I simplify them for explanation purpose here.
•
•
Also, I build its concepts from bottom to top.
5. Required Knowledge to Understand
Kubernetes (from My Viewpoint)
Understand Docker
Understand Docker network
Understand flanneld
Understand relationship between
container and pod
Understand relationship between
pod and service
Understand Kubernetes network
(DNS and routing)
Understand Kubernetes tools
To be explained
in this order.
6. Simplification Regarding Container
•
Theoretically, container is one OS-equivalent.
•
So, one container can run many processes, e.g. load
•
balancer, application server, and database.
•
However, Kubernetes has a concept that it limits the
•
number of processes in one container to its minimum,
•
and manager containers instead.
•
In this explanation, I assume that only one process
•
runs on one container, though it is not requirement
•
of Kubernetes.
7. Before Breaking Down
How Kubernetes Works
•
What on earth Kubernetes does?
•
What are benefits of using Kubernetes?
8. •
Deploys container to multiple hosts.
• Conceals what container (process) is to be deployed to what host.
•
Manages network among containers (including name solution).
• Feature equivalent to service discovery.
•
Monitors dead/alive of containers.
• Starts new container up automatically when a container (process) dies.
•
Balances load on containers.
• Feature to balance accesses to multiple containers of the same function
(not so rich, though).
•
Allocates resources to containers.
• Feature to allocate CPU and memory resources for each container
(not so rich, though).
What Kubernetes Does
9. Without Kubernetes
Process A Process B
Dependent
Execution Environment
Developer
Process B
Process B
Process B
Process B
deployment
LB
Process A
LB
Process A
configuration
(configure each end point)
10. With Kubernetes
Process A Process B
Dependent
Execution Environment
Developer
Process B
Process B
Process B
LB
Process A
Kubernetes
Define service name for group of
process Bs.
Grant service name for process A.
12. •
“Host Machine” is OS on which Docker processes run.
• Multiple containers run on one host.
•
Kubernetes does not care whether the host machine is
•
physical or virtual. You do not have to care it, either.
•
Similarly, it is also out of scope for Kubernetes on which
•
network (private or with global IP) the host machine is.
•
You do not have to care it, either.
Host Machine
13. •
The first thing you may get confused about Kubernetes
•
is Docker network-related one.
•
Here, I divide flanneld and Kubernetes topics to avoid
•
your confusion.
•
First, I talk about flanneld only.
Docker Network-Related Topics
and flanneld
14. •
Without flanneld, a container running on one host cannot
•
access IP address of another container on another host.
• Strictly speaking, if you configure something, it is able to access
remote container via IP address of the other host.
• However, it is basically cumbersome.
Role of flanneld (1)
15. •
flanneld is a daemon process running on each host.
•
With flanneld, containers on a group of hosts can access
•
to one another using their own IP addresses.
• Each container will have unique IP address.
• It appears that it needs coordinating among hosts.
However, it is actually simple; flanneld processes merely share
routing table on the same data store (etcd).
•
In this area, there are also some other technologies of
•
similar functions, such as Docker Swarm.
Role of flanneld (2)
16. 1. Install Docker itself, as it is to support Docker network.
2. Start etcs somewhere, as it is required as shared data
store. etcd itself is distributed KVS, but standalone is
still enough for operation check.
3. Register network address flanneld uses onto etcd.
User can choose which network address to be used.
Example:
$ etcdctl set /coreos.com/network/config '{ "Network": "10.1.0.0/16" }‘
Overview of Flow to Run flanneld (1)
17. 4. Start flanneld daemon process on each host.
Example:
If etcd is running on the same host, simply:
$ sudo bin/flanneld
If etcd is running on another host (IP: 10.140.0.14):
$ sudo bin/flanneld -etcd-endpoints 'http://10.140.0.14:4001,http://10.140.
0.14:2379'
Overview of Flow to Run flanneld (2)
18. 5. Each flanneld writes information which subnet it keeps
onto /run/flannel/subnet.env.
Example:
$ cat /run/flannel/subnet.env
FLANNEL_NETWORK=10.1.0.0/16
FLANNEL_SUBNET=10.1.19.1/24
FLANNEL_MTU=1432
FLANNEL_IPMASQ=true
6. Have Docker process use the subnet above and start.
$ source /run/flannel/subnet.env
$ sudo docker -d --bip=${FLANNEL_SUBNET} --mtu=${FLANNEL_MTU}
Overview of Flow to Run flanneld (3)
19. Confirm network addresses of docker0 and flanneld
with ifconfig. Output example below is extracted.
In this example below, containers on this host construct
network of 10.1.19.0/24.
$ ifconfig
docker0 Link encap:Ethernet HWaddr 02:42:a5:18:b3:73
inet addr:10.1.19.1 Bcast:0.0.0.0 Mask:255.255.255.0
flannel0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-
00-00-00-00-00
inet addr:10.1.19.0 P-t-P:10.1.19.0 Mask:255.255.0.0
Operation Check of flanneld (1)
20. Check routing table:
$ netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 10.140.0.1 0.0.0.0 UG 0 0 0 ens4
10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 flannel0
10.1.19.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0
10.140.0.1 0.0.0.0 255.255.255.255 UH 0 0 0 ens4
It is okay if it can connect IP address of a container on
another host, e.g. 10.1.79.2.
Operation Check of flanneld (2)
21. •
Quite cumbersome...
•
First, flanneld process must be started before Docker
•
daemon process starts.
•
Besides, when flanneld process starts, fixed address
•
of etcd must be given externally.
•
Furthermore, when Docker daemon process starts,
•
result of /run/flannel/subnet.env, which flanneld has
•
written, must be passed.
Automatic Startup of flanneld
When OS Starts Up (1)
22. If it is in systemd, on this following file:
$ sudo vi /lib/systemd/system/docker.service
add this following:
EnvironmentFile=-/run/flannel/subnet.env
Here, hyphen at the beginning is the option to ignore if the
file does not exist.
Then, modify as following:
#Modified as below ExecStart=/usr/bin/docker daemon -H fd:// $DOCKER
_OPTS
ExecStart=/usr/bin/docker daemon -H fd:// $DOCKER_OPTS --bip=${FLA
NNEL_SUBNET} --mtu=${FLANNEL_MTU}
Automatic Startup of flanneld
When OS Starts Up (2)
23. •
Minimum understanding of flanneld to understand
•
Kubernetes.
•
With flanneld, each container under Kubernetes'
•
management has unique IP address, and gets able to
•
reach one another.
•
flanneld network is private. Thus, it cannot be executed
•
outside Kubernetes.
• This network is different from host IP address network.
• It is also different from Kubernetes service IP address network,
which is to be explained later (this is confusing).
Summary up to Here
25. •
“Node” is a Kubernetes-specific jargon.
•
If confused, you may think “host machine = node”.
•
It is not that wrong.
•
Strictly speaking, there are two types of nodes:
•
“master node” and “worker node”.
Host Machine and Node (1)
26. •
Worker node is the host on which Docker process runs.
•
Containers under Kubernetes' management run on it.
•
For this one, there is no problem to understand as
•
“worker node = host”.
•
Master node is a group of some Kubernetes server
•
processes.
• Those processes do not have to run on container.
• Jargon “master node” may be somewhat misleading;
“master process” is more appropriate.
# Worker node is called “Minion” in old documents.
Host Machine and Node (2)
27. pod is a Kubernetes-specific concept, which groups containers.
•
Containers in one pod instance run on the same host. Also, in the
•
meaning of Docker network, they share the same IP address.
•
Group densely loosed processes, i.e. processes needs to be dead
•
at the same time, into pod.
•
However, for today's explanation, I assume the model that has
•
only one container in one pod.
•
As I have already explained, I assume the model that one process
•
has only one container. Thus, in explanation today, one pod
•
corresponds to one process.
• There is a special process container called pause, but I omit explanation
here, as it is not essencial.
Container and pod
28. •
Replication controller (rc) is a Kubernetes-specific concept to make
•
pod multi-instance.
•
In actual operation, Kubernetes does not bring so many benefits
•
by using pod in a single instance. Thus, it is normal to configure rc
•
and specify required number of replicas.
•
rc makes pod instances by number of replicas specified. In this
•
context, number of processes to be started is number of replicas.
•
On which host the process starts is decided when it is executed;
•
Kubernetes finds available host.
•
rc keeps pods by number of replicas specified. In other words, if
•
one pod (= container = process) dies, it starts new pod automatically.
pod and Replication Controller (rc)
29. •
In Kubernetes v1.3 and later, it is seemed that rc is to be
•
replaced with new concepts, replica set and Deployment.
•
Today, I explain with rc.
fyi, rc, and Replica Set and Deployment
30. •
With rc functions, pod is in multiple-instances.
• This time, think multiple processes start from the same program.
• Besides, on which host these processes run is decided when they are
executed. Thus, IP address of each process is decided then.
•
Service is a feature of Kubernetes to allocate a single IP address to
•
these multiple instances.
•
As the result, access to IP address of the service is assigned to
•
multiple pods background, which has it work like load balancer.
•
Internally, a process named kube-proxy running on each worker
•
node adds entry to iptables, which makes service IP address.
• Service IP address is the one that cannot seen with ifconfig, etc
(to be explained later).
pod and Service
31. •
DNS is used to solve name of each Kubernetes service IP address.
•
A DNS implementation SkyDNS is (actually) built in Kubernetes.
•
If a new service is started when it is executed, entry of service name
•
and IP address is automatically registered to SkyDNS by a process
•
named kube2dns.
•
Process on each pod can access to service if it knows applicable
•
service name (so-called service discovery equivalent feature).
•
It is responsibility for application developers to consider how they
•
name services and grant service name to applications.
Name Solution and SkyDNS
33. •
Command line tool to manage Kubernetes.
•
I will introduce its use case later.
kubectl
34. •
Distributed KVS.
•
Data store to be used by Kubernetes master process.
•
SkyDNS and flanneld also use etcd as their data store.
•
In this explanation, it is enough to regard etcd as a
•
data store somewhere.
• etcd does not have to be on a container.
It does not have to run on master node or worker node, either.
• In explanation today, I start etcd on master node using container,
just for convenience.
etcd
35. •
Command line tool to manage etcd.
•
I will introduce its use case later.
etcdctl
36. •
Program to distinguish various Kubernetes processes
•
in the 1st argument of command line.
•
For example, if you type hyperkube kubelet, it starts
•
kubelet process.
•
It is not required to use this tool, but it is easy to use,
•
so I use hyperkube this time.
hyperkube
37. •
Master processes include apiserver, controller-manager, and
•
scheduler.
•
Those may change in the future, when Kubernetes upgrades.
•
So, it is enough for now to understand those processes exist.
•
The only process to consider here is apiserver.
• kubectl is a program to execute apiserver REST API. It is necessary
to give apiserver address as an argument (can be omitted if it runs on the
same host)
• apiserver uses etcd as its data store. Thus, etcd must be started before
apiserver starts. Also, apiserver must know etcd address when it starts.
• Other Kubernetes processes must know apiserver process address
when they start.
Kubernetes Master Processes
38. •
docker daemon process: Containers do not run without it.
•
Needless to say, it is necessary.
•
kubelet: Process to let worker node really be worker
•
node. Starts pod (= container), etc.
•
kube-proxy: Manages service IP addresses
•
(internally operates iptables).
•
flanneld: Connects containers on different hosts to one
•
another (already explained).
Processes Run on Each Worker Node
41. •
We use two hosts here.
•
Host OS is Ubuntu 16.04, but I will explain it in order not
•
to depend on distribution as much as possible.
•
Both hosts are worker node. Besides, master processes
•
run on one of those hosts using container.
•
Processes on worker node (kubelet and kube-proxy) also
•
run using container.
•
It is not required to run those processes on container.
•
It would be simpler if those were to be installable with
•
apt-get in the future.
Sample Configuration
42. 1. Preparation
2. On host to be master node
3. On host to be worker node
4. Serve our own application
Overview of Flow
43. Install kubectl command.
Basically, you can install kubectl command into any machine, as long
as it is reachable from master process IP address.
$ export K8S_VERSION=1.2.6
$ curl http://storage.googleapis.com/kubernetes-release/release/v${K8S_
VERSION}/bin/linux/amd64/kubectl > kubectl
$ sudo mv kubectl /usr/local/bin/
$ sudo chmod +x /usr/local/bin/kubectl
If it is installed into a host other than master process, you have to
specify IP address and port of master process host (10.140.0.14:8080
here) in -s option of kubectl command, or configure them on
kubeconfig file.
$ kubectl -s 10.140.0.14:8080 cluster-info
Preparation
44. •
As I have previously explained, a term “master node” is
•
somewhat misleading.
•
“Master node” is merely the node to start some master
•
processes.
•
This time, this host is also worker node.
Host to be Master Node
45. Install Docker itself:
$ sudo apt-get update; sudo apt-get -y upgrade; sudo apt-get -y install d
ocker.io
Confirm etcd process is not running on the host, as it
becomes an obstacle if it is doing so.
Stop it if it is running.
$ sudo systemctl stop etcd
$ sudo systemctl disable etcd
On Host to be Master Node (1)
46. Set environment variables (for convenience)
$ export MASTER_IP=10.140.0.14 # Host IP address: confirm with
ifconfig.
$ export K8S_VERSION=1.2.6
$ export ETCD_VERSION=2.2.5
$ export FLANNEL_VERSION=0.5.5
$ export FLANNEL_IFACE=ens4 # Confirm with ifconfig.
$ export FLANNEL_IPMASQ=true
On Host to be Master Node (2)
47. •
Run etcd first, then run flanneld, due to dependency.
•
Though it is not necessary to run them on container,
•
run both on container for convenience purpose here.
•
Run Docker daemon process dedicated to flanneld and
•
etcd. It is a little bit tricky, though.
•
Variable item in the flow is network address used by
•
flanneld ("10.1.0.0/16"). You can decide it as you would
•
like.
On Host to be Master Node (3)
48. Start dedicated Docker daemon process.
$ sudo sh -c 'docker daemon -H unix:///var/run/docker-bootstrap.sock -p
/var/run/docker-bootstrap.pid --iptables=false --ip-masq=false --bridge=n
one --graph=/var/lib/docker-bootstrap 2> /var/log/docker-bootstrap.log 1
> /dev/null &'
On Host to be Master Node (4)
49. Start etcd process (in container).
$ sudo docker -H unix:///var/run/docker-bootstrap.sock run -d --net=host
gcr.io/google_containers/etcd-amd64:${ETCD_VERSION}
/usr/local/bin/etcd
--listen-client-urls=http://127.0.0.1:4001,http://${MASTER_IP}:4001
--advertise-client-urls=http://${MASTER_IP}:4001
--data-dir=/var/etcd/data
On Host to be Master Node (5)
50. Import initial data into etcd.
$ sudo docker -H unix:///var/run/docker-bootstrap.sock run
--net=host
gcr.io/google_containers/etcd-amd64:${ETCD_VERSION}
etcdctl set /coreos.com/network/config '{ "Network": "10.1.0.0/16" }'
On Host to be Master Node (6)
51. Pause normal Docker daemon.
$ sudo systemctl stop docker
Start flanneld process (in container).
$ sudo docker -H unix:///var/run/docker-bootstrap.sock run -d
--net=host --privileged
-v /dev/net:/dev/net
quay.io/coreos/flannel:${FLANNEL_VERSION}
/opt/bin/flanneld
--ip-masq=${FLANNEL_IPMASQ}
--iface=${FLANNEL_IFACE}
On Host to be Master Node (7)
52. Tell flanneld subnet network address to normal Docker
daemon process.
$ sudo docker -H unix:///var/run/docker-bootstrap.sock exec [output hash value
of flanneld container (= container ID)] cat /run/flannel/subnet.env
Input example:
$ sudo docker -H unix:///var/run/docker-bootstrap.sock exec 195ea9f70770ac20a3f
04e02c240fb24a74e1d08ef749f162beab5ee8c905734 cat /run/flannel/subnet.env
Output example:
FLANNEL_NETWORK=10.1.0.0/16
FLANNEL_SUBNET=10.1.19.1/24
FLANNEL_MTU=1432
FLANNEL_IPMASQ=true
On Host to be Master Node (8)
53. On:
$ sudo vi /lib/systemd/system/docker.service
rewrite as following:
ExecStart=/usr/bin/docker daemon -H fd:// $DOCKER_OPTS --bip=10.1.1
9.1/24 --mtu=1432
On Host to be Master Node (9)
54. Restart Docker daemon process.
$ sudo /sbin/ifconfig docker0 down
$ sudo brctl delbr docker0
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
On Host to be Master Node (10)
55. Check before restarting Docker (excerpt):
$ ifconfig
docker0 Link encap:Ethernet HWaddr 02:42:25:65:c5:f3
inet addr:10.1.20.1 Bcast:0.0.0.0 Mask:255.255.255.0
Check after restarting Docker (excerpt):
$ ifconfig
docker0 Link encap:Ethernet HWaddr 02:42:a5:18:b3:73
inet addr:10.1.19.1 Bcast:0.0.0.0 Mask:255.255.255.0
flannel0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-
00-00-00-00-00
inet addr:10.1.19.0 P-t-P:10.1.19.0 Mask:255.255.0.0
On Host to be Master Node (11)
56. Check routing table after restarting Docker.
$ netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 10.140.0.1 0.0.0.0 UG 0 0 0 ens4
10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 flannel0
10.1.19.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0
10.140.0.1 0.0.0.0 255.255.255.255 UH 0 0 0 ens4
On Host to be Master Node (12)
57. Start master processes and required processes for worker
node (e.g. kubelet and kube-proxy) with hyperkube.
$ sudo docker run
--volume=/:/rootfs:ro --volume=/sys:/sys:ro
--volume=/var/lib/docker/:/var/lib/docker:rw
--volume=/var/lib/kubelet/:/var/lib/kubelet:rw
--volume=/var/run:/var/run:rw
--net=host --privileged=true --pid=host -d
gcr.io/google_containers/hyperkube-amd64:v${K8S_VERSION}
/hyperkube kubelet --allow-privileged=true
--api-servers=http://localhost:8080
--v=2 --address=0.0.0.0 --enable-server
--hostname-override=127.0.0.1
--config=/etc/kubernetes/manifests-multi --containerized
--cluster-dns=10.0.0.10 --cluster-domain=cluster.local
On Host to be Master Node (13)
58. Operation check of Kubernetes master processes:
$ kubectl cluster-info
Kubernetes master is running at http://localhost:8080
On Host to be Master Node (14)
59. Run SkyDNS as pod.
$ curl http://kubernetes.io/docs/getting-started-guides/docker-multinode/
skydns.yaml.in > skydns.yaml.in
$ export DNS_REPLICAS=1
$ export DNS_DOMAIN=cluster.local # Domain name you can decide as
you would like.
$ export DNS_SERVER_IP=10.0.0.10 # DNS server IP address
(IP address as Kubernetes service) you can decide as you would like.
$ sed -e "s/{{ pillar['dns_replicas'] }}/${DNS_REPLICAS}/g;s/{{ pillar['dn
s_domain'] }}/${DNS_DOMAIN}/g;s/{{ pillar['dns_server'] }}/${DNS_SER
VER_IP}/g" skydns.yaml.in > ./skydns.yaml
On Host to be Master Node (15)
60. Create rc and service.
(skydns.yaml contains rc and service)
$ kubectl create -f ./skydns.yaml
On Host to be Master Node (16)
61. $ kubectl cluster-info
Kubernetes master is running at http://localhost:8080
KubeDNS is running at http://localhost:8080/api/v1/proxy/namespaces/ku
be-system/services/kube-dns
Confirm (no known cause for this failure)
$ curl http://localhost:8080/api/v1/proxy/namespaces/kube-system/servic
es/kube-dns
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "no endpoints available for service "kube-dns"",
"reason": "ServiceUnavailable",
"code": 503
}
SkyDNS Operation Check (1)
62. $ kubectl get --all-namespaces svc
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes 10.0.0.1 <none> 443/TCP 2m
kube-system kube-dns 10.0.0.10 <none> 53/UDP,53/TCP 1m
$ kubectl get --all-namespaces ep
NAMESPACE NAME ENDPOINTS AGE
default kubernetes 10.140.0.14:6443 2m
kube-system kube-dns 10.1.19.2:53,10.1.19.2:53 1m
SkyDNS Operation Check (2)
63. $ dig @10.0.0.10 cluster.local.
(Excerpt)
;; ANSWER SECTION:
cluster.local. 30 IN A 10.1.19.2
cluster.local. 30 IN A 127.0.0.1
cluster.local. 30 IN A 10.0.0.10
cluster.local. 30 IN A 10.0.0.1
This following also works, but do not depend on this,
because this IP address may vary by situation.
$ dig @10.1.19.2 cluster.local.
SkyDNS Operation Check (3)
64. •
Configured flanneld on the first host.
•
Started master processes (e.g. apiserver) on this host.
•
Made this host worker node (i.e. started kubelet and
•
kube-proxy).
•
Started SkyDNS as Kubernetes service (accessible with
•
IP address 10.0.0.10).
Summary up to Here
65. Install Docker itself.
$ sudo apt-get update; sudo apt-get -y upgrade; sudo apt-get -y install d
ocker.io
Set environment variables (for convenience).
$ export MASTER_IP=10.140.0.14 # IP address of master node host.
$ export K8S_VERSION=1.2.6
$ export FLANNEL_VERSION=0.5.5
$ export FLANNEL_IFACE=ens4 # Check with ifconfig.
$ export FLANNEL_IPMASQ=true
On Host to be Worker Node (1)
66. Start flanneld with container, just as we did for master node.
Have it refer to etcd on the master node.
Start flanneld in the same flow as we did for master node.
Start dedicated Docker daemon process.
$ sudo sh -c 'docker daemon -H unix:///var/run/docker-bootstrap.sock -p /
var/run/docker-bootstrap.pid --iptables=false --ip-masq=false --bridge=no
ne --graph=/var/lib/docker-bootstrap 2> /var/log/docker-bootstrap.log 1> /
dev/null &'
Pause normal Docker daemon.
$ sudo systemctl stop docker
On Host to be Worker Node (2)
67. Start flanneld process (in container).
$ sudo docker -H unix:///var/run/docker-bootstrap.sock run -d
--net=host --privileged -v /dev/net:/dev/net
quay.io/coreos/flannel:${FLANNEL_VERSION}
/opt/bin/flanneld
--ip-masq=${FLANNEL_IPMASQ}
--etcd-endpoints=http://${MASTER_IP}:4001
--iface=${FLANNEL_IFACE}
On Host to be Worker Node (3)
68. Tell flanneld subnet network address to normal Docker daemon process.
$ sudo docker -H unix:///var/run/docker-bootstrap.sock exec [output hash value of
flanneld container (= container ID) cat /run/flannel/subnet.env
Output Example:
FLANNEL_NETWORK=10.1.0.0/16
FLANNEL_SUBNET=10.1.79.1/24
FLANNEL_MTU=1432
FLANNEL_IPMASQ=true
On:
$ sudo vi /lib/systemd/system/docker.service
rewrite as following:
ExecStart=/usr/bin/docker daemon -H fd:// $DOCKER_OPTS --bip=10.1.79.1/24 --mtu=1
432
On Host to be Worker Node (4)
69. Restart Docker daemon process.
$ sudo /sbin/ifconfig docker0 down
$ sudo brctl delbr docker0
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
On Host to be Worker Node (5)
70. Confirm with ifconfig and netstat -rn.
Check routing table after restarting.
$ netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 10.140.0.1 0.0.0.0 UG 0 0 0 ens4
10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 flannel0
10.1.79.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0
10.140.0.1 0.0.0.0 255.255.255.255 UH 0 0 0 ens4
On Host to be Worker Node (6)
71. Start process required for worker node (kubelet) with
hyperkube.
$ sudo docker run
--volume=/:/rootfs:ro --volume=/sys:/sys:ro
--volume=/dev:/dev
--volume=/var/lib/docker/:/var/lib/docker:rw
--volume=/var/lib/kubelet/:/var/lib/kubelet:rw
--volume=/var/run:/var/run:rw
--net=host --privileged=true --pid=host -d
gcr.io/google_containers/hyperkube-amd64:v${K8S_VERSION}
/hyperkube kubelet
--allow-privileged=true --api-servers=http://${MASTER_IP}:8080
--v=2 --address=0.0.0.0 --enable-server --containerized
--cluster-dns=10.0.0.10 --cluster-domain=cluster.local
On Host to be Worker Node (7)
72. Start process required for worker node (kube-proxy) with
hyperkube.
$ sudo docker run -d --net=host --privileged
gcr.io/google_containers/hyperkube-amd64:v${K8S_VERSION}
/hyperkube proxy
--master=http://${MASTER_IP}:8080 --v=2
On Host to be Worker Node (8)
73. •
Configured flanneld on the second host.
•
Made this host worker node (i.e. started kubelet and
•
kube-proxy).
Summary up to Here
74. $ kubectl -s 10.140.0.14:8080 cluster-info
Kubernetes master is running at 10.140.0.14:8080
KubeDNS is running at 10.140.0.14:8080/api/v1/proxy/namespaces/kube-
system/services/kube-dns
Check Service:
$ kubectl -s 10.140.0.14:8080 get --all-namespaces svc
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes 10.0.0.1 <none> 443/TCP 47m
kube-system kube-dns 10.0.0.10 <none> 53/UDP,53/TCP 45m
Basic Operation Check for Kubernetes (1)
75. Check Node:
$ kubectl get nodes
NAME STATUS AGE
127.0.0.1 Ready 52m
ubuntu16k5 Ready 16m
Basic Operation Check for Kubernetes (2)
76. Run our Own Node.js Application
as a Service on Kubernetes
77. server.js =>
var http = require('http');
var handleRequest = function(request, response) {
response.writeHead(200);
response.end("Hello World");
}
var www = http.createServer(handleRequest);
www.listen(8888);
Prepare Docker Image
for Sample Application (1)
79. This Docker image needs registering to registry to enable
multiple host get it.
It is best to establish our own Docker registry, but we use
DockerHub as an alternative this time.
$ docker login
$ docker tag mynode:latest guest/mynode # DockerHub login ID for
“guest” part.
$ docker push guest/mynode
Register Docker Image to Registry
80. mynode.yaml =>
apiVersion: v1
kind: ReplicationController
metadata:
name: my-node
spec:
replicas: 2
template:
metadata:
labels:
app: sample
spec:
containers:
- name: mynode
image: guest/mynode
ports:
- containerPort: 8888
Configuration File for rc
Configuration file for ReplicationController (rc)
Identification name for this rc
(named by developer)
Number of replicas
Docker image (on DockerHub)
Labels (both key and value are
named by developer)
81. mynode-svc.yaml =>
apiVersion: v1
kind: Service
metadata:
name: frontend
labels:
app: sample
spec:
ports:
- port: 8888
selector:
app: sample
Configuration File for Service
(for Selector's Reference to mynode.yaml)
Configuration file for service (svc)
Identification name for this service
(named by developer)
Label (named by developer)
Select by labels for rc (and pod)
83. First, start rc (= start pod implicitly).
$ kubectl create -f mynode.yaml
Start rc (= Start pod Implicitly)
84. Check pod:
$ kubectl get --all-namespaces po
NAMESPACE NAME READY STATUS RESTARTS AGE
default k8s-master-127.0.0.1 4/4 Running 0 50m
default k8s-proxy-127.0.0.1 1/1 Running 0 50m
default my-node-ejvv9 1/1 Running 0 10s
default my-node-lm62r 1/1 Running 0 10s
kube-system kube-dns-v10-suqsw 4/4 Running 0 48m
Check rc:
$ kubectl get --all-namespaces rc
NAMESPACE NAME DESIRED CURRENT AGE
default my-node 2 2 36s
kube-system kube-dns-v10 1 1 49m
Check pod and rc
85. Service (svc) does not exist yet.
$ kubectl get --all-namespaces svc
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes 10.0.0.1 <none> 443/TCP 51m
kube-system kube-dns 10.0.0.10 <none> 53/UDP,53/TCP 49m
End point (ep) does not exist yet, either.
$ kubectl get --all-namespaces ep
NAMESPACE NAME ENDPOINTS AGE
default kubernetes 10.140.0.14:6443 51m
kube-system kube-dns 10.1.19.2:53,10.1.19.2:53 49m
Check Service and End Point
92. Check end point (: container IP address)
$ kubectl describe ep frontend
Name: frontend
Namespace: default
Labels: app=sample
Subsets:
Addresses: 10.1.19.3,10.1.79.2
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
<unset> 8888 TCP
No events.
Check End Point
96. $ kubectl get --all-namespaces ep
NAMESPACE NAME ENDPOINTS AGE
default frontend 10.1.19.3:8888,10.1.19.4:8888,10.1.79.2:8888 + 2 more... 5m
default kubernetes 10.140.0.14:6443 58m
kube-system kube-dns 10.1.19.2:53,10.1.19.2:53 56m
Check Scale-Out Verification for rc (3)
97. $ kubectl describe ep frontend
Name: frontend
Namespace: default
Labels: app=sample
Subsets:
Addresses: 10.1.19.3,10.1.19.4,10.1.79.2,10.1.79.3,10.1.79.4
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
<unset> 8888 TCP
No events.
One host has 2 pods (10.1.19.3,10.1.19.4), and the other
one has 3 (10.1.79.2,10.1.79.3,10.1.79.4).
Check Scale-Out Verification for rc (4)
98. Appearance as a service does not change even when the
number of pods increases.
$ kubectl get --all-namespaces svc
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default frontend 10.0.0.206 <none> 8888/TCP 5m
default kubernetes 10.0.0.1 <none> 443/TCP 58m
kube-system kube-dns 10.0.0.10 <none> 53/UDP,53/TCP 56m
Check Scale-Out Verification for rc (5)
99. $ kubectl describe po my-node-3blsf
Name: my-node-3blsf
Namespace: default
Node: ubuntu16k5/10.140.0.15
Start Time: Wed, 20 Jul 2016 09:29:54 +0000
(snip)
Check Physical Location of Each pod
(Node Line)
100. $ kubectl exec -it my-node-3blsf sh
=> It is more convenient than login to Docker-level
container (docker exec -it), as it is possible to login to pod
on another host, too.
Login Specific pod
101. $ kubectl logs my-node-3blsf
tailf is also possible.
$ kubectl logs -f my-node-3blsf
Check Logs in Specific pod
102. $ sudo docker ps |grep mynode
9b8ecdb7a42f guest/mynode "/bin/sh -c 'node s
er" 15 minutes ago Up 15 minutes k8s_mynode.6062c
b3_my-node-a2015_default_841729f4-4e5c-11e6-930a-42010a8c000e_a89
47e68
50e8cd85abec guest/mynode "/bin/sh -c 'node s
er" 21 minutes ago Up 21 minutes k8s_mynode.6062c
b3_my-node-lm62r_default_94564430-4e5b-11e6-930a-42010a8c000e_bcf
722c3
fyi, Check details of container at Docker level.
$ sudo docker inspect 9b8ecdb7a42f
(snip)
fyi, Check at Docker Level
103. •
For example, make container down at Docker level, or
•
kill process on Docker container from host OS.
•
For example, test to make specific machine (e.g. VM
•
instance) down.
=> Keeps number of replication.
Test to Make Specific pod Down
104. •
Started our own application as a service.
•
Made pod (= container = process) multi-instance with rc.
•
Regardless of the number of pod instances, service is
•
always accessible with its IP address.
•
If you would like to research troubles on each pod,
•
you can login directly with shell and monitor logs.
Summary up to Here
106. Each pod can see one another by service name, i.e.
curl http://frontend:8888
=> IP address returned by DNS is service IP address;
it is not Docker container's or host's.
Name Solution (DNS)
107. $ dig @10.0.0.10 frontend.default.svc.cluster.local.
(snip)
;; ANSWER SECTION:
frontend.default.svc.cluster.local. 30 IN A 10.0.0.206
=> FQDN for DNS is frontend.default.svc.cluster.local.
Format:
[service name].[namespace name].svc.cluster.local.
Name Solution by Host
108. •
Topic up to here:
• We could derive service IP address from service name via DNS.
•
Topic from here and on:
• We need to reach pod linked to service from IP address for this
service.
• We need to reach one pod somewhere, as multiple pods may be
linked to the service, due to mechanism of rc.
• Strictly speaking, we need to reach the container in the pod.
Detailed Explanation for Network
109. •
Service IP address does not exist in ifconfig world.
•
iptables rewrites service IP address with pod
•
(= container) address.
•
If there are multiple pods due to rc mechanism, iptables'
•
load balance feature allocates it.
•
kube-proxy adds service IP address onto iptables.
•
flanneld redirects packets for pod (= container) address
•
to host IP address.
• This routing table is in etcd.
Overview How Network Works
110. Packets for 10.0.0.206 goes to one of 5 by iptables at random (if number of
replicas by rc is 5):
$ sudo iptables-save | grep 10.0.0.206
-A KUBE-SERVICES -d 10.0.0.206/32 -p tcp -m comment --comment "default/frontend: cluster
IP" -m tcp --dport 8888 -j KUBE-SVC-GYQQTB6TY565JPRW
$ sudo iptables-save |grep KUBE-SVC-GYQQTB6TY565JPRW
:KUBE-SVC-GYQQTB6TY565JPRW - [0:0]
-A KUBE-SERVICES -d 10.0.0.206/32 -p tcp -m comment --comment "default/frontend: cluster
IP" -m tcp --dport 8888 -j KUBE-SVC-GYQQTB6TY565JPRW
-A KUBE-SVC-GYQQTB6TY565JPRW -m comment --comment "default/frontend:" -m statistic
--mode random --probability 0.20000000019 -j KUBE-SEP-IABZAQPI4OCAAEYI
-A KUBE-SVC-GYQQTB6TY565JPRW -m comment --comment "default/frontend:" -m statistic
--mode random --probability 0.25000000000 -j KUBE-SEP-KOOQP76EBZUHPEOS
-A KUBE-SVC-GYQQTB6TY565JPRW -m comment --comment "default/frontend:" -m statistic
--mode random --probability 0.33332999982 -j KUBE-SEP-R2LUGYH3W6MZDZRV
-A KUBE-SVC-GYQQTB6TY565JPRW -m comment --comment "default/frontend:" -m statistic
--mode random --probability 0.50000000000 -j KUBE-SEP-RHTBT7WLGW2VONI3
-A KUBE-SVC-GYQQTB6TY565JPRW -m comment --comment "default/frontend:" -j KUBE-SE
P-DSHEFNPOTRMM5FWS
Check iptables (1)
111. $ sudo iptables-save |grep KUBE-SEP-DSHEFNPOTRMM5FWS
:KUBE-SEP-DSHEFNPOTRMM5FWS - [0:0]
-A KUBE-SEP-DSHEFNPOTRMM5FWS -s 10.1.79.4/32 -m comment --com
ment "default/frontend:" -j KUBE-MARK-MASQ
-A KUBE-SEP-DSHEFNPOTRMM5FWS -p tcp -m comment --comment "de
fault/frontend:" -m tcp -j DNAT --to-destination 10.1.79.4:8888
-A KUBE-SVC-GYQQTB6TY565JPRW -m comment --comment "default/fr
ontend:" -j KUBE-SEP-DSHEFNPOTRMM5FWS
=> Packets for 10.0.0.206:8888 are converted to those for
10.1.79.4:8888 at some probability.
Check iptables (2)
112. $ netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 10.140.0.1 0.0.0.0 UG 0 0 0 ens4
10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 flannel0
10.1.19.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0
10.140.0.1 0.0.0.0 255.255.255.255 UH 0 0 0 ens4
=> Packets for 10.1.79.4:8888 go to flanneld.
Check Routing Table on Host
113. $ etcdctl ls --recursive /coreos.com/network/subnets
/coreos.com/network/subnets/10.1.19.0-24
/coreos.com/network/subnets/10.1.79.0-24
$ etcdctl get /coreos.com/network/subnets/10.1.79.0-24
{"PublicIP":"10.140.0.15"}
=> "10.140.0.15" is IP address of the host on which pod
(container) of 10.1.79.0/24 is running.
Check Routing Table in flanneld
114. •
iptables redirects packets for service IP address to pod
•
IP address.
•
flanneld redirects packets for pod to host on which the
•
applicable pod is running.
Summary up to Here
115. To make 10.0.0.206 accessible from outside host, expose
port with NodePort (or LoadBalancer).
mynode-svc.yaml =>
apiVersion: v1
kind: Service
metadata:
name: frontend
labels:
app: sample
spec:
type: NodePort
ports:
- port: 8888
selector:
app: sample
Publish Port outside Host
Though it is different in how it works,
it is like exporting port with Docker.
116. •
Kubernetes appears quite complicated at first sight,
•
but you will manage to understand it if you analyze it
•
properly.
•
Some consideration:
• When something wrong happens, it may be better off for
Kubernetes that processes die immediately.
• As long as we use iptables, it may be controversial that its load
balancing cannot be richer than round robin (some kinds of
compromise as an LB).
Summary
117. •
Programming languages mainly used: Java, JavaScript,
•
Swift (partially).
•
Middleware languages used: Scala (Spark, Kafka),
•
Go (Kubernetes).
•
Special welcome for experts in OS (Linux), JVM, algorithm,
•
middleware, network, and/or browsers.
•
Data available for real machine learning; we analyze enterprise
•
organizations as architecture.
•
We develop in large group spanning Tokyo, Osaka, Shanghai,
•
Singapore, and Chennai. Thus, we welcome those with tough mind
•
not to adhere to Japan, and ability to abstruct things and grasp large
•
picture.
Last, Enginees Wanted, as a sponsor of JTF
(Works Applications Co., Ltd.)