SlideShare a Scribd company logo
1 of 52
Linux Containers – NextGen
Virtualization for Cloud
Boden Russell (brussell@us.ibm.com)
OpenStack Summit
May 12 – 16, 2014
Atlanta, Georgia
Definitions
 Linux Containers (LXC  LinuX Containers)
– Lightweight virtualization
– Realized using features provided by a modern Linux kernel
– VMs without the hypervisor (kind of)
 Containerization of
– (Linux) Operating Systems
– Single or multiple applications
 LXC as a technology ≠ LXC “tools”
5/14/2014 © 2014 IBM Corporation 2
Hypervisors vs. Linux Containers
Hardware
Operating System
Hypervisor
Virtual Machine
Operating
System
Bins / libs
App App
Virtual Machine
Operating
System
Bins / libs
App App
Hardware
Hypervisor
Virtual Machine
Operating
System
Bins / libs
App App
Virtual Machine
Operating
System
Bins / libs
App App
Hardware
Operating System
Container
Bins / libs
App App
Container
Bins / libs
App App
Type 1 Hypervisor Type 2 Hypervisor Linux Containers
5/14/2014 3
Containers share the OS kernel of the host and thus are lightweight.
However, each container must have the same OS kernel.
Containers are isolated, but
share OS and, where
appropriate, libs / bins.
© 2014 IBM Corporation
LXC Technology Stack
5/14/2014 © 2014 IBM Corporation 4
UserSpaceKernelSpace
Kernel
System Call Interface
Architecture Dependent Kernel Code
GLIBC / Pseudo FS / User Space Tools & Libs
Linux Container Tooling
Linux Container Commoditization
Orchestration & Management
Hardware
cgroups
namespaces
chroots
LSM
lxc
So You Want To Build A Container?
 High level checklist
– Process(es)
– Throttling / limits
– Prioritization
– Resource isolation
– Root file system
– Security
5/14/2014 © 2014 IBM Corporation 5
my-lxc
?
Linux Control Groups (cgroups)
 Problem
– How do I throttle, prioritize, control and obtain metrics for a group of
tasks (processes)?
 Solution  control groups (cgroups)
5/14/2014 © 2014 IBM Corporation 6
cgroup blue
proc
proc
proc
– Device Access
– Resource limiting
– Prioritization
– Accounting
– Control
– Injection
Linux cgroup Subsystems
5/14/2014 © 2014 IBM Corporation 7
Subsystem Tunable Parameters
blkio - Weighted proportional block I/O access. Group wide or per device.
- Per device hard limits on block I/O read/write specified as bytes per second or
IOPS per second.
cpu - Time period (microseconds per second) a group should have CPU access.
- Group wide upper limit on CPU time per second.
- Weighted proportional value of relative CPU time for a group.
cpuset - CPUs (cores) the group can access.
- Memory nodes the group can access and migrate ability.
- Memory hardwall, pressure, spread, etc.
devices - Define which devices and access type a group can use.
freezer - Suspend/resume group tasks.
memory - Max memory limits for the group (in bytes).
- Memory swappiness, OOM control, hierarchy, etc..
hugetlb - Limit HugeTLB size usage.
- Per cgroup HugeTLB metrics.
net_cls - Tag network packets with a class ID.
- Use tc to prioritize tagged packets.
net_prio - Weighted proportional priority on egress traffic (per interface).
Linux cgroups Pseudo FS Interface
5/14/2014 8
/sys/fs/cgroup/my-lxc
|-- blkio
| |-- blkio.io_merged
| |-- blkio.io_queued
| |-- blkio.io_service_bytes
| |-- blkio.io_serviced
| |-- blkio.io_service_time
| |-- blkio.io_wait_time
| |-- blkio.reset_stats
| |-- blkio.sectors
| |-- blkio.throttle.io_service_bytes
| |-- blkio.throttle.io_serviced
| |-- blkio.throttle.read_bps_device
| |-- blkio.throttle.read_iops_device
| |-- blkio.throttle.write_bps_device
| |-- blkio.throttle.write_iops_device
| |-- blkio.time
| |-- blkio.weight
| |-- blkio.weight_device
| |-- cgroup.clone_children
| |-- cgroup.event_control
| |-- cgroup.procs
| |-- notify_on_release
| |-- release_agent
| `-- tasks
|-- cpu
| |-- ...
|-- ...
`-- perf_event
echo "8:16 1048576“ >
blkio.throttle.read_bps_device
cat blkio.weight_device
dev weight
8:1 200
8:16 500 App
App
App
 Linux pseudo FS is the interface to cgroups
– Directory per subsystem per cgroup
– Read / write to pseudo file(s) in your cgroup directory
© 2014 IBM Corporation
Linux cgroups FS Layout
5/14/2014 9© 2014 IBM Corporation
So You Want To Build A Container?
5/14/2014 © 2014 IBM Corporation 10
Linux namespaces
 Problem
– How do I provide an isolated view of global resources to a group of tasks
(processes)?
 Solution  namespaces
5/14/2014 © 2014 IBM Corporation 11
namespace blue
– MNT; mount points, files
systems, etc.
– PID; processes
– NET; NICs, routing, etc.
– IPC; System V IPC
– UTS; host and domain name
– USER; UID and GID
MNT
PID
NET
UTS
USER
proc
proc
proc
Linux namespaces: Conceptual Overview
5/14/2014 © 2014 IBM Corporation 12
global (i.e. root) namespace
MNT NS
/
/proc
/mnt/fsrd
/mnt/fsrw
/mnt/cdrom
/run2
UTS NS
globalhost
rootns.com
PID NS
PID COMMAND
1 /sbin/init
2 [kthreadd]
3 [ksoftirqd]
4 [cpuset]
5 /sbin/udevd
6 /bin/sh
7 /bin/bash
IPC NS
SHMID OWNER
32452 root
43321 boden
SEMID OWNER
0 root
1 Boden
MSQID OWNER
NET NS
lo: UNKNOWN…
eth0: UP…
eth1: UP…
br0: UP…
app1 IP:5000
app2 IP:6000
app3 IP:7000
USER NS
root 0:0
ntp 104:109
mysql 105:110
boden 106:111
purple namespace
MNT NS
/
/proc
/mnt/purplenfs
/mnt/fsrw
/mnt/cdrom
UTS NS
purplehost
purplens.com
PID NS
PID COMMAND
1 /bin/bash
2 /bin/vim
IPC NS
SHMID OWNER
SEMID OWNER
0 root
MSQID OWNER
NET NS
lo: UNKNOWN…
eth0: UP…
app1 IP:1000
app2 IP:7000
USER NS
root 0:0
app 106:111
blue namespace
MNT NS
/
/proc
/mnt/cdrom
/bluens
UTS NS
bluehost
bluens.com
PID NS
PID COMMAND
1 /bin/bash
2 python
3 node
IPC NS
SHMID OWNER
SEMID OWNER
MSQID OWNER
NET NS
lo: UNKNOWN…
eth0: DOWN…
eth1: UP
app1 IP:7000
app2 IP:9000
USER NS
root 0:0
app 104:109
Linux namespaces & cgroups: Availability
5/14/2014 13
Note: user namespace support in
upstream kernel 3.8+, but
distributions rolling out phased
support:
- Map LXC UID/GID between
container and host
- Non-root LXC creation
© 2014 IBM Corporation
So You Want To Build A Container?
5/14/2014 © 2014 IBM Corporation 14
Linux chroot & pivot_root
5/14/2014 15
 Using pivot_root with MNT namespace addresses escaping chroot
concerns
 The pivot_root target directory becomes the “new root FS”
© 2014 IBM Corporation
So You Want To Build A Container?
5/14/2014 © 2014 IBM Corporation 16
Linux Security Modules & MAC
 Linux Security Modules (LSM) – kernel modules which provide a
framework for Mandatory Access Control (MAC) security implementations
 MAC vs DAC
– In MAC, admin (user or process) assigns access controls to subject / initiator
– In DAC, resource owner (user) assigns access controls to individual resources
 Existing LSM implementations include: AppArmor, SELinux, GRSEC, etc.
5/14/2014 17
Linux Capabilities
 Per process privileges which define sys call
access
 Can be assigned to LXC process(es)
5/14/2014 18© 2014 IBM Corporation
Other Security Measures
 Reduce shared FS access using RO bind mounts
 Linux seccomp
– Confine system calls
 Keep Linux kernel up to date
 User namespaces in 3.8+ kernel
– Launching containers as non-root user
– Mapping UID / GID into container
5/14/2014 © 2014 IBM Corporation 19
So You Want To Build A Container?
5/14/2014 20© 2014 IBM Corporation
LXC Industry Tooling
Virtuozzo OpenVZ Linux
VServer
Libvirt-lxc Lxc (tools) Warden lmctfy Docker
Summary Commerical
product
using
OpenVZ
under the
hood
Custom
Kernel
providing
well
seasoned
LXC support
A set of
kernel
patches
providing
LXC. Not
based on
cgroups or
namespaces.
Libvirt support
for LXC via
cgroups and
namespaces.
Lib + set of user
spaces tools
/bindings for
LXC.
LXC
management
tooling used by
CF.
Similar to LXC,
but provides
more intent
based focus.
Commoditizatio
n of LXC adding
support for
images, build
files, etc.
Part of
upstream
Kernel?
No No Partial Yes Yes Yes Yes, but
additional
patches needed
for specific
features.
Yes
License Commercial GNU GPL v2 GNU GPL v2 GNU LGPL GNU LGPL Apache v2 Apache v2 Apache v2
APIs /
Bindings
- CLI
- API
- CLI
- C
- CLI
- C
- Python
- Java
- C#
- PHP
- Python
- Lua
- GO
- CLI
- GO
- REST
- CLI
- Python
- Other 3rd
party libs
Managem
ent plane/
Dashboard
Virtuozzo
Parrallels
Virtuozzo
Parrallels +
others
- OpenStack
- Archipel
- Virt-
Manager
- LXC web
panel
- Lexy
- OpenStack
- Shipyard
- Docker UI
5/14/2014 © 2014 IBM Corporation 21
LXC Orchestration & Management
 Docker & libvirt-lxc in OpenStack
– Manage containers heterogeneously with traditional VMs… but not w/the level
of support & features we might like
 CoreOS
– Zero-touch admin Linux distro with docker images as the unit of operation
– Centralized key/value store to coordinate distributed environment
 Various other 3rd party apps
– Maestro for docker
– Shipyard for docker
– Fleet for CoreOS
– Etc.
 LXC migration
– Container migration via criu
 But…
– Still no great way to tie all virtual resources together with LXC – e.g. storage +
networking
• IMO; an area which needs focus for LXC to become more generally applicable
5/14/2014 22© 2014 IBM Corporation
CLOUDY BENCHMARKING WITH
KVM, DOCKER AND OPENSTACK
5/14/2014 © 2014 IBM Corporation 23
Benchmark Environment Topology @ SoftLayer
glance api / reg
nova api / cond / etc
keystone
…
rally
nova api / cond / etc
cinder api / sch / vol
docker lxc
dstat
controller compute node
glance api / reg
nova api / cond / etc
keystone
…
rally
nova api / cond / etc
cinder api / sch / vol
KVM
dstat
controller compute node
5/14/2014 24
+
Awesome!
+
Awesome!
© 2014 IBM Corporation
Cloudy Performance: Steady State Packing
 Benchmark scenario overview
– Pre-cache VM image on compute node prior to test
– Boot 15 VM asynchronously in succession
– Wait for 5 minutes (to achieve steady-state on the
compute node)
– Delete all 15 VMs asynchronously in succession
 Benchmark driver
– cpu_bench.py
 High level goals
– Understand compute node characteristics under
steady-state conditions with 15 packed / active VMs
5/14/2014 25
0
2
4
6
8
10
12
14
16
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47
ActiveVMs
Time
Benchmark Visualization
VMs
Document v2.0
Cloudy Performance: Serial VM Boot
 Benchmark scenario overview
– Pre-cache VM image on compute node prior to test
– Boot VM
– Wait for VM to become ACTIVE
– Repeat the above steps for a total of 15 VMs
– Delete all VMs
 Benchmark driver
– OpenStack Rally
 High level goals
– Understand compute node characteristics under
sustained VM boots
5/14/2014 26
0
2
4
6
8
10
12
14
16
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
ActiveVMs
Time
Benchmark Visualization
VMs
Document v2.0
Cloudy Performance: Serial VM Reboot
 Benchmark scenario overview
– Pre-cache VM image on compute node prior to test
– Boot a VM & wait for it to become ACTIVE
– Soft reboot the VM and wait for it to become ACTIVE
• Repeat reboot a total of 5 times
– Delete VM
– Repeat the above for a total of 5 VMs
 Benchmark driver
– OpenStack Rally
 High level goals
– Understand compute node characteristics under sustained VM reboots
5/14/2014 27
0
1
2
3
4
5
6
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55
ActiveVMs
Time
Benchmark Visualization
Active VMs
Document v2.0
Cloudy Performance: Snapshot VM To Image
 Benchmark scenario overview
– Boot a VM
– Wait for it to become active
– Snapshot the VM
– Wait for image to become active
– Delete VM
5/14/2014 28© 2014 IBM Corporation
Cloudy Ops: VM Boot
5/14/2014 29
3.529113102
5.781662448
0
1
2
3
4
5
6
7
docker KVM
TimeInSeconds
Average Server Boot Time
docker
KVM
Document v2.0
Cloudy Ops: VM Reboot
5/14/2014 30
2.577879581
124.433239
0
20
40
60
80
100
120
140
docker KVM
TimeInSeconds
Average Server Reboot Time
docker
KVM
Document v2.0
Cloudy Ops: VM Delete
5/14/2014 31
3.567586041
3.479760051
0
0.5
1
1.5
2
2.5
3
3.5
4
docker KVM
TimeInSeconds
Average Server Delete Time
docker
KVM
Document v2.0
Cloudy Ops: VM Snapshot
5/14/2014 32
36.88756394
48.02313805
0
10
20
30
40
50
60
docker KVM
TimeInSeconds
Average Snapshot Server Time
docker
KVM
Document v2.0
Cloudy Performance: Steady State Packing
5/14/2014 33
0
10
20
30
40
50
60
70
80
1
9
17
25
33
41
49
57
65
73
81
89
97
105
113
121
129
137
145
153
161
169
177
185
193
201
209
217
225
233
241
249
257
265
273
281
289
297
305
313
321
CPUUsageInPercent
Time
Docker: Compute Node CPU (full test duration)
usr
sys
Averages
– 0.54
– 0.17
0
10
20
30
40
50
60
70
80
1
9
17
25
33
41
49
57
65
73
81
89
97
105
113
121
129
137
145
153
161
169
177
185
193
201
209
217
225
233
241
249
257
265
273
281
289
297
305
313
321
329
337
345
CPUUsageInPercent
Time
KVM: Compute Node CPU (full test duration)
usr
sys
Averages
– 7.64
– 1.4
Document v2.0
Cloudy Performance: Steady State Packing
5/14/2014 34
0
2
4
6
8
10
12
14
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
101
106
111
116
121
126
131
136
141
146
151
156
161
166
171
176
181
186
191
196
201
206
211
CPUUsageInPercent
Time (31s – 243s)
Docker: Compute Node Steady-State CPU (segment: 31s – 243s)
usr
sys
0
2
4
6
8
10
12
14
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
101
106
111
116
121
126
131
136
141
146
151
156
161
166
171
176
181
186
191
196
201
206
211
CPUUsageInPercent
Time (95s - 307s)
KVM: Compute Node Steady-State CPU (segment: 95s – 307s)
usr
sys
Averages
– 0.2
– 0.03
Averages
– 1.91
– 0.36
31 seconds
243 seconds
95 seconds
307 seconds
Document v2.0
Cloudy Performance: Steady State Packing
5/14/2014 35
0.00E+00
1.00E+09
2.00E+09
3.00E+09
4.00E+09
5.00E+09
6.00E+09
7.00E+09
1
10
19
28
37
46
55
64
73
82
91
100
109
118
127
136
145
154
163
172
181
190
199
208
217
226
235
244
253
262
271
280
289
298
307
316
325
334
MemoryUsed
Axis Title
Docker / KVM: Compute Node Used Memory (Overlay)
kvm
docker
Document v2.0
docker
Delta
734 MB
Per VM
49 MB
KVM
Delta
4387 MB
Per VM
292 MB
Cloudy Performance: Serial VM Boot
5/14/2014 36
0
5
10
15
20
25
30
35
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79
CPUUsageInPercent
Time
Docker: Compute Node CPU
usr
sys
Averages
– 1.39
– 0.57
0
5
10
15
20
25
30
35
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
88
91
94
97
100
103
106
109
112
115
118
121
124
127
CPUUsageInPercent
Time
KVM: Compute Node CPU Usage
usr
sys
Averages
– 13.45
– 2.23
Document v2.0
Cloudy Performance: Serial VM Boot
5/14/2014 37
y = 0.009x + 1.008
y = 0.358x + 1.063
0
5
10
15
20
25
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51
UsrCPUInPercent
Time (8s - 58s)
Docker / KVM: Serial VM Boot Usr CPU (segment: 8s - 58s)
docker(8-58)
kvm(8-58)
Linear (docker(8-58))
Linear (kvm(8-58))
8 seconds 58 seconds
Document v2.0
Cloudy Performance: Serial VM Boot
5/14/2014 38
0.00E+00
5.00E+08
1.00E+09
1.50E+09
2.00E+09
2.50E+09
3.00E+09
3.50E+09
4.00E+09
4.50E+09
5.00E+09
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101105109113117121125
MemoryUsed
Time
Docker / KVM: Compute Node Memory Used (Unnormalized Overlay)
kvm
docker
Document v2.0
Cloudy Performance: Serial VM Boot
5/14/2014 39
y = 1E+07x + 1E+09
y = 3E+07x + 1E+09
0.00E+00
5.00E+08
1.00E+09
1.50E+09
2.00E+09
2.50E+09
3.00E+09
3.50E+09
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65
MemoryUsage
Time (1s - 67s)
Docker / KVM: Serial VM Boot Memory Usage (segment: 1s - 67s)
docker
kvm
Linear (docker)
Linear (kvm)
1 second 67 seconds
Document v2.0
Guest Ops: Network
5/14/2014 40
940.26 940.56
0
100
200
300
400
500
600
700
800
900
1000
docker KVM
ThroughputIn10^6bits/second
Network Throughput
docker
KVM
Document v2.0
Guest Ops: Near Bare Metal Performance
 Typical docker LXC
performance near par
with bare metal
5/14/2014 41
linpack performance @ 45000
0
50
100
150
200
250
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
B
M
vcpus
GFlops
220.77
Bare metal220.5
@32 vcpu
220.9
@ 31 vcpu
0
2000
4000
6000
8000
10000
12000
14000
MEMCPY DUMB MCBLOCK
MiB/s
Memory Test
Memory Benchmark Performance
Bare Metal (MiB/s)
docker (MiB/s)
KVM (MiB/s)
Guest Ops: File I/O Random Read / Write
5/14/2014 42
0
200
400
600
800
1000
1200
1400
1600
1 2 4 8 16 32 64
TotalTransferredInKb/sec
Threads
Sysbench Synchronous File I/O Random Read/Write @ R/W Ratio of 1.50
docker
KVM
Document v2.0
Guest Ops: MySQL OLTP
5/14/2014 43
0
2000
4000
6000
8000
10000
12000
14000
1 2 4 8 16 32 64
TotalTransactions
Threads
MySQL OLTP Random Transactional R/W (60s)
docker
KVM
Document v2.0
Guest Ops: MySQL Indexed Insertion
5/14/2014 44
0
20
40
60
80
100
120
140
100000 200000 300000 400000 500000 600000 700000 800000 900000 1000000
SecondsPer100KInsertionBatch
Table Size In Rows
MySQL Indexed Insertion @ 100K Intervals
docker
kvm
Document v2.0
Cloud Management Impacts on LXC
5/14/2014 45
0.17
3.529113102
0
0.5
1
1.5
2
2.5
3
3.5
4
docker cli nova-docker
Seconds
Docker: Boot Container - CLI vs Nova Virt
docker cli
nova-docker
Cloud management often caps true ops performance of LXC
Document v2.0
Ubuntu MySQL Image Size
5/14/2014 Document v2.0 46
381.5
1080
0
200
400
600
800
1000
1200
docker kvm
SizeInMB
Docker / KVM: Ubuntu MySQL
docker
kvm
Out of the box JeOS images for docker are lightweight
LXC In Summary
 Near bare metal performance in the guest
 Fast operations in the Cloud
 Reduced resource consumption (CPU, MEM) on the compute
node
 Out of the box smaller image footprint
5/14/2014 47
LXC Gaps
There are gaps…
 Lack of industry tooling / support
 Live migration still a WIP
 Full orchestration across resources (compute / storage / networking)
 Fears of security
 Not a well known technology… yet
 Integration with existing virtualization and Cloud tooling
 Not much / any industry standards
 Missing skillset
 Slower upstream support due to kernel dev process
 Memory /CPU proc FS not cgroup aware
 Etc.
5/14/2014 48
References & Related Links
 http://www.slideshare.net/BodenRussell/realizing-linux-containerslxc
 http://bodenr.blogspot.com/2014/05/kvm-and-docker-lxc-benchmarking-
with.html
 https://www.docker.io/
 http://sysbench.sourceforge.net/
 http://dag.wiee.rs/home-made/dstat/
 http://www.openstack.org/
 https://wiki.openstack.org/wiki/Rally
 https://wiki.openstack.org/wiki/Docker
 http://devstack.org/
 http://www.linux-kvm.org/page/Main_Page
 https://github.com/stackforge/nova-docker
 https://github.com/dotcloud/docker-registry
 http://www.netperf.org/netperf/
 http://www.tokutek.com/products/iibench/
 http://www.brendangregg.com/activebenchmarking.html
 http://wiki.openvz.org/Performance
5/14/2014 49
IBM Sponsored Sessions
Monday, May 12 – Room B314
12:05-12:45
Wednesday, May 14 - Room B312
9:00-9:40
9:50-10:30
11:00-11:40
11:50-12:30
OpenStack is Rockin’ the OpenCloud Movement! Who‘s Next to Join the Band ?
Angel Diaz, VP Open Technology and Cloud Labs
David Lindquist, IBM Fellow, VP, CTO Cloud & Smarter Infrastructure
Getting from enterprise ready to enterprise bliss - why OpenStack and IBM is a match
made in Cloud heaven.
Todd Moore - Director, Open Technologies and Partnerships
Taking OpenStack beyond Infrastructure with IBM SmartCloud Orchestrator.
Andrew Trossman - Distinguished Engineer, IBM Common Cloud Stack and SmartCloud
Orchestrator
IBM, SoftLayer and OpenStack - present and future
Michael Fork - Cloud Architect
IBM and OpenStack: Enabling Enterprise Cloud Solutions Now.
Tammy Van Hove -Distinguished Engineer, Software Defined Systems
5/14/2014 50© 2014 IBM Corporation
IBM Technical Sessions
5/14/2014 © 2014 IBM Corporation 51
Monday, May 12
3:40 - 4:20
3:40 - 4:20
Tuesday, May 13
11:15 - 11:55
2:00 - 2:40
5:30 - 6:10
5:30 - 6:10
Wednesday, May14
9:50 - 10:30
2:40 - 3:20
Thursday, May 15
9:50 - 10:30
1:30 - 2:10
2:20 - 3:00
Be sure to stop by the IBM booth to see some demos and
get your rockin’ OpenStack t-shirt while they last.
Don’t miss Monday evening’s booth crawl where you can
enjoy Atlanta’s own SWEET WATER IPA!
Thank you!
5/14/2014 © 2014 IBM Corporation 52

More Related Content

What's hot

Linux Container Technology 101
Linux Container Technology 101Linux Container Technology 101
Linux Container Technology 101
inside-BigData.com
 
Virtualization which isn't: LXC (Linux Containers)
Virtualization which isn't: LXC (Linux Containers)Virtualization which isn't: LXC (Linux Containers)
Virtualization which isn't: LXC (Linux Containers)
Dobrica Pavlinušić
 
Inside Docker for Fedora20/RHEL7
Inside Docker for Fedora20/RHEL7Inside Docker for Fedora20/RHEL7
Inside Docker for Fedora20/RHEL7
Etsuji Nakai
 
Docker - container and lightweight virtualization
Docker - container and lightweight virtualization Docker - container and lightweight virtualization
Docker - container and lightweight virtualization
Sim Janghoon
 

What's hot (20)

Docker storage drivers by Jérôme Petazzoni
Docker storage drivers by Jérôme PetazzoniDocker storage drivers by Jérôme Petazzoni
Docker storage drivers by Jérôme Petazzoni
 
LXC
LXCLXC
LXC
 
KVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackKVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStack
 
Namespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containersNamespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containers
 
Introduction to linux containers
Introduction to linux containersIntroduction to linux containers
Introduction to linux containers
 
Linux Containers From Scratch
Linux Containers From ScratchLinux Containers From Scratch
Linux Containers From Scratch
 
Lightweight Virtualization: LXC containers & AUFS
Lightweight Virtualization: LXC containers & AUFSLightweight Virtualization: LXC containers & AUFS
Lightweight Virtualization: LXC containers & AUFS
 
LXC, Docker, security: is it safe to run applications in Linux Containers?
LXC, Docker, security: is it safe to run applications in Linux Containers?LXC, Docker, security: is it safe to run applications in Linux Containers?
LXC, Docker, security: is it safe to run applications in Linux Containers?
 
Realizing Linux Containers (LXC)
Realizing Linux Containers (LXC)Realizing Linux Containers (LXC)
Realizing Linux Containers (LXC)
 
Lxc- Introduction
Lxc- IntroductionLxc- Introduction
Lxc- Introduction
 
Linux Container Technology 101
Linux Container Technology 101Linux Container Technology 101
Linux Container Technology 101
 
Virtualization which isn't: LXC (Linux Containers)
Virtualization which isn't: LXC (Linux Containers)Virtualization which isn't: LXC (Linux Containers)
Virtualization which isn't: LXC (Linux Containers)
 
Seven problems of Linux Containers
Seven problems of Linux ContainersSeven problems of Linux Containers
Seven problems of Linux Containers
 
Inside Docker for Fedora20/RHEL7
Inside Docker for Fedora20/RHEL7Inside Docker for Fedora20/RHEL7
Inside Docker for Fedora20/RHEL7
 
Docker - container and lightweight virtualization
Docker - container and lightweight virtualization Docker - container and lightweight virtualization
Docker - container and lightweight virtualization
 
Linuxcon Barcelon 2012: LXC Best Practices
Linuxcon Barcelon 2012: LXC Best PracticesLinuxcon Barcelon 2012: LXC Best Practices
Linuxcon Barcelon 2012: LXC Best Practices
 
Linux container, namespaces & CGroup.
Linux container, namespaces & CGroup. Linux container, namespaces & CGroup.
Linux container, namespaces & CGroup.
 
Docker internals
Docker internalsDocker internals
Docker internals
 
Lxc- Linux Containers
Lxc- Linux ContainersLxc- Linux Containers
Lxc- Linux Containers
 
Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...
 

Viewers also liked

PRESENTACIÓN DEL IIEP _ NC
PRESENTACIÓN DEL IIEP _ NCPRESENTACIÓN DEL IIEP _ NC
PRESENTACIÓN DEL IIEP _ NC
nadiamjiiep
 
зато мы из ЗАТО
зато мы из ЗАТОзато мы из ЗАТО
зато мы из ЗАТО
marymam
 
04 streamline english directions
04 streamline english directions04 streamline english directions
04 streamline english directions
thucvat
 
зато мы из ЗАТО
зато мы из ЗАТОзато мы из ЗАТО
зато мы из ЗАТО
marymam
 
Arus bolak balik klompok 4
Arus bolak balik klompok 4Arus bolak balik klompok 4
Arus bolak balik klompok 4
FITRIA NENGSIH
 
Bali island
Bali islandBali island
Bali island
AIZZY118
 
03 streamline english destinations
03 streamline english destinations03 streamline english destinations
03 streamline english destinations
thucvat
 

Viewers also liked (15)

Docker by Example - Basics
Docker by Example - Basics Docker by Example - Basics
Docker by Example - Basics
 
Docker 101 - Nov 2016
Docker 101 - Nov 2016Docker 101 - Nov 2016
Docker 101 - Nov 2016
 
PRESENTACIÓN DEL IIEP _ NC
PRESENTACIÓN DEL IIEP _ NCPRESENTACIÓN DEL IIEP _ NC
PRESENTACIÓN DEL IIEP _ NC
 
Strat Plan May 31 2014
Strat Plan May 31 2014Strat Plan May 31 2014
Strat Plan May 31 2014
 
Metadata & brokering - a modern approach #2
Metadata & brokering - a modern approach #2Metadata & brokering - a modern approach #2
Metadata & brokering - a modern approach #2
 
Bantayan after Haiyan
Bantayan after HaiyanBantayan after Haiyan
Bantayan after Haiyan
 
зато мы из ЗАТО
зато мы из ЗАТОзато мы из ЗАТО
зато мы из ЗАТО
 
04 streamline english directions
04 streamline english directions04 streamline english directions
04 streamline english directions
 
зато мы из ЗАТО
зато мы из ЗАТОзато мы из ЗАТО
зато мы из ЗАТО
 
Arus bolak balik klompok 4
Arus bolak balik klompok 4Arus bolak balik klompok 4
Arus bolak balik klompok 4
 
Introduction to DRDA CPAs & Business Consultants
Introduction to DRDA CPAs & Business ConsultantsIntroduction to DRDA CPAs & Business Consultants
Introduction to DRDA CPAs & Business Consultants
 
Bali island
Bali islandBali island
Bali island
 
Monet
MonetMonet
Monet
 
Desarollo de la personalidad. Psicologia
Desarollo de la personalidad. PsicologiaDesarollo de la personalidad. Psicologia
Desarollo de la personalidad. Psicologia
 
03 streamline english destinations
03 streamline english destinations03 streamline english destinations
03 streamline english destinations
 

Similar to Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy

Evolution of Linux Containerization
Evolution of Linux Containerization Evolution of Linux Containerization
Evolution of Linux Containerization
WSO2
 
Revolutionizing the cloud with container virtualization
Revolutionizing the cloud with container virtualizationRevolutionizing the cloud with container virtualization
Revolutionizing the cloud with container virtualization
WSO2
 

Similar to Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy (20)

Introduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersIntroduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & Containers
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Lightweight Virtualization in Linux
Lightweight Virtualization in LinuxLightweight Virtualization in Linux
Lightweight Virtualization in Linux
 
Evolution of Linux Containerization
Evolution of Linux Containerization Evolution of Linux Containerization
Evolution of Linux Containerization
 
Revolutionizing the cloud with container virtualization
Revolutionizing the cloud with container virtualizationRevolutionizing the cloud with container virtualization
Revolutionizing the cloud with container virtualization
 
Evolution of containers to kubernetes
Evolution of containers to kubernetesEvolution of containers to kubernetes
Evolution of containers to kubernetes
 
How Secure Is Your Container? ContainerCon Berlin 2016
How Secure Is Your Container? ContainerCon Berlin 2016How Secure Is Your Container? ContainerCon Berlin 2016
How Secure Is Your Container? ContainerCon Berlin 2016
 
Academy PRO: Docker. Lecture 1
Academy PRO: Docker. Lecture 1Academy PRO: Docker. Lecture 1
Academy PRO: Docker. Lecture 1
 
Docker London: Container Security
Docker London: Container SecurityDocker London: Container Security
Docker London: Container Security
 
Docker-v3.pdf
Docker-v3.pdfDocker-v3.pdf
Docker-v3.pdf
 
Linux Containers and Docker SHARE.ORG Seattle 2015
Linux Containers and Docker SHARE.ORG Seattle 2015Linux Containers and Docker SHARE.ORG Seattle 2015
Linux Containers and Docker SHARE.ORG Seattle 2015
 
Scaleable PHP Applications in Kubernetes
Scaleable PHP Applications in KubernetesScaleable PHP Applications in Kubernetes
Scaleable PHP Applications in Kubernetes
 
First steps on CentOs7
First steps on CentOs7First steps on CentOs7
First steps on CentOs7
 
Why you’re going to fail running java on docker!
Why you’re going to fail running java on docker!Why you’re going to fail running java on docker!
Why you’re going to fail running java on docker!
 
Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015 Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015
 
The building blocks of docker.
The building blocks of docker.The building blocks of docker.
The building blocks of docker.
 
codemotion-docker-2014
codemotion-docker-2014codemotion-docker-2014
codemotion-docker-2014
 
Why everyone is excited about Docker (and you should too...) - Carlo Bonamic...
Why everyone is excited about Docker (and you should too...) -  Carlo Bonamic...Why everyone is excited about Docker (and you should too...) -  Carlo Bonamic...
Why everyone is excited about Docker (and you should too...) - Carlo Bonamic...
 
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
 
Security of Linux containers in the cloud
Security of Linux containers in the cloudSecurity of Linux containers in the cloud
Security of Linux containers in the cloud
 

Recently uploaded

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Linux containers – next gen virtualization for cloud (atl summit) ar4 3 - copy

  • 1. Linux Containers – NextGen Virtualization for Cloud Boden Russell (brussell@us.ibm.com) OpenStack Summit May 12 – 16, 2014 Atlanta, Georgia
  • 2. Definitions  Linux Containers (LXC  LinuX Containers) – Lightweight virtualization – Realized using features provided by a modern Linux kernel – VMs without the hypervisor (kind of)  Containerization of – (Linux) Operating Systems – Single or multiple applications  LXC as a technology ≠ LXC “tools” 5/14/2014 © 2014 IBM Corporation 2
  • 3. Hypervisors vs. Linux Containers Hardware Operating System Hypervisor Virtual Machine Operating System Bins / libs App App Virtual Machine Operating System Bins / libs App App Hardware Hypervisor Virtual Machine Operating System Bins / libs App App Virtual Machine Operating System Bins / libs App App Hardware Operating System Container Bins / libs App App Container Bins / libs App App Type 1 Hypervisor Type 2 Hypervisor Linux Containers 5/14/2014 3 Containers share the OS kernel of the host and thus are lightweight. However, each container must have the same OS kernel. Containers are isolated, but share OS and, where appropriate, libs / bins. © 2014 IBM Corporation
  • 4. LXC Technology Stack 5/14/2014 © 2014 IBM Corporation 4 UserSpaceKernelSpace Kernel System Call Interface Architecture Dependent Kernel Code GLIBC / Pseudo FS / User Space Tools & Libs Linux Container Tooling Linux Container Commoditization Orchestration & Management Hardware cgroups namespaces chroots LSM lxc
  • 5. So You Want To Build A Container?  High level checklist – Process(es) – Throttling / limits – Prioritization – Resource isolation – Root file system – Security 5/14/2014 © 2014 IBM Corporation 5 my-lxc ?
  • 6. Linux Control Groups (cgroups)  Problem – How do I throttle, prioritize, control and obtain metrics for a group of tasks (processes)?  Solution  control groups (cgroups) 5/14/2014 © 2014 IBM Corporation 6 cgroup blue proc proc proc – Device Access – Resource limiting – Prioritization – Accounting – Control – Injection
  • 7. Linux cgroup Subsystems 5/14/2014 © 2014 IBM Corporation 7 Subsystem Tunable Parameters blkio - Weighted proportional block I/O access. Group wide or per device. - Per device hard limits on block I/O read/write specified as bytes per second or IOPS per second. cpu - Time period (microseconds per second) a group should have CPU access. - Group wide upper limit on CPU time per second. - Weighted proportional value of relative CPU time for a group. cpuset - CPUs (cores) the group can access. - Memory nodes the group can access and migrate ability. - Memory hardwall, pressure, spread, etc. devices - Define which devices and access type a group can use. freezer - Suspend/resume group tasks. memory - Max memory limits for the group (in bytes). - Memory swappiness, OOM control, hierarchy, etc.. hugetlb - Limit HugeTLB size usage. - Per cgroup HugeTLB metrics. net_cls - Tag network packets with a class ID. - Use tc to prioritize tagged packets. net_prio - Weighted proportional priority on egress traffic (per interface).
  • 8. Linux cgroups Pseudo FS Interface 5/14/2014 8 /sys/fs/cgroup/my-lxc |-- blkio | |-- blkio.io_merged | |-- blkio.io_queued | |-- blkio.io_service_bytes | |-- blkio.io_serviced | |-- blkio.io_service_time | |-- blkio.io_wait_time | |-- blkio.reset_stats | |-- blkio.sectors | |-- blkio.throttle.io_service_bytes | |-- blkio.throttle.io_serviced | |-- blkio.throttle.read_bps_device | |-- blkio.throttle.read_iops_device | |-- blkio.throttle.write_bps_device | |-- blkio.throttle.write_iops_device | |-- blkio.time | |-- blkio.weight | |-- blkio.weight_device | |-- cgroup.clone_children | |-- cgroup.event_control | |-- cgroup.procs | |-- notify_on_release | |-- release_agent | `-- tasks |-- cpu | |-- ... |-- ... `-- perf_event echo "8:16 1048576“ > blkio.throttle.read_bps_device cat blkio.weight_device dev weight 8:1 200 8:16 500 App App App  Linux pseudo FS is the interface to cgroups – Directory per subsystem per cgroup – Read / write to pseudo file(s) in your cgroup directory © 2014 IBM Corporation
  • 9. Linux cgroups FS Layout 5/14/2014 9© 2014 IBM Corporation
  • 10. So You Want To Build A Container? 5/14/2014 © 2014 IBM Corporation 10
  • 11. Linux namespaces  Problem – How do I provide an isolated view of global resources to a group of tasks (processes)?  Solution  namespaces 5/14/2014 © 2014 IBM Corporation 11 namespace blue – MNT; mount points, files systems, etc. – PID; processes – NET; NICs, routing, etc. – IPC; System V IPC – UTS; host and domain name – USER; UID and GID MNT PID NET UTS USER proc proc proc
  • 12. Linux namespaces: Conceptual Overview 5/14/2014 © 2014 IBM Corporation 12 global (i.e. root) namespace MNT NS / /proc /mnt/fsrd /mnt/fsrw /mnt/cdrom /run2 UTS NS globalhost rootns.com PID NS PID COMMAND 1 /sbin/init 2 [kthreadd] 3 [ksoftirqd] 4 [cpuset] 5 /sbin/udevd 6 /bin/sh 7 /bin/bash IPC NS SHMID OWNER 32452 root 43321 boden SEMID OWNER 0 root 1 Boden MSQID OWNER NET NS lo: UNKNOWN… eth0: UP… eth1: UP… br0: UP… app1 IP:5000 app2 IP:6000 app3 IP:7000 USER NS root 0:0 ntp 104:109 mysql 105:110 boden 106:111 purple namespace MNT NS / /proc /mnt/purplenfs /mnt/fsrw /mnt/cdrom UTS NS purplehost purplens.com PID NS PID COMMAND 1 /bin/bash 2 /bin/vim IPC NS SHMID OWNER SEMID OWNER 0 root MSQID OWNER NET NS lo: UNKNOWN… eth0: UP… app1 IP:1000 app2 IP:7000 USER NS root 0:0 app 106:111 blue namespace MNT NS / /proc /mnt/cdrom /bluens UTS NS bluehost bluens.com PID NS PID COMMAND 1 /bin/bash 2 python 3 node IPC NS SHMID OWNER SEMID OWNER MSQID OWNER NET NS lo: UNKNOWN… eth0: DOWN… eth1: UP app1 IP:7000 app2 IP:9000 USER NS root 0:0 app 104:109
  • 13. Linux namespaces & cgroups: Availability 5/14/2014 13 Note: user namespace support in upstream kernel 3.8+, but distributions rolling out phased support: - Map LXC UID/GID between container and host - Non-root LXC creation © 2014 IBM Corporation
  • 14. So You Want To Build A Container? 5/14/2014 © 2014 IBM Corporation 14
  • 15. Linux chroot & pivot_root 5/14/2014 15  Using pivot_root with MNT namespace addresses escaping chroot concerns  The pivot_root target directory becomes the “new root FS” © 2014 IBM Corporation
  • 16. So You Want To Build A Container? 5/14/2014 © 2014 IBM Corporation 16
  • 17. Linux Security Modules & MAC  Linux Security Modules (LSM) – kernel modules which provide a framework for Mandatory Access Control (MAC) security implementations  MAC vs DAC – In MAC, admin (user or process) assigns access controls to subject / initiator – In DAC, resource owner (user) assigns access controls to individual resources  Existing LSM implementations include: AppArmor, SELinux, GRSEC, etc. 5/14/2014 17
  • 18. Linux Capabilities  Per process privileges which define sys call access  Can be assigned to LXC process(es) 5/14/2014 18© 2014 IBM Corporation
  • 19. Other Security Measures  Reduce shared FS access using RO bind mounts  Linux seccomp – Confine system calls  Keep Linux kernel up to date  User namespaces in 3.8+ kernel – Launching containers as non-root user – Mapping UID / GID into container 5/14/2014 © 2014 IBM Corporation 19
  • 20. So You Want To Build A Container? 5/14/2014 20© 2014 IBM Corporation
  • 21. LXC Industry Tooling Virtuozzo OpenVZ Linux VServer Libvirt-lxc Lxc (tools) Warden lmctfy Docker Summary Commerical product using OpenVZ under the hood Custom Kernel providing well seasoned LXC support A set of kernel patches providing LXC. Not based on cgroups or namespaces. Libvirt support for LXC via cgroups and namespaces. Lib + set of user spaces tools /bindings for LXC. LXC management tooling used by CF. Similar to LXC, but provides more intent based focus. Commoditizatio n of LXC adding support for images, build files, etc. Part of upstream Kernel? No No Partial Yes Yes Yes Yes, but additional patches needed for specific features. Yes License Commercial GNU GPL v2 GNU GPL v2 GNU LGPL GNU LGPL Apache v2 Apache v2 Apache v2 APIs / Bindings - CLI - API - CLI - C - CLI - C - Python - Java - C# - PHP - Python - Lua - GO - CLI - GO - REST - CLI - Python - Other 3rd party libs Managem ent plane/ Dashboard Virtuozzo Parrallels Virtuozzo Parrallels + others - OpenStack - Archipel - Virt- Manager - LXC web panel - Lexy - OpenStack - Shipyard - Docker UI 5/14/2014 © 2014 IBM Corporation 21
  • 22. LXC Orchestration & Management  Docker & libvirt-lxc in OpenStack – Manage containers heterogeneously with traditional VMs… but not w/the level of support & features we might like  CoreOS – Zero-touch admin Linux distro with docker images as the unit of operation – Centralized key/value store to coordinate distributed environment  Various other 3rd party apps – Maestro for docker – Shipyard for docker – Fleet for CoreOS – Etc.  LXC migration – Container migration via criu  But… – Still no great way to tie all virtual resources together with LXC – e.g. storage + networking • IMO; an area which needs focus for LXC to become more generally applicable 5/14/2014 22© 2014 IBM Corporation
  • 23. CLOUDY BENCHMARKING WITH KVM, DOCKER AND OPENSTACK 5/14/2014 © 2014 IBM Corporation 23
  • 24. Benchmark Environment Topology @ SoftLayer glance api / reg nova api / cond / etc keystone … rally nova api / cond / etc cinder api / sch / vol docker lxc dstat controller compute node glance api / reg nova api / cond / etc keystone … rally nova api / cond / etc cinder api / sch / vol KVM dstat controller compute node 5/14/2014 24 + Awesome! + Awesome! © 2014 IBM Corporation
  • 25. Cloudy Performance: Steady State Packing  Benchmark scenario overview – Pre-cache VM image on compute node prior to test – Boot 15 VM asynchronously in succession – Wait for 5 minutes (to achieve steady-state on the compute node) – Delete all 15 VMs asynchronously in succession  Benchmark driver – cpu_bench.py  High level goals – Understand compute node characteristics under steady-state conditions with 15 packed / active VMs 5/14/2014 25 0 2 4 6 8 10 12 14 16 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 ActiveVMs Time Benchmark Visualization VMs Document v2.0
  • 26. Cloudy Performance: Serial VM Boot  Benchmark scenario overview – Pre-cache VM image on compute node prior to test – Boot VM – Wait for VM to become ACTIVE – Repeat the above steps for a total of 15 VMs – Delete all VMs  Benchmark driver – OpenStack Rally  High level goals – Understand compute node characteristics under sustained VM boots 5/14/2014 26 0 2 4 6 8 10 12 14 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ActiveVMs Time Benchmark Visualization VMs Document v2.0
  • 27. Cloudy Performance: Serial VM Reboot  Benchmark scenario overview – Pre-cache VM image on compute node prior to test – Boot a VM & wait for it to become ACTIVE – Soft reboot the VM and wait for it to become ACTIVE • Repeat reboot a total of 5 times – Delete VM – Repeat the above for a total of 5 VMs  Benchmark driver – OpenStack Rally  High level goals – Understand compute node characteristics under sustained VM reboots 5/14/2014 27 0 1 2 3 4 5 6 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 ActiveVMs Time Benchmark Visualization Active VMs Document v2.0
  • 28. Cloudy Performance: Snapshot VM To Image  Benchmark scenario overview – Boot a VM – Wait for it to become active – Snapshot the VM – Wait for image to become active – Delete VM 5/14/2014 28© 2014 IBM Corporation
  • 29. Cloudy Ops: VM Boot 5/14/2014 29 3.529113102 5.781662448 0 1 2 3 4 5 6 7 docker KVM TimeInSeconds Average Server Boot Time docker KVM Document v2.0
  • 30. Cloudy Ops: VM Reboot 5/14/2014 30 2.577879581 124.433239 0 20 40 60 80 100 120 140 docker KVM TimeInSeconds Average Server Reboot Time docker KVM Document v2.0
  • 31. Cloudy Ops: VM Delete 5/14/2014 31 3.567586041 3.479760051 0 0.5 1 1.5 2 2.5 3 3.5 4 docker KVM TimeInSeconds Average Server Delete Time docker KVM Document v2.0
  • 32. Cloudy Ops: VM Snapshot 5/14/2014 32 36.88756394 48.02313805 0 10 20 30 40 50 60 docker KVM TimeInSeconds Average Snapshot Server Time docker KVM Document v2.0
  • 33. Cloudy Performance: Steady State Packing 5/14/2014 33 0 10 20 30 40 50 60 70 80 1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 145 153 161 169 177 185 193 201 209 217 225 233 241 249 257 265 273 281 289 297 305 313 321 CPUUsageInPercent Time Docker: Compute Node CPU (full test duration) usr sys Averages – 0.54 – 0.17 0 10 20 30 40 50 60 70 80 1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 145 153 161 169 177 185 193 201 209 217 225 233 241 249 257 265 273 281 289 297 305 313 321 329 337 345 CPUUsageInPercent Time KVM: Compute Node CPU (full test duration) usr sys Averages – 7.64 – 1.4 Document v2.0
  • 34. Cloudy Performance: Steady State Packing 5/14/2014 34 0 2 4 6 8 10 12 14 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 131 136 141 146 151 156 161 166 171 176 181 186 191 196 201 206 211 CPUUsageInPercent Time (31s – 243s) Docker: Compute Node Steady-State CPU (segment: 31s – 243s) usr sys 0 2 4 6 8 10 12 14 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 131 136 141 146 151 156 161 166 171 176 181 186 191 196 201 206 211 CPUUsageInPercent Time (95s - 307s) KVM: Compute Node Steady-State CPU (segment: 95s – 307s) usr sys Averages – 0.2 – 0.03 Averages – 1.91 – 0.36 31 seconds 243 seconds 95 seconds 307 seconds Document v2.0
  • 35. Cloudy Performance: Steady State Packing 5/14/2014 35 0.00E+00 1.00E+09 2.00E+09 3.00E+09 4.00E+09 5.00E+09 6.00E+09 7.00E+09 1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154 163 172 181 190 199 208 217 226 235 244 253 262 271 280 289 298 307 316 325 334 MemoryUsed Axis Title Docker / KVM: Compute Node Used Memory (Overlay) kvm docker Document v2.0 docker Delta 734 MB Per VM 49 MB KVM Delta 4387 MB Per VM 292 MB
  • 36. Cloudy Performance: Serial VM Boot 5/14/2014 36 0 5 10 15 20 25 30 35 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 CPUUsageInPercent Time Docker: Compute Node CPU usr sys Averages – 1.39 – 0.57 0 5 10 15 20 25 30 35 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 121 124 127 CPUUsageInPercent Time KVM: Compute Node CPU Usage usr sys Averages – 13.45 – 2.23 Document v2.0
  • 37. Cloudy Performance: Serial VM Boot 5/14/2014 37 y = 0.009x + 1.008 y = 0.358x + 1.063 0 5 10 15 20 25 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 UsrCPUInPercent Time (8s - 58s) Docker / KVM: Serial VM Boot Usr CPU (segment: 8s - 58s) docker(8-58) kvm(8-58) Linear (docker(8-58)) Linear (kvm(8-58)) 8 seconds 58 seconds Document v2.0
  • 38. Cloudy Performance: Serial VM Boot 5/14/2014 38 0.00E+00 5.00E+08 1.00E+09 1.50E+09 2.00E+09 2.50E+09 3.00E+09 3.50E+09 4.00E+09 4.50E+09 5.00E+09 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101105109113117121125 MemoryUsed Time Docker / KVM: Compute Node Memory Used (Unnormalized Overlay) kvm docker Document v2.0
  • 39. Cloudy Performance: Serial VM Boot 5/14/2014 39 y = 1E+07x + 1E+09 y = 3E+07x + 1E+09 0.00E+00 5.00E+08 1.00E+09 1.50E+09 2.00E+09 2.50E+09 3.00E+09 3.50E+09 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 MemoryUsage Time (1s - 67s) Docker / KVM: Serial VM Boot Memory Usage (segment: 1s - 67s) docker kvm Linear (docker) Linear (kvm) 1 second 67 seconds Document v2.0
  • 40. Guest Ops: Network 5/14/2014 40 940.26 940.56 0 100 200 300 400 500 600 700 800 900 1000 docker KVM ThroughputIn10^6bits/second Network Throughput docker KVM Document v2.0
  • 41. Guest Ops: Near Bare Metal Performance  Typical docker LXC performance near par with bare metal 5/14/2014 41 linpack performance @ 45000 0 50 100 150 200 250 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 B M vcpus GFlops 220.77 Bare metal220.5 @32 vcpu 220.9 @ 31 vcpu 0 2000 4000 6000 8000 10000 12000 14000 MEMCPY DUMB MCBLOCK MiB/s Memory Test Memory Benchmark Performance Bare Metal (MiB/s) docker (MiB/s) KVM (MiB/s)
  • 42. Guest Ops: File I/O Random Read / Write 5/14/2014 42 0 200 400 600 800 1000 1200 1400 1600 1 2 4 8 16 32 64 TotalTransferredInKb/sec Threads Sysbench Synchronous File I/O Random Read/Write @ R/W Ratio of 1.50 docker KVM Document v2.0
  • 43. Guest Ops: MySQL OLTP 5/14/2014 43 0 2000 4000 6000 8000 10000 12000 14000 1 2 4 8 16 32 64 TotalTransactions Threads MySQL OLTP Random Transactional R/W (60s) docker KVM Document v2.0
  • 44. Guest Ops: MySQL Indexed Insertion 5/14/2014 44 0 20 40 60 80 100 120 140 100000 200000 300000 400000 500000 600000 700000 800000 900000 1000000 SecondsPer100KInsertionBatch Table Size In Rows MySQL Indexed Insertion @ 100K Intervals docker kvm Document v2.0
  • 45. Cloud Management Impacts on LXC 5/14/2014 45 0.17 3.529113102 0 0.5 1 1.5 2 2.5 3 3.5 4 docker cli nova-docker Seconds Docker: Boot Container - CLI vs Nova Virt docker cli nova-docker Cloud management often caps true ops performance of LXC Document v2.0
  • 46. Ubuntu MySQL Image Size 5/14/2014 Document v2.0 46 381.5 1080 0 200 400 600 800 1000 1200 docker kvm SizeInMB Docker / KVM: Ubuntu MySQL docker kvm Out of the box JeOS images for docker are lightweight
  • 47. LXC In Summary  Near bare metal performance in the guest  Fast operations in the Cloud  Reduced resource consumption (CPU, MEM) on the compute node  Out of the box smaller image footprint 5/14/2014 47
  • 48. LXC Gaps There are gaps…  Lack of industry tooling / support  Live migration still a WIP  Full orchestration across resources (compute / storage / networking)  Fears of security  Not a well known technology… yet  Integration with existing virtualization and Cloud tooling  Not much / any industry standards  Missing skillset  Slower upstream support due to kernel dev process  Memory /CPU proc FS not cgroup aware  Etc. 5/14/2014 48
  • 49. References & Related Links  http://www.slideshare.net/BodenRussell/realizing-linux-containerslxc  http://bodenr.blogspot.com/2014/05/kvm-and-docker-lxc-benchmarking- with.html  https://www.docker.io/  http://sysbench.sourceforge.net/  http://dag.wiee.rs/home-made/dstat/  http://www.openstack.org/  https://wiki.openstack.org/wiki/Rally  https://wiki.openstack.org/wiki/Docker  http://devstack.org/  http://www.linux-kvm.org/page/Main_Page  https://github.com/stackforge/nova-docker  https://github.com/dotcloud/docker-registry  http://www.netperf.org/netperf/  http://www.tokutek.com/products/iibench/  http://www.brendangregg.com/activebenchmarking.html  http://wiki.openvz.org/Performance 5/14/2014 49
  • 50. IBM Sponsored Sessions Monday, May 12 – Room B314 12:05-12:45 Wednesday, May 14 - Room B312 9:00-9:40 9:50-10:30 11:00-11:40 11:50-12:30 OpenStack is Rockin’ the OpenCloud Movement! Who‘s Next to Join the Band ? Angel Diaz, VP Open Technology and Cloud Labs David Lindquist, IBM Fellow, VP, CTO Cloud & Smarter Infrastructure Getting from enterprise ready to enterprise bliss - why OpenStack and IBM is a match made in Cloud heaven. Todd Moore - Director, Open Technologies and Partnerships Taking OpenStack beyond Infrastructure with IBM SmartCloud Orchestrator. Andrew Trossman - Distinguished Engineer, IBM Common Cloud Stack and SmartCloud Orchestrator IBM, SoftLayer and OpenStack - present and future Michael Fork - Cloud Architect IBM and OpenStack: Enabling Enterprise Cloud Solutions Now. Tammy Van Hove -Distinguished Engineer, Software Defined Systems 5/14/2014 50© 2014 IBM Corporation
  • 51. IBM Technical Sessions 5/14/2014 © 2014 IBM Corporation 51 Monday, May 12 3:40 - 4:20 3:40 - 4:20 Tuesday, May 13 11:15 - 11:55 2:00 - 2:40 5:30 - 6:10 5:30 - 6:10 Wednesday, May14 9:50 - 10:30 2:40 - 3:20 Thursday, May 15 9:50 - 10:30 1:30 - 2:10 2:20 - 3:00
  • 52. Be sure to stop by the IBM booth to see some demos and get your rockin’ OpenStack t-shirt while they last. Don’t miss Monday evening’s booth crawl where you can enjoy Atlanta’s own SWEET WATER IPA! Thank you! 5/14/2014 © 2014 IBM Corporation 52