SlideShare a Scribd company logo
1 of 66
Download to read offline
devconf.cz 2014
QEMU Disk IO
Which performs Better:
Native or threads?
Pradeep Kumar Surisetty
Red Hat, Inc.
devconf.cz, February 2016
Outline
devconf.cz 2016
●
KVM IO Architecture
●
Storage transport choices in KVM
●
Virtio-blk Storage Configurations
●
Performance Benchmark tools
●
Challenges
●
Performance Results with Native & Threads
●
Limitations
●
Future Work
KVM I/O Architecture
HARDWARE
cpu0 …. cpuM
HARDWARE
cpu0 …. cpuM
HARDWARE
….
Applications
File System
& Block
Drivers
vcpu0 … vcpuN
… vcpuN
iothread
cpu0 cpuM
Applications
File System
& Block
Drivers
KVM GUEST
KVM Guest’s
Kernel
vcpu0 iothread
Hardware
Emulation
(QEMU)
Generates I/O
requests to host on
guest’s behalf &
handle events
Notes:
Each guest CPU has a dedicated vcpu
thread that uses kvm.ko module to execute
guest code
There is an I/O thread that runs a
select(2) loop to handle events
devconf.cz 2016
KVM (kvm.ko)
File Systems and
Block Devices
Physical Drivers
Storage transport choices in KVM
●
Full virtualization : IDE, SATA, SCSI
●
Good guest compatibility
●
Lots of trap-and-emulate, bad performance
●
Para virtualization: virtio-blk, virtio-scsi
●
Efficient guest ↔ host communication through virtio ring buffer
(virtqueue)
●
Good performance
●
Provide more virtualization friendly interface, higher
performance.
●
In AIO case, io_submit() is under the global mutex
devconf.cz 2016
Storage transport choices in KVM
●
Device assignment (Passthrough)
●
Pass hardware to guest, high-end usage, high performance
●
Limited Number of PCI Devices
●
Hard for Live Migration
devconf.cz 2016
Full virtualization Para-virtualization
Storage transport choices in KVM
devconf.cz 2016
Virtio PCI Controller
Virtio Device
vring
Guest
Qemu
Virtio pci controller
Virtio Device
Kick
Ring buffer with para virtualization
Virtio-blk-data-plane:
●
Accelerated data path for para-virtualized
block I/O driver
●
Threads are defined by -object
othread,iothread=<id> and the user can
set up arbitrary device->iothread mappings
(multiple devices can share an iothread)
●
No need to acquire big QEMU lock
KVM Guest
Host Kernel
QEMU Event
Loop
Virtio-bl
data-pla
KVM Guest
Host Kernel
QEMU Event
Loop
vityio-blk-
data-plane
thread(s)
Linux AIO
irqfd
devconf.cz 2016
Virtio-blk Storage Configurations
KVM Applications
Guest
LVM Volume on
Virtual Devices
Host Server
KVM
Guest
LVM Volume on
Virtual Devices
Physical Storage
Applications
Direct I/O w/
Para-Virtualized
Drivers
Block Devices
/dev/sda,
/dev/sdb,…
Device-Backed Virtual Storage
Virtual Devices
/dev/vda, /dev/vdb,…
KVM
Guest Applications
LVM File
Volume
Host Server
KVM
Guest
LVM
Volume
Physical Storage
File
Applications
Para-
virtualized
Drivers, Direct
I/O
RAW or
QCOW2
File-Backed Virtual Storage
devconf.cz 2016
Openstack:
Libvirt: AIO mode for disk devices
1) Asynchronous IO (AIO=Native)
Using io_submit calls
2) Synchronous (AIO=Threads)
pread64, pwrite64 calls
Default Choice in Openstack is aio=threads*
Ref: https://specs.openstack.org/openstack/novaspecs/specs/mitaka/approved/libvirt-aio-mode.html
* Before solving this problem
devconf.cz 2016
●
</disk>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='none' io='native'/>
<source file='/home/psuriset/xfs/vm2-native-ssd.qcow2'/>
<target dev='vdb' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</disk>
●
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='none' io='threads'/>
<source file='/home/psuriset/xfs/vm2-threads-ssd.qcow2'/>
<target dev='vdc' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
</disk>
Example XML
devconf.cz 2016
CPU Usage with aio=Native
devconf.cz 2016
CPU Usage with aio=Threads
devconf.cz 2016
ext4/XFS
Ext4, XFS
NFS
SSD, HDD
Qcow2, Qcow2 (With Falloc), Qcow2 (With Fallocate), Raw(With Preallocated)
File, Block device
Jobs: Seq Read, Seq Write,Rand Read, Rand Write, Rand Read Write
Block Sizes: 4k, 16k, 64k, 256k
Number of VM: 1, 16 (Concurrent)
devconf.cz 2016
Multiple layers Evaluated with virtio-blk
Test Environment
Hardware
● 2 x Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
● 256 GiB memory @1866MHz
● 1 x 1 TB NVMe PCI SSD
● 1 x 500 GB HDD
Software
● Host: RHEL 7.2 :3.10.0-327
● Qemu: 2.3.0-31 + AIO Merge Patch
● VM: RHEL 7.2
devconf.cz 2016
Tools
What is Pbench?
pbench (perf bench) aims to:
●
Provide easy access to benchmarking & performance tools on
Linux systems
●
Standardize the collection of telemetry and configuration
Information
●
Automate benchmark execution
●
Output effective visualization for analysis allow for ingestion
into elastic search
devconf.cz 2016
Pbench Continued...
Tool visualization:
sar tool, total cpu consumption:
devconf.cz 2016
Pbench Continued ..
tool visualization:
iostat tool, disk request size:
devconf.cz 2016
Pbench Continued ..
tool visualization:
proc-interrupts tool, function call interrupts/sec:
devconf.cz 2016
Pbench Continued ..
tool visualization:
proc-vmstat tool, numa stats: entries in /proc/vmstat which begin with “numa_” (delta/sec)
devconf.cz 2016
Pbench Continued ..
pbench benchmarks
example: fio benchmark
# pbench_fio --config=baremetal-hdd
runs a default set of iterations:
[read,rand-read]*[4KB, 8KB….64KB]
takes 5 samples per iteration and compute avg, stddev
handles start/stop/post-process of tools for each iteration
other fio options:
--targets=<devices or files>
--ioengine=[sync, libaio, others]
--test-types=[read,randread,write,randwrite,randrw]
--block-sizes=[<int>,[<int>]] (in KB)
devconf.cz 2016
FIO: Flexible IO Tester
● IO type
Defines the io pattern issued to the file(s). We may only be reading sequentially from
this file(s), or we may be writing randomly. Or even mixing reads and writes,
sequentially or Randomly
● Block size
In how large chunks are we issuing io? This may be a single value, or it may
describe a range of block sizes.
● IO size
How much data are we going to be reading/writing
● IO Engine
How do we issue io? We could be memory mapping the file, we could be using
regular read/write, we could be using splice, async io, syslet, or even SG (SCSI
generic sg)
● IO depth
If the io engine is async, how large a queuing depth do we want to maintain?
● IO Type
Should we be doing buffered io, or direct/raw io?
devconf.cz 2016
Guest
Host
Guest & Host iostat during 4k seq read with aio=native
devconf.cz 2016
● Aio=native uses Linux AIO io_submit(2) for read and write requests and
Request completion is signaled using eventfd.
● Virtqueue kicks are handled in the iothread. When the guest writes to
the virtqueue kick hardware register the kvm.ko module signals the
ioeventfd which the main loop thread is monitoring.
● Requests are collected from the virtqueue and submitted (after write
request merging) either via aio=threads or aio=native.
● Request completion callbacks are invoked in the main loop thread and
an interrupt is injected into the guest.
.
AIO Native
devconf.cz 2016
Virtio PCI Controller
Virtio Device
vring
Guest
Qemu
Virtio pci controller
Virtio Device
Kick
Challenges for Read with aio=native
●
● virtio-blk does *not* merge read requests in qemu-kvm. It only merges write
requests.
● QEMU submits each 4 KB request through a separate io_submit() call.
● Qemu would submit only 1 request at a time though Multiple requests to
process
● Batching method was implemented for both virtio-scsi and virtio-blk-data-plane
disk
Batch Submission
What is I/O batch submission
● Handle more requests in one single system call(io_submit), so calling
number of the syscall of io_submit can be decrease a lot
Abstracting with generic interfaces
● bdrv_io_plug( ) / bdrv_io_unplug( )
● merged in fc73548e444ae3239f6cef44a5200b5d2c3e85d1
(virtio-blk: submit I/O as a batch)
devconf.cz 2016
Performance Comparison Graphs
devconf.cz 2016
Test Specifications Single VM Results Multiple VM (16) Results
Disk: SSD
FS: None (used LVM)
Image: raw
Preallocated: yes
aio=threads has better performance with LVM.
4K read,randread performance is 10-15% higher.
4K write is 26% higher.
Native & threads perform equally in most
cases but native does better in few
cases.
Disk: SSD
FS: EXT4
Image: raw
Preallocated: yes
Native performs well with randwrite, write, and
randread-write.
Threads 4K read is 10-15% Higher..
4K randread is 8% higher.
Both have similar results.
4K seq reads: threads 1% higher.
Disk: SSD
FS: XFS
Image: Raw
Preallocated: yes
aio=threads has better performance Native & threads perform equally in most
cases but native does better in few
cases. Threads better in seq writes
Disk: SSD
FS: EXT4
Image: raw
Preallocated: yes
NFS : yes
Native performs well with randwrite, write and
randread-write.
Threads do well with 4K/16K read, randread by
12% higher.
Native & threads perform equally in most
cases but native does better in few
cases.
Results
devconf.cz 2016
Test Specifications Single VM Results Multiple VM (16) Results
Disk: SSD
FS: XFS
Image: raw
Preallocated: yes
NFS : yes
Native performs well with all tests except read &
randread tests where threads perform better.
Native performs well with all tests.
Disk: SSD
FS: EXT4
Image: qcow2
Preallocated: no
Native does well with all tests.
Threads outperform native <10% for read and
randread.
Native is better than threads in most
cases.
Seq reads are 10-15% higher with native.
Disk: SSD
FS: XFS
Image: qcow2
Preallocated: no
Native performs well with all tests except seq read
which is 6% higher
Native performs better than threads
except seq write, which is 8% higher
Disk: SSD
FS: EXT4
Image: qcow2
Preallocated: with falloc (using
qemu-img)
Native is optimal for almost all tests.
Threads slightly better (<10%) for seq reads.
Native is optimal with randwrite, write
and randread-write.
Threads have slightly better performance
for read and randread.
Disk: SSD
FS: XFS
Image: qcow2
Preallocate: with falloc
Native is optimal for write and randread-write.
Threads better (<10%) for read and randread.
Native is optimal for all tests. Threads is
better for seq writes.
Disk: SSD
FS: EXT4
Native performs better for randwrite, write,
randread-write.
Threads does better for read and randread.
4K,16K read,randread is 12% higher.
Native outperforms threads.
Test Specifications Single VM Results Multiple VM (16) Results
Disk: SSD
FS: XFS
Image: qcow2
Preallocated: with fallocate
Native is optimal for randwrite, write and
randread
Threads better (<10%) for read
Native optimal for all tests.
Threads optimal for randread, and 4K seq write.
Disk: HDD
FS: No. Used LVM
Image: raw
Preallocated: yes
Native outperforms threads in all tests. Native outperforms threads in all tests.
Disk: HDD
FS: EXT4
Image: raw
Preallocated: yes
Native outperforms threads in all tests. Native outperforms threads in all tests.
Disk: HDD
FS: XFS
Image: raw
Preallocated: yes
Native outperforms threads in all tests. Native is optimal or equal in all tests.
Disk: HDD
FS: EXT4
Image: qcow2
Preallocated: no
Native is optimal or equal in all test cases
except randread where threads is 30%
higher.
Native is optimal except for 4K seq reads.
Disk: HDD
FS: XFS
Image: qcow2
Preallocated: no
Native is optimal except for
seq writes where threads is 30% higher.
Native is optimal except for 4K seq reads.
Disk: HDD
FS: EXT4
Native is optimal or equal in all cases except
randread where threads is 30% higher.
Native is optimal or equal in all tests except 4K read.
Test Specifications
Single VM Results Multiple VM (16) Results
Disk: HDD
FS: XFS
Image: qcow2
Preallocated: with falloc
(using qemu-img)
Native is optimal or equal in all cases except seq
write where threads is 30% higher.
Native is optimal or equal in all cases
except for 4K randread.
Disk: HDD
FS: EXT4
Image: qcow2
Preallocated: with fallocate
Native is optimal or equal in all tests. Native is optimal or equal in all tests
except 4K randread where threads is 15%
higher.
Disk: HDD
FS: XFS
Image: qcow2
Preallocated: with fallocate
Native is optimal in all tests except for seq write
where threads is 30% higher.
Nativs is better. Threads has slightly better
performance(<3-4%),
excluding randread where threads is 30%
higher.
devconf.cz 2016
Performance Graphs
devconf.cz 2016
1. Disk: SSD, Image: raw, Preallocated: yes, VMs: 16
FS: no.Used LVM, aio=native
FS: No.Used LVM, aio=threads
FS: EXT4, aio=native
FS: EXT4, aio=threads
FS: XFS, aio=native
FS: XFS, aio=threads
RandRead
devconf.cz 2016
RandReadWrite
FS: no.Used LVM, aio=native
FS: No.Used LVM, aio=threads
FS: EXT4, aio=native
FS: EXT4, aio=threads
FS: XFS, aio=native
FS: XFS, aio=threads
devconf.cz 2016
RandWrite
devconf.cz 2016
FS: no.Used LVM, aio=native
FS: No.Used LVM, aio=threads
FS: EXT4, aio=native
FS: EXT4, aio=threads
FS: XFS, aio=native
FS: XFS, aio=threads
Seq Read
FS: no.Used LVM, aio=native
FS: No.Used LVM, aio=threads
FS: EXT4, aio=native
FS: EXT4, aio=threads
FS: XFS, aio=native
FS: XFS, aio=threads
devconf.cz 2016
Seq Write
devconf.cz 2016
FS: no.Used LVM, aio=native
FS: No.Used LVM, aio=threads
FS: EXT4, aio=native
FS: EXT4, aio=threads
FS: XFS, aio=native
FS: XFS, aio=threads
2. Disk: HDD, Image: raw, Preallocated: yes, VMs: 16
RandRead RandReadWrite
devconf.cz 2016
FS: no.Used LVM, aio=native
FS: No.Used LVM, aio=threads
FS: EXT4, aio=native
FS: EXT4, aio=threads
FS: XFS, aio=native
FS: XFS, aio=threads
RandWrite
Seq Read
Seq Write
devconf.cz 2016
RandWrite
FS: no.Used LVM, aio=native
FS: No.Used LVM, aio=threads
FS: EXT4, aio=native
FS: EXT4, aio=threads
FS: XFS, aio=native
FS: XFS, aio=threads
3. Disk: SSD, Image: raw, Preallocated: yes, VMs: 1
RandRead
RandRead Write
FS: no.Used LVM, aio=native
FS: No.Used LVM, aio=threads
FS: EXT4, aio=native
FS: EXT4, aio=threads
FS: XFS, aio=native
FS: XFS, aio=threads
devconf.cz 2016
RandRead Write
Seq Read
FS: no.Used LVM, aio=native
FS: No.Used LVM, aio=threads
FS: EXT4, aio=native
FS: EXT4, aio=threads
FS: XFS, aio=native
FS: XFS, aio=threads
devconf.cz 2016
Seq Write
FS: no.Used LVM, aio=native
FS: No.Used LVM, aio=threads
FS: EXT4, aio=native
FS: EXT4, aio=threads
FS: XFS, aio=native
FS: XFS, aio=threads
devconf.cz 2016
Rand Read Rand Read Write
Rand Write
4. Disk: HDD, Image: raw, Preallocated: yes, VMs: 1 FS: no.Used LVM, aio=native
FS: No.Used LVM, aio=threads
FS: EXT4, aio=native
FS: EXT4, aio=threads
FS: XFS, aio=native
FS: XFS, aio=threads
devconf.cz 2016
Seq Read
Seq Write
FS: no.Used LVM, aio=native
FS: No.Used LVM, aio=threads
FS: EXT4, aio=native
FS: EXT4, aio=threads
FS: XFS, aio=native
FS: XFS, aio=threads
devconf.cz 2016
5. Disk: SSD, Image: qcow2, VMs: 16
RandRead
1 FS: EXT4, aio=native, Img: qcow2
2 FS: EXT4, aio=threads, Img: qcow2
3 FS: XFS, aio=native, Img: qcow2
4 FS: XFS, aio=threads, Img: qcow2
5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc
6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc
7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc
8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc
9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate
10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate
11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate
12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
RandReadWrite RandWrite
1 FS: EXT4, aio=native, Img: qcow2
2 FS: EXT4, aio=threads, Img: qcow2
3 FS: XFS, aio=native, Img: qcow2
4 FS: XFS, aio=threads, Img: qcow2
5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc
6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc
7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc
8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc
9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate
10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate
11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate
12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
Seq Read
SeqWrite
1 FS: EXT4, aio=native, Img: qcow2
2 FS: EXT4, aio=threads, Img: qcow2
3 FS: XFS, aio=native, Img: qcow2
4 FS: XFS, aio=threads, Img: qcow2
5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc
6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc
7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc
8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc
9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate
10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate
11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate
12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
6. Disk: HDD, Image: qcow2, VMs: 16
1 FS: EXT4, aio=native, Img: qcow2
2 FS: EXT4, aio=threads, Img: qcow2
3 FS: XFS, aio=native, Img: qcow2
4 FS: XFS, aio=threads, Img: qcow2
5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc
6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc
7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc
8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc
9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate
10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate
11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate
12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
RandRead Rand Read Write Rand Write Seq Read
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2
3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2
5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc
7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc
9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate
11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
7. Disk: SSD, Image: qcow2, VMs: 1
RandRead
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2
3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2
5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc
7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc
9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate
11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
RandReadWrite RandWrite
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2
3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2
5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc
7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc
9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate
11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
Seq Read
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2
3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2
5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc
7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc
9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate
11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
Seq Write
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2
3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2
5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc
7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc
9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate
11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
8. Disk: HDD, Image: qcow2, VMs: 1
Seq Read
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2
3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2
5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc
7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc
9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate
11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
Rand Read Write Rand Write
Rand Read
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2
3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2
5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc
7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc
9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate
11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
Seq Write
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2
3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2
5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc
7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc
9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate
11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
8. Disk: SSD, Image: raw, NFS: yes, VMs: 1
1.FS: EXT4, aio=native 2. FS: EXT4, aio=threads
3. FS: XFS, aio=native 4. FS: XFS, aio=threads
Rand Read
Rand Read Write
devconf.cz 2016
Rand Write
Seq Read
Seq Write
devconf.cz 2016
https://review.openstack.org/#/c/232
514/7/specs/mitaka/approved/libvirt-
aio-mode.rst,cm
devconf.cz 2016
Performance Brief
● https://access.redhat.com/articles/2147661
devconf.cz 2016
Conclusion & Limitations
● Throughput increased a lot because IO thread takes fewer CPU to submit I/O
● AIO=Native is Preferable choice with few limitations.
● Native AIO can block the VM if the file is not fully allocated and is therefore not
recommended for use on sparse files.
● Writes to sparsely allocated files are more likely to block than fully preallocated
files. Therefore it is recommended to only use aio=native on fully preallocated
files, local disks, or logical volumes.
devconf.cz 2016
Future work
devconf.cz 2016
Evaluate Virtio Data Plane Performance
Reduce cpu utilization for aio=threads and consider
Questions
devconf.cz 2016
References
● Stefan Hajnoczi Optimizing the QEMU Storage Stack, Linux Plumbers
2010
● Asias He, Virtio-blk Performance Improvement, KVM forum 2012
● Khoa Huynch: Exploiting The Latest KVM Features For Optimized
Virtualized Enterprise Storage Performance, LinuxCon2012
● Pbench: http://distributed-system-analysis.github.io/pbench/
https://github.com/distributed-system-analysis/pbench
● FIO: https://github.com/axboe/fio/
devconf.cz 2016
Special Thanks
to
Andrew Theurer
Stefan Hajnoczj
Thanks
Irc: #psuriset
Blog: psuriset.com

More Related Content

What's hot

Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph Community
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory ManagementNi Zo-Ma
 
Physical Memory Models.pdf
Physical Memory Models.pdfPhysical Memory Models.pdf
Physical Memory Models.pdfAdrian Huang
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernelAdrian Huang
 
UEFI時代のブートローダ
UEFI時代のブートローダUEFI時代のブートローダ
UEFI時代のブートローダTakuya ASADA
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016Brendan Gregg
 
XPDDS18: EPT-Based Sub-page Write Protection On Xenc - Yi Zhang, Intel
XPDDS18: EPT-Based Sub-page Write Protection On Xenc - Yi Zhang, IntelXPDDS18: EPT-Based Sub-page Write Protection On Xenc - Yi Zhang, Intel
XPDDS18: EPT-Based Sub-page Write Protection On Xenc - Yi Zhang, IntelThe Linux Foundation
 
Memory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfAdrian Huang
 
Windows Server 2019 で Container を使ってみる
Windows Server 2019 で Container を使ってみるWindows Server 2019 で Container を使ってみる
Windows Server 2019 で Container を使ってみるKazuki Takai
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringGeorg Schönberger
 
[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅NAVER D2
 
10分で分かるデータストレージ
10分で分かるデータストレージ10分で分かるデータストレージ
10分で分かるデータストレージTakashi Hoshino
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBshimosawa
 
I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜Ryousei Takano
 
Reverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux KernelReverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux KernelAdrian Huang
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionGene Chang
 
from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu WorksZhen Wei
 

What's hot (20)

Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph RBD Update - June 2021
Ceph RBD Update - June 2021
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
 
Physical Memory Models.pdf
Physical Memory Models.pdfPhysical Memory Models.pdf
Physical Memory Models.pdf
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernel
 
Deploying IPv6 on OpenStack
Deploying IPv6 on OpenStackDeploying IPv6 on OpenStack
Deploying IPv6 on OpenStack
 
UEFI時代のブートローダ
UEFI時代のブートローダUEFI時代のブートローダ
UEFI時代のブートローダ
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
XPDDS18: EPT-Based Sub-page Write Protection On Xenc - Yi Zhang, Intel
XPDDS18: EPT-Based Sub-page Write Protection On Xenc - Yi Zhang, IntelXPDDS18: EPT-Based Sub-page Write Protection On Xenc - Yi Zhang, Intel
XPDDS18: EPT-Based Sub-page Write Protection On Xenc - Yi Zhang, Intel
 
Memory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdf
 
Windows Server 2019 で Container を使ってみる
Windows Server 2019 で Container を使ってみるWindows Server 2019 で Container を使ってみる
Windows Server 2019 で Container を使ってみる
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
 
[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅
 
10分で分かるデータストレージ
10分で分かるデータストレージ10分で分かるデータストレージ
10分で分かるデータストレージ
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKB
 
I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜
 
Reverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux KernelReverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux Kernel
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
 
from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Works
 
systemd
systemdsystemd
systemd
 

Similar to QEMU Disk IO Which performs Better: Native or threads?

Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVM Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVM ShapeBlue
 
Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVMAchieving the ultimate performance with KVM
Achieving the ultimate performance with KVMStorPool Storage
 
Malware analysis
Malware analysisMalware analysis
Malware analysisxabean
 
Talk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopTalk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopQuey-Liang Kao
 
Get Your GeekOn with Ron - Session One: Designing your VDI Servers
Get Your GeekOn with Ron - Session One: Designing your VDI ServersGet Your GeekOn with Ron - Session One: Designing your VDI Servers
Get Your GeekOn with Ron - Session One: Designing your VDI ServersUnidesk Corporation
 
The Unofficial VCAP / VCP VMware Study Guide
The Unofficial VCAP / VCP VMware Study GuideThe Unofficial VCAP / VCP VMware Study Guide
The Unofficial VCAP / VCP VMware Study GuideVeeam Software
 
Virtualization overheads
Virtualization overheadsVirtualization overheads
Virtualization overheadsSandeep Joshi
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMDevOps.com
 
Current and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on LinuxCurrent and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on Linuxmountpoint.io
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMdata://disrupted®
 
HKG15-401: Ceph and Software Defined Storage on ARM servers
HKG15-401: Ceph and Software Defined Storage on ARM serversHKG15-401: Ceph and Software Defined Storage on ARM servers
HKG15-401: Ceph and Software Defined Storage on ARM serversLinaro
 
Swift Install Workshop - OpenStack Conference Spring 2012
Swift Install Workshop - OpenStack Conference Spring 2012Swift Install Workshop - OpenStack Conference Spring 2012
Swift Install Workshop - OpenStack Conference Spring 2012Joe Arnold
 
KVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackKVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackBoden Russell
 
Nytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_AccelerationNytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_AccelerationKhai Le
 
What’s Slowing Down Your Kafka Pipeline? With Ruizhe Cheng and Pete Stevenson...
What’s Slowing Down Your Kafka Pipeline? With Ruizhe Cheng and Pete Stevenson...What’s Slowing Down Your Kafka Pipeline? With Ruizhe Cheng and Pete Stevenson...
What’s Slowing Down Your Kafka Pipeline? With Ruizhe Cheng and Pete Stevenson...HostedbyConfluent
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013dotCloud
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Docker, Inc.
 

Similar to QEMU Disk IO Which performs Better: Native or threads? (20)

Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVM Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVM
 
Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVMAchieving the ultimate performance with KVM
Achieving the ultimate performance with KVM
 
Malware analysis
Malware analysisMalware analysis
Malware analysis
 
Talk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopTalk 160920 @ Cat System Workshop
Talk 160920 @ Cat System Workshop
 
Get Your GeekOn with Ron - Session One: Designing your VDI Servers
Get Your GeekOn with Ron - Session One: Designing your VDI ServersGet Your GeekOn with Ron - Session One: Designing your VDI Servers
Get Your GeekOn with Ron - Session One: Designing your VDI Servers
 
The Unofficial VCAP / VCP VMware Study Guide
The Unofficial VCAP / VCP VMware Study GuideThe Unofficial VCAP / VCP VMware Study Guide
The Unofficial VCAP / VCP VMware Study Guide
 
Virtualization overheads
Virtualization overheadsVirtualization overheads
Virtualization overheads
 
oSC22ww4.pdf
oSC22ww4.pdfoSC22ww4.pdf
oSC22ww4.pdf
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
 
Current and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on LinuxCurrent and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on Linux
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
 
HKG15-401: Ceph and Software Defined Storage on ARM servers
HKG15-401: Ceph and Software Defined Storage on ARM serversHKG15-401: Ceph and Software Defined Storage on ARM servers
HKG15-401: Ceph and Software Defined Storage on ARM servers
 
Swift Install Workshop - OpenStack Conference Spring 2012
Swift Install Workshop - OpenStack Conference Spring 2012Swift Install Workshop - OpenStack Conference Spring 2012
Swift Install Workshop - OpenStack Conference Spring 2012
 
How swift is your Swift - SD.pptx
How swift is your Swift - SD.pptxHow swift is your Swift - SD.pptx
How swift is your Swift - SD.pptx
 
KVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackKVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStack
 
Nytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_AccelerationNytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_Acceleration
 
What’s Slowing Down Your Kafka Pipeline? With Ruizhe Cheng and Pete Stevenson...
What’s Slowing Down Your Kafka Pipeline? With Ruizhe Cheng and Pete Stevenson...What’s Slowing Down Your Kafka Pipeline? With Ruizhe Cheng and Pete Stevenson...
What’s Slowing Down Your Kafka Pipeline? With Ruizhe Cheng and Pete Stevenson...
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
 
XS Boston 2008 Self IO Emulation
XS Boston 2008 Self IO EmulationXS Boston 2008 Self IO Emulation
XS Boston 2008 Self IO Emulation
 

Recently uploaded

SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 

Recently uploaded (20)

SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 

QEMU Disk IO Which performs Better: Native or threads?

  • 1. devconf.cz 2014 QEMU Disk IO Which performs Better: Native or threads? Pradeep Kumar Surisetty Red Hat, Inc. devconf.cz, February 2016
  • 2. Outline devconf.cz 2016 ● KVM IO Architecture ● Storage transport choices in KVM ● Virtio-blk Storage Configurations ● Performance Benchmark tools ● Challenges ● Performance Results with Native & Threads ● Limitations ● Future Work
  • 3. KVM I/O Architecture HARDWARE cpu0 …. cpuM HARDWARE cpu0 …. cpuM HARDWARE …. Applications File System & Block Drivers vcpu0 … vcpuN … vcpuN iothread cpu0 cpuM Applications File System & Block Drivers KVM GUEST KVM Guest’s Kernel vcpu0 iothread Hardware Emulation (QEMU) Generates I/O requests to host on guest’s behalf & handle events Notes: Each guest CPU has a dedicated vcpu thread that uses kvm.ko module to execute guest code There is an I/O thread that runs a select(2) loop to handle events devconf.cz 2016 KVM (kvm.ko) File Systems and Block Devices Physical Drivers
  • 4. Storage transport choices in KVM ● Full virtualization : IDE, SATA, SCSI ● Good guest compatibility ● Lots of trap-and-emulate, bad performance ● Para virtualization: virtio-blk, virtio-scsi ● Efficient guest ↔ host communication through virtio ring buffer (virtqueue) ● Good performance ● Provide more virtualization friendly interface, higher performance. ● In AIO case, io_submit() is under the global mutex devconf.cz 2016
  • 5. Storage transport choices in KVM ● Device assignment (Passthrough) ● Pass hardware to guest, high-end usage, high performance ● Limited Number of PCI Devices ● Hard for Live Migration devconf.cz 2016
  • 6. Full virtualization Para-virtualization Storage transport choices in KVM devconf.cz 2016
  • 7. Virtio PCI Controller Virtio Device vring Guest Qemu Virtio pci controller Virtio Device Kick Ring buffer with para virtualization
  • 8. Virtio-blk-data-plane: ● Accelerated data path for para-virtualized block I/O driver ● Threads are defined by -object othread,iothread=<id> and the user can set up arbitrary device->iothread mappings (multiple devices can share an iothread) ● No need to acquire big QEMU lock KVM Guest Host Kernel QEMU Event Loop Virtio-bl data-pla KVM Guest Host Kernel QEMU Event Loop vityio-blk- data-plane thread(s) Linux AIO irqfd devconf.cz 2016
  • 9. Virtio-blk Storage Configurations KVM Applications Guest LVM Volume on Virtual Devices Host Server KVM Guest LVM Volume on Virtual Devices Physical Storage Applications Direct I/O w/ Para-Virtualized Drivers Block Devices /dev/sda, /dev/sdb,… Device-Backed Virtual Storage Virtual Devices /dev/vda, /dev/vdb,… KVM Guest Applications LVM File Volume Host Server KVM Guest LVM Volume Physical Storage File Applications Para- virtualized Drivers, Direct I/O RAW or QCOW2 File-Backed Virtual Storage devconf.cz 2016
  • 10. Openstack: Libvirt: AIO mode for disk devices 1) Asynchronous IO (AIO=Native) Using io_submit calls 2) Synchronous (AIO=Threads) pread64, pwrite64 calls Default Choice in Openstack is aio=threads* Ref: https://specs.openstack.org/openstack/novaspecs/specs/mitaka/approved/libvirt-aio-mode.html * Before solving this problem devconf.cz 2016
  • 11. ● </disk> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none' io='native'/> <source file='/home/psuriset/xfs/vm2-native-ssd.qcow2'/> <target dev='vdb' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> ● <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none' io='threads'/> <source file='/home/psuriset/xfs/vm2-threads-ssd.qcow2'/> <target dev='vdc' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </disk> Example XML devconf.cz 2016
  • 12. CPU Usage with aio=Native devconf.cz 2016
  • 13. CPU Usage with aio=Threads devconf.cz 2016
  • 14. ext4/XFS Ext4, XFS NFS SSD, HDD Qcow2, Qcow2 (With Falloc), Qcow2 (With Fallocate), Raw(With Preallocated) File, Block device Jobs: Seq Read, Seq Write,Rand Read, Rand Write, Rand Read Write Block Sizes: 4k, 16k, 64k, 256k Number of VM: 1, 16 (Concurrent) devconf.cz 2016 Multiple layers Evaluated with virtio-blk
  • 15. Test Environment Hardware ● 2 x Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz ● 256 GiB memory @1866MHz ● 1 x 1 TB NVMe PCI SSD ● 1 x 500 GB HDD Software ● Host: RHEL 7.2 :3.10.0-327 ● Qemu: 2.3.0-31 + AIO Merge Patch ● VM: RHEL 7.2 devconf.cz 2016
  • 16. Tools What is Pbench? pbench (perf bench) aims to: ● Provide easy access to benchmarking & performance tools on Linux systems ● Standardize the collection of telemetry and configuration Information ● Automate benchmark execution ● Output effective visualization for analysis allow for ingestion into elastic search devconf.cz 2016
  • 17. Pbench Continued... Tool visualization: sar tool, total cpu consumption: devconf.cz 2016
  • 18. Pbench Continued .. tool visualization: iostat tool, disk request size: devconf.cz 2016
  • 19. Pbench Continued .. tool visualization: proc-interrupts tool, function call interrupts/sec: devconf.cz 2016
  • 20. Pbench Continued .. tool visualization: proc-vmstat tool, numa stats: entries in /proc/vmstat which begin with “numa_” (delta/sec) devconf.cz 2016
  • 21. Pbench Continued .. pbench benchmarks example: fio benchmark # pbench_fio --config=baremetal-hdd runs a default set of iterations: [read,rand-read]*[4KB, 8KB….64KB] takes 5 samples per iteration and compute avg, stddev handles start/stop/post-process of tools for each iteration other fio options: --targets=<devices or files> --ioengine=[sync, libaio, others] --test-types=[read,randread,write,randwrite,randrw] --block-sizes=[<int>,[<int>]] (in KB) devconf.cz 2016
  • 22. FIO: Flexible IO Tester ● IO type Defines the io pattern issued to the file(s). We may only be reading sequentially from this file(s), or we may be writing randomly. Or even mixing reads and writes, sequentially or Randomly ● Block size In how large chunks are we issuing io? This may be a single value, or it may describe a range of block sizes. ● IO size How much data are we going to be reading/writing ● IO Engine How do we issue io? We could be memory mapping the file, we could be using regular read/write, we could be using splice, async io, syslet, or even SG (SCSI generic sg) ● IO depth If the io engine is async, how large a queuing depth do we want to maintain? ● IO Type Should we be doing buffered io, or direct/raw io? devconf.cz 2016
  • 23. Guest Host Guest & Host iostat during 4k seq read with aio=native devconf.cz 2016
  • 24. ● Aio=native uses Linux AIO io_submit(2) for read and write requests and Request completion is signaled using eventfd. ● Virtqueue kicks are handled in the iothread. When the guest writes to the virtqueue kick hardware register the kvm.ko module signals the ioeventfd which the main loop thread is monitoring. ● Requests are collected from the virtqueue and submitted (after write request merging) either via aio=threads or aio=native. ● Request completion callbacks are invoked in the main loop thread and an interrupt is injected into the guest. . AIO Native devconf.cz 2016 Virtio PCI Controller Virtio Device vring Guest Qemu Virtio pci controller Virtio Device Kick
  • 25. Challenges for Read with aio=native ● ● virtio-blk does *not* merge read requests in qemu-kvm. It only merges write requests. ● QEMU submits each 4 KB request through a separate io_submit() call. ● Qemu would submit only 1 request at a time though Multiple requests to process ● Batching method was implemented for both virtio-scsi and virtio-blk-data-plane disk
  • 26. Batch Submission What is I/O batch submission ● Handle more requests in one single system call(io_submit), so calling number of the syscall of io_submit can be decrease a lot Abstracting with generic interfaces ● bdrv_io_plug( ) / bdrv_io_unplug( ) ● merged in fc73548e444ae3239f6cef44a5200b5d2c3e85d1 (virtio-blk: submit I/O as a batch) devconf.cz 2016
  • 28. Test Specifications Single VM Results Multiple VM (16) Results Disk: SSD FS: None (used LVM) Image: raw Preallocated: yes aio=threads has better performance with LVM. 4K read,randread performance is 10-15% higher. 4K write is 26% higher. Native & threads perform equally in most cases but native does better in few cases. Disk: SSD FS: EXT4 Image: raw Preallocated: yes Native performs well with randwrite, write, and randread-write. Threads 4K read is 10-15% Higher.. 4K randread is 8% higher. Both have similar results. 4K seq reads: threads 1% higher. Disk: SSD FS: XFS Image: Raw Preallocated: yes aio=threads has better performance Native & threads perform equally in most cases but native does better in few cases. Threads better in seq writes Disk: SSD FS: EXT4 Image: raw Preallocated: yes NFS : yes Native performs well with randwrite, write and randread-write. Threads do well with 4K/16K read, randread by 12% higher. Native & threads perform equally in most cases but native does better in few cases. Results devconf.cz 2016
  • 29. Test Specifications Single VM Results Multiple VM (16) Results Disk: SSD FS: XFS Image: raw Preallocated: yes NFS : yes Native performs well with all tests except read & randread tests where threads perform better. Native performs well with all tests. Disk: SSD FS: EXT4 Image: qcow2 Preallocated: no Native does well with all tests. Threads outperform native <10% for read and randread. Native is better than threads in most cases. Seq reads are 10-15% higher with native. Disk: SSD FS: XFS Image: qcow2 Preallocated: no Native performs well with all tests except seq read which is 6% higher Native performs better than threads except seq write, which is 8% higher Disk: SSD FS: EXT4 Image: qcow2 Preallocated: with falloc (using qemu-img) Native is optimal for almost all tests. Threads slightly better (<10%) for seq reads. Native is optimal with randwrite, write and randread-write. Threads have slightly better performance for read and randread. Disk: SSD FS: XFS Image: qcow2 Preallocate: with falloc Native is optimal for write and randread-write. Threads better (<10%) for read and randread. Native is optimal for all tests. Threads is better for seq writes. Disk: SSD FS: EXT4 Native performs better for randwrite, write, randread-write. Threads does better for read and randread. 4K,16K read,randread is 12% higher. Native outperforms threads.
  • 30. Test Specifications Single VM Results Multiple VM (16) Results Disk: SSD FS: XFS Image: qcow2 Preallocated: with fallocate Native is optimal for randwrite, write and randread Threads better (<10%) for read Native optimal for all tests. Threads optimal for randread, and 4K seq write. Disk: HDD FS: No. Used LVM Image: raw Preallocated: yes Native outperforms threads in all tests. Native outperforms threads in all tests. Disk: HDD FS: EXT4 Image: raw Preallocated: yes Native outperforms threads in all tests. Native outperforms threads in all tests. Disk: HDD FS: XFS Image: raw Preallocated: yes Native outperforms threads in all tests. Native is optimal or equal in all tests. Disk: HDD FS: EXT4 Image: qcow2 Preallocated: no Native is optimal or equal in all test cases except randread where threads is 30% higher. Native is optimal except for 4K seq reads. Disk: HDD FS: XFS Image: qcow2 Preallocated: no Native is optimal except for seq writes where threads is 30% higher. Native is optimal except for 4K seq reads. Disk: HDD FS: EXT4 Native is optimal or equal in all cases except randread where threads is 30% higher. Native is optimal or equal in all tests except 4K read.
  • 31. Test Specifications Single VM Results Multiple VM (16) Results Disk: HDD FS: XFS Image: qcow2 Preallocated: with falloc (using qemu-img) Native is optimal or equal in all cases except seq write where threads is 30% higher. Native is optimal or equal in all cases except for 4K randread. Disk: HDD FS: EXT4 Image: qcow2 Preallocated: with fallocate Native is optimal or equal in all tests. Native is optimal or equal in all tests except 4K randread where threads is 15% higher. Disk: HDD FS: XFS Image: qcow2 Preallocated: with fallocate Native is optimal in all tests except for seq write where threads is 30% higher. Nativs is better. Threads has slightly better performance(<3-4%), excluding randread where threads is 30% higher. devconf.cz 2016
  • 33. 1. Disk: SSD, Image: raw, Preallocated: yes, VMs: 16 FS: no.Used LVM, aio=native FS: No.Used LVM, aio=threads FS: EXT4, aio=native FS: EXT4, aio=threads FS: XFS, aio=native FS: XFS, aio=threads RandRead devconf.cz 2016
  • 34. RandReadWrite FS: no.Used LVM, aio=native FS: No.Used LVM, aio=threads FS: EXT4, aio=native FS: EXT4, aio=threads FS: XFS, aio=native FS: XFS, aio=threads devconf.cz 2016
  • 35. RandWrite devconf.cz 2016 FS: no.Used LVM, aio=native FS: No.Used LVM, aio=threads FS: EXT4, aio=native FS: EXT4, aio=threads FS: XFS, aio=native FS: XFS, aio=threads
  • 36. Seq Read FS: no.Used LVM, aio=native FS: No.Used LVM, aio=threads FS: EXT4, aio=native FS: EXT4, aio=threads FS: XFS, aio=native FS: XFS, aio=threads devconf.cz 2016
  • 37. Seq Write devconf.cz 2016 FS: no.Used LVM, aio=native FS: No.Used LVM, aio=threads FS: EXT4, aio=native FS: EXT4, aio=threads FS: XFS, aio=native FS: XFS, aio=threads
  • 38. 2. Disk: HDD, Image: raw, Preallocated: yes, VMs: 16 RandRead RandReadWrite devconf.cz 2016 FS: no.Used LVM, aio=native FS: No.Used LVM, aio=threads FS: EXT4, aio=native FS: EXT4, aio=threads FS: XFS, aio=native FS: XFS, aio=threads
  • 39. RandWrite Seq Read Seq Write devconf.cz 2016 RandWrite FS: no.Used LVM, aio=native FS: No.Used LVM, aio=threads FS: EXT4, aio=native FS: EXT4, aio=threads FS: XFS, aio=native FS: XFS, aio=threads
  • 40. 3. Disk: SSD, Image: raw, Preallocated: yes, VMs: 1 RandRead RandRead Write FS: no.Used LVM, aio=native FS: No.Used LVM, aio=threads FS: EXT4, aio=native FS: EXT4, aio=threads FS: XFS, aio=native FS: XFS, aio=threads devconf.cz 2016
  • 41. RandRead Write Seq Read FS: no.Used LVM, aio=native FS: No.Used LVM, aio=threads FS: EXT4, aio=native FS: EXT4, aio=threads FS: XFS, aio=native FS: XFS, aio=threads devconf.cz 2016
  • 42. Seq Write FS: no.Used LVM, aio=native FS: No.Used LVM, aio=threads FS: EXT4, aio=native FS: EXT4, aio=threads FS: XFS, aio=native FS: XFS, aio=threads devconf.cz 2016
  • 43. Rand Read Rand Read Write Rand Write 4. Disk: HDD, Image: raw, Preallocated: yes, VMs: 1 FS: no.Used LVM, aio=native FS: No.Used LVM, aio=threads FS: EXT4, aio=native FS: EXT4, aio=threads FS: XFS, aio=native FS: XFS, aio=threads devconf.cz 2016
  • 44. Seq Read Seq Write FS: no.Used LVM, aio=native FS: No.Used LVM, aio=threads FS: EXT4, aio=native FS: EXT4, aio=threads FS: XFS, aio=native FS: XFS, aio=threads devconf.cz 2016
  • 45. 5. Disk: SSD, Image: qcow2, VMs: 16 RandRead 1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate devconf.cz 2016
  • 46. RandReadWrite RandWrite 1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate devconf.cz 2016
  • 47. Seq Read SeqWrite 1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate devconf.cz 2016
  • 48. 6. Disk: HDD, Image: qcow2, VMs: 16 1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate devconf.cz 2016
  • 49. RandRead Rand Read Write Rand Write Seq Read 1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate devconf.cz 2016
  • 50. 7. Disk: SSD, Image: qcow2, VMs: 1 RandRead 1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate devconf.cz 2016
  • 51. RandReadWrite RandWrite 1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate devconf.cz 2016
  • 52. Seq Read 1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate devconf.cz 2016
  • 53. Seq Write 1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate devconf.cz 2016
  • 54. 8. Disk: HDD, Image: qcow2, VMs: 1 Seq Read 1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate devconf.cz 2016
  • 55. Rand Read Write Rand Write Rand Read 1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate devconf.cz 2016
  • 56. Seq Write 1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate devconf.cz 2016
  • 57. 8. Disk: SSD, Image: raw, NFS: yes, VMs: 1 1.FS: EXT4, aio=native 2. FS: EXT4, aio=threads 3. FS: XFS, aio=native 4. FS: XFS, aio=threads Rand Read Rand Read Write devconf.cz 2016
  • 58. Rand Write Seq Read Seq Write devconf.cz 2016
  • 61. Conclusion & Limitations ● Throughput increased a lot because IO thread takes fewer CPU to submit I/O ● AIO=Native is Preferable choice with few limitations. ● Native AIO can block the VM if the file is not fully allocated and is therefore not recommended for use on sparse files. ● Writes to sparsely allocated files are more likely to block than fully preallocated files. Therefore it is recommended to only use aio=native on fully preallocated files, local disks, or logical volumes. devconf.cz 2016
  • 62. Future work devconf.cz 2016 Evaluate Virtio Data Plane Performance Reduce cpu utilization for aio=threads and consider
  • 64. References ● Stefan Hajnoczi Optimizing the QEMU Storage Stack, Linux Plumbers 2010 ● Asias He, Virtio-blk Performance Improvement, KVM forum 2012 ● Khoa Huynch: Exploiting The Latest KVM Features For Optimized Virtualized Enterprise Storage Performance, LinuxCon2012 ● Pbench: http://distributed-system-analysis.github.io/pbench/ https://github.com/distributed-system-analysis/pbench ● FIO: https://github.com/axboe/fio/ devconf.cz 2016