DPDK brings high-performance/low-latency virtualization networking capabilities thanks to its Vhost/Virtio support. The session will first introduce DPDK and its Vhost/Virtio implementations, exposing to the audience examples of possible uses, and challenges that need to be addressed to achieve high-performance, functionality and reliability. Then, Vhost/Virtio improvements introduced in last DPDK release will be covered, such as receive path optimizations, Virtio's indirect descriptors support, or transmit zero copy to name a few. The speakers will explain which problems they aim to address, how they address them, mentioning their limitations.
Finally, the speakers, who are active DPDK's Virtio/Vhost contributors, will expose what new developments are in the pipe to tackle the remaining challenges.
The session will be presented so that DPDK developers and users find useful information on current developments and status. People not familiar with DPDK may find a overview, get and share ideas with other projects.
4. 4
DPDK – project overview
Overview
DPDK is a set of userspace libraries aimed at fast
packet processing.
● Data Plane Development Kit
● Goal:
● Benefit from software flexibility
● Achieving performance close to dedicated HW solutions
5. 5
DPDK – project overview
Overview
● License: BSD
● CPU architectures: x86, Power8, TILE-Gx & ARM
● NICs: Intel, Mellanox, Broadcom, Cisco,...
● Other HW: Crypto, SCSI for SPDK project
● Operating systems: Linux, BSD
6. 6
1st
release
by Intel as
Zip file
2012 2013
6Wind intiates
dpdk.org
community
DPDK - project history
Overview
Packaged
In Fedora
2014 2015
Power8 &
TILE-Gx
support
ARM
Support /
Crypto
2016 2017
Moving to
Linux Foundation
v1.2
v1.3
v1.4
v1.5
v1.6
v1.7
v1.8
v2.0
v2.1
v2.2
v16.04v16.07
v16.11
v17.02v17.05v17.08v17.11
v16.11: ~750K LoC / ~6000 commits / ~350 contributors
8. 8
DPDK – performance
Overview
DPDK uses:
● CPU isolation/partitioning & polling
→ Dedicated CPU cores to poll the device
● VFIO/UIO
→ Direct devices registers accesses from user-space
● NUMA awareness
● → Resources local to the Poll-Mode Driver’s (PMD) CPU
● Hugepages
→ Less TLB misses, no swap
9. 9
DPDK – performance
Overview
To avoid:
● Interrupt handling
→ Kernel’s NAPI polling mode is not enough
● Context switching
● Kernel/user data copies
● Syscalls overhead
→ More than the time budget for a 64B packet at 14.88Mpps
11. 11
Virtio/Vhost
Overview
●
Device emulation, direct assignment, VirtIO
● Vhost: In-kernel virtio device emulation
● device emulation code calls to directly call into kernel
subsystems
● Vhost worker thread in host kernel
● Bypasses system calls from user to kernel space on host
13. 13
In-kernel device emulation
● In-kernel restricted to virtqueue emulation
● QEMU handles control plane, feature negotiation, migration,
etc
● File descriptor polling done by vhost in kernel
● Buffers moved between tap device and virtqueues by kernel
worker thread
Overview
14. 14
Vhost as user space interface
● Vhost architecture is not tied to KVM
● Backend: Vhost instance in user space
● Eventfd is set up to signal backend when new buffers are
placed by guest (kickfd)
● Irqfd is set up to signal the guest about new buffers placed by
backend (callfd)
● The beauty: backend only knows about guest memory
mapping, kick eventfd and call eventfd
● Vhost-user implemented in DPDK in v16.7
Overview
16. 16
Performance
Challenges
DPDK is about performance, which is a trade-off between:
● Bandwidth → achieving line rate even for small packets
● Latency → as low as possible (of course)
● CPU utilization → $$$
→ Prefer bandwidth & latency at the expense of CPU utilization
→ Take into account HW architectures as much as possible
17. 17
0% packet-loss
• Some use-cases of Virtio cannot afford packet loss, like NFV
• Hard to achieve max perf without loss, as Virtio is CPU intensive
→ Scheduling “glitches” may cause packets drop
Migration
• Requires restoration of internal state including the backend
• Interface exposed by QEMU must stay unchanged for cross-
version migration
• Interface exposed to guest depends on capabilities of third-party
application
• Support from the management tool is required
Reliability
Challenges
18. 18
• Isolation of untrusted guests
• Direct access to device from untrusted guests
• Current implementations require mediator for guest -to-
guest communication.
• Zero-copy is problematic from security point of view
Security
Challenges
22. 22
Pros:
• Increase ring capacity
• Improve performance for large number of large requests
• Improve 0% packet loss perf even for small requests
→ If system is not fine-tuned
→ If Virtio headers are in dedicated descriptor
Cons:
• One more level of indirection
→ Impacts raw performance (~-3%)
Indirect descriptors (DPDK v16.11)
New & upcoming features
23. 23
Vhost dequeue 0-copy (DPDK v16.11)
New & upcoming features
addr
len
flags
next
descriptor
desc's buf mbuf's buf
addr
paddr
len
...
mbuf
Memcpy
addr
len
flags
next
descriptor
desc's buf
addr
paddr
len
...
mbuf
Default
dequeuing
Zero-copy
dequeuing
24. 24
Pros:
• Big perf improvement for standard & large packet sizes
→ More than +50% for VM-to-VM with iperf benchs
• Reduces memory footprint
Cons:
• Performance degradation for small packets
→ But disabled by default
• Only for VM-to-VM using Vhost lib API (No PMD support)
• Does not work for VM-to-NIC
→ Mbuf lacks release notif mechanism / No headroom
Vhost dequeue 0-copy (DPDK v16.11)
New & upcoming features
25. 25
Way for the host to share its max supported MTU
• Can be used to set MTU values across the infra
• Can improve performance for small packet
→ If MTU fits in rx buffer size, disable Rx mergeable buffers
→ Save one cache-miss when parsing the virtio-net header
MTU feature (DPDK v17.05?)
New & upcoming features
26. 26
Vhost-pci (DPDK v17.05?)
New & upcoming features
vSwitch
VM1
Virtio
Vhost
VM2
Virtio
Vhost
VM1
Vhost-pci
VM2
Virtio
Traditional VM to VM
communication
Direct VM to VM
communication
27. 27
Vhost-pci (DPDK v17.05?)
New & upcoming features
VM1
Vhost-pci
driver 1
VM2
Virtio-pci
driver
Vhost-pci
device 1
Virtio-pci
device
Vhost-pci
client
Vhost-pci server
Vhost-pci protocol
28. 28
Pros:
• Performance improvement
→ The 2 VMs share the same virtqueues
→ Packets doesn’t go through host’s vSwitch
• No change needed in Virtio’s guest drivers
Vhost-pci (DPDK v17.05?)
New & upcoming features
29. 29
Cons:
• Security
→ Vhost-pci’s VM maps all Virtio-pci’s VM memory space
→ Could be solved with IOTLB support
• Live migration
→ Not supported in current version
→ Hard to implement as VMs are connected to each other
through a socket
Vhost-pci (DPDK v17.05?)
New & upcoming features
30. 30
IOTLB in kernel
New & upcoming features
VM
IOMMU
driver
VM
QEMU VHOST
IOMMU
Device
IOTLB
iova
IOTLB miss
IOTLB update,
invalidate
TLB invalidation
hva
memory
Syscall/eventfd
31. 31
IOTLB for vhost-user
New & upcoming features
VM
IOMMU
driver
VM
QEMU Backend
IOMMU
IOTLB
Cache
iova
IOTLB miss
IOTLB update,
invalidate
TLB invalidation
hva
memory
Unix Socket
32. 32
• DPDK support for VM is in active development
• 10M pps is real in VM with DPDK
• New features to boost performance of VM networking
• Accelerate transition to NFV / SDN
Conclusions