SlideShare a Scribd company logo
1 of 34
Download to read offline
Can VMs networking benefit from
DPDK?
Virtio/Vhost-user status & updates
Maxime Coquelin – Victor Kaplansky
2017-01-27
2
AGENDA
Can VMs networking benefit from DPDK?
● Overview
● Challenges
● New & upcoming features
Overview
4
DPDK – project overview
Overview
DPDK is a set of userspace libraries aimed at fast
packet processing.
● Data Plane Development Kit
● Goal:
● Benefit from software flexibility
● Achieving performance close to dedicated HW solutions
5
DPDK – project overview
Overview
● License: BSD
● CPU architectures: x86, Power8, TILE-Gx & ARM
● NICs: Intel, Mellanox, Broadcom, Cisco,...
● Other HW: Crypto, SCSI for SPDK project
● Operating systems: Linux, BSD
6
1st
release
by Intel as
Zip file
2012 2013
6Wind intiates
dpdk.org
community
DPDK - project history
Overview
Packaged
In Fedora
2014 2015
Power8 &
TILE-Gx
support
ARM
Support /
Crypto
2016 2017
Moving to
Linux Foundation
v1.2
v1.3
v1.4
v1.5
v1.6
v1.7
v1.8
v2.0
v2.1
v2.2
v16.04v16.07
v16.11
v17.02v17.05v17.08v17.11
v16.11: ~750K LoC / ~6000 commits / ~350 contributors
7
DPDK – comparison
Overview
User
Kernel
NIC
Application
Socket
Net stack
Driver
User
Kernel
NIC
Application
DPDK
VFIO
8
DPDK – performance
Overview
DPDK uses:
● CPU isolation/partitioning & polling
→ Dedicated CPU cores to poll the device
● VFIO/UIO
→ Direct devices registers accesses from user-space
● NUMA awareness
● → Resources local to the Poll-Mode Driver’s (PMD) CPU
● Hugepages
→ Less TLB misses, no swap
9
DPDK – performance
Overview
To avoid:
● Interrupt handling
→ Kernel’s NAPI polling mode is not enough
● Context switching
● Kernel/user data copies
● Syscalls overhead
→ More than the time budget for a 64B packet at 14.88Mpps
10
DPDK - components
Overview
Credits: Tim O’Driscoll - Intel
11
Virtio/Vhost
Overview
●
Device emulation, direct assignment, VirtIO
● Vhost: In-kernel virtio device emulation
● device emulation code calls to directly call into kernel
subsystems
● Vhost worker thread in host kernel
● Bypasses system calls from user to kernel space on host
12
Vhost driver model
VM
QEMU
Userspace
Kvm.ko
ioeventfd
irqfd
ioeventfd
Vhost
Host Kernel
13
In-kernel device emulation
● In-kernel restricted to virtqueue emulation
● QEMU handles control plane, feature negotiation, migration,
etc
● File descriptor polling done by vhost in kernel
● Buffers moved between tap device and virtqueues by kernel
worker thread
Overview
14
Vhost as user space interface
● Vhost architecture is not tied to KVM
● Backend: Vhost instance in user space
● Eventfd is set up to signal backend when new buffers are
placed by guest (kickfd)
● Irqfd is set up to signal the guest about new buffers placed by
backend (callfd)
● The beauty: backend only knows about guest memory
mapping, kick eventfd and call eventfd
● Vhost-user implemented in DPDK in v16.7
Overview
Challenges
16
Performance
Challenges
DPDK is about performance, which is a trade-off between:
● Bandwidth → achieving line rate even for small packets
● Latency → as low as possible (of course)
● CPU utilization → $$$
→ Prefer bandwidth & latency at the expense of CPU utilization
→ Take into account HW architectures as much as possible
17
0% packet-loss
• Some use-cases of Virtio cannot afford packet loss, like NFV
• Hard to achieve max perf without loss, as Virtio is CPU intensive
→ Scheduling “glitches” may cause packets drop
Migration
• Requires restoration of internal state including the backend
• Interface exposed by QEMU must stay unchanged for cross-
version migration
• Interface exposed to guest depends on capabilities of third-party
application
• Support from the management tool is required
Reliability
Challenges
18
• Isolation of untrusted guests
• Direct access to device from untrusted guests
• Current implementations require mediator for guest -to-
guest communication.
• Zero-copy is problematic from security point of view
Security
Challenges
New & upcoming features
20
Pro:
• Allows receiving packets larger
than descriptors’ buffer size
Con:
• Introduce extra-cache miss in the
dequeue path
Rx mergeable buffers
New & upcoming features
Desc 0
Desc 1
Desc 2
Desc 3
Desc 4
Desc n
Desc n-1
num_buffers = 1
num_buffers = 3
21
Indirect descriptors (DPDK v16.11)
New & upcoming features
Desc 0
Desc 1
Desc 2
Desc 3
Desc 4
Desc n
Desc n-1
Desc 0
Desc 1
Desc 2
Desc 3
Desc 4
Desc n
Desc n-1
iDesc 2-0
iDesc 2-1
iDesc 2-2
iDesc 2-3
Direct descriptors chaining Indirect descriptors table
22
Pros:
• Increase ring capacity
• Improve performance for large number of large requests
• Improve 0% packet loss perf even for small requests
→ If system is not fine-tuned
→ If Virtio headers are in dedicated descriptor
Cons:
• One more level of indirection
→ Impacts raw performance (~-3%)
Indirect descriptors (DPDK v16.11)
New & upcoming features
23
Vhost dequeue 0-copy (DPDK v16.11)
New & upcoming features
addr
len
flags
next
descriptor
desc's buf mbuf's buf
addr
paddr
len
...
mbuf
Memcpy
addr
len
flags
next
descriptor
desc's buf
addr
paddr
len
...
mbuf
Default
dequeuing
Zero-copy
dequeuing
24
Pros:
• Big perf improvement for standard & large packet sizes
→ More than +50% for VM-to-VM with iperf benchs
• Reduces memory footprint
Cons:
• Performance degradation for small packets
→ But disabled by default
• Only for VM-to-VM using Vhost lib API (No PMD support)
• Does not work for VM-to-NIC
→ Mbuf lacks release notif mechanism / No headroom
Vhost dequeue 0-copy (DPDK v16.11)
New & upcoming features
25
Way for the host to share its max supported MTU
• Can be used to set MTU values across the infra
• Can improve performance for small packet
→ If MTU fits in rx buffer size, disable Rx mergeable buffers
→ Save one cache-miss when parsing the virtio-net header
MTU feature (DPDK v17.05?)
New & upcoming features
26
Vhost-pci (DPDK v17.05?)
New & upcoming features
vSwitch
VM1
Virtio
Vhost
VM2
Virtio
Vhost
VM1
Vhost-pci
VM2
Virtio
Traditional VM to VM
communication
Direct VM to VM
communication
27
Vhost-pci (DPDK v17.05?)
New & upcoming features
VM1
Vhost-pci
driver 1
VM2
Virtio-pci
driver
Vhost-pci
device 1
Virtio-pci
device
Vhost-pci
client
Vhost-pci server
Vhost-pci protocol
28
Pros:
• Performance improvement
→ The 2 VMs share the same virtqueues
→ Packets doesn’t go through host’s vSwitch
• No change needed in Virtio’s guest drivers
Vhost-pci (DPDK v17.05?)
New & upcoming features
29
Cons:
• Security
→ Vhost-pci’s VM maps all Virtio-pci’s VM memory space
→ Could be solved with IOTLB support
• Live migration
→ Not supported in current version
→ Hard to implement as VMs are connected to each other
through a socket
Vhost-pci (DPDK v17.05?)
New & upcoming features
30
IOTLB in kernel
New & upcoming features
VM
IOMMU
driver
VM
QEMU VHOST
IOMMU
Device
IOTLB
iova
IOTLB miss
IOTLB update,
invalidate
TLB invalidation
hva
memory
Syscall/eventfd
31
IOTLB for vhost-user
New & upcoming features
VM
IOMMU
driver
VM
QEMU Backend
IOMMU
IOTLB
Cache
iova
IOTLB miss
IOTLB update,
invalidate
TLB invalidation
hva
memory
Unix Socket
32
• DPDK support for VM is in active development
• 10M pps is real in VM with DPDK
• New features to boost performance of VM networking
• Accelerate transition to NFV / SDN
Conclusions
33
Q / A
THANK YOU

More Related Content

What's hot

Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3R
Simon Huang
 

What's hot (20)

OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. GrayOVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
 
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel ArchitectureDPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
 
Netsft2017 day in_life_of_nfv
Netsft2017 day in_life_of_nfvNetsft2017 day in_life_of_nfv
Netsft2017 day in_life_of_nfv
 
DPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingDPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet Processing
 
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
DPDK summit 2015: It's kind of fun  to do the impossible with DPDKDPDK summit 2015: It's kind of fun  to do the impossible with DPDK
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
 
DPDK Summit - 08 Sept 2014 - 6WIND - High Perf Networking Leveraging the DPDK...
DPDK Summit - 08 Sept 2014 - 6WIND - High Perf Networking Leveraging the DPDK...DPDK Summit - 08 Sept 2014 - 6WIND - High Perf Networking Leveraging the DPDK...
DPDK Summit - 08 Sept 2014 - 6WIND - High Perf Networking Leveraging the DPDK...
 
SR-IOV ixgbe Driver Limitations and Improvement
SR-IOV ixgbe Driver Limitations and ImprovementSR-IOV ixgbe Driver Limitations and Improvement
SR-IOV ixgbe Driver Limitations and Improvement
 
Intel dpdk Tutorial
Intel dpdk TutorialIntel dpdk Tutorial
Intel dpdk Tutorial
 
WAN - trends and use cases
WAN - trends and use casesWAN - trends and use cases
WAN - trends and use cases
 
High Performance Networking Leveraging the DPDK and Growing Community
High Performance Networking Leveraging the DPDK and Growing CommunityHigh Performance Networking Leveraging the DPDK and Growing Community
High Performance Networking Leveraging the DPDK and Growing Community
 
7 hands on
7 hands on7 hands on
7 hands on
 
Openstack v4 0
Openstack v4 0Openstack v4 0
Openstack v4 0
 
DPDK Summit 2015 - Intel - Keith Wiles
DPDK Summit 2015 - Intel - Keith WilesDPDK Summit 2015 - Intel - Keith Wiles
DPDK Summit 2015 - Intel - Keith Wiles
 
Intel® Ethernet Update
Intel® Ethernet Update Intel® Ethernet Update
Intel® Ethernet Update
 
DPDK Summit 2015 - HP - Al Sanders
DPDK Summit 2015 - HP - Al SandersDPDK Summit 2015 - HP - Al Sanders
DPDK Summit 2015 - HP - Al Sanders
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
DPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles ShiflettDPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles Shiflett
 
Quieting noisy neighbor with Intel® Resource Director Technology
Quieting noisy neighbor with Intel® Resource Director TechnologyQuieting noisy neighbor with Intel® Resource Director Technology
Quieting noisy neighbor with Intel® Resource Director Technology
 
Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3R
 

Similar to Devconf2017 - Can VMs networking benefit from DPDK

Similar to Devconf2017 - Can VMs networking benefit from DPDK (20)

Simplify Networking for Containers
Simplify Networking for ContainersSimplify Networking for Containers
Simplify Networking for Containers
 
Midokura Gluecon 2014 - Level up your OpenStack Neutron Networking
Midokura Gluecon 2014 - Level up your OpenStack Neutron NetworkingMidokura Gluecon 2014 - Level up your OpenStack Neutron Networking
Midokura Gluecon 2014 - Level up your OpenStack Neutron Networking
 
Network Design patters with Docker
Network Design patters with DockerNetwork Design patters with Docker
Network Design patters with Docker
 
What Happens in Vegas (Amcom at EMC Global Forum)
What Happens in Vegas (Amcom at EMC Global Forum)What Happens in Vegas (Amcom at EMC Global Forum)
What Happens in Vegas (Amcom at EMC Global Forum)
 
6WINDGate™ - Powering the New-Generation of IPsec Gateways
6WINDGate™ - Powering the New-Generation of IPsec Gateways6WINDGate™ - Powering the New-Generation of IPsec Gateways
6WINDGate™ - Powering the New-Generation of IPsec Gateways
 
Container network security
Container network securityContainer network security
Container network security
 
Known basic of NFV Features
Known basic of NFV FeaturesKnown basic of NFV Features
Known basic of NFV Features
 
Varnish extend
Varnish extendVarnish extend
Varnish extend
 
Cumulus Linux 2.2 Overview
Cumulus Linux 2.2 OverviewCumulus Linux 2.2 Overview
Cumulus Linux 2.2 Overview
 
Решения NFV в контексте операторов связи
Решения NFV в контексте операторов связиРешения NFV в контексте операторов связи
Решения NFV в контексте операторов связи
 
Understanding network and service virtualization
Understanding network and service virtualizationUnderstanding network and service virtualization
Understanding network and service virtualization
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead
 
Practical Design Patterns in Docker Networking
Practical Design Patterns in Docker NetworkingPractical Design Patterns in Docker Networking
Practical Design Patterns in Docker Networking
 
Introductio to Docker and usage in HPC applications
Introductio to Docker and usage in HPC applicationsIntroductio to Docker and usage in HPC applications
Introductio to Docker and usage in HPC applications
 
DockerCon EU 2018 Workshop: Container Networking for Swarm and Kubernetes in ...
DockerCon EU 2018 Workshop: Container Networking for Swarm and Kubernetes in ...DockerCon EU 2018 Workshop: Container Networking for Swarm and Kubernetes in ...
DockerCon EU 2018 Workshop: Container Networking for Swarm and Kubernetes in ...
 
Dev Conf 2017 - Meeting nfv networking requirements
Dev Conf 2017 - Meeting nfv networking requirementsDev Conf 2017 - Meeting nfv networking requirements
Dev Conf 2017 - Meeting nfv networking requirements
 
Libvirt/KVM Driver Update (Kilo)
Libvirt/KVM Driver Update (Kilo)Libvirt/KVM Driver Update (Kilo)
Libvirt/KVM Driver Update (Kilo)
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
6WINDGate™ - Enabling Cloud RAN Virtualization
6WINDGate™ - Enabling Cloud RAN Virtualization6WINDGate™ - Enabling Cloud RAN Virtualization
6WINDGate™ - Enabling Cloud RAN Virtualization
 

Recently uploaded

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 

Recently uploaded (20)

OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 

Devconf2017 - Can VMs networking benefit from DPDK

  • 1. Can VMs networking benefit from DPDK? Virtio/Vhost-user status & updates Maxime Coquelin – Victor Kaplansky 2017-01-27
  • 2. 2 AGENDA Can VMs networking benefit from DPDK? ● Overview ● Challenges ● New & upcoming features
  • 4. 4 DPDK – project overview Overview DPDK is a set of userspace libraries aimed at fast packet processing. ● Data Plane Development Kit ● Goal: ● Benefit from software flexibility ● Achieving performance close to dedicated HW solutions
  • 5. 5 DPDK – project overview Overview ● License: BSD ● CPU architectures: x86, Power8, TILE-Gx & ARM ● NICs: Intel, Mellanox, Broadcom, Cisco,... ● Other HW: Crypto, SCSI for SPDK project ● Operating systems: Linux, BSD
  • 6. 6 1st release by Intel as Zip file 2012 2013 6Wind intiates dpdk.org community DPDK - project history Overview Packaged In Fedora 2014 2015 Power8 & TILE-Gx support ARM Support / Crypto 2016 2017 Moving to Linux Foundation v1.2 v1.3 v1.4 v1.5 v1.6 v1.7 v1.8 v2.0 v2.1 v2.2 v16.04v16.07 v16.11 v17.02v17.05v17.08v17.11 v16.11: ~750K LoC / ~6000 commits / ~350 contributors
  • 7. 7 DPDK – comparison Overview User Kernel NIC Application Socket Net stack Driver User Kernel NIC Application DPDK VFIO
  • 8. 8 DPDK – performance Overview DPDK uses: ● CPU isolation/partitioning & polling → Dedicated CPU cores to poll the device ● VFIO/UIO → Direct devices registers accesses from user-space ● NUMA awareness ● → Resources local to the Poll-Mode Driver’s (PMD) CPU ● Hugepages → Less TLB misses, no swap
  • 9. 9 DPDK – performance Overview To avoid: ● Interrupt handling → Kernel’s NAPI polling mode is not enough ● Context switching ● Kernel/user data copies ● Syscalls overhead → More than the time budget for a 64B packet at 14.88Mpps
  • 10. 10 DPDK - components Overview Credits: Tim O’Driscoll - Intel
  • 11. 11 Virtio/Vhost Overview ● Device emulation, direct assignment, VirtIO ● Vhost: In-kernel virtio device emulation ● device emulation code calls to directly call into kernel subsystems ● Vhost worker thread in host kernel ● Bypasses system calls from user to kernel space on host
  • 13. 13 In-kernel device emulation ● In-kernel restricted to virtqueue emulation ● QEMU handles control plane, feature negotiation, migration, etc ● File descriptor polling done by vhost in kernel ● Buffers moved between tap device and virtqueues by kernel worker thread Overview
  • 14. 14 Vhost as user space interface ● Vhost architecture is not tied to KVM ● Backend: Vhost instance in user space ● Eventfd is set up to signal backend when new buffers are placed by guest (kickfd) ● Irqfd is set up to signal the guest about new buffers placed by backend (callfd) ● The beauty: backend only knows about guest memory mapping, kick eventfd and call eventfd ● Vhost-user implemented in DPDK in v16.7 Overview
  • 16. 16 Performance Challenges DPDK is about performance, which is a trade-off between: ● Bandwidth → achieving line rate even for small packets ● Latency → as low as possible (of course) ● CPU utilization → $$$ → Prefer bandwidth & latency at the expense of CPU utilization → Take into account HW architectures as much as possible
  • 17. 17 0% packet-loss • Some use-cases of Virtio cannot afford packet loss, like NFV • Hard to achieve max perf without loss, as Virtio is CPU intensive → Scheduling “glitches” may cause packets drop Migration • Requires restoration of internal state including the backend • Interface exposed by QEMU must stay unchanged for cross- version migration • Interface exposed to guest depends on capabilities of third-party application • Support from the management tool is required Reliability Challenges
  • 18. 18 • Isolation of untrusted guests • Direct access to device from untrusted guests • Current implementations require mediator for guest -to- guest communication. • Zero-copy is problematic from security point of view Security Challenges
  • 19. New & upcoming features
  • 20. 20 Pro: • Allows receiving packets larger than descriptors’ buffer size Con: • Introduce extra-cache miss in the dequeue path Rx mergeable buffers New & upcoming features Desc 0 Desc 1 Desc 2 Desc 3 Desc 4 Desc n Desc n-1 num_buffers = 1 num_buffers = 3
  • 21. 21 Indirect descriptors (DPDK v16.11) New & upcoming features Desc 0 Desc 1 Desc 2 Desc 3 Desc 4 Desc n Desc n-1 Desc 0 Desc 1 Desc 2 Desc 3 Desc 4 Desc n Desc n-1 iDesc 2-0 iDesc 2-1 iDesc 2-2 iDesc 2-3 Direct descriptors chaining Indirect descriptors table
  • 22. 22 Pros: • Increase ring capacity • Improve performance for large number of large requests • Improve 0% packet loss perf even for small requests → If system is not fine-tuned → If Virtio headers are in dedicated descriptor Cons: • One more level of indirection → Impacts raw performance (~-3%) Indirect descriptors (DPDK v16.11) New & upcoming features
  • 23. 23 Vhost dequeue 0-copy (DPDK v16.11) New & upcoming features addr len flags next descriptor desc's buf mbuf's buf addr paddr len ... mbuf Memcpy addr len flags next descriptor desc's buf addr paddr len ... mbuf Default dequeuing Zero-copy dequeuing
  • 24. 24 Pros: • Big perf improvement for standard & large packet sizes → More than +50% for VM-to-VM with iperf benchs • Reduces memory footprint Cons: • Performance degradation for small packets → But disabled by default • Only for VM-to-VM using Vhost lib API (No PMD support) • Does not work for VM-to-NIC → Mbuf lacks release notif mechanism / No headroom Vhost dequeue 0-copy (DPDK v16.11) New & upcoming features
  • 25. 25 Way for the host to share its max supported MTU • Can be used to set MTU values across the infra • Can improve performance for small packet → If MTU fits in rx buffer size, disable Rx mergeable buffers → Save one cache-miss when parsing the virtio-net header MTU feature (DPDK v17.05?) New & upcoming features
  • 26. 26 Vhost-pci (DPDK v17.05?) New & upcoming features vSwitch VM1 Virtio Vhost VM2 Virtio Vhost VM1 Vhost-pci VM2 Virtio Traditional VM to VM communication Direct VM to VM communication
  • 27. 27 Vhost-pci (DPDK v17.05?) New & upcoming features VM1 Vhost-pci driver 1 VM2 Virtio-pci driver Vhost-pci device 1 Virtio-pci device Vhost-pci client Vhost-pci server Vhost-pci protocol
  • 28. 28 Pros: • Performance improvement → The 2 VMs share the same virtqueues → Packets doesn’t go through host’s vSwitch • No change needed in Virtio’s guest drivers Vhost-pci (DPDK v17.05?) New & upcoming features
  • 29. 29 Cons: • Security → Vhost-pci’s VM maps all Virtio-pci’s VM memory space → Could be solved with IOTLB support • Live migration → Not supported in current version → Hard to implement as VMs are connected to each other through a socket Vhost-pci (DPDK v17.05?) New & upcoming features
  • 30. 30 IOTLB in kernel New & upcoming features VM IOMMU driver VM QEMU VHOST IOMMU Device IOTLB iova IOTLB miss IOTLB update, invalidate TLB invalidation hva memory Syscall/eventfd
  • 31. 31 IOTLB for vhost-user New & upcoming features VM IOMMU driver VM QEMU Backend IOMMU IOTLB Cache iova IOTLB miss IOTLB update, invalidate TLB invalidation hva memory Unix Socket
  • 32. 32 • DPDK support for VM is in active development • 10M pps is real in VM with DPDK • New features to boost performance of VM networking • Accelerate transition to NFV / SDN Conclusions