5. Netlink as an IPC
• Netlink is an intra-kernel messaging system
• Netlink is an IPC between the Linux kernel
and the userspace that has:
• Socket interface (AF_NETLINK family with
various protocols)
• Broadcast messages (notifications) from the
kernel triggered by other processes
6. History of Netlink
• It was introduced in Linux 2.2, 1999, by
Alexey Kuznetsov in INR RAS as a successor
of ioctl for the networking interfaces
• In 1995, Linux 1.3 had /dev/netlink
(Skiplink; obsolete) by Alan Cox
• Generic Netlink was supported in 2.6.15,
2006
7. Netlink use cases
• iproute2, a.k.a. ip command
• by Alexey Kuznetsov and Stephen Hemminger
• Open vSwitch (OVS)
• e.g., the communication between the
datapath in the kernel and ova-
vswitchd in the userspace
10. • man 7 netlink
• Netlink Protocol Library Suite (libnl)
• http://www.carisma.slowglass.com/~tgr/libnl/
• RFC 3549
• https://tools.ietf.org/html/rfc3549
• Linux kernel and iproute2 source code
Netlink documentations
11. Linux source code
• Use cscope, global or your preferred tag system
• net/netlink/*.[ch]
• include/linux/{genetlink, netlink,
rtnetlink}.h
• include/net/{genetlink, netlink,
rtnetlink}.h
• include/uapi/linux/*.h
12. iproute2 source code
• Use cscope, global or your preferred tag
system
• ip/*.[ch]
• ip/include/linux/*.h
• e.g., ip/include/linux/ip_link*.h for
links
• lib/libnetlink.c
16. Notes on Netlink
• Netlink data should be transferred in
the native endian
• Little endian on the little endian system
• Big endian on the big endian system
• You need to subscribe some groups to
get the notifications from the kernel
20. NSDB NSDB
NSDB
Private
Network
Host
Midol
man
Cache
Datapath
VM VM VM
Flow Table
Nova compute
MidoNet APINova
API
Horizon MidoNet CLI
Neutron API
MidoNet Plugin
Clients / Users
Host
Midol
man
Cache
Datapath
VM VM VM
Flow Table
Nova compute
BGP Gateway
Midol
man
Datapath
Flow Table
BGP Gateway
Midol
man
Datapath
Flow Table
GRE/VXLAN Tunneling
Internet
21. Midolman (MidoNet agent)
NSDB NSDB
NSDB
Open vSwitch Datapath
IF IF
Interfaces on the host
IF
VM VM VM Midolman
(MidoNet
agent)
Network
Flow Table
Watch/modify
Add/remove flows
Host
Cache
+
local state
Store virtual
topology
information
Nova compute
22. Open vSwitch Datapath
IF IF
e host
IF
VM VM VM Midolman
(MidoNet
agent)
Flow Table
Watch/modify
Add/remove flows
Host
Cache
+
local state
Nova compute
23. Open vSwitch Datapath
IF IF
e host
IF
VM VM VM Midolman
(MidoNet
agent)
Flow Table
Watch/modify
Add/remove flows
Host
Cache
+
local state
Nova compute
Open vSwitch Datapath
IF IF
e host
IF
VM VM VM Midolman
(MidoNet
agent)
Flow Table
Watch/modify
Add/remove flows
Host
Cache
+
local state
Nova compute
Netlink
24. MidoNet speaks Netlink
• MidoNet agent drives OVS datapath
kernel module
• MidoNet agent communicates with
the kernel through Netlink
• e.g., upcalls and flow installations/
invalidations
25. Upcall Lifecycle
1. Input stage
• Get upcalls with packets from the datapath
2. Packet processing stage
1. Deduplicate and queue packets
2. Simulate packets on the virtual topology
3. Deal with the wildcard flows
4. Determine the egress physical port
3. Output stage
• Emit packets and install flows based on the sims
Netlink
32. Netlink/rtnetlink in MidoNet
• odp library to communicate with OVS
datapath
• Hard-coded ip command
• InterfaceScanner scans
interface information on the host
• e.g., interface type, MTU, …
35. InterfaceScanner
• Scans interface information on the
host periodically
• Exposes the subscription interface
• e.g., Notify the current status of all
interfaces to other components
36. New InterfaceScanner
• Gets the notifications from the kernel
through rtnetlink for the updates
• Exposes the subscription interface
• e.g., Notify the current status of all
interfaces to other components
38. Do you know what ip a does?
1. Retrieve link information
2. Retrieve address information
3. Combine them into the single
representation format
4. Display the result
42. Do you know what ip a does?
1. Retrieve link information
2. Retrieve address information
3. Combine them into the single
representation format
4. Display the result
Blocking operation from the!
perspective of Midolman
50. rtnelink library
• Retrieve/create rtnetlink resources
• Observer or Subject consumes retrieved
data
• Coordinate async operations with RxJava
• map ByteBuffer to rtnetlink resource
• filter some resources
• zip few different resources
51. New InterfaceScanner
• Get the notifications from the kernel
• Update the single representation
format
• Let Observers consume the data
52. Acknowledgements
• The following people helped me a lot
• Takayuki Usui, Guillermo Ontañón,
Duarte Nunes and Ivan Kelly
• Special thanks to Antoni Segura
Puimedon and Hugo Benichi
53. (Non-academic) References
• The Netlink protocol: Mysteries Uncovered , Jan
Engelhardt, 2010
• http://inai.de/documents/Netlink_Protocol.pdf
• Understanding And Programming With Netlink Sockets,
Neil Horman, 2004
• http://people.redhat.com/nhorman/papers/netlink.pdf
• Netlink - Wikipedia, the free encyclopaedia
• http://en.wikipedia.org/wiki/Netlink