SlideShare a Scribd company logo
1 of 41
Download to read offline
BPF โ€“ in-kernel virtual machine
1!
BPF is
โ€ขโ€ฏ Berkeley Packet Filter
โ€ขโ€ฏ low level instruction set
โ€ขโ€ฏ kernel infrastructure around it
โ€ขโ€ฏ interpreter
โ€ขโ€ฏ JITs
โ€ขโ€ฏ maps
โ€ขโ€ฏ helper functions
Agenda
โ€ขโ€ฏ status and new use cases
โ€ขโ€ฏ architecture and design
โ€ขโ€ฏ demo
extended BPF JITs and compilers
โ€ขโ€ฏ x64 JIT upstreamed
โ€ขโ€ฏ arm64 JIT upstreamed
โ€ขโ€ฏ s390 JIT in progress
โ€ขโ€ฏ ppc JIT in progress
โ€ขโ€ฏ LLVM backend is upstreamed
โ€ขโ€ฏ gcc backend is in progress
extended BPF use cases
1.โ€ฏ networking
2.โ€ฏ tracing (analytics, monitoring, debugging)
3.โ€ฏ in-kernel optimizations
4.โ€ฏ hw modeling
5.โ€ฏ crazy stuff...
1. extended BPF in networking
โ€ขโ€ฏ socket filters
โ€ขโ€ฏ four use cases of bpf in openvswitch (bpf+ovs)
โ€ขโ€ฏ bpf as an action on flow-hit
โ€ขโ€ฏ bpf as fallback on flow-miss
โ€ขโ€ฏ bpf as packet parser before flow lookup
โ€ขโ€ฏ bpf to completely replace ovs datapath
โ€ขโ€ฏ two use cases in traffic control (bpf+tc)
โ€ขโ€ฏ cls โ€“ packet parser and classifier
โ€ขโ€ฏ act โ€“ action
โ€ขโ€ฏ bpf as net_device
2. extended BPF in tracing
โ€ขโ€ฏ bpf+kprobe โ€“ dtrace/systemtap like
โ€ขโ€ฏ bpf+syscalls โ€“ analytics and monitoring
โ€ขโ€ฏ bpf+tracepoints โ€“ faster alternative to kprobes
โ€ขโ€ฏ TCP stack instrumentation with bpf+tracepoints as non-
intrusive alternative to web10g
โ€ขโ€ฏ disk latency monitoring
โ€ขโ€ฏ live kernel debugging (with and without debug info)
3. extended BPF for in-kernel optimizations
โ€ขโ€ฏ kernel interface is kept unmodified. subsystems use bpf to
accelerate internal execution
โ€ขโ€ฏ predicate tree walker of tracing filters -> bpf
โ€ขโ€ฏ nft (netfilter tables) -> bpf
4. extended BPF for HW modeling
โ€ขโ€ฏ p4 โ€“ language for programing flexible network switches
โ€ขโ€ฏ p4 compiler into bpf (userspace)
โ€ขโ€ฏ pass bpf into kernel via switchdev abstraction
โ€ขโ€ฏ rocker device (part of qemu) to execute bpf
5. other crazy uses of BPF
โ€ขโ€ฏ 'reverse BPF' was proposed
โ€ขโ€ฏ in-kernel NIC drivers expose BPF back to user space as generic program to
construct hw specific data structures
โ€ขโ€ฏ bpf -> NPUs
โ€ขโ€ฏ some networking HW vendors planning to translate bpf directly to HW
classic BPF
โ€ขโ€ฏ BPF - Berkeley Packet Filter
โ€ขโ€ฏ inspired by BSD
โ€ขโ€ฏ introduced in linux in 1997 in version 2.1.75
โ€ขโ€ฏ initially used as socket filter by packet capture tool tcpdump
(via libpcap)
classic BPF
โ€ขโ€ฏ two 32-bit registers: A, X
โ€ขโ€ฏ implicit stack of 16 32-bit slots (LD_MEM, ST_MEM insns)
โ€ขโ€ฏ full integer arithmetic
โ€ขโ€ฏ explicit load/store from packet (LD_ABS, LD_IND insns)
โ€ขโ€ฏ conditional branches (with two destinations: jump true/false)
Ex: tcpdump syntax and classic BPF assembler
โ€ขโ€ฏ tcpdump โ€“d 'ip and tcp port 22โ€™
(000) ldh [12] // fetch eth protoโ€จ
(001) jeq #0x800 jt 2"jf 12 // is it IPv4 ?โ€จ
(002) ldb [23] // fetch ip protoโ€จ
(003) jeq #0x6 jt 4"jf 12 // is it TCP ?โ€จ
(004) ldh [20] // fetch frag_offโ€จ
(005) jset #0x1fff jt 12 jf 6 // is it a frag?โ€จ
(006) ldxb 4*([14]&0xf) // fetch ip header lenโ€จ
(007) ldh [x + 14] // fetch src portโ€จ
(008) jeq #0x16 jt 11 jf 9 // is it 22 ?โ€จ
(009) ldh [x + 16] // fetch dest portโ€จ
(010) jeq #0x16 jt 11 jf 12 // is it 22 ?โ€จ
(011) ret #65535 // trim packet and passโ€จ
(012) ret #0 // ignore packet"
Classic BPF for use cases
โ€ขโ€ฏ socket filters (drop or trim packet and pass to user space)
โ€ขโ€ฏ used by tcpdump/libpcap, wireshark, nmap, dhcp, arpd, ...
โ€ขโ€ฏ in networking subsystems
โ€ขโ€ฏ cls_bpf (TC classifier), xt_bpf, ppp, team, ...
โ€ขโ€ฏ seccomp (chrome sandboxing)
โ€ขโ€ฏ introduced in 2012 to filter syscall arguments with bpf program
Classic BPF safety
โ€ขโ€ฏ verifier checks all instructions, forward jumps only, stack slot
load/store, etc
โ€ขโ€ฏ instruction set has some built-in safety (no exposed stack
pointer, instead load instruction has โ€˜memโ€™ modifier)
โ€ขโ€ฏ dynamic packet-boundary checks
Classic BPF extensions
โ€ขโ€ฏ over years multiple extensions were added in the form of โ€˜load
from negative hard coded offsetโ€™
โ€ขโ€ฏ LD_ABS -0x1000 โ€“ skb->protocol
LD_ABS -0x1000+4 โ€“ skb->pkt_type
LD_ABS -0x1000+56 โ€“ get_random()
Extended BPF
โ€ขโ€ฏ design goals:
โ€ขโ€ฏ parse, lookup, update, modify network packets
โ€ขโ€ฏ loadable as kernel modules on demand, on live traffic
โ€ขโ€ฏ safe on production system
โ€ขโ€ฏ performance equal to native x86 code
โ€ขโ€ฏ fast interpreter speed (good performance on all architectures)
โ€ขโ€ฏ calls into bpf and calls from bpf to kernel should be free (no FFI overhead)
in kernel 3.15
tcpdump dhclient chrome
cls,
xt,
team,
ppp,
โ€ฆ
classic -> extended
bpf engine
in kernel 3.18
tcpdump
dhclient
chrome
gcc/llvm
libbpf
classic -> extended
bpf engine
bpf syscall
verifier
x64 JIT
arm64 JIT
Early prototypes
โ€ขโ€ฏ Failed approach #1 (design a VM from scratch)
โ€ขโ€ฏ performance was too slow, user tools need to be developed from scratch as
well
โ€ขโ€ฏ Failed approach #2 (have kernel disassemble and verify x86
instructions)
โ€ขโ€ฏ too many instruction combinations, disasm/verifier needs to be rewritten for
every architecture
Extended BPF
โ€ขโ€ฏ take a mix of real CPU instructions
โ€ขโ€ฏ 10% classic BPF + 70% x86 + 25% arm64 + 5% risc
โ€ขโ€ฏ rename every x86 instruction โ€˜mov rax, rbxโ€™ into โ€˜mov r1, r2โ€™
โ€ขโ€ฏ analyze x86/arm64/risc calling conventions and define a
common one for this โ€˜renamedโ€™ instruction set
โ€ขโ€ฏ make instruction encoding fixed size (for high interpreter
speed)
โ€ขโ€ฏ reuse classic BPF instruction encoding (for trivial classic-
>extended conversion)
extended vs classic BPF
โ€ขโ€ฏ ten 64-bit registers vs two 32-bit registers
โ€ขโ€ฏ arbitrary load/store vs stack load/store
โ€ขโ€ฏ call instruction
Performance
โ€ขโ€ฏ user space compiler โ€˜thinksโ€™ that itโ€™s emitting simplified x86
code
โ€ขโ€ฏ kernel verifies this โ€˜simplified x86โ€™ code
โ€ขโ€ฏ kernel JIT translates each โ€˜simplified x86โ€™ insn into real x86
โ€ขโ€ฏ all registers map one-to-one
โ€ขโ€ฏ most of instructions map one-to-one
โ€ขโ€ฏ bpf โ€˜callโ€™ instruction maps to x86 โ€˜callโ€™
Extended BPF calling convention
โ€ขโ€ฏ BPF calling convention was carefully selected to match a
subset of amd64/arm64 ABIs to avoid extra copy in calls:
โ€ขโ€ฏ R0 โ€“ return value
โ€ขโ€ฏ R1..R5 โ€“ function arguments
โ€ขโ€ฏ R6..R9 โ€“ callee saved
โ€ขโ€ฏ R10 โ€“ frame pointer
Mapping of BPF registers to x86
โ€ขโ€ฏ R0 โ€“ rax return value from functionโ€จ
R1 โ€“ rdi 1st argumentโ€จ
R2 โ€“ rsi 2nd argumentโ€จ
R3 โ€“ rdx 3rd argumentโ€จ
R4 โ€“ rcx 4th argumentโ€จ
R5 โ€“ r8 5th argumentโ€จ
R6 โ€“ rbx callee savedโ€จ
R7 - r13 callee savedโ€จ
R8 - r14 callee savedโ€จ
R9 - r15 callee savedโ€จ
R10 โ€“ rbp frame pointer"
calls and helper functions
โ€ขโ€ฏ bpf โ€˜callโ€™ and set of in-kernel helper functions define what bpf
programs can do
โ€ขโ€ฏ bpf code itself is a โ€˜glueโ€™ between calls to in-kernel helper
functions
โ€ขโ€ฏ helpers
โ€ขโ€ฏ map_lookup/update/delete
โ€ขโ€ฏ ktime_get
โ€ขโ€ฏ packet_write
โ€ขโ€ฏ fetch
BPF maps
โ€ขโ€ฏ maps is a generic storage of different types for sharing data between
kernel and userspace
โ€ขโ€ฏ The maps are accessed from user space via BPF syscall, which has
commands:
โ€ขโ€ฏ create a map with given type and attributes
map_fd = bpf(BPF_MAP_CREATE, union bpf_attr *attr, u32 size)
โ€ขโ€ฏ lookup key/value, update, delete, iterate, delete a map
โ€ขโ€ฏ userspace programs use this syscall to create/access maps that BPF
programs are concurrently updating
BPF compilers
โ€ขโ€ฏ BPF backend for LLVM is in trunk and will be released as part of 3.7
โ€ขโ€ฏ BPF backend for GCC is being worked on
โ€ขโ€ฏ C front-end (clang) is used today to compile C code into BPF
โ€ขโ€ฏ tracing and networking use cases may need custom languages
โ€ขโ€ฏ BPF backend only knows how to emit instructions (calls to helper
functions look like normal calls)
Extended BPF assembler
0: r1 = *(u64 *)(r1 +8)โ€จ
1: *(u64 *)(r10 -8) = r1โ€จ
2: r1 = 1โ€จ
3: *(u64 *)(r10 -16) = r1โ€จ
4: r1 = map_fdโ€จ
6: r2 = r10โ€จ
7: r2 += -8โ€จ
8: call 1โ€จ
9: if r0 == 0x0 goto pc+4โ€จ
10: r1 = *(u64 *)(r0 +0)โ€จ
11: r1 += 1โ€จ
12: *(u64 *)(r0 +0) = r1โ€จ
13: goto pc+8โ€จ
14: r1 = map_fdโ€จ
16: r2 = r10โ€จ
17: r2 += -8โ€จ
18: r3 = r10โ€จ
19: r3 += -16โ€จ
20: r4 = 0โ€จ
21: call 2โ€จ
22: r0 = 0โ€จ
23: exit"
int bpf_prog(struct bpf_context *ctx)โ€จ
{โ€จ
u64 loc = ctx->arg2;โ€จ
u64 init_val = 1;โ€จ
u64 *value;โ€จ
โ€จ
value = bpf_map_lookup_elem(&my_map, &loc);โ€จ
if (value)โ€จ
*value += 1;โ€จ
elseโ€จ
bpf_map_update_elem(&my_map, &loc,โ€จ
&init_val, BPF_ANY);โ€จ
return 0;โ€จ
}"
"
compiled by LLVM from C to bpf asm
compiler as a library
tracing script in .txt file
bpf_create_map
.txt parser
llvm mcjit api
bpf
backend
x64
backend
bpf code x86 code
bpf_prog_load
user
kernel
libllvm
perf
binary
run it
BPF verifier (CFG check)
โ€ขโ€ฏ To minimize run-time overhead anything that can be checked
statically is done by verifier
โ€ขโ€ฏ all jumps of a program form a CFG which is checked for loops
โ€ขโ€ฏ DAG check = non-recursive depth-first-search
โ€ขโ€ฏ if back-edge exists -> there is a loop -> reject program
โ€ขโ€ฏ jumps back are allowed if they donโ€™t form loops
โ€ขโ€ฏ bpf compiler can move cold basic blocks out of critical path
โ€ขโ€ฏ likely/unlikely() hints give extra performance
BPF verifier (instruction walking)
โ€ขโ€ฏ once itโ€™s known that all paths through the program reach final โ€˜exitโ€™
instruction, brute force analyzer of all instructions starts
โ€ขโ€ฏ it descents all possible paths from the 1st insn till โ€˜exitโ€™ insn
โ€ขโ€ฏ it simulates execution of every insn and updates the state change of
registers and stack
BPF verifier
โ€ขโ€ฏ at the start of the program:
โ€ขโ€ฏ type of R1 = PTR_TO_CTX
type of R10 = FRAME_PTR
other registers and stack is unreadable
โ€ขโ€ฏ when verifier sees:
โ€ขโ€ฏ โ€˜R2 = R1โ€™ instruction it copies the type of R1 into R2
โ€ขโ€ฏ โ€˜R3 = 123โ€™ instruction, the type of R3 becomes CONST_IMM
โ€ขโ€ฏ โ€˜exitโ€™ instruction, it checks that R0 is readable
โ€ขโ€ฏ โ€˜if (R4 == 456) goto pc+5โ€™ instruction, it checks that R4 is readable and forks current
state of registers and stack into โ€˜trueโ€™ and โ€˜falseโ€™ branches
BPF verifier (state pruning)
โ€ขโ€ฏ every branch adds another fork for verifier to explore, therefore
branch pruning is important
โ€ขโ€ฏ when verifiers sees an old state that has more strict register state and
more strict stack state then the current branch doesn't need to be
explored further, since verifier already concluded that more strict state
leads to valid โ€˜exitโ€™
โ€ขโ€ฏ two states are equivalent if register state is more conservative and
explored stack state is more conservative than the current one
unprivileged programs?
โ€ขโ€ฏ today extended BPF is root only
โ€ขโ€ฏ to consider unprivileged access:
โ€ขโ€ฏ teach verifier to conditionally reject programs that expose kernel addresses to
user space
โ€ขโ€ฏ constant blinding pass
BPF for tracing
โ€ขโ€ฏ BPF is seen as alternative to systemtap/dtrace
โ€ขโ€ฏ provides in-kernel aggregation, event filtering
โ€ขโ€ฏ can be 'always on'
โ€ขโ€ฏ must have minimal overhead
BPF for tracing (kernel part)
struct bpf_map_def SEC("maps") my_hist_map = {
.type = BPF_MAP_TYPE_ARRAY,
.key_size = sizeof(u32),
.value_size = sizeof(u64),
.max_entries = 64,
};
SEC("events/syscalls/sys_enter_write")
int bpf_prog(struct bpf_context *ctx)
{
u64 write_size = ctx->arg3;
u32 index = log2(write_size);
u64 *value;
value = bpf_map_lookup_elem(&my_hist_map, &index);
if (value)
__sync_fetch_and_add(value, 1);
return 0;
}
sent to kernel as bpf map via bpf() syscall
compiled by llvm into .o and
loaded via bpf() syscall
name of elf section - tracing event to attach via perf_event ioctl
BPF for tracing (user part)
u64 data[64] = {}; u32 key; u64 value;
for (key = 0; key < 64; key++) {
bpf_lookup_elem(fd, &key, &value);
data[key] = value;
if (value && key > max_ind)
max_ind = key;
if (value > max_value)
max_value = value;
}
printf("syscall write() statsn");
user space walks the map and fetches elements via bpf() syscall
syscall write() stats"
byte_size : count distribution"
1 -> 1 : 9 |*************************** |"
2 -> 3 : 0 | |"
4 -> 7 : 0 | |"
8 -> 15 : 2 |***** |"
16 -> 31 : 0 | |"
32 -> 63 : 10 |****************************** |"
64 -> 127 : 12 |************************************* |"
128 -> 255 : 1 |** |"
256 -> 511 : 2 |***** |"
(Brendan Greggโ€™s slide)
Extended BPF
demo
41!

More Related Content

What's hot

Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
ย 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingViller Hsiao
ย 
Staring into the eBPF Abyss
Staring into the eBPF AbyssStaring into the eBPF Abyss
Staring into the eBPF AbyssSasha Goldshtein
ย 
Performance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedBrendan Gregg
ย 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
ย 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsHisaki Ohara
ย 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDPDaniel T. Lee
ย 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance AnalysisBrendan Gregg
ย 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Brendan Gregg
ย 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Ray Jenkins
ย 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
ย 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDPlcplcp1
ย 
Linux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersBrendan Gregg
ย 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!Affan Syed
ย 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking ExplainedThomas Graf
ย 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network StackAdrien Mahieux
ย 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPFAlex Maestretti
ย 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applicationsVipin Varghese
ย 
New Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingNew Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingScyllaDB
ย 
DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In DepthKernel TLV
ย 

What's hot (20)

Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
ย 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
ย 
Staring into the eBPF Abyss
Staring into the eBPF AbyssStaring into the eBPF Abyss
Staring into the eBPF Abyss
ย 
Performance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting Started
ย 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
ย 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
ย 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
ย 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
ย 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
ย 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
ย 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
ย 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
ย 
Linux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF Superpowers
ย 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
ย 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
ย 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
ย 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
ย 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
ย 
New Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingNew Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using Tracing
ย 
DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
ย 

Similar to BPF - in-kernel virtual machine

Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing LandscapeSasha Goldshtein
ย 
eBPF Basics
eBPF BasicseBPF Basics
eBPF BasicsMichael Kehoe
ย 
Berkeley Packet Filters
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet FiltersKernel TLV
ย 
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary DemosMM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary DemosAMD Developer Central
ย 
Kernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at FacebookKernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at FacebookAnne Nicolas
ย 
Dynamic Instrumentation- OpenEBS Golang Meetup July 2017
Dynamic Instrumentation- OpenEBS Golang Meetup July 2017Dynamic Instrumentation- OpenEBS Golang Meetup July 2017
Dynamic Instrumentation- OpenEBS Golang Meetup July 2017OpenEBS
ย 
OSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable SwitchOSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable SwitchChun Ming Ou
ย 
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)Yuuki Takano
ย 
Comprehensive XDP Offโ€Œload-handling the Edge Cases
Comprehensive XDP Offโ€Œload-handling the Edge CasesComprehensive XDP Offโ€Œload-handling the Edge Cases
Comprehensive XDP Offโ€Œload-handling the Edge CasesNetronome
ย 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCKernel TLV
ย 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Andriy Berestovskyy
ย 
Netlink-Optimization.pptx
Netlink-Optimization.pptxNetlink-Optimization.pptx
Netlink-Optimization.pptxKalimuthuVelappan
ย 
BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!Linaro
ย 
15 ia64
15 ia6415 ia64
15 ia64dilip kumar
ย 
DEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depthDEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depthFelipe Prado
ย 
Routing basics/CEF
Routing basics/CEFRouting basics/CEF
Routing basics/CEFDmitry Figol
ย 
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThe Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThomas Graf
ย 
Microchip's PIC Micro Controller
Microchip's PIC Micro ControllerMicrochip's PIC Micro Controller
Microchip's PIC Micro ControllerMidhu S V Unnithan
ย 
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIWLec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIWHsien-Hsin Sean Lee, Ph.D.
ย 

Similar to BPF - in-kernel virtual machine (20)

Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing Landscape
ย 
Ebpf ovsconf-2016
Ebpf ovsconf-2016Ebpf ovsconf-2016
Ebpf ovsconf-2016
ย 
eBPF Basics
eBPF BasicseBPF Basics
eBPF Basics
ย 
Berkeley Packet Filters
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet Filters
ย 
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary DemosMM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
ย 
Kernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at FacebookKernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at Facebook
ย 
Dynamic Instrumentation- OpenEBS Golang Meetup July 2017
Dynamic Instrumentation- OpenEBS Golang Meetup July 2017Dynamic Instrumentation- OpenEBS Golang Meetup July 2017
Dynamic Instrumentation- OpenEBS Golang Meetup July 2017
ย 
OSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable SwitchOSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable Switch
ย 
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
ย 
Comprehensive XDP Offโ€Œload-handling the Edge Cases
Comprehensive XDP Offโ€Œload-handling the Edge CasesComprehensive XDP Offโ€Œload-handling the Edge Cases
Comprehensive XDP Offโ€Œload-handling the Edge Cases
ย 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
ย 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
ย 
Netlink-Optimization.pptx
Netlink-Optimization.pptxNetlink-Optimization.pptx
Netlink-Optimization.pptx
ย 
BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!
ย 
15 ia64
15 ia6415 ia64
15 ia64
ย 
DEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depthDEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depth
ย 
Routing basics/CEF
Routing basics/CEFRouting basics/CEF
Routing basics/CEF
ย 
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThe Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
ย 
Microchip's PIC Micro Controller
Microchip's PIC Micro ControllerMicrochip's PIC Micro Controller
Microchip's PIC Micro Controller
ย 
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIWLec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
ย 

Recently uploaded

CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female serviceCALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
ย 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
ย 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto Gonzรกlez Trastoy
ย 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
ย 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
ย 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
ย 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
ย 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
ย 
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online โ˜‚๏ธ
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online  โ˜‚๏ธCALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online  โ˜‚๏ธ
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online โ˜‚๏ธanilsa9823
ย 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
ย 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
ย 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
ย 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
ย 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
ย 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
ย 
CHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR
ย 
Shapes for Sharing between Graph Data Spacesย - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spacesย - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spacesย - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spacesย - and Epistemic Querying of RDF-...Steffen Staab
ย 
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธ
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธcall girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธ
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธDelhi Call girls
ย 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
ย 

Recently uploaded (20)

CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female serviceCALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Badshah Nagar Lucknow best Female service
ย 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
ย 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
ย 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
ย 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
ย 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
ย 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
ย 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
ย 
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online โ˜‚๏ธ
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online  โ˜‚๏ธCALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online  โ˜‚๏ธ
CALL ON โžฅ8923113531 ๐Ÿ”Call Girls Kakori Lucknow best sexual service Online โ˜‚๏ธ
ย 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
ย 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
ย 
Vip Call Girls Noida โžก๏ธ Delhi โžก๏ธ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida โžก๏ธ Delhi โžก๏ธ 9999965857 No Advance 24HRS LiveVip Call Girls Noida โžก๏ธ Delhi โžก๏ธ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida โžก๏ธ Delhi โžก๏ธ 9999965857 No Advance 24HRS Live
ย 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
ย 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
ย 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
ย 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
ย 
CHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )๐Ÿ” 9953056974๐Ÿ”(=)/CALL GIRLS SERVICE
ย 
Shapes for Sharing between Graph Data Spacesย - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spacesย - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spacesย - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spacesย - and Epistemic Querying of RDF-...
ย 
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธ
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธcall girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธ
call girls in Vaishali (Ghaziabad) ๐Ÿ” >เผ’8448380779 ๐Ÿ” genuine Escort Service ๐Ÿ”โœ”๏ธโœ”๏ธ
ย 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
ย 

BPF - in-kernel virtual machine

  • 1. BPF โ€“ in-kernel virtual machine 1!
  • 2. BPF is โ€ขโ€ฏ Berkeley Packet Filter โ€ขโ€ฏ low level instruction set โ€ขโ€ฏ kernel infrastructure around it โ€ขโ€ฏ interpreter โ€ขโ€ฏ JITs โ€ขโ€ฏ maps โ€ขโ€ฏ helper functions
  • 3. Agenda โ€ขโ€ฏ status and new use cases โ€ขโ€ฏ architecture and design โ€ขโ€ฏ demo
  • 4. extended BPF JITs and compilers โ€ขโ€ฏ x64 JIT upstreamed โ€ขโ€ฏ arm64 JIT upstreamed โ€ขโ€ฏ s390 JIT in progress โ€ขโ€ฏ ppc JIT in progress โ€ขโ€ฏ LLVM backend is upstreamed โ€ขโ€ฏ gcc backend is in progress
  • 5. extended BPF use cases 1.โ€ฏ networking 2.โ€ฏ tracing (analytics, monitoring, debugging) 3.โ€ฏ in-kernel optimizations 4.โ€ฏ hw modeling 5.โ€ฏ crazy stuff...
  • 6. 1. extended BPF in networking โ€ขโ€ฏ socket filters โ€ขโ€ฏ four use cases of bpf in openvswitch (bpf+ovs) โ€ขโ€ฏ bpf as an action on flow-hit โ€ขโ€ฏ bpf as fallback on flow-miss โ€ขโ€ฏ bpf as packet parser before flow lookup โ€ขโ€ฏ bpf to completely replace ovs datapath โ€ขโ€ฏ two use cases in traffic control (bpf+tc) โ€ขโ€ฏ cls โ€“ packet parser and classifier โ€ขโ€ฏ act โ€“ action โ€ขโ€ฏ bpf as net_device
  • 7. 2. extended BPF in tracing โ€ขโ€ฏ bpf+kprobe โ€“ dtrace/systemtap like โ€ขโ€ฏ bpf+syscalls โ€“ analytics and monitoring โ€ขโ€ฏ bpf+tracepoints โ€“ faster alternative to kprobes โ€ขโ€ฏ TCP stack instrumentation with bpf+tracepoints as non- intrusive alternative to web10g โ€ขโ€ฏ disk latency monitoring โ€ขโ€ฏ live kernel debugging (with and without debug info)
  • 8. 3. extended BPF for in-kernel optimizations โ€ขโ€ฏ kernel interface is kept unmodified. subsystems use bpf to accelerate internal execution โ€ขโ€ฏ predicate tree walker of tracing filters -> bpf โ€ขโ€ฏ nft (netfilter tables) -> bpf
  • 9. 4. extended BPF for HW modeling โ€ขโ€ฏ p4 โ€“ language for programing flexible network switches โ€ขโ€ฏ p4 compiler into bpf (userspace) โ€ขโ€ฏ pass bpf into kernel via switchdev abstraction โ€ขโ€ฏ rocker device (part of qemu) to execute bpf
  • 10. 5. other crazy uses of BPF โ€ขโ€ฏ 'reverse BPF' was proposed โ€ขโ€ฏ in-kernel NIC drivers expose BPF back to user space as generic program to construct hw specific data structures โ€ขโ€ฏ bpf -> NPUs โ€ขโ€ฏ some networking HW vendors planning to translate bpf directly to HW
  • 11. classic BPF โ€ขโ€ฏ BPF - Berkeley Packet Filter โ€ขโ€ฏ inspired by BSD โ€ขโ€ฏ introduced in linux in 1997 in version 2.1.75 โ€ขโ€ฏ initially used as socket filter by packet capture tool tcpdump (via libpcap)
  • 12. classic BPF โ€ขโ€ฏ two 32-bit registers: A, X โ€ขโ€ฏ implicit stack of 16 32-bit slots (LD_MEM, ST_MEM insns) โ€ขโ€ฏ full integer arithmetic โ€ขโ€ฏ explicit load/store from packet (LD_ABS, LD_IND insns) โ€ขโ€ฏ conditional branches (with two destinations: jump true/false)
  • 13. Ex: tcpdump syntax and classic BPF assembler โ€ขโ€ฏ tcpdump โ€“d 'ip and tcp port 22โ€™ (000) ldh [12] // fetch eth protoโ€จ (001) jeq #0x800 jt 2"jf 12 // is it IPv4 ?โ€จ (002) ldb [23] // fetch ip protoโ€จ (003) jeq #0x6 jt 4"jf 12 // is it TCP ?โ€จ (004) ldh [20] // fetch frag_offโ€จ (005) jset #0x1fff jt 12 jf 6 // is it a frag?โ€จ (006) ldxb 4*([14]&0xf) // fetch ip header lenโ€จ (007) ldh [x + 14] // fetch src portโ€จ (008) jeq #0x16 jt 11 jf 9 // is it 22 ?โ€จ (009) ldh [x + 16] // fetch dest portโ€จ (010) jeq #0x16 jt 11 jf 12 // is it 22 ?โ€จ (011) ret #65535 // trim packet and passโ€จ (012) ret #0 // ignore packet"
  • 14. Classic BPF for use cases โ€ขโ€ฏ socket filters (drop or trim packet and pass to user space) โ€ขโ€ฏ used by tcpdump/libpcap, wireshark, nmap, dhcp, arpd, ... โ€ขโ€ฏ in networking subsystems โ€ขโ€ฏ cls_bpf (TC classifier), xt_bpf, ppp, team, ... โ€ขโ€ฏ seccomp (chrome sandboxing) โ€ขโ€ฏ introduced in 2012 to filter syscall arguments with bpf program
  • 15. Classic BPF safety โ€ขโ€ฏ verifier checks all instructions, forward jumps only, stack slot load/store, etc โ€ขโ€ฏ instruction set has some built-in safety (no exposed stack pointer, instead load instruction has โ€˜memโ€™ modifier) โ€ขโ€ฏ dynamic packet-boundary checks
  • 16. Classic BPF extensions โ€ขโ€ฏ over years multiple extensions were added in the form of โ€˜load from negative hard coded offsetโ€™ โ€ขโ€ฏ LD_ABS -0x1000 โ€“ skb->protocol LD_ABS -0x1000+4 โ€“ skb->pkt_type LD_ABS -0x1000+56 โ€“ get_random()
  • 17. Extended BPF โ€ขโ€ฏ design goals: โ€ขโ€ฏ parse, lookup, update, modify network packets โ€ขโ€ฏ loadable as kernel modules on demand, on live traffic โ€ขโ€ฏ safe on production system โ€ขโ€ฏ performance equal to native x86 code โ€ขโ€ฏ fast interpreter speed (good performance on all architectures) โ€ขโ€ฏ calls into bpf and calls from bpf to kernel should be free (no FFI overhead)
  • 18. in kernel 3.15 tcpdump dhclient chrome cls, xt, team, ppp, โ€ฆ classic -> extended bpf engine
  • 19. in kernel 3.18 tcpdump dhclient chrome gcc/llvm libbpf classic -> extended bpf engine bpf syscall verifier x64 JIT arm64 JIT
  • 20. Early prototypes โ€ขโ€ฏ Failed approach #1 (design a VM from scratch) โ€ขโ€ฏ performance was too slow, user tools need to be developed from scratch as well โ€ขโ€ฏ Failed approach #2 (have kernel disassemble and verify x86 instructions) โ€ขโ€ฏ too many instruction combinations, disasm/verifier needs to be rewritten for every architecture
  • 21. Extended BPF โ€ขโ€ฏ take a mix of real CPU instructions โ€ขโ€ฏ 10% classic BPF + 70% x86 + 25% arm64 + 5% risc โ€ขโ€ฏ rename every x86 instruction โ€˜mov rax, rbxโ€™ into โ€˜mov r1, r2โ€™ โ€ขโ€ฏ analyze x86/arm64/risc calling conventions and define a common one for this โ€˜renamedโ€™ instruction set โ€ขโ€ฏ make instruction encoding fixed size (for high interpreter speed) โ€ขโ€ฏ reuse classic BPF instruction encoding (for trivial classic- >extended conversion)
  • 22. extended vs classic BPF โ€ขโ€ฏ ten 64-bit registers vs two 32-bit registers โ€ขโ€ฏ arbitrary load/store vs stack load/store โ€ขโ€ฏ call instruction
  • 23. Performance โ€ขโ€ฏ user space compiler โ€˜thinksโ€™ that itโ€™s emitting simplified x86 code โ€ขโ€ฏ kernel verifies this โ€˜simplified x86โ€™ code โ€ขโ€ฏ kernel JIT translates each โ€˜simplified x86โ€™ insn into real x86 โ€ขโ€ฏ all registers map one-to-one โ€ขโ€ฏ most of instructions map one-to-one โ€ขโ€ฏ bpf โ€˜callโ€™ instruction maps to x86 โ€˜callโ€™
  • 24. Extended BPF calling convention โ€ขโ€ฏ BPF calling convention was carefully selected to match a subset of amd64/arm64 ABIs to avoid extra copy in calls: โ€ขโ€ฏ R0 โ€“ return value โ€ขโ€ฏ R1..R5 โ€“ function arguments โ€ขโ€ฏ R6..R9 โ€“ callee saved โ€ขโ€ฏ R10 โ€“ frame pointer
  • 25. Mapping of BPF registers to x86 โ€ขโ€ฏ R0 โ€“ rax return value from functionโ€จ R1 โ€“ rdi 1st argumentโ€จ R2 โ€“ rsi 2nd argumentโ€จ R3 โ€“ rdx 3rd argumentโ€จ R4 โ€“ rcx 4th argumentโ€จ R5 โ€“ r8 5th argumentโ€จ R6 โ€“ rbx callee savedโ€จ R7 - r13 callee savedโ€จ R8 - r14 callee savedโ€จ R9 - r15 callee savedโ€จ R10 โ€“ rbp frame pointer"
  • 26. calls and helper functions โ€ขโ€ฏ bpf โ€˜callโ€™ and set of in-kernel helper functions define what bpf programs can do โ€ขโ€ฏ bpf code itself is a โ€˜glueโ€™ between calls to in-kernel helper functions โ€ขโ€ฏ helpers โ€ขโ€ฏ map_lookup/update/delete โ€ขโ€ฏ ktime_get โ€ขโ€ฏ packet_write โ€ขโ€ฏ fetch
  • 27. BPF maps โ€ขโ€ฏ maps is a generic storage of different types for sharing data between kernel and userspace โ€ขโ€ฏ The maps are accessed from user space via BPF syscall, which has commands: โ€ขโ€ฏ create a map with given type and attributes map_fd = bpf(BPF_MAP_CREATE, union bpf_attr *attr, u32 size) โ€ขโ€ฏ lookup key/value, update, delete, iterate, delete a map โ€ขโ€ฏ userspace programs use this syscall to create/access maps that BPF programs are concurrently updating
  • 28. BPF compilers โ€ขโ€ฏ BPF backend for LLVM is in trunk and will be released as part of 3.7 โ€ขโ€ฏ BPF backend for GCC is being worked on โ€ขโ€ฏ C front-end (clang) is used today to compile C code into BPF โ€ขโ€ฏ tracing and networking use cases may need custom languages โ€ขโ€ฏ BPF backend only knows how to emit instructions (calls to helper functions look like normal calls)
  • 29. Extended BPF assembler 0: r1 = *(u64 *)(r1 +8)โ€จ 1: *(u64 *)(r10 -8) = r1โ€จ 2: r1 = 1โ€จ 3: *(u64 *)(r10 -16) = r1โ€จ 4: r1 = map_fdโ€จ 6: r2 = r10โ€จ 7: r2 += -8โ€จ 8: call 1โ€จ 9: if r0 == 0x0 goto pc+4โ€จ 10: r1 = *(u64 *)(r0 +0)โ€จ 11: r1 += 1โ€จ 12: *(u64 *)(r0 +0) = r1โ€จ 13: goto pc+8โ€จ 14: r1 = map_fdโ€จ 16: r2 = r10โ€จ 17: r2 += -8โ€จ 18: r3 = r10โ€จ 19: r3 += -16โ€จ 20: r4 = 0โ€จ 21: call 2โ€จ 22: r0 = 0โ€จ 23: exit" int bpf_prog(struct bpf_context *ctx)โ€จ {โ€จ u64 loc = ctx->arg2;โ€จ u64 init_val = 1;โ€จ u64 *value;โ€จ โ€จ value = bpf_map_lookup_elem(&my_map, &loc);โ€จ if (value)โ€จ *value += 1;โ€จ elseโ€จ bpf_map_update_elem(&my_map, &loc,โ€จ &init_val, BPF_ANY);โ€จ return 0;โ€จ }" " compiled by LLVM from C to bpf asm
  • 30. compiler as a library tracing script in .txt file bpf_create_map .txt parser llvm mcjit api bpf backend x64 backend bpf code x86 code bpf_prog_load user kernel libllvm perf binary run it
  • 31. BPF verifier (CFG check) โ€ขโ€ฏ To minimize run-time overhead anything that can be checked statically is done by verifier โ€ขโ€ฏ all jumps of a program form a CFG which is checked for loops โ€ขโ€ฏ DAG check = non-recursive depth-first-search โ€ขโ€ฏ if back-edge exists -> there is a loop -> reject program โ€ขโ€ฏ jumps back are allowed if they donโ€™t form loops โ€ขโ€ฏ bpf compiler can move cold basic blocks out of critical path โ€ขโ€ฏ likely/unlikely() hints give extra performance
  • 32. BPF verifier (instruction walking) โ€ขโ€ฏ once itโ€™s known that all paths through the program reach final โ€˜exitโ€™ instruction, brute force analyzer of all instructions starts โ€ขโ€ฏ it descents all possible paths from the 1st insn till โ€˜exitโ€™ insn โ€ขโ€ฏ it simulates execution of every insn and updates the state change of registers and stack
  • 33. BPF verifier โ€ขโ€ฏ at the start of the program: โ€ขโ€ฏ type of R1 = PTR_TO_CTX type of R10 = FRAME_PTR other registers and stack is unreadable โ€ขโ€ฏ when verifier sees: โ€ขโ€ฏ โ€˜R2 = R1โ€™ instruction it copies the type of R1 into R2 โ€ขโ€ฏ โ€˜R3 = 123โ€™ instruction, the type of R3 becomes CONST_IMM โ€ขโ€ฏ โ€˜exitโ€™ instruction, it checks that R0 is readable โ€ขโ€ฏ โ€˜if (R4 == 456) goto pc+5โ€™ instruction, it checks that R4 is readable and forks current state of registers and stack into โ€˜trueโ€™ and โ€˜falseโ€™ branches
  • 34. BPF verifier (state pruning) โ€ขโ€ฏ every branch adds another fork for verifier to explore, therefore branch pruning is important โ€ขโ€ฏ when verifiers sees an old state that has more strict register state and more strict stack state then the current branch doesn't need to be explored further, since verifier already concluded that more strict state leads to valid โ€˜exitโ€™ โ€ขโ€ฏ two states are equivalent if register state is more conservative and explored stack state is more conservative than the current one
  • 35. unprivileged programs? โ€ขโ€ฏ today extended BPF is root only โ€ขโ€ฏ to consider unprivileged access: โ€ขโ€ฏ teach verifier to conditionally reject programs that expose kernel addresses to user space โ€ขโ€ฏ constant blinding pass
  • 36. BPF for tracing โ€ขโ€ฏ BPF is seen as alternative to systemtap/dtrace โ€ขโ€ฏ provides in-kernel aggregation, event filtering โ€ขโ€ฏ can be 'always on' โ€ขโ€ฏ must have minimal overhead
  • 37. BPF for tracing (kernel part) struct bpf_map_def SEC("maps") my_hist_map = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(u32), .value_size = sizeof(u64), .max_entries = 64, }; SEC("events/syscalls/sys_enter_write") int bpf_prog(struct bpf_context *ctx) { u64 write_size = ctx->arg3; u32 index = log2(write_size); u64 *value; value = bpf_map_lookup_elem(&my_hist_map, &index); if (value) __sync_fetch_and_add(value, 1); return 0; } sent to kernel as bpf map via bpf() syscall compiled by llvm into .o and loaded via bpf() syscall name of elf section - tracing event to attach via perf_event ioctl
  • 38. BPF for tracing (user part) u64 data[64] = {}; u32 key; u64 value; for (key = 0; key < 64; key++) { bpf_lookup_elem(fd, &key, &value); data[key] = value; if (value && key > max_ind) max_ind = key; if (value > max_value) max_value = value; } printf("syscall write() statsn"); user space walks the map and fetches elements via bpf() syscall syscall write() stats" byte_size : count distribution" 1 -> 1 : 9 |*************************** |" 2 -> 3 : 0 | |" 4 -> 7 : 0 | |" 8 -> 15 : 2 |***** |" 16 -> 31 : 0 | |" 32 -> 63 : 10 |****************************** |" 64 -> 127 : 12 |************************************* |" 128 -> 255 : 1 |** |" 256 -> 511 : 2 |***** |"