Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

eBPF Basics

1,306 views

Published on

An Introduction to eBPF (and cBPF). Topics covered include history, implementation, program types & maps. Also gives a brief introduction to XDP and DPDK

Published in: Engineering
  • Login to see the comments

eBPF Basics

  1. 1. (c|e)BPF Basics Michael Kehoe Sr Staff Site Reliability Engineer
  2. 2. Agenda
  3. 3. Today’s agenda 1 Introduction 2 cBPF Introduction, History & Implementation 3 eBPF Introduction, History & Implementation 5 eBPF Uses 6 XDP 7 DPDK
  4. 4. Introduction
  5. 5. Michael Kehoe $ WHOAMI • Sr Staff Site Reliability Engineer @ LinkedIn • Production-SRE Team • What I do: • Disaster Recovery • (Organizational) Visibility Engineering • Incident Management • Reliability Research
  6. 6. (c)BPF Introduction & History & Implementation
  7. 7. “BPF is a highly flexible and efficient virtual machine-like construct in the Linux kernel allowing to execute bytecode at various hook points in a safe manner. It is used in a number of Linux kernel subsystems, most prominently networking, tracing and security (e.g. sandboxing).” C i l i u m
  8. 8. What is cBPF? • cBPF – Classic BPF • Also known as “Linux Packet Filtering” • BPF was first introduced in 1992 by Steven McCanne and Van Jacobson in BSD • Better known as the packet filter language in tcpdump
  9. 9. What is cBPF? • Network packet filtering, Seccomp • Filter Expressions  Bytecode  Interpret • Small, in-kernel VM, Register based, switch dispatch interpreter, few instructions • BPF uses a simple, non-shared buffer model made possible by today’s larger address space
  10. 10. History
  11. 11. History of BPF • Before BPF, each OS (Sun, DEC, SGI etc) had its own packet filtering API • In 1993: Steven McCanne & Van Jacobsen released a paper titled the BSD Packet Filter (BPF) • Implemented as “Linux Socket Filter” in kernel 2.2 • While maintaining the BPF language (for describing filters), uses a different internal architecture
  12. 12. Implementation
  13. 13. BPF (original) implementation • Open a special-purpose character-device, namely /dev/bpfn, for dealing with raw packets. • Associate the previous device with a network interface by using the ioctl(2) system call https://www.tcpdump.org/papers/bpf-usenix93.pdf
  14. 14. BPF (original) implementation • Set various BPF parameters, (e.g. buffer size, attach some BPF filters ) This is done using the ioctl(2) system call • Read packets from the kernel, or send raw packets, by reading/writing to the corresponding file descriptor of /dev/bpf using read(2)/write(2) system callshttps://www.tcpdump.org/papers/bpf-usenix93.pdf
  15. 15. BPF (LSF) implementation • Utilizes sockets for passing/receiving packets to/from the kernel-space • Filters are attached with the setsockopt(2) system call https://www.tcpdump.org/papers/bpf-usenix93.pdf
  16. 16. BPF (LSF) implementation • Create a special-purpose socket (i.e., PF_PACKET) 2 • Attach a BPF program to the socket using the setsockopt(2) system call https://www.tcpdump.org/papers/bpf-usenix93.pdf
  17. 17. BPF (LSF) implementation • Set the network interface to promiscuous mode with ioctl(2) (optionally) • Read packets from the kernel, or send raw packets, by reading/writing to the file descriptor of the socket using recvfrom(2)/sendto(2) system calls https://www.tcpdump.org/papers/bpf-usenix93.pdf
  18. 18. BPF (LSF) implementation TCPDUMP EXAMPLE https://static.sched.com/hosted_files/kccnceu19/b8/KubeCon-Europe-2019-Beatriz_Martinez_eBPF.pdf
  19. 19. (e)BPF Introduction & History & Implementation
  20. 20. (e)BPF 1 Introduction 2 History 3 Implementation 5 Program Types 6 Maps
  21. 21. “eBPF is Linux’s new superpower” G a u r a v G u p t a
  22. 22. “eBPF does to Linux what JavaScript does to HTML” B r e n d a n G r e g g
  23. 23. “Run code in the kernel without having to write a kernel module” L i z R i c e
  24. 24. “Stateful, programmable in-kernel decisions for networking, tracing and security” S u c h a k r a p a n i D a t t S h a r m a
  25. 25. What is eBPF? • eBPF – extended Berkeley Packet Filter • User-defined, sandboxed bytecode executed by the kernel • VM that implements a RISC-like assembly language in kernel space • All interactions between kernel/ user space are done through eBPF “maps” • eBPF does not allow loops
  26. 26. What is eBPF? • Similar to LSF, but with the following improvements: • More registers, JIT compiler (flexible/ faster), verifier • Attach on Tracepoint, Kprobe, Uprobe, USDT • In-kernel trace aggregation & filtering • Control via bpf() • Designed for general event processing within the kernel • All interactions between kernel/ user space are done through eBPF “maps”
  27. 27. History
  28. 28. History of BPF • 3.15: Optimization of BPF Interpreter’s instruction set • 3.18: Linux eBPF was released (bpf() syscall) • 3.19: Socket supports, BPF Maps • 4.1: Kprobe support • 4.4: Perf events • 4.7: Attach to tracepoints • 4.8: XDP core • 4.10: cgroups support • 4.18: bpfilter released http://hsdm.dorsal.polymtl.ca/system/files/eBPF-5May2017%20%281%29.pdf
  29. 29. Implementation
  30. 30. What is eBPF? http://hsdm.dorsal.polymtl.ca/system/files/eBPF-5May2017%20%281%29.pdf
  31. 31. Program Types
  32. 32. (e)BPF Program Types • prog_type determines the subset of kernel helper functions that the program may call • Determines the program input (bpf_context) https://www.tcpdump.org/papers/bpf-usenix93.pdf
  33. 33. (e)BPF Program Types SOCKET-RELATED • SOCKET_FILTER: Filtering actions (e.g. drop packets) • SK_SKB: Access SKB and docket details with a view to redirect SKB’s • SOCK_OPS – Catch socket operations • XDP: Allows access to packet data as early as possible (DDoS mitigation/ Load-balancing) https://www.tcpdump.org/papers/bpf-usenix93.pdf
  34. 34. (e)BPF Program Types XDP • XDP: Allows access to packet data as early as possible (DDoS mitigation/ Load-balancing) https://www.tcpdump.org/papers/bpf-usenix93.pdf
  35. 35. (e)BPF Program Types KPROBES, TRACEPOINTS & PERF • KPROBE – Instrument code in any kernel function • TRACEPOINT – Instrument tracepoints in kernel code • PERF_EVENT: Instrument software and hardware perf events https://www.tcpdump.org/papers/bpf-usenix93.pdf
  36. 36. (e)BPF Program Types CGROUPS • CGROUP_SKB – Allow or deny network access on IP egress/ ingress • CGROUP_SOCK – Allow or deny network access at various socket-lreated events • CGROUP_DEVICE – Determine if a device operation should be permitted https://www.tcpdump.org/papers/bpf-usenix93.pdf
  37. 37. (e)BPF Program Types LIGHTWEIGHT TUNNELS • LWT_IN – Examine inbound packets for lightweight tunnel de- encapsulation • LWT_OUT – Implement encapsulation tunnels for specific destination routes • LWT_XMIT – Allowed to modify content and prepend a L2 header https://www.tcpdump.org/papers/bpf-usenix93.pdf
  38. 38. (e)BPF Program Types TRAFFIC CONTROL • SCHED_CLS: A network traffic-control classifier • SCHED_ACT: A network traffic-control action https://www.tcpdump.org/papers/bpf-usenix93.pdf
  39. 39. Maps
  40. 40. (e)BPF Maps • Generic structure for storage of different types of data • Allow sharing of data between: • eBPF kernel program • Kernel and user-space https://www.tcpdump.org/papers/bpf-usenix93.pdf
  41. 41. (e)BPF Maps • Each map has the following attributes: • Type • Max number of elements • Key Size (bytes) • Value Size (bytes) http://man7.org/linux/man-pages/man2/bpf.2.html
  42. 42. (e)BPF Maps • HASH - A hash table • ARRAY- An array map, optimized for fast lookup speeds • PROG_ARRAY - An array of FD’s corresponding to eBPF programs • PERCPU_ARRAY - A per-CPU array, used to implement histograms • PERF_EVENT_ARRAY - Stores pointers to struct perf_event • CGROUP_ARRAY – Stores pointers to control groups https://lwn.net/Articles/740157/
  43. 43. (e)BPF Maps • LRU_HASH - A hash table that only retains the most recently used items • LRU_PER_CPU_HASH - A per-CPU hash table that only retains the most recently used items • LPM_TRIE - A longest-prefix match true, good for matching IP addresses • STACK_TRACE - Stores stack traces • ARRAY_OF_MAPS - A map-in-map data structure • HASH_OF_MAPS – A map-in-map data structurehttps://lwn.net/Articles/740157/
  44. 44. (e)BPF Maps • DEVICE_MAP - For storing and looking up network device references • SOCKET_MAP – Stores and looks up sockets and allows redirection https://lwn.net/Articles/740157/
  45. 45. eBPF Uses
  46. 46. What can BPF be used for? 1 Networking (e.g. load balancing) 2 Firewalls 3 DDOS mitigation 4 Profiling & Tracing 5 Container Security 6 Device Drivers 7 Chaos Engineering
  47. 47. What can BPF be used for NETWORKING • Load-balancing • Katran (Facebook) • General networking • Cilium • Extending the TCP stack • Network Monitoring • Flowmill • Weaveworks
  48. 48. What can BPF be used for FIREWALLS • Bpfilter (Linux 4.18)
  49. 49. What can BPF be used for DDOS MITIGATION • Use of eBPF & XDP to perform infra-wide DDoS mitigation • Facebook • Cloudflare
  50. 50. What can BPF be used for PROFILE & TRACING • Sysdig • bpftrace
  51. 51. What can BPF be used for SECURITY • Cilium • Seccomp BPF
  52. 52. What can BPF be used for DEVICE DRIVERS • eBPF provides a pseudo device driver  possible to extend this in multiple ways
  53. 53. What can BPF be used for CHAOS ENGINEERING • Use Cilium to inject latency, packet-loss, L7 HTTP errors (via a Go extension)
  54. 54. Introduction to XDP
  55. 55. Introduction to XDP • XDP – eXpress Data Path • High performance, programmable network data path (IO Visor Project) • Linux Kernels answer for DPDK (Released in 4.8)
  56. 56. Introduction to XDP • Features: • Does not require specialized hardware • Does not require kernel bypass • Does not replace TCP/ IP stack • Works with TCP/ IP stack with eBPF
  57. 57. Introduction to XDP • XDP program runs as soon as the packet gets to the network driver • XDP program needs to edit with an action: • XDP_TX • XDP_DROP • XDP_PASS
  58. 58. Introduction to DPDK
  59. 59. Introduction to DPDK • DPDK – Data Plane Development Kit • Created in 2010 by Intel • Collection of data plane libraries & NIC drivers for fast packet processing • Open-Source under Linux Foundation • Support for multiple CPU architectures
  60. 60. DPDK Architecture https://core.dpdk.org/
  61. 61. XDP & DPDK
  62. 62. XDP & DPDK BENEFITS OF XDP • No 3rd party code • Option of busy polling or interrupt driven networking • Removes the need to: • Allocate large pages • Dedicated CPU’s • Inject packets into the kernel from 3rd party user space • Define a new security model https://www.iovisor.org/technology/xdp

×