Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Designing scalable Docker networks

This slide deck was presented on a Docker Meetup in Melbourne in March 2016. Linux namespaces and how they working together with Docker were covered in detail as an introduction to this presentation. In the main part was discussed solution that uses VXLAN networks together with EVPN BGP signalling to route traffic between Docker containers.

Designing scalable Docker networks

  1. 1. Designing scalable Docker Networks [March 15 2016] [ Murat Mukhtarov ] Zendesk
  2. 2. Contents 2 ● Linux network namespaces ○ Introduction ○ Binding interface to namespace ● Docker networking ○ Namespaces ○ Inbound and Outbound traffic flows ○ Clustered environments ○ Challenges ● VXLAN ○ Introduction ○ VXLAN signalling ○ VXLAN and Docker ● BGP ○ Routing VXLAN with BGP ○ Scaling VXLAN based Docker networks with BGP ○ PoC ● What wasn’t covered in this presentation
  3. 3. Linux network namespaces 3 Network namespaces is a part of containerization technology that used by Linux kernel Network namespaces allows: ○ To create linux container network isolation instances (namespaces) ○ With own routing table, virtual interfaces, L2 isolation ● The tool that is used to operate with network ns: iproute2 ● Network namespaces are stored in ○ /var/run/netns ● There two types of network namespaces: ○ Root namespace [ ip link ] ○ Non-root namespace [ ip netns .. ip link ]
  4. 4. Bind interface to network namespace 4 Change namespace for eth0-NAMESPACE1 from Root to NAMESPACE1 When network namespace is created it has only one interface Loopback: We can create a pair of peered ip links in the root namespace:
  5. 5. Bringing namespaced interface UP 5 After bringing UP veth part of the pipe, interface inside NAMESPACE1 also becomes UP We can rename interface inside namespace and try to bring it UP Finally assign ip address on eth0 interface inside NAMESPACE1
  6. 6. Docker and network namespaces 6 Docker supports different format of containerisation: ● Libcontainer - own native go-lang implementation to use kernel containerisation capabilities. Default (since 0.9) ● LXC was default before 0.9 Hence docker uses libcontainer every container that created with network namespace would not be seen in ip netns output However it is possible to expose it if you now docker container process pid: PID=$(docker inspect -f '{{.State.Pid}}' $container_id) ln -s /proc/$PID/ns/net /var/run/netns/$PID Instead of PID you can use any name, container_id for example
  7. 7. Docker networking: introduction 7 Docker does for you: - ip link pair: vethXXXXXX <-> eth0 inside the container’s namespace - Adds to docker0 (by default) bridge a vethXXXXX interface (which is tunnel-end in Root namespaces). - Sets up ip address from docker0 network range. - Creates a rule in iptables that will organize for you NAT (PAT) translation, masquerading containers’ network behind default eth0 interface
  8. 8. Docker networking: exposing ports 8 Docker can expose internal ports and even interfaces: - Network type: host. No network namespaces isolation, root namespace will be used - Supply port numbers to be exposed: iptables rules would be created to allow given port(s) number and create a port mapping (port translation) rule.
  9. 9. Docker networking: Clustered environments 9 Now docker offers multi host networking using Docker Swarm, KV store to signal Network and Clustering using Docker Swarm. Overlay transport Requires Linux Kernel version > 3.17
  10. 10. Current challenges 10 KV store approach is a great way to interconnect different docker-runnings nodes for Docker only environments. But it still has scalability limitations for WAN, Multi- Datacenters and not only Docker scenarios. - Modern service-oriented applications consists of multiple processes. Sometimes platform can be described as 30-40 applications, which would be great to containerise - Old networking child issues could return - broadcast domain problems, segmentation and etc. - Docker offers VXLAN support which allows you to scale to certain extent. However how to distribute knowledge about VXLAN database for non-Docker networks ?
  11. 11. VXLAN introduction 11 VXLAN overlay networking technology that allows to send Ethernet traffic encapsulated into UDP datagrams over IP/GRE networks. Detailed description of VXLAN networking could be found in RFC7348 24 bit VNI field is VXLAN address field that could be compared with 802.1q tag for Ethernet frames or MPLS label. Bare in mind MTU value when using VXLAN
  12. 12. VXLAN signalling 12 VXLAN network should be properly signalled otherwise participating hosts would not know about existence of each other. In terms of signalling this particular information should be advertised: - VXLAN Tunnel End-Point (VTEP) - identifies EndPoint, an entity that organizes and terminates VXLAN tunnels - VXLAN Network Identifier (VNI) - identifies the network, similar to 802.1q tag or MPLS label - IP and MAC addresses Ways of signalling VXLAN: - Unicast way - dedicated controller - Multicast way - using PIM and VNI:VTEP pairs propagated as Multicast routes - Docker has implementation with KV store - OpenContrail can use XMPP - BGP
  13. 13. VXLAN signalling with BGP: EVPN 13 Using BGP protocol to carry VXLAN and MAC/IP information is described at following RFCs: - http://tools.ietf.org/html/rfc7432 - https://tools.ietf.org/html/draft-ietf-bess-evpn-overlay-02 - https://tools.ietf.org/html/rfc4684 BGP protocol is designed to be highly extensible and that is why it is possible to use NLRI to carry other information than IPv4/IPv6 routes. For EVPN following Address families were allocated: ● AFI 25 - which matches to L2VPN networks signalling over BGP (Kompella approach) ● SAFI 70 - subaddress family for EVPN (VXLAN) Basicly VXLAN information is carried as BGP routes.
  14. 14. VXLAN and Docker 14 To create multi-tenant Docker networks with advanced isolation we can use VXLAN in the following way: - Create a dedicated interface that has type vxlan - Create a bridge interface where we can stitch together vxlan interface and Root namespace leg of container interface - Create a forwarding table entry bridge fdb add to 00:17:42:8a:b4:05 dst 192.19.0.2 dev vxlan0 - It would be signalled using multicast address 239.1.1.1 on port 4789 (mutlicast should be supported) OR - Configure KV store parameters as daemon arguments and create overlay network - docker network create --driver overlay my-multi-host-network
  15. 15. Docker and VXLAN traffic flow 15
  16. 16. Docker with EVPN and BGP 16 To achieve highly scalable network for Docker we can use: - VXLAN as a forwarding plane to carry network traffic and isolate different container groups and hosts - Signal VXLAN using BGP to manage large Multi-datacenter networks - CNI plugin to bring EVPN tunnels up automatically (Kubernetes) Python written BGP implementation for VXLAN and BGP: bagpipe-BGP, code based on ExaBGP https://github.com/Orange-OpenSource/bagpipe-bgp Go BGP implementation - GoBGP - Route Reflector https://github.com/osrg/gobgp
  17. 17. Stitching together Docker, BGP and VXLAN 17
  18. 18. 18 Proof of concept: Docker + VXLAN + BGP
  19. 19. DEMO 19 Description: - 4 virtual machines: 3 - bagpipe-bgp and 1 goBGP route reflector - Dockerbgp1, Dockerbgp2 and Dockerbgp3 establish BGP session to goBGP RR: 192.168.33.30 - dockerbgp1: 192.168.33.10, runninng web server - dckerbgp2: 192.168.33.20, running curl - dockerbgp3: 192.168.33.30, just busybox for ping test EVPN network: 192.168.10.0/24 IP network for hosts: 192.168.33.0/24
  20. 20. What we did not cover 20 - Another BGP project for Docker and Kubernetes IP networking: https://www.projectcalico.org/why-bgp/ - CNI the Container Network Interface, is a proposed standard for configuring network interfaces for Linux application containers. https://github.com/appc/cni - IP VPN networks using Bagpipe BGP and Open vSwitch
  21. 21. Q&A mmukhtarov@zendesk.com Links: GoBGP project and EVPN: https://github. com/osrg/gobgp/blob/master/docs/sources/evpn.md BagPipe BGP: https://github.com/Orange-OpenSource/bagpipe-bgp BagPipe BGP Docker image: https://hub.docker.com/r/yoshima/bagpipe-bgp/ VXLAN: https://tools.ietf.org/html/rfc7348 EVPN: https://tools.ietf.org/html/draft-ietf-bess-evpn-overlay-02 https://tools.ietf.org/html/rfc7432

×