This document provides a conceptual and hands-on introduction to deploying Neo4j containers across a cluster using three popular container orchestration tools: Docker Swarm, Kubernetes, and Mesos. It begins with an overview of containerization and orchestration, then dives into details of each tool - describing their core concepts, architectures, and strengths. Hands-on demos are provided for deploying a Neo4j container on a local Docker Swarm cluster, a Google Kubernetes cluster, and a Mesos/Marathon cluster. The document aims to help readers understand and choose between these leading orchestration approaches.
Gentle Intro to Container Orchestration with Docker Swarm, Kubernetes & Mesos
1. A Gentle (Hands-on) Introduction
to Container Orchestration with
Docker Swarm, Kubernetes, Mesos
Dippy Aggarwal
Ph.D. Candidate, University of Cincinnati, Ohio
2. Dippy Aggarwal
Dissertation focus: Graph databases (Neo4j), data
warehouses, schema evolution and provenance
Summer Intern 2016 at Cincinnati Children’s Hospital,
Biomedical Informatics, High Performance Computing
Team
3. High Performance Computing Team,
Biomedical Informatics,
Cincinnati Children’s Hospital
Carmen De VitoPrakash Velayuthum
Roberto Perea
Kevin Sandy
Mark Cunningham Jason Curtis
4. Orchestration 101
Overview of three orchestration tools:
Kubernetes, Docker Swarm and Mesos
Demos – Automated cluster deployment of
neo4j with the three orchestration approaches
Outline
Motivation: To present a conceptual and hands-on introduction for deploying Neo4j
containers across a cluster using the three popular container orchestration tools –
Docker Swarm, Kubernetes and Mesos.
5. What is Docker/Containerization all about?
Ships goods Ships software
Stackable,
portable, isolated
What purpose does containers solve in general?
[1] http://www.computerweekly.com/feature/Demystifying-Kubernetes-the-tool-to-manage-Google-scale-workloads-in-the-cloud
[2] The Docker Book: Containerization is the new virtualization, James Turnbull, https://www.docker.com/what-docker
• Avoid “Runs on my machine” issues
• Package your application as a standardized unit
6. Docker Progression (contd.)
Docker Adoption Is Up 30% in One Year 2/3 of Companies That Try Docker Adopt It
Adopters 5x Their container Count within 9 Months Docker Now Runs on 10% of the Hosts We Monitor
Source: https://www.datadoghq.com/docker-adoption/
8. The Challenge
• Containers by themselves are difficult to scale, achieve
fault-tolerance
• How to handle replication?
• How to make multiple containers communicate?
How to deploy and manage multiple containers across a
cluster of machines?
Orchestration is this idea of going from launching a container on one
machine to multi-containers spread across a fleet of machines.
10. Last year
@GraphConnect SF 2015
GraphConnect 2015
David Makogon, Microsoft and Patrick Chenzon,
Docker - Containerized Neo4j: Automating Deployments with Docker
https://neo4j.com/blog/neo4j-containers-docker-azure/
11. Docker Swarm
Single Docker Engine Docker Swarm
- Native tool by Docker
- Serves the standard Docker API
Swarm-manager
• Add additional nodes to the cluster seamlessly
• Support single pool of resources
• Maintains state of all the containers running on different docker engines
• Make scheduling decisions
13. Scheduling in Docker Swarm
Spread strategy: Swarm optimizes for the node with the least number of containers.
Binpack strategy: Swarm optimize for the node which is most packed.
Swarm scheduler strategies
Running two containers on the same host
Running containers on nodes meeting certain constraints: health checks,
storage etc.
docker tcp://<manager_ip:manager_port> run -d --name logger -e
affinity:container==frontend logger
Swarm filters
15. Commands to set up a Swarm cluster
Create discovery tokens
docker-machine create –d virtualbox local
docker run swarm create
export TOKEN=<token obtained from the last command>
Launching master and two agent nodes forming a cluster
Master: docker-machine create –d virtualbox --swarm
--swarm-strategy=binpack --swarm-master
--swarm-discovery token://${TOKEN} swarm-master
Agent: docker-machine create –d virtualbox –swarm
--swarm-discovery token://${TOKEN} swarm-agent1
Optional. Default strategy is
spread
17. Docker Swarm
Pros and Cons
+ Simplicity
+ With Docker 1.12, several advanced features such as filters, auto-scaling
made simpler
- Limited by Docker API functionality
docker service create –name frontend –replicas 5 -p 80:80/tcp nginx:latest
Scaling: docker service scale frontend=100
18.
19. What is Kubernetes
• Container Orchestration tool developed by Google but
many participants
• Container cluster manager
Image credit: http://www.webopedia.com/TERM/G/google-container-engine.html
20. Kubernetes- core concepts
• Pod
• Service
• Replication Controller
• Deployment
• Etcd
• API server
• Scheduler
• Kubelet daemon
• Kube-proxy
24. Kubernetes - Pros and Cons
+ Driven by Google
+ Provides more concepts than Swarm
+ Docker 1.12 leveraging Kubernetes idea of abstraction
using pods and services
26. How Mesos help
Image Credits: http://www.slideshare.net/charmalloc/introductionapachemesosjstein20140714?next_slideshow=2
27. How does Mesos help?
No static cluster partitioning required
Mesos offers a level
of abstraction
Interleaved
workloads
28. Another stack variation for Mesos
Use Kubernetes as
container
management tool
Can even have heterogeneous cluster : private and cloud
29. Mesos – core components
• Master: Mediator between the underlying resources and the different
frameworks.
-- Makes offers to frameworks about available resources and launches tasks on
slaves for accepted offers.
• Slaves: actual workhorses of the cluster.
-- Execute tasks submitted by frameworks.
• Frameworks: applications that run on Mesos and solve a specific use-case.
-- Two components: Scheduler and Executor.
31. Which one to choose?
• Use Docker Swarm if:
You want to use the familiar Docker API to build Docker containers
• Use Kubernetes if:
- You want to launch pods, which are groups of containers co-scheduled and
co-located together, sharing resources.
- You are a google fan!
• Use Marathon if:
You want to launch Docker or non-Docker long-running apps/services.
Choose your own adventure!
“The great thing about it is that lots of modern scalable data processing application run well on Mesos (Hadoop, Kafka, Spark) and it is nice because you can run them all on the same basic resource pool, along with your new age container packaged apps” – [source]
-- What Mesos does is that it provides primitives to manage aggregated resource pool (Source: Apache Mesos Essentials book)
&lt;number&gt;
&lt;number&gt;
Mesos master: at any point, only one active master. If running in fault-tolerant mode, multiple masters exist but other than one, all are in standby mode.
Frameworks: Scheduler decides whether to accept or reject an offer.
Executors are resource consumers and run on slaves
Marathon: scheduling framework
&lt;number&gt;