SlideShare a Scribd company logo
1 of 27
Download to read offline
Cluster schedulers
Agenda
• What is cluster scheduler and why one would need it?

• Cluster scheduler architectures

• Specifics of YARN, Kubernetes, Mesos and Nomad:

• Architecture

• Specific features / positioning

• Pros and cons
What is cluster scheduler?
Do I really need it?
• Software component (monolith or distributed) with two major functions:

• Allocate resources on node(s) for incoming workload

• Maintain task lifecycle on allocated resources (distribute, run, keep
up, shutdown)

• Cluster scheduler is different from application scheduler

• You need one (and probably using one) if you run distributed
application

• You need a real one if you run more than one application and need
some elasticity
Monolith architecture
• Scheduler is a single process
that controls everything about
workloads

• Examples: Hadoop
JobTracker, Kubernetes (kube-
scheduler)

• Simple initial implementation

• Hard to implement different
requirements for different
workloads
* Picture source: http://www.firmament.io/blog/scheduler-architectures.html
Two-level architecture
• Task lifecycle is separated
from resource allocation

• Examples: YARN (you have to
see it), Mesos

• Easy to add different types of
application

• Hard to implement anti-
interference measures, priority
cross-application preemption
* Picture source: http://www.firmament.io/blog/scheduler-architectures.html
Shared-state architecture
• Each scheduler (i.e.
application type) maintains its
own state of the cluster and
commits changes as a
transactions (that could
succeed or fail)

• Example: Nomad

• State synchronisation has to
be done
* Picture source: http://www.firmament.io/blog/scheduler-architectures.html
Distributed architecture
• No centralised resource
allocation, simplified model

• Example: Sparrow

• Has great advantages on fine-
grained tasks randomly
distributed on large cluster

• Any synchronisation (e.g. to
avoid interference) is hard
* Picture source: http://www.firmament.io/blog/scheduler-architectures.html
YARN: Yet Another
Resource Negotiator
History
• MapReduce JobTracker generalisation (decoupled Resource
Manager and Application Master), one of two parts of
“Hadoop”

• Resource allocation based on requests

• Works fine with large containers and batch processes, not so
much with fine-grained / services

• All Hadoop frameworks have 1st class support for YARN
(MRv2, Pig, Hive, Spark)

• Supports pluggable schedulers (cluster-level), containerisation
Architecture
* Picture source: Apache Hadoop Website
Specific features / issues
• Pluggable “queue management” scheduler:

• FairScheduler: memory-fair by default, possible DRF policy for specific queue

• CapacityScheduler: pluggable resource calculator,
DominantResourceCalculator supports CPU and Memory

• Data locality support possible (e.g. MRv2)

• Preemption: across queues and intra queues (2.8.0/3.0.0)

• Kerberos authentication, ACLs on queue and cluster

• Awful metric system, no support for metric collection from “frameworks”

• No volume management
Google Kubernetes
History
• Kubernetes happened after internal “Borg” project in Google

• Initially: greenfield implementation of container orchestration
targeted for services

• kube-scheduler is a small part of what K8s does 

• Best for micro services on cloud

• Huge momentum

• Very ops friendly, Google dogfooding it (Google Cloud
Engine is upstream K8s)
Architectures
* Picture source: Wikipedia
Specific features
• Pod / Controllers / Services 

• Controllers: Replica Sets / StatefulSets / Daemon Sets

• Volumes!

• Resources, oversubscription and QoS

• Service Discovery / Load Balancing

• Secrets

• Authentication / Authorizations / Admission Controls

• Monitoring: Heapster / cAdvisor

• Federation!

• …
Issues
• Many concepts, hard to master and reason about (e.g.
controllers are like schedulers, but not really)

• Monolith kube-scheduler could be slow

• No IO isolation, not suitable for analytical workloads on
large on-premise clusters 

• No real enterprise support (that I know of)
Apache Mesos
History
• UC Berkely 2009, Apache top-tier 2013

• Clean two-level architecture implementation

• Resource allocation based on offers

• Initially part of BDAS groups, targeted at Big Data first
(Apache Spark is Proof-of-Concept for Mesos)

• Popularised by Mesosphere in DC/OS product
Architecture
* Picture source: Apache Mesos Website
Specific features
• Flexible in terms of resources available that could be
allocated: cpus, memory, disks / volumes, gpus

• Pluggable: schedulers (called frameworks), containerizers,
loggers, networking (CNI/libnetwork)

• Oversubscription, revocable resources, quotas

• Some volume management

• Very rough around edges
Framework support
• Although it’s very common when somebody runs X on Y, Mesos is a
leader in terms of hosting other stuff

• It’s really easy to develop Mesos framework

• Some examples:

• Marathon/Aurora for container orchestration (some people even tried
K8s, but that is too much)

• HDFS/Kafka/NoSQL DBs - if you like to live on the edge

• Jenkins/Artifactory/Gitlab

• Spark/TF/Flink/Storm
Real world example
Hashicorp Nomad
History
• 2015, developed by Hashicorp

• Shared-state architecture (service/batch/system
schedulers) Docker scheduler

• Dependent on other Hashicorp tools: Consul, Vault
Architecture
* Picture source: Nomad Website
Specific features & issues
• Multi-DC and multi-region support based on Gossip

• Service/batch/system schedulers

• No authorisations, only basic TLS on communication

• No volume management

• No IO isolation

• Preemption?
Q & A

More Related Content

What's hot

Membase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase
 
Introduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed StorageIntroduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed StorageStreamlio
 
Thug feb 23 2015 Chen Zhang
Thug feb 23 2015 Chen ZhangThug feb 23 2015 Chen Zhang
Thug feb 23 2015 Chen ZhangChen Zhang
 
Membase Introduction
Membase IntroductionMembase Introduction
Membase IntroductionMembase
 
Membase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San FranciscoMembase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San FranciscoMembase
 
Application Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceApplication Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceWSO2
 
Usage case of HBase for real-time application
Usage case of HBase for real-time applicationUsage case of HBase for real-time application
Usage case of HBase for real-time applicationEdward Yoon
 
Cloud infrastructure on Apache Mesos
Cloud infrastructure on Apache MesosCloud infrastructure on Apache Mesos
Cloud infrastructure on Apache MesosAhmed Bacha
 
An evening with Jay Kreps; author of Apache Kafka, Samza, Voldemort & Azkaban.
An evening with Jay Kreps; author of Apache Kafka, Samza, Voldemort & Azkaban.An evening with Jay Kreps; author of Apache Kafka, Samza, Voldemort & Azkaban.
An evening with Jay Kreps; author of Apache Kafka, Samza, Voldemort & Azkaban.Data Con LA
 
Building big data pipelines with Kafka and Kubernetes
Building big data pipelines with Kafka and KubernetesBuilding big data pipelines with Kafka and Kubernetes
Building big data pipelines with Kafka and KubernetesVenu Ryali
 
Building a derived data store using Kafka
Building a derived data store using KafkaBuilding a derived data store using Kafka
Building a derived data store using KafkaVenu Ryali
 
Chicago Data Summit: Geo-based Content Processing Using HBase
Chicago Data Summit: Geo-based Content Processing Using HBaseChicago Data Summit: Geo-based Content Processing Using HBase
Chicago Data Summit: Geo-based Content Processing Using HBaseCloudera, Inc.
 
Using Kafka to scale database replication
Using Kafka to scale database replicationUsing Kafka to scale database replication
Using Kafka to scale database replicationVenu Ryali
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataAshnikbiz
 
Biomatters and Amazon Web Services
Biomatters and Amazon Web Services Biomatters and Amazon Web Services
Biomatters and Amazon Web Services Biomatters
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践HBaseCon
 

What's hot (20)

Membase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase Meetup - Silicon Valley
Membase Meetup - Silicon Valley
 
Introduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed StorageIntroduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed Storage
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Thug feb 23 2015 Chen Zhang
Thug feb 23 2015 Chen ZhangThug feb 23 2015 Chen Zhang
Thug feb 23 2015 Chen Zhang
 
Membase Introduction
Membase IntroductionMembase Introduction
Membase Introduction
 
Membase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San FranciscoMembase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San Francisco
 
Application Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceApplication Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a Service
 
Usage case of HBase for real-time application
Usage case of HBase for real-time applicationUsage case of HBase for real-time application
Usage case of HBase for real-time application
 
Cloud infrastructure on Apache Mesos
Cloud infrastructure on Apache MesosCloud infrastructure on Apache Mesos
Cloud infrastructure on Apache Mesos
 
An evening with Jay Kreps; author of Apache Kafka, Samza, Voldemort & Azkaban.
An evening with Jay Kreps; author of Apache Kafka, Samza, Voldemort & Azkaban.An evening with Jay Kreps; author of Apache Kafka, Samza, Voldemort & Azkaban.
An evening with Jay Kreps; author of Apache Kafka, Samza, Voldemort & Azkaban.
 
Rails on HBase
Rails on HBaseRails on HBase
Rails on HBase
 
Building big data pipelines with Kafka and Kubernetes
Building big data pipelines with Kafka and KubernetesBuilding big data pipelines with Kafka and Kubernetes
Building big data pipelines with Kafka and Kubernetes
 
NoSQL
NoSQLNoSQL
NoSQL
 
Building a derived data store using Kafka
Building a derived data store using KafkaBuilding a derived data store using Kafka
Building a derived data store using Kafka
 
Chicago Data Summit: Geo-based Content Processing Using HBase
Chicago Data Summit: Geo-based Content Processing Using HBaseChicago Data Summit: Geo-based Content Processing Using HBase
Chicago Data Summit: Geo-based Content Processing Using HBase
 
Using Kafka to scale database replication
Using Kafka to scale database replicationUsing Kafka to scale database replication
Using Kafka to scale database replication
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big Data
 
Biomatters and Amazon Web Services
Biomatters and Amazon Web Services Biomatters and Amazon Web Services
Biomatters and Amazon Web Services
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
 

Similar to Cluster schedulers

Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1Sandeep Kunkunuru
 
Intro to cluster scheduler for Linux containers
Intro to cluster scheduler for Linux containersIntro to cluster scheduler for Linux containers
Intro to cluster scheduler for Linux containersKumar Gaurav
 
Cloud Infrastructures Slide Set 8 - More Cloud Technologies - Mesos, Spark | ...
Cloud Infrastructures Slide Set 8 - More Cloud Technologies - Mesos, Spark | ...Cloud Infrastructures Slide Set 8 - More Cloud Technologies - Mesos, Spark | ...
Cloud Infrastructures Slide Set 8 - More Cloud Technologies - Mesos, Spark | ...anynines GmbH
 
Productionizing Hadoop - New Lessons Learned
Productionizing Hadoop - New Lessons LearnedProductionizing Hadoop - New Lessons Learned
Productionizing Hadoop - New Lessons LearnedCloudera, Inc.
 
Michael stack -the state of apache h base
Michael stack -the state of apache h baseMichael stack -the state of apache h base
Michael stack -the state of apache h basehdhappy001
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformDataStax Academy
 
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Mladen Kovacevic
 
An Introduction to Using PostgreSQL with Docker & Kubernetes
An Introduction to Using PostgreSQL with Docker & KubernetesAn Introduction to Using PostgreSQL with Docker & Kubernetes
An Introduction to Using PostgreSQL with Docker & KubernetesJonathan Katz
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data StackZubair Nabi
 
OpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient CloudOpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient CloudMark Voelker
 
'Cloud-Native' Ecosystem - Aug 2015
'Cloud-Native' Ecosystem - Aug 2015'Cloud-Native' Ecosystem - Aug 2015
'Cloud-Native' Ecosystem - Aug 2015Lenny Pruss
 
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...Radhika Puthiyetath
 
Five Years of EC2 Distilled
Five Years of EC2 DistilledFive Years of EC2 Distilled
Five Years of EC2 DistilledGrig Gheorghiu
 
Containerization - The DevOps Revolution
Containerization - The DevOps RevolutionContainerization - The DevOps Revolution
Containerization - The DevOps RevolutionYulian Slobodyan
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...Simon Ambridge
 
The container revolution, and what it means to operators.pptx
The container revolution, and what it means to operators.pptxThe container revolution, and what it means to operators.pptx
The container revolution, and what it means to operators.pptxRobert Starmer
 
Using Docker in production: Get started today!
Using Docker in production: Get started today!Using Docker in production: Get started today!
Using Docker in production: Get started today!Clarence Bakirtzidis
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservicesBigstep
 
Big data talk barcelona - jsr - jc
Big data talk   barcelona - jsr - jcBig data talk   barcelona - jsr - jc
Big data talk barcelona - jsr - jcJames Saint-Rossy
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache KuduAndriy Zabavskyy
 

Similar to Cluster schedulers (20)

Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1
 
Intro to cluster scheduler for Linux containers
Intro to cluster scheduler for Linux containersIntro to cluster scheduler for Linux containers
Intro to cluster scheduler for Linux containers
 
Cloud Infrastructures Slide Set 8 - More Cloud Technologies - Mesos, Spark | ...
Cloud Infrastructures Slide Set 8 - More Cloud Technologies - Mesos, Spark | ...Cloud Infrastructures Slide Set 8 - More Cloud Technologies - Mesos, Spark | ...
Cloud Infrastructures Slide Set 8 - More Cloud Technologies - Mesos, Spark | ...
 
Productionizing Hadoop - New Lessons Learned
Productionizing Hadoop - New Lessons LearnedProductionizing Hadoop - New Lessons Learned
Productionizing Hadoop - New Lessons Learned
 
Michael stack -the state of apache h base
Michael stack -the state of apache h baseMichael stack -the state of apache h base
Michael stack -the state of apache h base
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
 
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
 
An Introduction to Using PostgreSQL with Docker & Kubernetes
An Introduction to Using PostgreSQL with Docker & KubernetesAn Introduction to Using PostgreSQL with Docker & Kubernetes
An Introduction to Using PostgreSQL with Docker & Kubernetes
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data Stack
 
OpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient CloudOpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient Cloud
 
'Cloud-Native' Ecosystem - Aug 2015
'Cloud-Native' Ecosystem - Aug 2015'Cloud-Native' Ecosystem - Aug 2015
'Cloud-Native' Ecosystem - Aug 2015
 
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
 
Five Years of EC2 Distilled
Five Years of EC2 DistilledFive Years of EC2 Distilled
Five Years of EC2 Distilled
 
Containerization - The DevOps Revolution
Containerization - The DevOps RevolutionContainerization - The DevOps Revolution
Containerization - The DevOps Revolution
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...
 
The container revolution, and what it means to operators.pptx
The container revolution, and what it means to operators.pptxThe container revolution, and what it means to operators.pptx
The container revolution, and what it means to operators.pptx
 
Using Docker in production: Get started today!
Using Docker in production: Get started today!Using Docker in production: Get started today!
Using Docker in production: Get started today!
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 
Big data talk barcelona - jsr - jc
Big data talk   barcelona - jsr - jcBig data talk   barcelona - jsr - jc
Big data talk barcelona - jsr - jc
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 

Recently uploaded

SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 

Recently uploaded (20)

Odoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting ServiceOdoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting Service
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 

Cluster schedulers

  • 2. Agenda • What is cluster scheduler and why one would need it? • Cluster scheduler architectures • Specifics of YARN, Kubernetes, Mesos and Nomad: • Architecture • Specific features / positioning • Pros and cons
  • 3. What is cluster scheduler? Do I really need it? • Software component (monolith or distributed) with two major functions: • Allocate resources on node(s) for incoming workload • Maintain task lifecycle on allocated resources (distribute, run, keep up, shutdown) • Cluster scheduler is different from application scheduler • You need one (and probably using one) if you run distributed application • You need a real one if you run more than one application and need some elasticity
  • 4. Monolith architecture • Scheduler is a single process that controls everything about workloads • Examples: Hadoop JobTracker, Kubernetes (kube- scheduler) • Simple initial implementation • Hard to implement different requirements for different workloads * Picture source: http://www.firmament.io/blog/scheduler-architectures.html
  • 5. Two-level architecture • Task lifecycle is separated from resource allocation • Examples: YARN (you have to see it), Mesos • Easy to add different types of application • Hard to implement anti- interference measures, priority cross-application preemption * Picture source: http://www.firmament.io/blog/scheduler-architectures.html
  • 6. Shared-state architecture • Each scheduler (i.e. application type) maintains its own state of the cluster and commits changes as a transactions (that could succeed or fail) • Example: Nomad • State synchronisation has to be done * Picture source: http://www.firmament.io/blog/scheduler-architectures.html
  • 7. Distributed architecture • No centralised resource allocation, simplified model • Example: Sparrow • Has great advantages on fine- grained tasks randomly distributed on large cluster • Any synchronisation (e.g. to avoid interference) is hard * Picture source: http://www.firmament.io/blog/scheduler-architectures.html
  • 9. History • MapReduce JobTracker generalisation (decoupled Resource Manager and Application Master), one of two parts of “Hadoop” • Resource allocation based on requests • Works fine with large containers and batch processes, not so much with fine-grained / services • All Hadoop frameworks have 1st class support for YARN (MRv2, Pig, Hive, Spark) • Supports pluggable schedulers (cluster-level), containerisation
  • 10. Architecture * Picture source: Apache Hadoop Website
  • 11. Specific features / issues • Pluggable “queue management” scheduler: • FairScheduler: memory-fair by default, possible DRF policy for specific queue • CapacityScheduler: pluggable resource calculator, DominantResourceCalculator supports CPU and Memory • Data locality support possible (e.g. MRv2) • Preemption: across queues and intra queues (2.8.0/3.0.0) • Kerberos authentication, ACLs on queue and cluster • Awful metric system, no support for metric collection from “frameworks” • No volume management
  • 13. History • Kubernetes happened after internal “Borg” project in Google • Initially: greenfield implementation of container orchestration targeted for services • kube-scheduler is a small part of what K8s does • Best for micro services on cloud • Huge momentum • Very ops friendly, Google dogfooding it (Google Cloud Engine is upstream K8s)
  • 15. Specific features • Pod / Controllers / Services • Controllers: Replica Sets / StatefulSets / Daemon Sets • Volumes! • Resources, oversubscription and QoS • Service Discovery / Load Balancing • Secrets • Authentication / Authorizations / Admission Controls • Monitoring: Heapster / cAdvisor • Federation! • …
  • 16. Issues • Many concepts, hard to master and reason about (e.g. controllers are like schedulers, but not really) • Monolith kube-scheduler could be slow • No IO isolation, not suitable for analytical workloads on large on-premise clusters • No real enterprise support (that I know of)
  • 18. History • UC Berkely 2009, Apache top-tier 2013 • Clean two-level architecture implementation • Resource allocation based on offers • Initially part of BDAS groups, targeted at Big Data first (Apache Spark is Proof-of-Concept for Mesos) • Popularised by Mesosphere in DC/OS product
  • 19. Architecture * Picture source: Apache Mesos Website
  • 20. Specific features • Flexible in terms of resources available that could be allocated: cpus, memory, disks / volumes, gpus • Pluggable: schedulers (called frameworks), containerizers, loggers, networking (CNI/libnetwork) • Oversubscription, revocable resources, quotas • Some volume management • Very rough around edges
  • 21. Framework support • Although it’s very common when somebody runs X on Y, Mesos is a leader in terms of hosting other stuff • It’s really easy to develop Mesos framework • Some examples: • Marathon/Aurora for container orchestration (some people even tried K8s, but that is too much) • HDFS/Kafka/NoSQL DBs - if you like to live on the edge • Jenkins/Artifactory/Gitlab • Spark/TF/Flink/Storm
  • 24. History • 2015, developed by Hashicorp • Shared-state architecture (service/batch/system schedulers) Docker scheduler • Dependent on other Hashicorp tools: Consul, Vault
  • 26. Specific features & issues • Multi-DC and multi-region support based on Gossip • Service/batch/system schedulers • No authorisations, only basic TLS on communication • No volume management • No IO isolation • Preemption?
  • 27. Q & A