SlideShare a Scribd company logo
1 of 33
Download to read offline
Heterogeneity-Aware Cluster
Scheduling Policies for Deep
Learning Workloads
Deepak Narayanan†, Keshav Santhanam†, Fiodar Kazhamiaka†, Amar
Phanishayee*, Matei Zaharia†
†Stanford University, *Microsoft Research
Hardware for ML training is becoming highly
specialized and heterogeneous!
Nvidia GPUs: K80,
P100, V100, A100
Google TPU FPGAs in Azure
…and others
How should we allocate
heterogeneous resources?
Scheduler
V100 GPU
P100 GPU
Training jobs written in
existing frameworks
…
…
…
Objective (e.g., fairness)
Heterogeneous
cluster
How should one allocate heterogeneous resources to DL training
jobs from multiple users while optimizing different objectives?
Challenge 1: Heterogeneous performance
• Models and operators (e.g., convolutional, attention) perform
differently across hardware architectures
• Disregarding heterogeneity can lead to unfair allocations
Magnitude of speedup across GPU generations varies significantly
Challenge 2: Diverse scheduling objectives
• Single-job objectives: “maximize throughput” or “minimize cost”
• Minimizing cost subject to SLOs involves moving between fast but
expensive, and slow but cheap instances
• Multi-job objectives: fairness or more complicated hierarchical policies
Hierarchical policy: Weighted
fairness
across sub-organizations, FIFO and
fairness within
How should we allocate
heterogeneous resources?
Accelerator
devices
Deep learning
training workloads
Image
classification
Object
detection
Translation
Dog Perro
Objectives
Maximize throughput, Minimize cost Single-job
Multi-jobMax-min fairness, Makespan, Hierarchical
Gavel: A new heterogeneity-aware cluster scheduler
• Generalizes a wide range of existing scheduling policies by expressing
policies as optimization problems over the allocation
• Provides abstraction to incorporate performance heterogeneity
• Round-based scheduling mechanism ensures jobs receive optimal allocation
• Improves objectives such as average job completion time by 3.5×
Throughput
Estimator
Policy
Scheduling
MechanismThroughput
tensor
Allocation
Per-round
placement
Throughput measurements from runs fed back into throughput estimator
V100
P100
Training jobs written in
existing frameworks
…
…
…
If measurements provided by user User objective
Outline
• Background and Motivation
• Challenges with allocating resources over heterogeneous resources
• Heterogeneity-aware Policies
• Round-based Scheduling Mechanism
• Throughput Estimation
• Evaluation
𝑋 specifies the fraction of time a job spends on each accelerator between
allocation recomputations
Allocations (𝑿) as time fractions
Allocations recomputed either at periodic intervals of time, or
on a reset event (new job arrival, or old job completes)
Scheduling policies to be made heterogeneity-aware
• LAS [1]: Max-min fairness by total compute time
• LAS w/ weights: Max-min fairness by total compute time with weights
• Minimize Makespan: Minimize time taken by batch of jobs
• Finish Time Fairness [2]: Maximize minimum job speedup
• FIFO: First in, first out
• Shortest Job First: Minimize time taken by shortest job
• Minimize cost (w/ SLOs): Minimize total cost in public cloud (subject to SLOs)
• Hierarchical: Multi-level policy with fairness as top-level policy, and FIFO or fairness
as lower-level policies. Per-job weights can be specified
[1] Tiresias: A GPU Cluster Manager for Distributed Deep Learning, NSDI 2019, Gu et al.
[2] Themis: Fair and Efficient GPU Cluster Scheduling, NSDI 2020, Mahajan et al.
• In a homogeneous cluster, policy objectives are functions of throughput
(e.g., duration = training steps / throughput)
• To make policies heterogeneity-aware, policy objectives can be
expressed in terms of effective throughput (given allocation 𝑋):
throughput 𝑚, 𝑋 = ,
!
𝑇"! ⋅ 𝑋"!
Policies as optimization problems
𝑇 is matrix of raw throughputs of
each job on each accelerator type
Heterogeneity-aware policies:
Max-Min fairness
• On a homogeneous cluster, Least Attained Service policy is a max-min
fairness policy that equalizes the total compute time each job receives
Maximize# min
"
𝑋"
• Jobs can see unequal throughput reductions on heterogeneous clusters
• Instead, compute max-min fairness over normalized throughputs:
Maximize# min
"
throughput(𝑚, 𝑋)
normalizing_factor"
Scheduling policies to be made
heterogeneity-aware
• LAS: Max-min fairness by total compute time
• LAS w/ weights: Max-min fairness by total compute time with weights
• Minimize Makespan: Minimize time taken by batch of jobs
• Finish Time Fairness: Maximize minimum job speedup
• FIFO: First in, first out
• Shortest Job First: Minimize time taken by shortest job
• Minimize cost (w/ SLOs): Minimize total cost in public cloud (subject to SLOs)
• Hierarchical: Multi-level policy with fairness as top-level policy, and FIFO or fairness
as lower-level policies. Per-job weights can be specified
Performance optimizations:
space sharing and placement awareness
• Gavel can also deploy existing performance optimizations like space-
sharing and placement awareness [1, 2] in a heterogeneity-aware way
• Objectives in terms of throughput(𝑚, 𝑋) unchanged
• 𝑋 needs to be modified to account for performance optimization (e.g.,
allocation for each job combination)
• Raw throughputs (𝑇) for concurrently running applications might need to
be measured / estimated on the fly (see paper for details)
[1] Gandiva: Introspective Cluster Scheduling for Deep Learning, OSDI 2018, Xiao et al.
[2] Themis: Fair and Efficient GPU Cluster Scheduling, NSDI 2020, Mahajan et al.
Outline
• Background and Motivation
• Challenges with allocating resources over heterogeneous resources
• Heterogeneity-aware Policies
• Round-based Scheduling Mechanism
• Throughput Estimation
• Evaluation
How do we realize an optimal allocation?
Given an optimal heterogeneity-aware allocation by a policy, how do we
assign resources to jobs?
Assignments of jobs to
heterogeneous cluster
resources
Gavel’s round-based scheduling
• Round-based scheduler ensures jobs receive time on accelerator
types according to the computed optimal allocation 𝑋
Gavel’s round-based scheduling
• Round-based scheduler ensures jobs receive time on accelerator
types according to the computed optimal allocation 𝑋
• Priority score for every (job, accelerator) combination
• High when a job has received a smaller time fraction than 𝑋
18
Gavel’s round-based scheduling
• Round-based scheduler ensures jobs receive time on accelerator
types according to the computed optimal allocation 𝑋
• Priority score for every (job, accelerator) combination
• High when a job has received a smaller time fraction than 𝑋
• Round durations are 6 minutes in our experiments
• Jobs are issued lease extensions if they do not change workers
between rounds
Outline
• Background and Motivation
• Challenges with allocating resources over heterogeneous resources
• Heterogeneity-aware Policies
• Round-based Scheduling Mechanism
• Throughput Estimation
• Evaluation
Throughput estimation in Gavel
• How well does a newly arrived job perform in isolation on each
worker type?
• How well does a newly arrived job co-locate with other jobs on each
worker type?
Throughput estimation in Gavel
• Naively measuring all combinations will not scale – O(kn)
combinations to measure for each newly arrived job (k resource
types, n active jobs)
• Solution: Use matrix completion approach proposed by Paragon /
Quasar [2, 3]
[2] "QoS-Aware Scheduling in Heterogeneous Datacenters with Paragon". Christina Delimitrou and Christos Kozyrakis.
In the ACM Transactions on Computer Systems (TOCS), Vol. 31 Issue 4, December 2013.
[3] "Quasar: Resource-Efficient and QoS-Aware Cluster Management". Christina Delimitrou and Christos Kozyrakis. In
Proc. of the Nineteenth International Conference on Architectural Support for Programming Languages and Operating
Systems (ASPLOS), Salt Lake City, March 2014.
Throughput estimation in Gavel
Matrix completion
Green entries measured
Black entries not measured
Crossed entries estimates of
missing black entries
! !"#$
Fingerprint
of job i
Find closest
reference job
(computed offline)
Ref. job 1
Ref. job 2
Ref. job r
New job i
...
Outline
• Background and Motivation
• Challenges with allocating resources over heterogeneous resources
• Heterogeneity-aware Policies
• Round-based Scheduling Mechanism
• Throughput Estimation
• Evaluation
Main questions
• Do Gavel’s policies improve objective metrics in a physical cluster?
• What is the impact of input load on objectives using Gavel’s policies?
• Can Gavel’s policy framework support hierarchical policies?
• How do Gavel’s policies scale with the number of active jobs?
• How well does Gavel’s scheduling mechanism realize optimal
allocations?
• What is the overhead of preemption in Gavel?
Gavel improves end objectives
on a physical cluster
• Gavel reduces average JCT by 1.5x
• Gavel without space sharing reduces makespan by 1.2x compared to a
baseline that uses ad-hoc space sharing
• Results in simulation reflect reality (< 8% difference)
System Policy Physical Simulated
Heterogeneity-agnostic Least Attained Service
(average JCT)
5.1 hrs 5.4 hrs
Heterogeneity-aware 3.4 hrs 3.7 hrs
Heterogeneity-agnostic (w/ ad
hoc space sharing)
Makespan 21.3 hrs 22.1 hrs
Heterogeneity-aware 17.7 hrs 17.6 hrs
Physical cluster with 8
V100 GPUs, 16 P100
GPUs, 24 K80 GPUs
Gavel can enable the same heterogeneous
cluster to support higher input load
Higher input
job rate
3.5× better
average JCT
• Simulated cluster with 36 V100 GPUs, 36 P100 GPUs, 36 K80 GPUs
• Each policy evaluated on multiple traces (different Poisson arrival rates)
Shorter CDF tail
JCT CDF (input job rate = 5.6 jobs/hr)
Gavel can support hierarchical policies
Weighted fairness
at both levels Entity 0
Job 0 Job 1 Job 2
𝑤!"#$#% & = 1
Entity 1 Entity 2
… …
• Six jobs per entity
• 𝑤!"#$#% & < 𝑤!"#$#% ' < 𝑤!"#$#% (
• 𝑤!"#$#% ' = 2 implies that entity 1 should get 2× resources as entity 0
Organization
𝑤!"#$#% ' = 2 𝑤!"#$#% ( = 3
Gavel can support hierarchical policies
Widths of bars indicate that inter- and
intra-entity weights are respected
Allocation
in ratio of
3:2:1
Gavel scales to clusters with hundreds
of active jobs
Gavel can compute heterogeneity-aware
allocations over 2048 jobs in a minute
64 seconds
for 2k jobs< 0.13 seconds
for 2k jobs
Gavel’s scheduling mechanism nearly matches
ideal execution
Gavel’s scheduling mechanism with 6 minute rounds
matches ideal baseline (jobs get exact allocation)
Gavel’s scheduling mechanism introduces
minimal overhead
Jobs experience overhead of < 3%, even
without lease renewals
Conclusion
https://cs.stanford.edu/~keshav2/
• Gavel is a heterogeneity-aware cluster scheduler able to optimize for many
high-level objectives such as fairness, makespan, and cost
• Gavel formulates existing policies as optimization problems, and extends
these optimization problems to be heterogeneity-aware
• Gavel can reduce average job completion time by 3.5×
Code open sourced at https://github.com/stanford-futuredata/gavel
keshav2@stanford.edu

More Related Content

Similar to Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads

Task Scheduling using Tabu Search algorithm in Cloud Computing Environment us...
Task Scheduling using Tabu Search algorithm in Cloud Computing Environment us...Task Scheduling using Tabu Search algorithm in Cloud Computing Environment us...
Task Scheduling using Tabu Search algorithm in Cloud Computing Environment us...
AzarulIkhwan
 
Scheduling and sequencing
Scheduling and sequencingScheduling and sequencing
Scheduling and sequencing
Akanksha Gupta
 

Similar to Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads (20)

genetic paper
genetic papergenetic paper
genetic paper
 
AI4Media WP3 workshop - Distributed training introduction
AI4Media WP3 workshop - Distributed training introductionAI4Media WP3 workshop - Distributed training introduction
AI4Media WP3 workshop - Distributed training introduction
 
Podila mesos con-northamerica_sep2017
Podila mesos con-northamerica_sep2017Podila mesos con-northamerica_sep2017
Podila mesos con-northamerica_sep2017
 
A New Approach for Job Scheduling Using Hybrid GA-ST Optimization-Crimson Pub...
A New Approach for Job Scheduling Using Hybrid GA-ST Optimization-Crimson Pub...A New Approach for Job Scheduling Using Hybrid GA-ST Optimization-Crimson Pub...
A New Approach for Job Scheduling Using Hybrid GA-ST Optimization-Crimson Pub...
 
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
 
Chapter 2 Time boxing &amp; agile models
Chapter 2   Time boxing &amp; agile modelsChapter 2   Time boxing &amp; agile models
Chapter 2 Time boxing &amp; agile models
 
TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERSTEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
 
TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERSTEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
 
Temporally Extended Actions For Reinforcement Learning Based Schedulers
Temporally Extended Actions For Reinforcement Learning Based Schedulers Temporally Extended Actions For Reinforcement Learning Based Schedulers
Temporally Extended Actions For Reinforcement Learning Based Schedulers
 
LEARNING SCHEDULER PARAMETERS FOR ADAPTIVE PREEMPTION
LEARNING SCHEDULER PARAMETERS FOR ADAPTIVE PREEMPTIONLEARNING SCHEDULER PARAMETERS FOR ADAPTIVE PREEMPTION
LEARNING SCHEDULER PARAMETERS FOR ADAPTIVE PREEMPTION
 
Learning scheduler parameters for adaptive preemption
Learning scheduler parameters for adaptive preemptionLearning scheduler parameters for adaptive preemption
Learning scheduler parameters for adaptive preemption
 
Task Scheduling using Tabu Search algorithm in Cloud Computing Environment us...
Task Scheduling using Tabu Search algorithm in Cloud Computing Environment us...Task Scheduling using Tabu Search algorithm in Cloud Computing Environment us...
Task Scheduling using Tabu Search algorithm in Cloud Computing Environment us...
 
Scheduling in symbian os
Scheduling in symbian os Scheduling in symbian os
Scheduling in symbian os
 
ch09s.ppt
ch09s.pptch09s.ppt
ch09s.ppt
 
Hadoop scheduler
Hadoop schedulerHadoop scheduler
Hadoop scheduler
 
Scheduling and sequencing
Scheduling and sequencingScheduling and sequencing
Scheduling and sequencing
 
High Dimensionality Structures Selection for Efficient Economic Big data usin...
High Dimensionality Structures Selection for Efficient Economic Big data usin...High Dimensionality Structures Selection for Efficient Economic Big data usin...
High Dimensionality Structures Selection for Efficient Economic Big data usin...
 
Data warehouse 26 exploiting parallel technologies
Data warehouse  26 exploiting parallel technologiesData warehouse  26 exploiting parallel technologies
Data warehouse 26 exploiting parallel technologies
 
SC17 Panel: Energy Efficiency Gains From HPC Software
SC17 Panel: Energy Efficiency Gains From HPC SoftwareSC17 Panel: Energy Efficiency Gains From HPC Software
SC17 Panel: Energy Efficiency Gains From HPC Software
 
Comparison between Cloud Mirror, Mesos Cluster, and Google Omega
Comparison between Cloud Mirror, Mesos Cluster, and Google OmegaComparison between Cloud Mirror, Mesos Cluster, and Google Omega
Comparison between Cloud Mirror, Mesos Cluster, and Google Omega
 

More from Databricks

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 

Recently uploaded (20)

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 

Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads

  • 1. Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads Deepak Narayanan†, Keshav Santhanam†, Fiodar Kazhamiaka†, Amar Phanishayee*, Matei Zaharia† †Stanford University, *Microsoft Research
  • 2. Hardware for ML training is becoming highly specialized and heterogeneous! Nvidia GPUs: K80, P100, V100, A100 Google TPU FPGAs in Azure …and others
  • 3. How should we allocate heterogeneous resources? Scheduler V100 GPU P100 GPU Training jobs written in existing frameworks … … … Objective (e.g., fairness) Heterogeneous cluster How should one allocate heterogeneous resources to DL training jobs from multiple users while optimizing different objectives?
  • 4. Challenge 1: Heterogeneous performance • Models and operators (e.g., convolutional, attention) perform differently across hardware architectures • Disregarding heterogeneity can lead to unfair allocations Magnitude of speedup across GPU generations varies significantly
  • 5. Challenge 2: Diverse scheduling objectives • Single-job objectives: “maximize throughput” or “minimize cost” • Minimizing cost subject to SLOs involves moving between fast but expensive, and slow but cheap instances • Multi-job objectives: fairness or more complicated hierarchical policies Hierarchical policy: Weighted fairness across sub-organizations, FIFO and fairness within
  • 6. How should we allocate heterogeneous resources? Accelerator devices Deep learning training workloads Image classification Object detection Translation Dog Perro Objectives Maximize throughput, Minimize cost Single-job Multi-jobMax-min fairness, Makespan, Hierarchical
  • 7. Gavel: A new heterogeneity-aware cluster scheduler • Generalizes a wide range of existing scheduling policies by expressing policies as optimization problems over the allocation • Provides abstraction to incorporate performance heterogeneity • Round-based scheduling mechanism ensures jobs receive optimal allocation • Improves objectives such as average job completion time by 3.5× Throughput Estimator Policy Scheduling MechanismThroughput tensor Allocation Per-round placement Throughput measurements from runs fed back into throughput estimator V100 P100 Training jobs written in existing frameworks … … … If measurements provided by user User objective
  • 8. Outline • Background and Motivation • Challenges with allocating resources over heterogeneous resources • Heterogeneity-aware Policies • Round-based Scheduling Mechanism • Throughput Estimation • Evaluation
  • 9. 𝑋 specifies the fraction of time a job spends on each accelerator between allocation recomputations Allocations (𝑿) as time fractions Allocations recomputed either at periodic intervals of time, or on a reset event (new job arrival, or old job completes)
  • 10. Scheduling policies to be made heterogeneity-aware • LAS [1]: Max-min fairness by total compute time • LAS w/ weights: Max-min fairness by total compute time with weights • Minimize Makespan: Minimize time taken by batch of jobs • Finish Time Fairness [2]: Maximize minimum job speedup • FIFO: First in, first out • Shortest Job First: Minimize time taken by shortest job • Minimize cost (w/ SLOs): Minimize total cost in public cloud (subject to SLOs) • Hierarchical: Multi-level policy with fairness as top-level policy, and FIFO or fairness as lower-level policies. Per-job weights can be specified [1] Tiresias: A GPU Cluster Manager for Distributed Deep Learning, NSDI 2019, Gu et al. [2] Themis: Fair and Efficient GPU Cluster Scheduling, NSDI 2020, Mahajan et al.
  • 11. • In a homogeneous cluster, policy objectives are functions of throughput (e.g., duration = training steps / throughput) • To make policies heterogeneity-aware, policy objectives can be expressed in terms of effective throughput (given allocation 𝑋): throughput 𝑚, 𝑋 = , ! 𝑇"! ⋅ 𝑋"! Policies as optimization problems 𝑇 is matrix of raw throughputs of each job on each accelerator type
  • 12. Heterogeneity-aware policies: Max-Min fairness • On a homogeneous cluster, Least Attained Service policy is a max-min fairness policy that equalizes the total compute time each job receives Maximize# min " 𝑋" • Jobs can see unequal throughput reductions on heterogeneous clusters • Instead, compute max-min fairness over normalized throughputs: Maximize# min " throughput(𝑚, 𝑋) normalizing_factor"
  • 13. Scheduling policies to be made heterogeneity-aware • LAS: Max-min fairness by total compute time • LAS w/ weights: Max-min fairness by total compute time with weights • Minimize Makespan: Minimize time taken by batch of jobs • Finish Time Fairness: Maximize minimum job speedup • FIFO: First in, first out • Shortest Job First: Minimize time taken by shortest job • Minimize cost (w/ SLOs): Minimize total cost in public cloud (subject to SLOs) • Hierarchical: Multi-level policy with fairness as top-level policy, and FIFO or fairness as lower-level policies. Per-job weights can be specified
  • 14. Performance optimizations: space sharing and placement awareness • Gavel can also deploy existing performance optimizations like space- sharing and placement awareness [1, 2] in a heterogeneity-aware way • Objectives in terms of throughput(𝑚, 𝑋) unchanged • 𝑋 needs to be modified to account for performance optimization (e.g., allocation for each job combination) • Raw throughputs (𝑇) for concurrently running applications might need to be measured / estimated on the fly (see paper for details) [1] Gandiva: Introspective Cluster Scheduling for Deep Learning, OSDI 2018, Xiao et al. [2] Themis: Fair and Efficient GPU Cluster Scheduling, NSDI 2020, Mahajan et al.
  • 15. Outline • Background and Motivation • Challenges with allocating resources over heterogeneous resources • Heterogeneity-aware Policies • Round-based Scheduling Mechanism • Throughput Estimation • Evaluation
  • 16. How do we realize an optimal allocation? Given an optimal heterogeneity-aware allocation by a policy, how do we assign resources to jobs? Assignments of jobs to heterogeneous cluster resources
  • 17. Gavel’s round-based scheduling • Round-based scheduler ensures jobs receive time on accelerator types according to the computed optimal allocation 𝑋
  • 18. Gavel’s round-based scheduling • Round-based scheduler ensures jobs receive time on accelerator types according to the computed optimal allocation 𝑋 • Priority score for every (job, accelerator) combination • High when a job has received a smaller time fraction than 𝑋 18
  • 19. Gavel’s round-based scheduling • Round-based scheduler ensures jobs receive time on accelerator types according to the computed optimal allocation 𝑋 • Priority score for every (job, accelerator) combination • High when a job has received a smaller time fraction than 𝑋 • Round durations are 6 minutes in our experiments • Jobs are issued lease extensions if they do not change workers between rounds
  • 20. Outline • Background and Motivation • Challenges with allocating resources over heterogeneous resources • Heterogeneity-aware Policies • Round-based Scheduling Mechanism • Throughput Estimation • Evaluation
  • 21. Throughput estimation in Gavel • How well does a newly arrived job perform in isolation on each worker type? • How well does a newly arrived job co-locate with other jobs on each worker type?
  • 22. Throughput estimation in Gavel • Naively measuring all combinations will not scale – O(kn) combinations to measure for each newly arrived job (k resource types, n active jobs) • Solution: Use matrix completion approach proposed by Paragon / Quasar [2, 3] [2] "QoS-Aware Scheduling in Heterogeneous Datacenters with Paragon". Christina Delimitrou and Christos Kozyrakis. In the ACM Transactions on Computer Systems (TOCS), Vol. 31 Issue 4, December 2013. [3] "Quasar: Resource-Efficient and QoS-Aware Cluster Management". Christina Delimitrou and Christos Kozyrakis. In Proc. of the Nineteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Salt Lake City, March 2014.
  • 23. Throughput estimation in Gavel Matrix completion Green entries measured Black entries not measured Crossed entries estimates of missing black entries ! !"#$ Fingerprint of job i Find closest reference job (computed offline) Ref. job 1 Ref. job 2 Ref. job r New job i ...
  • 24. Outline • Background and Motivation • Challenges with allocating resources over heterogeneous resources • Heterogeneity-aware Policies • Round-based Scheduling Mechanism • Throughput Estimation • Evaluation
  • 25. Main questions • Do Gavel’s policies improve objective metrics in a physical cluster? • What is the impact of input load on objectives using Gavel’s policies? • Can Gavel’s policy framework support hierarchical policies? • How do Gavel’s policies scale with the number of active jobs? • How well does Gavel’s scheduling mechanism realize optimal allocations? • What is the overhead of preemption in Gavel?
  • 26. Gavel improves end objectives on a physical cluster • Gavel reduces average JCT by 1.5x • Gavel without space sharing reduces makespan by 1.2x compared to a baseline that uses ad-hoc space sharing • Results in simulation reflect reality (< 8% difference) System Policy Physical Simulated Heterogeneity-agnostic Least Attained Service (average JCT) 5.1 hrs 5.4 hrs Heterogeneity-aware 3.4 hrs 3.7 hrs Heterogeneity-agnostic (w/ ad hoc space sharing) Makespan 21.3 hrs 22.1 hrs Heterogeneity-aware 17.7 hrs 17.6 hrs Physical cluster with 8 V100 GPUs, 16 P100 GPUs, 24 K80 GPUs
  • 27. Gavel can enable the same heterogeneous cluster to support higher input load Higher input job rate 3.5× better average JCT • Simulated cluster with 36 V100 GPUs, 36 P100 GPUs, 36 K80 GPUs • Each policy evaluated on multiple traces (different Poisson arrival rates) Shorter CDF tail JCT CDF (input job rate = 5.6 jobs/hr)
  • 28. Gavel can support hierarchical policies Weighted fairness at both levels Entity 0 Job 0 Job 1 Job 2 𝑤!"#$#% & = 1 Entity 1 Entity 2 … … • Six jobs per entity • 𝑤!"#$#% & < 𝑤!"#$#% ' < 𝑤!"#$#% ( • 𝑤!"#$#% ' = 2 implies that entity 1 should get 2× resources as entity 0 Organization 𝑤!"#$#% ' = 2 𝑤!"#$#% ( = 3
  • 29. Gavel can support hierarchical policies Widths of bars indicate that inter- and intra-entity weights are respected Allocation in ratio of 3:2:1
  • 30. Gavel scales to clusters with hundreds of active jobs Gavel can compute heterogeneity-aware allocations over 2048 jobs in a minute 64 seconds for 2k jobs< 0.13 seconds for 2k jobs
  • 31. Gavel’s scheduling mechanism nearly matches ideal execution Gavel’s scheduling mechanism with 6 minute rounds matches ideal baseline (jobs get exact allocation)
  • 32. Gavel’s scheduling mechanism introduces minimal overhead Jobs experience overhead of < 3%, even without lease renewals
  • 33. Conclusion https://cs.stanford.edu/~keshav2/ • Gavel is a heterogeneity-aware cluster scheduler able to optimize for many high-level objectives such as fairness, makespan, and cost • Gavel formulates existing policies as optimization problems, and extends these optimization problems to be heterogeneity-aware • Gavel can reduce average job completion time by 3.5× Code open sourced at https://github.com/stanford-futuredata/gavel keshav2@stanford.edu