SlideShare a Scribd company logo
1 of 36
Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Enabling diverse workload scheduling in YARN
June, 2015
Wangda Tan, Hortonworks, (wangda@apache.com)
Craig Welch, Hortonworks, (cwelch@hortonworks.com)
Page2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
About us
Wangda Tan
• Last 5+ years in big data field,
Hadoop, Open-MPI, etc.
• Past
– Pivotal (PHD team, brings
OpenMPI/GraphLab to YARN)
– Alibaba (ODPS team, platform for
distributed data-mining)
• Now
– Apache Hadoop Committer
@Hortonworks, all in YARN.
– Now spending most of time on
resource scheduling enhancements.
Craig Welch
• Yarn Contributor
Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop+YARN is the home of
big data processing.
Page4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Our workloads vary,
Service | Batch | interactive/ real-time
Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
They have different CRAZY requirements
I wanna be fast!
When cluster is busy
Don’t take away
MY RESOURCES
A huge job
needs be scheduled
at a special time
Page6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
We want to make them
AS HAPPY AS POSSIBLE
to run together in YARN.
Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Let’s start…
Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Agenda today
• Overview
• Node Label
• Resource Preemption
• Reservation system
• Pluggable behavior for Scheduler
• Docker support
• Resource scheduling beyond memory
Page9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Overview
Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Background
• Resources are
managed by a
hierarchy of queues.
• One queue can have
multiple applications
• Container is the result
resource scheduling,
Which is a bundle of
resources and can run
process(es)
Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
How to manage your workload by queues
• By organization:
–Marketing/Finance
queue
• By workload
–Interactive/Batch queue
• Hybrid
–Finance-
batch/Marketing-
realtime queue
Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Node Label
Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Node Label – Overview
• Types of node labels
– Node partition (Since 2.6)
– Node constraints (WIP)
• Node partition (Today’s focus)
– One node belongs to only one
partition
– Related to resource planning
• Node constraints
– One node can assign multiple
constraints
– Not related to resource planning
Page14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Node partition – Resource planning
• Nodes belong to “default partition” if not specified
• It’s possible to specify different capacities of queues on different partitions
–For example, sales queue can use different resource on GPU and default partition.
• It’s possible to specify some partition will be only used by some queues
(ACL for partition)
–For example, only sales queue can access “Large memory partition”
Page15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Node partition – Exclusive vs. Non-exclusive
Snake Partition Bear partition Default partition
Exclusive partition
Non-exclusive partition
Use it when
they're not at home
Resource Request
Page16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Node Partition – Use cases & best practice
• Dedicate nodes to run important services:
–E.g. Running HBase region server using Apache Slider
• Nodes with special hardware in the cluster are used by organizations.
–E.g. You may want a queue dedicated to the marketing department to use 80% of
these memory-heavy nodes.
• Use non-exclusive node partition to make better resource utilization.
• Be careful about user-limits, capacity, etc. to make sure jobs can be
launched
I will cover more details about implementation & usage in Thursday morning’s
session “YARN Node Labels” with Mayank Bansal from Ebay.
Page17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Resource Preemption
Page18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Resource Preemption – Overview
• Queue has configured minimum resource.
• Since it has a minimum resource value, the preemption policy (which
performs preempting resources) is used to insure that:
–When a queue is under its “minimum resource”, and the cluster doesn’t have
available resources, preemption policy can get resource from other queues use
more than their minimum resource.
A
B
C
20%
30%
50%
Page19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Resource Preemption – Example
• When preemption is not enabled
• When preemption is enabled
Page20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Resource Preemption – best practice
•Configurations to control the pace of preemption:
–yarn.resourcemanager.monitor.capacity.preemption.max_wait_before_kill
–yarn.resourcemanager.monitor.capacity.preemption.total_preemption_per_round
–yarn.resourcemanager.monitor.capacity.preemption.natural_termination_factor
•Configurations to control when or if preemption happens
–yarn.resourcemanager.monitor.capacity.preemption.max_ignored_over_capacity
(deadzone)
–yarn.scheduler.capacity.<queue-path>.disable_preemption
Page21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Reservation System
Page22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Reservation System – Overview
• Reserving resource ahead of time
– Just like ordering table in a restaurant
– “I need a table for X people at Y time”
– “Wait for moment … Reservation
confirmed sir“
– (After some time), “Your table is ready”
–What Reservation System does is:
–Send a reservation request
–RM checks time table
–Send back reservation confirmation ID
–Notify when ready
•Enables more predictable start and
run time for time-critical / resource
intensive applications
Page23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Reservation System – Use cases
•Gang scheduling
– Currently, YARN can do gang
scheduling from application side (holding
resources until it meets requirements)
– Resources could be wasted and there’s
risk of deadlocks.
–RS lays the foundation for gang scheduling
•Workflow support
– I want to run jobs in stages
– Stage-1 at 1 AM tomorrow, needs 10k
containers
– Stage-2 after stage-1, needs 5k
containers
– Stage-3 after stage-2, needs 2k
containers
– You can submit such requests to RS!
Page24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Reservation System – Result & References
•Before & After Reservation System
(reports from MSR)
– It increased cluster utilization a lot!
•References
– Design / Discussion / Report : YARN-1051
– More detail about example : YARN-2609
Page25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Pluggable scheduler behavior
Page26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Why
• Problem
• It’s difficult to share functionality
between schedulers
• Users cannot achieve the same
behavior with all schedulers
• Fixes and enhancements tend to end up
in one scheduler, not all, leading to
fragmentation
• No simple mechanism exists to mix
behaviors for a given feature in a single
cluster
• Solution
• Move to sharable, pluggable scheduler
behavior
Page27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
How
• The Goal
–Recast scheduler behavior as
policies – candidates include
–Resource limits for apps, users...
–Ordering for allocation and
preemption
• With this, we can:
–Maximize feature availability and
reduce fragmentation
–Configure different queues for
different workloads in a single
cluster
Flexible Scheduler configuration,
as simple
as building with Legos!
Page28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Ordering Policy of Capacity Scheduler
• Pluggable ordering policies for
LeafQueues in Capacity Scheduler
–Enables the implementation of different
policies for ordering assignment and
preemption of containers for applications
–Initial implementations include FIFO
(Capacity Scheduler original behavior)
and Fair
–User Limits and Queue Capacity limits
are still respected
• Fair scheduling inside Capacity
Scheduler
–Based on the Fair Sharing logic in
FairScheduler
–Assigns containers to applications in
order of least to greatest resource usage
–Allows many applications to make
progress concurrently
–Lets short jobs finish in reasonable time
while not starving long running jobs
Page29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Configuration and tuning
• Rough guidelines for when to use Fair
and FIFO ordering policies
• Configuration
–yarn.scheduler.capacity.<queue>.ordering-
policy (“fifo” or “fair”, default “fifo”)
–yarn.scheduler.capacity.<queue>.ordering-
policy.fair.enable-size-based-weight (true or
false)
• Tuning
–Use max-am-resource-percent to
avoid “peanut buttering” from having
too many apps running at once
–Sometimes it’s necessary to separate
large and small apps in different
queues, or use size-based-weight, to
avoid large app starvation
Workloads Policy
On-
demand/interactive/
exploratory
Fair
Predictable/Recu-
rring batch
FIFO
Mix of above two Fair
Page30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Docker container support
Page31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Docker container support – Overview
• Containers for the Cluster
–Brings the sandboxing and
dependency isolation of container
technology to Hadoop
–Containers make it simple to use
Hadoop resources for a wider range of
applications
Page32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Docker container support – Status
• Done
–(V1) Initial implementation
translating Kubernetes to an
Application Master launching
Docker containers from the Cluster
met with success.
–(V2) A custom container launcher
for Docker containers. This brought
the capability more fully under the
management of YARN,
–but a single cluster could not
support both traditional YARN
applications (MapReduce, etc)
and Docker concurrently
• Next phase
–(V3) WIP, is adding support for
running Docker and traditional
YARN applications side-by-side in
a single cluster
Page33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
It’s not all about memory
Page34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
It’s not all about Memory - CPU
• What’s in a CPU
–Some workloads are CPU
intensive, without accounting for
this nodes may end up CPU bound
or CPU may be under utilized
cluster-wide
–CPU awareness at the scheduer
level is enabled by selecting the
DominantResourceCalculator.
–Dominant? “Dominant” stands for
the “dominant factor”, or the
“bottleneck”. In simplified terms,
for the resource type which is the
most constrained becomes the
dominant factor for any given
comparison or calculation
–For example, If there is enough
memory but not enough cpu for a
resource request, the cpu
component is dominant ( and the
answer is “No”  )
–See
https://www.cs.berkeley.edu/~alig/pap
ers/drf.pdf for more detail
Page35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
It’s not all about Memory – CPU - Vcores
• What’s in a CPU
–The unit used to abstract CPU
capability in YARN is the vcore
–Vcore counts are configured per-
node in the yarn-site.xml, typically
1-1 vcore to physical CPU
–If some Nodes’ CPUs outclass
other nodes’, the number of vcores
per physical CPU can be adjusted
upward to compensate
Page36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Q & A
?

More Related Content

What's hot

Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)Ferran Galí Reniu
 
LLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveLLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveDataWorks Summit
 
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019Timothy Spann
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Cloudera, Inc.
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingDataWorks Summit
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
Productionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job ServerProductionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job ServerEvan Chan
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and FutureDataWorks Summit
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez Hortonworks
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseDataWorks Summit/Hadoop Summit
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark Summit
 
Apache spark 소개 및 실습
Apache spark 소개 및 실습Apache spark 소개 및 실습
Apache spark 소개 및 실습동현 강
 

What's hot (20)

Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)
 
LLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveLLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
 
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
 
HBase Low Latency
HBase Low LatencyHBase Low Latency
HBase Low Latency
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
Productionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job ServerProductionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job Server
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and Future
 
Scaling HBase for Big Data
Scaling HBase for Big DataScaling HBase for Big Data
Scaling HBase for Big Data
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
File Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and ParquetFile Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and Parquet
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
 
Apache spark 소개 및 실습
Apache spark 소개 및 실습Apache spark 소개 및 실습
Apache spark 소개 및 실습
 

Viewers also liked

Building large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillBuilding large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillHenry Saputra
 
Reservations Based Scheduling: if you’re late don’t blame us!
Reservations Based Scheduling: if you’re late don’t blame us!  Reservations Based Scheduling: if you’re late don’t blame us!
Reservations Based Scheduling: if you’re late don’t blame us! DataWorks Summit
 
Towards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN ClustersTowards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN ClustersDataWorks Summit
 
Research in Soft Real-Time and Virtualized Applications on Linux
Research in Soft Real-Time and Virtualized Applications on LinuxResearch in Soft Real-Time and Virtualized Applications on Linux
Research in Soft Real-Time and Virtualized Applications on Linuxtcucinotta
 
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02Suresh Kumar
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Kathleen Ting
 
Scale-Out Resource Management at Microsoft using Apache YARN
Scale-Out Resource Management at Microsoft using Apache YARNScale-Out Resource Management at Microsoft using Apache YARN
Scale-Out Resource Management at Microsoft using Apache YARNDataWorks Summit/Hadoop Summit
 
YARN High Availability
YARN High AvailabilityYARN High Availability
YARN High AvailabilityCloudera, Inc.
 
How to get started in Big Data without Big Costs - StampedeCon 2016
How to get started in Big Data without Big Costs - StampedeCon 2016How to get started in Big Data without Big Costs - StampedeCon 2016
How to get started in Big Data without Big Costs - StampedeCon 2016StampedeCon
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...StampedeCon
 
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with Yarn
Scale 12 x   Efficient Multi-tenant Hadoop 2 Workloads with YarnScale 12 x   Efficient Multi-tenant Hadoop 2 Workloads with Yarn
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with YarnDavid Kaiser
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN ApplicationsHortonworks
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARNDataWorks Summit
 
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016StampedeCon
 
Harnessing the power of YARN with Apache Twill
Harnessing the power of YARN with Apache TwillHarnessing the power of YARN with Apache Twill
Harnessing the power of YARN with Apache TwillTerence Yim
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopHortonworks
 

Viewers also liked (20)

Building large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillBuilding large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twill
 
Reservations Based Scheduling: if you’re late don’t blame us!
Reservations Based Scheduling: if you’re late don’t blame us!  Reservations Based Scheduling: if you’re late don’t blame us!
Reservations Based Scheduling: if you’re late don’t blame us!
 
Towards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN ClustersTowards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN Clusters
 
Node Labels in YARN
Node Labels in YARNNode Labels in YARN
Node Labels in YARN
 
Research in Soft Real-Time and Virtualized Applications on Linux
Research in Soft Real-Time and Virtualized Applications on LinuxResearch in Soft Real-Time and Virtualized Applications on Linux
Research in Soft Real-Time and Virtualized Applications on Linux
 
Hadoop scheduler
Hadoop schedulerHadoop scheduler
Hadoop scheduler
 
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
 
Bloom filter
Bloom filterBloom filter
Bloom filter
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
 
Scale-Out Resource Management at Microsoft using Apache YARN
Scale-Out Resource Management at Microsoft using Apache YARNScale-Out Resource Management at Microsoft using Apache YARN
Scale-Out Resource Management at Microsoft using Apache YARN
 
Bloom filters
Bloom filtersBloom filters
Bloom filters
 
YARN High Availability
YARN High AvailabilityYARN High Availability
YARN High Availability
 
How to get started in Big Data without Big Costs - StampedeCon 2016
How to get started in Big Data without Big Costs - StampedeCon 2016How to get started in Big Data without Big Costs - StampedeCon 2016
How to get started in Big Data without Big Costs - StampedeCon 2016
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
 
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with Yarn
Scale 12 x   Efficient Multi-tenant Hadoop 2 Workloads with YarnScale 12 x   Efficient Multi-tenant Hadoop 2 Workloads with Yarn
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with Yarn
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN Applications
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARN
 
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
 
Harnessing the power of YARN with Apache Twill
Harnessing the power of YARN with Apache TwillHarnessing the power of YARN with Apache Twill
Harnessing the power of YARN with Apache Twill
 
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of HadoopApache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
 

Similar to Enabling Diverse Workload Scheduling in YARN

Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureDataWorks Summit
 
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureHadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureVinod Kumar Vavilapalli
 
Hadoop Summit - Scheduling policies in YARN - San Jose 2016
Hadoop Summit - Scheduling policies in YARN - San Jose 2016Hadoop Summit - Scheduling policies in YARN - San Jose 2016
Hadoop Summit - Scheduling policies in YARN - San Jose 2016Wangda Tan
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureDataWorks Summit
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionWangda Tan
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionDataWorks Summit
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondYahoo Developer Network
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDataWorks Summit
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Seetharam Venkatesh
 
Debugging Apache Hadoop YARN Cluster in Production
Debugging Apache Hadoop YARN Cluster in ProductionDebugging Apache Hadoop YARN Cluster in Production
Debugging Apache Hadoop YARN Cluster in ProductionXuan Gong
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & FutureDataWorks Summit
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionDataWorks Summit
 
The Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral ProcessingThe Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral ProcessingDataWorks Summit
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnhdhappy001
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformBikas Saha
 

Similar to Enabling Diverse Workload Scheduling in YARN (20)

Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureHadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
 
Hadoop Summit - Scheduling policies in YARN - San Jose 2016
Hadoop Summit - Scheduling policies in YARN - San Jose 2016Hadoop Summit - Scheduling policies in YARN - San Jose 2016
Hadoop Summit - Scheduling policies in YARN - San Jose 2016
 
Scheduling Policies in YARN
Scheduling Policies in YARNScheduling Policies in YARN
Scheduling Policies in YARN
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
 
Debugging Apache Hadoop YARN Cluster in Production
Debugging Apache Hadoop YARN Cluster in ProductionDebugging Apache Hadoop YARN Cluster in Production
Debugging Apache Hadoop YARN Cluster in Production
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & Future
 
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduceApache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduce
 
A Multi Colored YARN
A Multi Colored YARNA Multi Colored YARN
A Multi Colored YARN
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
Running Services on YARN
Running Services on YARNRunning Services on YARN
Running Services on YARN
 
The Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral ProcessingThe Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral Processing
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute PlatformYARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Recently uploaded (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Enabling Diverse Workload Scheduling in YARN

  • 1. Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Enabling diverse workload scheduling in YARN June, 2015 Wangda Tan, Hortonworks, (wangda@apache.com) Craig Welch, Hortonworks, (cwelch@hortonworks.com)
  • 2. Page2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved About us Wangda Tan • Last 5+ years in big data field, Hadoop, Open-MPI, etc. • Past – Pivotal (PHD team, brings OpenMPI/GraphLab to YARN) – Alibaba (ODPS team, platform for distributed data-mining) • Now – Apache Hadoop Committer @Hortonworks, all in YARN. – Now spending most of time on resource scheduling enhancements. Craig Welch • Yarn Contributor
  • 3. Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop+YARN is the home of big data processing.
  • 4. Page4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Our workloads vary, Service | Batch | interactive/ real-time
  • 5. Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved They have different CRAZY requirements I wanna be fast! When cluster is busy Don’t take away MY RESOURCES A huge job needs be scheduled at a special time
  • 6. Page6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved We want to make them AS HAPPY AS POSSIBLE to run together in YARN.
  • 7. Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Let’s start…
  • 8. Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Agenda today • Overview • Node Label • Resource Preemption • Reservation system • Pluggable behavior for Scheduler • Docker support • Resource scheduling beyond memory
  • 9. Page9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Overview
  • 10. Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Background • Resources are managed by a hierarchy of queues. • One queue can have multiple applications • Container is the result resource scheduling, Which is a bundle of resources and can run process(es)
  • 11. Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved How to manage your workload by queues • By organization: –Marketing/Finance queue • By workload –Interactive/Batch queue • Hybrid –Finance- batch/Marketing- realtime queue
  • 12. Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Node Label
  • 13. Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Node Label – Overview • Types of node labels – Node partition (Since 2.6) – Node constraints (WIP) • Node partition (Today’s focus) – One node belongs to only one partition – Related to resource planning • Node constraints – One node can assign multiple constraints – Not related to resource planning
  • 14. Page14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Node partition – Resource planning • Nodes belong to “default partition” if not specified • It’s possible to specify different capacities of queues on different partitions –For example, sales queue can use different resource on GPU and default partition. • It’s possible to specify some partition will be only used by some queues (ACL for partition) –For example, only sales queue can access “Large memory partition”
  • 15. Page15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Node partition – Exclusive vs. Non-exclusive Snake Partition Bear partition Default partition Exclusive partition Non-exclusive partition Use it when they're not at home Resource Request
  • 16. Page16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Node Partition – Use cases & best practice • Dedicate nodes to run important services: –E.g. Running HBase region server using Apache Slider • Nodes with special hardware in the cluster are used by organizations. –E.g. You may want a queue dedicated to the marketing department to use 80% of these memory-heavy nodes. • Use non-exclusive node partition to make better resource utilization. • Be careful about user-limits, capacity, etc. to make sure jobs can be launched I will cover more details about implementation & usage in Thursday morning’s session “YARN Node Labels” with Mayank Bansal from Ebay.
  • 17. Page17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Resource Preemption
  • 18. Page18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Resource Preemption – Overview • Queue has configured minimum resource. • Since it has a minimum resource value, the preemption policy (which performs preempting resources) is used to insure that: –When a queue is under its “minimum resource”, and the cluster doesn’t have available resources, preemption policy can get resource from other queues use more than their minimum resource. A B C 20% 30% 50%
  • 19. Page19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Resource Preemption – Example • When preemption is not enabled • When preemption is enabled
  • 20. Page20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Resource Preemption – best practice •Configurations to control the pace of preemption: –yarn.resourcemanager.monitor.capacity.preemption.max_wait_before_kill –yarn.resourcemanager.monitor.capacity.preemption.total_preemption_per_round –yarn.resourcemanager.monitor.capacity.preemption.natural_termination_factor •Configurations to control when or if preemption happens –yarn.resourcemanager.monitor.capacity.preemption.max_ignored_over_capacity (deadzone) –yarn.scheduler.capacity.<queue-path>.disable_preemption
  • 21. Page21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Reservation System
  • 22. Page22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Reservation System – Overview • Reserving resource ahead of time – Just like ordering table in a restaurant – “I need a table for X people at Y time” – “Wait for moment … Reservation confirmed sir“ – (After some time), “Your table is ready” –What Reservation System does is: –Send a reservation request –RM checks time table –Send back reservation confirmation ID –Notify when ready •Enables more predictable start and run time for time-critical / resource intensive applications
  • 23. Page23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Reservation System – Use cases •Gang scheduling – Currently, YARN can do gang scheduling from application side (holding resources until it meets requirements) – Resources could be wasted and there’s risk of deadlocks. –RS lays the foundation for gang scheduling •Workflow support – I want to run jobs in stages – Stage-1 at 1 AM tomorrow, needs 10k containers – Stage-2 after stage-1, needs 5k containers – Stage-3 after stage-2, needs 2k containers – You can submit such requests to RS!
  • 24. Page24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Reservation System – Result & References •Before & After Reservation System (reports from MSR) – It increased cluster utilization a lot! •References – Design / Discussion / Report : YARN-1051 – More detail about example : YARN-2609
  • 25. Page25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Pluggable scheduler behavior
  • 26. Page26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Why • Problem • It’s difficult to share functionality between schedulers • Users cannot achieve the same behavior with all schedulers • Fixes and enhancements tend to end up in one scheduler, not all, leading to fragmentation • No simple mechanism exists to mix behaviors for a given feature in a single cluster • Solution • Move to sharable, pluggable scheduler behavior
  • 27. Page27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved How • The Goal –Recast scheduler behavior as policies – candidates include –Resource limits for apps, users... –Ordering for allocation and preemption • With this, we can: –Maximize feature availability and reduce fragmentation –Configure different queues for different workloads in a single cluster Flexible Scheduler configuration, as simple as building with Legos!
  • 28. Page28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Ordering Policy of Capacity Scheduler • Pluggable ordering policies for LeafQueues in Capacity Scheduler –Enables the implementation of different policies for ordering assignment and preemption of containers for applications –Initial implementations include FIFO (Capacity Scheduler original behavior) and Fair –User Limits and Queue Capacity limits are still respected • Fair scheduling inside Capacity Scheduler –Based on the Fair Sharing logic in FairScheduler –Assigns containers to applications in order of least to greatest resource usage –Allows many applications to make progress concurrently –Lets short jobs finish in reasonable time while not starving long running jobs
  • 29. Page29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Configuration and tuning • Rough guidelines for when to use Fair and FIFO ordering policies • Configuration –yarn.scheduler.capacity.<queue>.ordering- policy (“fifo” or “fair”, default “fifo”) –yarn.scheduler.capacity.<queue>.ordering- policy.fair.enable-size-based-weight (true or false) • Tuning –Use max-am-resource-percent to avoid “peanut buttering” from having too many apps running at once –Sometimes it’s necessary to separate large and small apps in different queues, or use size-based-weight, to avoid large app starvation Workloads Policy On- demand/interactive/ exploratory Fair Predictable/Recu- rring batch FIFO Mix of above two Fair
  • 30. Page30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Docker container support
  • 31. Page31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Docker container support – Overview • Containers for the Cluster –Brings the sandboxing and dependency isolation of container technology to Hadoop –Containers make it simple to use Hadoop resources for a wider range of applications
  • 32. Page32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Docker container support – Status • Done –(V1) Initial implementation translating Kubernetes to an Application Master launching Docker containers from the Cluster met with success. –(V2) A custom container launcher for Docker containers. This brought the capability more fully under the management of YARN, –but a single cluster could not support both traditional YARN applications (MapReduce, etc) and Docker concurrently • Next phase –(V3) WIP, is adding support for running Docker and traditional YARN applications side-by-side in a single cluster
  • 33. Page33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved It’s not all about memory
  • 34. Page34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved It’s not all about Memory - CPU • What’s in a CPU –Some workloads are CPU intensive, without accounting for this nodes may end up CPU bound or CPU may be under utilized cluster-wide –CPU awareness at the scheduer level is enabled by selecting the DominantResourceCalculator. –Dominant? “Dominant” stands for the “dominant factor”, or the “bottleneck”. In simplified terms, for the resource type which is the most constrained becomes the dominant factor for any given comparison or calculation –For example, If there is enough memory but not enough cpu for a resource request, the cpu component is dominant ( and the answer is “No”  ) –See https://www.cs.berkeley.edu/~alig/pap ers/drf.pdf for more detail
  • 35. Page35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved It’s not all about Memory – CPU - Vcores • What’s in a CPU –The unit used to abstract CPU capability in YARN is the vcore –Vcore counts are configured per- node in the yarn-site.xml, typically 1-1 vcore to physical CPU –If some Nodes’ CPUs outclass other nodes’, the number of vcores per physical CPU can be adjusted upward to compensate
  • 36. Page36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Q & A ?