SlideShare a Scribd company logo
1 of 40
Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Multi-Tenancy & The Capacity Scheduler
Apache YARN
Joseph Niemiec
Senior Solutions Architect
jniemiec@hortonworks.com
ARN
Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Quick Bio
•  Hadoop User for ~4 years
•  Co-Author for Apache Hadoop YARN
•  Originally used Hadoop for location based services
•  Destination Prediction
•  Traffic Analysis
•  Effects of weather at client locations on call center call types
•  Pending Patent in Automotive/Telematics domain
•  Defensive Paper on M2M Validation
•  Started on analytics to be better at an MMORPG
•  HWX SME for YARN, Tez & MapReduce
Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Agenda
Multi-Tenancy, A Goal, A Definition
YARN Primer
Capacity Scheduler Basics
Workload Management
•  Queue Mapping
•  Node Labels
•  Fair Sharing Preemption
•  Chargeback
Resource Control
•  Memory
•  CPU & CGroups
•  Future Resources
Quick Preemption Demo – If we have time
Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Multi-Tenancy
A Goal, A Definition
Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Business Objectives of Multi-Tenancy
•  Elimination of data silos
•  Collate and share data across LoBs
•  Lower cluster TCO through:
•  Blending workloads
•  Higher cluster utilization
•  Economies of scale
•  Enable applications to:
•  Exploit 3rd party data sources
•  Share LoB data with
–  External customers; and
–  Other LoBs
–  Supply chain partners
Spring 2015
65% of clusters host
multiple workloads
Fall 2013
Largely silo’d deployments
with single workload clusters
Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
YARN
Yet Another Resource Negotiator
Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Transition from Hadoop 1 to Hadoop 2
HADOOP 1.0
HDFS
(redundant,	
  reliable	
  storage)
MapReduce
(cluster	
  resource	
  management
	
  &	
  data	
  processing)
HDFS2
(redundant,	
  reliable	
  storage)
YARN
(cluster	
  resource	
  management)
MapReduce
(data	
  processing)
Others
(data	
  processing)
HADOOP 2.0
Single Use System
Batch Apps
Multi Purpose Platform
Batch, Interactive, Online, Streaming
YARN-1
Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Concepts
Application
Application is a job submitted to the framework
Example – Map Reduce Job
Container
Basic unit of allocation
Fine-grained resource allocation across multiple resource types (memory, cpu, disk, network,
gpu etc.)
container_0 = 2GB, 1CPU
container_1 = 1GB, 6 CPU
Replaces the fixed map/reduce slots
Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
YARN what is it good for?
Compute for Data Processing
Metal Detectors for your Hay Stacks
Compute for Embarrassingly Parallel Problems
Problems with tiny datasets and/or that don’t depend on one another
ie: Exhaustive Search, Trade Simulations, Climate Models, Genetic Algorithms
Beyond MapReduce
Enables Multi Workload Compute Applications on a Single Shared Infrastructure
Stream Processing, NoSQL, Search, InMemory, Graphs, etc
!ANYTHING YOU CAN START FROM CLI!
Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
NodeManager	
   NodeManager	
   NodeManager	
   NodeManager	
  
Container	
  1.1	
  
Container	
  2.4	
  
NodeManager	
   NodeManager	
   NodeManager	
   NodeManager	
  
NodeManager	
   NodeManager	
   NodeManager	
   NodeManager	
  
Container	
  1.2	
  
Container	
  1.3	
  
AM	
  1	
  
Container	
  2.2	
  
Container	
  2.1	
  
Container	
  2.3	
  
AM2	
  
YARN Architecture - Walkthrough
Client2	
  
ResourceManager	
  
Scheduler	
  
Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Capacity Scheduler
The Basics
Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
YARN Capacity Scheduler
•  Elasticity over Queues
•  Job submission Access Control Lists
Capacity	
  Sharing	
  
FUNCTION	
  
•  Max capacity per queue
•  User limits within queue	
  
Capacity	
  
Enforcement	
  
FUNCTION	
  
•  Management Admin. Access Control Lists
•  Capacity-Scheduler.xml
Admin-­‐istraIon	
  
FUNCTION	
  
Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hierarchical Queues
ResourceManager	
  
Scheduler	
  
root
Adhoc
10%
DW
60%
Mrkting
30%
Dev
10%
Reserved
20%
Prod
70%
Prod
80%
Dev
20%
P0
70%
P1
30%
Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Multi-Tenancy with Capacity Scheduler
Queues
Economics as queue-capacity
§  Hierarchical Queues
SLAs
§  Preemption
Resource Isolation
§  Cgroups
Administration
§  Queue ACLs
§  Run-time re-configuration for queues
§  Charge-back
ResourceManager	
  
Scheduler	
  
root
Adhoc
10%
DW
70%
Mrkting
20%
Dev
10%
Reserved
20%
Prod
70%
Prod
80%
Dev
20%
P0
70%
P1
30%
Capacity Scheduler
Hierarchical
Queues
Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Set Limits on Capacity
Minimum Capacity
Used to enforce SLAs across Business Units
Guaranteed minimum resources for the queue
Maximum Capacity
Hard limits on maximum % of cluster
resources
Resource Elasticity when not being used by
other queues
Minimum User Limits
Enforces sharing amongst users in a
Business Unit
User sharing for a given queue
User Limit Factor
Maximum queue capacity that one user can
take up
Application Limit
Maximum # of applications submitted to one
queue
Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
CS: Example Queue Configuration
Finance: 10 users | Ad-hoc BI Query jobs only | High Priority, Stricter SLAs
Data Warehouse: 2 users | Batch ETL and Report Generation jobs | Production SLAs
Marketing: 4 users | Ad-hoc Data Science (Pig+Mahout) | Loose SLAs
yarn.scheduler.capacity.root.finance	
  
Capacity ACLs
Min: 0.30 | Max: 0.40 | User Limit: 0.20 ‘Finance’ group
yarn.scheduler.capacity.root.datawarehouse	
  
Capacity ACLs
Min: 0.50 | Max: 0.60 | User Limit: 1.0 ‘DataWarehouse’ group
yarn.scheduler.capacity.root.markeIng	
  
Capacity ACLs
Min: 0.20 | Max: 0.20 | User Limit: 1.0 ‘Marketing’ group
Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Workload Management
Queue Mapping, Labels, Fair Sharing, Preemption, Chargeback
Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Default User/Group to Queue Mapping
Yarn.scheduler.capacity.root.markeIng	
  
Capacity ACLs
Min: 0.30 | Max: 0.50 ‘Web’ group
Yarn.scheduler.capacity.root.ops	
  
Capacity ACLs
Min: 0.30 | Max: 0.50 ‘SupplyChain’ group
Yarn.scheduler.capacity.root.default	
  
Capacity ACLs
Min: 0.40
User/Group CS Queue
U: Joe Marketing
G: Web Marketing
G: SupplyChain Ops
…
MR App
Joe
YARN-2411
Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Node Labels in YARN
Enable configuration of node partitions
Why
Need a mechanism to enforce node-level isolation
Account for resource contention amongst non-YARN managed resources
Account for hardware or software constraints
Two options:
Non-exclusive (Soft) Node Labels
Exclusive (Hard) Node Labels
Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Storm Storm
StormStorm
Exclusive Node Labels enable Isolated Partitions
S
App
Storm
Configure
Partitions
Storm
B
App
Exclusive Labels
enforce Isolation
S S
nodes
labels
S S
YARN-796
Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Spark Spark
SparkSpark
Non-Exclusive Node Labels
S
App
Spark
Configure non-
exclusive labels
Spark
B
App
Schedule if free
capacity
S S
nodes
labels
S S
B
YARN-3214
Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Fair Sharing: Pluggable Queue Policies
Choose scheduling policy per leaf queue
FIFO
Application Container requests accommodated on first come first serve basis
Multi-fair weight
Application Container requests accommodated according to:
•  Order of least resources used – multiple applications make progress
•  (Optional) Size based weight – adjustment to boost large applications making progress
YARN-3319YARN-3318
Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Fair Sharing
MarkeIng	
  
QUEUE	
  
- Job 1
- U: etl
- Q:
Marketing
Job1
Containers: 100
Running/
Finished: 20
Job1 - 20 Containers Running
MarkeIng	
  
QUEUE	
  
- Job 2
- U: etl
- Q:
Marketing
Job1
Containers: 100
Running/
Finished: 20
Job2
Containers: 20
Running/
Finished: 0
Job1 - 20 Containers Running
Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Fair Sharing
MarkeIng	
  
QUEUE	
  
Job1 – 10
Containers Running
MarkeIng	
  
QUEUE	
  Job1
Containers: 100
Running/
Finished: 40
Job2
Containers: 20
Running/
Finished: 20
Job1 - 10
Containers Running
Job1
Containers: 100
Running/
Finished: 30
Job2
Containers: 20
Running/
Finished: 10
Job2 – 10
Containers Running
Job2 - 10
Containers Running
Page 25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Preemption
Across Queues Supported
Within Queue across
Users
Roadmap
Within Queue within User
Not Supported
(Fair Sharing only)
Across Queues
(Node Labels)
Supported
YARN-569
Page 26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Preemption: Overview
Preempt based on lowest priority
Queue that is most over subscribed
Last scheduled app in FIFO queue
Does not account for user limits within a queue
Warn and request from application
Requests application to un-reserve a container
After period will forcibly kill container
Page 27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
© Hortonworks Inc. 2013
Preemption in action
27
1
Product	
  
QUEUE	
  
Min: 20%
Max: 30%
MarkeIng	
  
QUEUE	
  
Min: 45% Max: 75%
Finance	
  
QUEUE	
  
Job 1
Min: 35%
Max: 35%
Page 28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
© Hortonworks Inc. 2013
Preemption in action
28
2
Product	
  
QUEUE	
  
Min: 20%
Max: 30%
MarkeIng	
  
QUEUE	
  
Min: 45% Max: 75%
Finance	
  
QUEUE	
  
Job 2
Min: 35%
Max: 35%
Job1
Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
© Hortonworks Inc. 2013
Preemption in action
29
3
Product	
  
QUEUE	
  
Min: 20%
Max: 30%
MarkeIng	
  
QUEUE	
  
Min: 45% Max: 75%
Finance	
  
QUEUE	
  
Job 3
Min: 35%
Max: 35%
Job1
Job2
65%
Page 30 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
© Hortonworks Inc. 2013
Preemption in action
30
4
Product	
  
QUEUE	
  
Min: 20%
Max: 30%
MarkeIng	
  
QUEUE	
  
Min: 45% Max: 75%
Finance	
  
QUEUE	
  
Job 3
Min: 35%
Max: 35%
Job1
Job3
Job2 preemptedJob2
Page 31 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
© Hortonworks Inc. 2013
Preemption in action
31
5
Product	
  
QUEUE	
  
Min: 20%
Max: 30%
MarkeIng	
  
QUEUE	
  
Min: 45% Max: 75%
Finance	
  
QUEUE	
  
Min: 35%
Max: 35%
Job2
Job3
Job1 finishes
Page 32 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Chargeback: App-Level Aggregate
Capture resources of an app.
Resources
Exposes Memory and CPU Seconds
Reserved amount not utilized
Exposed
REST API, CLI, WebUI
Ambari Chargeback View coming soon!
YARN-415
Page 33 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Resource Control
Page 34 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Memory Scheduling
What
Default method of scheduling today
Applications request containers of X memory resources MB/GB
YARN Capacity Scheduler schedules containers based on node memory availability
Why
Most tasks are not CPU Bound
Typically an abundant resource on newer clusters (256GB Per Node Standard)
Had to start with something ;)
Page 35 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
CPU Scheduling
What
Admin tells YARN how much CPU capacity is available in cluster
Applications specify CPU capacity needed for each container
YARN Capacity Scheduler schedules application taking CPU capacity availability into account
Why
Applications (for example Storm, HBase, Machine Learning) need predictable access to CPU
as a resource
CPU has become bottleneck instead of memory in certain clusters (128 GB RAM, 6 CPUs)
YARN-2
Page 36 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
CGroup Isolation for CPU
What
Admin enables CGroups for CPU Isolation for all YARN application workloads
Why
Applications need guaranteed access to CPU resources
To ensure SLAs, need to enforce CPU allocations given to an Application container
Effects
Containers ability to use CPU by vCores requested, hard or soft enforcement
NodeManager max allowed CPU usage for ALL Containers on Node (Host CPU % Total)
YARN-3
Page 37 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
[Coming Soon] Disk Resources
What
Just isolation – enforce equal sharing of Disk or dedication of spindles
Disk Isolation : Local Disk Iops at runtime… not HDFS read/writes
Disk Dedication : Let applications request dedicated spindles
How
Linux only – uses CGroups
Use Cgroups resource handler:
org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler
Enable Disk resource
yarn.nodemanager.resource.disk.enabled
YARN-2619
Page 38 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Preemption Demo
Demo
1.  Multiple Queues without Preemption
2.  Multiple Queues with Preemption
Simulated workload
1.  Start 2 jobs and use all elasticity of the reports & ops queues
2.  Wait ~30 Seconds
3.  Start 2 more jobs in adhoc & batch queues
Page 39 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Thank You!
Questions?
Page 40 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

More Related Content

What's hot

Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
Python Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on FlinkPython Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on FlinkAljoscha Krettek
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Storesconfluent
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 
NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?Anton Zadorozhniy
 
Transactional operations in Apache Hive: present and future
Transactional operations in Apache Hive: present and futureTransactional operations in Apache Hive: present and future
Transactional operations in Apache Hive: present and futureDataWorks Summit
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and FutureDataWorks Summit
 
Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived Vinoth Chandar
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiManish Gupta
 
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...DataWorks Summit
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkTimo Walther
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database Systemconfluent
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 

What's hot (20)

Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Python Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on FlinkPython Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on Flink
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?
 
Transactional operations in Apache Hive: present and future
Transactional operations in Apache Hive: present and futureTransactional operations in Apache Hive: present and future
Transactional operations in Apache Hive: present and future
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and Future
 
Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Effective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant ClustersEffective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant Clusters
 

Similar to Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - StampedeCon 2015

Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureDataWorks Summit
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionWangda Tan
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionDataWorks Summit
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionDataWorks Summit
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGskumpf
 
Internet of things Crash Course Workshop
Internet of things Crash Course WorkshopInternet of things Crash Course Workshop
Internet of things Crash Course WorkshopDataWorks Summit
 
Internet of Things Crash Course Workshop at Hadoop Summit
Internet of Things Crash Course Workshop at Hadoop SummitInternet of Things Crash Course Workshop at Hadoop Summit
Internet of Things Crash Course Workshop at Hadoop SummitDataWorks Summit
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoophitesh1892
 
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureHadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureVinod Kumar Vavilapalli
 
Storm Demo Talk - Denver Apr 2015
Storm Demo Talk - Denver Apr 2015Storm Demo Talk - Denver Apr 2015
Storm Demo Talk - Denver Apr 2015Mac Moore
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondYahoo Developer Network
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureDataWorks Summit
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & FutureDataWorks Summit
 
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Hortonworks
 
June 10 145pm hortonworks_tan & welch_v2
June 10 145pm hortonworks_tan & welch_v2June 10 145pm hortonworks_tan & welch_v2
June 10 145pm hortonworks_tan & welch_v2DataWorks Summit
 
Hadoop summit-diverse-workload
Hadoop summit-diverse-workloadHadoop summit-diverse-workload
Hadoop summit-diverse-workloadWangda Tan
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnhdhappy001
 

Similar to Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - StampedeCon 2015 (20)

Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
 
Internet of things Crash Course Workshop
Internet of things Crash Course WorkshopInternet of things Crash Course Workshop
Internet of things Crash Course Workshop
 
Internet of Things Crash Course Workshop at Hadoop Summit
Internet of Things Crash Course Workshop at Hadoop SummitInternet of Things Crash Course Workshop at Hadoop Summit
Internet of Things Crash Course Workshop at Hadoop Summit
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
 
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureHadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
 
Storm Demo Talk - Denver Apr 2015
Storm Demo Talk - Denver Apr 2015Storm Demo Talk - Denver Apr 2015
Storm Demo Talk - Denver Apr 2015
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & Future
 
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014
 
June 10 145pm hortonworks_tan & welch_v2
June 10 145pm hortonworks_tan & welch_v2June 10 145pm hortonworks_tan & welch_v2
June 10 145pm hortonworks_tan & welch_v2
 
Hadoop summit-diverse-workload
Hadoop summit-diverse-workloadHadoop summit-diverse-workload
Hadoop summit-diverse-workload
 
What's new in Ambari
What's new in AmbariWhat's new in Ambari
What's new in Ambari
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 

More from StampedeCon

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...StampedeCon
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017StampedeCon
 
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017StampedeCon
 
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...StampedeCon
 
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017StampedeCon
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017StampedeCon
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017StampedeCon
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...StampedeCon
 
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...StampedeCon
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017StampedeCon
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017StampedeCon
 
A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017StampedeCon
 
Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017StampedeCon
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017StampedeCon
 
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017StampedeCon
 
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...StampedeCon
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...StampedeCon
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016StampedeCon
 
Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016StampedeCon
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016StampedeCon
 

More from StampedeCon (20)

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
 
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
 
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
 
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
 
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
 
A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017
 
Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
 
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
 
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
 
Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016
 

Recently uploaded

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 

Recently uploaded (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 

Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - StampedeCon 2015

  • 1. Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Multi-Tenancy & The Capacity Scheduler Apache YARN Joseph Niemiec Senior Solutions Architect jniemiec@hortonworks.com ARN
  • 2. Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Quick Bio •  Hadoop User for ~4 years •  Co-Author for Apache Hadoop YARN •  Originally used Hadoop for location based services •  Destination Prediction •  Traffic Analysis •  Effects of weather at client locations on call center call types •  Pending Patent in Automotive/Telematics domain •  Defensive Paper on M2M Validation •  Started on analytics to be better at an MMORPG •  HWX SME for YARN, Tez & MapReduce
  • 3. Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Agenda Multi-Tenancy, A Goal, A Definition YARN Primer Capacity Scheduler Basics Workload Management •  Queue Mapping •  Node Labels •  Fair Sharing Preemption •  Chargeback Resource Control •  Memory •  CPU & CGroups •  Future Resources Quick Preemption Demo – If we have time
  • 4. Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Multi-Tenancy A Goal, A Definition
  • 5. Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Business Objectives of Multi-Tenancy •  Elimination of data silos •  Collate and share data across LoBs •  Lower cluster TCO through: •  Blending workloads •  Higher cluster utilization •  Economies of scale •  Enable applications to: •  Exploit 3rd party data sources •  Share LoB data with –  External customers; and –  Other LoBs –  Supply chain partners Spring 2015 65% of clusters host multiple workloads Fall 2013 Largely silo’d deployments with single workload clusters
  • 6. Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved YARN Yet Another Resource Negotiator
  • 7. Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Transition from Hadoop 1 to Hadoop 2 HADOOP 1.0 HDFS (redundant,  reliable  storage) MapReduce (cluster  resource  management  &  data  processing) HDFS2 (redundant,  reliable  storage) YARN (cluster  resource  management) MapReduce (data  processing) Others (data  processing) HADOOP 2.0 Single Use System Batch Apps Multi Purpose Platform Batch, Interactive, Online, Streaming YARN-1
  • 8. Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Concepts Application Application is a job submitted to the framework Example – Map Reduce Job Container Basic unit of allocation Fine-grained resource allocation across multiple resource types (memory, cpu, disk, network, gpu etc.) container_0 = 2GB, 1CPU container_1 = 1GB, 6 CPU Replaces the fixed map/reduce slots
  • 9. Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved YARN what is it good for? Compute for Data Processing Metal Detectors for your Hay Stacks Compute for Embarrassingly Parallel Problems Problems with tiny datasets and/or that don’t depend on one another ie: Exhaustive Search, Trade Simulations, Climate Models, Genetic Algorithms Beyond MapReduce Enables Multi Workload Compute Applications on a Single Shared Infrastructure Stream Processing, NoSQL, Search, InMemory, Graphs, etc !ANYTHING YOU CAN START FROM CLI!
  • 10. Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved NodeManager   NodeManager   NodeManager   NodeManager   Container  1.1   Container  2.4   NodeManager   NodeManager   NodeManager   NodeManager   NodeManager   NodeManager   NodeManager   NodeManager   Container  1.2   Container  1.3   AM  1   Container  2.2   Container  2.1   Container  2.3   AM2   YARN Architecture - Walkthrough Client2   ResourceManager   Scheduler  
  • 11. Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Capacity Scheduler The Basics
  • 12. Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved YARN Capacity Scheduler •  Elasticity over Queues •  Job submission Access Control Lists Capacity  Sharing   FUNCTION   •  Max capacity per queue •  User limits within queue   Capacity   Enforcement   FUNCTION   •  Management Admin. Access Control Lists •  Capacity-Scheduler.xml Admin-­‐istraIon   FUNCTION  
  • 13. Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hierarchical Queues ResourceManager   Scheduler   root Adhoc 10% DW 60% Mrkting 30% Dev 10% Reserved 20% Prod 70% Prod 80% Dev 20% P0 70% P1 30%
  • 14. Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Multi-Tenancy with Capacity Scheduler Queues Economics as queue-capacity §  Hierarchical Queues SLAs §  Preemption Resource Isolation §  Cgroups Administration §  Queue ACLs §  Run-time re-configuration for queues §  Charge-back ResourceManager   Scheduler   root Adhoc 10% DW 70% Mrkting 20% Dev 10% Reserved 20% Prod 70% Prod 80% Dev 20% P0 70% P1 30% Capacity Scheduler Hierarchical Queues
  • 15. Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Set Limits on Capacity Minimum Capacity Used to enforce SLAs across Business Units Guaranteed minimum resources for the queue Maximum Capacity Hard limits on maximum % of cluster resources Resource Elasticity when not being used by other queues Minimum User Limits Enforces sharing amongst users in a Business Unit User sharing for a given queue User Limit Factor Maximum queue capacity that one user can take up Application Limit Maximum # of applications submitted to one queue
  • 16. Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved CS: Example Queue Configuration Finance: 10 users | Ad-hoc BI Query jobs only | High Priority, Stricter SLAs Data Warehouse: 2 users | Batch ETL and Report Generation jobs | Production SLAs Marketing: 4 users | Ad-hoc Data Science (Pig+Mahout) | Loose SLAs yarn.scheduler.capacity.root.finance   Capacity ACLs Min: 0.30 | Max: 0.40 | User Limit: 0.20 ‘Finance’ group yarn.scheduler.capacity.root.datawarehouse   Capacity ACLs Min: 0.50 | Max: 0.60 | User Limit: 1.0 ‘DataWarehouse’ group yarn.scheduler.capacity.root.markeIng   Capacity ACLs Min: 0.20 | Max: 0.20 | User Limit: 1.0 ‘Marketing’ group
  • 17. Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Workload Management Queue Mapping, Labels, Fair Sharing, Preemption, Chargeback
  • 18. Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Default User/Group to Queue Mapping Yarn.scheduler.capacity.root.markeIng   Capacity ACLs Min: 0.30 | Max: 0.50 ‘Web’ group Yarn.scheduler.capacity.root.ops   Capacity ACLs Min: 0.30 | Max: 0.50 ‘SupplyChain’ group Yarn.scheduler.capacity.root.default   Capacity ACLs Min: 0.40 User/Group CS Queue U: Joe Marketing G: Web Marketing G: SupplyChain Ops … MR App Joe YARN-2411
  • 19. Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Node Labels in YARN Enable configuration of node partitions Why Need a mechanism to enforce node-level isolation Account for resource contention amongst non-YARN managed resources Account for hardware or software constraints Two options: Non-exclusive (Soft) Node Labels Exclusive (Hard) Node Labels
  • 20. Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Storm Storm StormStorm Exclusive Node Labels enable Isolated Partitions S App Storm Configure Partitions Storm B App Exclusive Labels enforce Isolation S S nodes labels S S YARN-796
  • 21. Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Spark Spark SparkSpark Non-Exclusive Node Labels S App Spark Configure non- exclusive labels Spark B App Schedule if free capacity S S nodes labels S S B YARN-3214
  • 22. Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Fair Sharing: Pluggable Queue Policies Choose scheduling policy per leaf queue FIFO Application Container requests accommodated on first come first serve basis Multi-fair weight Application Container requests accommodated according to: •  Order of least resources used – multiple applications make progress •  (Optional) Size based weight – adjustment to boost large applications making progress YARN-3319YARN-3318
  • 23. Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Fair Sharing MarkeIng   QUEUE   - Job 1 - U: etl - Q: Marketing Job1 Containers: 100 Running/ Finished: 20 Job1 - 20 Containers Running MarkeIng   QUEUE   - Job 2 - U: etl - Q: Marketing Job1 Containers: 100 Running/ Finished: 20 Job2 Containers: 20 Running/ Finished: 0 Job1 - 20 Containers Running
  • 24. Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Fair Sharing MarkeIng   QUEUE   Job1 – 10 Containers Running MarkeIng   QUEUE  Job1 Containers: 100 Running/ Finished: 40 Job2 Containers: 20 Running/ Finished: 20 Job1 - 10 Containers Running Job1 Containers: 100 Running/ Finished: 30 Job2 Containers: 20 Running/ Finished: 10 Job2 – 10 Containers Running Job2 - 10 Containers Running
  • 25. Page 25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Preemption Across Queues Supported Within Queue across Users Roadmap Within Queue within User Not Supported (Fair Sharing only) Across Queues (Node Labels) Supported YARN-569
  • 26. Page 26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Preemption: Overview Preempt based on lowest priority Queue that is most over subscribed Last scheduled app in FIFO queue Does not account for user limits within a queue Warn and request from application Requests application to un-reserve a container After period will forcibly kill container
  • 27. Page 27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved © Hortonworks Inc. 2013 Preemption in action 27 1 Product   QUEUE   Min: 20% Max: 30% MarkeIng   QUEUE   Min: 45% Max: 75% Finance   QUEUE   Job 1 Min: 35% Max: 35%
  • 28. Page 28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved © Hortonworks Inc. 2013 Preemption in action 28 2 Product   QUEUE   Min: 20% Max: 30% MarkeIng   QUEUE   Min: 45% Max: 75% Finance   QUEUE   Job 2 Min: 35% Max: 35% Job1
  • 29. Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved © Hortonworks Inc. 2013 Preemption in action 29 3 Product   QUEUE   Min: 20% Max: 30% MarkeIng   QUEUE   Min: 45% Max: 75% Finance   QUEUE   Job 3 Min: 35% Max: 35% Job1 Job2 65%
  • 30. Page 30 © Hortonworks Inc. 2011 – 2015. All Rights Reserved © Hortonworks Inc. 2013 Preemption in action 30 4 Product   QUEUE   Min: 20% Max: 30% MarkeIng   QUEUE   Min: 45% Max: 75% Finance   QUEUE   Job 3 Min: 35% Max: 35% Job1 Job3 Job2 preemptedJob2
  • 31. Page 31 © Hortonworks Inc. 2011 – 2015. All Rights Reserved © Hortonworks Inc. 2013 Preemption in action 31 5 Product   QUEUE   Min: 20% Max: 30% MarkeIng   QUEUE   Min: 45% Max: 75% Finance   QUEUE   Min: 35% Max: 35% Job2 Job3 Job1 finishes
  • 32. Page 32 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Chargeback: App-Level Aggregate Capture resources of an app. Resources Exposes Memory and CPU Seconds Reserved amount not utilized Exposed REST API, CLI, WebUI Ambari Chargeback View coming soon! YARN-415
  • 33. Page 33 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Resource Control
  • 34. Page 34 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Memory Scheduling What Default method of scheduling today Applications request containers of X memory resources MB/GB YARN Capacity Scheduler schedules containers based on node memory availability Why Most tasks are not CPU Bound Typically an abundant resource on newer clusters (256GB Per Node Standard) Had to start with something ;)
  • 35. Page 35 © Hortonworks Inc. 2011 – 2015. All Rights Reserved CPU Scheduling What Admin tells YARN how much CPU capacity is available in cluster Applications specify CPU capacity needed for each container YARN Capacity Scheduler schedules application taking CPU capacity availability into account Why Applications (for example Storm, HBase, Machine Learning) need predictable access to CPU as a resource CPU has become bottleneck instead of memory in certain clusters (128 GB RAM, 6 CPUs) YARN-2
  • 36. Page 36 © Hortonworks Inc. 2011 – 2015. All Rights Reserved CGroup Isolation for CPU What Admin enables CGroups for CPU Isolation for all YARN application workloads Why Applications need guaranteed access to CPU resources To ensure SLAs, need to enforce CPU allocations given to an Application container Effects Containers ability to use CPU by vCores requested, hard or soft enforcement NodeManager max allowed CPU usage for ALL Containers on Node (Host CPU % Total) YARN-3
  • 37. Page 37 © Hortonworks Inc. 2011 – 2015. All Rights Reserved [Coming Soon] Disk Resources What Just isolation – enforce equal sharing of Disk or dedication of spindles Disk Isolation : Local Disk Iops at runtime… not HDFS read/writes Disk Dedication : Let applications request dedicated spindles How Linux only – uses CGroups Use Cgroups resource handler: org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler Enable Disk resource yarn.nodemanager.resource.disk.enabled YARN-2619
  • 38. Page 38 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Preemption Demo Demo 1.  Multiple Queues without Preemption 2.  Multiple Queues with Preemption Simulated workload 1.  Start 2 jobs and use all elasticity of the reports & ops queues 2.  Wait ~30 Seconds 3.  Start 2 more jobs in adhoc & batch queues
  • 39. Page 39 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Thank You! Questions?
  • 40. Page 40 © Hortonworks Inc. 2011 – 2015. All Rights Reserved