SlideShare a Scribd company logo
1 of 28
​  Mark Wagner
​  Engineer, Hadoop Infrastructure
​  LinkedIn
Dr. Elephant:
Self-serve performance tuning for
Hadoop
3
Hadoop @ LinkedIn
•  Thousands of users of Hadoop infrastructure
•  Tens of thousands of jobs a day
•  Thousands of registered projects
•  Multiple analytics, experimentation, and metrics platforms built on top
•  Diverse backgrounds and levels of experience with Hadoop
4
Hadoop team @ LinkedIn
•  Roll our own distribution
•  Build next generation systems
•  Optimize our investment in hardware
•  Enable our users to be productive
5
Optimizing people
Better compatibility: ByteRay
•  We have 1000s of developer-
months in existing codebases
•  Hadoop 2 has incompatible APIs
6
Optimizing people
Workflow tooling: Gradle DSL for Hadoop
•  Nobody writes one Hadoop job
•  How do you structure Hadoop
codebases?
	
  
hadoop	
  {	
  
	
  	
  buildPath	
  'conf/jobs';	
  
	
  
	
  	
  propertyFile('common'){	
  
	
  	
  	
  set	
  properties:	
  [	
  
	
  	
  	
  	
  	
  'user.to.proxy'	
  :	
  'mwagner'	
  
	
  	
  	
  ]	
  
	
  	
  }	
  
	
  
	
  	
  workflow('my-­‐first-­‐workflow'){	
  
	
  	
  	
  	
  commandJob('start-­‐job'){	
  
	
  	
  	
  	
  	
  	
  uses	
  'echo	
  "Hello,	
  World!"'	
  
	
  	
  	
  	
  }	
  
	
  	
  	
  	
  	
  
	
  	
  	
  	
  pigLiJob('vowels'){	
  
	
  	
  	
  	
  	
  	
  uses	
  'src/main/pig/vowels.pig'	
  
	
  	
  	
  	
  	
  	
  depends	
  'start-­‐job'	
  
	
  	
  	
  	
  }	
  
	
  
	
  	
  	
  	
  targets	
  'vowels'	
  
	
  	
  }	
  
}	
  
	
  
Easier tuning?
7
Optimizing people
•  Large investment in hardware
•  Cost(People) >> Cost(Machines)
•  Can’t throw machines at the problem forever
•  Some tuning needed to get things running
•  Minimum effort gives the worst of both worlds
8
Barriers to tuning
 Problems are not obvious
•  What’s wrong with this job?
Anything?
...	
  
2015-­‐06-­‐09	
  05:57:56,281	
  Stage-­‐1	
  map	
  =	
  95%,	
  	
  reduce	
  =	
  0%,	
  Cumulative	
  CPU	
  12602.08	
  sec	
  
2015-­‐06-­‐09	
  05:58:17,821	
  Stage-­‐1	
  map	
  =	
  96%,	
  	
  reduce	
  =	
  0%,	
  Cumulative	
  CPU	
  12688.5	
  sec	
  
2015-­‐06-­‐09	
  05:58:23,952	
  Stage-­‐1	
  map	
  =	
  97%,	
  	
  reduce	
  =	
  0%,	
  Cumulative	
  CPU	
  12705.91	
  sec	
  
2015-­‐06-­‐09	
  05:58:24,976	
  Stage-­‐1	
  map	
  =	
  99%,	
  	
  reduce	
  =	
  0%,	
  Cumulative	
  CPU	
  12710.31	
  sec	
  
2015-­‐06-­‐09	
  05:58:26,000	
  Stage-­‐1	
  map	
  =	
  100%,	
  	
  reduce	
  =	
  0%,	
  Cumulative	
  CPU	
  12712.08	
  sec	
  
2015-­‐06-­‐09	
  05:58:40,317	
  Stage-­‐1	
  map	
  =	
  100%,	
  	
  reduce	
  =	
  100%,	
  Cumulative	
  CPU	
  12714.17	
  
sec	
  
MapReduce	
  Total	
  cumulative	
  CPU	
  time:	
  0	
  days	
  3	
  hours	
  31	
  minutes	
  54	
  seconds	
  170	
  msec	
  
Ended	
  Job	
  =	
  job_1433389922983_133809	
  
MapReduce	
  Jobs	
  Launched:	
  	
  
Job	
  0:	
  Map:	
  35	
  	
  Reduce:	
  1	
  	
  	
  Cumulative	
  CPU:	
  12714.17	
  sec	
  	
  	
  HDFS	
  Read:	
  23223452	
  HDFS	
  
Write:	
  18	
  SUCCESS	
  
Total	
  MapReduce	
  CPU	
  Time	
  Spent:	
  0	
  days	
  3	
  hours	
  31	
  minutes	
  54	
  seconds	
  170	
  msec	
  
OK	
  
1234567	
  
Time	
  taken:	
  564.189	
  seconds,	
  Fetched:	
  1	
  row(s)	
  
hive	
  (default)>	
  
 Critical information is scattered
9
Barriers to tuning
 Inter-related settings
10
Barriers to tuning
What interface
are you using?
Did you set max
split size?
Did you set min
split size?
Did you have
split combination
enabled?
How large are
your files?
Extend
CombineFileInputFormat?
CombineHiveInputFormat?
What’s your
maxCombinedSplitSize?
What’s your
block size?
 Large Parameter Space
11
Barriers to tuning
Mapreduce.task.io.sort.mb
Mapreduce.job.min.split.size
Pig.maxcombinedsplitsize
Hive.autoconvert.join
Mapreduce.task.io.sort.factor
Hive.exec.reducers.bytes.per.reducer
Pig.exec.reducer.max
Pig.exec.reducers.bytes.per.reducer
Hive.map.aggr
Hive.groupby.skewindata
Hive.multigroupby.singlemr
Mapreduce.map.memory.mb
Pig.cachedbag.memusage
Hive.optimize.correlation
Hive.exec.orc.dictionary.key.size.threshold
Pig.exec.mapPartAgg
Pig.exec.mapPartAgg.minReduction
Pig.skewedjoin.reduce.memusage
Mapreduce.map.sort.spill.percent
Mapreduce.job.max.split.locations
Mapreduce.reduce.shuffle.parallelcopies
Mapreduce.reduce.shuffle.merge.percent
Mapreduce.map.speculative
Mapreduce.reduce.speculative
Mapreduce.map.output.compress
Mapreduce.job.ubertask.maxmaps
Mapreduce.ifile.readahead.bytes
Hive.exec.compress.intermediate
Hive.merge.mapfiles
200+ configuration settings in MapReduce
300+ more in Hive
 Not this
12
Tuning Hadoop
Photo credit: __ Night Flier __
 This
13
Tuning Hadoop
Photo credit: Ben Cooper
 Expert intervention
14
Things that don’t work
•  Not enough support resources available
•  Poor coverage
•  Difficult to prioritize efforts
•  Delays user development
 Extensive training
15
Things that don’t work
•  Too many users
•  Diverse backgrounds
•  Scope is large and evolving
•  Other responsibilities are more important
 Goals
16
Dr. Elephant
•  Help every user to get the best performance of their jobs
•  Impose minimal burden on the user
•  Development burden
•  Intellectual burden
•  Provide a platform for other performance related tools
17
Dashboard and feed
18
Search
19
Individual Job
20
Help Pages
 Internals
21
Dr. Elephant
•  All completed jobs are monitored
•  Diagnostic information collected automatically
•  REST API for everything
22
Dr. Elephant
 Monitoring scheduled workflows
•  Performance Characteristics
change
•  Data growth
•  Data distribution change
•  Hardware change
•  Incremental software change
•  Monitor performance on each
execution
•  Compare behavior across revisions
======TOP	
  20	
  BAD	
  JOBS	
  YESTERDAY======	
  
JobId	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Score	
  
job_1431576474881_181412	
  	
  	
  	
  	
  	
  36035	
  	
  	
  	
  	
  
job_1431576474881_185548	
  	
  	
  	
  	
  	
  27710	
  
.	
  
.	
  
.	
  
	
  
	
  
======TOP	
  20	
  BAD	
  FLOWS	
  YESTERDAY======	
  
FlowUrl 	
   	
   	
   	
  Score	
  	
  	
  	
  	
  
https://prod-­‐azkaban/...	
   	
  45379	
  
.	
  
.	
  
.	
  
	
  
======TOP	
  10	
  FLOWS	
  WITH	
  SIGNIFICANT	
  PERFORMANCE	
  CHANGE======	
  
Project 	
  Flow 	
   	
   	
  ChangeScore 	
  User	
  	
  	
  	
  	
  
myProject	
  	
  score-­‐daily 	
  48755	
  	
  	
  	
   	
   	
  mwagner	
  
.	
  
.	
  
.	
  
 Automated audits
23
Dr. Elephant
•  Separate cluster for critical workloads
•  Audit before deployment
•  Improved accuracy
•  Faster turnaround
•  Higher throughput
24
Dr. Elephant
 As an operator utility
•  Global view of performance issues
•  Search and identify jobs for extra
attention
•  Dr. Elephant sign-off as a
requirement for capacity requests
•  Dr. Elephant can grade itself
•  Social pressures encourage good
behavior
•  Tuning degrades over time
25
Results and experiences
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fraction
Fraction of healthy jobs
26
Dr. Elephant for all
•  Plugins for other execution engines
•  Tez, Spark on the way
•  Allow the user community to build a knowledge-base
27
Dr. Elephant today
•  Evaluating 60000+ jobs a day across multiple clusters
•  Open source release coming soon
©2014 LinkedIn Corporation. All Rights Reserved.©2014 LinkedIn Corporation. All Rights Reserved.

More Related Content

What's hot

Building A Diverse Geo-Architecture For Cloud Native Applications In One Day
Building A Diverse Geo-Architecture For Cloud Native Applications In One DayBuilding A Diverse Geo-Architecture For Cloud Native Applications In One Day
Building A Diverse Geo-Architecture For Cloud Native Applications In One DayVMware Tanzu
 
TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...
TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...
TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...Databricks
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...Spark Summit
 
Extending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event ProcessingExtending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event ProcessingOh Chan Kwon
 
Spark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni SchieferSpark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni SchieferSpark Summit
 
20180522 infra autoscaling_system
20180522 infra autoscaling_system20180522 infra autoscaling_system
20180522 infra autoscaling_systemKai Sasaki
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Databricks
 
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn SchedulerCloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn SchedulerDatabricks
 
Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...
Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...
Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...Spark Summit
 
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouMetrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouDatabricks
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeDatabricks
 
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"PivotalOpenSourceHub
 
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
 Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng ShiDatabricks
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesDatabricks
 
Lambda at Weather Scale by Robbie Strickland
Lambda at Weather Scale by Robbie StricklandLambda at Weather Scale by Robbie Strickland
Lambda at Weather Scale by Robbie StricklandSpark Summit
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Landon Robinson
 
Tricks of the trade with Ted Malaska
Tricks of the trade with Ted MalaskaTricks of the trade with Ted Malaska
Tricks of the trade with Ted MalaskaDatabricks
 
Hyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkHyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkDatabricks
 
Spark Tuning For Enterprise System Administrators, Spark Summit East 2016
Spark Tuning For Enterprise System Administrators, Spark Summit East 2016Spark Tuning For Enterprise System Administrators, Spark Summit East 2016
Spark Tuning For Enterprise System Administrators, Spark Summit East 2016Anya Bida
 

What's hot (20)

Building A Diverse Geo-Architecture For Cloud Native Applications In One Day
Building A Diverse Geo-Architecture For Cloud Native Applications In One DayBuilding A Diverse Geo-Architecture For Cloud Native Applications In One Day
Building A Diverse Geo-Architecture For Cloud Native Applications In One Day
 
TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...
TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...
TuneIn: How to Get Your Hadoop/Spark Jobs Tuned While You’re Sleeping with Ma...
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
 
Extending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event ProcessingExtending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event Processing
 
Spark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni SchieferSpark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni Schiefer
 
20180522 infra autoscaling_system
20180522 infra autoscaling_system20180522 infra autoscaling_system
20180522 infra autoscaling_system
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
 
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn SchedulerCloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
 
Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...
Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...
Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...
 
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouMetrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta Lake
 
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
 
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
 Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using Kubernetes
 
Lambda at Weather Scale by Robbie Strickland
Lambda at Weather Scale by Robbie StricklandLambda at Weather Scale by Robbie Strickland
Lambda at Weather Scale by Robbie Strickland
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
 
Tricks of the trade with Ted Malaska
Tricks of the trade with Ted MalaskaTricks of the trade with Ted Malaska
Tricks of the trade with Ted Malaska
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Hyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkHyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache Spark
 
Spark Tuning For Enterprise System Administrators, Spark Summit East 2016
Spark Tuning For Enterprise System Administrators, Spark Summit East 2016Spark Tuning For Enterprise System Administrators, Spark Summit East 2016
Spark Tuning For Enterprise System Administrators, Spark Summit East 2016
 

Similar to Self-serve Hadoop Performance Tuning with Dr. Elephant

A Year in Google - Percona Live Europe 2018
A Year in Google - Percona Live Europe 2018A Year in Google - Percona Live Europe 2018
A Year in Google - Percona Live Europe 2018Carmen Mason
 
6 tips for improving ruby performance
6 tips for improving ruby performance6 tips for improving ruby performance
6 tips for improving ruby performanceEngine Yard
 
Automatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI PlanningAutomatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI PlanningHiroshi Wada
 
ServerTemplate Deep Dive
ServerTemplate Deep DiveServerTemplate Deep Dive
ServerTemplate Deep DiveRightScale
 
Adding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance TestAdding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance TestRodolfo Kohn
 
Velocity 2018 preetha appan final
Velocity 2018   preetha appan finalVelocity 2018   preetha appan final
Velocity 2018 preetha appan finalpreethaappan
 
Wed-12-05pm-box-salmanahmed
Wed-12-05pm-box-salmanahmedWed-12-05pm-box-salmanahmed
Wed-12-05pm-box-salmanahmedSalman Ahmed
 
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesApache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesDataWorks Summit
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightRed_Hat_Storage
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightColleen Corrice
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks
 
Sql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptSql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptQingsong Yao
 
Building Enterprise Clouds - Key Considerations and Strategies - RED HAT
Building Enterprise Clouds - Key Considerations and Strategies - RED HATBuilding Enterprise Clouds - Key Considerations and Strategies - RED HAT
Building Enterprise Clouds - Key Considerations and Strategies - RED HATFadi Semaan
 
MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014Ryusuke Kajiyama
 
Continuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsContinuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsMike Brittain
 
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUsDavid Klee
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudRevolution Analytics
 
Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Co...
Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Co...Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Co...
Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Co...Nane Kratzke
 
Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Lari Hotari
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreDataStax Academy
 

Similar to Self-serve Hadoop Performance Tuning with Dr. Elephant (20)

A Year in Google - Percona Live Europe 2018
A Year in Google - Percona Live Europe 2018A Year in Google - Percona Live Europe 2018
A Year in Google - Percona Live Europe 2018
 
6 tips for improving ruby performance
6 tips for improving ruby performance6 tips for improving ruby performance
6 tips for improving ruby performance
 
Automatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI PlanningAutomatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI Planning
 
ServerTemplate Deep Dive
ServerTemplate Deep DiveServerTemplate Deep Dive
ServerTemplate Deep Dive
 
Adding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance TestAdding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance Test
 
Velocity 2018 preetha appan final
Velocity 2018   preetha appan finalVelocity 2018   preetha appan final
Velocity 2018 preetha appan final
 
Wed-12-05pm-box-salmanahmed
Wed-12-05pm-box-salmanahmedWed-12-05pm-box-salmanahmed
Wed-12-05pm-box-salmanahmed
 
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesApache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer Spotlight
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer Spotlight
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
Sql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptSql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.ppt
 
Building Enterprise Clouds - Key Considerations and Strategies - RED HAT
Building Enterprise Clouds - Key Considerations and Strategies - RED HATBuilding Enterprise Clouds - Key Considerations and Strategies - RED HAT
Building Enterprise Clouds - Key Considerations and Strategies - RED HAT
 
MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014
 
Continuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsContinuous Delivery: The Dirty Details
Continuous Delivery: The Dirty Details
 
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Co...
Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Co...Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Co...
Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Co...
 
Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Recently uploaded (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

Self-serve Hadoop Performance Tuning with Dr. Elephant

  • 1.
  • 2. ​  Mark Wagner ​  Engineer, Hadoop Infrastructure ​  LinkedIn Dr. Elephant: Self-serve performance tuning for Hadoop
  • 3. 3 Hadoop @ LinkedIn •  Thousands of users of Hadoop infrastructure •  Tens of thousands of jobs a day •  Thousands of registered projects •  Multiple analytics, experimentation, and metrics platforms built on top •  Diverse backgrounds and levels of experience with Hadoop
  • 4. 4 Hadoop team @ LinkedIn •  Roll our own distribution •  Build next generation systems •  Optimize our investment in hardware •  Enable our users to be productive
  • 5. 5 Optimizing people Better compatibility: ByteRay •  We have 1000s of developer- months in existing codebases •  Hadoop 2 has incompatible APIs
  • 6. 6 Optimizing people Workflow tooling: Gradle DSL for Hadoop •  Nobody writes one Hadoop job •  How do you structure Hadoop codebases?   hadoop  {      buildPath  'conf/jobs';        propertyFile('common'){        set  properties:  [            'user.to.proxy'  :  'mwagner'        ]      }        workflow('my-­‐first-­‐workflow'){          commandJob('start-­‐job'){              uses  'echo  "Hello,  World!"'          }                    pigLiJob('vowels'){              uses  'src/main/pig/vowels.pig'              depends  'start-­‐job'          }            targets  'vowels'      }   }    
  • 7. Easier tuning? 7 Optimizing people •  Large investment in hardware •  Cost(People) >> Cost(Machines) •  Can’t throw machines at the problem forever •  Some tuning needed to get things running •  Minimum effort gives the worst of both worlds
  • 8. 8 Barriers to tuning  Problems are not obvious •  What’s wrong with this job? Anything? ...   2015-­‐06-­‐09  05:57:56,281  Stage-­‐1  map  =  95%,    reduce  =  0%,  Cumulative  CPU  12602.08  sec   2015-­‐06-­‐09  05:58:17,821  Stage-­‐1  map  =  96%,    reduce  =  0%,  Cumulative  CPU  12688.5  sec   2015-­‐06-­‐09  05:58:23,952  Stage-­‐1  map  =  97%,    reduce  =  0%,  Cumulative  CPU  12705.91  sec   2015-­‐06-­‐09  05:58:24,976  Stage-­‐1  map  =  99%,    reduce  =  0%,  Cumulative  CPU  12710.31  sec   2015-­‐06-­‐09  05:58:26,000  Stage-­‐1  map  =  100%,    reduce  =  0%,  Cumulative  CPU  12712.08  sec   2015-­‐06-­‐09  05:58:40,317  Stage-­‐1  map  =  100%,    reduce  =  100%,  Cumulative  CPU  12714.17   sec   MapReduce  Total  cumulative  CPU  time:  0  days  3  hours  31  minutes  54  seconds  170  msec   Ended  Job  =  job_1433389922983_133809   MapReduce  Jobs  Launched:     Job  0:  Map:  35    Reduce:  1      Cumulative  CPU:  12714.17  sec      HDFS  Read:  23223452  HDFS   Write:  18  SUCCESS   Total  MapReduce  CPU  Time  Spent:  0  days  3  hours  31  minutes  54  seconds  170  msec   OK   1234567   Time  taken:  564.189  seconds,  Fetched:  1  row(s)   hive  (default)>  
  • 9.  Critical information is scattered 9 Barriers to tuning
  • 10.  Inter-related settings 10 Barriers to tuning What interface are you using? Did you set max split size? Did you set min split size? Did you have split combination enabled? How large are your files? Extend CombineFileInputFormat? CombineHiveInputFormat? What’s your maxCombinedSplitSize? What’s your block size?
  • 11.  Large Parameter Space 11 Barriers to tuning Mapreduce.task.io.sort.mb Mapreduce.job.min.split.size Pig.maxcombinedsplitsize Hive.autoconvert.join Mapreduce.task.io.sort.factor Hive.exec.reducers.bytes.per.reducer Pig.exec.reducer.max Pig.exec.reducers.bytes.per.reducer Hive.map.aggr Hive.groupby.skewindata Hive.multigroupby.singlemr Mapreduce.map.memory.mb Pig.cachedbag.memusage Hive.optimize.correlation Hive.exec.orc.dictionary.key.size.threshold Pig.exec.mapPartAgg Pig.exec.mapPartAgg.minReduction Pig.skewedjoin.reduce.memusage Mapreduce.map.sort.spill.percent Mapreduce.job.max.split.locations Mapreduce.reduce.shuffle.parallelcopies Mapreduce.reduce.shuffle.merge.percent Mapreduce.map.speculative Mapreduce.reduce.speculative Mapreduce.map.output.compress Mapreduce.job.ubertask.maxmaps Mapreduce.ifile.readahead.bytes Hive.exec.compress.intermediate Hive.merge.mapfiles 200+ configuration settings in MapReduce 300+ more in Hive
  • 12.  Not this 12 Tuning Hadoop Photo credit: __ Night Flier __
  • 14.  Expert intervention 14 Things that don’t work •  Not enough support resources available •  Poor coverage •  Difficult to prioritize efforts •  Delays user development
  • 15.  Extensive training 15 Things that don’t work •  Too many users •  Diverse backgrounds •  Scope is large and evolving •  Other responsibilities are more important
  • 16.  Goals 16 Dr. Elephant •  Help every user to get the best performance of their jobs •  Impose minimal burden on the user •  Development burden •  Intellectual burden •  Provide a platform for other performance related tools
  • 21.  Internals 21 Dr. Elephant •  All completed jobs are monitored •  Diagnostic information collected automatically •  REST API for everything
  • 22. 22 Dr. Elephant  Monitoring scheduled workflows •  Performance Characteristics change •  Data growth •  Data distribution change •  Hardware change •  Incremental software change •  Monitor performance on each execution •  Compare behavior across revisions ======TOP  20  BAD  JOBS  YESTERDAY======   JobId                                                  Score   job_1431576474881_181412            36035           job_1431576474881_185548            27710   .   .   .       ======TOP  20  BAD  FLOWS  YESTERDAY======   FlowUrl        Score           https://prod-­‐azkaban/...    45379   .   .   .     ======TOP  10  FLOWS  WITH  SIGNIFICANT  PERFORMANCE  CHANGE======   Project  Flow      ChangeScore  User           myProject    score-­‐daily  48755            mwagner   .   .   .  
  • 23.  Automated audits 23 Dr. Elephant •  Separate cluster for critical workloads •  Audit before deployment •  Improved accuracy •  Faster turnaround •  Higher throughput
  • 24. 24 Dr. Elephant  As an operator utility •  Global view of performance issues •  Search and identify jobs for extra attention •  Dr. Elephant sign-off as a requirement for capacity requests
  • 25. •  Dr. Elephant can grade itself •  Social pressures encourage good behavior •  Tuning degrades over time 25 Results and experiences 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Fraction Fraction of healthy jobs
  • 26. 26 Dr. Elephant for all •  Plugins for other execution engines •  Tez, Spark on the way •  Allow the user community to build a knowledge-base
  • 27. 27 Dr. Elephant today •  Evaluating 60000+ jobs a day across multiple clusters •  Open source release coming soon
  • 28. ©2014 LinkedIn Corporation. All Rights Reserved.©2014 LinkedIn Corporation. All Rights Reserved.