SlideShare a Scribd company logo
1 of 47
Download to read offline
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Treasure Data on The YARN
Ryu Kobayashi
!
Hadoop Conference Japan 2014
8 July 2014
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Who am I?
• Ryu Kobayashi
• @ryu_kobayashi
• https://github.com/ryukobayashi
• Treasure Data, Inc.
• Software Engineer
• Background
• Hadoop, Cassandra, Machine Learning, ...
• I developed Huahin(Hadoop) Framework. 

http://huahinframework.org/
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
What is Treasure Data?
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Our Service
!
!
!
!
Columnar Storage!
+!
Hadoop!
MapReduce!
Data Collection Data Warehouse Data Analysis
!
!
!
Open-Source!
Log Collector!
Bulk Loader!
!
CSV / TSV!
MySQL,
Postgres!
Oracle, etc.
Web Log
App Log
Sensor
RDBMS
CRM
ERP
Streaming Upload
BI Tools!
Tableau, QlickView,!
Pentaho, Excel, etc.!
!
TD command / 

Web Console
REST API
JDBC / ODBC
SQL
(HiveQL)
or
Pig
Bulk Upload
Parallel Upload
External Service/
Storage!
Custom App,!
RDBMS, FTP, etc.
Result push
schema-less!
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Our Service
!
!
!
!
Columnar Storage!
+!
Hadoop!
MapReduce!
Data Collection Data Warehouse Data Analysis
!
!
!
Open-Source!
Log Collector!
Bulk Loader!
!
CSV / TSV!
MySQL,
Postgres!
Oracle, etc.
Web Log
App Log
Sensor
RDBMS
CRM
ERP
Streaming Upload
BI Tools!
Tableau, QlickView,!
Pentaho, Excel, etc.!
!
TD command / 

Web Console
REST API
JDBC / ODBC
SQL
(HiveQL)
or
Pig
Bulk Upload
Parallel Upload
External Service/
Storage!
Custom App,!
RDBMS, FTP, etc.
Result push
schema-less!
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Our Query Language
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Our Service
!
!
!
!
Columnar Storage!
+!
Hadoop!
MapReduce!
Data Collection Data Warehouse Data Analysis
!
!
!
Open-Source!
Log Collector!
Bulk Loader!
!
CSV / TSV!
MySQL,
Postgres!
Oracle, etc.
Web Log
App Log
Sensor
RDBMS
CRM
ERP
Streaming Upload
BI Tools!
Tableau, QlickView,!
Pentaho, Excel, etc.!
!
TD command / 

Web Console
REST API
JDBC / ODBC
SQL
(HiveQL)
or
Pig
Bulk Upload
Parallel Upload
External Service/
Storage!
Custom App,!
RDBMS, FTP, etc.
Result push
schema-less!
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Hadoop&Cluster
PlazmaDB
Our System
HDFS is not used
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Hadoop&Cluster
PlazmaDB
Our System
HDFS is not used
• Customize Hadoop
• Customize Hive
• Customize Pig
• Customize Impala
• Customize Presto
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
We have 4 production’s
Hadoop Cluster
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
We have 4 production’s
Hadoop Cluster
user1,&user4,&
user5,&…
user2,&user9,&
user34,&…
user10,&user40,&
user102,&…
user50,&user88,&
user1023,&…
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Our Scheduler and Queue
QueueScheduler
Hadoop&Cluster Hadoop&Cluster
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
We have 4 production’s
Hadoop Cluster and
Hadoop Cluster(YARN)
YARN&Cluster
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
MRv1 and YARN Queue
Queue
Hadoop&Cluster Hadoop&Cluster
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Our Service
• About 4700 users
• About 6 trillion records
• About 12 million Jobs
• About 40,000 Job by day
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
What is YARN?
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
YARN(Yet Another Resource Negotiator)
Architecture
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
• MRv1
• JobTracker
• TaskTracker
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
• YARN
• ResourceManager
• NodeManager
• ApplicationMaster
• Job History Server
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
• MRv1
• JobTracker
• TaskTracker
• YARN
• ResourceManager
• NodeManager
• ApplicationMaster
• Job History Server
* ******(We*can*not*see*the*log*history*If*it*do*not*install)
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Note!!!
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Use the Hadoop 2.4.0
and later!!!
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
• The versions which must not be used
• Apache Hadoop 2.2.0
• Apache Hadoop 2.3.0
• HDP 2.0(2.2.0 based)
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
• Currently
• Apache Hadoop 2.4.1
• CDH 5.0.2(2.3.0 based and patch)
• HDP 2.1(2.4.0 based)
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
• Why should not use?
• Capacity Scheduler
• There is a bug
• Fair Scheduler
• There is a bug
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
• Any bugs?
• Each Scheduler will cause
a deadlock
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Distribution
• CDH 5.0.2
• Red Hat/CentOS/Oracle 5
• Red Hat/CentOS/Oracle 6
• Ubuntu/Debian
• HDP 2.1
• Red Hat/CentOS/SLES (64-bit)
• (There is already Ubuntu12 to the
repository)
• Windows Server 2008 & 2012
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Configuration file has been changed
several(YARN from MRv1)
!
reference: http://goo.gl/vBIYQP
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Deprecated Properties
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Other notes for configuration file
• hadoop-conf-pseudo does not work
• some mistakes
ex : yarn.nodemanager.aux-services
mapreduce.shuffle -> mapreduce_shuffle
• 2.2.0 and 2.4.0
• There are some differences
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
What should we do?
• Copy of CDH VM and HDP VM
configuration files
• Use the Ambari or Cloudera
Manager
• I work hard on their own!
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Slot has been changed(YARN from MRv1)
• MRv1
• map slot, reduce slot
• YARN(MRv2)
• resource(container)
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
mapred-site.xml
• mapred.tasktracker.map.tasks.maximum
• mapred.tasktracker.reduce.tasks.maximum
scheduler.xml
• maxMaps, minMaps
• maxReduces, minReduces
MRv1
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
yarn-site.xml
• yarn.nodemanager.resource.memory-mb
• (yarn.nodenamager.vmem-pmem-ratio)
• (yarn.scheduler.minimum-allocation-mb)
mapred-site.xml
• yarn.app.mapreduce.am.resource.mb
• mapreduce.map.memory.mb
• mapreduce.reduce.memory.mb
fair-scheduler.xml
• maxResources, minResources
YARN(MRv2)
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
yarn.nodemanager.resource.memory-mb =>
Memory that NodeManager uses
!
yarn.app.mapreduce.am.resource.mb =>
Memory that ApplicationMaster uses
!
mapreduce.map.memory.mb =>
Memory that Map uses
!
mapreduce.reduce.memory.mb =>
Memory that Reduce uses
YANR Resource Management
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
yarn.nodemanager.resource.memory-mb = 4096
yarn.app.mapreduce.am.resource.mb = 1024
mapreduce.map.memory.mb = 1024
mapreduce.reduce.memory.mb = 2048
!
MRv2 Application
	 ApplicationMaster => 1
	 	 Mapper => 3
	 	 	 Reducer => 1
YANR Resource Example
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
In addition to this(ex: Fair Scheduler):
	 minResources
	 maxResources
	 maxRunningApps
	 schedulingPolicy
YANR Resource Example
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
In addition to this(ex: Fair Scheduler):
	 pool -> queue
	 user. maxRunningJobs -> user. maxRunningApps
	 userMaxJobsDefault -> userMaxAppsDefault
	 etc…
Changes Fair scheduler
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
yarn.nodemanager.resource.memoryDmb
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
YANR Scheduler Management
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
e.g.
	 Use hdp-configuration-utils.py script
	 	 http://goo.gl/L2hxyq
!
	 Use Ambari
	 	 http://ambari.apache.org/
	 	 (not supported Ubuntu12.
	 	 Ubuntu 12 support is coming soon)
YANR Resource Management
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
DefaultContainerExecuter
• Container launch process based
• Same as the conventional(MRv1)
!
LinuxContainerExecuter
• Only Linux
• Some restrictions
• cgroup, etc…
YANR Container Executer
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
MRv1
• The need to set the initial
!
YARN
• The need to set the initial
• There is a change from MRv1 (ex: /tmp/hadoop-yarn/)
YANR Directory Structure
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
What should we do?
• Reference the CDH VM and HDP
VM HDFS directory
• Use the Ambari or Cloudera
Manager
• I work hard on their own!
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Enjoy the YARN!!!
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
We are hiring!!!
Copyright*©2014*Treasure*Data.**All*Rights*Reserved.
Thanks!!!

More Related Content

What's hot

Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingImpetus Technologies
 
Hw09 Monitoring Best Practices
Hw09   Monitoring Best PracticesHw09   Monitoring Best Practices
Hw09 Monitoring Best PracticesCloudera, Inc.
 
Improving Hadoop Performance via Linux
Improving Hadoop Performance via LinuxImproving Hadoop Performance via Linux
Improving Hadoop Performance via LinuxAlex Moundalexis
 
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedInHadoop Operations at LinkedIn
Hadoop Operations at LinkedInAllen Wittenauer
 
Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Yahoo Developer Network
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingGreat Wide Open
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaCloudera, Inc.
 
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...Sumeet Singh
 
Keynote: Getting Serious about MySQL and Hadoop at Continuent
Keynote: Getting Serious about MySQL and Hadoop at ContinuentKeynote: Getting Serious about MySQL and Hadoop at Continuent
Keynote: Getting Serious about MySQL and Hadoop at ContinuentContinuent
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationAlex Moundalexis
 
A Basic Introduction to the Hadoop eco system - no animation
A Basic Introduction to the Hadoop eco system - no animationA Basic Introduction to the Hadoop eco system - no animation
A Basic Introduction to the Hadoop eco system - no animationSameer Tiwari
 
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter SlidesJuly 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slidesryancox
 
Hadoop Performance at LinkedIn
Hadoop Performance at LinkedInHadoop Performance at LinkedIn
Hadoop Performance at LinkedInAllen Wittenauer
 
Deview2013 SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
Deview2013 SQL-on-Hadoop with Apache Tajo, and application case of SK TelecomDeview2013 SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
Deview2013 SQL-on-Hadoop with Apache Tajo, and application case of SK TelecomNAVER D2
 
Introduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUGIntroduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUGAdam Kawa
 

What's hot (20)

Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hw09 Monitoring Best Practices
Hw09   Monitoring Best PracticesHw09   Monitoring Best Practices
Hw09 Monitoring Best Practices
 
Improving Hadoop Performance via Linux
Improving Hadoop Performance via LinuxImproving Hadoop Performance via Linux
Improving Hadoop Performance via Linux
 
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedInHadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
 
Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed Debugging
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
 
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
 
Keynote: Getting Serious about MySQL and Hadoop at Continuent
Keynote: Getting Serious about MySQL and Hadoop at ContinuentKeynote: Getting Serious about MySQL and Hadoop at Continuent
Keynote: Getting Serious about MySQL and Hadoop at Continuent
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
HBase with MapR
HBase with MapRHBase with MapR
HBase with MapR
 
Hadoop 2.0 handout 5.0
Hadoop 2.0 handout 5.0Hadoop 2.0 handout 5.0
Hadoop 2.0 handout 5.0
 
A Basic Introduction to the Hadoop eco system - no animation
A Basic Introduction to the Hadoop eco system - no animationA Basic Introduction to the Hadoop eco system - no animation
A Basic Introduction to the Hadoop eco system - no animation
 
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter SlidesJuly 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
 
Hadoop2.2
Hadoop2.2Hadoop2.2
Hadoop2.2
 
Hadoop Performance at LinkedIn
Hadoop Performance at LinkedInHadoop Performance at LinkedIn
Hadoop Performance at LinkedIn
 
Hadoop 1.x vs 2
Hadoop 1.x vs 2Hadoop 1.x vs 2
Hadoop 1.x vs 2
 
Deview2013 SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
Deview2013 SQL-on-Hadoop with Apache Tajo, and application case of SK TelecomDeview2013 SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
Deview2013 SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
 
Introduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUGIntroduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUG
 

Viewers also liked

Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014Tsuyoshi OZAWA
 
Sparkパフォーマンス検証
Sparkパフォーマンス検証Sparkパフォーマンス検証
Sparkパフォーマンス検証BrainPad Inc.
 
FluentdやNorikraを使った データ集約基盤への取り組み紹介
FluentdやNorikraを使った データ集約基盤への取り組み紹介FluentdやNorikraを使った データ集約基盤への取り組み紹介
FluentdやNorikraを使った データ集約基盤への取り組み紹介Recruit Technologies
 
Hivemall v0.3の機能紹介@1st Hivemall meetup
Hivemall v0.3の機能紹介@1st Hivemall meetupHivemall v0.3の機能紹介@1st Hivemall meetup
Hivemall v0.3の機能紹介@1st Hivemall meetupMakoto Yui
 
「PV、UBなどの数値からでは見えてこないユーザー行動の可視化」#yjdsw2
「PV、UBなどの数値からでは見えてこないユーザー行動の可視化」#yjdsw2「PV、UBなどの数値からでは見えてこないユーザー行動の可視化」#yjdsw2
「PV、UBなどの数値からでは見えてこないユーザー行動の可視化」#yjdsw2Yahoo!デベロッパーネットワーク
 
Hadoopカンファレンス20140707
Hadoopカンファレンス20140707Hadoopカンファレンス20140707
Hadoopカンファレンス20140707Recruit Technologies
 
Mahoutによるアルツハイマー診断支援へ向けた取り組み (Hadoop Confernce Japan 2014)
Mahoutによるアルツハイマー診断支援へ向けた取り組み (Hadoop Confernce Japan 2014)Mahoutによるアルツハイマー診断支援へ向けた取り組み (Hadoop Confernce Japan 2014)
Mahoutによるアルツハイマー診断支援へ向けた取り組み (Hadoop Confernce Japan 2014)Hadoop / Spark Conference Japan
 
実践機械学習 — MahoutとSolrを活用したレコメンデーションにおけるイノベーション - 2014/07/08 Hadoop Conference ...
実践機械学習 — MahoutとSolrを活用したレコメンデーションにおけるイノベーション - 2014/07/08 Hadoop Conference ...実践機械学習 — MahoutとSolrを活用したレコメンデーションにおけるイノベーション - 2014/07/08 Hadoop Conference ...
実践機械学習 — MahoutとSolrを活用したレコメンデーションにおけるイノベーション - 2014/07/08 Hadoop Conference ...MapR Technologies Japan
 
Shib: WebUI tool provides crossover of Hive and MPP
Shib: WebUI tool provides crossover of Hive and MPPShib: WebUI tool provides crossover of Hive and MPP
Shib: WebUI tool provides crossover of Hive and MPPSATOSHI TAGOMORI
 

Viewers also liked (12)

Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014
 
Sparkパフォーマンス検証
Sparkパフォーマンス検証Sparkパフォーマンス検証
Sparkパフォーマンス検証
 
FluentdやNorikraを使った データ集約基盤への取り組み紹介
FluentdやNorikraを使った データ集約基盤への取り組み紹介FluentdやNorikraを使った データ集約基盤への取り組み紹介
FluentdやNorikraを使った データ集約基盤への取り組み紹介
 
Hivemall v0.3の機能紹介@1st Hivemall meetup
Hivemall v0.3の機能紹介@1st Hivemall meetupHivemall v0.3の機能紹介@1st Hivemall meetup
Hivemall v0.3の機能紹介@1st Hivemall meetup
 
「PV、UBなどの数値からでは見えてこないユーザー行動の可視化」#yjdsw2
「PV、UBなどの数値からでは見えてこないユーザー行動の可視化」#yjdsw2「PV、UBなどの数値からでは見えてこないユーザー行動の可視化」#yjdsw2
「PV、UBなどの数値からでは見えてこないユーザー行動の可視化」#yjdsw2
 
Hadoopカンファレンス20140707
Hadoopカンファレンス20140707Hadoopカンファレンス20140707
Hadoopカンファレンス20140707
 
Mahoutによるアルツハイマー診断支援へ向けた取り組み (Hadoop Confernce Japan 2014)
Mahoutによるアルツハイマー診断支援へ向けた取り組み (Hadoop Confernce Japan 2014)Mahoutによるアルツハイマー診断支援へ向けた取り組み (Hadoop Confernce Japan 2014)
Mahoutによるアルツハイマー診断支援へ向けた取り組み (Hadoop Confernce Japan 2014)
 
「最近傍検索とその応用」#yjdsw2
「最近傍検索とその応用」#yjdsw2「最近傍検索とその応用」#yjdsw2
「最近傍検索とその応用」#yjdsw2
 
Hcj2014 myui
Hcj2014 myuiHcj2014 myui
Hcj2014 myui
 
Gwt sdm public
Gwt sdm publicGwt sdm public
Gwt sdm public
 
実践機械学習 — MahoutとSolrを活用したレコメンデーションにおけるイノベーション - 2014/07/08 Hadoop Conference ...
実践機械学習 — MahoutとSolrを活用したレコメンデーションにおけるイノベーション - 2014/07/08 Hadoop Conference ...実践機械学習 — MahoutとSolrを活用したレコメンデーションにおけるイノベーション - 2014/07/08 Hadoop Conference ...
実践機械学習 — MahoutとSolrを活用したレコメンデーションにおけるイノベーション - 2014/07/08 Hadoop Conference ...
 
Shib: WebUI tool provides crossover of Hive and MPP
Shib: WebUI tool provides crossover of Hive and MPPShib: WebUI tool provides crossover of Hive and MPP
Shib: WebUI tool provides crossover of Hive and MPP
 

Similar to Treasure Data on The YARN - Hadoop Conference Japan 2014

[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史Insight Technology, Inc.
 
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillAnalyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillTomer Shiran
 
Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014Tsuyoshi OZAWA
 
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillAnalyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drilltshiran
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsHortonworks
 
Hadoop-Quick introduction
Hadoop-Quick introductionHadoop-Quick introduction
Hadoop-Quick introductionSandeep Singh
 
Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Big Data Joe™ Rossi
 
NYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache HadoopNYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache Hadoopmarkgrover
 
AWS Earth and Space 2018 - Element 84 Processing and Streaming GOES-16 Data...
AWS Earth and Space 2018 -   Element 84 Processing and Streaming GOES-16 Data...AWS Earth and Space 2018 -   Element 84 Processing and Streaming GOES-16 Data...
AWS Earth and Space 2018 - Element 84 Processing and Streaming GOES-16 Data...Dan Pilone
 
PhillyDB Talk - Beyond Batch
PhillyDB Talk - Beyond BatchPhillyDB Talk - Beyond Batch
PhillyDB Talk - Beyond Batchboorad
 
Apache Drill - Why, What, How
Apache Drill - Why, What, HowApache Drill - Why, What, How
Apache Drill - Why, What, Howmcsrivas
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesDataWorks Summit
 
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNHortonworks
 
NoSQL in Real-time Architectures
NoSQL in Real-time ArchitecturesNoSQL in Real-time Architectures
NoSQL in Real-time ArchitecturesRonen Botzer
 

Similar to Treasure Data on The YARN - Hadoop Conference Japan 2014 (20)

[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から  by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
 
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillAnalyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drill
 
Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014
 
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillAnalyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drill
 
Yarnthug2014
Yarnthug2014Yarnthug2014
Yarnthug2014
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
 
Hadoop-Quick introduction
Hadoop-Quick introductionHadoop-Quick introduction
Hadoop-Quick introduction
 
MHUG - YARN
MHUG - YARNMHUG - YARN
MHUG - YARN
 
Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0
 
NYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache HadoopNYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache Hadoop
 
AWS Earth and Space 2018 - Element 84 Processing and Streaming GOES-16 Data...
AWS Earth and Space 2018 -   Element 84 Processing and Streaming GOES-16 Data...AWS Earth and Space 2018 -   Element 84 Processing and Streaming GOES-16 Data...
AWS Earth and Space 2018 - Element 84 Processing and Streaming GOES-16 Data...
 
Big data
Big dataBig data
Big data
 
PhillyDB Talk - Beyond Batch
PhillyDB Talk - Beyond BatchPhillyDB Talk - Beyond Batch
PhillyDB Talk - Beyond Batch
 
Apache Drill - Why, What, How
Apache Drill - Why, What, HowApache Drill - Why, What, How
Apache Drill - Why, What, How
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
 
Yarn
YarnYarn
Yarn
 
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARN
 
NoSQL in Real-time Architectures
NoSQL in Real-time ArchitecturesNoSQL in Real-time Architectures
NoSQL in Real-time Architectures
 

More from Ryu Kobayashi

PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core enginePLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engineRyu Kobayashi
 
Huahin Framework for Hadoop, Hadoop Conference Japan 2013 Winter
Huahin Framework for Hadoop, Hadoop Conference Japan 2013 WinterHuahin Framework for Hadoop, Hadoop Conference Japan 2013 Winter
Huahin Framework for Hadoop, Hadoop Conference Japan 2013 WinterRyu Kobayashi
 
Hadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 FallHadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 FallRyu Kobayashi
 
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLRyu Kobayashi
 
Hadoopソースコードリーディング第3回 Hadopo MR + Cassandra
Hadoopソースコードリーディング第3回 Hadopo MR + CassandraHadoopソースコードリーディング第3回 Hadopo MR + Cassandra
Hadoopソースコードリーディング第3回 Hadopo MR + CassandraRyu Kobayashi
 
AWSを使ったトラッキングログ収集
AWSを使ったトラッキングログ収集AWSを使ったトラッキングログ収集
AWSを使ったトラッキングログ収集Ryu Kobayashi
 
Hadoopソースコードリーディング MapReduce障害時のフロー
Hadoopソースコードリーディング MapReduce障害時のフローHadoopソースコードリーディング MapReduce障害時のフロー
Hadoopソースコードリーディング MapReduce障害時のフローRyu Kobayashi
 

More from Ryu Kobayashi (7)

PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core enginePLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
 
Huahin Framework for Hadoop, Hadoop Conference Japan 2013 Winter
Huahin Framework for Hadoop, Hadoop Conference Japan 2013 WinterHuahin Framework for Hadoop, Hadoop Conference Japan 2013 Winter
Huahin Framework for Hadoop, Hadoop Conference Japan 2013 Winter
 
Hadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 FallHadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 Fall
 
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
 
Hadoopソースコードリーディング第3回 Hadopo MR + Cassandra
Hadoopソースコードリーディング第3回 Hadopo MR + CassandraHadoopソースコードリーディング第3回 Hadopo MR + Cassandra
Hadoopソースコードリーディング第3回 Hadopo MR + Cassandra
 
AWSを使ったトラッキングログ収集
AWSを使ったトラッキングログ収集AWSを使ったトラッキングログ収集
AWSを使ったトラッキングログ収集
 
Hadoopソースコードリーディング MapReduce障害時のフロー
Hadoopソースコードリーディング MapReduce障害時のフローHadoopソースコードリーディング MapReduce障害時のフロー
Hadoopソースコードリーディング MapReduce障害時のフロー
 

Recently uploaded

React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineeringssuserb3a23b
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 

Recently uploaded (20)

React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineering
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 

Treasure Data on The YARN - Hadoop Conference Japan 2014