SlideShare a Scribd company logo
1 of 20
Download to read offline
Apache Horn (Incubating)
a Large-scale Deep Learning Platform
Edward J. Yoon @eddieyoon
Oct 15, 2015 @ R3 Diva-Hall, Samsung Electronics
I am ..
● Member of Apache Software Foundation
● PMC member and committer, or Mentor of
○ Apache Incubator,
○ Apache Hama, Apache Horn, Apache MRQL,
○ and Apache Rya, Apache BigTop.
● Cloud Tech Lab, Software R&D Center.
○ HPC Cloud (Network Analysis, ML & DNN)
What’s Apache Horn?
Horn [hɔ:n]: 얼(혼) 魂 = Mind
● Horn is a clone project of Google’s DistBelief, supports
both data and model parallelism.
○ Apache Incubator Project (Since Sep 2015)
○ 9 initial members are from Samsung Electronics, Microsoft, Cldi Inc,
LINE plus, TUM, KAIST, …, etc.
Google’s DistBelief
● GPUs are expensive, both to buy and to rent.
● Most GPUs can only hold a relatively small amount of data in
memory and CPU-to-GPU data transfer is very slow.
○ Therefore, the training speed-up is small when the model
does not fit in GPU memory.
● DistBelief is a framework for training deep neural networks
that avoids GPUs-only approach (for the above reasons) and
solves the problems with a large number of examples and
dimensions (e.g., high-resolution images).
Google’s DistBelief
● It supports both Data and Model Parallelism
○ Data Parallelism: The training data is partitioned
across several machines each having its own replica
of the model. Each model trains with its partition of
the data in parallel.
○ Model Parallelism: The layers of each model
replica are distributed across machines.
DistBelief: Basic Architecture
Each worker group performs minibatch
in BSP paradigm, and interacts with
Parameter Server asynchronously.
What’s BSP?
● Bulk Synchronous Parallel
It was developed by Leslie Valiant of
Harvard University during the 1980s.
● Iteratively:
a. Local Computation
b. Communication (Message Passing)
c. Global Barrier Synchronization
DistBelief: Batch Optimization
Coordinator 1) finds stragglers
(slow tasks) for better load
balancing and resource usage. It
similar to Google MapReduce’s
“Backup Tasks” 2) reduces
communication overheads between
the central Parameter Server and
workers something like
Aggregators.
As a result:
● CPU cluster to train deep networks significantly faster
than a GPU, w/o limitation on the max size of model.
○ CPU cluster is 10x faster than a GPU.
● Trained a model with over 1 billion parameters to
achieve better than state-of-the-art performance on
ImageNet challenge.
Nov 2012: IBM simulates 530 billion neurons, 100 trillion synapses
* 1,572,864 processor cores, 1.5 PB memory, and 6,291,456 threads.
Wait, .. Why do we need this?
● Deep learning is likely to spur other applications beyond
speech and image recognition in the nearer term.
○ e.g., medicine, manufacturing, and transportation.
and, it’s a Closed Source Software
● We needs to solve size matters (training set and the
size of neural networks), but many OSS such as Caffe,
DeepDist, Spark MLlib, Deeplearning4j, and
NeuralGiraph are data or model parallel only.
● So, we started to clone the Google’s DistBelief, called
Apache Horn (Incubating).
The key idea of implementation
● .. is to use existing OSS distributed systems
○ Apache Hadoop: Distributed File System, Resource
Manager.
○ Apache Hama: general-purpose BSP computing
engine on top of Hadoop, which can be used for
Both data-parallel and graph-parallel in flexible
way.
Apache Hama: BSP framework
BSP framework
on Hama or YARN
Hadoop HDFS
Task 1 Task 2 Task 3 Task N...
Like MapReduce, Apache Hama
BSP framework schedules tasks
according to the distance between
the input data of the tasks and
request nodes.
BSP tasks are globally
synchronized after performing
computations on local data and
communication actions.
Global Regional Synchronization
BSP framework
on Hama or YARN
Hadoop HDFS
Task 1
Task 2
Task 3
Task 4
Like MapReduce, Apache Hama
BSP framework schedules tasks
according to the distance between
the input data of the tasks and
request nodes.
All tasks within the same
group are synchronized
with each others. Each
group works
asynchronously as
independent BSP job.
...
Task 6
Task 5
Async mini-batches using Regional Synchronization
BSP framework
on Hama or YARN
Hadoop HDFS
Task 1
Task 2
Task 3
Task 4
Like MapReduce, Apache Hama
BSP framework schedules tasks
according to the distance between
the input data of the tasks and
request nodes.
...
Task 5
Task 6
Each group performs
minibatch in BSP
paradigm, and interacts
with Parameter Server
asynchronously.
Parameter Swapping
Parameter Server Parameter Server
BSP framework
on Hama or YARN
Hadoop HDFS
Task 1
Task 2
Task 3
Task 4
Like MapReduce, Apache Hama
BSP framework schedules tasks
according to the distance between
the input data of the tasks and
request nodes.
...
Task 5
Task 6
One of group
works as a
Coordinator
Each group performs
minibatch in BSP
paradigm, and interacts
with Parameter Server
asynchronously.
Parameter Swapping
Async mini-batches using Regional Synchronization
Parameter Server Parameter Server
Neuron-centric Programming APIs
User-defined neuron-centric
programming APIs:
The activation and cost
functions computes the
propagated information, or
error messages and sends
its updates to Parameter
Server (but not fully
designed yet).
Similar to Google’s Pregel.
Job Configuration APIs
/*
* Sigmoid Activation Function
*/
public static class Sigmoid extends ActivationFunction {
public double apply(double input) {
return 1.0 / (1 + Math.exp(-input));
}
}
...
public static void main(String[] args) {
ANNJob ann = new ANNJob();
// Initialize the topology of the model
ann.addLayer(int featureDimension, Sigmoid.class, int numOfTasks);
ann.addLayer(int featureDimension, Step.class, int numOfTasks);
ann.addLayer(int featureDimention, Tanh.class, int numOfTasks);
…
ann.setCostFunction(CrossEntropy.class);
..
}
Job Submission Flow
BSP framework on
Apache Hama or YARN
clusters
Task 1
Task 4
Task 7
Task 2 Task 3
Task 5 Task 6
Task 8 Task 9
Parameter
Server
Parameter
Server
Parameter Swapping
One of worker
group works
as a Coordinator
Hadoop HDFS
Data
Parallelism
Model
Parallelism
Apache Horn
Client and Web UI
User’s
ANN Job
Horn Community
● https://horn.incubator.apache.org/
● https://issues.apache.org/jira/browse/HORN
● Mailing lists
○ dev-subscribe@horn.incubator.apache.org

More Related Content

What's hot

A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)Alexander Ulanov
 
Enterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using SparkEnterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using SparkAlpine Data
 
Neural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep LearningNeural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep LearningAsim Jalis
 
Apache hama 0.2-userguide
Apache hama 0.2-userguideApache hama 0.2-userguide
Apache hama 0.2-userguideEdward Yoon
 
Challenges in Large Scale Machine Learning
Challenges in Large Scale  Machine LearningChallenges in Large Scale  Machine Learning
Challenges in Large Scale Machine LearningSudarsun Santhiappan
 
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearnPrediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearnJosef A. Habdank
 
Harnessing Big Data with Spark
Harnessing Big Data with SparkHarnessing Big Data with Spark
Harnessing Big Data with SparkAlpine Data
 
Machine learning at scale with Google Cloud Platform
Machine learning at scale with Google Cloud PlatformMachine learning at scale with Google Cloud Platform
Machine learning at scale with Google Cloud PlatformMatthias Feys
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016MLconf
 
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...Databricks
 
Developing a Map Reduce Application
Developing a Map Reduce ApplicationDeveloping a Map Reduce Application
Developing a Map Reduce ApplicationDr. C.V. Suresh Babu
 
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016MLconf
 
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...Databricks
 
Surge: Rise of Scalable Machine Learning at Yahoo!
Surge: Rise of Scalable Machine Learning at Yahoo!Surge: Rise of Scalable Machine Learning at Yahoo!
Surge: Rise of Scalable Machine Learning at Yahoo!DataWorks Summit
 
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim HunterDeep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim HunterDatabricks
 
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In SparkYggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In SparkJen Aman
 
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitSlides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitCarlo C. del Mundo
 
Stories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi TorresStories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi TorresSpark Summit
 

What's hot (20)

A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
 
Enterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using SparkEnterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using Spark
 
Neural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep LearningNeural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep Learning
 
Apache hama 0.2-userguide
Apache hama 0.2-userguideApache hama 0.2-userguide
Apache hama 0.2-userguide
 
Challenges in Large Scale Machine Learning
Challenges in Large Scale  Machine LearningChallenges in Large Scale  Machine Learning
Challenges in Large Scale Machine Learning
 
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearnPrediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
 
Harnessing Big Data with Spark
Harnessing Big Data with SparkHarnessing Big Data with Spark
Harnessing Big Data with Spark
 
Machine learning at scale with Google Cloud Platform
Machine learning at scale with Google Cloud PlatformMachine learning at scale with Google Cloud Platform
Machine learning at scale with Google Cloud Platform
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
 
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
 
Map Reduce introduction
Map Reduce introductionMap Reduce introduction
Map Reduce introduction
 
Developing a Map Reduce Application
Developing a Map Reduce ApplicationDeveloping a Map Reduce Application
Developing a Map Reduce Application
 
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
 
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
 
Surge: Rise of Scalable Machine Learning at Yahoo!
Surge: Rise of Scalable Machine Learning at Yahoo!Surge: Rise of Scalable Machine Learning at Yahoo!
Surge: Rise of Scalable Machine Learning at Yahoo!
 
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim HunterDeep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
 
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In SparkYggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
 
Tutorial5
Tutorial5Tutorial5
Tutorial5
 
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitSlides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
 
Stories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi TorresStories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi Torres
 

Viewers also liked

(소스콘 2015 발표자료) Apache HORN, a large scale deep learning
(소스콘 2015 발표자료) Apache HORN, a large scale deep learning(소스콘 2015 발표자료) Apache HORN, a large scale deep learning
(소스콘 2015 발표자료) Apache HORN, a large scale deep learningEdward Yoon
 
Time Series Processing with Apache Spark
Time Series Processing with Apache SparkTime Series Processing with Apache Spark
Time Series Processing with Apache SparkQAware GmbH
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaSpark Summit
 
Apache Zeppelin으로 데이터 분석하기
Apache Zeppelin으로 데이터 분석하기Apache Zeppelin으로 데이터 분석하기
Apache Zeppelin으로 데이터 분석하기SangWoo Kim
 
Zeppelin notebook 만들기
Zeppelin notebook 만들기Zeppelin notebook 만들기
Zeppelin notebook 만들기Soo-Kyung Choi
 
Zeppelin(Spark)으로 데이터 분석하기
Zeppelin(Spark)으로 데이터 분석하기Zeppelin(Spark)으로 데이터 분석하기
Zeppelin(Spark)으로 데이터 분석하기SangWoo Kim
 
Applied Deep Learning with Spark and Deeplearning4j
Applied Deep Learning with Spark and Deeplearning4jApplied Deep Learning with Spark and Deeplearning4j
Applied Deep Learning with Spark and Deeplearning4jDataWorks Summit
 
[모두의연구소] 쫄지말자딥러닝
[모두의연구소] 쫄지말자딥러닝[모두의연구소] 쫄지말자딥러닝
[모두의연구소] 쫄지말자딥러닝Modulabs
 

Viewers also liked (8)

(소스콘 2015 발표자료) Apache HORN, a large scale deep learning
(소스콘 2015 발표자료) Apache HORN, a large scale deep learning(소스콘 2015 발표자료) Apache HORN, a large scale deep learning
(소스콘 2015 발표자료) Apache HORN, a large scale deep learning
 
Time Series Processing with Apache Spark
Time Series Processing with Apache SparkTime Series Processing with Apache Spark
Time Series Processing with Apache Spark
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
 
Apache Zeppelin으로 데이터 분석하기
Apache Zeppelin으로 데이터 분석하기Apache Zeppelin으로 데이터 분석하기
Apache Zeppelin으로 데이터 분석하기
 
Zeppelin notebook 만들기
Zeppelin notebook 만들기Zeppelin notebook 만들기
Zeppelin notebook 만들기
 
Zeppelin(Spark)으로 데이터 분석하기
Zeppelin(Spark)으로 데이터 분석하기Zeppelin(Spark)으로 데이터 분석하기
Zeppelin(Spark)으로 데이터 분석하기
 
Applied Deep Learning with Spark and Deeplearning4j
Applied Deep Learning with Spark and Deeplearning4jApplied Deep Learning with Spark and Deeplearning4j
Applied Deep Learning with Spark and Deeplearning4j
 
[모두의연구소] 쫄지말자딥러닝
[모두의연구소] 쫄지말자딥러닝[모두의연구소] 쫄지말자딥러닝
[모두의연구소] 쫄지말자딥러닝
 

Similar to Introduction to apache horn (incubating)

Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onDony Riyanto
 
Hadoop Tutorial.ppt
Hadoop Tutorial.pptHadoop Tutorial.ppt
Hadoop Tutorial.pptSathish24111
 
Architecting and productionising data science applications at scale
Architecting and productionising data science applications at scaleArchitecting and productionising data science applications at scale
Architecting and productionising data science applications at scalesamthemonad
 
Data Analytics and Machine Learning: From Node to Cluster on ARM64
Data Analytics and Machine Learning: From Node to Cluster on ARM64Data Analytics and Machine Learning: From Node to Cluster on ARM64
Data Analytics and Machine Learning: From Node to Cluster on ARM64Ganesh Raju
 
BKK16-404B Data Analytics and Machine Learning- from Node to Cluster
BKK16-404B Data Analytics and Machine Learning- from Node to ClusterBKK16-404B Data Analytics and Machine Learning- from Node to Cluster
BKK16-404B Data Analytics and Machine Learning- from Node to ClusterLinaro
 
BKK16-408B Data Analytics and Machine Learning From Node to Cluster
BKK16-408B Data Analytics and Machine Learning From Node to ClusterBKK16-408B Data Analytics and Machine Learning From Node to Cluster
BKK16-408B Data Analytics and Machine Learning From Node to ClusterLinaro
 
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Chris Baglieri
 
Hadoop ecosystem framework n hadoop in live environment
Hadoop ecosystem framework  n hadoop in live environmentHadoop ecosystem framework  n hadoop in live environment
Hadoop ecosystem framework n hadoop in live environmentDelhi/NCR HUG
 
The Future of Computing is Distributed
The Future of Computing is DistributedThe Future of Computing is Distributed
The Future of Computing is DistributedAlluxio, Inc.
 
A Database-Hadoop Hybrid Approach to Scalable Machine Learning
A Database-Hadoop Hybrid Approach to Scalable Machine LearningA Database-Hadoop Hybrid Approach to Scalable Machine Learning
A Database-Hadoop Hybrid Approach to Scalable Machine LearningMakoto Yui
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduceBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduceMahantesh Angadi
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation HadoopVarun Narang
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce FrameworkBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce FrameworkMahantesh Angadi
 
Hadoop online-training
Hadoop online-trainingHadoop online-training
Hadoop online-trainingGeohedrick
 
Deep Learning with Spark and GPUs
Deep Learning with Spark and GPUsDeep Learning with Spark and GPUs
Deep Learning with Spark and GPUsDataWorks Summit
 

Similar to Introduction to apache horn (incubating) (20)

Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-on
 
Hadoop Tutorial.ppt
Hadoop Tutorial.pptHadoop Tutorial.ppt
Hadoop Tutorial.ppt
 
Hadoop tutorial
Hadoop tutorialHadoop tutorial
Hadoop tutorial
 
Architecting and productionising data science applications at scale
Architecting and productionising data science applications at scaleArchitecting and productionising data science applications at scale
Architecting and productionising data science applications at scale
 
Data Analytics and Machine Learning: From Node to Cluster on ARM64
Data Analytics and Machine Learning: From Node to Cluster on ARM64Data Analytics and Machine Learning: From Node to Cluster on ARM64
Data Analytics and Machine Learning: From Node to Cluster on ARM64
 
BKK16-404B Data Analytics and Machine Learning- from Node to Cluster
BKK16-404B Data Analytics and Machine Learning- from Node to ClusterBKK16-404B Data Analytics and Machine Learning- from Node to Cluster
BKK16-404B Data Analytics and Machine Learning- from Node to Cluster
 
BKK16-408B Data Analytics and Machine Learning From Node to Cluster
BKK16-408B Data Analytics and Machine Learning From Node to ClusterBKK16-408B Data Analytics and Machine Learning From Node to Cluster
BKK16-408B Data Analytics and Machine Learning From Node to Cluster
 
Hadoop
HadoopHadoop
Hadoop
 
final report
final reportfinal report
final report
 
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
 
Hadoop ecosystem framework n hadoop in live environment
Hadoop ecosystem framework  n hadoop in live environmentHadoop ecosystem framework  n hadoop in live environment
Hadoop ecosystem framework n hadoop in live environment
 
The Future of Computing is Distributed
The Future of Computing is DistributedThe Future of Computing is Distributed
The Future of Computing is Distributed
 
A Database-Hadoop Hybrid Approach to Scalable Machine Learning
A Database-Hadoop Hybrid Approach to Scalable Machine LearningA Database-Hadoop Hybrid Approach to Scalable Machine Learning
A Database-Hadoop Hybrid Approach to Scalable Machine Learning
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduceBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
F07-Cloud-Hadoop-BAM
F07-Cloud-Hadoop-BAMF07-Cloud-Hadoop-BAM
F07-Cloud-Hadoop-BAM
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce FrameworkBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
 
Hadoop online-training
Hadoop online-trainingHadoop online-training
Hadoop online-training
 
hadoop
hadoophadoop
hadoop
 
Deep Learning with Spark and GPUs
Deep Learning with Spark and GPUsDeep Learning with Spark and GPUs
Deep Learning with Spark and GPUs
 

More from Edward Yoon

K means 알고리즘을 이용한 영화배우 클러스터링
K means 알고리즘을 이용한 영화배우 클러스터링K means 알고리즘을 이용한 영화배우 클러스터링
K means 알고리즘을 이용한 영화배우 클러스터링Edward Yoon
 
차세대하둡과 주목해야할 오픈소스
차세대하둡과 주목해야할 오픈소스차세대하둡과 주목해야할 오픈소스
차세대하둡과 주목해야할 오픈소스Edward Yoon
 
The evolution of web and big data
The evolution of web and big dataThe evolution of web and big data
The evolution of web and big dataEdward Yoon
 
Apache hama @ Samsung SW Academy
Apache hama @ Samsung SW AcademyApache hama @ Samsung SW Academy
Apache hama @ Samsung SW AcademyEdward Yoon
 
Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011Edward Yoon
 
MongoDB introduction
MongoDB introductionMongoDB introduction
MongoDB introductionEdward Yoon
 
Monitoring and mining network traffic in clouds
Monitoring and mining network traffic in cloudsMonitoring and mining network traffic in clouds
Monitoring and mining network traffic in cloudsEdward Yoon
 
Usage case of HBase for real-time application
Usage case of HBase for real-time applicationUsage case of HBase for real-time application
Usage case of HBase for real-time applicationEdward Yoon
 
Apache HAMA: An Introduction toBulk Synchronization Parallel on Hadoop
Apache HAMA: An Introduction toBulk Synchronization Parallel on HadoopApache HAMA: An Introduction toBulk Synchronization Parallel on Hadoop
Apache HAMA: An Introduction toBulk Synchronization Parallel on HadoopEdward Yoon
 
Understand Of Linear Algebra
Understand Of Linear AlgebraUnderstand Of Linear Algebra
Understand Of Linear AlgebraEdward Yoon
 
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And HbaseEdward Yoon
 

More from Edward Yoon (12)

K means 알고리즘을 이용한 영화배우 클러스터링
K means 알고리즘을 이용한 영화배우 클러스터링K means 알고리즘을 이용한 영화배우 클러스터링
K means 알고리즘을 이용한 영화배우 클러스터링
 
차세대하둡과 주목해야할 오픈소스
차세대하둡과 주목해야할 오픈소스차세대하둡과 주목해야할 오픈소스
차세대하둡과 주목해야할 오픈소스
 
The evolution of web and big data
The evolution of web and big dataThe evolution of web and big data
The evolution of web and big data
 
Apache hama @ Samsung SW Academy
Apache hama @ Samsung SW AcademyApache hama @ Samsung SW Academy
Apache hama @ Samsung SW Academy
 
Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011
 
MongoDB introduction
MongoDB introductionMongoDB introduction
MongoDB introduction
 
Monitoring and mining network traffic in clouds
Monitoring and mining network traffic in cloudsMonitoring and mining network traffic in clouds
Monitoring and mining network traffic in clouds
 
Usage case of HBase for real-time application
Usage case of HBase for real-time applicationUsage case of HBase for real-time application
Usage case of HBase for real-time application
 
Apache HAMA: An Introduction toBulk Synchronization Parallel on Hadoop
Apache HAMA: An Introduction toBulk Synchronization Parallel on HadoopApache HAMA: An Introduction toBulk Synchronization Parallel on Hadoop
Apache HAMA: An Introduction toBulk Synchronization Parallel on Hadoop
 
Understand Of Linear Algebra
Understand Of Linear AlgebraUnderstand Of Linear Algebra
Understand Of Linear Algebra
 
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And Hbase
 
Heart Proposal
Heart ProposalHeart Proposal
Heart Proposal
 

Recently uploaded

CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 

Introduction to apache horn (incubating)

  • 1. Apache Horn (Incubating) a Large-scale Deep Learning Platform Edward J. Yoon @eddieyoon Oct 15, 2015 @ R3 Diva-Hall, Samsung Electronics
  • 2. I am .. ● Member of Apache Software Foundation ● PMC member and committer, or Mentor of ○ Apache Incubator, ○ Apache Hama, Apache Horn, Apache MRQL, ○ and Apache Rya, Apache BigTop. ● Cloud Tech Lab, Software R&D Center. ○ HPC Cloud (Network Analysis, ML & DNN)
  • 3. What’s Apache Horn? Horn [hɔ:n]: 얼(혼) 魂 = Mind ● Horn is a clone project of Google’s DistBelief, supports both data and model parallelism. ○ Apache Incubator Project (Since Sep 2015) ○ 9 initial members are from Samsung Electronics, Microsoft, Cldi Inc, LINE plus, TUM, KAIST, …, etc.
  • 4. Google’s DistBelief ● GPUs are expensive, both to buy and to rent. ● Most GPUs can only hold a relatively small amount of data in memory and CPU-to-GPU data transfer is very slow. ○ Therefore, the training speed-up is small when the model does not fit in GPU memory. ● DistBelief is a framework for training deep neural networks that avoids GPUs-only approach (for the above reasons) and solves the problems with a large number of examples and dimensions (e.g., high-resolution images).
  • 5. Google’s DistBelief ● It supports both Data and Model Parallelism ○ Data Parallelism: The training data is partitioned across several machines each having its own replica of the model. Each model trains with its partition of the data in parallel. ○ Model Parallelism: The layers of each model replica are distributed across machines.
  • 6. DistBelief: Basic Architecture Each worker group performs minibatch in BSP paradigm, and interacts with Parameter Server asynchronously.
  • 7. What’s BSP? ● Bulk Synchronous Parallel It was developed by Leslie Valiant of Harvard University during the 1980s. ● Iteratively: a. Local Computation b. Communication (Message Passing) c. Global Barrier Synchronization
  • 8. DistBelief: Batch Optimization Coordinator 1) finds stragglers (slow tasks) for better load balancing and resource usage. It similar to Google MapReduce’s “Backup Tasks” 2) reduces communication overheads between the central Parameter Server and workers something like Aggregators.
  • 9. As a result: ● CPU cluster to train deep networks significantly faster than a GPU, w/o limitation on the max size of model. ○ CPU cluster is 10x faster than a GPU. ● Trained a model with over 1 billion parameters to achieve better than state-of-the-art performance on ImageNet challenge. Nov 2012: IBM simulates 530 billion neurons, 100 trillion synapses * 1,572,864 processor cores, 1.5 PB memory, and 6,291,456 threads.
  • 10. Wait, .. Why do we need this? ● Deep learning is likely to spur other applications beyond speech and image recognition in the nearer term. ○ e.g., medicine, manufacturing, and transportation.
  • 11. and, it’s a Closed Source Software ● We needs to solve size matters (training set and the size of neural networks), but many OSS such as Caffe, DeepDist, Spark MLlib, Deeplearning4j, and NeuralGiraph are data or model parallel only. ● So, we started to clone the Google’s DistBelief, called Apache Horn (Incubating).
  • 12. The key idea of implementation ● .. is to use existing OSS distributed systems ○ Apache Hadoop: Distributed File System, Resource Manager. ○ Apache Hama: general-purpose BSP computing engine on top of Hadoop, which can be used for Both data-parallel and graph-parallel in flexible way.
  • 13. Apache Hama: BSP framework BSP framework on Hama or YARN Hadoop HDFS Task 1 Task 2 Task 3 Task N... Like MapReduce, Apache Hama BSP framework schedules tasks according to the distance between the input data of the tasks and request nodes. BSP tasks are globally synchronized after performing computations on local data and communication actions.
  • 14. Global Regional Synchronization BSP framework on Hama or YARN Hadoop HDFS Task 1 Task 2 Task 3 Task 4 Like MapReduce, Apache Hama BSP framework schedules tasks according to the distance between the input data of the tasks and request nodes. All tasks within the same group are synchronized with each others. Each group works asynchronously as independent BSP job. ... Task 6 Task 5
  • 15. Async mini-batches using Regional Synchronization BSP framework on Hama or YARN Hadoop HDFS Task 1 Task 2 Task 3 Task 4 Like MapReduce, Apache Hama BSP framework schedules tasks according to the distance between the input data of the tasks and request nodes. ... Task 5 Task 6 Each group performs minibatch in BSP paradigm, and interacts with Parameter Server asynchronously. Parameter Swapping Parameter Server Parameter Server
  • 16. BSP framework on Hama or YARN Hadoop HDFS Task 1 Task 2 Task 3 Task 4 Like MapReduce, Apache Hama BSP framework schedules tasks according to the distance between the input data of the tasks and request nodes. ... Task 5 Task 6 One of group works as a Coordinator Each group performs minibatch in BSP paradigm, and interacts with Parameter Server asynchronously. Parameter Swapping Async mini-batches using Regional Synchronization Parameter Server Parameter Server
  • 17. Neuron-centric Programming APIs User-defined neuron-centric programming APIs: The activation and cost functions computes the propagated information, or error messages and sends its updates to Parameter Server (but not fully designed yet). Similar to Google’s Pregel.
  • 18. Job Configuration APIs /* * Sigmoid Activation Function */ public static class Sigmoid extends ActivationFunction { public double apply(double input) { return 1.0 / (1 + Math.exp(-input)); } } ... public static void main(String[] args) { ANNJob ann = new ANNJob(); // Initialize the topology of the model ann.addLayer(int featureDimension, Sigmoid.class, int numOfTasks); ann.addLayer(int featureDimension, Step.class, int numOfTasks); ann.addLayer(int featureDimention, Tanh.class, int numOfTasks); … ann.setCostFunction(CrossEntropy.class); .. }
  • 19. Job Submission Flow BSP framework on Apache Hama or YARN clusters Task 1 Task 4 Task 7 Task 2 Task 3 Task 5 Task 6 Task 8 Task 9 Parameter Server Parameter Server Parameter Swapping One of worker group works as a Coordinator Hadoop HDFS Data Parallelism Model Parallelism Apache Horn Client and Web UI User’s ANN Job
  • 20. Horn Community ● https://horn.incubator.apache.org/ ● https://issues.apache.org/jira/browse/HORN ● Mailing lists ○ dev-subscribe@horn.incubator.apache.org