SlideShare a Scribd company logo
1 of 33
Course Instructor : Dr.Zarifzadeh
Presented By : Pouyan Rezazadeh, Ali Rezaie
Apache Storm
2
Introduction
Apache Storm
Hadoop and related technologies have made it
possible to store and process data at large scales.
Unfortunately, these data processing technologies
are not realtime systems.
Hadoop does batch processing instead of realtime
processing.
3
Introduction
Batch processing
Processing jobs in batch
Batch processing jobs can take hours
E.g. billing system
Realtime processing
Processing jobs one by one
Processing jobs immediately
E.g. airline system
Apache Storm
4
Introduction
Realtime data processing at massive scale is
becoming more and more of a requirement for
businesses.
The lack of a "Hadoop of realtime" has become
the biggest hole in the data processing ecosystem.
There's no hack that will turn Hadoop into a
realtime system.
Solution
Apache Storm
5
Apache Storm
A distributed realtime computation system
Founded in 2011
Implemented in Clojure (a dialect of Lisp), some
Java
Apache Storm
6
Advantages
Free, simple and open source
Can be used with any programming language
Very fast
Scalable
Fault-tolerant
Guarantees your data will be processed
Integrates with any database technology
Extremely robust
Apache Storm
7
Storm Use Cases
Apache Storm
And too many others …
8
Storm vs Hadoop
A Storm cluster is superficially similar to a
Hadoop cluster.
Hadoop runs "MapReduce jobs", while Storm
runs "topologies".
A MapReduce job eventually finishes, whereas a
topology processes messages forever (or until you
kill it).
Apache Storm
9
Spouts and Bolts
Apache Storm
Spouts Bolts
10
Spouts and Bolts
A stream is an unbounded sequence of tuples.
A spout is a source of streams.
Apache Storm
Spout 2 Bolt 3
Bolt 2
Bolt 4
Bolt 1
Spout 1
11
Spouts and Bolts
For example, a spout may read tuples off of
a queue and emit them as a stream.
Apache Storm
Spout 2 Bolt 3
Bolt 2
Bolt 4
Bolt 1
Spout 1
12
Spouts and Bolts
A bolt consumes any number of input streams,
does some processing, and possibly emits new
streams.
Apache Storm
Spout 2 Bolt 3
Bolt 2
Bolt 4
Bolt 1
Spout 1
13
Spouts and Bolts
Each node (spout or bolt) in a Storm topology
executes in parallel.
Apache Storm
Spout 2 Bolt 3
Bolt 2
Bolt 4
Bolt 1
Spout 1
14
Architecture
Apache Storm
A machine in a storm cluster may
run one or more worker processes.
Each topology has one or more
worker processes.
Each worker process runs
executors (threads) for a specific
topology.
Each executor runs one or more
tasks of the same component(spout
or bolt).
Worker Process
Task
Task
Task
Task
executor
15
Architecture
Apache Storm
Supervisor
Nimbus
ZooKeeper
ZooKeeper
ZooKeeper
Supervisor
Supervisor
Supervisor
Supervisor
Hadoop v1 Storm
JobTracker Nimbus
(only 1)
 distributes code around cluster
 assigns tasks to machines/supervisors
 failure monitoring
TaskTracker Supervisor
(many)
 listens for work assigned to its machine
 starts and stops worker processes as necessary based on Nimbus
ZooKeeper  coordination between Nimbus and the Supervisors
16
Architecture
The Nimbus and Supervisor are stateless.
All state is kept in Zookeeper.
1 ZK instance per machine
When the Nimbus or Supervisor fails, they'll start
back up like nothing happened.
Apache Storm
storm jar all-my-code.jar org.apache.storm.MyTopology arg1 arg2
17
Architecture
A running topology consists of many worker
processes spread across many machines.
Apache Storm
Topology
Worker Process
Task
Task
Task
Task
TaskTask
Worker Process
Task
Task
Task
Task
TaskTask
18
Topology With
Tasks in Details
Apache Storm
19
Stream Groupings
Shuffle grouping: Randomized
round-robin
Fields grouping: all Tuples
with the same field value(s) are
always routed to the same task
Direct grouping: producer of
the tuple decides which task of
the consumer will receive the
tuple
Apache Storm
20
A Sample Code of
Configuring
Apache Storm
TopologyBuilder topologyBuilder = new TopologyBuilder();
21
Fault Tolerance
Apache Storm
Workers heartbeat back to Nimbus via ZooKeeper.
22
Fault Tolerance
Apache Storm
When a worker dies, the supervisor will restart it.
23
Fault Tolerance
Apache Storm
If it continuously fails on startup and is unable to
heartbeat to Nimbus, Nimbus will reschedule the worker.
24
Fault Tolerance
Apache Storm
If a supervisor node dies, Nimbus will reassign the work
to other nodes.
25
Fault Tolerance
Apache Storm
If Nimbus dies, topologies will continue to function
normally! but won’t be able to perform reassignments.
26
Fault Tolerance
Apache Storm
In contrast to Hadoop, where if the JobTracker
dies, all the running jobs are lost.
27
Fault Tolerance
Apache Storm
Preferably run ZK with nodes >= 3 so that you
can tolerate the failure of 1 ZK server.
28
A Sample Word
Count Topology
Sentence Spout:
Split Sentence Bolt:
Word Count Bolt:
Report Bolt: prints the contents
Apache Storm
{ "sentence": "my dog has fleas" }
{ "word" : "my" }
{ "word" : "dog" }
{ "word" : "has" }
{ "word" : "fleas" }
{ "word" : "dog", "count" : 5 }
Sentence
Spout
Split
Sentence
Bolt
Word
Count
Bolt
Report
Bolt
29
A Sample Word
Count Code
Apache Storm
public class SentenceSpout extends BaseRichSpout {
private SpoutOutputCollector collector;
private String[] sentences = {
"my dog has fleas", "i like cold beverages", "the dog ate my
homework", "don't have a cow man", "i don't think i like fleas“
};
private int index = 0;
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("sentence"));
}
public void open(Map config, TopologyContext context, SpoutOutputCollector collector) {
this.collector = collector;
}
public void nextTuple() {
this.collector.emit(new Values(sentences[index]));
index++;
if (index >= sentences.length) { index = 0; }
}
}
30
A Sample Word
Count Code
Apache Storm
public class SplitSentenceBolt extends BaseRichBolt{
private OutputCollector collector;
public void prepare(Map config, TopologyContext context, OutputCollector
collector) {
this.collector = collector;
}
public void execute(Tuple tuple) {
String sentence = tuple.getStringByField("sentence");
String[] words = sentence.split(" ");
for(String word : words){
this.collector.emit(new Values(word));
}
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word"));
}
}
31
A Sample Word
Count Code
Apache Storm
public class WordCountBolt extends BaseRichBolt{
private OutputCollector collector;
private HashMap<String, Long> counts = null;
public void prepare(Map config, TopologyContext context, OutputCollector collector) {
this.collector = collector;
this.counts = new HashMap<String, Long>();
}
public void execute(Tuple tuple) {
String word = tuple.getStringByField("word");
Long count = this.counts.get(word);
if(count == null){
count = 0L;
}
count++;
this.counts.put(word, count);
this.collector.emit(new Values(word, count));
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word", "count"));
}
}
32
A Sample Word
Count Code
Apache Storm
public class ReportBolt extends BaseRichBolt {
private HashMap<String, Long> counts = null;
public void prepare(Map config, TopologyContext context, OutputCollector collector) {
this.counts = new HashMap<String, Long>();
}
public void execute(Tuple tuple) {
String word = tuple.getStringByField("word");
Long count = tuple.getLongByField("count");
this.counts.put(word, count);
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// this bolt does not emit anything }
public void cleanup() {
List<String> keys = new ArrayList<String>();
keys.addAll(this.counts.keySet());
Collections.sort(keys);
for (String key : keys) {
System.out.println(key + " : " + this.counts.get(key));
}
}
}
Storm

More Related Content

What's hot

BigData_Chp4: NOSQL
BigData_Chp4: NOSQLBigData_Chp4: NOSQL
BigData_Chp4: NOSQLLilia Sfaxi
 
Cours Big Data Chap2
Cours Big Data Chap2Cours Big Data Chap2
Cours Big Data Chap2Amal Abid
 
TP1 Big Data - MapReduce
TP1 Big Data - MapReduceTP1 Big Data - MapReduce
TP1 Big Data - MapReduceAmal Abid
 
Modélisation de données pour MongoDB
Modélisation de données pour MongoDBModélisation de données pour MongoDB
Modélisation de données pour MongoDBMongoDB
 
Cours systèmes temps réel partie 1 Prof. Khalifa MANSOURI
Cours  systèmes temps réel partie 1 Prof. Khalifa MANSOURICours  systèmes temps réel partie 1 Prof. Khalifa MANSOURI
Cours systèmes temps réel partie 1 Prof. Khalifa MANSOURIMansouri Khalifa
 
Apache Kafka, Un système distribué de messagerie hautement performant
Apache Kafka, Un système distribué de messagerie hautement performantApache Kafka, Un système distribué de messagerie hautement performant
Apache Kafka, Un système distribué de messagerie hautement performantALTIC Altic
 
Thinking Big - Big data: principes et architecture
Thinking Big - Big data: principes et architecture Thinking Big - Big data: principes et architecture
Thinking Big - Big data: principes et architecture Lilia Sfaxi
 
BigData_Chp3: Data Processing
BigData_Chp3: Data ProcessingBigData_Chp3: Data Processing
BigData_Chp3: Data ProcessingLilia Sfaxi
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignMichael Noll
 
Hadoop Hbase - Introduction
Hadoop Hbase - IntroductionHadoop Hbase - Introduction
Hadoop Hbase - IntroductionBlandine Larbret
 
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...Hatim CHAHDI
 
Cours Big Data Chap4 - Spark
Cours Big Data Chap4 - SparkCours Big Data Chap4 - Spark
Cours Big Data Chap4 - SparkAmal Abid
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache KafkaPaul Brebner
 
Spark, ou comment traiter des données à la vitesse de l'éclair
Spark, ou comment traiter des données à la vitesse de l'éclairSpark, ou comment traiter des données à la vitesse de l'éclair
Spark, ou comment traiter des données à la vitesse de l'éclairAlexis Seigneurin
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
BigData_Chp1: Introduction à la Big Data
BigData_Chp1: Introduction à la Big DataBigData_Chp1: Introduction à la Big Data
BigData_Chp1: Introduction à la Big DataLilia Sfaxi
 

What's hot (20)

BigData_Chp4: NOSQL
BigData_Chp4: NOSQLBigData_Chp4: NOSQL
BigData_Chp4: NOSQL
 
Cours Big Data Chap2
Cours Big Data Chap2Cours Big Data Chap2
Cours Big Data Chap2
 
TP1 Big Data - MapReduce
TP1 Big Data - MapReduceTP1 Big Data - MapReduce
TP1 Big Data - MapReduce
 
Modélisation de données pour MongoDB
Modélisation de données pour MongoDBModélisation de données pour MongoDB
Modélisation de données pour MongoDB
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Cours systèmes temps réel partie 1 Prof. Khalifa MANSOURI
Cours  systèmes temps réel partie 1 Prof. Khalifa MANSOURICours  systèmes temps réel partie 1 Prof. Khalifa MANSOURI
Cours systèmes temps réel partie 1 Prof. Khalifa MANSOURI
 
Apache Kafka, Un système distribué de messagerie hautement performant
Apache Kafka, Un système distribué de messagerie hautement performantApache Kafka, Un système distribué de messagerie hautement performant
Apache Kafka, Un système distribué de messagerie hautement performant
 
Thinking Big - Big data: principes et architecture
Thinking Big - Big data: principes et architecture Thinking Big - Big data: principes et architecture
Thinking Big - Big data: principes et architecture
 
BigData_Chp3: Data Processing
BigData_Chp3: Data ProcessingBigData_Chp3: Data Processing
BigData_Chp3: Data Processing
 
introduction à MongoDB
introduction à MongoDBintroduction à MongoDB
introduction à MongoDB
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Hadoop Hbase - Introduction
Hadoop Hbase - IntroductionHadoop Hbase - Introduction
Hadoop Hbase - Introduction
 
DEVOPS
DEVOPSDEVOPS
DEVOPS
 
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
 
Cours Big Data Chap4 - Spark
Cours Big Data Chap4 - SparkCours Big Data Chap4 - Spark
Cours Big Data Chap4 - Spark
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache Kafka
 
Spark, ou comment traiter des données à la vitesse de l'éclair
Spark, ou comment traiter des données à la vitesse de l'éclairSpark, ou comment traiter des données à la vitesse de l'éclair
Spark, ou comment traiter des données à la vitesse de l'éclair
 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
BigData_Chp1: Introduction à la Big Data
BigData_Chp1: Introduction à la Big DataBigData_Chp1: Introduction à la Big Data
BigData_Chp1: Introduction à la Big Data
 

Similar to Storm

storm-170531123446.dotx.pptx
storm-170531123446.dotx.pptxstorm-170531123446.dotx.pptx
storm-170531123446.dotx.pptxIbrahimBenhadhria
 
Distributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache StormDistributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache Stormthe100rabh
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time ComputationSonal Raj
 
Real time stream processing presentation at General Assemb.ly
Real time stream processing presentation at General Assemb.lyReal time stream processing presentation at General Assemb.ly
Real time stream processing presentation at General Assemb.lyVarun Vijayaraghavan
 
BWB Meetup: Storm - distributed realtime computation system
BWB Meetup: Storm - distributed realtime computation systemBWB Meetup: Storm - distributed realtime computation system
BWB Meetup: Storm - distributed realtime computation systemAndrii Gakhov
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureP. Taylor Goetz
 
cs2110Concurrency1.ppt
cs2110Concurrency1.pptcs2110Concurrency1.ppt
cs2110Concurrency1.pptnarendra551069
 
Threaded Programming
Threaded ProgrammingThreaded Programming
Threaded ProgrammingSri Prasanna
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014P. Taylor Goetz
 
Developing Java Streaming Applications with Apache Storm
Developing Java Streaming Applications with Apache StormDeveloping Java Streaming Applications with Apache Storm
Developing Java Streaming Applications with Apache StormLester Martin
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormDavorin Vukelic
 
Streams processing with Storm
Streams processing with StormStreams processing with Storm
Streams processing with StormMariusz Gil
 
.NET Multithreading/Multitasking
.NET Multithreading/Multitasking.NET Multithreading/Multitasking
.NET Multithreading/MultitaskingSasha Kravchuk
 

Similar to Storm (20)

storm-170531123446.dotx.pptx
storm-170531123446.dotx.pptxstorm-170531123446.dotx.pptx
storm-170531123446.dotx.pptx
 
storm-170531123446.pptx
storm-170531123446.pptxstorm-170531123446.pptx
storm-170531123446.pptx
 
Distributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache StormDistributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache Storm
 
Storm 0.8.2
Storm 0.8.2Storm 0.8.2
Storm 0.8.2
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time Computation
 
STORM
STORMSTORM
STORM
 
Apache Storm Tutorial
Apache Storm TutorialApache Storm Tutorial
Apache Storm Tutorial
 
Real time stream processing presentation at General Assemb.ly
Real time stream processing presentation at General Assemb.lyReal time stream processing presentation at General Assemb.ly
Real time stream processing presentation at General Assemb.ly
 
BWB Meetup: Storm - distributed realtime computation system
BWB Meetup: Storm - distributed realtime computation systemBWB Meetup: Storm - distributed realtime computation system
BWB Meetup: Storm - distributed realtime computation system
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
 
cs2110Concurrency1.ppt
cs2110Concurrency1.pptcs2110Concurrency1.ppt
cs2110Concurrency1.ppt
 
Threaded Programming
Threaded ProgrammingThreaded Programming
Threaded Programming
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
Developing Java Streaming Applications with Apache Storm
Developing Java Streaming Applications with Apache StormDeveloping Java Streaming Applications with Apache Storm
Developing Java Streaming Applications with Apache Storm
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache Storm
 
Streams processing with Storm
Streams processing with StormStreams processing with Storm
Streams processing with Storm
 
Storm
StormStorm
Storm
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
.NET Multithreading/Multitasking
.NET Multithreading/Multitasking.NET Multithreading/Multitasking
.NET Multithreading/Multitasking
 
Storm
StormStorm
Storm
 

Recently uploaded

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 

Storm

  • 1. Course Instructor : Dr.Zarifzadeh Presented By : Pouyan Rezazadeh, Ali Rezaie Apache Storm
  • 2. 2 Introduction Apache Storm Hadoop and related technologies have made it possible to store and process data at large scales. Unfortunately, these data processing technologies are not realtime systems. Hadoop does batch processing instead of realtime processing.
  • 3. 3 Introduction Batch processing Processing jobs in batch Batch processing jobs can take hours E.g. billing system Realtime processing Processing jobs one by one Processing jobs immediately E.g. airline system Apache Storm
  • 4. 4 Introduction Realtime data processing at massive scale is becoming more and more of a requirement for businesses. The lack of a "Hadoop of realtime" has become the biggest hole in the data processing ecosystem. There's no hack that will turn Hadoop into a realtime system. Solution Apache Storm
  • 5. 5 Apache Storm A distributed realtime computation system Founded in 2011 Implemented in Clojure (a dialect of Lisp), some Java Apache Storm
  • 6. 6 Advantages Free, simple and open source Can be used with any programming language Very fast Scalable Fault-tolerant Guarantees your data will be processed Integrates with any database technology Extremely robust Apache Storm
  • 7. 7 Storm Use Cases Apache Storm And too many others …
  • 8. 8 Storm vs Hadoop A Storm cluster is superficially similar to a Hadoop cluster. Hadoop runs "MapReduce jobs", while Storm runs "topologies". A MapReduce job eventually finishes, whereas a topology processes messages forever (or until you kill it). Apache Storm
  • 9. 9 Spouts and Bolts Apache Storm Spouts Bolts
  • 10. 10 Spouts and Bolts A stream is an unbounded sequence of tuples. A spout is a source of streams. Apache Storm Spout 2 Bolt 3 Bolt 2 Bolt 4 Bolt 1 Spout 1
  • 11. 11 Spouts and Bolts For example, a spout may read tuples off of a queue and emit them as a stream. Apache Storm Spout 2 Bolt 3 Bolt 2 Bolt 4 Bolt 1 Spout 1
  • 12. 12 Spouts and Bolts A bolt consumes any number of input streams, does some processing, and possibly emits new streams. Apache Storm Spout 2 Bolt 3 Bolt 2 Bolt 4 Bolt 1 Spout 1
  • 13. 13 Spouts and Bolts Each node (spout or bolt) in a Storm topology executes in parallel. Apache Storm Spout 2 Bolt 3 Bolt 2 Bolt 4 Bolt 1 Spout 1
  • 14. 14 Architecture Apache Storm A machine in a storm cluster may run one or more worker processes. Each topology has one or more worker processes. Each worker process runs executors (threads) for a specific topology. Each executor runs one or more tasks of the same component(spout or bolt). Worker Process Task Task Task Task executor
  • 15. 15 Architecture Apache Storm Supervisor Nimbus ZooKeeper ZooKeeper ZooKeeper Supervisor Supervisor Supervisor Supervisor Hadoop v1 Storm JobTracker Nimbus (only 1)  distributes code around cluster  assigns tasks to machines/supervisors  failure monitoring TaskTracker Supervisor (many)  listens for work assigned to its machine  starts and stops worker processes as necessary based on Nimbus ZooKeeper  coordination between Nimbus and the Supervisors
  • 16. 16 Architecture The Nimbus and Supervisor are stateless. All state is kept in Zookeeper. 1 ZK instance per machine When the Nimbus or Supervisor fails, they'll start back up like nothing happened. Apache Storm storm jar all-my-code.jar org.apache.storm.MyTopology arg1 arg2
  • 17. 17 Architecture A running topology consists of many worker processes spread across many machines. Apache Storm Topology Worker Process Task Task Task Task TaskTask Worker Process Task Task Task Task TaskTask
  • 18. 18 Topology With Tasks in Details Apache Storm
  • 19. 19 Stream Groupings Shuffle grouping: Randomized round-robin Fields grouping: all Tuples with the same field value(s) are always routed to the same task Direct grouping: producer of the tuple decides which task of the consumer will receive the tuple Apache Storm
  • 20. 20 A Sample Code of Configuring Apache Storm TopologyBuilder topologyBuilder = new TopologyBuilder();
  • 21. 21 Fault Tolerance Apache Storm Workers heartbeat back to Nimbus via ZooKeeper.
  • 22. 22 Fault Tolerance Apache Storm When a worker dies, the supervisor will restart it.
  • 23. 23 Fault Tolerance Apache Storm If it continuously fails on startup and is unable to heartbeat to Nimbus, Nimbus will reschedule the worker.
  • 24. 24 Fault Tolerance Apache Storm If a supervisor node dies, Nimbus will reassign the work to other nodes.
  • 25. 25 Fault Tolerance Apache Storm If Nimbus dies, topologies will continue to function normally! but won’t be able to perform reassignments.
  • 26. 26 Fault Tolerance Apache Storm In contrast to Hadoop, where if the JobTracker dies, all the running jobs are lost.
  • 27. 27 Fault Tolerance Apache Storm Preferably run ZK with nodes >= 3 so that you can tolerate the failure of 1 ZK server.
  • 28. 28 A Sample Word Count Topology Sentence Spout: Split Sentence Bolt: Word Count Bolt: Report Bolt: prints the contents Apache Storm { "sentence": "my dog has fleas" } { "word" : "my" } { "word" : "dog" } { "word" : "has" } { "word" : "fleas" } { "word" : "dog", "count" : 5 } Sentence Spout Split Sentence Bolt Word Count Bolt Report Bolt
  • 29. 29 A Sample Word Count Code Apache Storm public class SentenceSpout extends BaseRichSpout { private SpoutOutputCollector collector; private String[] sentences = { "my dog has fleas", "i like cold beverages", "the dog ate my homework", "don't have a cow man", "i don't think i like fleas“ }; private int index = 0; public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("sentence")); } public void open(Map config, TopologyContext context, SpoutOutputCollector collector) { this.collector = collector; } public void nextTuple() { this.collector.emit(new Values(sentences[index])); index++; if (index >= sentences.length) { index = 0; } } }
  • 30. 30 A Sample Word Count Code Apache Storm public class SplitSentenceBolt extends BaseRichBolt{ private OutputCollector collector; public void prepare(Map config, TopologyContext context, OutputCollector collector) { this.collector = collector; } public void execute(Tuple tuple) { String sentence = tuple.getStringByField("sentence"); String[] words = sentence.split(" "); for(String word : words){ this.collector.emit(new Values(word)); } } public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("word")); } }
  • 31. 31 A Sample Word Count Code Apache Storm public class WordCountBolt extends BaseRichBolt{ private OutputCollector collector; private HashMap<String, Long> counts = null; public void prepare(Map config, TopologyContext context, OutputCollector collector) { this.collector = collector; this.counts = new HashMap<String, Long>(); } public void execute(Tuple tuple) { String word = tuple.getStringByField("word"); Long count = this.counts.get(word); if(count == null){ count = 0L; } count++; this.counts.put(word, count); this.collector.emit(new Values(word, count)); } public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("word", "count")); } }
  • 32. 32 A Sample Word Count Code Apache Storm public class ReportBolt extends BaseRichBolt { private HashMap<String, Long> counts = null; public void prepare(Map config, TopologyContext context, OutputCollector collector) { this.counts = new HashMap<String, Long>(); } public void execute(Tuple tuple) { String word = tuple.getStringByField("word"); Long count = tuple.getLongByField("count"); this.counts.put(word, count); } public void declareOutputFields(OutputFieldsDeclarer declarer) { // this bolt does not emit anything } public void cleanup() { List<String> keys = new ArrayList<String>(); keys.addAll(this.counts.keySet()); Collections.sort(keys); for (String key : keys) { System.out.println(key + " : " + this.counts.get(key)); } } }