How AI, OpenAI, and ChatGPT impact business and software.
An introduction to Apache Storm
1. Apache Storm
●
What is it ?
●
Architecture
●
Storm Vs Hadoop
●
History
●
Terms
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
2. Apache Storm – What is it ?
●
A real time big data processing system
●
Stream based
●
Fault tolerant and distributed
●
Non persistent
●
In the Apache incubator
●
Written in Clojure and Java
●
Released via an Eclipse license
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
3. Apache Storm – Storm Vs Hadoop
Hadoop
Storm
●
Distributed & fault tolerant
●
Distributed & fault tolerant
●
Batch / file based
●
Real time / stream based
●
Master/slave plus Zoo Keeper
●
Master/slave plus Zoo Keeper
●
Persistent, uses HDFS
●
Non persistent
●
Big Data Analysis
●
Big Data analysis
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
4. Apache Storm – Storm Vs Hadoop
Hadoop Versus Storm
●
They are complementary technologies
●
They might both be used in a single system
●
Storm to process real time streams of data
●
Hadoop and M/R to process batched data on HDFS
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
5. Apache Storm – Architecture
Storm architecture at a high level
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
6. Apache Storm – Architecture
●
Composed of stream of tuples, bolted together
●
sourced via spouts
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
7. Apache Storm – Architecture
●
From these components we form topologies
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
8. Apache Storm – History
What is Apache Storm's history ?
●
Developed by BackType
●
Acquired by Twitter
●
Open sourced by Twitter in Sept 2011
●
Added to Apache Incubator in 2013
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
9. Apache Storm – Terms
●
Tuple – an ordered list of elements
●
Stream – an unbounded feed of tuples
●
Spout – like a tap or faucet, a source of streams
●
Bolt – Functions / Filters etc to process streams
●
Topologies – ETL like architectures built from
–
Spouts, Streams, Bolts
●
Nimbus – master node, like Hadoop job tracker
●
Supervisor – controls worker processes
www.semtech-solutions.co.nz
info@semtech-solutions.co.nz
10. Contact Us
●
Feel free to contact us at
–
www.semtech-solutions.co.nz
–
info@semtech-solutions.co.nz
●
We offer IT project consultancy
●
We are happy to hear about your problems
●
You can just pay for those hours that you need
●
To solve your problems