SlideShare a Scribd company logo
1 of 79
Download to read offline
© 2017 MapR Technologies
Streaming Patterns, Revolutionary
Architectures
Carol McDonald
@caroljmcdonald
© 2017 MapR Technologies
Agenda
Streams Core Components
Patterns
•  Event Sourcing
•  Duality of Streams and Databases
•  Command Query Responsibility Separation
•  Polyglot Persistence, Multiple Materialized Views
•  Turning the Database Upside Down
Real World Examples
•  Retail Monolith to Microservice
•  Healthcare Exchange
© 2017 MapR Technologies
What’s a Stream ?
Producers ConsumersEvents_Stream
A stream is an unbounded sequence of events carried
from a set of producers to a set of consumers.
Events
© 2017 MapR Technologies
What is Streaming Data? Got Some Examples?
Data Collection
Devices
Smart Machinery Phones and Tablets Home Automation
RFID Systems Digital Signage Security Systems Medical Devices
© 2017 MapR Technologies
Why Streams?
Trigger Events:
•  Stock Prices
•  User Activity
•  Sensor Data
Topic
Many Big Data sources are Event Oriented
StreamStreamStream
Event Data
TopicTopic
Real-Time Analytics
© 2017 MapR Technologies
Analyze Data
What if you need to analyze data as it arrives?
© 2017 MapR Technologies
It was hot
at 6:05
yesterday!
Batch Processing
Analyze
6:01 P.M.: 72°
6:02 P.M.: 75°
6:03 P.M.: 77°
6:04 P.M.: 85°
6:05 P.M.: 90°
6:06 P.M.: 85°
6:07 P.M.: 77°
6:08 P.M.: 75°
90°90°
6:01 P.M.: 72°
6:02 P.M.: 75°
6:03 P.M.: 77°
6:04 P.M.: 85°
6:05 P.M.: 90°
6:06 P.M.: 85°
6:07 P.M.: 77°
6:08 P.M.: 75°
© 2017 MapR Technologies
Event Processing with Streams
6:05 P.M.: 90°
To
pic
Stream
Temperature
Turn on the air
conditioning!
© 2017 MapR Technologies
Organize Data
What if you need to organize data as it arrives?
© 2017 MapR Technologies
Integrating Many Data Sources and Applications
Sources
(Producers)
Applications
(Consumers)
Unorganized, Complicated, and Tightly Coupled.
© 2017 MapR Technologies
Organize Data into Topics with MapR Streams
Topics Organize Events into Categories and Decouple Producers from Consumers
Consumers
MapR Cluster
Topic: Pressure
Topic: Temperature
Topic: Warnings
Consumers
Consumers
Kafka API Kafka API
© 2017 MapR Technologies
Process High Volume of Data
What if you need to process a high volume of data as it arrives?
© 2017 MapR Technologies
What if BP had detected problems before the oil hit the water ?
•  1M samples/sec
•  High performance at
scale is necessary!
© 2017 MapR Technologies
Traditional Message queue
Huge performance hit:
•  Lots of disk I/O
© 2017 MapR Technologies
Scalable Messaging with MapR Streams
Server 1
Partition1: Topic - Pressure
Partition1: Topic - Temperature
Partition1: Topic - Warning
Server 2
Partition2: Topic - Pressure
Partition2: Topic - Temperature
Partition2: Topic - Warning
Server 3
Partition3: Topic - Pressure
Partition3: Topic - Temperature
Partition3: Topic - Warning
Topics are
partitioned for
throughput and
scalability
© 2017 MapR Technologies
Scalable Messaging with MapR Streams
Partition1: Topic - Pressure
Partition1: Topic - Temperature
Partition1: Topic - Warning
Partition2: Topic - Pressure
Partition2: Topic - Temperature
Partition2: Topic - Warning
Partition3: Topic - Pressure
Partition3: Topic - Temperature
Partition3: Topic - Warning
Producers are load
balanced between partitions
Kafka API
© 2017 MapR Technologies
Scalable Messaging with MapR Streams
Partition1: Topic - Pressure
Partition1: Topic - Temperature
Partition1: Topic - Warning
Partition2: Topic - Pressure
Partition2: Topic - Temperature
Partition2: Topic - Warning
Partition3: Topic - Pressure
Partition3: Topic - Temperature
Partition3: Topic - Warning
Consumers
Consumers
Consumers
Consumer groups can read in parallel
Kafka API
© 2017 MapR Technologies
Partition is like a Queue
Consumers
MapR Cluster
Topic: Admission / Server 1
Topic: Admission / Server 2
Topic: Admission / Server 3
Consumers
Consumers
Partition
1
New Messages are
appended to the end
Partition
2
Partition
3
6 5 4 3 2 1
3 2 1
5 4 3 2 1
Producers
Producers
Producers
New
Message
6 5 4 3 2 1
Old
Message
© 2017 MapR Technologies
Events are delivered in the order they are received, like a queue
messages are delivered in the order they are received
MapR Cluster
6 5 4 3 2 1
Consumer
groupProducers
Read cursors
Consumer
group
© 2017 MapR Technologies
Unlike a queue, events are persisted even after they’re delivered
Messages remain on the partition, available to other consumers
Minimizes Non-Sequential disk read-writes
MapR Cluster (1 Server)
Topic: Warning
Partition
1
3 2 1 Unread Events
Get Unread
3 2 1
Client Library ConsumerPoll
© 2017 MapR Technologies
When Are Messages Deleted?
•  Messages can be persisted forever
•  Or
•  Older messages can be deleted automatically based on time to live
MapR Cluster (1 Server)
6 5 4 3 2 1Partition
1
Older
message
© 2017 MapR Technologies
Processing Same Message for Different Purposes
Consumers
Consumers
Consumers
Producers
Producers
Producers
MapR-FS
Kafka API Kafka API
© 2017 MapR Technologies
Partition Fault Tolerance
© 2017 MapR Technologies
Message Recovery
What if you need to recover messages in case of server failure?
© 2017 MapR Technologies
Partitions are Replicated for Fault Tolerance
Producer
Producer
Server 2 Partition2: Topic - Warning
Producer
Server 1 Partition1: Topic - Warning
Server 3 Partition3: Topic - Warning
Server 2
Server 3
Server 1
Server 3
Server 1
Server 2
© 2017 MapR Technologies
Partition1: Warning
Partition2: Warning Replica
Partition3: Warning Replica
Partition1: Warning Replica
Partition3: Warning Replica
Partition1: Warning Replica
Partition2: Warning Replica
Partition3: Warning
Producer
Producer
Producer
Server 1
Server 2
Server 3
Security Investigation &
Event Management
Operational
Intelligence
Real-time Analytics
Partition2: Warning
Partitions are Replicated for Fault Tolerance
© 2017 MapR Technologies
Partitions are Replicated for Fault Tolerance
Producer
Producer
Producer
Security Investigation &
Event Management
Operational
Intelligence
Real-time Analytics
Partition1: Warning
Partition2: Warning Replica
Partition3: Warning Replica
Partition1: Warning Replica
Partition3: Warning Replica
Partition1: Warning Replica
Partition2: Warning Replica
Partition3: Warning
Server 1
Server 2
Server 3
Partition2: Warning
© 2017 MapR Technologies
Partitions are Replicated for Fault tolerance
Producer
Producer
Producer
Security Investigation &
Event Management
Operational
Intelligence
Real-time Analytics
Partition1: Warning
Partition2: Warning Replica
Partition3: Warning Replica
Partition1: Warning Replica
Partition3: Warning Replica
Partition1: Warning Replica
Partition2: Warning Replica
Partition3: Warning
Server 1
Server 2
Server 3
Partition2: Warning
© 2017 MapR Technologies
Streams and High Availability
© 2017 MapR Technologies
Real-time Access
What if you need real-time access to live data distributed across multiple clusters
and multiple data centers?
© 2017 MapR Technologies
Streams and Replication
Streams:
•  can be replicated worldwide
Topic: A
Topic: B
Topic: C
Topic: A
Topic: B
Topic: C
Replicating to
another
cluster
© 2017 MapR Technologies
Streams:
•  high availability
•  disaster recovery
Streams and Replication
Topic: A
Topic: B
Topic: C
Fail Over
© 2017 MapR Technologies
Patterns
© 2017 MapR Technologies
Patterns
Batch
Architecture
mins - hrs
Streaming
Architecture
ms - secs
© 2017 MapR Technologies
Event Sourcing
Updates
Imagine each event as a change to an entry in a database.
Account Id Balance
WillO 80.00
BradA 20.00
1: WillO : Deposit : 100.00
2: BradA : Deposit : 50.00
3: BradA : Withdraw : 30.00
4: WillO : Withdraw: 20.00
https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying
Change log
4 3 2 1
queue of all deposit and withdrawal events current account balances
© 2017 MapR Technologies
Replication
Change Log
https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying
3 2 1 3 2 1
3 2 1
Duality of Streams and Tables
Master:
Append writes
Slave:
Apply writes in order
© 2017 MapR Technologies
Which Makes a Better System of Record?
Which of these can be used to reconstruct the other?
1: WillO : Deposit : 100.00
2: BradA : Deposit : 50.00
3: BradA : Withdraw : 30.00
4: WillO : Withdraw: 20.00
Account Id Balance
WillO 80.00
BradA 20.00
Change Log
3 2 1
© 2017 MapR Technologies
Rewind: Reprocessing Events
MapR Cluster
6 5 4 3 2 1Producers
Reprocess from
oldest
Consumer
Create new view, Index, cache
© 2017 MapR Technologies
Rewind Reprocessing Events
MapR Cluster
6 5 4 3 2 1Producers
To Newest
Consumer new view
Read from
new view
© 2017 MapR Technologies
Event Sourcing, Command Query Responsibility Separation:
Turning the Database Upside Down
Key-Val Document Graph
Wide
Column
Time
Series
Relational
???Events Updates
© 2017 MapR Technologies
What Else Do I Use My Stream For?
Lineage - “how did BradA’s balance get so low?”
Auditing - “who deposited/withdrew from BradA’s account?”
History – to see the status of the accounts last year
Integrity - “can I trust this data hasn’t been tampered with?”
•  Yup - Streams are immutable
0: WillO : Deposit : 100.00
1: BradA : Deposit : 50.00
2: BradA : Withdraw : 30.00
3: WillO : Withdraw: 20.00
© 2017 MapR Technologies
What Do I Need For This to Work?
Infinitely persisted events
A way to query your persisted stream data
An integrated security model across the stream and databases
© 2017 MapR Technologies
Examples with Patterns
© 2017 MapR Technologies
Breaking up Online shopping rating items into Microservices
Concurrency
bottleneck
© 2017 MapR Technologies
Separate Write from Read using CQRS
Command Query Responsibility Separation
Separate the Rate Item write “command”
from the Get Item Ratings read “query” using event sourcing
{
"itemid": "sku124",
"rating": "4",
"userid": "cmcdonald",
"comment": "works well"
}
{
"itemid": "sku124",
"pname": "bluetooth earbud",
"ratings": [
{
"rating": "4",
"userid": "cmcdonald",
"comment": "works well"
},
{
"rating": "1",
"userid": "diego",
"comment": "hated it"
}]
}
© 2017 MapR Technologies
NoSQL Scaling Fast Reads and Writes
Design your schema so that the data that is read together is
stored together
© 2017 MapR Technologies
Event Sourcing: New Uses of Data
Add new Services like Recommendations
© 2017 MapR Technologies
Fraud Detection
Point of Sale -> Data Center is Transaction Fraud ?
•  Lots of requests
•  Need answer within ~ 50 100 milliseconds
Data
Center
Point of Sale
Location, time, card#
Fraud yes/no ?
© 2017 MapR Technologies
Traditional Solution
POS
1..n
Fraud
detector
Last card
use
1.  Look up last card use
2.  Compute the card velocity:
•  Subtract last location, time from
current location, time
3.  Update last card use
© 2017 MapR Technologies
What Happens Next?
POS
1..n
Fraud
detector
Last card
use
POS
1..n
Fraud
detector
POS
1..n
Fraud
detector
1.  Read last card use
2.  Compute the card velocity
3.  Update last card use
© 2017 MapR Technologies
Service Isolation: Separate Read from Write
POS
1..n
Fraud
detector
Last card
use
Updater
card activity
Read
Read last card use
© 2017 MapR Technologies
Separate Read Model from the Write Model:
Command Query Responsibility Separation
POS
1..n
Fraud
detector
Last card
use
Updater
card activity
Read
Event last card use
Write last card use
© 2017 MapR Technologies
Event Sourcing: New Uses of Data
Processing Same Message for Different Views
POS
1..n
Fraud
detector
Last card
use
Updater
Card
location
history
Other
card activity
© 2017 MapR Technologies
Scaling Through Isolation
POS
1..n
Last card
use
Updater
POS
1..n
Last card
use
Updater
card activity
Fraud
detector
Fraud
detector
Multiple fraud detectors can use the same message queue
© 2017 MapR Technologies
Lessons
De-coupling and isolation are key
Propagate events, not table updates
© 2017 MapR Technologies
Real World Solution
© 2017 MapR Technologies
Use Case: Streaming System of Record for Healthcare
Objective:
•  Build a flexible, secure
healthcare exchange
Records Analysis
Applications
Challenges:
•  Many different data models
•  Security and privacy issues
•  HIPAA compliance
Records
© 2017 MapR Technologies59
ALLOY Health:
Exchange State HIE
Clinical Data Viewer
Reporting and Analytics
Clinical Data
Financial Data
Provider
Organizations
© 2017 MapR Technologies
This is a PAIN !
COMPLIAN
CE
SECURITY CONTROLS
COMPLIANCE
FEATURES
PRIVACY
PCI DSS
3.0
21 CFR Part
11
SSAE16 /
SOC2
HIPAA/HITECH
© 2017 MapR Technologies
WHY NOW?
2014 FQ4 profit
$ -440 M
Total Cost Estimate
$ -12 B
© 2017 MapR Technologies
Why Now? The Relational database is not the only tool
1234
Attribute Value
patient_id 1234
Name Jon Smith
Age 50
999
Attribute Value
patient_id 999
Name Jonathan
Smith
DOB Jun 1965
86
9876
Attribute Value
provider_id 86
Name Dr. Nora Paige
Specialty Diabetes
Attribute Value
rx_id 9876
Name Sitagliptin
Dosage 325mg
Visited
Prescribed
WasPrescribed
Patient
Patient
Prescription
Provider
Context and Relationships
© 2017 MapR Technologies
WHY NOW? Mind the Gap
63
© 2017 MapR Technologies
Streaming System of Record for Healthcare
Stream
Topic
Records
Applications
6 5 4 3 2 1
Search
Graph DB
JSON
HBase
Micro
Service
Micro
Service
Micro
Service
Micro
Service
Micro
Service
Micro
Service
A
P
I
Streaming System of Record Materialized
Views
Consumer
workflow
Consumer
workflow
Consumer
workflowImmutable Log
pre-
processor
© 2017 MapR Technologies
65	
Immutable Log
Raw
Data
workflow
Key/Value
(MapR-DB)
materialized
view
workflow
Search
Engine
materialized
view
CEP
k v v v v v
k v v v
k v v
k v v v v
k v v v
k v v v v v
Document Log
(MapR-FS)
log
API
App
pre-
processor
workflow
Graph
(ArangoDB)
materialized
view
workflow
Time
Series
(OpenTSDB)
materialized
view
micro
service
micro
service
micro
service
micro
service
micro
service
micro
service
micro
service
micro
service
App AppApp
...
The Promised Land
Compliance
Auditor
smiley faces
Data Lineage
Audit Logging
© 2017 MapR Technologies
Solution
Design/architecture solved some
•  Streams
•  Data Lineage/System of Record
•  Kappa Architecture (Kreps/Kleppman)
MapR solved others
•  Unified Security
•  Replication DC to DC
•  Converge Kafka/HBase/Hadoop to one cluster
•  Multi-tenancy (lots of topics, for lots of tenants)
66
© 2017 MapR Technologies
Real World Solution
© 2017 MapR Technologies
Challenge: Major Latency from Batch File Transfer
20-30 Minutes
© 2017 MapR Technologies
Regional Datacenter	
Topic
Elasticsearch
Kibana
File Server
Producer
(Java)
Consumer
(Java)	 Index	
Filtering config
•  Monitoring directory
•  Parsing CSV files
•  Publishing messages to
topic	
•  Parsing master data
•  Subscribing topic
•  Join tables
•  Aggregation
Dashboard
© 2017 MapR Technologies
Streams and Replication
Streams:
Topic: A
Topic: B
Topic: C
Topic: A
Topic: B
Topic: C
Replicating to
another
cluster
© 2017 MapR Technologies
Central Data Center
Ad-hoc
analysis
Other Data
Sources
Real-time
analysis
Reporting
Streaming	
Stream
Topic
Replicating
Regional Data Centers
Stream
Topic
Stream
Topic
Performance
and other
monitoring
related data.
Aggregation of data across all regional data centers
© 2017 MapR Technologies
Stream Processing
Building a Complete Data Architecture
MapR File System
(MapR-FS)
MapR Converged Data Platform
MapR Database
(MapR-DB)
MapR Streams
Sources/Apps Bulk Processing
© 2017 MapR Technologies
To Learn More:
•  Streaming Architecture ebook
•  https://mapr.com/streaming-architecture-using-apache-kafka-mapr-streams/
© 2017 MapR Technologies
© 2017 MapR Technologies
MapR Blog
• https://www.mapr.com/blog/
© 2017 MapR Technologies
To Learn More:
•  End to End Application for Monitoring Uber Data using Spark ML
•  https://mapr.com/blog/monitoring-real-time-uber-data-using-spark-machine-
learning-streaming-and-kafka-api-part-1/
© 2017 MapR Technologies
…helping you put data technology to work
●  Find answers
●  Ask technical questions
●  Join on-demand training course
discussions
●  Follow release announcements
●  Share and vote on product ideas
●  Find Meetup and event listings
Connect with fellow Apache
Hadoop and Spark professionals
community.mapr.com
© 2017 MapR Technologies
To Learn More:
•  MapR Free ODT http://learn.mapr.com/
© 2017 MapR Technologies
Q&A
ENGAGE WITH US

More Related Content

What's hot

Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Carol McDonald
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Carol McDonald
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Carol McDonald
 
Apache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision TreesApache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision TreesCarol McDonald
 
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningPredicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningCarol McDonald
 
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs:  Spark Machine Learning...Analysis of Popular Uber Locations using Apache APIs:  Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...Carol McDonald
 
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APIStreaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APICarol McDonald
 
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DBAnalyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DBCarol McDonald
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...MapR Technologies
 
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...Carol McDonald
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
 
Spark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleSpark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleIan Downard
 
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBaseCarol McDonald
 
Free Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache SparkFree Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache SparkMapR Technologies
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataMathieu Dumoulin
 
NoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBNoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBMapR Technologies
 
MapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Technologies
 

What's hot (20)

Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
 
Apache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision TreesApache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision Trees
 
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningPredicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine Learning
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs:  Spark Machine Learning...Analysis of Popular Uber Locations using Apache APIs:  Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
 
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APIStreaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka API
 
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DBAnalyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
 
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
 
Spark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleSpark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating Example
 
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBase
 
Free Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache SparkFree Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache Spark
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
 
NoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBNoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DB
 
MapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data Platform
 

Similar to Streaming patterns revolutionary architectures

Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsMapR Technologies
 
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningStreaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningTed Dunning
 
MapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn GloballyMapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn Globallyridhav
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Technologies
 
Predictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural NetworksPredictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural NetworksJustin Brandenburg
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged ApplicationsMapR Technologies
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 
Progress for big data in Kubernetes
Progress for big data in KubernetesProgress for big data in Kubernetes
Progress for big data in KubernetesTed Dunning
 
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...Mathieu Dumoulin
 
Episode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OSEpisode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OSMesosphere Inc.
 
Container and Kubernetes without limits
Container and Kubernetes without limitsContainer and Kubernetes without limits
Container and Kubernetes without limitsAntje Barth
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformMapR Technologies
 
Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016Nitin Kumar
 
Big Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business SolutionsBig Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business SolutionsMatt Stubbs
 
MapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document DatabaseMapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document DatabaseMapR Technologies
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications MapR Technologies
 
Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...DataWorks Summit
 
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloudJeff Hung
 
Why Stream? Advantages of Streaming Architecture #StrataData SJ 2017 presenta...
Why Stream? Advantages of Streaming Architecture #StrataData SJ 2017 presenta...Why Stream? Advantages of Streaming Architecture #StrataData SJ 2017 presenta...
Why Stream? Advantages of Streaming Architecture #StrataData SJ 2017 presenta...Ellen Friedman
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
 

Similar to Streaming patterns revolutionary architectures (20)

Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningStreaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine Learning
 
MapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn GloballyMapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn Globally
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
Predictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural NetworksPredictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural Networks
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
Progress for big data in Kubernetes
Progress for big data in KubernetesProgress for big data in Kubernetes
Progress for big data in Kubernetes
 
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
 
Episode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OSEpisode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OS
 
Container and Kubernetes without limits
Container and Kubernetes without limitsContainer and Kubernetes without limits
Container and Kubernetes without limits
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016
 
Big Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business SolutionsBig Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business Solutions
 
MapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document DatabaseMapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document Database
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
 
Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...
 
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
 
Why Stream? Advantages of Streaming Architecture #StrataData SJ 2017 presenta...
Why Stream? Advantages of Streaming Architecture #StrataData SJ 2017 presenta...Why Stream? Advantages of Streaming Architecture #StrataData SJ 2017 presenta...
Why Stream? Advantages of Streaming Architecture #StrataData SJ 2017 presenta...
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 

More from Carol McDonald

Introduction to machine learning with GPUs
Introduction to machine learning with GPUsIntroduction to machine learning with GPUs
Introduction to machine learning with GPUsCarol McDonald
 
Spark machine learning predicting customer churn
Spark machine learning predicting customer churnSpark machine learning predicting customer churn
Spark machine learning predicting customer churnCarol McDonald
 
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...Carol McDonald
 
Apache Spark Machine Learning
Apache Spark Machine LearningApache Spark Machine Learning
Apache Spark Machine LearningCarol McDonald
 
Apache Spark streaming and HBase
Apache Spark streaming and HBaseApache Spark streaming and HBase
Apache Spark streaming and HBaseCarol McDonald
 
Machine Learning Recommendations with Spark
Machine Learning Recommendations with SparkMachine Learning Recommendations with Spark
Machine Learning Recommendations with SparkCarol McDonald
 
Getting started with HBase
Getting started with HBaseGetting started with HBase
Getting started with HBaseCarol McDonald
 
Introduction to Spark on Hadoop
Introduction to Spark on HadoopIntroduction to Spark on Hadoop
Introduction to Spark on HadoopCarol McDonald
 
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill Carol McDonald
 

More from Carol McDonald (12)

Introduction to machine learning with GPUs
Introduction to machine learning with GPUsIntroduction to machine learning with GPUs
Introduction to machine learning with GPUs
 
Spark graphx
Spark graphxSpark graphx
Spark graphx
 
Spark machine learning predicting customer churn
Spark machine learning predicting customer churnSpark machine learning predicting customer churn
Spark machine learning predicting customer churn
 
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
 
Apache Spark Machine Learning
Apache Spark Machine LearningApache Spark Machine Learning
Apache Spark Machine Learning
 
Apache Spark streaming and HBase
Apache Spark streaming and HBaseApache Spark streaming and HBase
Apache Spark streaming and HBase
 
Machine Learning Recommendations with Spark
Machine Learning Recommendations with SparkMachine Learning Recommendations with Spark
Machine Learning Recommendations with Spark
 
Introduction to Spark
Introduction to SparkIntroduction to Spark
Introduction to Spark
 
CU9411MW.DOC
CU9411MW.DOCCU9411MW.DOC
CU9411MW.DOC
 
Getting started with HBase
Getting started with HBaseGetting started with HBase
Getting started with HBase
 
Introduction to Spark on Hadoop
Introduction to Spark on HadoopIntroduction to Spark on Hadoop
Introduction to Spark on Hadoop
 
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill
 

Recently uploaded

Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineeringssuserb3a23b
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 

Recently uploaded (20)

Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Odoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting ServiceOdoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting Service
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineering
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 

Streaming patterns revolutionary architectures

  • 1. © 2017 MapR Technologies Streaming Patterns, Revolutionary Architectures Carol McDonald @caroljmcdonald
  • 2. © 2017 MapR Technologies Agenda Streams Core Components Patterns •  Event Sourcing •  Duality of Streams and Databases •  Command Query Responsibility Separation •  Polyglot Persistence, Multiple Materialized Views •  Turning the Database Upside Down Real World Examples •  Retail Monolith to Microservice •  Healthcare Exchange
  • 3. © 2017 MapR Technologies What’s a Stream ? Producers ConsumersEvents_Stream A stream is an unbounded sequence of events carried from a set of producers to a set of consumers. Events
  • 4. © 2017 MapR Technologies What is Streaming Data? Got Some Examples? Data Collection Devices Smart Machinery Phones and Tablets Home Automation RFID Systems Digital Signage Security Systems Medical Devices
  • 5. © 2017 MapR Technologies Why Streams? Trigger Events: •  Stock Prices •  User Activity •  Sensor Data Topic Many Big Data sources are Event Oriented StreamStreamStream Event Data TopicTopic Real-Time Analytics
  • 6. © 2017 MapR Technologies Analyze Data What if you need to analyze data as it arrives?
  • 7. © 2017 MapR Technologies It was hot at 6:05 yesterday! Batch Processing Analyze 6:01 P.M.: 72° 6:02 P.M.: 75° 6:03 P.M.: 77° 6:04 P.M.: 85° 6:05 P.M.: 90° 6:06 P.M.: 85° 6:07 P.M.: 77° 6:08 P.M.: 75° 90°90° 6:01 P.M.: 72° 6:02 P.M.: 75° 6:03 P.M.: 77° 6:04 P.M.: 85° 6:05 P.M.: 90° 6:06 P.M.: 85° 6:07 P.M.: 77° 6:08 P.M.: 75°
  • 8. © 2017 MapR Technologies Event Processing with Streams 6:05 P.M.: 90° To pic Stream Temperature Turn on the air conditioning!
  • 9. © 2017 MapR Technologies Organize Data What if you need to organize data as it arrives?
  • 10. © 2017 MapR Technologies Integrating Many Data Sources and Applications Sources (Producers) Applications (Consumers) Unorganized, Complicated, and Tightly Coupled.
  • 11. © 2017 MapR Technologies Organize Data into Topics with MapR Streams Topics Organize Events into Categories and Decouple Producers from Consumers Consumers MapR Cluster Topic: Pressure Topic: Temperature Topic: Warnings Consumers Consumers Kafka API Kafka API
  • 12. © 2017 MapR Technologies Process High Volume of Data What if you need to process a high volume of data as it arrives?
  • 13. © 2017 MapR Technologies What if BP had detected problems before the oil hit the water ? •  1M samples/sec •  High performance at scale is necessary!
  • 14. © 2017 MapR Technologies Traditional Message queue Huge performance hit: •  Lots of disk I/O
  • 15. © 2017 MapR Technologies Scalable Messaging with MapR Streams Server 1 Partition1: Topic - Pressure Partition1: Topic - Temperature Partition1: Topic - Warning Server 2 Partition2: Topic - Pressure Partition2: Topic - Temperature Partition2: Topic - Warning Server 3 Partition3: Topic - Pressure Partition3: Topic - Temperature Partition3: Topic - Warning Topics are partitioned for throughput and scalability
  • 16. © 2017 MapR Technologies Scalable Messaging with MapR Streams Partition1: Topic - Pressure Partition1: Topic - Temperature Partition1: Topic - Warning Partition2: Topic - Pressure Partition2: Topic - Temperature Partition2: Topic - Warning Partition3: Topic - Pressure Partition3: Topic - Temperature Partition3: Topic - Warning Producers are load balanced between partitions Kafka API
  • 17. © 2017 MapR Technologies Scalable Messaging with MapR Streams Partition1: Topic - Pressure Partition1: Topic - Temperature Partition1: Topic - Warning Partition2: Topic - Pressure Partition2: Topic - Temperature Partition2: Topic - Warning Partition3: Topic - Pressure Partition3: Topic - Temperature Partition3: Topic - Warning Consumers Consumers Consumers Consumer groups can read in parallel Kafka API
  • 18. © 2017 MapR Technologies Partition is like a Queue Consumers MapR Cluster Topic: Admission / Server 1 Topic: Admission / Server 2 Topic: Admission / Server 3 Consumers Consumers Partition 1 New Messages are appended to the end Partition 2 Partition 3 6 5 4 3 2 1 3 2 1 5 4 3 2 1 Producers Producers Producers New Message 6 5 4 3 2 1 Old Message
  • 19. © 2017 MapR Technologies Events are delivered in the order they are received, like a queue messages are delivered in the order they are received MapR Cluster 6 5 4 3 2 1 Consumer groupProducers Read cursors Consumer group
  • 20. © 2017 MapR Technologies Unlike a queue, events are persisted even after they’re delivered Messages remain on the partition, available to other consumers Minimizes Non-Sequential disk read-writes MapR Cluster (1 Server) Topic: Warning Partition 1 3 2 1 Unread Events Get Unread 3 2 1 Client Library ConsumerPoll
  • 21. © 2017 MapR Technologies When Are Messages Deleted? •  Messages can be persisted forever •  Or •  Older messages can be deleted automatically based on time to live MapR Cluster (1 Server) 6 5 4 3 2 1Partition 1 Older message
  • 22. © 2017 MapR Technologies Processing Same Message for Different Purposes Consumers Consumers Consumers Producers Producers Producers MapR-FS Kafka API Kafka API
  • 23. © 2017 MapR Technologies Partition Fault Tolerance
  • 24. © 2017 MapR Technologies Message Recovery What if you need to recover messages in case of server failure?
  • 25. © 2017 MapR Technologies Partitions are Replicated for Fault Tolerance Producer Producer Server 2 Partition2: Topic - Warning Producer Server 1 Partition1: Topic - Warning Server 3 Partition3: Topic - Warning Server 2 Server 3 Server 1 Server 3 Server 1 Server 2
  • 26. © 2017 MapR Technologies Partition1: Warning Partition2: Warning Replica Partition3: Warning Replica Partition1: Warning Replica Partition3: Warning Replica Partition1: Warning Replica Partition2: Warning Replica Partition3: Warning Producer Producer Producer Server 1 Server 2 Server 3 Security Investigation & Event Management Operational Intelligence Real-time Analytics Partition2: Warning Partitions are Replicated for Fault Tolerance
  • 27. © 2017 MapR Technologies Partitions are Replicated for Fault Tolerance Producer Producer Producer Security Investigation & Event Management Operational Intelligence Real-time Analytics Partition1: Warning Partition2: Warning Replica Partition3: Warning Replica Partition1: Warning Replica Partition3: Warning Replica Partition1: Warning Replica Partition2: Warning Replica Partition3: Warning Server 1 Server 2 Server 3 Partition2: Warning
  • 28. © 2017 MapR Technologies Partitions are Replicated for Fault tolerance Producer Producer Producer Security Investigation & Event Management Operational Intelligence Real-time Analytics Partition1: Warning Partition2: Warning Replica Partition3: Warning Replica Partition1: Warning Replica Partition3: Warning Replica Partition1: Warning Replica Partition2: Warning Replica Partition3: Warning Server 1 Server 2 Server 3 Partition2: Warning
  • 29. © 2017 MapR Technologies Streams and High Availability
  • 30. © 2017 MapR Technologies Real-time Access What if you need real-time access to live data distributed across multiple clusters and multiple data centers?
  • 31. © 2017 MapR Technologies Streams and Replication Streams: •  can be replicated worldwide Topic: A Topic: B Topic: C Topic: A Topic: B Topic: C Replicating to another cluster
  • 32. © 2017 MapR Technologies Streams: •  high availability •  disaster recovery Streams and Replication Topic: A Topic: B Topic: C Fail Over
  • 33. © 2017 MapR Technologies Patterns
  • 34. © 2017 MapR Technologies Patterns
  • 36. © 2017 MapR Technologies Event Sourcing Updates Imagine each event as a change to an entry in a database. Account Id Balance WillO 80.00 BradA 20.00 1: WillO : Deposit : 100.00 2: BradA : Deposit : 50.00 3: BradA : Withdraw : 30.00 4: WillO : Withdraw: 20.00 https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying Change log 4 3 2 1 queue of all deposit and withdrawal events current account balances
  • 37. © 2017 MapR Technologies Replication Change Log https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying 3 2 1 3 2 1 3 2 1 Duality of Streams and Tables Master: Append writes Slave: Apply writes in order
  • 38. © 2017 MapR Technologies Which Makes a Better System of Record? Which of these can be used to reconstruct the other? 1: WillO : Deposit : 100.00 2: BradA : Deposit : 50.00 3: BradA : Withdraw : 30.00 4: WillO : Withdraw: 20.00 Account Id Balance WillO 80.00 BradA 20.00 Change Log 3 2 1
  • 39. © 2017 MapR Technologies Rewind: Reprocessing Events MapR Cluster 6 5 4 3 2 1Producers Reprocess from oldest Consumer Create new view, Index, cache
  • 40. © 2017 MapR Technologies Rewind Reprocessing Events MapR Cluster 6 5 4 3 2 1Producers To Newest Consumer new view Read from new view
  • 41. © 2017 MapR Technologies Event Sourcing, Command Query Responsibility Separation: Turning the Database Upside Down Key-Val Document Graph Wide Column Time Series Relational ???Events Updates
  • 42. © 2017 MapR Technologies What Else Do I Use My Stream For? Lineage - “how did BradA’s balance get so low?” Auditing - “who deposited/withdrew from BradA’s account?” History – to see the status of the accounts last year Integrity - “can I trust this data hasn’t been tampered with?” •  Yup - Streams are immutable 0: WillO : Deposit : 100.00 1: BradA : Deposit : 50.00 2: BradA : Withdraw : 30.00 3: WillO : Withdraw: 20.00
  • 43. © 2017 MapR Technologies What Do I Need For This to Work? Infinitely persisted events A way to query your persisted stream data An integrated security model across the stream and databases
  • 44. © 2017 MapR Technologies Examples with Patterns
  • 45. © 2017 MapR Technologies Breaking up Online shopping rating items into Microservices Concurrency bottleneck
  • 46. © 2017 MapR Technologies Separate Write from Read using CQRS Command Query Responsibility Separation Separate the Rate Item write “command” from the Get Item Ratings read “query” using event sourcing { "itemid": "sku124", "rating": "4", "userid": "cmcdonald", "comment": "works well" } { "itemid": "sku124", "pname": "bluetooth earbud", "ratings": [ { "rating": "4", "userid": "cmcdonald", "comment": "works well" }, { "rating": "1", "userid": "diego", "comment": "hated it" }] }
  • 47. © 2017 MapR Technologies NoSQL Scaling Fast Reads and Writes Design your schema so that the data that is read together is stored together
  • 48. © 2017 MapR Technologies Event Sourcing: New Uses of Data Add new Services like Recommendations
  • 49. © 2017 MapR Technologies Fraud Detection Point of Sale -> Data Center is Transaction Fraud ? •  Lots of requests •  Need answer within ~ 50 100 milliseconds Data Center Point of Sale Location, time, card# Fraud yes/no ?
  • 50. © 2017 MapR Technologies Traditional Solution POS 1..n Fraud detector Last card use 1.  Look up last card use 2.  Compute the card velocity: •  Subtract last location, time from current location, time 3.  Update last card use
  • 51. © 2017 MapR Technologies What Happens Next? POS 1..n Fraud detector Last card use POS 1..n Fraud detector POS 1..n Fraud detector 1.  Read last card use 2.  Compute the card velocity 3.  Update last card use
  • 52. © 2017 MapR Technologies Service Isolation: Separate Read from Write POS 1..n Fraud detector Last card use Updater card activity Read Read last card use
  • 53. © 2017 MapR Technologies Separate Read Model from the Write Model: Command Query Responsibility Separation POS 1..n Fraud detector Last card use Updater card activity Read Event last card use Write last card use
  • 54. © 2017 MapR Technologies Event Sourcing: New Uses of Data Processing Same Message for Different Views POS 1..n Fraud detector Last card use Updater Card location history Other card activity
  • 55. © 2017 MapR Technologies Scaling Through Isolation POS 1..n Last card use Updater POS 1..n Last card use Updater card activity Fraud detector Fraud detector Multiple fraud detectors can use the same message queue
  • 56. © 2017 MapR Technologies Lessons De-coupling and isolation are key Propagate events, not table updates
  • 57. © 2017 MapR Technologies Real World Solution
  • 58. © 2017 MapR Technologies Use Case: Streaming System of Record for Healthcare Objective: •  Build a flexible, secure healthcare exchange Records Analysis Applications Challenges: •  Many different data models •  Security and privacy issues •  HIPAA compliance Records
  • 59. © 2017 MapR Technologies59 ALLOY Health: Exchange State HIE Clinical Data Viewer Reporting and Analytics Clinical Data Financial Data Provider Organizations
  • 60. © 2017 MapR Technologies This is a PAIN ! COMPLIAN CE SECURITY CONTROLS COMPLIANCE FEATURES PRIVACY PCI DSS 3.0 21 CFR Part 11 SSAE16 / SOC2 HIPAA/HITECH
  • 61. © 2017 MapR Technologies WHY NOW? 2014 FQ4 profit $ -440 M Total Cost Estimate $ -12 B
  • 62. © 2017 MapR Technologies Why Now? The Relational database is not the only tool 1234 Attribute Value patient_id 1234 Name Jon Smith Age 50 999 Attribute Value patient_id 999 Name Jonathan Smith DOB Jun 1965 86 9876 Attribute Value provider_id 86 Name Dr. Nora Paige Specialty Diabetes Attribute Value rx_id 9876 Name Sitagliptin Dosage 325mg Visited Prescribed WasPrescribed Patient Patient Prescription Provider Context and Relationships
  • 63. © 2017 MapR Technologies WHY NOW? Mind the Gap 63
  • 64. © 2017 MapR Technologies Streaming System of Record for Healthcare Stream Topic Records Applications 6 5 4 3 2 1 Search Graph DB JSON HBase Micro Service Micro Service Micro Service Micro Service Micro Service Micro Service A P I Streaming System of Record Materialized Views Consumer workflow Consumer workflow Consumer workflowImmutable Log pre- processor
  • 65. © 2017 MapR Technologies 65 Immutable Log Raw Data workflow Key/Value (MapR-DB) materialized view workflow Search Engine materialized view CEP k v v v v v k v v v k v v k v v v v k v v v k v v v v v Document Log (MapR-FS) log API App pre- processor workflow Graph (ArangoDB) materialized view workflow Time Series (OpenTSDB) materialized view micro service micro service micro service micro service micro service micro service micro service micro service App AppApp ... The Promised Land Compliance Auditor smiley faces Data Lineage Audit Logging
  • 66. © 2017 MapR Technologies Solution Design/architecture solved some •  Streams •  Data Lineage/System of Record •  Kappa Architecture (Kreps/Kleppman) MapR solved others •  Unified Security •  Replication DC to DC •  Converge Kafka/HBase/Hadoop to one cluster •  Multi-tenancy (lots of topics, for lots of tenants) 66
  • 67. © 2017 MapR Technologies Real World Solution
  • 68. © 2017 MapR Technologies Challenge: Major Latency from Batch File Transfer 20-30 Minutes
  • 69. © 2017 MapR Technologies Regional Datacenter Topic Elasticsearch Kibana File Server Producer (Java) Consumer (Java) Index Filtering config •  Monitoring directory •  Parsing CSV files •  Publishing messages to topic •  Parsing master data •  Subscribing topic •  Join tables •  Aggregation Dashboard
  • 70. © 2017 MapR Technologies Streams and Replication Streams: Topic: A Topic: B Topic: C Topic: A Topic: B Topic: C Replicating to another cluster
  • 71. © 2017 MapR Technologies Central Data Center Ad-hoc analysis Other Data Sources Real-time analysis Reporting Streaming Stream Topic Replicating Regional Data Centers Stream Topic Stream Topic Performance and other monitoring related data. Aggregation of data across all regional data centers
  • 72. © 2017 MapR Technologies Stream Processing Building a Complete Data Architecture MapR File System (MapR-FS) MapR Converged Data Platform MapR Database (MapR-DB) MapR Streams Sources/Apps Bulk Processing
  • 73. © 2017 MapR Technologies To Learn More: •  Streaming Architecture ebook •  https://mapr.com/streaming-architecture-using-apache-kafka-mapr-streams/
  • 74. © 2017 MapR Technologies
  • 75. © 2017 MapR Technologies MapR Blog • https://www.mapr.com/blog/
  • 76. © 2017 MapR Technologies To Learn More: •  End to End Application for Monitoring Uber Data using Spark ML •  https://mapr.com/blog/monitoring-real-time-uber-data-using-spark-machine- learning-streaming-and-kafka-api-part-1/
  • 77. © 2017 MapR Technologies …helping you put data technology to work ●  Find answers ●  Ask technical questions ●  Join on-demand training course discussions ●  Follow release announcements ●  Share and vote on product ideas ●  Find Meetup and event listings Connect with fellow Apache Hadoop and Spark professionals community.mapr.com
  • 78. © 2017 MapR Technologies To Learn More: •  MapR Free ODT http://learn.mapr.com/
  • 79. © 2017 MapR Technologies Q&A ENGAGE WITH US