SlideShare a Scribd company logo
1 of 91
HOW TO APPLY BIG DATA ANALYTICS
AND MACHINE LEARNING TO
REAL TIME PROCESSING
Kai Wähner
kwaehner@tibco.com
@KaiWaehner
www.kai-waehner.de
LinkedIn / Xing  Please connect!
2
Digital Transformation - Physical and Digital Worlds are Merging
© Copyright 2000-2016 TIBCO Software Inc.
3
Apply Big Data Analytics to Real Time Processing
© Copyright 2000-2016 TIBCO Software Inc.
4
Analyse and Act on Critical Business Moments
© Copyright 2000-2016 TIBCO Software Inc.
Key Take-Aways
 Insights are hidden in Historical Data on Big Data Platforms
 Machine Learning and Big Data Analytics find these Insights by building Analytics Models
 Event Processing uses these Models (without Rebuilding) to take Action in Real Time
6
Agenda
© Copyright 2000-2016 TIBCO Software Inc.
1) Machine Learning and Big Data Analytics
2) Analysis of Historical Data
3) Real Time Processing
4) Live Demo
7
Agenda
© Copyright 2000-2016 TIBCO Software Inc.
1) Machine Learning and Big Data Analytics
2) Analysis of Historical Data
3) Real Time Processing
4) Live Demo
8
Machine Learning
© Copyright 2000-2016 TIBCO Software Inc.
Machine learning is a method of data analysis that automates analytical model building.
Using algorithms that iteratively learn from data, machine learning allows computers to
find hidden insights without being explicitly programmed where to look.
http://www.sas.com
9
10 Examples of Machine Learning
© Copyright 2000-2016 TIBCO Software Inc.
• Spam Detection
• Credit Card Fraud Detection
• Digit Recognition
• Speech Understanding
• Face Detection
• Shape Detection
• Product Recommendation
• Medical Diagnosis
• Stock Trading
• Customer Segmentation
http://machinelearningmastery.com/practical-machine-learning-problems/
10
10 Examples of Machine Learning
© Copyright 2000-2016 TIBCO Software Inc.
• Spam Detection: Given email in an inbox, identify those email messages that are spam and those that
are not. Having a model of this problem would allow a program to leave non-spam emails in the inbox
and move spam emails to a spam folder. We should all be familiar with this example.
• Credit Card Fraud Detection: Given credit card transactions for a customer in a month, identify those
transactions that were made by the customer and those that were not. A program with a model of this
decision could refund those transactions that were fraudulent.
• Digit Recognition: Given a zip codes hand written on envelops, identify the digit for each hand written
character. A model of this problem would allow a computer program to read and understand handwritten
zip codes and sort envelops by geographic region.
• Speech Understanding: Given an utterance from a user, identify the specific request made by the user.
A model of this problem would allow a program to understand and make an attempt to fulfil that request.
The iPhone with Siri has this capability.
• Face Detection: Given a digital photo album of many hundreds of digital photographs, identify those
photos that include a given person. A model of this decision process would allow a program to organize
photos by person. Some cameras and software like iPhoto has this capability.
http://machinelearningmastery.com/practical-machine-learning-problems/
11
10 Examples of Machine Learning
© Copyright 2000-2016 TIBCO Software Inc.
• Product Recommendation: Given a purchase history for a customer and a large inventory of products,
identify those products in which that customer will be interested and likely to purchase. A model of this
decision process would allow a program to make recommendations to a customer and motivate product
purchases. Amazon has this capability. Also think of Facebook, GooglePlus and Facebook that
recommend users to connect with you after you sign-up.
• Medical Diagnosis: Given the symptoms exhibited in a patient and a database of anonymized patient
records, predict whether the patient is likely to have an illness. A model of this decision problem could be
used by a program to provide decision support to medical professionals.
• Stock Trading: Given the current and past price movements for a stock, determine whether the stock
should be bought, held or sold. A model of this decision problem could provide decision support to
financial analysts.
• Customer Segmentation: Given the pattern of behaviour by a user during a trial period and the past
behaviours of all users, identify those users that will convert to the paid version of the product and those
that will not. A model of this decision problem would allow a program to trigger customer interventions to
persuade the customer to covert early or better engage in the trial.
• Shape Detection: Given a user hand drawing a shape on a touch screen and a database of known
shapes, determine which shape the user was trying to draw. A model of this decision would allow a
program to show the platonic version of that shape the user drew to make crisp diagrams. The Instaviz
iPhone app does this.
http://machinelearningmastery.com/practical-machine-learning-problems/
12
Types of Machine Learning Problems
© Copyright 2000-2016 TIBCO Software Inc.
• Classification: Data is labelled meaning it is assigned a class, for example
spam / non-spam or fraud / non-fraud.
• Regression: Data is labelled with a real value (think floating point) rather
then a label. Examples that are easy to understand are time series data like
the price of a stock over time.
• Clustering: Data is not labelled, but can be divided into groups based on
similarity and other measures of natural structure in the data. An example
from would be organising pictures by faces without names.
• Rule Extraction: Data is used as the basis for the extraction of
propositional rules (antecedent/consequent aka if-then). An example is the
discovery of the relationship between the purchase of beer and diapers.
http://machinelearningmastery.com/practical-machine-learning-problems/
(no complete list!)
© Copyright 2000-2016 TIBCO Software Inc.
Closed Loop for Big Data Analytics
MODEL
Develop model
Deploy into
Stream Processing flow
ACT
Automatically monitor
real-time transactions
Automatically trigger action
ANALYZE
Analyze data via
Data Discovery
Uncover patterns,
trends, correlations
14
Analytics Maturity Model
© Copyright 2000-2016 TIBCO Software Inc.
Immediate
Long-Term
Competitive AdvantageValue to the Organization
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Self-service
Dashboards
Event Processing
Predictive and
Prescriptive Analytics
Measure Diagnose Predict Optimize Operationalize Automate
Analytics Maturity
Self-service
Dashboards
Event Processing
Analytics
15
Analytics Maturity Model
© Copyright 2000-2016 TIBCO Software Inc.
Immediate
Long-Term
Competitive AdvantageValue to the Organization
Self-service
Dashboards
Event Processing
Predictive and
Prescriptive Analytics
Measure Diagnose Predict Optimize Operationalize Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Analytics
16
Analytics Maturity Model
© Copyright 2000-2016 TIBCO Software Inc.
Immediate
Long-Term
Competitive AdvantageValue to the Organization
Self-service
Dashboards
Event Processing
Predictive and
Prescriptive Analytics
Measure Diagnose Predict Optimize Operationalize Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Self-service
Dashboards
Event Processing
Analytics
17
Agenda
© Copyright 2000-2016 TIBCO Software Inc.
1) Machine Learning and Big Data Analytics
2) Analysis of Historical Data
3) Real Time Processing
4) Live Demo
18
Analytical Pipeline
© Copyright 2000-2016 TIBCO Software Inc.
19
Analytics Maturity Model
© Copyright 2000-2016 TIBCO Software Inc.
Immediate
Long-Term
Competitive AdvantageValue to the Organization
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Self-service
Dashboards
Event Processing
Predictive and
Prescriptive Analytics
Measure Diagnose Predict Optimize Operationalize Automate
Analytics Maturity
Self-service
Dashboards
Event Processing
Analytics
What is Predictive Analytics?
21
Analytical Pipeline
© Copyright 2000-2016 TIBCO Software Inc.
© Copyright 2000-2016 TIBCO Software Inc.
Data Acquisition
23
Analytical Pipeline
© Copyright 2000-2016 TIBCO Software Inc.
© Copyright 2000-2016 TIBCO Software Inc.
Data Munging / Wrangling / Mash-up
cust_id dept sku dollar gift date
1 104 C 12003 2.40 FALSE 2016-10-17
2 105 A 12005 62.85 FALSE 2016-10-17
3 102 C 12007 69.23 TRUE 2016-10-17
4 104 B 12004 9.33 FALSE 2016-10-18
5 105 C 12010 14.16 TRUE 2016-10-18
6 101 B 12003 90.43 FALSE 2016-10-19
7 103 C 12005 90.97 FALSE 2016-10-19
n … … … … … …
cust_id A B C total # orders first_date last_date
1 100 21.76 23.67 0.00 45.43 2 2016-10-19 2016-10-20
2 101 0.01 74.65 0.00 74.66 3 2016-10-19 2016-10-20
3 102 0.00 60.92 50.29 111.21 6 2016-10-17 2016-10-20
4 103 0.00 0.00 52.30 52.30 2 2016-10-19 2016-10-20
5 104 31.34 9.33 2.40 43.06 4 2016-10-17 2016-10-20
6 105 62.85 0.00 56.00 118.85 3 2016-10-17 2016-10-20
© Copyright 2000-2016 TIBCO Software Inc.
Data Munging - Transformations
26
Analytical Pipeline
© Copyright 2000-2016 TIBCO Software Inc.
© Copyright 2000-2016 TIBCO Software Inc.
Exploratory Data Analysis
Exploratory Data Analysis (EDA) is an
approach/philosophy for data analysis
that employs a variety of techniques
(mostly graphical)
1. to maximize insight into a data set
2. uncover underlying structure
3. extract important variables
4. detect outliers and anomalies
5. test underlying assumptions
6. develop parsimonious models
7. determine optimal factor settings
© Copyright 2000-2016 TIBCO Software Inc.
Exploratory Data Analysis
“The greatest value of a picture is
when it forces us to notice what we
never expected to see”
John W. Tukey, 1977
© Copyright 2000-2016 TIBCO Software Inc.
Exploratory Data Analysis
Visual Analytics - Interactive Brush-Linked
© Copyright 2000-2016 TIBCO Software Inc.
31
Analytics Maturity Model
© Copyright 2000-2016 TIBCO Software Inc.
Immediate
Long-Term
Competitive AdvantageValue to the Organization
Self-service
Dashboards
Event Processing
Predictive and
Prescriptive Analytics
Measure Diagnose Predict Optimize Operationalize Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Analytics
What is Predictive Analytics?
33
Analytical Pipeline
© Copyright 2000-2016 TIBCO Software Inc.
© Copyright 2000-2016 TIBCO Software Inc.
Which picture represents a model?
A model is a simplification of the truth that helps you with decision making.
© Copyright 2000-2016 TIBCO Software Inc.
Model Building
Supervised Models – known, labeled responses
• Regression (for example Linear Regression)
• Categorical (for example Random Forest)
Unsupervised Models – no labeled responses
• Clustering (for example k-means clustering)
© Copyright 2000-2016 TIBCO Software Inc.
Model Building
Employees who write longer emails earn higher salaries!
© Copyright 2000-2016 TIBCO Software Inc.
Model Building
© Copyright 2000-2016 TIBCO Software Inc.
Model Improvement
Managers
Staff
© Copyright 2000-2016 TIBCO Software Inc.
Model Improvement
40
Analytical Pipeline
© Copyright 2000-2016 TIBCO Software Inc.
© Copyright 2000-2016 TIBCO Software Inc.
Model Validation
How is the IQ of a kid related to the IQ of his / her mum?
© Copyright 2000-2016 TIBCO Software Inc.
What tools do
Data Scientists use?
Data Scientists work with many Tools
© Copyright 2000-2016 TIBCO Software Inc.
• SQL
• Excel
• Python
• R
Source: O’Reilly 2015 Data Science Salary Survey
http://duu86o6n09pv.cloudfront.net/reports/2015-
data-science-salary-survey.pdf
44
Alternatives for Data Scientists
© Copyright 2000-2016 TIBCO Software Inc.
Open Source Closed Source
Tooling
Source Code
(no complete list)
R
R Language
R is well known as the most and increasingly getting more popular
programming language used by data scientists for modeling. It is
developing very rapidly with a very active community.
© Copyright 2000-2016 TIBCO Software Inc.
R with Revolution Analytics (now Microsoft)
© Copyright 2000-2016 TIBCO Software Inc.
Open Source GPL License
(including its restrictions) http://www.revolutionanalytics.com/webinars/introducing-revolution-r-open-enhanced-open-source-r-distribution-revolution-analytics
• TIBCO has rewritten R as a Commercial Compute Engine
• Latest statistics scripting engine: S a S-PLUS® a R a TERR
• Runs R code including CRAN packages
• Engine internals rebuilt from scratch at low-level
• Redesigned data objects, memory management
• High performance + Big Data
• TERR is licensed from TIBCO
• TERR Installs (free) with Spotfire Analyst / Desktop + other TIBCO products
• Spotfire Server can manage all TERR / R scripts, artifacts for reuse
• Standalone Developer Edition
• Supported by TIBCO
• No GPL license issues
© Copyright 2000-2016 TIBCO Software Inc.
TERR - TIBCO’s Enterprise Runtime for R
Which R to use?
© Copyright 2000-2016 TIBCO Software Inc.
http://www.forbes.com/sites/danwoods/2016/01/27/microsofts-revolution-analytics-acquisition-is-the-wrong-way-to-embrace-r/
49
Apache Spark
© Copyright 2000-2016 TIBCO Software Inc.
General Data-processing Framework
 However, focus is especially on Analytics (at least these days)
http://fortune.com/2016/09/09/cloudera-spark-mapreduce/
Spark MLlib
© Copyright 2000-2016 TIBCO Software Inc.
MLlib is Spark’s machine learning
(ML) library. Its goal is to make
practical machine learning scalable
and easy.
It consists of common learning
algorithms and utilities, including
classification, regression,
clustering, collaborative filtering,
dimensionality reduction, as well as
lower-level optimization primitives
and higher-level pipeline APIs.
You can even combine Mllib module with R language
51
Why Spark is used for Analytics?
52
Apache Spark – Focus on Analytics
http://aptuz.com/blog/is-apache-spark-going-to-replace-hadoop/
http://fortune.com/2016/09/09/cloudera-spark-mapreduce/
http://www.ebaytechblog.com/2016/05/28/using-spark-to-ignite-data-analytics/
http://www.forbes.com/sites/paulmiller/2016/06/15/ibm-backs-apache-spark-for-big-data-analytics/
“[IBM’s initiatives] include:
• deepening the integration between Apache Spark and
existing IBM products like the Watson Health Cloud;
• open sourcing IBM’s existing SystemML machine
learning technology;
H20
© Copyright 2000-2016 TIBCO Software Inc.
An Extensible Open Source Platform for Analytics
• Best of Breed Open Source Technology
• Easy-to-use WebUI and Familiar Interfaces
• Data Agnostic Support for all Common Database
and File Types
• Massively Scalable Big Data Analysis
• Real-time Data Scoring (Nanofast Scoring Engine)
http://www.h2o.ai/
TIBCO Spotfire for Visual Data Discovery
© Copyright 2000-2016 TIBCO Software Inc.
Let the business user leverage historical data to find insights!
TIBCO Spotfire with R / TERR Integration
© Copyright 2000-2016 TIBCO Software Inc.
Let the business user leverage Analytic Models (created by the Data Scientist)!
Example: Customer Churn with Random Forest Algorithm
• ‘refresh model’ button lives a ‘random forest algorithm’
• requires no a priori assumptions at all, it just always works
• The business user doesn’t need to know what random forest is to be empowered by it
Select variables
for the model
SaaS Machine Learning
© Copyright 2000-2016 TIBCO Software Inc.
• Managed SaaS service for building ML models and generating predictions
• Integrated into the corresponding cloud ecosystem
• Easy to use, but limited feature set and potential latency issues if combined
with external data or applications
http://docs.aws.amazon.com/machine-learning/latest/dg/tutorial.html
PMML (Predictive Model Markup Language )
© Copyright 2000-2016 TIBCO Software Inc.
• XML-based de facto standard to represent predictive analytic models
• Developed by the Data Mining Group (DMG)
• Easily share models between PMML compliant applications
(e.g. between model creation and deployment for operations)
http://www.ibm.com/developerworks/library/ba-ind-PMML1/
58
Agenda
© Copyright 2000-2016 TIBCO Software Inc.
1) Machine Learning and Big Data Analytics
2) Analysis of Historical Data
3) Real Time Processing
4) Live Demo
59
Analytics Maturity Model
© Copyright 2000-2016 TIBCO Software Inc.
Immediate
Long-Term
Competitive AdvantageValue to the Organization
Self-service
Dashboards
Event Processing
Predictive and
Prescriptive Analytics
Measure Diagnose Predict Optimize Operationalize Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Self-service
Dashboards
Event Processing
Analytics
Streaming Analytics
© Copyright 2000-2016 TIBCO Software Inc.
time
1 2 3 4 5 6 7 8 9
Event Streams
• Continuous Queries
• Sliding Windows
• Filter
• Aggregation
• Correlation
• …
Operational Intelligence in Action
© Copyright 2000-2016 TIBCO Software Inc.
Actions by Operations
Human decisions in real time informed by
up to date information
The Challenge:
Empower operations staff to see and
seize key business moments61
Automated action based on models of history
combined with live context and business rules
The Challenge:
Create, understand, and deploy algorithms &
rules that automate key business reactions
Machine-to-Machine Automation
What is Prescriptive Analytics?
63
Alternatives for Stream Processing
© Copyright 2000-2016 TIBCO Software Inc.
OPEN SOURCE CLOSED SOURCE
PRODUCT
FRAMEWORK
(no complete list!)
Azure Microsoft
Stream Analytics
Visual IDE (Dev, Test, Debug)
Simulation (Feed Testing, Test Generation)
Live UI (monitoring, proactive interaction)
Maturity (24/7 support, consulting)
Integration (out-of-the-box: ESB, MDM, etc.)
Library (Java, .NET, Python)
Query Language (often similar to SQL)
Scalability (horizontal and vertical, fail over)
Connectivity (technologies, markets, products)
Operators (Filter, Sort, Aggregate)
What Streaming Alternative do you need?
Time
to
Market
Streaming
Frameworks
Streaming
Products
Slow Fast
Streaming
Concepts
65
Comparison of Stream Processing Frameworks and Products
© Copyright 2000-2016 TIBCO Software Inc.
Slide Deck from JavaOne 2016:
http://www.kai-waehner.de/blog/2016/10/25/comparison-of-stream-processing-frameworks-and-products/
StreamBase: The Power of Visual Programming
© Copyright 2000-2016 TIBCO Software Inc.
1) Get ideas into
market in days or
weeks, not months or
years
2) Unlock the power of
IT and data scientists
working together
67
Dynamic aggregation
Live visualization
Ad-hoc continuous query
Alerts
Action
Live Datamart
© Copyright 2000-2016 TIBCO Software Inc.
How to apply
analytic models to
real time processing
without rebuilding them ?
Streaming Analytics
to operationalize insights
and patterns in real time
without rebuilding the models
Stream
Processing
H20
Open
Source
R
TERR
Spark
MLlib
MATLAB
SAS
PMML
Real Time Close Loop: Understand – Anticipate – Act
TIBCO StreamBase + R / TERR
TIBCO StreamBase + H20
TIBCO StreamBase + PMML
Real World Application - Customer Churn
74
Agenda
© Copyright 2000-2016 TIBCO Software Inc.
1) Machine Learning and Big Data Analytics
2) Analysis of Historical Data
3) Real Time Processing
4) Live Demo
© Copyright 2000-2013 TIBCO Software Inc.
“An outage on one well can cost $10M per
hour. We have 20-100 outages per year.“
- Drilling operations VP, major oil company
BIG DATA
AT REST
FAST DATA
IN MOTION
Insight to Action – Closing the Loop
Data Monitoring
• Motor temperature
• Motor vibration
• Current
• Intake pressure
• Intake temperature
• Flow
Electrical power cable
Pump
Intake
Protector
ESP motor
Pump monitoring unit
Pump Components
© Copyright 2000-2016 TIBCO Software Inc.
Live Surveillance of Equipment
Voltage
Temperature
Vibration
Device
history
Temporal analytic: “If vibration spike is followed by temp spike then
voltage spike [within 4 hours] then flag high severity alert.”
Predictive Analytics (Fault Management)
Operational Analytics
Operations
Live UI
SENSOR DATA
TRANSACTIONS
MESSAGE BUS
MACHINE DATA
SOCIAL DATA
Streaming AnalyticsAction
Aggregate
Rules
Stream Processing
Analytics
Correlate
Live Monitoring
Continuous query
processing
Alerts
Manual action,
escalation
HISTORICAL ANALYSIS
Data
Sheets
BI
Data
Scientists
Cleansed
Data
History
Data Discovery
Analytics
Enterprise Service Bus
ERP MDM DB WMS
SOA
Data Storage
InternalData
IntegrationBus
API
Event Server
Predictive Maintenance
Spark
Big Data
Machine Data
(Sensors,
Weather Data, …)
Take Action
(Stop Machine, Send Mechanic, …)
Find Insights
(Sensor Behaviour,
Hardware Issues, …)
ERP System
(Transaction History, Production Volume)
2
Operational Analytics
Operations
Live UI
SENSOR DATA
TRANSACTIONS
MESSAGE BUS
MACHINE DATA
SOCIAL DATA
Streaming AnalyticsAction
Aggregate
Rules
Stream Processing
Analytics
Correlate
Live Monitoring
Continuous query
processing
Alerts
Manual action,
escalation
HISTORICAL ANALYSIS
Data
Sheets
BI
Data
Scientists
Cleansed
Data
History
Data Discovery
Analytics
Enterprise Service Bus
ERP MDM DB WMS
SOA
Data Storage
InternalData
IntegrationBus
API
Event Server
Complete Big Data Architecture
Spark
Big Data
Leading Indicators Pump Failure
Find Leading
Indicators
Backtest
Rules / Models
Push
Rules / Models
to Streambase
© Copyright 2000-2016 TIBCO Software Inc.
Create a Model
© Copyright 2000-2016 TIBCO Software Inc.
Real Time Analytics
Trend Analysis
Combination of Rules
CUSUM Analysis
Statistical Analysis
Statistical Process Control
Machine Learning
• Location Change
– Variable moves up or down
• Slope Change
– Variable changes trend
• Variance Change
– Variable becomes more/less volatile
• Process Threshold
– Shewhart control chart
• Failure Model
y (0/1) = f (X, b) + e; f = logistic regression, trees, svm, nnet, ...
Upon event trigger, populate Spotfire RCA template; email responsible engineer
Put model into Action
1. Rules / models pushed from
Spotfire
2. Data streams into StreamBase
3. Data evaluated in real-time
4. Spotfire RCA on trigger
Other notifications available
Live view on streaming data
Streambase – from Big Data to Fast Data
© Copyright 2000-2016 TIBCO Software Inc.
TIBCO StreamBase – TERR Adapter
Live View of the Situation + Proactive Actions
Responsible engineer clicks URL to launch Spotfire Root Cause Analysis; diagnose issue
Compare Live Data with Historical Data to make Human Decision
TIBCO Spotfire + StreamBase + TERR + Live Datamart
Live Demo
Key Take-Aways
 Insights are hidden in Historical Data on Big Data Platforms
 Machine Learning and Big Data Analytics find these Insights by building Analytics Models
 Event Processing uses these Models (without Rebuilding) to take Action in Real Time
Questions? Please contact me!
Kai Wähner
kwaehner@tibco.com
@KaiWaehner
www.kai-waehner.de
LinkedIn / Xing  Please connect!

More Related Content

What's hot

Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...
Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...
Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...Sabri Skhiri
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemShirshanka Das
 
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Kai Wähner
 
Oracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingOracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingGuido Schmutz
 
Lessons from building a stream-first metadata platform | Shirshanka Das, Stealth
Lessons from building a stream-first metadata platform | Shirshanka Das, StealthLessons from building a stream-first metadata platform | Shirshanka Das, Stealth
Lessons from building a stream-first metadata platform | Shirshanka Das, StealthHostedbyConfluent
 
Big Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your DataBig Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your DataKai Wähner
 
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...Kai Wähner
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningKai Wähner
 
Real Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsReal Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsArun Kejariwal
 
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsR, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsKai Wähner
 
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013Kai Wähner
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Shirshanka Das
 
Mastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and AnalysisMastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and AnalysisTeradata Aster
 
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive AnalyticsInfochimps, a CSC Big Data Business
 
Meetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunen
Meetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunenMeetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunen
Meetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunenDigipolis Antwerpen
 
Data analysis trend 2015 2016 v071
Data analysis trend 2015 2016 v071Data analysis trend 2015 2016 v071
Data analysis trend 2015 2016 v071Chun Myung Kyu
 
The Scout24 Data Platform (A Technical Deep Dive)
The Scout24 Data Platform (A Technical Deep Dive)The Scout24 Data Platform (A Technical Deep Dive)
The Scout24 Data Platform (A Technical Deep Dive)RaffaelDzikowski
 
Predictive maintenance withsensors_in_utilities_
Predictive maintenance withsensors_in_utilities_Predictive maintenance withsensors_in_utilities_
Predictive maintenance withsensors_in_utilities_Tina Zhang
 

What's hot (19)

Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...
Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...
Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
 
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
 
Oracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingOracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream Processing
 
Lessons from building a stream-first metadata platform | Shirshanka Das, Stealth
Lessons from building a stream-first metadata platform | Shirshanka Das, StealthLessons from building a stream-first metadata platform | Shirshanka Das, Stealth
Lessons from building a stream-first metadata platform | Shirshanka Das, Stealth
 
Big Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your DataBig Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your Data
 
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
 
Real Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsReal Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and Systems
 
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsR, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
 
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Mastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and AnalysisMastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and Analysis
 
Infochimps + CloudCon: Infinite Monkey Theorem
Infochimps + CloudCon: Infinite Monkey TheoremInfochimps + CloudCon: Infinite Monkey Theorem
Infochimps + CloudCon: Infinite Monkey Theorem
 
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
 
Meetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunen
Meetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunenMeetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunen
Meetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunen
 
Data analysis trend 2015 2016 v071
Data analysis trend 2015 2016 v071Data analysis trend 2015 2016 v071
Data analysis trend 2015 2016 v071
 
The Scout24 Data Platform (A Technical Deep Dive)
The Scout24 Data Platform (A Technical Deep Dive)The Scout24 Data Platform (A Technical Deep Dive)
The Scout24 Data Platform (A Technical Deep Dive)
 
Predictive maintenance withsensors_in_utilities_
Predictive maintenance withsensors_in_utilities_Predictive maintenance withsensors_in_utilities_
Predictive maintenance withsensors_in_utilities_
 

Viewers also liked

Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...Kai Wähner
 
Streaming Analytics - Comparison of Open Source Frameworks and Products
Streaming Analytics - Comparison of Open Source Frameworks and ProductsStreaming Analytics - Comparison of Open Source Frameworks and Products
Streaming Analytics - Comparison of Open Source Frameworks and ProductsKai Wähner
 
Microservices - Death of the Enterprise Service Bus (ESB)? (Update 2016)
Microservices - Death of the Enterprise Service Bus (ESB)? (Update 2016)Microservices - Death of the Enterprise Service Bus (ESB)? (Update 2016)
Microservices - Death of the Enterprise Service Bus (ESB)? (Update 2016)Kai Wähner
 
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...Kai Wähner
 
Streaming Data in R
Streaming Data in RStreaming Data in R
Streaming Data in RRory Winston
 
Case Study: How to move from a Monolith to Cloud, Containers and Microservices
Case Study: How to move from a Monolith to Cloud, Containers and MicroservicesCase Study: How to move from a Monolith to Cloud, Containers and Microservices
Case Study: How to move from a Monolith to Cloud, Containers and MicroservicesKai Wähner
 
Blockchain + Streaming Analytics with Ethereum and TIBCO StreamBase
Blockchain + Streaming Analytics with Ethereum and TIBCO StreamBase Blockchain + Streaming Analytics with Ethereum and TIBCO StreamBase
Blockchain + Streaming Analytics with Ethereum and TIBCO StreamBase Kai Wähner
 
Open Source IoT Project Flogo - Introduction, Overview and Architecture
Open Source IoT Project Flogo - Introduction, Overview and ArchitectureOpen Source IoT Project Flogo - Introduction, Overview and Architecture
Open Source IoT Project Flogo - Introduction, Overview and ArchitectureKai Wähner
 
Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...
Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...
Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...Kai Wähner
 
Framework and Product Comparison for Big Data Log Analytics and ITOA
Framework and Product Comparison for Big Data Log Analytics and ITOA Framework and Product Comparison for Big Data Log Analytics and ITOA
Framework and Product Comparison for Big Data Log Analytics and ITOA Kai Wähner
 
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...Kai Wähner
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014P. Taylor Goetz
 
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...Codemotion
 
Flogo - A Golang-powered Open Source IoT Integration Framework (Gophercon)
Flogo - A Golang-powered Open Source IoT Integration Framework (Gophercon)Flogo - A Golang-powered Open Source IoT Integration Framework (Gophercon)
Flogo - A Golang-powered Open Source IoT Integration Framework (Gophercon)Kai Wähner
 
Cloud Native Middleware Microservices - Lessons Learned with Docker, Kubernet...
Cloud Native Middleware Microservices - Lessons Learned with Docker, Kubernet...Cloud Native Middleware Microservices - Lessons Learned with Docker, Kubernet...
Cloud Native Middleware Microservices - Lessons Learned with Docker, Kubernet...Kai Wähner
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationnathanmarz
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopDataWorks Summit
 

Viewers also liked (20)

Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
 
Streaming Analytics - Comparison of Open Source Frameworks and Products
Streaming Analytics - Comparison of Open Source Frameworks and ProductsStreaming Analytics - Comparison of Open Source Frameworks and Products
Streaming Analytics - Comparison of Open Source Frameworks and Products
 
Microservices - Death of the Enterprise Service Bus (ESB)? (Update 2016)
Microservices - Death of the Enterprise Service Bus (ESB)? (Update 2016)Microservices - Death of the Enterprise Service Bus (ESB)? (Update 2016)
Microservices - Death of the Enterprise Service Bus (ESB)? (Update 2016)
 
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
 
Streaming Data in R
Streaming Data in RStreaming Data in R
Streaming Data in R
 
Case Study: How to move from a Monolith to Cloud, Containers and Microservices
Case Study: How to move from a Monolith to Cloud, Containers and MicroservicesCase Study: How to move from a Monolith to Cloud, Containers and Microservices
Case Study: How to move from a Monolith to Cloud, Containers and Microservices
 
Blockchain + Streaming Analytics with Ethereum and TIBCO StreamBase
Blockchain + Streaming Analytics with Ethereum and TIBCO StreamBase Blockchain + Streaming Analytics with Ethereum and TIBCO StreamBase
Blockchain + Streaming Analytics with Ethereum and TIBCO StreamBase
 
Open Source IoT Project Flogo - Introduction, Overview and Architecture
Open Source IoT Project Flogo - Introduction, Overview and ArchitectureOpen Source IoT Project Flogo - Introduction, Overview and Architecture
Open Source IoT Project Flogo - Introduction, Overview and Architecture
 
Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...
Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...
Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...
 
Framework and Product Comparison for Big Data Log Analytics and ITOA
Framework and Product Comparison for Big Data Log Analytics and ITOA Framework and Product Comparison for Big Data Log Analytics and ITOA
Framework and Product Comparison for Big Data Log Analytics and ITOA
 
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
 
Flogo - A Golang-powered Open Source IoT Integration Framework (Gophercon)
Flogo - A Golang-powered Open Source IoT Integration Framework (Gophercon)Flogo - A Golang-powered Open Source IoT Integration Framework (Gophercon)
Flogo - A Golang-powered Open Source IoT Integration Framework (Gophercon)
 
Cloud Native Middleware Microservices - Lessons Learned with Docker, Kubernet...
Cloud Native Middleware Microservices - Lessons Learned with Docker, Kubernet...Cloud Native Middleware Microservices - Lessons Learned with Docker, Kubernet...
Cloud Native Middleware Microservices - Lessons Learned with Docker, Kubernet...
 
IOT, Streaming Analytics and Machine Learning
IOT, Streaming Analytics and Machine Learning IOT, Streaming Analytics and Machine Learning
IOT, Streaming Analytics and Machine Learning
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
Yahoo compares Storm and Spark
Yahoo compares Storm and SparkYahoo compares Storm and Spark
Yahoo compares Storm and Spark
 

Similar to How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real Time Streaming Analytics

The top ten free and open-source tools for video analytics.pdf
The top ten free and open-source tools for video analytics.pdfThe top ten free and open-source tools for video analytics.pdf
The top ten free and open-source tools for video analytics.pdfVertexplus Technologies
 
HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...
HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...
HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...Big Data Spain
 
How To Start A Reflective Essay 8 Tips For Writing A
How To Start A Reflective Essay 8 Tips For Writing AHow To Start A Reflective Essay 8 Tips For Writing A
How To Start A Reflective Essay 8 Tips For Writing ALisa Martinez
 
Certus Accelerate - Building the business case for why you need to invest in ...
Certus Accelerate - Building the business case for why you need to invest in ...Certus Accelerate - Building the business case for why you need to invest in ...
Certus Accelerate - Building the business case for why you need to invest in ...Certus Solutions
 
Intro to Data Analytics with Oscar's Director of Product
 Intro to Data Analytics with Oscar's Director of Product Intro to Data Analytics with Oscar's Director of Product
Intro to Data Analytics with Oscar's Director of ProductProduct School
 
driving_business_value_from_real_time_streaming_analytics
driving_business_value_from_real_time_streaming_analyticsdriving_business_value_from_real_time_streaming_analytics
driving_business_value_from_real_time_streaming_analyticsJane Roberts
 
Findability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learningFindability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learningFindwise
 
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...Precisely
 
Big Data Meetup by Chad Richeson
Big Data Meetup by Chad RichesonBig Data Meetup by Chad Richeson
Big Data Meetup by Chad RichesonSocietyConsulting
 
Netvibes for Financial Services
Netvibes for Financial ServicesNetvibes for Financial Services
Netvibes for Financial ServicesNetvibes
 
MB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptxMB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptxssuser28b150
 
Big Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation SlidesBig Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation SlidesSlideTeam
 
Data analysis step by step guide
Data analysis   step by step guideData analysis   step by step guide
Data analysis step by step guideManish Gupta
 
Markerstudy Group Drives Growth and Innovation
Markerstudy Group Drives Growth and InnovationMarkerstudy Group Drives Growth and Innovation
Markerstudy Group Drives Growth and InnovationCloudera, Inc.
 
Big data - The next best thing
Big data - The next best thingBig data - The next best thing
Big data - The next best thingBharath Rao
 
Ai design sprint - Finance - Wealth management
Ai design sprint  - Finance - Wealth managementAi design sprint  - Finance - Wealth management
Ai design sprint - Finance - Wealth managementChinmay Patel
 
Fuel for the cognitive age: What's new in IBM predictive analytics
Fuel for the cognitive age: What's new in IBM predictive analytics Fuel for the cognitive age: What's new in IBM predictive analytics
Fuel for the cognitive age: What's new in IBM predictive analytics IBM SPSS Software
 

Similar to How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real Time Streaming Analytics (20)

The top ten free and open-source tools for video analytics.pdf
The top ten free and open-source tools for video analytics.pdfThe top ten free and open-source tools for video analytics.pdf
The top ten free and open-source tools for video analytics.pdf
 
HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...
HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...
HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...
 
How To Start A Reflective Essay 8 Tips For Writing A
How To Start A Reflective Essay 8 Tips For Writing AHow To Start A Reflective Essay 8 Tips For Writing A
How To Start A Reflective Essay 8 Tips For Writing A
 
Certus Accelerate - Building the business case for why you need to invest in ...
Certus Accelerate - Building the business case for why you need to invest in ...Certus Accelerate - Building the business case for why you need to invest in ...
Certus Accelerate - Building the business case for why you need to invest in ...
 
Solving Big Data Industry Use Cases with AWS Cloud Computing
Solving Big Data Industry Use Cases with AWS Cloud ComputingSolving Big Data Industry Use Cases with AWS Cloud Computing
Solving Big Data Industry Use Cases with AWS Cloud Computing
 
Intro to Data Analytics with Oscar's Director of Product
 Intro to Data Analytics with Oscar's Director of Product Intro to Data Analytics with Oscar's Director of Product
Intro to Data Analytics with Oscar's Director of Product
 
driving_business_value_from_real_time_streaming_analytics
driving_business_value_from_real_time_streaming_analyticsdriving_business_value_from_real_time_streaming_analytics
driving_business_value_from_real_time_streaming_analytics
 
Findability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learningFindability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learning
 
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...
 
Big Data Meetup by Chad Richeson
Big Data Meetup by Chad RichesonBig Data Meetup by Chad Richeson
Big Data Meetup by Chad Richeson
 
Netvibes for Financial Services
Netvibes for Financial ServicesNetvibes for Financial Services
Netvibes for Financial Services
 
MB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptxMB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptx
 
Big Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation SlidesBig Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation Slides
 
Data analysis step by step guide
Data analysis   step by step guideData analysis   step by step guide
Data analysis step by step guide
 
Markerstudy Group Drives Growth and Innovation
Markerstudy Group Drives Growth and InnovationMarkerstudy Group Drives Growth and Innovation
Markerstudy Group Drives Growth and Innovation
 
Big data - The next best thing
Big data - The next best thingBig data - The next best thing
Big data - The next best thing
 
Certified Business Analytics Specialist (CBAS)
Certified Business Analytics Specialist (CBAS) Certified Business Analytics Specialist (CBAS)
Certified Business Analytics Specialist (CBAS)
 
Ai design sprint - Finance - Wealth management
Ai design sprint  - Finance - Wealth managementAi design sprint  - Finance - Wealth management
Ai design sprint - Finance - Wealth management
 
Projects
ProjectsProjects
Projects
 
Fuel for the cognitive age: What's new in IBM predictive analytics
Fuel for the cognitive age: What's new in IBM predictive analytics Fuel for the cognitive age: What's new in IBM predictive analytics
Fuel for the cognitive age: What's new in IBM predictive analytics
 

More from Kai Wähner

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Kai Wähner
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?Kai Wähner
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKai Wähner
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaKai Wähner
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareKai Wähner
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Kai Wähner
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureKai Wähner
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Kai Wähner
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryKai Wähner
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryKai Wähner
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryKai Wähner
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail IndustryKai Wähner
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Kai Wähner
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingKai Wähner
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesKai Wähner
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Kai Wähner
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Kai Wähner
 

More from Kai Wähner (20)

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and Manufacturing
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
 

Recently uploaded

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real Time Streaming Analytics

  • 1. HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING Kai Wähner kwaehner@tibco.com @KaiWaehner www.kai-waehner.de LinkedIn / Xing  Please connect!
  • 2. 2 Digital Transformation - Physical and Digital Worlds are Merging © Copyright 2000-2016 TIBCO Software Inc.
  • 3. 3 Apply Big Data Analytics to Real Time Processing © Copyright 2000-2016 TIBCO Software Inc.
  • 4. 4 Analyse and Act on Critical Business Moments © Copyright 2000-2016 TIBCO Software Inc.
  • 5. Key Take-Aways  Insights are hidden in Historical Data on Big Data Platforms  Machine Learning and Big Data Analytics find these Insights by building Analytics Models  Event Processing uses these Models (without Rebuilding) to take Action in Real Time
  • 6. 6 Agenda © Copyright 2000-2016 TIBCO Software Inc. 1) Machine Learning and Big Data Analytics 2) Analysis of Historical Data 3) Real Time Processing 4) Live Demo
  • 7. 7 Agenda © Copyright 2000-2016 TIBCO Software Inc. 1) Machine Learning and Big Data Analytics 2) Analysis of Historical Data 3) Real Time Processing 4) Live Demo
  • 8. 8 Machine Learning © Copyright 2000-2016 TIBCO Software Inc. Machine learning is a method of data analysis that automates analytical model building. Using algorithms that iteratively learn from data, machine learning allows computers to find hidden insights without being explicitly programmed where to look. http://www.sas.com
  • 9. 9 10 Examples of Machine Learning © Copyright 2000-2016 TIBCO Software Inc. • Spam Detection • Credit Card Fraud Detection • Digit Recognition • Speech Understanding • Face Detection • Shape Detection • Product Recommendation • Medical Diagnosis • Stock Trading • Customer Segmentation http://machinelearningmastery.com/practical-machine-learning-problems/
  • 10. 10 10 Examples of Machine Learning © Copyright 2000-2016 TIBCO Software Inc. • Spam Detection: Given email in an inbox, identify those email messages that are spam and those that are not. Having a model of this problem would allow a program to leave non-spam emails in the inbox and move spam emails to a spam folder. We should all be familiar with this example. • Credit Card Fraud Detection: Given credit card transactions for a customer in a month, identify those transactions that were made by the customer and those that were not. A program with a model of this decision could refund those transactions that were fraudulent. • Digit Recognition: Given a zip codes hand written on envelops, identify the digit for each hand written character. A model of this problem would allow a computer program to read and understand handwritten zip codes and sort envelops by geographic region. • Speech Understanding: Given an utterance from a user, identify the specific request made by the user. A model of this problem would allow a program to understand and make an attempt to fulfil that request. The iPhone with Siri has this capability. • Face Detection: Given a digital photo album of many hundreds of digital photographs, identify those photos that include a given person. A model of this decision process would allow a program to organize photos by person. Some cameras and software like iPhoto has this capability. http://machinelearningmastery.com/practical-machine-learning-problems/
  • 11. 11 10 Examples of Machine Learning © Copyright 2000-2016 TIBCO Software Inc. • Product Recommendation: Given a purchase history for a customer and a large inventory of products, identify those products in which that customer will be interested and likely to purchase. A model of this decision process would allow a program to make recommendations to a customer and motivate product purchases. Amazon has this capability. Also think of Facebook, GooglePlus and Facebook that recommend users to connect with you after you sign-up. • Medical Diagnosis: Given the symptoms exhibited in a patient and a database of anonymized patient records, predict whether the patient is likely to have an illness. A model of this decision problem could be used by a program to provide decision support to medical professionals. • Stock Trading: Given the current and past price movements for a stock, determine whether the stock should be bought, held or sold. A model of this decision problem could provide decision support to financial analysts. • Customer Segmentation: Given the pattern of behaviour by a user during a trial period and the past behaviours of all users, identify those users that will convert to the paid version of the product and those that will not. A model of this decision problem would allow a program to trigger customer interventions to persuade the customer to covert early or better engage in the trial. • Shape Detection: Given a user hand drawing a shape on a touch screen and a database of known shapes, determine which shape the user was trying to draw. A model of this decision would allow a program to show the platonic version of that shape the user drew to make crisp diagrams. The Instaviz iPhone app does this. http://machinelearningmastery.com/practical-machine-learning-problems/
  • 12. 12 Types of Machine Learning Problems © Copyright 2000-2016 TIBCO Software Inc. • Classification: Data is labelled meaning it is assigned a class, for example spam / non-spam or fraud / non-fraud. • Regression: Data is labelled with a real value (think floating point) rather then a label. Examples that are easy to understand are time series data like the price of a stock over time. • Clustering: Data is not labelled, but can be divided into groups based on similarity and other measures of natural structure in the data. An example from would be organising pictures by faces without names. • Rule Extraction: Data is used as the basis for the extraction of propositional rules (antecedent/consequent aka if-then). An example is the discovery of the relationship between the purchase of beer and diapers. http://machinelearningmastery.com/practical-machine-learning-problems/ (no complete list!)
  • 13. © Copyright 2000-2016 TIBCO Software Inc. Closed Loop for Big Data Analytics MODEL Develop model Deploy into Stream Processing flow ACT Automatically monitor real-time transactions Automatically trigger action ANALYZE Analyze data via Data Discovery Uncover patterns, trends, correlations
  • 14. 14 Analytics Maturity Model © Copyright 2000-2016 TIBCO Software Inc. Immediate Long-Term Competitive AdvantageValue to the Organization A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases Self-service Dashboards Event Processing Predictive and Prescriptive Analytics Measure Diagnose Predict Optimize Operationalize Automate Analytics Maturity Self-service Dashboards Event Processing Analytics
  • 15. 15 Analytics Maturity Model © Copyright 2000-2016 TIBCO Software Inc. Immediate Long-Term Competitive AdvantageValue to the Organization Self-service Dashboards Event Processing Predictive and Prescriptive Analytics Measure Diagnose Predict Optimize Operationalize Automate Analytics Maturity A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases Analytics
  • 16. 16 Analytics Maturity Model © Copyright 2000-2016 TIBCO Software Inc. Immediate Long-Term Competitive AdvantageValue to the Organization Self-service Dashboards Event Processing Predictive and Prescriptive Analytics Measure Diagnose Predict Optimize Operationalize Automate Analytics Maturity A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases Self-service Dashboards Event Processing Analytics
  • 17. 17 Agenda © Copyright 2000-2016 TIBCO Software Inc. 1) Machine Learning and Big Data Analytics 2) Analysis of Historical Data 3) Real Time Processing 4) Live Demo
  • 18. 18 Analytical Pipeline © Copyright 2000-2016 TIBCO Software Inc.
  • 19. 19 Analytics Maturity Model © Copyright 2000-2016 TIBCO Software Inc. Immediate Long-Term Competitive AdvantageValue to the Organization A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases Self-service Dashboards Event Processing Predictive and Prescriptive Analytics Measure Diagnose Predict Optimize Operationalize Automate Analytics Maturity Self-service Dashboards Event Processing Analytics
  • 20. What is Predictive Analytics?
  • 21. 21 Analytical Pipeline © Copyright 2000-2016 TIBCO Software Inc.
  • 22. © Copyright 2000-2016 TIBCO Software Inc. Data Acquisition
  • 23. 23 Analytical Pipeline © Copyright 2000-2016 TIBCO Software Inc.
  • 24. © Copyright 2000-2016 TIBCO Software Inc. Data Munging / Wrangling / Mash-up
  • 25. cust_id dept sku dollar gift date 1 104 C 12003 2.40 FALSE 2016-10-17 2 105 A 12005 62.85 FALSE 2016-10-17 3 102 C 12007 69.23 TRUE 2016-10-17 4 104 B 12004 9.33 FALSE 2016-10-18 5 105 C 12010 14.16 TRUE 2016-10-18 6 101 B 12003 90.43 FALSE 2016-10-19 7 103 C 12005 90.97 FALSE 2016-10-19 n … … … … … … cust_id A B C total # orders first_date last_date 1 100 21.76 23.67 0.00 45.43 2 2016-10-19 2016-10-20 2 101 0.01 74.65 0.00 74.66 3 2016-10-19 2016-10-20 3 102 0.00 60.92 50.29 111.21 6 2016-10-17 2016-10-20 4 103 0.00 0.00 52.30 52.30 2 2016-10-19 2016-10-20 5 104 31.34 9.33 2.40 43.06 4 2016-10-17 2016-10-20 6 105 62.85 0.00 56.00 118.85 3 2016-10-17 2016-10-20 © Copyright 2000-2016 TIBCO Software Inc. Data Munging - Transformations
  • 26. 26 Analytical Pipeline © Copyright 2000-2016 TIBCO Software Inc.
  • 27. © Copyright 2000-2016 TIBCO Software Inc. Exploratory Data Analysis
  • 28. Exploratory Data Analysis (EDA) is an approach/philosophy for data analysis that employs a variety of techniques (mostly graphical) 1. to maximize insight into a data set 2. uncover underlying structure 3. extract important variables 4. detect outliers and anomalies 5. test underlying assumptions 6. develop parsimonious models 7. determine optimal factor settings © Copyright 2000-2016 TIBCO Software Inc. Exploratory Data Analysis
  • 29. “The greatest value of a picture is when it forces us to notice what we never expected to see” John W. Tukey, 1977 © Copyright 2000-2016 TIBCO Software Inc. Exploratory Data Analysis
  • 30. Visual Analytics - Interactive Brush-Linked © Copyright 2000-2016 TIBCO Software Inc.
  • 31. 31 Analytics Maturity Model © Copyright 2000-2016 TIBCO Software Inc. Immediate Long-Term Competitive AdvantageValue to the Organization Self-service Dashboards Event Processing Predictive and Prescriptive Analytics Measure Diagnose Predict Optimize Operationalize Automate Analytics Maturity A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases Analytics
  • 32. What is Predictive Analytics?
  • 33. 33 Analytical Pipeline © Copyright 2000-2016 TIBCO Software Inc.
  • 34. © Copyright 2000-2016 TIBCO Software Inc. Which picture represents a model? A model is a simplification of the truth that helps you with decision making.
  • 35. © Copyright 2000-2016 TIBCO Software Inc. Model Building Supervised Models – known, labeled responses • Regression (for example Linear Regression) • Categorical (for example Random Forest) Unsupervised Models – no labeled responses • Clustering (for example k-means clustering)
  • 36. © Copyright 2000-2016 TIBCO Software Inc. Model Building
  • 37. Employees who write longer emails earn higher salaries! © Copyright 2000-2016 TIBCO Software Inc. Model Building
  • 38. © Copyright 2000-2016 TIBCO Software Inc. Model Improvement
  • 39. Managers Staff © Copyright 2000-2016 TIBCO Software Inc. Model Improvement
  • 40. 40 Analytical Pipeline © Copyright 2000-2016 TIBCO Software Inc.
  • 41. © Copyright 2000-2016 TIBCO Software Inc. Model Validation How is the IQ of a kid related to the IQ of his / her mum?
  • 42. © Copyright 2000-2016 TIBCO Software Inc. What tools do Data Scientists use?
  • 43. Data Scientists work with many Tools © Copyright 2000-2016 TIBCO Software Inc. • SQL • Excel • Python • R Source: O’Reilly 2015 Data Science Salary Survey http://duu86o6n09pv.cloudfront.net/reports/2015- data-science-salary-survey.pdf
  • 44. 44 Alternatives for Data Scientists © Copyright 2000-2016 TIBCO Software Inc. Open Source Closed Source Tooling Source Code (no complete list) R
  • 45. R Language R is well known as the most and increasingly getting more popular programming language used by data scientists for modeling. It is developing very rapidly with a very active community. © Copyright 2000-2016 TIBCO Software Inc.
  • 46. R with Revolution Analytics (now Microsoft) © Copyright 2000-2016 TIBCO Software Inc. Open Source GPL License (including its restrictions) http://www.revolutionanalytics.com/webinars/introducing-revolution-r-open-enhanced-open-source-r-distribution-revolution-analytics
  • 47. • TIBCO has rewritten R as a Commercial Compute Engine • Latest statistics scripting engine: S a S-PLUS® a R a TERR • Runs R code including CRAN packages • Engine internals rebuilt from scratch at low-level • Redesigned data objects, memory management • High performance + Big Data • TERR is licensed from TIBCO • TERR Installs (free) with Spotfire Analyst / Desktop + other TIBCO products • Spotfire Server can manage all TERR / R scripts, artifacts for reuse • Standalone Developer Edition • Supported by TIBCO • No GPL license issues © Copyright 2000-2016 TIBCO Software Inc. TERR - TIBCO’s Enterprise Runtime for R
  • 48. Which R to use? © Copyright 2000-2016 TIBCO Software Inc. http://www.forbes.com/sites/danwoods/2016/01/27/microsofts-revolution-analytics-acquisition-is-the-wrong-way-to-embrace-r/
  • 49. 49 Apache Spark © Copyright 2000-2016 TIBCO Software Inc. General Data-processing Framework  However, focus is especially on Analytics (at least these days) http://fortune.com/2016/09/09/cloudera-spark-mapreduce/
  • 50. Spark MLlib © Copyright 2000-2016 TIBCO Software Inc. MLlib is Spark’s machine learning (ML) library. Its goal is to make practical machine learning scalable and easy. It consists of common learning algorithms and utilities, including classification, regression, clustering, collaborative filtering, dimensionality reduction, as well as lower-level optimization primitives and higher-level pipeline APIs. You can even combine Mllib module with R language
  • 51. 51 Why Spark is used for Analytics?
  • 52. 52 Apache Spark – Focus on Analytics http://aptuz.com/blog/is-apache-spark-going-to-replace-hadoop/ http://fortune.com/2016/09/09/cloudera-spark-mapreduce/ http://www.ebaytechblog.com/2016/05/28/using-spark-to-ignite-data-analytics/ http://www.forbes.com/sites/paulmiller/2016/06/15/ibm-backs-apache-spark-for-big-data-analytics/ “[IBM’s initiatives] include: • deepening the integration between Apache Spark and existing IBM products like the Watson Health Cloud; • open sourcing IBM’s existing SystemML machine learning technology;
  • 53. H20 © Copyright 2000-2016 TIBCO Software Inc. An Extensible Open Source Platform for Analytics • Best of Breed Open Source Technology • Easy-to-use WebUI and Familiar Interfaces • Data Agnostic Support for all Common Database and File Types • Massively Scalable Big Data Analysis • Real-time Data Scoring (Nanofast Scoring Engine) http://www.h2o.ai/
  • 54. TIBCO Spotfire for Visual Data Discovery © Copyright 2000-2016 TIBCO Software Inc. Let the business user leverage historical data to find insights!
  • 55. TIBCO Spotfire with R / TERR Integration © Copyright 2000-2016 TIBCO Software Inc. Let the business user leverage Analytic Models (created by the Data Scientist)! Example: Customer Churn with Random Forest Algorithm • ‘refresh model’ button lives a ‘random forest algorithm’ • requires no a priori assumptions at all, it just always works • The business user doesn’t need to know what random forest is to be empowered by it Select variables for the model
  • 56. SaaS Machine Learning © Copyright 2000-2016 TIBCO Software Inc. • Managed SaaS service for building ML models and generating predictions • Integrated into the corresponding cloud ecosystem • Easy to use, but limited feature set and potential latency issues if combined with external data or applications http://docs.aws.amazon.com/machine-learning/latest/dg/tutorial.html
  • 57. PMML (Predictive Model Markup Language ) © Copyright 2000-2016 TIBCO Software Inc. • XML-based de facto standard to represent predictive analytic models • Developed by the Data Mining Group (DMG) • Easily share models between PMML compliant applications (e.g. between model creation and deployment for operations) http://www.ibm.com/developerworks/library/ba-ind-PMML1/
  • 58. 58 Agenda © Copyright 2000-2016 TIBCO Software Inc. 1) Machine Learning and Big Data Analytics 2) Analysis of Historical Data 3) Real Time Processing 4) Live Demo
  • 59. 59 Analytics Maturity Model © Copyright 2000-2016 TIBCO Software Inc. Immediate Long-Term Competitive AdvantageValue to the Organization Self-service Dashboards Event Processing Predictive and Prescriptive Analytics Measure Diagnose Predict Optimize Operationalize Automate Analytics Maturity A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases Self-service Dashboards Event Processing Analytics
  • 60. Streaming Analytics © Copyright 2000-2016 TIBCO Software Inc. time 1 2 3 4 5 6 7 8 9 Event Streams • Continuous Queries • Sliding Windows • Filter • Aggregation • Correlation • …
  • 61. Operational Intelligence in Action © Copyright 2000-2016 TIBCO Software Inc. Actions by Operations Human decisions in real time informed by up to date information The Challenge: Empower operations staff to see and seize key business moments61 Automated action based on models of history combined with live context and business rules The Challenge: Create, understand, and deploy algorithms & rules that automate key business reactions Machine-to-Machine Automation
  • 62. What is Prescriptive Analytics?
  • 63. 63 Alternatives for Stream Processing © Copyright 2000-2016 TIBCO Software Inc. OPEN SOURCE CLOSED SOURCE PRODUCT FRAMEWORK (no complete list!) Azure Microsoft Stream Analytics
  • 64. Visual IDE (Dev, Test, Debug) Simulation (Feed Testing, Test Generation) Live UI (monitoring, proactive interaction) Maturity (24/7 support, consulting) Integration (out-of-the-box: ESB, MDM, etc.) Library (Java, .NET, Python) Query Language (often similar to SQL) Scalability (horizontal and vertical, fail over) Connectivity (technologies, markets, products) Operators (Filter, Sort, Aggregate) What Streaming Alternative do you need? Time to Market Streaming Frameworks Streaming Products Slow Fast Streaming Concepts
  • 65. 65 Comparison of Stream Processing Frameworks and Products © Copyright 2000-2016 TIBCO Software Inc. Slide Deck from JavaOne 2016: http://www.kai-waehner.de/blog/2016/10/25/comparison-of-stream-processing-frameworks-and-products/
  • 66. StreamBase: The Power of Visual Programming © Copyright 2000-2016 TIBCO Software Inc. 1) Get ideas into market in days or weeks, not months or years 2) Unlock the power of IT and data scientists working together
  • 67. 67 Dynamic aggregation Live visualization Ad-hoc continuous query Alerts Action Live Datamart
  • 68. © Copyright 2000-2016 TIBCO Software Inc. How to apply analytic models to real time processing without rebuilding them ?
  • 69. Streaming Analytics to operationalize insights and patterns in real time without rebuilding the models Stream Processing H20 Open Source R TERR Spark MLlib MATLAB SAS PMML Real Time Close Loop: Understand – Anticipate – Act
  • 70. TIBCO StreamBase + R / TERR
  • 73. Real World Application - Customer Churn
  • 74. 74 Agenda © Copyright 2000-2016 TIBCO Software Inc. 1) Machine Learning and Big Data Analytics 2) Analysis of Historical Data 3) Real Time Processing 4) Live Demo
  • 75. © Copyright 2000-2013 TIBCO Software Inc. “An outage on one well can cost $10M per hour. We have 20-100 outages per year.“ - Drilling operations VP, major oil company
  • 76. BIG DATA AT REST FAST DATA IN MOTION Insight to Action – Closing the Loop
  • 77. Data Monitoring • Motor temperature • Motor vibration • Current • Intake pressure • Intake temperature • Flow Electrical power cable Pump Intake Protector ESP motor Pump monitoring unit Pump Components © Copyright 2000-2016 TIBCO Software Inc. Live Surveillance of Equipment
  • 78. Voltage Temperature Vibration Device history Temporal analytic: “If vibration spike is followed by temp spike then voltage spike [within 4 hours] then flag high severity alert.” Predictive Analytics (Fault Management)
  • 79. Operational Analytics Operations Live UI SENSOR DATA TRANSACTIONS MESSAGE BUS MACHINE DATA SOCIAL DATA Streaming AnalyticsAction Aggregate Rules Stream Processing Analytics Correlate Live Monitoring Continuous query processing Alerts Manual action, escalation HISTORICAL ANALYSIS Data Sheets BI Data Scientists Cleansed Data History Data Discovery Analytics Enterprise Service Bus ERP MDM DB WMS SOA Data Storage InternalData IntegrationBus API Event Server Predictive Maintenance Spark Big Data Machine Data (Sensors, Weather Data, …) Take Action (Stop Machine, Send Mechanic, …) Find Insights (Sensor Behaviour, Hardware Issues, …) ERP System (Transaction History, Production Volume) 2
  • 80. Operational Analytics Operations Live UI SENSOR DATA TRANSACTIONS MESSAGE BUS MACHINE DATA SOCIAL DATA Streaming AnalyticsAction Aggregate Rules Stream Processing Analytics Correlate Live Monitoring Continuous query processing Alerts Manual action, escalation HISTORICAL ANALYSIS Data Sheets BI Data Scientists Cleansed Data History Data Discovery Analytics Enterprise Service Bus ERP MDM DB WMS SOA Data Storage InternalData IntegrationBus API Event Server Complete Big Data Architecture Spark Big Data
  • 82. Find Leading Indicators Backtest Rules / Models Push Rules / Models to Streambase © Copyright 2000-2016 TIBCO Software Inc. Create a Model
  • 83. © Copyright 2000-2016 TIBCO Software Inc. Real Time Analytics Trend Analysis Combination of Rules CUSUM Analysis Statistical Analysis Statistical Process Control Machine Learning • Location Change – Variable moves up or down • Slope Change – Variable changes trend • Variance Change – Variable becomes more/less volatile • Process Threshold – Shewhart control chart • Failure Model y (0/1) = f (X, b) + e; f = logistic regression, trees, svm, nnet, ...
  • 84. Upon event trigger, populate Spotfire RCA template; email responsible engineer Put model into Action
  • 85. 1. Rules / models pushed from Spotfire 2. Data streams into StreamBase 3. Data evaluated in real-time 4. Spotfire RCA on trigger Other notifications available Live view on streaming data Streambase – from Big Data to Fast Data
  • 86. © Copyright 2000-2016 TIBCO Software Inc. TIBCO StreamBase – TERR Adapter
  • 87. Live View of the Situation + Proactive Actions
  • 88. Responsible engineer clicks URL to launch Spotfire Root Cause Analysis; diagnose issue Compare Live Data with Historical Data to make Human Decision
  • 89. TIBCO Spotfire + StreamBase + TERR + Live Datamart Live Demo
  • 90. Key Take-Aways  Insights are hidden in Historical Data on Big Data Platforms  Machine Learning and Big Data Analytics find these Insights by building Analytics Models  Event Processing uses these Models (without Rebuilding) to take Action in Real Time
  • 91. Questions? Please contact me! Kai Wähner kwaehner@tibco.com @KaiWaehner www.kai-waehner.de LinkedIn / Xing  Please connect!