SlideShare a Scribd company logo
1 of 33
Download to read offline
© 2018 KNIME AG. All Right Reserved.
Practicing Data Science
A Collection of Case Studies
Rosaria.Silipo@knime.com
@KNIME
Strata New York , September 24 2019
© 2018 KNIME AG. All Rights Reserved.
A few Words about me
2
• I am Rosaria Silipo
• Principal Data Scientist at KNIME
• At least 20 years analyzing data
• Generally interesting projects become Case Studies
• 23 case studies collected in a book
• Almost 24
© 2018 KNIME AG. All Rights Reserved.
A Classic Data Science Project
It always starts
with some data
…
3
Data
Preparation
Model
Training
Model
Optimization
Deployment
Data Manipulation
Data Blending
Missing Values Handling
Feature Generation
Dimensionality Reduction
Feature Selection
Outlier Removal
Normalization
Partitioning
…
Model Training
Bag of Models
Model Selection
Ensemble Models
Own Ensemble Model
External Models
Import Existing Models
Model Factory
…
Parameter Tuning
Parameter Optimization
Regularization
Model Size
No. Iterations
…
Performance Measures
Accuracy
ROC Curve
Cross-Validation
…
Files & DBs
Dashboards
REST API
SQL Code Export
Reporting
…
Model
Testing
© 2018 KNIME AG. All Rights Reserved. 4
Customer Intelligence: Churn Prediction
© 2018 KNIME AG. All Rights Reserved.
Churn Prediction: The Problem
CRM System
Data about your customer
• Demographics
• Behavior
• Revenues
Model
• Churn Prediction
• Upselling Likelihood
• Product Propensity /NBO
• Campaign Management
• Customer Segmentation
• …
5
© 2018 KNIME AG. All Rights Reserved.
Churn Prediction: The Training Workflow
6
© 2018 KNIME AG. All Rights Reserved.
Churn Prediction: The Deployment Workflow
7
YouTube: “Building a basic Model for Churn Prediction with KNIME” https://www.youtube.com/watch?v=RHsO10q7e2Y
EXAMPLES Server: 50_Applications/18_Churn_Prediction
© 2018 KNIME AG. All Rights Reserved. 9
Demand Prediction (Taxi)
© 2018 KNIME AG. All Rights Reserved.
Demand Prediction: The Problem
10
How many taxi
do I need in NYC
on Wednesday
at 12:00?
How many customers?
How many kW?
How many diapers?
© 2018 KNIME AG. All Rights Reserved.
Demand Prediction: The Training Workflow
11
Training set
Test set
R2 = 0.81
© 2018 KNIME AG. All Rights Reserved.
Demand Prediction: Deployment
12
On Wednesday
at 12:00 we
need 13k taxis
© 2018 KNIME AG. All Rights Reserved.
Automated Machine Learning
13
© 2018 KNIME AG. All Rights Reserved.
Interaction Points
Business analysts will
simply access the
KNIME WebPortal
from any web
browser.
© 2018 KNIME AG. All Rights Reserved.
Automated Machine Learning: Just how much?
15
https://towardsdatascience.com/automated-
machine-learning-just-how-much-7330fd4f882e
© 2018 KNIME AG. All Rights Reserved.
Fraud/Anomaly Detection
16
© 2018 KNIME AG. All Rights Reserved.
Fraud Detection: The Problem
17
Transactions
• Trx 1
• Trx 2
• Trx 3
• Trx 4
• Trx 5
• Trx 6
• …
Model
• Good
• Good
• Good
• Fraud
• Good
• Good
• …
© 2018 KNIME AG. All Rights Reserved.
Fraud Detection: without Fraud Examples – Auto-encoder
18
• Trained with Back-
Propagation on just
“normal” transactions
• If distance > threshold
=> possible fraud
distance
© 2018 KNIME AG. All Rights Reserved.
Fraud Detection: without Fraud Examples
© 2018 KNIME AG. All Rights Reserved.
Fraud Detection deployed
Suspicious Transaction
© 2018 KNIME AG. All Rights Reserved. 21
Creative AI
© 2018 KNIME AG. All Rights Reserved.
Creative AI: The Problems
• Free Text Generation
– Simulating a writing style
– Writing in different languages
– Providing an answer in a specific style
– Generating Candidates for Product Names
• Image Neuro-Styling
– Picasso
– Botero
– Matisse
– Manet
– …
22
© 2018 KNIME AG. All Rights Reserved.
Deep Learning LSTM Network
23
One-hotencodedcharacter
Characterprobabilities
e - s- u – o - h
o - u- s - e - <space>
© 2018 KNIME AG. All Rights Reserved.
Creative AI: The Training Workflow
24
© 2018 KNIME AG. All Rights Reserved.
Creative AI: The Deployment Workflow
25
© 2018 KNIME AG. All Rights Reserved.
Creative AI: Deployment and Results
Yo!
This post is about generating free text
with a deep learning network
particularly it is about Brick X6,
Phey, cabe,
make you feel soom the way (I smoke good!)
I probably make (What?)
More money in six months,
Than what's in your papa's safe (I'm serious)
Look like I robbed a bank (Okay Okay)
I set it off like Queen Latifah
'Cause I'm living single I'm feeling cautious
I ain't scream when they served a subpoena
(Can't go back to jail)
I heard that he a leader
(Who pood, what to be f*****' up
The baugerout Black alro Black X6,
Phantom White X6 looks like a panda
Goin' out like I'm Montana
Hundred killers, hundred hammers Black X6,
Phantom White X6, panda
Pockets swole, Danny
Sellin' bar, candy
Man I'm the macho like Randy
The choppa go Oscar for Grammy
B**** n**** pull up ya panty
Hope you killas understand me
Hey Panda, Panda Panda,
Panda, Panda, Panda, Panda
I got broads in Atlanta
Twistin' dope, lean, and the Fanta
Credit cards and the scammers
Hittin' off licks in the bando
26
This License refers to version of the GNU General Public
License. Copyright also means copyright-bick,
Remade me any thing to his sword
To his salt and most hidden loose to be so for sings, but not in
a libutt of his matter than that shall be sure as will be soldye
As master compary, do not live in traitor.
Bless thy five wits!
-Kent
O pity!
Sir, where is the patience now,
That this is so far from the sea and some bidings to dismantle
So many folds of save and honest.
-Brabantio
I must not think the Turk of Cassio in the strange metting the
cribles of a charmer be the reviling of libe to say
That I can deceive him to the best advantage,
In her prophetic fairs of a little to presently at your powers;
whereof I thank you, sir.
-Albany
Gloucester, I will prove upancy of his sport and first accuriors
and guard and talking on the white.
-King Lear
Where are the thief?
Thou shalt never have the captains at the letter
To the Moor and thing we have not the better shall be sure as
worth if he be anger—
-Regan
I pray you, have a countend more than think to do a proclaim’d
there of my heart, Hot
The words save, honest, thief, master, traitor, and deceive
seem to fit the context. Notice also that the dialogue sprouting
from the start text of the license agreement interestingly
involves mainly minor, less tragic characters from the plays.
Caro amico ti scrivo così mi distraggo un po'E
siccome sei molto lontano più forte ti scriverò.
Da quella prima folla strana, che aveva preso il suo
nome, e di correre alla casa di don Abbondio, con un
viso bene di non poterci andar la casa del padre
Cristoforo, e gli disse che s'avvicinava all'uscio, e si
mise a sparse di corsa, e di stare a sé, verso la strada
di servizio, chiesto le parole che gli andavan
dall'altra stanza, e con la sua condizione de'
cappuccini, e di consigli ricerche di confidenza delle
gride, nel suo passaggio, se non pensava con una
certa ripugnanza a casa sua, che andavano a
scomparire in un campo di buone ragioni che
avevan potuto raccogliere i suoi pensieri, e di sopra
non senza interrogare, che la sua avventura aveva
fatto predicare, e con la forza d'un fatto come
fuggitive che aveva preso il suo nome, e di correre
alla casa di don Abbondio, con un cappuccino di
quella sorte, con un certo sospiro, alzando le sue
finestre, e le diede un'occhiata in carrozza. Si
vendano a metter nelle mani di chi era stato a
sedere sur una strada così fatta con le braccia in
© 2018 KNIME AG. All Rights Reserved.
Image Neuro-Styling
27
© 2018 KNIME AG. All Rights Reserved.
Rosaria Neuro-Styling
28
original Picasso
Matisse
Caravaggio
BoteroRenoir
Notre Dame
TapestryManet
Magritte
Van Gogh
Artemisia
© 2018 KNIME AG. All Rights Reserved.
Image Neuro-Styling: Workflow
29
Python
code
To Web
Browser
From Web
Browser
© 2018 KNIME AG. All Rights Reserved.
KNIME Fall Summit 2019
November 5 – 8 at AT&T Executive Education and Conference
Center, Austin, Texas
• Tuesday & Wednesday: One-day courses
• Thursday & Friday: Summit sessions
Use the Promo Code
STRATA-NY-2019
by October 1 for Early Bird
Discount!
Register at
knime.com/fall-summit2019
© 2018 KNIME AG. All Rights Reserved.
Free Book as a Thank You
Free Copy of e-book:
“Practicing Data Science.
A Collection of Case Studies”
from KNIME Press
https://www.knime.com/knimepress
with this code: STRATA-NY-2019
Expiration date:
Dec/31/2019 - 23:59
33
© 2018 KNIME AG. All Rights Reserved.
KNIME Hub (https://hub.knime.com)
34
© 2018 KNIME AG. All Rights Reserved.
You can find KNIMers here!
35
• KNIME (www.knime.com)
• BLOG for news, tips and tricks(www.knime.com/blog)
• FORUM for questions and answers (tech.knime.com/forum)
• EXAMPLES SERVER for example workflows
• E-LEARNING COURSE (https://www.knime.com/knime-introductory-course )
• KNIME TV channel on
• KNIME on @KNIME
• KNIME on https://www.facebook.com/KNIMEanalytics
• s
© 2018 KNIME AG. All Rights Reserved. 36
The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by KNIME AG under license from KNIME GmbH,
and are registered in the United States. KNIME® is also registered in Germany.

More Related Content

What's hot

Big Data, Business Intelligence and Data Analytics
Big Data, Business Intelligence and Data AnalyticsBig Data, Business Intelligence and Data Analytics
Big Data, Business Intelligence and Data AnalyticsSystems Limited
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsSrinath Perera
 
Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud DetectionNitesh Kumar
 
Data Science Case Studies: The Internet of Things: Implications for the Enter...
Data Science Case Studies: The Internet of Things: Implications for the Enter...Data Science Case Studies: The Internet of Things: Implications for the Enter...
Data Science Case Studies: The Internet of Things: Implications for the Enter...VMware Tanzu
 
Business analytics awareness presentation
Business analytics  awareness presentationBusiness analytics  awareness presentation
Business analytics awareness presentationRamakrishna BE PGDM
 
Big Data Analytics for Banking, a Point of View
Big Data Analytics for Banking, a Point of ViewBig Data Analytics for Banking, a Point of View
Big Data Analytics for Banking, a Point of ViewPietro Leo
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analyticsUmasree Raghunath
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Edureka!
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with PythonDavis David
 
Customer Segmentation using Clustering
Customer Segmentation using ClusteringCustomer Segmentation using Clustering
Customer Segmentation using ClusteringDessy Amirudin
 
introduction to data mining tutorial
introduction to data mining tutorial introduction to data mining tutorial
introduction to data mining tutorial Salah Amean
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologySergey Shelpuk
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithmsankit panigrahy
 
Build Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and GraphsBuild Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and GraphsNeo4j
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AICori Faklaris
 

What's hot (20)

Big Data, Business Intelligence and Data Analytics
Big Data, Business Intelligence and Data AnalyticsBig Data, Business Intelligence and Data Analytics
Big Data, Business Intelligence and Data Analytics
 
Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud Detection
 
Data Science Case Studies: The Internet of Things: Implications for the Enter...
Data Science Case Studies: The Internet of Things: Implications for the Enter...Data Science Case Studies: The Internet of Things: Implications for the Enter...
Data Science Case Studies: The Internet of Things: Implications for the Enter...
 
Business analytics awareness presentation
Business analytics  awareness presentationBusiness analytics  awareness presentation
Business analytics awareness presentation
 
Big Data Analytics for Banking, a Point of View
Big Data Analytics for Banking, a Point of ViewBig Data Analytics for Banking, a Point of View
Big Data Analytics for Banking, a Point of View
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analytics
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with Python
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 
Big Data & Data Mining
Big Data & Data MiningBig Data & Data Mining
Big Data & Data Mining
 
Customer Segmentation using Clustering
Customer Segmentation using ClusteringCustomer Segmentation using Clustering
Customer Segmentation using Clustering
 
introduction to data mining tutorial
introduction to data mining tutorial introduction to data mining tutorial
introduction to data mining tutorial
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodology
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithms
 
Data analytics
Data analyticsData analytics
Data analytics
 
Build Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and GraphsBuild Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and Graphs
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AI
 

Similar to Practicing Data Science: A Collection of Case Studies

OMR Festival 2019 – State of the German Internet 2019
OMR Festival 2019 – State of the German Internet 2019OMR Festival 2019 – State of the German Internet 2019
OMR Festival 2019 – State of the German Internet 2019Online Marketing Rockstars
 
Unboxingecommerce
UnboxingecommerceUnboxingecommerce
UnboxingecommerceWijs
 
Strategic Selling: Dancing with Technology Goliaths when you are David
Strategic Selling: Dancing with Technology Goliaths when you are DavidStrategic Selling: Dancing with Technology Goliaths when you are David
Strategic Selling: Dancing with Technology Goliaths when you are Davidbwatson
 
Garbage in, garbage out
Garbage in, garbage outGarbage in, garbage out
Garbage in, garbage outBertil Hatt
 
Writing Paper Aesthetic Https Encrypted Tbn0 Gstatic Co
Writing Paper Aesthetic Https Encrypted Tbn0 Gstatic CoWriting Paper Aesthetic Https Encrypted Tbn0 Gstatic Co
Writing Paper Aesthetic Https Encrypted Tbn0 Gstatic CoMichelle Meienburg
 

Similar to Practicing Data Science: A Collection of Case Studies (6)

OMR Festival 2019 – State of the German Internet 2019
OMR Festival 2019 – State of the German Internet 2019OMR Festival 2019 – State of the German Internet 2019
OMR Festival 2019 – State of the German Internet 2019
 
Unboxingecommerce
UnboxingecommerceUnboxingecommerce
Unboxingecommerce
 
Strategic Selling: Dancing with Technology Goliaths when you are David
Strategic Selling: Dancing with Technology Goliaths when you are DavidStrategic Selling: Dancing with Technology Goliaths when you are David
Strategic Selling: Dancing with Technology Goliaths when you are David
 
Garbage in, garbage out
Garbage in, garbage outGarbage in, garbage out
Garbage in, garbage out
 
Act Essay
Act EssayAct Essay
Act Essay
 
Writing Paper Aesthetic Https Encrypted Tbn0 Gstatic Co
Writing Paper Aesthetic Https Encrypted Tbn0 Gstatic CoWriting Paper Aesthetic Https Encrypted Tbn0 Gstatic Co
Writing Paper Aesthetic Https Encrypted Tbn0 Gstatic Co
 

More from KNIMESlides

What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1KNIMESlides
 
Codeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image ClassificationCodeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image ClassificationKNIMESlides
 
Automating Inferences out of Financial Data
Automating Inferences out of Financial DataAutomating Inferences out of Financial Data
Automating Inferences out of Financial DataKNIMESlides
 
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020KNIMESlides
 
Credit Card Fraud Detection Tutorial
Credit Card Fraud Detection TutorialCredit Card Fraud Detection Tutorial
Credit Card Fraud Detection TutorialKNIMESlides
 
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9KNIMESlides
 
Webinar: Behind the Scenes on Guided Analytics
Webinar: Behind the Scenes on Guided AnalyticsWebinar: Behind the Scenes on Guided Analytics
Webinar: Behind the Scenes on Guided AnalyticsKNIMESlides
 
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019KNIMESlides
 
Scoring Metrics for Classification Models
Scoring Metrics for Classification ModelsScoring Metrics for Classification Models
Scoring Metrics for Classification ModelsKNIMESlides
 
Open Source Story and what’s new in KNIME Software
Open Source Story and what’s new in KNIME SoftwareOpen Source Story and what’s new in KNIME Software
Open Source Story and what’s new in KNIME SoftwareKNIMESlides
 
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine LearningAnomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine LearningKNIMESlides
 
Sharing and Deploying Data Science with KNIME Server
Sharing and Deploying Data Science with KNIME ServerSharing and Deploying Data Science with KNIME Server
Sharing and Deploying Data Science with KNIME ServerKNIMESlides
 
Guided Automation- A Blueprint for Interactive Automated Machine Learning
Guided Automation- A Blueprint for Interactive Automated Machine LearningGuided Automation- A Blueprint for Interactive Automated Machine Learning
Guided Automation- A Blueprint for Interactive Automated Machine LearningKNIMESlides
 
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIMESlides
 
Sentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics PlatformSentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics PlatformKNIMESlides
 
Chemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics PlatformChemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics PlatformKNIMESlides
 
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon basedSentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon basedKNIMESlides
 
KNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To DeploymentKNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To DeploymentKNIMESlides
 
KNIME Software Overview
KNIME Software OverviewKNIME Software Overview
KNIME Software OverviewKNIMESlides
 
From Raw Data to Deployment
From Raw Data to DeploymentFrom Raw Data to Deployment
From Raw Data to DeploymentKNIMESlides
 

More from KNIMESlides (20)

What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1
 
Codeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image ClassificationCodeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image Classification
 
Automating Inferences out of Financial Data
Automating Inferences out of Financial DataAutomating Inferences out of Financial Data
Automating Inferences out of Financial Data
 
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
 
Credit Card Fraud Detection Tutorial
Credit Card Fraud Detection TutorialCredit Card Fraud Detection Tutorial
Credit Card Fraud Detection Tutorial
 
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
 
Webinar: Behind the Scenes on Guided Analytics
Webinar: Behind the Scenes on Guided AnalyticsWebinar: Behind the Scenes on Guided Analytics
Webinar: Behind the Scenes on Guided Analytics
 
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
 
Scoring Metrics for Classification Models
Scoring Metrics for Classification ModelsScoring Metrics for Classification Models
Scoring Metrics for Classification Models
 
Open Source Story and what’s new in KNIME Software
Open Source Story and what’s new in KNIME SoftwareOpen Source Story and what’s new in KNIME Software
Open Source Story and what’s new in KNIME Software
 
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine LearningAnomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
 
Sharing and Deploying Data Science with KNIME Server
Sharing and Deploying Data Science with KNIME ServerSharing and Deploying Data Science with KNIME Server
Sharing and Deploying Data Science with KNIME Server
 
Guided Automation- A Blueprint for Interactive Automated Machine Learning
Guided Automation- A Blueprint for Interactive Automated Machine LearningGuided Automation- A Blueprint for Interactive Automated Machine Learning
Guided Automation- A Blueprint for Interactive Automated Machine Learning
 
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
 
Sentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics PlatformSentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics Platform
 
Chemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics PlatformChemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics Platform
 
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon basedSentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
 
KNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To DeploymentKNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To Deployment
 
KNIME Software Overview
KNIME Software OverviewKNIME Software Overview
KNIME Software Overview
 
From Raw Data to Deployment
From Raw Data to DeploymentFrom Raw Data to Deployment
From Raw Data to Deployment
 

Recently uploaded

Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Best Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfBest Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfIdiosysTechnologies1
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 

Recently uploaded (20)

Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Best Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfBest Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdf
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 

Practicing Data Science: A Collection of Case Studies

  • 1. © 2018 KNIME AG. All Right Reserved. Practicing Data Science A Collection of Case Studies Rosaria.Silipo@knime.com @KNIME Strata New York , September 24 2019
  • 2. © 2018 KNIME AG. All Rights Reserved. A few Words about me 2 • I am Rosaria Silipo • Principal Data Scientist at KNIME • At least 20 years analyzing data • Generally interesting projects become Case Studies • 23 case studies collected in a book • Almost 24
  • 3. © 2018 KNIME AG. All Rights Reserved. A Classic Data Science Project It always starts with some data … 3 Data Preparation Model Training Model Optimization Deployment Data Manipulation Data Blending Missing Values Handling Feature Generation Dimensionality Reduction Feature Selection Outlier Removal Normalization Partitioning … Model Training Bag of Models Model Selection Ensemble Models Own Ensemble Model External Models Import Existing Models Model Factory … Parameter Tuning Parameter Optimization Regularization Model Size No. Iterations … Performance Measures Accuracy ROC Curve Cross-Validation … Files & DBs Dashboards REST API SQL Code Export Reporting … Model Testing
  • 4. © 2018 KNIME AG. All Rights Reserved. 4 Customer Intelligence: Churn Prediction
  • 5. © 2018 KNIME AG. All Rights Reserved. Churn Prediction: The Problem CRM System Data about your customer • Demographics • Behavior • Revenues Model • Churn Prediction • Upselling Likelihood • Product Propensity /NBO • Campaign Management • Customer Segmentation • … 5
  • 6. © 2018 KNIME AG. All Rights Reserved. Churn Prediction: The Training Workflow 6
  • 7. © 2018 KNIME AG. All Rights Reserved. Churn Prediction: The Deployment Workflow 7 YouTube: “Building a basic Model for Churn Prediction with KNIME” https://www.youtube.com/watch?v=RHsO10q7e2Y EXAMPLES Server: 50_Applications/18_Churn_Prediction
  • 8. © 2018 KNIME AG. All Rights Reserved. 9 Demand Prediction (Taxi)
  • 9. © 2018 KNIME AG. All Rights Reserved. Demand Prediction: The Problem 10 How many taxi do I need in NYC on Wednesday at 12:00? How many customers? How many kW? How many diapers?
  • 10. © 2018 KNIME AG. All Rights Reserved. Demand Prediction: The Training Workflow 11 Training set Test set R2 = 0.81
  • 11. © 2018 KNIME AG. All Rights Reserved. Demand Prediction: Deployment 12 On Wednesday at 12:00 we need 13k taxis
  • 12. © 2018 KNIME AG. All Rights Reserved. Automated Machine Learning 13
  • 13. © 2018 KNIME AG. All Rights Reserved. Interaction Points Business analysts will simply access the KNIME WebPortal from any web browser.
  • 14. © 2018 KNIME AG. All Rights Reserved. Automated Machine Learning: Just how much? 15 https://towardsdatascience.com/automated- machine-learning-just-how-much-7330fd4f882e
  • 15. © 2018 KNIME AG. All Rights Reserved. Fraud/Anomaly Detection 16
  • 16. © 2018 KNIME AG. All Rights Reserved. Fraud Detection: The Problem 17 Transactions • Trx 1 • Trx 2 • Trx 3 • Trx 4 • Trx 5 • Trx 6 • … Model • Good • Good • Good • Fraud • Good • Good • …
  • 17. © 2018 KNIME AG. All Rights Reserved. Fraud Detection: without Fraud Examples – Auto-encoder 18 • Trained with Back- Propagation on just “normal” transactions • If distance > threshold => possible fraud distance
  • 18. © 2018 KNIME AG. All Rights Reserved. Fraud Detection: without Fraud Examples
  • 19. © 2018 KNIME AG. All Rights Reserved. Fraud Detection deployed Suspicious Transaction
  • 20. © 2018 KNIME AG. All Rights Reserved. 21 Creative AI
  • 21. © 2018 KNIME AG. All Rights Reserved. Creative AI: The Problems • Free Text Generation – Simulating a writing style – Writing in different languages – Providing an answer in a specific style – Generating Candidates for Product Names • Image Neuro-Styling – Picasso – Botero – Matisse – Manet – … 22
  • 22. © 2018 KNIME AG. All Rights Reserved. Deep Learning LSTM Network 23 One-hotencodedcharacter Characterprobabilities e - s- u – o - h o - u- s - e - <space>
  • 23. © 2018 KNIME AG. All Rights Reserved. Creative AI: The Training Workflow 24
  • 24. © 2018 KNIME AG. All Rights Reserved. Creative AI: The Deployment Workflow 25
  • 25. © 2018 KNIME AG. All Rights Reserved. Creative AI: Deployment and Results Yo! This post is about generating free text with a deep learning network particularly it is about Brick X6, Phey, cabe, make you feel soom the way (I smoke good!) I probably make (What?) More money in six months, Than what's in your papa's safe (I'm serious) Look like I robbed a bank (Okay Okay) I set it off like Queen Latifah 'Cause I'm living single I'm feeling cautious I ain't scream when they served a subpoena (Can't go back to jail) I heard that he a leader (Who pood, what to be f*****' up The baugerout Black alro Black X6, Phantom White X6 looks like a panda Goin' out like I'm Montana Hundred killers, hundred hammers Black X6, Phantom White X6, panda Pockets swole, Danny Sellin' bar, candy Man I'm the macho like Randy The choppa go Oscar for Grammy B**** n**** pull up ya panty Hope you killas understand me Hey Panda, Panda Panda, Panda, Panda, Panda, Panda I got broads in Atlanta Twistin' dope, lean, and the Fanta Credit cards and the scammers Hittin' off licks in the bando 26 This License refers to version of the GNU General Public License. Copyright also means copyright-bick, Remade me any thing to his sword To his salt and most hidden loose to be so for sings, but not in a libutt of his matter than that shall be sure as will be soldye As master compary, do not live in traitor. Bless thy five wits! -Kent O pity! Sir, where is the patience now, That this is so far from the sea and some bidings to dismantle So many folds of save and honest. -Brabantio I must not think the Turk of Cassio in the strange metting the cribles of a charmer be the reviling of libe to say That I can deceive him to the best advantage, In her prophetic fairs of a little to presently at your powers; whereof I thank you, sir. -Albany Gloucester, I will prove upancy of his sport and first accuriors and guard and talking on the white. -King Lear Where are the thief? Thou shalt never have the captains at the letter To the Moor and thing we have not the better shall be sure as worth if he be anger— -Regan I pray you, have a countend more than think to do a proclaim’d there of my heart, Hot The words save, honest, thief, master, traitor, and deceive seem to fit the context. Notice also that the dialogue sprouting from the start text of the license agreement interestingly involves mainly minor, less tragic characters from the plays. Caro amico ti scrivo così mi distraggo un po'E siccome sei molto lontano più forte ti scriverò. Da quella prima folla strana, che aveva preso il suo nome, e di correre alla casa di don Abbondio, con un viso bene di non poterci andar la casa del padre Cristoforo, e gli disse che s'avvicinava all'uscio, e si mise a sparse di corsa, e di stare a sé, verso la strada di servizio, chiesto le parole che gli andavan dall'altra stanza, e con la sua condizione de' cappuccini, e di consigli ricerche di confidenza delle gride, nel suo passaggio, se non pensava con una certa ripugnanza a casa sua, che andavano a scomparire in un campo di buone ragioni che avevan potuto raccogliere i suoi pensieri, e di sopra non senza interrogare, che la sua avventura aveva fatto predicare, e con la forza d'un fatto come fuggitive che aveva preso il suo nome, e di correre alla casa di don Abbondio, con un cappuccino di quella sorte, con un certo sospiro, alzando le sue finestre, e le diede un'occhiata in carrozza. Si vendano a metter nelle mani di chi era stato a sedere sur una strada così fatta con le braccia in
  • 26. © 2018 KNIME AG. All Rights Reserved. Image Neuro-Styling 27
  • 27. © 2018 KNIME AG. All Rights Reserved. Rosaria Neuro-Styling 28 original Picasso Matisse Caravaggio BoteroRenoir Notre Dame TapestryManet Magritte Van Gogh Artemisia
  • 28. © 2018 KNIME AG. All Rights Reserved. Image Neuro-Styling: Workflow 29 Python code To Web Browser From Web Browser
  • 29. © 2018 KNIME AG. All Rights Reserved. KNIME Fall Summit 2019 November 5 – 8 at AT&T Executive Education and Conference Center, Austin, Texas • Tuesday & Wednesday: One-day courses • Thursday & Friday: Summit sessions Use the Promo Code STRATA-NY-2019 by October 1 for Early Bird Discount! Register at knime.com/fall-summit2019
  • 30. © 2018 KNIME AG. All Rights Reserved. Free Book as a Thank You Free Copy of e-book: “Practicing Data Science. A Collection of Case Studies” from KNIME Press https://www.knime.com/knimepress with this code: STRATA-NY-2019 Expiration date: Dec/31/2019 - 23:59 33
  • 31. © 2018 KNIME AG. All Rights Reserved. KNIME Hub (https://hub.knime.com) 34
  • 32. © 2018 KNIME AG. All Rights Reserved. You can find KNIMers here! 35 • KNIME (www.knime.com) • BLOG for news, tips and tricks(www.knime.com/blog) • FORUM for questions and answers (tech.knime.com/forum) • EXAMPLES SERVER for example workflows • E-LEARNING COURSE (https://www.knime.com/knime-introductory-course ) • KNIME TV channel on • KNIME on @KNIME • KNIME on https://www.facebook.com/KNIMEanalytics • s
  • 33. © 2018 KNIME AG. All Rights Reserved. 36 The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by KNIME AG under license from KNIME GmbH, and are registered in the United States. KNIME® is also registered in Germany.