More Related Content Similar to KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November 2018 (20) More from KNIMESlides (7) KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November 20181. © 2018 KNIME AG. All Right Reserved.
From Raw Data to Deployment
KNIMEr: rosaria.silipo@knime.com
KNIMEr: maarit.widmann@knime.com
KNIMEr: jerome.treboux@hevs.ch
@KNIME
2. © 2018 KNIME AG. All Rights Reserved.
Do you recognize this?
2
https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining
3. © 2018 KNIME AG. All Rights Reserved.
Let’s unroll it!
It always starts
with some data …
3
Data
Preparation
Model
Training
Model
Optimization
Deployment
Data Manipulation
Data Blending
Missing Values Handling
Feature Generation
Dimensionality Reduction
Feature Selection
Outlier Removal
Normalization
Partitioning
…
Model Training
Bag of Models
Model Selection
Ensemble Models
Own Ensemble Model
External Models
Import Existing Models
Model Factory
…
Parameter Tuning
Parameter Optimization
Regularization
Model Size
No. Iterations
…
Performance Measures
Accuracy
ROC Curve
Cross-Validation
…
Files & DBs
Dashboards
REST API
SQL Code Export
Reporting
…
Model
Evaluation
4. © 2018 KNIME AG. All Rights Reserved.
The many Lives of a Dataset
4
Data
Preparation
Model
Training
Model
Optimization
Model
Evaluation
Deployment
Partitioning:
• Training Set
• Validation Set
• Test Set
Training Set Validation Set Test Set New Data from Real
World Applications
Original Data
Set with Past
Observations
5. © 2018 KNIME AG. All Rights Reserved.
Data Exploration
• Sometimes in between Data Access and Data
Preparation there is a Data Exploration phase
• The Data Exploration phase is useful to get to
know the data
• KNIME offers a few visualization nodes to build
dashboards to explore the data
5
6. © 2018 KNIME AG. All Rights Reserved.
What about Big Data?
• Big Data serves Scalability
• The whole Analytics Process is no different on
Big Data
• You need:
– a Big Data Platform
– The KNIME Big Data (Spark & Hive) Extension
6
7. © 2018 KNIME AG. All Rights Reserved.
One Example for Every Need
The KNIME EXAMPLES Server
7
50_Applications
8. © 2018 KNIME AG. All Rights Reserved.
Classification Problem & Data Set
• Airline Dataset: http://stat-computing.org/dataexpo/2009/the-data.html
• Smaller dataset (Jan 2007) (AirlineDataset.table)
• Challenge:
Predict Departure Delays
If on original airline dataset, only flights from airport ORD
Output Class = “delay” if depdelay > 15min
otherwise “no delay”
Available features: date, dep time, arr time, carrier, destination, cancelled, …
8
9. © 2018 KNIME AG. All Rights Reserved.
Challenges
• Group 1. Data Access and Data Preparation
• Group 2. ML Model Training
• Group 3. Model Deployment
• Import file Learnathon_2018.knar into your workspace
9
10. © 2018 KNIME AG. All Rights Reserved.
Group 1. Data Access and Data Preparation
10
11. © 2018 KNIME AG. All Rights Reserved.
Group 2. Model Training & Optimization
11
12. © 2018 KNIME AG. All Rights Reserved.
Group 3. Deployment
12
• Deployment Options – Multiple challenges:
– Workflow deployment to KNIME Server
– Remote/Scheduled execution from KNIME
Server
– KNIME RESTful Web Services
– Build a Composite Interactive Dashboard and
make it available on KNIME Web Portal
– Generate a report with BIRT
– Write Prediction Results to a Database
13. © 2018 KNIME AG. All Rights Reserved.
KNIME Spring Summit 2019
March 18 – 22 at bcc Berlin Congress Center, Berlin
• Monday & Tuesday: One-day courses
• Wednesday & Thursday: Summit sessions
• Friday: Workshops
Use the code
LEARNATHON
for 10% off tickets!
Register at
knime.com/spring-summit2019
14. © 2018 KNIME AG. All Rights Reserved.
KNIME Beginner’s Luck
Free Copy of KNIME Beginner’s Luck Book from KNIME Press
https://www.knime.com/knimepress
with this code: PARIS-1118
15
15. © 2018 KNIME AG. All Rights Reserved.
You can find KNIMers here!
16
• KNIME (www.knime.com)
• BLOG for news, tips and tricks(www.knime.com/blog)
• FORUM for questions and answers (tech.knime.com/forum)
• EXAMPLE SERVER for example workflows
• LEARNING HUB (www.knime.com/learning-hub)
• KNIME TV channel on
• KNIME on @KNIME
• KNIME on https://www.facebook.com/KNIMEanalytics
• On
16. © 2018 KNIME AG. All Rights Reserved. 17
The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by KNIME.com AG under license from KNIME GmbH,
and are registered in the United States. KNIME® is also registered in Germany.
Thank You!