SlideShare a Scribd company logo
1 of 16
Download to read offline
© 2018 KNIME AG. All Right Reserved.
From Raw Data to Deployment
KNIMEr: rosaria.silipo@knime.com
KNIMEr: maarit.widmann@knime.com
KNIMEr: jerome.treboux@hevs.ch
@KNIME
© 2018 KNIME AG. All Rights Reserved.
Do you recognize this?
2
https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining
© 2018 KNIME AG. All Rights Reserved.
Let’s unroll it!
It always starts
with some data …
3
Data
Preparation
Model
Training
Model
Optimization
Deployment
Data Manipulation
Data Blending
Missing Values Handling
Feature Generation
Dimensionality Reduction
Feature Selection
Outlier Removal
Normalization
Partitioning
…
Model Training
Bag of Models
Model Selection
Ensemble Models
Own Ensemble Model
External Models
Import Existing Models
Model Factory
…
Parameter Tuning
Parameter Optimization
Regularization
Model Size
No. Iterations
…
Performance Measures
Accuracy
ROC Curve
Cross-Validation
…
Files & DBs
Dashboards
REST API
SQL Code Export
Reporting
…
Model
Evaluation
© 2018 KNIME AG. All Rights Reserved.
The many Lives of a Dataset
4
Data
Preparation
Model
Training
Model
Optimization
Model
Evaluation
Deployment
Partitioning:
• Training Set
• Validation Set
• Test Set
Training Set Validation Set Test Set New Data from Real
World Applications
Original Data
Set with Past
Observations
© 2018 KNIME AG. All Rights Reserved.
Data Exploration
• Sometimes in between Data Access and Data
Preparation there is a Data Exploration phase
• The Data Exploration phase is useful to get to
know the data
• KNIME offers a few visualization nodes to build
dashboards to explore the data
5
© 2018 KNIME AG. All Rights Reserved.
What about Big Data?
• Big Data serves Scalability
• The whole Analytics Process is no different on
Big Data
• You need:
– a Big Data Platform
– The KNIME Big Data (Spark & Hive) Extension
6
© 2018 KNIME AG. All Rights Reserved.
One Example for Every Need
The KNIME EXAMPLES Server
7
50_Applications
© 2018 KNIME AG. All Rights Reserved.
Classification Problem & Data Set
• Airline Dataset: http://stat-computing.org/dataexpo/2009/the-data.html
• Smaller dataset (Jan 2007) (AirlineDataset.table)
• Challenge:
Predict Departure Delays
If on original airline dataset, only flights from airport ORD
Output Class = “delay” if depdelay > 15min
otherwise “no delay”
Available features: date, dep time, arr time, carrier, destination, cancelled, …
8
© 2018 KNIME AG. All Rights Reserved.
Challenges
• Group 1. Data Access and Data Preparation
• Group 2. ML Model Training
• Group 3. Model Deployment
• Import file Learnathon_2018.knar into your workspace
9
© 2018 KNIME AG. All Rights Reserved.
Group 1. Data Access and Data Preparation
10
© 2018 KNIME AG. All Rights Reserved.
Group 2. Model Training & Optimization
11
© 2018 KNIME AG. All Rights Reserved.
Group 3. Deployment
12
• Deployment Options – Multiple challenges:
– Workflow deployment to KNIME Server
– Remote/Scheduled execution from KNIME
Server
– KNIME RESTful Web Services
– Build a Composite Interactive Dashboard and
make it available on KNIME Web Portal
– Generate a report with BIRT
– Write Prediction Results to a Database
© 2018 KNIME AG. All Rights Reserved.
KNIME Spring Summit 2019
March 18 – 22 at bcc Berlin Congress Center, Berlin
• Monday & Tuesday: One-day courses
• Wednesday & Thursday: Summit sessions
• Friday: Workshops
Use the code
LEARNATHON
for 10% off tickets!
Register at
knime.com/spring-summit2019
© 2018 KNIME AG. All Rights Reserved.
KNIME Beginner’s Luck
Free Copy of KNIME Beginner’s Luck Book from KNIME Press
https://www.knime.com/knimepress
with this code: PARIS-1118
15
© 2018 KNIME AG. All Rights Reserved.
You can find KNIMers here!
16
• KNIME (www.knime.com)
• BLOG for news, tips and tricks(www.knime.com/blog)
• FORUM for questions and answers (tech.knime.com/forum)
• EXAMPLE SERVER for example workflows
• LEARNING HUB (www.knime.com/learning-hub)
• KNIME TV channel on
• KNIME on @KNIME
• KNIME on https://www.facebook.com/KNIMEanalytics
• On
© 2018 KNIME AG. All Rights Reserved. 17
The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by KNIME.com AG under license from KNIME GmbH,
and are registered in the United States. KNIME® is also registered in Germany.
Thank You!

More Related Content

What's hot

What's hot (20)

Heterogeneous Data Mining with Spark
Heterogeneous Data Mining with SparkHeterogeneous Data Mining with Spark
Heterogeneous Data Mining with Spark
 
Sharing and Deploying Data Science with KNIME Server
Sharing and Deploying Data Science with KNIME ServerSharing and Deploying Data Science with KNIME Server
Sharing and Deploying Data Science with KNIME Server
 
Chemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics PlatformChemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics Platform
 
From Raw Data to Deployment
From Raw Data to DeploymentFrom Raw Data to Deployment
From Raw Data to Deployment
 
Machine learning basic course with KNIME analytics platform
Machine learning basic course with KNIME analytics platformMachine learning basic course with KNIME analytics platform
Machine learning basic course with KNIME analytics platform
 
Just add Imagination
Just add ImaginationJust add Imagination
Just add Imagination
 
What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1
 
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
 
Advanced analytics for the Internet of Things. Restocking Rental Bike Stations
Advanced analytics for the Internet of Things. Restocking Rental Bike StationsAdvanced analytics for the Internet of Things. Restocking Rental Bike Stations
Advanced analytics for the Internet of Things. Restocking Rental Bike Stations
 
Codeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image ClassificationCodeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image Classification
 
Automating Inferences out of Financial Data
Automating Inferences out of Financial DataAutomating Inferences out of Financial Data
Automating Inferences out of Financial Data
 
AWS reInvent 2019 Trip Report
AWS reInvent 2019 Trip ReportAWS reInvent 2019 Trip Report
AWS reInvent 2019 Trip Report
 
Twitter analytics in Bluemix
Twitter analytics in BluemixTwitter analytics in Bluemix
Twitter analytics in Bluemix
 
Is it harder to find a taxi when it is raining?
Is it harder to find a taxi when it is raining? Is it harder to find a taxi when it is raining?
Is it harder to find a taxi when it is raining?
 
Daho.am meetup kubernetes evolution @abi
Daho.am meetup   kubernetes evolution @abiDaho.am meetup   kubernetes evolution @abi
Daho.am meetup kubernetes evolution @abi
 
The Race To Better Datacenters - Tailormade Colocation by Globalways AG
The Race To Better Datacenters - Tailormade Colocation by Globalways AGThe Race To Better Datacenters - Tailormade Colocation by Globalways AG
The Race To Better Datacenters - Tailormade Colocation by Globalways AG
 
#AI + #Cloud = #DigitalTransformation
#AI + #Cloud = #DigitalTransformation#AI + #Cloud = #DigitalTransformation
#AI + #Cloud = #DigitalTransformation
 
Big Data LDN 2017: Your flight is boarding now!
Big Data LDN 2017: Your flight is boarding now!Big Data LDN 2017: Your flight is boarding now!
Big Data LDN 2017: Your flight is boarding now!
 
Flink Forward Berlin 2018: Henri Heiskanen - "How to keep our flock happy wit...
Flink Forward Berlin 2018: Henri Heiskanen - "How to keep our flock happy wit...Flink Forward Berlin 2018: Henri Heiskanen - "How to keep our flock happy wit...
Flink Forward Berlin 2018: Henri Heiskanen - "How to keep our flock happy wit...
 
Flink Forward Berlin 2018: Tobias Lindener - "Approximate standing queries on...
Flink Forward Berlin 2018: Tobias Lindener - "Approximate standing queries on...Flink Forward Berlin 2018: Tobias Lindener - "Approximate standing queries on...
Flink Forward Berlin 2018: Tobias Lindener - "Approximate standing queries on...
 

Similar to KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November 2018

How Do You Build and Validate 1500 Models and What Can You Learn from Them?
How Do You Build and Validate 1500 Models and What Can You Learn from Them? How Do You Build and Validate 1500 Models and What Can You Learn from Them?
How Do You Build and Validate 1500 Models and What Can You Learn from Them?
Greg Landrum
 
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
Sri Ambati
 
Knime & bioinformatics
Knime & bioinformaticsKnime & bioinformatics
Knime & bioinformatics
BioinformaticsInstitute
 
Machine Learning with Apache Spark
Machine Learning with Apache SparkMachine Learning with Apache Spark
Machine Learning with Apache Spark
IBM Cloud Data Services
 

Similar to KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November 2018 (20)

From raw data to deployment
From raw data to deployment From raw data to deployment
From raw data to deployment
 
Inteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeInteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for Code
 
How Do You Build and Validate 1500 Models and What Can You Learn from Them?
How Do You Build and Validate 1500 Models and What Can You Learn from Them? How Do You Build and Validate 1500 Models and What Can You Learn from Them?
How Do You Build and Validate 1500 Models and What Can You Learn from Them?
 
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
 
KNIME Software Overview
KNIME Software OverviewKNIME Software Overview
KNIME Software Overview
 
Processing malaria HTS results using KNIME: a tutorial
Processing malaria HTS results using KNIME: a tutorialProcessing malaria HTS results using KNIME: a tutorial
Processing malaria HTS results using KNIME: a tutorial
 
How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...
 
Capitalizing on cloud 4.3.18
Capitalizing on cloud 4.3.18Capitalizing on cloud 4.3.18
Capitalizing on cloud 4.3.18
 
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
 
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
 
Improving Adobe Experience Cloud Services Dependability with Machine Learning
Improving Adobe Experience Cloud Services Dependability with Machine LearningImproving Adobe Experience Cloud Services Dependability with Machine Learning
Improving Adobe Experience Cloud Services Dependability with Machine Learning
 
Open Source AI - News and examples
Open Source AI - News and examplesOpen Source AI - News and examples
Open Source AI - News and examples
 
DutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive SectorDutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive Sector
 
Take the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented AnalyticsTake the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented Analytics
 
Knime & bioinformatics
Knime & bioinformaticsKnime & bioinformatics
Knime & bioinformatics
 
Moving from Artisanal to Industrial Machine Learning
Moving from Artisanal to Industrial Machine LearningMoving from Artisanal to Industrial Machine Learning
Moving from Artisanal to Industrial Machine Learning
 
Interactive and reproducible data analysis with the open-source KNIME Analyti...
Interactive and reproducible data analysis with the open-source KNIME Analyti...Interactive and reproducible data analysis with the open-source KNIME Analyti...
Interactive and reproducible data analysis with the open-source KNIME Analyti...
 
Your Flight is Boarding Now!
Your Flight is Boarding Now!Your Flight is Boarding Now!
Your Flight is Boarding Now!
 
Machine Learning with Apache Spark
Machine Learning with Apache SparkMachine Learning with Apache Spark
Machine Learning with Apache Spark
 
Introduction to Machine Learning on IBM Power Systems
Introduction to Machine Learning on IBM Power SystemsIntroduction to Machine Learning on IBM Power Systems
Introduction to Machine Learning on IBM Power Systems
 

More from KNIMESlides

More from KNIMESlides (7)

Credit Card Fraud Detection Tutorial
Credit Card Fraud Detection TutorialCredit Card Fraud Detection Tutorial
Credit Card Fraud Detection Tutorial
 
Practicing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case StudiesPracticing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case Studies
 
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
 
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon basedSentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
 
Knime customer intelligence on social media: Text Analytics vs. Network Mining
Knime customer intelligence on social media: Text Analytics vs. Network MiningKnime customer intelligence on social media: Text Analytics vs. Network Mining
Knime customer intelligence on social media: Text Analytics vs. Network Mining
 
Text Processing with KNIME
Text Processing with KNIMEText Processing with KNIME
Text Processing with KNIME
 
Big Data with KNIME is as easy as 1, 2, 3, ...4!
Big Data with KNIME is as easy as 1, 2, 3, ...4!Big Data with KNIME is as easy as 1, 2, 3, ...4!
Big Data with KNIME is as easy as 1, 2, 3, ...4!
 

Recently uploaded

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 

Recently uploaded (20)

%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - Kanchana
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 

KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November 2018

  • 1. © 2018 KNIME AG. All Right Reserved. From Raw Data to Deployment KNIMEr: rosaria.silipo@knime.com KNIMEr: maarit.widmann@knime.com KNIMEr: jerome.treboux@hevs.ch @KNIME
  • 2. © 2018 KNIME AG. All Rights Reserved. Do you recognize this? 2 https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining
  • 3. © 2018 KNIME AG. All Rights Reserved. Let’s unroll it! It always starts with some data … 3 Data Preparation Model Training Model Optimization Deployment Data Manipulation Data Blending Missing Values Handling Feature Generation Dimensionality Reduction Feature Selection Outlier Removal Normalization Partitioning … Model Training Bag of Models Model Selection Ensemble Models Own Ensemble Model External Models Import Existing Models Model Factory … Parameter Tuning Parameter Optimization Regularization Model Size No. Iterations … Performance Measures Accuracy ROC Curve Cross-Validation … Files & DBs Dashboards REST API SQL Code Export Reporting … Model Evaluation
  • 4. © 2018 KNIME AG. All Rights Reserved. The many Lives of a Dataset 4 Data Preparation Model Training Model Optimization Model Evaluation Deployment Partitioning: • Training Set • Validation Set • Test Set Training Set Validation Set Test Set New Data from Real World Applications Original Data Set with Past Observations
  • 5. © 2018 KNIME AG. All Rights Reserved. Data Exploration • Sometimes in between Data Access and Data Preparation there is a Data Exploration phase • The Data Exploration phase is useful to get to know the data • KNIME offers a few visualization nodes to build dashboards to explore the data 5
  • 6. © 2018 KNIME AG. All Rights Reserved. What about Big Data? • Big Data serves Scalability • The whole Analytics Process is no different on Big Data • You need: – a Big Data Platform – The KNIME Big Data (Spark & Hive) Extension 6
  • 7. © 2018 KNIME AG. All Rights Reserved. One Example for Every Need The KNIME EXAMPLES Server 7 50_Applications
  • 8. © 2018 KNIME AG. All Rights Reserved. Classification Problem & Data Set • Airline Dataset: http://stat-computing.org/dataexpo/2009/the-data.html • Smaller dataset (Jan 2007) (AirlineDataset.table) • Challenge: Predict Departure Delays If on original airline dataset, only flights from airport ORD Output Class = “delay” if depdelay > 15min otherwise “no delay” Available features: date, dep time, arr time, carrier, destination, cancelled, … 8
  • 9. © 2018 KNIME AG. All Rights Reserved. Challenges • Group 1. Data Access and Data Preparation • Group 2. ML Model Training • Group 3. Model Deployment • Import file Learnathon_2018.knar into your workspace 9
  • 10. © 2018 KNIME AG. All Rights Reserved. Group 1. Data Access and Data Preparation 10
  • 11. © 2018 KNIME AG. All Rights Reserved. Group 2. Model Training & Optimization 11
  • 12. © 2018 KNIME AG. All Rights Reserved. Group 3. Deployment 12 • Deployment Options – Multiple challenges: – Workflow deployment to KNIME Server – Remote/Scheduled execution from KNIME Server – KNIME RESTful Web Services – Build a Composite Interactive Dashboard and make it available on KNIME Web Portal – Generate a report with BIRT – Write Prediction Results to a Database
  • 13. © 2018 KNIME AG. All Rights Reserved. KNIME Spring Summit 2019 March 18 – 22 at bcc Berlin Congress Center, Berlin • Monday & Tuesday: One-day courses • Wednesday & Thursday: Summit sessions • Friday: Workshops Use the code LEARNATHON for 10% off tickets! Register at knime.com/spring-summit2019
  • 14. © 2018 KNIME AG. All Rights Reserved. KNIME Beginner’s Luck Free Copy of KNIME Beginner’s Luck Book from KNIME Press https://www.knime.com/knimepress with this code: PARIS-1118 15
  • 15. © 2018 KNIME AG. All Rights Reserved. You can find KNIMers here! 16 • KNIME (www.knime.com) • BLOG for news, tips and tricks(www.knime.com/blog) • FORUM for questions and answers (tech.knime.com/forum) • EXAMPLE SERVER for example workflows • LEARNING HUB (www.knime.com/learning-hub) • KNIME TV channel on • KNIME on @KNIME • KNIME on https://www.facebook.com/KNIMEanalytics • On
  • 16. © 2018 KNIME AG. All Rights Reserved. 17 The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by KNIME.com AG under license from KNIME GmbH, and are registered in the United States. KNIME® is also registered in Germany. Thank You!