SlideShare a Scribd company logo
1 of 21
Machine Learning
101
Fred Verheul
What we won’t cover…
• Deep learning / Neural Networks
• Specifics of ML-algorithms
• Tools / Libraries / Code
• SAP Products, like HANA / Predictive Analytics / Vora / …
• Ethics, algorithmic transparency & fairness
• Hardware
2
Examples: Recommender systems
3
Examples, continued…
4
SPAM-
filtering
Handwriting
recognition
ML in the news: Deepmind’s AlphaGo
5
6
Machine Learning
"Field of study that gives computers the ability to learn
without being explicitly programmed” (Arthur Samuel, 1959)
7
What is Machine Learning?
8
Computer
Computer
Traditional Programming
Machine Learning
Data
Data
Program
Output
Program
Output
Sweet spot for Machine Learning
• It’s impossible to write down the rules in code:
• Too many rules
• Too many factors influencing the rules
• Too finely tuned
• We just don’t know the rules (image recognition)
• Lots of labeled data (examples) available (e.g. historical data)
9
Basic Machine Learning ‘workflow’
10
Feature
Vectors
Training
data
Labels
Machine
Learning
Algorithm
Feature
Vectors
New data Prediction
Training Phase
Operational Phase
Predictive
Model
Training Phase in more detail
11
Raw data
Data
preparation Feature
Vectors
Training
Data
Test
data
Model Building
(by ML
algorithm)
Model
Evaluation
Predictive
Model
Feedback loop
data cleansing
data transformation
normalization
feature extraction
aka
‘learning’
CRISP-DM: data mining process
12
ML
important
ML
important
Examples of ML tasks
Supervised learning
Regression 
target is numeric
Classification 
target is categorical
13
Unsupervised learning
Clustering
Dimensionality
reduction
Modeling: so many algorithms…
14
ML Algorithms: by Representation
Collection of candidate models/programs, aka hypothesis space
15
Decision trees
Instance-based
Neural networks
Model ensembles
ML Algorithms: by Evaluation
Evaluation: Quality measure for a model
16
Regression
Example metric: Root Mean Squared Error
RMSE =
Binary classification: confusion matrix
Accuracy: 8 + 971 -> 97,9%
Example: medical test
for a disease
Positive Negative
P
True
positives
TP
False
Negatives
FN
N
False
positives
FP
True
Negatives
TN
True
Class
Predicted class
Accuracy: Better evaluation metrics:
• Precision: 8 / (8 + 19)
• Recall: 8 / (8 + 2)
Optimization: how the algorithm ‘learns’, depends on representation and
evaluation
ML Algorithms: by Optimization
17
Greedy Search,
ex. of
combinatorial
optimization
Gradient Descent (or in general: Convex Optimization)
Linear Programming (or in general:
Constrained/Nonlinear Optimization)
Training error vs test error
18
Data Science for Business
• Focuses more on general principles
than specific algorithms
• Not math-heavy, does contain some
math
• O’Reilly link:
http://shop.oreilly.com/product/063692
0028918.do
• Book website: http://data-science-for-
biz.com/DSB/Home.html
19
Take-aways
• Goal of ML: generalize from training data (not optimization!!)
• Part of ‘Data Mining Process’, not a goal in and of itself
• No magic! Just some clever algorithms…
• Increasingly important non-technical aspects:
• Ethics
• Algorithmic transparency
20
Thank You
www.soapeople.com
info@soapeople.com
@SOAPEOPLE
Fred Verheul
Big Data Consultant
+31 6 3919 2986
fred.verheul@soapeople.com
@fredverheul

More Related Content

What's hot

The Evolution of AutoML
The Evolution of AutoMLThe Evolution of AutoML
The Evolution of AutoMLNing Jiang
 
ETL & Machine Learning
ETL & Machine LearningETL & Machine Learning
ETL & Machine LearningLuthfi Hariz
 
Visualising the world of competitive programming with Python (Codeforces)
Visualising the world of competitive programming with Python (Codeforces)Visualising the world of competitive programming with Python (Codeforces)
Visualising the world of competitive programming with Python (Codeforces)Anuj Menta
 
How is research conducted in my field
How is research conducted in my fieldHow is research conducted in my field
How is research conducted in my fieldCristian Klein
 
H2O World - Ensembles with Erin LeDell
H2O World - Ensembles with Erin LeDellH2O World - Ensembles with Erin LeDell
H2O World - Ensembles with Erin LeDellSri Ambati
 
Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningMicrosoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningSetu Chokshi
 
AzureML – zero to hero
AzureML – zero to heroAzureML – zero to hero
AzureML – zero to heroGovind Kanshi
 
AutoML - The Future of AI
AutoML - The Future of AIAutoML - The Future of AI
AutoML - The Future of AINing Jiang
 
Ideas spracklen-final
Ideas spracklen-finalIdeas spracklen-final
Ideas spracklen-finalsupportlogic
 
Pipeline oriented data analytics
Pipeline oriented data analyticsPipeline oriented data analytics
Pipeline oriented data analyticsBorys Biletskyy
 
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...Sri Ambati
 
Part 3 Machine Learnning
Part 3 Machine LearnningPart 3 Machine Learnning
Part 3 Machine LearnningMohamed Essam
 
The Quest for an Open Source Data Science Platform
 The Quest for an Open Source Data Science Platform The Quest for an Open Source Data Science Platform
The Quest for an Open Source Data Science PlatformQAware GmbH
 
Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...Gianmario Spacagna
 
Genetic Algorithm Projects Research Ideas
Genetic Algorithm Projects Research IdeasGenetic Algorithm Projects Research Ideas
Genetic Algorithm Projects Research IdeasMatlab Simulation
 

What's hot (16)

The Evolution of AutoML
The Evolution of AutoMLThe Evolution of AutoML
The Evolution of AutoML
 
ETL & Machine Learning
ETL & Machine LearningETL & Machine Learning
ETL & Machine Learning
 
Visualising the world of competitive programming with Python (Codeforces)
Visualising the world of competitive programming with Python (Codeforces)Visualising the world of competitive programming with Python (Codeforces)
Visualising the world of competitive programming with Python (Codeforces)
 
How is research conducted in my field
How is research conducted in my fieldHow is research conducted in my field
How is research conducted in my field
 
OpenML NeurIPS2018
OpenML NeurIPS2018OpenML NeurIPS2018
OpenML NeurIPS2018
 
H2O World - Ensembles with Erin LeDell
H2O World - Ensembles with Erin LeDellH2O World - Ensembles with Erin LeDell
H2O World - Ensembles with Erin LeDell
 
Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningMicrosoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine Learning
 
AzureML – zero to hero
AzureML – zero to heroAzureML – zero to hero
AzureML – zero to hero
 
AutoML - The Future of AI
AutoML - The Future of AIAutoML - The Future of AI
AutoML - The Future of AI
 
Ideas spracklen-final
Ideas spracklen-finalIdeas spracklen-final
Ideas spracklen-final
 
Pipeline oriented data analytics
Pipeline oriented data analyticsPipeline oriented data analytics
Pipeline oriented data analytics
 
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
 
Part 3 Machine Learnning
Part 3 Machine LearnningPart 3 Machine Learnning
Part 3 Machine Learnning
 
The Quest for an Open Source Data Science Platform
 The Quest for an Open Source Data Science Platform The Quest for an Open Source Data Science Platform
The Quest for an Open Source Data Science Platform
 
Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...
 
Genetic Algorithm Projects Research Ideas
Genetic Algorithm Projects Research IdeasGenetic Algorithm Projects Research Ideas
Genetic Algorithm Projects Research Ideas
 

Viewers also liked

Qué cambiarías en la educación en
Qué cambiarías en la educación enQué cambiarías en la educación en
Qué cambiarías en la educación enMaricarmen Rodríguez
 
Animales en peligro de extinción
Animales en peligro de extinciónAnimales en peligro de extinción
Animales en peligro de extincióncawi_007_0909
 
Tbjee Syllabi 2019 - Tripura Jee
Tbjee Syllabi 2019 - Tripura Jee Tbjee Syllabi 2019 - Tripura Jee
Tbjee Syllabi 2019 - Tripura Jee Abhinandan singh
 
Diseño sistema de sonido hifi con reduccion de sonido
Diseño sistema de sonido hifi con reduccion de sonidoDiseño sistema de sonido hifi con reduccion de sonido
Diseño sistema de sonido hifi con reduccion de sonidoBrenda Reina
 
El Correo Electronico
El Correo ElectronicoEl Correo Electronico
El Correo ElectronicoRomer Crespo
 
Platanitoren besoa
Platanitoren besoaPlatanitoren besoa
Platanitoren besoajokinuki
 
Pga 15 16 ceip san agustín definitiva
Pga 15 16 ceip san agustín definitivaPga 15 16 ceip san agustín definitiva
Pga 15 16 ceip san agustín definitiva02001433
 
SAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
SAP HANA SPS10- Predictive Analysis Library and Application Function ModelerSAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
SAP HANA SPS10- Predictive Analysis Library and Application Function ModelerSAP Technology
 
What's New in SAP HANA SPS 11 Predictive
What's New in SAP HANA SPS 11 PredictiveWhat's New in SAP HANA SPS 11 Predictive
What's New in SAP HANA SPS 11 PredictiveSAP Technology
 
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AGSap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AGINDUSCommunity
 
Machine Learning, hype or hit?
Machine Learning, hype or hit?Machine Learning, hype or hit?
Machine Learning, hype or hit?fredverheul
 
SAP Marketing Runs Hybris Marketing By Andreas Starke
SAP Marketing Runs Hybris Marketing By Andreas StarkeSAP Marketing Runs Hybris Marketing By Andreas Starke
SAP Marketing Runs Hybris Marketing By Andreas StarkeMarTech Conference
 
Programa anual 2016_pfrh
Programa anual 2016_pfrhPrograma anual 2016_pfrh
Programa anual 2016_pfrhNancy Ale Tapia
 
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and SparkReal-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and SparkSingleStore
 
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...SAP Analytics
 
Big Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of ThingsBig Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of ThingsAnthony Chen
 

Viewers also liked (18)

Pasifloras
PasiflorasPasifloras
Pasifloras
 
Qué cambiarías en la educación en
Qué cambiarías en la educación enQué cambiarías en la educación en
Qué cambiarías en la educación en
 
Animales en peligro de extinción
Animales en peligro de extinciónAnimales en peligro de extinción
Animales en peligro de extinción
 
Tbjee Syllabi 2019 - Tripura Jee
Tbjee Syllabi 2019 - Tripura Jee Tbjee Syllabi 2019 - Tripura Jee
Tbjee Syllabi 2019 - Tripura Jee
 
Diseño sistema de sonido hifi con reduccion de sonido
Diseño sistema de sonido hifi con reduccion de sonidoDiseño sistema de sonido hifi con reduccion de sonido
Diseño sistema de sonido hifi con reduccion de sonido
 
El Correo Electronico
El Correo ElectronicoEl Correo Electronico
El Correo Electronico
 
Platanitoren besoa
Platanitoren besoaPlatanitoren besoa
Platanitoren besoa
 
Pga 15 16 ceip san agustín definitiva
Pga 15 16 ceip san agustín definitivaPga 15 16 ceip san agustín definitiva
Pga 15 16 ceip san agustín definitiva
 
SAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
SAP HANA SPS10- Predictive Analysis Library and Application Function ModelerSAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
SAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
 
What's New in SAP HANA SPS 11 Predictive
What's New in SAP HANA SPS 11 PredictiveWhat's New in SAP HANA SPS 11 Predictive
What's New in SAP HANA SPS 11 Predictive
 
Brood en vis
Brood en visBrood en vis
Brood en vis
 
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AGSap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
 
Machine Learning, hype or hit?
Machine Learning, hype or hit?Machine Learning, hype or hit?
Machine Learning, hype or hit?
 
SAP Marketing Runs Hybris Marketing By Andreas Starke
SAP Marketing Runs Hybris Marketing By Andreas StarkeSAP Marketing Runs Hybris Marketing By Andreas Starke
SAP Marketing Runs Hybris Marketing By Andreas Starke
 
Programa anual 2016_pfrh
Programa anual 2016_pfrhPrograma anual 2016_pfrh
Programa anual 2016_pfrh
 
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and SparkReal-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
 
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
 
Big Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of ThingsBig Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of Things
 

Similar to Machine learning 101 sit hvr

Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-stepsShesha R
 
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Intel® Software
 
1. Demystifying ML.pdf
1. Demystifying ML.pdf1. Demystifying ML.pdf
1. Demystifying ML.pdfJyoti Yadav
 
Machine Learning in NutShell
Machine Learning in NutShellMachine Learning in NutShell
Machine Learning in NutShellAshwin Shiv
 
artificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdfartificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdftt4765690
 
Machine Learning Contents.pptx
Machine Learning Contents.pptxMachine Learning Contents.pptx
Machine Learning Contents.pptxNaveenkushwaha18
 
Machine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEMachine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEbutest
 
Machine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptxMachine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptxNsitTech
 
It’s all about me_ From big data models to personalized experience Presentation
It’s all about me_ From big data models to personalized experience PresentationIt’s all about me_ From big data models to personalized experience Presentation
It’s all about me_ From big data models to personalized experience PresentationYao H. Morin, Ph.D.
 
MachineLearning Seminar PPT.pptx
MachineLearning Seminar PPT.pptxMachineLearning Seminar PPT.pptx
MachineLearning Seminar PPT.pptxAmanDixit74
 
Week 2 Sentiment Analysis Using Machine Learning
Week 2 Sentiment Analysis Using Machine Learning Week 2 Sentiment Analysis Using Machine Learning
Week 2 Sentiment Analysis Using Machine Learning SARCCOM
 
Drifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in ProductionDrifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in ProductionDatabricks
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2Roger Barga
 
An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statisticsSpotle.ai
 
Machine Learning an Research Overview
Machine Learning an Research OverviewMachine Learning an Research Overview
Machine Learning an Research OverviewKathirvel Ayyaswamy
 

Similar to Machine learning 101 sit hvr (20)

Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-steps
 
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
 
Machine learning
Machine learningMachine learning
Machine learning
 
1. Demystifying ML.pdf
1. Demystifying ML.pdf1. Demystifying ML.pdf
1. Demystifying ML.pdf
 
Machine Learning in NutShell
Machine Learning in NutShellMachine Learning in NutShell
Machine Learning in NutShell
 
artificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdfartificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdf
 
Machine Learning Contents.pptx
Machine Learning Contents.pptxMachine Learning Contents.pptx
Machine Learning Contents.pptx
 
Machine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEMachine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AE
 
ML_Module_1.pdf
ML_Module_1.pdfML_Module_1.pdf
ML_Module_1.pdf
 
Machine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptxMachine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptx
 
It’s all about me_ From big data models to personalized experience Presentation
It’s all about me_ From big data models to personalized experience PresentationIt’s all about me_ From big data models to personalized experience Presentation
It’s all about me_ From big data models to personalized experience Presentation
 
MachineLearning Seminar PPT.pptx
MachineLearning Seminar PPT.pptxMachineLearning Seminar PPT.pptx
MachineLearning Seminar PPT.pptx
 
Week 2 Sentiment Analysis Using Machine Learning
Week 2 Sentiment Analysis Using Machine Learning Week 2 Sentiment Analysis Using Machine Learning
Week 2 Sentiment Analysis Using Machine Learning
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
 
Drifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in ProductionDrifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in Production
 
machine learning
machine learningmachine learning
machine learning
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
Machine learning
Machine learning Machine learning
Machine learning
 
An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statistics
 
Machine Learning an Research Overview
Machine Learning an Research OverviewMachine Learning an Research Overview
Machine Learning an Research Overview
 

Recently uploaded

SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 

Recently uploaded (20)

SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Odoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting ServiceOdoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting Service
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 

Machine learning 101 sit hvr

  • 2. What we won’t cover… • Deep learning / Neural Networks • Specifics of ML-algorithms • Tools / Libraries / Code • SAP Products, like HANA / Predictive Analytics / Vora / … • Ethics, algorithmic transparency & fairness • Hardware 2
  • 5. ML in the news: Deepmind’s AlphaGo 5
  • 6. 6
  • 7. Machine Learning "Field of study that gives computers the ability to learn without being explicitly programmed” (Arthur Samuel, 1959) 7
  • 8. What is Machine Learning? 8 Computer Computer Traditional Programming Machine Learning Data Data Program Output Program Output
  • 9. Sweet spot for Machine Learning • It’s impossible to write down the rules in code: • Too many rules • Too many factors influencing the rules • Too finely tuned • We just don’t know the rules (image recognition) • Lots of labeled data (examples) available (e.g. historical data) 9
  • 10. Basic Machine Learning ‘workflow’ 10 Feature Vectors Training data Labels Machine Learning Algorithm Feature Vectors New data Prediction Training Phase Operational Phase Predictive Model
  • 11. Training Phase in more detail 11 Raw data Data preparation Feature Vectors Training Data Test data Model Building (by ML algorithm) Model Evaluation Predictive Model Feedback loop data cleansing data transformation normalization feature extraction aka ‘learning’
  • 12. CRISP-DM: data mining process 12 ML important ML important
  • 13. Examples of ML tasks Supervised learning Regression  target is numeric Classification  target is categorical 13 Unsupervised learning Clustering Dimensionality reduction
  • 14. Modeling: so many algorithms… 14
  • 15. ML Algorithms: by Representation Collection of candidate models/programs, aka hypothesis space 15 Decision trees Instance-based Neural networks Model ensembles
  • 16. ML Algorithms: by Evaluation Evaluation: Quality measure for a model 16 Regression Example metric: Root Mean Squared Error RMSE = Binary classification: confusion matrix Accuracy: 8 + 971 -> 97,9% Example: medical test for a disease Positive Negative P True positives TP False Negatives FN N False positives FP True Negatives TN True Class Predicted class Accuracy: Better evaluation metrics: • Precision: 8 / (8 + 19) • Recall: 8 / (8 + 2)
  • 17. Optimization: how the algorithm ‘learns’, depends on representation and evaluation ML Algorithms: by Optimization 17 Greedy Search, ex. of combinatorial optimization Gradient Descent (or in general: Convex Optimization) Linear Programming (or in general: Constrained/Nonlinear Optimization)
  • 18. Training error vs test error 18
  • 19. Data Science for Business • Focuses more on general principles than specific algorithms • Not math-heavy, does contain some math • O’Reilly link: http://shop.oreilly.com/product/063692 0028918.do • Book website: http://data-science-for- biz.com/DSB/Home.html 19
  • 20. Take-aways • Goal of ML: generalize from training data (not optimization!!) • Part of ‘Data Mining Process’, not a goal in and of itself • No magic! Just some clever algorithms… • Increasingly important non-technical aspects: • Ethics • Algorithmic transparency 20
  • 21. Thank You www.soapeople.com info@soapeople.com @SOAPEOPLE Fred Verheul Big Data Consultant +31 6 3919 2986 fred.verheul@soapeople.com @fredverheul

Editor's Notes

  1. Source for images: http://www.havlena.net/en/machine-learning/machine-learning-what-is-it-where-to-learn-about-it/
  2. Go (DeepMind’s AlphaGo). How it works: https://www.tastehit.com/blog/google-deepmind-alphago-how-it-works/ Go is very different to Chess (DeepBlue 1996). Chess works with a game tree + sophisticated evaluation function. Go is too complex, and there are no good evaluation functions, because Go positions are harder to evaluate. Enter Monte Carlo Tree Search: simulation. Exploration/exploitation trade-off! No Go-knowledge required!
  3. This diagram is attributed to Pedro Domingos who used it in his Coursera Machine Learning course in 2012.
  4. Source: https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining
  5. Sources: Regression - http://gerardnico.com/wiki/data_mining/linear_regression Classification - ?? Clustering - https://en.wikipedia.org/wiki/Cluster_analysis Dimensionality reduction: http://www.sthda.com/english/wiki/factoextra-r-package-easy-multivariate-data-analyses-and-elegant-visualization
  6. Source: http://machinelearningmastery.com/
  7. Sources: Decision Tree - https://en.wikipedia.org/wiki/Decision_tree_learning Instance-based - https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm Neural Networks - https://en.wikipedia.org/wiki/Artificial_neural_network Ensembles - https://www.analyticsvidhya.com/blog/2015/09/questions-ensemble-modeling/
  8. Sources: Greedy Search - https://en.wikipedia.org/wiki/Greedy_algorithm Gradient Descent - ?? Linear Programming - http://courses.wccnet.edu/~palay/math181/linearprogramming.htm
  9. Source: https://onlinecourses.science.psu.edu/stat857/node/160