Machine learning 101 sit hvr

•Download as PPTX, PDF•

2 likes•944 views

fredverheul

Basic introduction to machine learning as presented at SAP Inside Track Hannover

Software

What we won’t cover…
• Deep learning / Neural Networks
• Specifics of ML-algorithms
• Tools / Libraries / Code
• SAP Products, like HANA / Predictive Analytics / Vora / …
• Ethics, algorithmic transparency & fairness
• Hardware
2

Examples, continued…
4
SPAM-
filtering
Handwriting
recognition

Machine Learning
"Field of study that gives computers the ability to learn
without being explicitly programmed” (Arthur Samuel, 1959)
7

What is Machine Learning?
8
Computer
Computer
Traditional Programming
Machine Learning
Data
Data
Program
Output
Program
Output

Sweet spot for Machine Learning
• It’s impossible to write down the rules in code:
• Too many rules
• Too many factors influencing the rules
• Too finely tuned
• We just don’t know the rules (image recognition)
• Lots of labeled data (examples) available (e.g. historical data)
9

Basic Machine Learning ‘workflow’
10
Feature
Vectors
Training
data
Labels
Machine
Learning
Algorithm
Feature
Vectors
New data Prediction
Training Phase
Operational Phase
Predictive
Model

Training Phase in more detail
11
Raw data
Data
preparation Feature
Vectors
Training
Data
Test
data
Model Building
(by ML
algorithm)
Model
Evaluation
Predictive
Model
Feedback loop
data cleansing
data transformation
normalization
feature extraction
aka
‘learning’

CRISP-DM: data mining process
12
ML
important
ML
important

Examples of ML tasks
Supervised learning
Regression 
target is numeric
Classification 
target is categorical
13
Unsupervised learning
Clustering
Dimensionality
reduction

ML Algorithms: by Representation
Collection of candidate models/programs, aka hypothesis space
15
Decision trees
Instance-based
Neural networks
Model ensembles

ML Algorithms: by Evaluation
Evaluation: Quality measure for a model
16
Regression
Example metric: Root Mean Squared Error
RMSE =
Binary classification: confusion matrix
Accuracy: 8 + 971 -> 97,9%
Example: medical test
for a disease
Positive Negative
P
True
positives
TP
False
Negatives
FN
N
False
positives
FP
True
Negatives
TN
True
Class
Predicted class
Accuracy: Better evaluation metrics:
• Precision: 8 / (8 + 19)
• Recall: 8 / (8 + 2)

Optimization: how the algorithm ‘learns’, depends on representation and
evaluation
ML Algorithms: by Optimization
17
Greedy Search,
ex. of
combinatorial
optimization
Gradient Descent (or in general: Convex Optimization)
Linear Programming (or in general:
Constrained/Nonlinear Optimization)

Data Science for Business
• Focuses more on general principles
than specific algorithms
• Not math-heavy, does contain some
math
• O’Reilly link:
http://shop.oreilly.com/product/063692
0028918.do
• Book website: http://data-science-for-
biz.com/DSB/Home.html
19

Take-aways
• Goal of ML: generalize from training data (not optimization!!)
• Part of ‘Data Mining Process’, not a goal in and of itself
• No magic! Just some clever algorithms…
• Increasingly important non-technical aspects:
• Ethics
• Algorithmic transparency
20

Thank You
www.soapeople.com
info@soapeople.com
@SOAPEOPLE
Fred Verheul
Big Data Consultant
+31 6 3919 2986
fred.verheul@soapeople.com
@fredverheul

What's hot

The Evolution of AutoMLNing Jiang

ETL & Machine LearningLuthfi Hariz

Visualising the world of competitive programming with Python (Codeforces)Anuj Menta

How is research conducted in my fieldCristian Klein

OpenML NeurIPS2018Joaquin Vanschoren

H2O World - Ensembles with Erin LeDellSri Ambati

Microsoft Introduction to Automated Machine LearningSetu Chokshi

AzureML – zero to heroGovind Kanshi

AutoML - The Future of AINing Jiang

Ideas spracklen-finalsupportlogic

Pipeline oriented data analyticsBorys Biletskyy

Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...Sri Ambati

Part 3 Machine LearnningMohamed Essam

The Quest for an Open Source Data Science PlatformQAware GmbH

Robust and declarative machine learning pipelines for predictive buying at Ba...Gianmario Spacagna

Genetic Algorithm Projects Research IdeasMatlab Simulation

What's hot (16)

The Evolution of AutoML

ETL & Machine Learning

Visualising the world of competitive programming with Python (Codeforces)

How is research conducted in my field

OpenML NeurIPS2018

H2O World - Ensembles with Erin LeDell

Microsoft Introduction to Automated Machine Learning

AzureML – zero to hero

AutoML - The Future of AI

Ideas spracklen-final

Pipeline oriented data analytics

Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...

Part 3 Machine Learnning

The Quest for an Open Source Data Science Platform

Robust and declarative machine learning pipelines for predictive buying at Ba...

Genetic Algorithm Projects Research Ideas

Viewers also liked

PasiflorasHernan Heras

Qué cambiarías en la educación enMaricarmen Rodríguez

Animales en peligro de extincióncawi_007_0909

Tbjee Syllabi 2019 - Tripura Jee Abhinandan singh

Diseño sistema de sonido hifi con reduccion de sonidoBrenda Reina

El Correo ElectronicoRomer Crespo

Platanitoren besoajokinuki

Pga 15 16 ceip san agustín definitiva02001433

SAP HANA SPS10- Predictive Analysis Library and Application Function ModelerSAP Technology

What's New in SAP HANA SPS 11 PredictiveSAP Technology

Brood en visNoordwolde, Friesland

Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AGINDUSCommunity

Machine Learning, hype or hit?fredverheul

SAP Marketing Runs Hybris Marketing By Andreas StarkeMarTech Conference

Programa anual 2016_pfrhNancy Ale Tapia

Real-Time Supply Chain Analytics with Machine Learning, Kafka, and SparkSingleStore

#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...SAP Analytics

Big Data Analytics for the Industrial Internet of ThingsAnthony Chen

Viewers also liked (18)

Pasifloras

Qué cambiarías en la educación en

Animales en peligro de extinción

Tbjee Syllabi 2019 - Tripura Jee

Diseño sistema de sonido hifi con reduccion de sonido

El Correo Electronico

Platanitoren besoa

Pga 15 16 ceip san agustín definitiva

SAP HANA SPS10- Predictive Analysis Library and Application Function Modeler

What's New in SAP HANA SPS 11 Predictive

Brood en vis

Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG

Machine Learning, hype or hit?

SAP Marketing Runs Hybris Marketing By Andreas Starke

Programa anual 2016_pfrh

Real-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark

#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...

Big Data Analytics for the Industrial Internet of Things

Similar to Machine learning 101 sit hvr

Data analytcis-first-stepsShesha R

Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Intel® Software

Machine learningWahid Ur Rehman

1. Demystifying ML.pdfJyoti Yadav

Machine Learning in NutShellAshwin Shiv

artificggggggggggggggialintelligence.pdftt4765690

Machine Learning Contents.pptxNaveenkushwaha18

Machine Learning for automated diagnosis of distributed ...AEbutest

ML_Module_1.pdfJafarHussain48

Machine Learning - Lecture2.pptxNsitTech

It’s all about me_ From big data models to personalized experience PresentationYao H. Morin, Ph.D.

MachineLearning Seminar PPT.pptxAmanDixit74

Week 2 Sentiment Analysis Using Machine Learning SARCCOM

Machine_Learning.pptxshubhamatak136

Drifting Away: Testing ML Models in ProductionDatabricks

machine learningsoundaryasarya

Barga Data Science lecture 2Roger Barga

Machine learning Aarthi Srinivasan

An introduction to machine learning and statisticsSpotle.ai

Machine Learning an Research OverviewKathirvel Ayyaswamy

Similar to Machine learning 101 sit hvr (20)

Data analytcis-first-steps

Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...

Machine learning

1. Demystifying ML.pdf

Machine Learning in NutShell

artificggggggggggggggialintelligence.pdf

Machine Learning Contents.pptx

Machine Learning for automated diagnosis of distributed ...AE

ML_Module_1.pdf

Machine Learning - Lecture2.pptx

It’s all about me_ From big data models to personalized experience Presentation

MachineLearning Seminar PPT.pptx

Week 2 Sentiment Analysis Using Machine Learning

Machine_Learning.pptx

Drifting Away: Testing ML Models in Production

machine learning

Barga Data Science lecture 2

Machine learning

An introduction to machine learning and statistics

Machine Learning an Research Overview

Recently uploaded

SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa

Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171

What is Advanced Excel and what are some best practices for designing and cre...Technogeeks

Odoo Development Company in India | Devintelle Consulting ServiceDevintelle Consulting Service Pvt Ltd Odoo OpenERP

办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea

Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions

A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska

Powering Real-Time Decisions with Continuous Data StreamsSafe Software

Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig

Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky

Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater

MYjobs Presentation Django-based projectAnoyGreter

Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app

2.pdf Ejercicios de programación competitivaDiego Iván Oliveros Acosta

SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl

Advantages of Odoo ERP 17 for Your BusinessEnvertis Software Solutions

React Server Component in Next.js by Hanief UtamaHanief Utama

Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz

Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López

Recently uploaded (20)

SpotFlow: Tracking Method Calls and States at Runtime

Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf

What is Advanced Excel and what are some best practices for designing and cre...

Odoo Development Company in India | Devintelle Consulting Service

办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样

Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...

A healthy diet for your Java application Devoxx France.pdf

Powering Real-Time Decisions with Continuous Data Streams

Automate your Kamailio Test Calls - Kamailio World 2024

Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...

Ahmed Motair CV April 2024 (Senior SW Developer)

MYjobs Presentation Django-based project

Machine Learning Software Engineering Patterns and Their Engineering

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx

2.pdf Ejercicios de programación competitiva

SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany

Advantages of Odoo ERP 17 for Your Business

React Server Component in Next.js by Hanief Utama

Folding Cheat Sheet #4 - fourth in a series

Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...

Machine learning 101 sit hvr

1. Machine Learning 101 Fred Verheul

2. What we won’t cover… • Deep learning / Neural Networks • Specifics of ML-algorithms • Tools / Libraries / Code • SAP Products, like HANA / Predictive Analytics / Vora / … • Ethics, algorithmic transparency & fairness • Hardware 2

3. Examples: Recommender systems 3

4. Examples, continued… 4 SPAM- filtering Handwriting recognition

5. ML in the news: Deepmind’s AlphaGo 5

6. 6

7. Machine Learning "Field of study that gives computers the ability to learn without being explicitly programmed” (Arthur Samuel, 1959) 7

8. What is Machine Learning? 8 Computer Computer Traditional Programming Machine Learning Data Data Program Output Program Output

9. Sweet spot for Machine Learning • It’s impossible to write down the rules in code: • Too many rules • Too many factors influencing the rules • Too finely tuned • We just don’t know the rules (image recognition) • Lots of labeled data (examples) available (e.g. historical data) 9

10. Basic Machine Learning ‘workflow’ 10 Feature Vectors Training data Labels Machine Learning Algorithm Feature Vectors New data Prediction Training Phase Operational Phase Predictive Model

11. Training Phase in more detail 11 Raw data Data preparation Feature Vectors Training Data Test data Model Building (by ML algorithm) Model Evaluation Predictive Model Feedback loop data cleansing data transformation normalization feature extraction aka ‘learning’

12. CRISP-DM: data mining process 12 ML important ML important

13. Examples of ML tasks Supervised learning Regression  target is numeric Classification  target is categorical 13 Unsupervised learning Clustering Dimensionality reduction

14. Modeling: so many algorithms… 14

15. ML Algorithms: by Representation Collection of candidate models/programs, aka hypothesis space 15 Decision trees Instance-based Neural networks Model ensembles

16. ML Algorithms: by Evaluation Evaluation: Quality measure for a model 16 Regression Example metric: Root Mean Squared Error RMSE = Binary classification: confusion matrix Accuracy: 8 + 971 -> 97,9% Example: medical test for a disease Positive Negative P True positives TP False Negatives FN N False positives FP True Negatives TN True Class Predicted class Accuracy: Better evaluation metrics: • Precision: 8 / (8 + 19) • Recall: 8 / (8 + 2)

17. Optimization: how the algorithm ‘learns’, depends on representation and evaluation ML Algorithms: by Optimization 17 Greedy Search, ex. of combinatorial optimization Gradient Descent (or in general: Convex Optimization) Linear Programming (or in general: Constrained/Nonlinear Optimization)

18. Training error vs test error 18

19. Data Science for Business • Focuses more on general principles than specific algorithms • Not math-heavy, does contain some math • O’Reilly link: http://shop.oreilly.com/product/063692 0028918.do • Book website: http://data-science-for- biz.com/DSB/Home.html 19

20. Take-aways • Goal of ML: generalize from training data (not optimization!!) • Part of ‘Data Mining Process’, not a goal in and of itself • No magic! Just some clever algorithms… • Increasingly important non-technical aspects: • Ethics • Algorithmic transparency 20

21. Thank You www.soapeople.com info@soapeople.com @SOAPEOPLE Fred Verheul Big Data Consultant +31 6 3919 2986 fred.verheul@soapeople.com @fredverheul

Editor's Notes

Source for images: http://www.havlena.net/en/machine-learning/machine-learning-what-is-it-where-to-learn-about-it/
Go (DeepMind’s AlphaGo). How it works: https://www.tastehit.com/blog/google-deepmind-alphago-how-it-works/ Go is very different to Chess (DeepBlue 1996). Chess works with a game tree + sophisticated evaluation function. Go is too complex, and there are no good evaluation functions, because Go positions are harder to evaluate. Enter Monte Carlo Tree Search: simulation. Exploration/exploitation trade-off! No Go-knowledge required!
This diagram is attributed to Pedro Domingos who used it in his Coursera Machine Learning course in 2012.
Source: https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining
Sources: Regression - http://gerardnico.com/wiki/data_mining/linear_regression Classification - ?? Clustering - https://en.wikipedia.org/wiki/Cluster_analysis Dimensionality reduction: http://www.sthda.com/english/wiki/factoextra-r-package-easy-multivariate-data-analyses-and-elegant-visualization
Source: http://machinelearningmastery.com/
Sources: Decision Tree - https://en.wikipedia.org/wiki/Decision_tree_learning Instance-based - https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm Neural Networks - https://en.wikipedia.org/wiki/Artificial_neural_network Ensembles - https://www.analyticsvidhya.com/blog/2015/09/questions-ensemble-modeling/
Sources: Greedy Search - https://en.wikipedia.org/wiki/Greedy_algorithm Gradient Descent - ?? Linear Programming - http://courses.wccnet.edu/~palay/math181/linearprogramming.htm
Source: https://onlinecourses.science.psu.edu/stat857/node/160

Machine learning 101 sit hvr

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Viewers also liked

Viewers also liked (18)

Similar to Machine learning 101 sit hvr

Similar to Machine learning 101 sit hvr (20)

Recently uploaded

Recently uploaded (20)

Machine learning 101 sit hvr

Editor's Notes