SlideShare a Scribd company logo
1 of 46
Download to read offline
Prof. Pier Luca Lanzi
Classification
Data Mining andText Mining (UIC 583 @ Politecnico di Milano)
Prof. Pier Luca Lanzi
In classification, we have data that have been labeled according to a certain concept
(good/bad, positive/negative, etc.) identified by a supervisor who actually labeled the data
Prof. Pier Luca Lanzi
The goal of classification is to compute a model that can discriminate between examples that have
been labeled differently, e.g., the points above the planes are green and the one below are red
Prof. Pier Luca Lanzi
The model is then used to label previously unseen points.
?
?
?
?
Prof. Pier Luca Lanzi
As for regression, we have to compute a
model using known data that will perform
well on unknown data.
Prof. Pier Luca Lanzi
Prof. Pier Luca Lanzi
Contact Lenses Data 7
NoneReducedYesHypermetropePre-presbyopic
NoneNormalYesHypermetropePre-presbyopic
NoneReducedNoMyopePresbyopic
NoneNormalNoMyopePresbyopic
NoneReducedYesMyopePresbyopic
HardNormalYesMyopePresbyopic
NoneReducedNoHypermetropePresbyopic
SoftNormalNoHypermetropePresbyopic
NoneReducedYesHypermetropePresbyopic
NoneNormalYesHypermetropePresbyopic
SoftNormalNoHypermetropePre-presbyopic
NoneReducedNoHypermetropePre-presbyopic
HardNormalYesMyopePre-presbyopic
NoneReducedYesMyopePre-presbyopic
SoftNormalNoMyopePre-presbyopic
NoneReducedNoMyopePre-presbyopic
hardNormalYesHypermetropeYoung
NoneReducedYesHypermetropeYoung
SoftNormalNoHypermetropeYoung
NoneReducedNoHypermetropeYoung
HardNormalYesMyopeYoung
NoneReducedYesMyopeYoung
SoftNormalNoMyopeYoung
NoneReducedNoMyopeYoung
Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge
Prof. Pier Luca Lanzi
Rows are typically called
examples or data points
Columns are typically called
attributes, variables or features
The target variable to be
predicted is usually called “class”
Prof. Pier Luca Lanzi
A Model for the Contact Lenses Data 9
If tear production rate = reduced then recommendation = none
If age = young and astigmatic = no
and tear production rate = normal then recommendation = soft
If age = pre-presbyopic and astigmatic = no
and tear production rate = normal then recommendation = soft
If age = presbyopic and spectacle prescription = myope
and astigmatic = no then recommendation = none
If spectacle prescription = hypermetrope and astigmatic = no
and tear production rate = normal then recommendation = soft
If spectacle prescription = myope and astigmatic = yes
and tear production rate = normal then recommendation = hard
If age young and astigmatic = yes
and tear production rate = normal then recommendation = hard
If age = pre-presbyopic
and spectacle prescription = hypermetrope
and astigmatic = yes then recommendation = none
If age = presbyopic and spectacle prescription = hypermetrope
and astigmatic = yes then recommendation = none
Prof. Pier Luca Lanzi
Classification
The target to predict is a label (the class variable)
(good/bad, none/soft/hard, etc.)
Prediction/Regression
The target to predict is numerical
10
Prof. Pier Luca Lanzi
classification = model building + model usage
Prof. Pier Luca Lanzi
Classification: Model Construction 14
Classification
Algorithm
IF rank = ‘professor’
OR years > 6
THEN tenured = ‘yes’
name rank years tenured
Mike Assistant Prof 3 no
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes
Dave Assistant Prof 6 no
Anne Associate Prof 3 no
Training
Data
Classifier
(Model)
Prof. Pier Luca Lanzi
Classification: Model Usage 15
tenured = yes
name rank years tenured
Tom Assistant Prof 2 no
Merlisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes
Test
Data
Classifier
(Model)
Unseen Data
Jeff, Professor, 4
Prof. Pier Luca Lanzi
Evaluating Classification Methods
• Accuracy
§Classifier accuracy in predicting the correct the class labels
• Speed
§Time to construct the model (training time)
§Time to use the model to label unseen data
• Other Criteria
§Robustness in handling noise
§Scalability
§Interpretability
§…
16
Prof. Pier Luca Lanzi
Logistic Regression
20
Prof. Pier Luca Lanzi
To build a model to label the green and red data points we can compute an hyperplane
Score(x) that separates the examples labeled green from the one labeled red
Prof. Pier Luca Lanzi
To predict the labels we check whether the value of Score(x0, x1) is
positive or negative and we label the examples accordingly
Score(x0,x1) = 0
Score(x0,x1) > 0
Score(x0,x1) < 0
Prof. Pier Luca Lanzi
Intuition Behind Linear Classifiers
(Two Classes)
• We define a score function similar to the one used in linear
regression,
• The label is determined by the sign of the score value,
23
Prof. Pier Luca Lanzi
Logistic Regression
• Well-known and widely used statistical classification method
• Instead of computing the label, it computes the probability of
assigning a class to an example, that is,
• For this purpose, logistic regression assumes that,
• By making the score computation explicit and using h to identify
all the feature transformation hj we get,
24
Prof. Pier Luca Lanzi
25
Prof. Pier Luca Lanzi
Logistic Regression
• Logistic Regression search for the weight vector that corresponds
to the highest likelihood
• For this purpose, it performs a gradient ascent on the log
likelihood function
• Which updates weight j using,
26
Prof. Pier Luca Lanzi
Classification using Logistic
Regression
• To classify an example X, we select the class Y that maximizes
the probability P(Y=y | X) or check
or
• By taking the natural logarithm of both sides, we obtain a linear
classification rule that assign the label Y=-1 if
• Y=+1 otherwise
27
Prof. Pier Luca Lanzi
Overfitting & Regularization
28
Prof. Pier Luca Lanzi
L1 & L2 Regularization
• Logistic regression can use L1 and L2 regularization to limit
overfitting and produce sparse solution
• Like it happened with regression, overfitting is often associated to
large weights so to limit overfitting we penalize large weights
Total cost = Measure of Fit - Magnitude of Coefficients
• Regularization
§L1 uses the sum of absolute values, which penalizes large
weights and at the same time promotes sparse solutions
§L2 uses the sum of squares, which penalizes large weights
29
Prof. Pier Luca Lanzi
L1 Regularization
L2 Regularization
30
Prof. Pier Luca Lanzi
If ⍺ is zero, then we have no regularization
If ⍺ tends to infinity,
the solution is a zero weight vector
⍺ must balance fit and the weight magnitude
Prof. Pier Luca Lanzi
Example
32
Prof. Pier Luca Lanzi
Run the Logistic Regression Python
notebooks provided on Beep
Prof. Pier Luca Lanzi
Safe Loans Prediction
• Given data downloaded from the Lending Club Corporation
website regarding safe and risky loans, build a model to predict
whether a loan should be given
• Model evaluation using 10-fold crossvalidation
§Simple Logistic Regression μ=0.63 σ=0.05
§Logistic Regression (L1) μ=0.63 σ=0.07
§Logistic Regression (L2) μ=0.64 σ=0.05
34
Prof. Pier Luca Lanzi
Weights of the simple logistic model, the L1 regularized model, and the L2 regularized model.
35
Prof. Pier Luca Lanzi
From now on we are going to use k-fold
crossvalidation a lot to score models
k-fold crossvalidation generates
k separate models
Which one should be deployed?
Prof. Pier Luca Lanzi
K-fold crossvalidation provides an evaluation
of model performance on unknown data
Its output it’s the evaluation, not the model!
The model should be obtained by running
the algorithm on the entire dataset!
Prof. Pier Luca Lanzi
Multiclass Classification
38
Prof. Pier Luca Lanzi
Logistic regression assumes that
there are only two class values (e.g., +1/-1)
What if we have more?
(e.g., none, soft, hard or Setosa/Versicolour/Virginica)
39
Prof. Pier Luca Lanzi
One Versus the Rest Evaluation
• For each class, it creates one classifier that predicts the target
class against all the others
• For instance, given three classes A, B, C, it computes three
models: one that predicts A against B and C, one that predicts B
against A and C, and one that predicts C against A and B
• Then, given an example, all the three classifiers are applied and
the label with the highest probability is returned
• Alternative approaches include the minimization of loss based on
the multinomial loss fit across the entire probability distribution
40
Prof. Pier Luca Lanzi
One Versus Rest multiclass model using the Iris dataset
41
Prof. Pier Luca Lanzi
Multinomial multiclass model using the Iris dataset
Prof. Pier Luca Lanzi
Run the Logistic Regression Python notebooks
on multiclass classification available on Beep
Prof. Pier Luca Lanzi
Categorical Attributes
44
Prof. Pier Luca Lanzi
So far we applied Logistic Regression
(and regression) to numerical variables
What if the variables are categorical?
Prof. Pier Luca Lanzi
The Weather Dataset
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Sunny Hot	 High	 True No
Overcast	 Hot		 High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Mild High False No
Sunny Cool Normal False Yes
Rainy Mild Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
46
Prof. Pier Luca Lanzi
One Hot Encoding
Map each categorical attribute with
n values into n binary 0/1 variables
Each one describing one
specific attribute values
For example, attribute Outlook is replaced by three
binary variables Sunny, Overcast, and Rainy
Prof. Pier Luca Lanzi
One Hot Encoding for the Outlook attribute.
overcast rainy sunny
0 0 1
0 0 1
1 0 0
0 1 0
0 1 0
0 1 0
1 0 0
0 0 1
0 0 1
0 1 0
0 0 1
1 0 0
1 0 0
0 1 0
Prof. Pier Luca Lanzi
Run the Logistic Regression Python notebooks
on one hot encoding available on Beep
Prof. Pier Luca Lanzi
Summary
50
Prof. Pier Luca Lanzi
Classification
• The goal of classification is to build a model of a target concept (the class
attribute) available in the data
• The target concept is typically a symbolic label (good/bad, yes/no,
none/soft/hard, etc.)
• Linear classifiers are one of the simplest of classification algorithm that
computes a model represented as a hyperplane that separates examples of the
two classes
• Logistic regression is a classification algorithm that computes the probability to
classify an example with a certain label
• It works similarly to linear regression and can be coupled with L1 and L2
regularization to limit overfitting
• It assumes that the class has only two values but it can be extended to
problems involving more than two classes
• Like linear regression it assumes that the variables are numerical, if they are
not, categorical variables must be mapped into numerical values
• One of the typical approaches is to use one-hot-encoding
51

More Related Content

What's hot

DMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsDMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsPier Luca Lanzi
 
DMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rulesDMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rulesPier Luca Lanzi
 
DMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringDMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringPier Luca Lanzi
 
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other MethodsDMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other MethodsPier Luca Lanzi
 
DMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringDMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringPier Luca Lanzi
 
DMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification ModelsDMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification ModelsPier Luca Lanzi
 
DMTM Lecture 05 Data representation
DMTM Lecture 05 Data representationDMTM Lecture 05 Data representation
DMTM Lecture 05 Data representationPier Luca Lanzi
 
DMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringDMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringPier Luca Lanzi
 
DMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesDMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesPier Luca Lanzi
 
DMTM 2015 - 04 Data Exploration
DMTM 2015 - 04 Data ExplorationDMTM 2015 - 04 Data Exploration
DMTM 2015 - 04 Data ExplorationPier Luca Lanzi
 
DMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical ClusteringDMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical ClusteringPier Luca Lanzi
 
DMTM 2015 - 12 Classification Rules
DMTM 2015 - 12 Classification RulesDMTM 2015 - 12 Classification Rules
DMTM 2015 - 12 Classification RulesPier Luca Lanzi
 
DMTM 2015 - 10 Introduction to Classification
DMTM 2015 - 10 Introduction to ClassificationDMTM 2015 - 10 Introduction to Classification
DMTM 2015 - 10 Introduction to ClassificationPier Luca Lanzi
 
DMTM 2015 - 03 Data Representation
DMTM 2015 - 03 Data RepresentationDMTM 2015 - 03 Data Representation
DMTM 2015 - 03 Data RepresentationPier Luca Lanzi
 
DMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningDMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningPier Luca Lanzi
 
DMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based ClusteringDMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based ClusteringPier Luca Lanzi
 
DMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data PreparationDMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data PreparationPier Luca Lanzi
 
DMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationDMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationPier Luca Lanzi
 
DMTM Lecture 17 Text mining
DMTM Lecture 17 Text miningDMTM Lecture 17 Text mining
DMTM Lecture 17 Text miningPier Luca Lanzi
 
DMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesDMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesPier Luca Lanzi
 

What's hot (20)

DMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsDMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethods
 
DMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rulesDMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rules
 
DMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringDMTM Lecture 11 Clustering
DMTM Lecture 11 Clustering
 
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other MethodsDMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
 
DMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringDMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based Clustering
 
DMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification ModelsDMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification Models
 
DMTM Lecture 05 Data representation
DMTM Lecture 05 Data representationDMTM Lecture 05 Data representation
DMTM Lecture 05 Data representation
 
DMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringDMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clustering
 
DMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesDMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision trees
 
DMTM 2015 - 04 Data Exploration
DMTM 2015 - 04 Data ExplorationDMTM 2015 - 04 Data Exploration
DMTM 2015 - 04 Data Exploration
 
DMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical ClusteringDMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical Clustering
 
DMTM 2015 - 12 Classification Rules
DMTM 2015 - 12 Classification RulesDMTM 2015 - 12 Classification Rules
DMTM 2015 - 12 Classification Rules
 
DMTM 2015 - 10 Introduction to Classification
DMTM 2015 - 10 Introduction to ClassificationDMTM 2015 - 10 Introduction to Classification
DMTM 2015 - 10 Introduction to Classification
 
DMTM 2015 - 03 Data Representation
DMTM 2015 - 03 Data RepresentationDMTM 2015 - 03 Data Representation
DMTM 2015 - 03 Data Representation
 
DMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningDMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph mining
 
DMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based ClusteringDMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based Clustering
 
DMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data PreparationDMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data Preparation
 
DMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationDMTM Lecture 19 Data exploration
DMTM Lecture 19 Data exploration
 
DMTM Lecture 17 Text mining
DMTM Lecture 17 Text miningDMTM Lecture 17 Text mining
DMTM Lecture 17 Text mining
 
DMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesDMTM Lecture 16 Association rules
DMTM Lecture 16 Association rules
 

Similar to DMTM Lecture 04 Classification

GECCO 2014 - Learning Classifier System Tutorial
GECCO 2014 - Learning Classifier System TutorialGECCO 2014 - Learning Classifier System Tutorial
GECCO 2014 - Learning Classifier System TutorialPier Luca Lanzi
 
GECCO-2014 Learning Classifier Systems: A Gentle Introduction
GECCO-2014 Learning Classifier Systems: A Gentle IntroductionGECCO-2014 Learning Classifier Systems: A Gentle Introduction
GECCO-2014 Learning Classifier Systems: A Gentle IntroductionPier Luca Lanzi
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesPier Luca Lanzi
 
Introduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonIntroduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonPeadar Coyle
 
Replicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender SystemsReplicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender SystemsAlejandro Bellogin
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptxHadrian7
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Ian Morgan
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Bayes Nets meetup London
 
Evolving Rules to Solve Problems: The Learning Classifier Systems Way
Evolving Rules to Solve Problems: The Learning Classifier Systems WayEvolving Rules to Solve Problems: The Learning Classifier Systems Way
Evolving Rules to Solve Problems: The Learning Classifier Systems WayPier Luca Lanzi
 
Building useful models for imbalanced datasets (without resampling)
Building useful models for imbalanced datasets (without resampling)Building useful models for imbalanced datasets (without resampling)
Building useful models for imbalanced datasets (without resampling)Greg Landrum
 
AI -learning and machine learning.pptx
AI  -learning and machine learning.pptxAI  -learning and machine learning.pptx
AI -learning and machine learning.pptxGaytriDhingra1
 
ML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxMayankChadha14
 
Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningKhaled Saleh
 
EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...
EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...
EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...Yun Huang
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18Matt Yang
 
An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learningSubrat Panda, PhD
 
anintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdfanintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdfssuseradaf5f
 

Similar to DMTM Lecture 04 Classification (20)

GECCO 2014 - Learning Classifier System Tutorial
GECCO 2014 - Learning Classifier System TutorialGECCO 2014 - Learning Classifier System Tutorial
GECCO 2014 - Learning Classifier System Tutorial
 
GECCO-2014 Learning Classifier Systems: A Gentle Introduction
GECCO-2014 Learning Classifier Systems: A Gentle IntroductionGECCO-2014 Learning Classifier Systems: A Gentle Introduction
GECCO-2014 Learning Classifier Systems: A Gentle Introduction
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers Ensembles
 
Introduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonIntroduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in Python
 
Replicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender SystemsReplicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender Systems
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Evolving Rules to Solve Problems: The Learning Classifier Systems Way
Evolving Rules to Solve Problems: The Learning Classifier Systems WayEvolving Rules to Solve Problems: The Learning Classifier Systems Way
Evolving Rules to Solve Problems: The Learning Classifier Systems Way
 
Building useful models for imbalanced datasets (without resampling)
Building useful models for imbalanced datasets (without resampling)Building useful models for imbalanced datasets (without resampling)
Building useful models for imbalanced datasets (without resampling)
 
Machine Learning
Machine Learning Machine Learning
Machine Learning
 
AI -learning and machine learning.pptx
AI  -learning and machine learning.pptxAI  -learning and machine learning.pptx
AI -learning and machine learning.pptx
 
ML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptx
 
Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement Learning
 
EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...
EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...
EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...
 
Hierarchical Object Detection with Deep Reinforcement Learning
Hierarchical Object Detection with Deep Reinforcement LearningHierarchical Object Detection with Deep Reinforcement Learning
Hierarchical Object Detection with Deep Reinforcement Learning
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18
 
An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learning
 
anintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdfanintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdf
 

More from Pier Luca Lanzi

11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i VideogiochiPier Luca Lanzi
 
Breve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiBreve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiPier Luca Lanzi
 
Global Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomeGlobal Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomePier Luca Lanzi
 
Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Pier Luca Lanzi
 
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...Pier Luca Lanzi
 
GGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaGGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaPier Luca Lanzi
 
Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Pier Luca Lanzi
 
DMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clusteringDMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clusteringPier Luca Lanzi
 
DMTM Lecture 01 Introduction
DMTM Lecture 01 IntroductionDMTM Lecture 01 Introduction
DMTM Lecture 01 IntroductionPier Luca Lanzi
 
DMTM Lecture 02 Data mining
DMTM Lecture 02 Data miningDMTM Lecture 02 Data mining
DMTM Lecture 02 Data miningPier Luca Lanzi
 
VDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipelineVDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipelinePier Luca Lanzi
 
VDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with UnityVDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with UnityPier Luca Lanzi
 
VDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generationVDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generationPier Luca Lanzi
 
VDP2016 - Lecture 13 Data driven game design
VDP2016 - Lecture 13 Data driven game designVDP2016 - Lecture 13 Data driven game design
VDP2016 - Lecture 13 Data driven game designPier Luca Lanzi
 

More from Pier Luca Lanzi (14)

11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi
 
Breve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiBreve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei Videogiochi
 
Global Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomeGlobal Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning Welcome
 
Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018
 
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
 
GGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaGGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di apertura
 
Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018
 
DMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clusteringDMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clustering
 
DMTM Lecture 01 Introduction
DMTM Lecture 01 IntroductionDMTM Lecture 01 Introduction
DMTM Lecture 01 Introduction
 
DMTM Lecture 02 Data mining
DMTM Lecture 02 Data miningDMTM Lecture 02 Data mining
DMTM Lecture 02 Data mining
 
VDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipelineVDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipeline
 
VDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with UnityVDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with Unity
 
VDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generationVDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generation
 
VDP2016 - Lecture 13 Data driven game design
VDP2016 - Lecture 13 Data driven game designVDP2016 - Lecture 13 Data driven game design
VDP2016 - Lecture 13 Data driven game design
 

Recently uploaded

Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Developmentchesterberbo7
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleCeline George
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxMichelleTuguinay1
 

Recently uploaded (20)

Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Development
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP Module
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
 

DMTM Lecture 04 Classification

  • 1. Prof. Pier Luca Lanzi Classification Data Mining andText Mining (UIC 583 @ Politecnico di Milano)
  • 2. Prof. Pier Luca Lanzi In classification, we have data that have been labeled according to a certain concept (good/bad, positive/negative, etc.) identified by a supervisor who actually labeled the data
  • 3. Prof. Pier Luca Lanzi The goal of classification is to compute a model that can discriminate between examples that have been labeled differently, e.g., the points above the planes are green and the one below are red
  • 4. Prof. Pier Luca Lanzi The model is then used to label previously unseen points. ? ? ? ?
  • 5. Prof. Pier Luca Lanzi As for regression, we have to compute a model using known data that will perform well on unknown data.
  • 7. Prof. Pier Luca Lanzi Contact Lenses Data 7 NoneReducedYesHypermetropePre-presbyopic NoneNormalYesHypermetropePre-presbyopic NoneReducedNoMyopePresbyopic NoneNormalNoMyopePresbyopic NoneReducedYesMyopePresbyopic HardNormalYesMyopePresbyopic NoneReducedNoHypermetropePresbyopic SoftNormalNoHypermetropePresbyopic NoneReducedYesHypermetropePresbyopic NoneNormalYesHypermetropePresbyopic SoftNormalNoHypermetropePre-presbyopic NoneReducedNoHypermetropePre-presbyopic HardNormalYesMyopePre-presbyopic NoneReducedYesMyopePre-presbyopic SoftNormalNoMyopePre-presbyopic NoneReducedNoMyopePre-presbyopic hardNormalYesHypermetropeYoung NoneReducedYesHypermetropeYoung SoftNormalNoHypermetropeYoung NoneReducedNoHypermetropeYoung HardNormalYesMyopeYoung NoneReducedYesMyopeYoung SoftNormalNoMyopeYoung NoneReducedNoMyopeYoung Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge
  • 8. Prof. Pier Luca Lanzi Rows are typically called examples or data points Columns are typically called attributes, variables or features The target variable to be predicted is usually called “class”
  • 9. Prof. Pier Luca Lanzi A Model for the Contact Lenses Data 9 If tear production rate = reduced then recommendation = none If age = young and astigmatic = no and tear production rate = normal then recommendation = soft If age = pre-presbyopic and astigmatic = no and tear production rate = normal then recommendation = soft If age = presbyopic and spectacle prescription = myope and astigmatic = no then recommendation = none If spectacle prescription = hypermetrope and astigmatic = no and tear production rate = normal then recommendation = soft If spectacle prescription = myope and astigmatic = yes and tear production rate = normal then recommendation = hard If age young and astigmatic = yes and tear production rate = normal then recommendation = hard If age = pre-presbyopic and spectacle prescription = hypermetrope and astigmatic = yes then recommendation = none If age = presbyopic and spectacle prescription = hypermetrope and astigmatic = yes then recommendation = none
  • 10. Prof. Pier Luca Lanzi Classification The target to predict is a label (the class variable) (good/bad, none/soft/hard, etc.) Prediction/Regression The target to predict is numerical 10
  • 11. Prof. Pier Luca Lanzi classification = model building + model usage
  • 12. Prof. Pier Luca Lanzi Classification: Model Construction 14 Classification Algorithm IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’ name rank years tenured Mike Assistant Prof 3 no Mary Assistant Prof 7 yes Bill Professor 2 yes Jim Associate Prof 7 yes Dave Assistant Prof 6 no Anne Associate Prof 3 no Training Data Classifier (Model)
  • 13. Prof. Pier Luca Lanzi Classification: Model Usage 15 tenured = yes name rank years tenured Tom Assistant Prof 2 no Merlisa Associate Prof 7 no George Professor 5 yes Joseph Assistant Prof 7 yes Test Data Classifier (Model) Unseen Data Jeff, Professor, 4
  • 14. Prof. Pier Luca Lanzi Evaluating Classification Methods • Accuracy §Classifier accuracy in predicting the correct the class labels • Speed §Time to construct the model (training time) §Time to use the model to label unseen data • Other Criteria §Robustness in handling noise §Scalability §Interpretability §… 16
  • 15. Prof. Pier Luca Lanzi Logistic Regression 20
  • 16. Prof. Pier Luca Lanzi To build a model to label the green and red data points we can compute an hyperplane Score(x) that separates the examples labeled green from the one labeled red
  • 17. Prof. Pier Luca Lanzi To predict the labels we check whether the value of Score(x0, x1) is positive or negative and we label the examples accordingly Score(x0,x1) = 0 Score(x0,x1) > 0 Score(x0,x1) < 0
  • 18. Prof. Pier Luca Lanzi Intuition Behind Linear Classifiers (Two Classes) • We define a score function similar to the one used in linear regression, • The label is determined by the sign of the score value, 23
  • 19. Prof. Pier Luca Lanzi Logistic Regression • Well-known and widely used statistical classification method • Instead of computing the label, it computes the probability of assigning a class to an example, that is, • For this purpose, logistic regression assumes that, • By making the score computation explicit and using h to identify all the feature transformation hj we get, 24
  • 20. Prof. Pier Luca Lanzi 25
  • 21. Prof. Pier Luca Lanzi Logistic Regression • Logistic Regression search for the weight vector that corresponds to the highest likelihood • For this purpose, it performs a gradient ascent on the log likelihood function • Which updates weight j using, 26
  • 22. Prof. Pier Luca Lanzi Classification using Logistic Regression • To classify an example X, we select the class Y that maximizes the probability P(Y=y | X) or check or • By taking the natural logarithm of both sides, we obtain a linear classification rule that assign the label Y=-1 if • Y=+1 otherwise 27
  • 23. Prof. Pier Luca Lanzi Overfitting & Regularization 28
  • 24. Prof. Pier Luca Lanzi L1 & L2 Regularization • Logistic regression can use L1 and L2 regularization to limit overfitting and produce sparse solution • Like it happened with regression, overfitting is often associated to large weights so to limit overfitting we penalize large weights Total cost = Measure of Fit - Magnitude of Coefficients • Regularization §L1 uses the sum of absolute values, which penalizes large weights and at the same time promotes sparse solutions §L2 uses the sum of squares, which penalizes large weights 29
  • 25. Prof. Pier Luca Lanzi L1 Regularization L2 Regularization 30
  • 26. Prof. Pier Luca Lanzi If ⍺ is zero, then we have no regularization If ⍺ tends to infinity, the solution is a zero weight vector ⍺ must balance fit and the weight magnitude
  • 27. Prof. Pier Luca Lanzi Example 32
  • 28. Prof. Pier Luca Lanzi Run the Logistic Regression Python notebooks provided on Beep
  • 29. Prof. Pier Luca Lanzi Safe Loans Prediction • Given data downloaded from the Lending Club Corporation website regarding safe and risky loans, build a model to predict whether a loan should be given • Model evaluation using 10-fold crossvalidation §Simple Logistic Regression μ=0.63 σ=0.05 §Logistic Regression (L1) μ=0.63 σ=0.07 §Logistic Regression (L2) μ=0.64 σ=0.05 34
  • 30. Prof. Pier Luca Lanzi Weights of the simple logistic model, the L1 regularized model, and the L2 regularized model. 35
  • 31. Prof. Pier Luca Lanzi From now on we are going to use k-fold crossvalidation a lot to score models k-fold crossvalidation generates k separate models Which one should be deployed?
  • 32. Prof. Pier Luca Lanzi K-fold crossvalidation provides an evaluation of model performance on unknown data Its output it’s the evaluation, not the model! The model should be obtained by running the algorithm on the entire dataset!
  • 33. Prof. Pier Luca Lanzi Multiclass Classification 38
  • 34. Prof. Pier Luca Lanzi Logistic regression assumes that there are only two class values (e.g., +1/-1) What if we have more? (e.g., none, soft, hard or Setosa/Versicolour/Virginica) 39
  • 35. Prof. Pier Luca Lanzi One Versus the Rest Evaluation • For each class, it creates one classifier that predicts the target class against all the others • For instance, given three classes A, B, C, it computes three models: one that predicts A against B and C, one that predicts B against A and C, and one that predicts C against A and B • Then, given an example, all the three classifiers are applied and the label with the highest probability is returned • Alternative approaches include the minimization of loss based on the multinomial loss fit across the entire probability distribution 40
  • 36. Prof. Pier Luca Lanzi One Versus Rest multiclass model using the Iris dataset 41
  • 37. Prof. Pier Luca Lanzi Multinomial multiclass model using the Iris dataset
  • 38. Prof. Pier Luca Lanzi Run the Logistic Regression Python notebooks on multiclass classification available on Beep
  • 39. Prof. Pier Luca Lanzi Categorical Attributes 44
  • 40. Prof. Pier Luca Lanzi So far we applied Logistic Regression (and regression) to numerical variables What if the variables are categorical?
  • 41. Prof. Pier Luca Lanzi The Weather Dataset Outlook Temp Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No 46
  • 42. Prof. Pier Luca Lanzi One Hot Encoding Map each categorical attribute with n values into n binary 0/1 variables Each one describing one specific attribute values For example, attribute Outlook is replaced by three binary variables Sunny, Overcast, and Rainy
  • 43. Prof. Pier Luca Lanzi One Hot Encoding for the Outlook attribute. overcast rainy sunny 0 0 1 0 0 1 1 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1 0 0 1 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0
  • 44. Prof. Pier Luca Lanzi Run the Logistic Regression Python notebooks on one hot encoding available on Beep
  • 45. Prof. Pier Luca Lanzi Summary 50
  • 46. Prof. Pier Luca Lanzi Classification • The goal of classification is to build a model of a target concept (the class attribute) available in the data • The target concept is typically a symbolic label (good/bad, yes/no, none/soft/hard, etc.) • Linear classifiers are one of the simplest of classification algorithm that computes a model represented as a hyperplane that separates examples of the two classes • Logistic regression is a classification algorithm that computes the probability to classify an example with a certain label • It works similarly to linear regression and can be coupled with L1 and L2 regularization to limit overfitting • It assumes that the class has only two values but it can be extended to problems involving more than two classes • Like linear regression it assumes that the variables are numerical, if they are not, categorical variables must be mapped into numerical values • One of the typical approaches is to use one-hot-encoding 51