SlideShare a Scribd company logo
1 of 31
Download to read offline
Prof. Pier Luca Lanzi
Classification: Introduction
Data Mining andText Mining (UIC 583 @ Politecnico di Milano)
Prof. Pier Luca Lanzi
What is An Apple? 2
Prof. Pier Luca Lanzi
Are These Apples?
Prof. Pier Luca Lanzi
Prof. Pier Luca Lanzi
Contact Lenses Data 5
NoneReducedYesHypermetropePre-presbyopic
NoneNormalYesHypermetropePre-presbyopic
NoneReducedNoMyopePresbyopic
NoneNormalNoMyopePresbyopic
NoneReducedYesMyopePresbyopic
HardNormalYesMyopePresbyopic
NoneReducedNoHypermetropePresbyopic
SoftNormalNoHypermetropePresbyopic
NoneReducedYesHypermetropePresbyopic
NoneNormalYesHypermetropePresbyopic
SoftNormalNoHypermetropePre-presbyopic
NoneReducedNoHypermetropePre-presbyopic
HardNormalYesMyopePre-presbyopic
NoneReducedYesMyopePre-presbyopic
SoftNormalNoMyopePre-presbyopic
NoneReducedNoMyopePre-presbyopic
hardNormalYesHypermetropeYoung
NoneReducedYesHypermetropeYoung
SoftNormalNoHypermetropeYoung
NoneReducedNoHypermetropeYoung
HardNormalYesMyopeYoung
NoneReducedYesMyopeYoung
SoftNormalNoMyopeYoung
NoneReducedNoMyopeYoung
Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge
Prof. Pier Luca Lanzi
A Model for the Contact Lenses Data 6
If tear production rate = reduced then recommendation = none
If age = young and astigmatic = no
and tear production rate = normal then recommendation = soft
If age = pre-presbyopic and astigmatic = no
and tear production rate = normal then recommendation = soft
If age = presbyopic and spectacle prescription = myope
and astigmatic = no then recommendation = none
If spectacle prescription = hypermetrope and astigmatic = no
and tear production rate = normal then recommendation = soft
If spectacle prescription = myope and astigmatic = yes
and tear production rate = normal then recommendation = hard
If age young and astigmatic = yes
and tear production rate = normal then recommendation = hard
If age = pre-presbyopic
and spectacle prescription = hypermetrope
and astigmatic = yes then recommendation = none
If age = presbyopic and spectacle prescription = hypermetrope
and astigmatic = yes then recommendation = none
Prof. Pier Luca Lanzi
CPU Performance Data 7
0
0
32
128
CHMAX
0
0
8
16
CHMIN
Channels PerformanceCache
(Kb)
Main memory
(Kb)
Cycle time
(ns)
45040001000480209
67328000512480208
…
26932320008000292
19825660002561251
PRPCACHMMAXMMINMYCT
PRP = -55.9 + 0.0489 MYCT + 0.0153 MMIN + 0.0056 MMAX
+ 0.6410 CACH - 0.2700 CHMIN + 1.480 CHMAX
Prof. Pier Luca Lanzi
Classification vs. Prediction
•  Classification
§ Predicts categorical class labels (discrete or nominal)
§ Classifies data (constructs a model) based on the training set
and the values (class labels) in a classifying attribute and uses it
in classifying new data
•  Prediction
§ Models continuous-valued functions, i.e., predicts unknown or
missing values
•  Applications
§ Credit approval
§ Target marketing
§ Medical diagnosis
§ Fraud detection
8
Prof. Pier Luca Lanzi
classification = model building + model usage
Prof. Pier Luca Lanzi
What is classification?
•  Classification is a two-step Process
•  Model construction
§ Given a set of data representing examples of 
a target concept, build a model to “explain” the concept
•  Model usage
§ The classification model is used for classifying 
future or unknown cases
§ Estimate accuracy of the model
10
Prof. Pier Luca Lanzi
Classification: Model Construction 11
Classification
Algorithm
IF rank = ‘professor’
OR years  6
THEN tenured = ‘yes’
name rank years tenured
Mike Assistant Prof 3 no
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes
Dave Assistant Prof 6 no
Anne Associate Prof 3 no
Training
Data
Classifier
(Model)
Prof. Pier Luca Lanzi
Classification: Model Usage 12
tenured = yes
name rank years tenured
Tom Assistant Prof 2 no
Merlisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes
Test
Data
Classifier
(Model)
Unseen Data
Jeff, Professor, 4
Prof. Pier Luca Lanzi
Evaluating Classification Methods
•  Accuracy
§ classifier accuracy: predicting class label
§ predictor accuracy: guessing value of predicted attributes
•  Speed
§ time to construct the model (training time)
§ time to use the model (classification/prediction time)
•  Other Criteria
§ Robustness: handling noise and missing values
§ Scalability: efficiency in disk-resident databases
§ Interpretability: understanding and insight provided
§ Other measures, e.g., goodness of rules, such as decision tree size
or compactness of classification rules
13
Prof. Pier Luca Lanzi
Example
Prof. Pier Luca Lanzi
The Weather Dataset:
Building the Model
Outlook Temp Humidity Windy Play
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Cool Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
15
•  Write one rule like “if A=v1 then X, else if A=v2 thenY, …” to
predict whether the player is going to play or not
•  A is an attribute; vi are attribute values; X,Y are class labels
Prof. Pier Luca Lanzi
The Weather Dataset: 
Testing the Model
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Rainy Mild High False Yes
Sunny Mild High False No
Rainy Mild Normal False Yes
16
Prof. Pier Luca Lanzi
Examples of Models
•  if outlook = sunny then no (3 / 2) 
if outlook = overcast then yes (0 / 4) 
if outlook = rainy then yes (2 / 3) 

correct: 10 out of 14 training examples
•  if outlook = sunny then yes (1 / 2)
if outlook = overcast then yes (0 / 4) 
if outlook = rainy then no (2 / 1) 

correct: 8 out of 10 training examples
17
Prof. Pier Luca Lanzi
The Machine Learning Perspective
Prof. Pier Luca Lanzi
The Machine Learning Perspective 
•  Classification algorithms are methods of supervised Learning
•  The experience E consists of a set of examples of a target
concept that have been prepared by a supervisor
•  The task T consists of finding an hypothesis that accurately
explains the target concept
•  The performance P depends on how accurately the hypothesis h
explains the examples in E
19
Prof. Pier Luca Lanzi
The Machine Learning Perspective
•  Let us define the problem domain as the set of instance X 
(for instance, X contains different different fruits)
•  We define a concept over X as a function c which maps
elements of X into a range D or c:X→ D
•  The range D represents the type of concept analyzed
•  For instance, c: X → {isApple, notAnApple}
20
Prof. Pier Luca Lanzi
The Machine Learning Perspective
•  Experience E is a set of x,d pairs, with x∈X and d∈D.
•  The task T consists of finding an hypothesis h to explain E:
•  ∀x∈X h(x)=c(x)
•  The set H of all the possible hypotheses h that can be used to
explain c it is called the hypothesis space
•  The goodness of an hypothesis h can be evaluated as the
percentage of examples that are correctly explained by h

P(h) = | {x| x∈X e h(x)=c(x)}| / |X|
21
Prof. Pier Luca Lanzi
Examples
•  Concept Learning
when D={0,1}
•  Supervised classification 
when D consists of a finite number of labels
•  Prediction
when D is a subset of Rn
22
Prof. Pier Luca Lanzi
The Machine Learning Perspective 
on Classification
•  Supervised learning algorithms, given the examples in E, search
the hypotheses space H for the hypothesis h that best explains
the examples in E
•  Learning is viewed as a search in the hypotheses space
23
Prof. Pier Luca Lanzi
Searching for Hypotheses
•  The type of hypothesis required influences the search algorithm
•  The more complex the representation 
the more complex the search algorithm
•  Many algorithms assume that it is possible to define a partial
ordering over the hypothesis space
•  The hypothesis space can be searched using either a general to
specific or a specific-to-general strategy
24
Prof. Pier Luca Lanzi
Exploring the Hypothesis Space
•  General to Specific
§ Start with the most general hypothesis and then go on
through specialization steps
•  Specific to General
§ Start with the set of the most specific hypothesis and
then go on through generalization steps
25
Prof. Pier Luca Lanzi
Inductive Bias
•  Set of assumptions that together with the training data deductively justify the
classification assigned by the learner to future instances
•  There can be a number of hypotheses consistent with training data
•  Each learning algorithm has an inductive bias that imposes a preference on the
space of all possible hypotheses
26
Prof. Pier Luca Lanzi
Types of Inductive Bias
•  Syntactic Bias
§ Depends on the language used to represent hypotheses
•  Semantic Bias
§ Depends on the heuristics used to filter hypotheses
•  Preference Bias
§ Depends on the ability to rank and compare hypotheses
•  Restriction Bias
§ Depends on the ability to restrict the search space
27
Prof. Pier Luca Lanzi
Why Are We Looking for h?
Prof. Pier Luca Lanzi
Inductive Learning Hypothesis
•  Any hypothesis (h) found to approximate the target function (c) over a
sufficiently large set of training examples will also approximate the
target function (c) well over other unobserved examples.
•  Training
§ The hypothesis h is developed to explain the examples in ETrain
•  Testing
§ The hypothesis h is evaluated (verified) with respect to the
previously unseen examples in ETest
•  The underlying hypothesis
§ If h explains ETrain then it can also be used to explain other unseen
examples in ETest (not previously used to develop h)
29
Prof. Pier Luca Lanzi
Generalization and Overfitting
•  Generalization
§ When h explains “well” both ETrain and ETest we say that h is
general and that the method used to develop h has
adequately generalized
•  Overfitting
§ When h explains ETrain but not ETest we say that the method
used to develop h has overfitted
§ We have overfitting when the hypothesis h explains ETrain too
accurately so that h is not general enough to be applied
outside ETrain
30
Prof. Pier Luca Lanzi
What are the general issues
for classification in Machine Learning?
•  Type of training experience
§ Direct or indirect?
§ Supervised or not?
•  Type of target function and performance
•  Type of search algorithm
•  Type of representation of the solution
•  Type of inductive bias
31

More Related Content

What's hot

DMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based ClusteringDMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based ClusteringPier Luca Lanzi
 
DMTM Lecture 15 Clustering evaluation
DMTM Lecture 15 Clustering evaluationDMTM Lecture 15 Clustering evaluation
DMTM Lecture 15 Clustering evaluationPier Luca Lanzi
 
DMTM 2015 - 15 Classification Ensembles
DMTM 2015 - 15 Classification EnsemblesDMTM 2015 - 15 Classification Ensembles
DMTM 2015 - 15 Classification EnsemblesPier Luca Lanzi
 
DMTM Lecture 04 Classification
DMTM Lecture 04 ClassificationDMTM Lecture 04 Classification
DMTM Lecture 04 ClassificationPier Luca Lanzi
 
DMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringDMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringPier Luca Lanzi
 
DMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningDMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningPier Luca Lanzi
 
DMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clusteringDMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clusteringPier Luca Lanzi
 
DMTM Lecture 03 Regression
DMTM Lecture 03 RegressionDMTM Lecture 03 Regression
DMTM Lecture 03 RegressionPier Luca Lanzi
 
DMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsDMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsPier Luca Lanzi
 
DMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification ModelsDMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification ModelsPier Luca Lanzi
 
DMTM Lecture 10 Classification ensembles
DMTM Lecture 10 Classification ensemblesDMTM Lecture 10 Classification ensembles
DMTM Lecture 10 Classification ensemblesPier Luca Lanzi
 
DMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical ClusteringDMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical ClusteringPier Luca Lanzi
 
DMTM Lecture 05 Data representation
DMTM Lecture 05 Data representationDMTM Lecture 05 Data representation
DMTM Lecture 05 Data representationPier Luca Lanzi
 
DMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringDMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringPier Luca Lanzi
 
DMTM Lecture 06 Classification evaluation
DMTM Lecture 06 Classification evaluationDMTM Lecture 06 Classification evaluation
DMTM Lecture 06 Classification evaluationPier Luca Lanzi
 
DMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesDMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesPier Luca Lanzi
 
DMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesDMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesPier Luca Lanzi
 
DMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationDMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationPier Luca Lanzi
 
DMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringDMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringPier Luca Lanzi
 
DMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rulesDMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rulesPier Luca Lanzi
 

What's hot (20)

DMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based ClusteringDMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based Clustering
 
DMTM Lecture 15 Clustering evaluation
DMTM Lecture 15 Clustering evaluationDMTM Lecture 15 Clustering evaluation
DMTM Lecture 15 Clustering evaluation
 
DMTM 2015 - 15 Classification Ensembles
DMTM 2015 - 15 Classification EnsemblesDMTM 2015 - 15 Classification Ensembles
DMTM 2015 - 15 Classification Ensembles
 
DMTM Lecture 04 Classification
DMTM Lecture 04 ClassificationDMTM Lecture 04 Classification
DMTM Lecture 04 Classification
 
DMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringDMTM Lecture 11 Clustering
DMTM Lecture 11 Clustering
 
DMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningDMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph mining
 
DMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clusteringDMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clustering
 
DMTM Lecture 03 Regression
DMTM Lecture 03 RegressionDMTM Lecture 03 Regression
DMTM Lecture 03 Regression
 
DMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsDMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethods
 
DMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification ModelsDMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification Models
 
DMTM Lecture 10 Classification ensembles
DMTM Lecture 10 Classification ensemblesDMTM Lecture 10 Classification ensembles
DMTM Lecture 10 Classification ensembles
 
DMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical ClusteringDMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical Clustering
 
DMTM Lecture 05 Data representation
DMTM Lecture 05 Data representationDMTM Lecture 05 Data representation
DMTM Lecture 05 Data representation
 
DMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringDMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based Clustering
 
DMTM Lecture 06 Classification evaluation
DMTM Lecture 06 Classification evaluationDMTM Lecture 06 Classification evaluation
DMTM Lecture 06 Classification evaluation
 
DMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesDMTM Lecture 16 Association rules
DMTM Lecture 16 Association rules
 
DMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesDMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision trees
 
DMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationDMTM Lecture 19 Data exploration
DMTM Lecture 19 Data exploration
 
DMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringDMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clustering
 
DMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rulesDMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rules
 

Viewers also liked

DMTM 2015 - 11 Decision Trees
DMTM 2015 - 11 Decision TreesDMTM 2015 - 11 Decision Trees
DMTM 2015 - 11 Decision TreesPier Luca Lanzi
 
DMTM 2015 - 12 Classification Rules
DMTM 2015 - 12 Classification RulesDMTM 2015 - 12 Classification Rules
DMTM 2015 - 12 Classification RulesPier Luca Lanzi
 
DMTM 2015 - 19 Graph Mining
DMTM 2015 - 19 Graph MiningDMTM 2015 - 19 Graph Mining
DMTM 2015 - 19 Graph MiningPier Luca Lanzi
 
Focus Junior - 14 Maggio 2016
Focus Junior - 14 Maggio 2016Focus Junior - 14 Maggio 2016
Focus Junior - 14 Maggio 2016Pier Luca Lanzi
 
DMTM 2015 - 01 Course Introduction
DMTM 2015 - 01 Course IntroductionDMTM 2015 - 01 Course Introduction
DMTM 2015 - 01 Course IntroductionPier Luca Lanzi
 
DMTM 2015 - 02 Data Mining
DMTM 2015 - 02 Data MiningDMTM 2015 - 02 Data Mining
DMTM 2015 - 02 Data MiningPier Luca Lanzi
 
DMTM 2015 - 18 Text Mining Part 2
DMTM 2015 - 18 Text Mining Part 2DMTM 2015 - 18 Text Mining Part 2
DMTM 2015 - 18 Text Mining Part 2Pier Luca Lanzi
 
DMTM 2015 - 05 Association Rules
DMTM 2015 - 05 Association RulesDMTM 2015 - 05 Association Rules
DMTM 2015 - 05 Association RulesPier Luca Lanzi
 
Machine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesMachine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesPier Luca Lanzi
 
Idea Generation and Conceptualization
Idea Generation and ConceptualizationIdea Generation and Conceptualization
Idea Generation and ConceptualizationPier Luca Lanzi
 
Introduction to Procedural Content Generation - Codemotion 29 Novembre 2014
Introduction to Procedural Content Generation - Codemotion 29 Novembre 2014Introduction to Procedural Content Generation - Codemotion 29 Novembre 2014
Introduction to Procedural Content Generation - Codemotion 29 Novembre 2014Pier Luca Lanzi
 
Working with Formal Elements
Working with Formal ElementsWorking with Formal Elements
Working with Formal ElementsPier Luca Lanzi
 
Elements for the Theory of Fun
Elements for the Theory of FunElements for the Theory of Fun
Elements for the Theory of FunPier Luca Lanzi
 
Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Marina Santini
 
Fitness Inheritance in Evolutionary and
Fitness Inheritance in Evolutionary andFitness Inheritance in Evolutionary and
Fitness Inheritance in Evolutionary andPier Luca Lanzi
 

Viewers also liked (19)

DMTM 2015 - 11 Decision Trees
DMTM 2015 - 11 Decision TreesDMTM 2015 - 11 Decision Trees
DMTM 2015 - 11 Decision Trees
 
DMTM 2015 - 12 Classification Rules
DMTM 2015 - 12 Classification RulesDMTM 2015 - 12 Classification Rules
DMTM 2015 - 12 Classification Rules
 
Course Introduction
Course IntroductionCourse Introduction
Course Introduction
 
DMTM 2015 - 19 Graph Mining
DMTM 2015 - 19 Graph MiningDMTM 2015 - 19 Graph Mining
DMTM 2015 - 19 Graph Mining
 
Focus Junior - 14 Maggio 2016
Focus Junior - 14 Maggio 2016Focus Junior - 14 Maggio 2016
Focus Junior - 14 Maggio 2016
 
DMTM 2015 - 01 Course Introduction
DMTM 2015 - 01 Course IntroductionDMTM 2015 - 01 Course Introduction
DMTM 2015 - 01 Course Introduction
 
DMTM 2015 - 02 Data Mining
DMTM 2015 - 02 Data MiningDMTM 2015 - 02 Data Mining
DMTM 2015 - 02 Data Mining
 
Course Organization
Course OrganizationCourse Organization
Course Organization
 
DMTM 2015 - 18 Text Mining Part 2
DMTM 2015 - 18 Text Mining Part 2DMTM 2015 - 18 Text Mining Part 2
DMTM 2015 - 18 Text Mining Part 2
 
DMTM 2015 - 05 Association Rules
DMTM 2015 - 05 Association RulesDMTM 2015 - 05 Association Rules
DMTM 2015 - 05 Association Rules
 
Machine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesMachine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification Rules
 
Idea Generation and Conceptualization
Idea Generation and ConceptualizationIdea Generation and Conceptualization
Idea Generation and Conceptualization
 
Introduction to Procedural Content Generation - Codemotion 29 Novembre 2014
Introduction to Procedural Content Generation - Codemotion 29 Novembre 2014Introduction to Procedural Content Generation - Codemotion 29 Novembre 2014
Introduction to Procedural Content Generation - Codemotion 29 Novembre 2014
 
Working with Formal Elements
Working with Formal ElementsWorking with Formal Elements
Working with Formal Elements
 
The Structure of Games
The Structure of GamesThe Structure of Games
The Structure of Games
 
Elements for the Theory of Fun
Elements for the Theory of FunElements for the Theory of Fun
Elements for the Theory of Fun
 
The Design Document
The Design DocumentThe Design Document
The Design Document
 
Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)
 
Fitness Inheritance in Evolutionary and
Fitness Inheritance in Evolutionary andFitness Inheritance in Evolutionary and
Fitness Inheritance in Evolutionary and
 

Similar to DMTM 2015 - 10 Introduction to Classification

Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesPier Luca Lanzi
 
AI -learning and machine learning.pptx
AI  -learning and machine learning.pptxAI  -learning and machine learning.pptx
AI -learning and machine learning.pptxGaytriDhingra1
 
The T.A.P.E. system for effective corporate training
The T.A.P.E. system for effective corporate trainingThe T.A.P.E. system for effective corporate training
The T.A.P.E. system for effective corporate trainingAndy Clark
 
Teaching Kids Programming using the Intentional Method
Teaching Kids Programming using the Intentional MethodTeaching Kids Programming using the Intentional Method
Teaching Kids Programming using the Intentional MethodLynn Langit
 
Improving Communications With Soft Skill And Dialogue Simulations
Improving Communications With Soft Skill And Dialogue SimulationsImproving Communications With Soft Skill And Dialogue Simulations
Improving Communications With Soft Skill And Dialogue SimulationsEnspire Learning
 
Learn from Example and Learn Probabilistic Model
Learn from Example and Learn Probabilistic ModelLearn from Example and Learn Probabilistic Model
Learn from Example and Learn Probabilistic ModelJunya Tanaka
 
Teacher toolkit Pycon UK Sept 2018
Teacher toolkit Pycon UK Sept 2018Teacher toolkit Pycon UK Sept 2018
Teacher toolkit Pycon UK Sept 2018Sue Sentance
 
ML-DecisionTrees.ppt
ML-DecisionTrees.pptML-DecisionTrees.ppt
ML-DecisionTrees.pptssuserec53e73
 
ML-DecisionTrees.ppt
ML-DecisionTrees.pptML-DecisionTrees.ppt
ML-DecisionTrees.pptssuserec53e73
 
ML-DecisionTrees.ppt
ML-DecisionTrees.pptML-DecisionTrees.ppt
ML-DecisionTrees.pptssuser71fb63
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learningbutest
 
Blooms_Abridged_Revised_Taxonomy.ppt
Blooms_Abridged_Revised_Taxonomy.pptBlooms_Abridged_Revised_Taxonomy.ppt
Blooms_Abridged_Revised_Taxonomy.pptMuhammadZiaSheikh
 
ARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptxARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptxRuchitaMaaran
 

Similar to DMTM 2015 - 10 Introduction to Classification (20)

Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers Ensembles
 
Machine Learning
Machine Learning Machine Learning
Machine Learning
 
AI -learning and machine learning.pptx
AI  -learning and machine learning.pptxAI  -learning and machine learning.pptx
AI -learning and machine learning.pptx
 
The T.A.P.E. system for effective corporate training
The T.A.P.E. system for effective corporate trainingThe T.A.P.E. system for effective corporate training
The T.A.P.E. system for effective corporate training
 
Lecture-2.pdf
Lecture-2.pdfLecture-2.pdf
Lecture-2.pdf
 
S10
S10S10
S10
 
S10
S10S10
S10
 
Teaching Kids Programming using the Intentional Method
Teaching Kids Programming using the Intentional MethodTeaching Kids Programming using the Intentional Method
Teaching Kids Programming using the Intentional Method
 
Improving Communications With Soft Skill And Dialogue Simulations
Improving Communications With Soft Skill And Dialogue SimulationsImproving Communications With Soft Skill And Dialogue Simulations
Improving Communications With Soft Skill And Dialogue Simulations
 
Human-Centric Machine Learning
Human-Centric Machine LearningHuman-Centric Machine Learning
Human-Centric Machine Learning
 
Learn from Example and Learn Probabilistic Model
Learn from Example and Learn Probabilistic ModelLearn from Example and Learn Probabilistic Model
Learn from Example and Learn Probabilistic Model
 
Action research workshop
Action research workshopAction research workshop
Action research workshop
 
Teacher toolkit Pycon UK Sept 2018
Teacher toolkit Pycon UK Sept 2018Teacher toolkit Pycon UK Sept 2018
Teacher toolkit Pycon UK Sept 2018
 
M18 learning
M18 learningM18 learning
M18 learning
 
ML-DecisionTrees.ppt
ML-DecisionTrees.pptML-DecisionTrees.ppt
ML-DecisionTrees.ppt
 
ML-DecisionTrees.ppt
ML-DecisionTrees.pptML-DecisionTrees.ppt
ML-DecisionTrees.ppt
 
ML-DecisionTrees.ppt
ML-DecisionTrees.pptML-DecisionTrees.ppt
ML-DecisionTrees.ppt
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
Blooms_Abridged_Revised_Taxonomy.ppt
Blooms_Abridged_Revised_Taxonomy.pptBlooms_Abridged_Revised_Taxonomy.ppt
Blooms_Abridged_Revised_Taxonomy.ppt
 
ARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptxARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptx
 

More from Pier Luca Lanzi

11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i VideogiochiPier Luca Lanzi
 
Breve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiBreve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiPier Luca Lanzi
 
Global Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomeGlobal Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomePier Luca Lanzi
 
Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Pier Luca Lanzi
 
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...Pier Luca Lanzi
 
GGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaGGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaPier Luca Lanzi
 
Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Pier Luca Lanzi
 
DMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clusteringDMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clusteringPier Luca Lanzi
 
DMTM Lecture 01 Introduction
DMTM Lecture 01 IntroductionDMTM Lecture 01 Introduction
DMTM Lecture 01 IntroductionPier Luca Lanzi
 
DMTM Lecture 02 Data mining
DMTM Lecture 02 Data miningDMTM Lecture 02 Data mining
DMTM Lecture 02 Data miningPier Luca Lanzi
 
VDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipelineVDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipelinePier Luca Lanzi
 
VDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with UnityVDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with UnityPier Luca Lanzi
 
VDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generationVDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generationPier Luca Lanzi
 

More from Pier Luca Lanzi (13)

11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi
 
Breve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiBreve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei Videogiochi
 
Global Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomeGlobal Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning Welcome
 
Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018
 
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
 
GGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaGGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di apertura
 
Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018
 
DMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clusteringDMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clustering
 
DMTM Lecture 01 Introduction
DMTM Lecture 01 IntroductionDMTM Lecture 01 Introduction
DMTM Lecture 01 Introduction
 
DMTM Lecture 02 Data mining
DMTM Lecture 02 Data miningDMTM Lecture 02 Data mining
DMTM Lecture 02 Data mining
 
VDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipelineVDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipeline
 
VDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with UnityVDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with Unity
 
VDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generationVDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generation
 

Recently uploaded

Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 

Recently uploaded (20)

Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 

DMTM 2015 - 10 Introduction to Classification

  • 1. Prof. Pier Luca Lanzi Classification: Introduction Data Mining andText Mining (UIC 583 @ Politecnico di Milano)
  • 2. Prof. Pier Luca Lanzi What is An Apple? 2
  • 3. Prof. Pier Luca Lanzi Are These Apples?
  • 5. Prof. Pier Luca Lanzi Contact Lenses Data 5 NoneReducedYesHypermetropePre-presbyopic NoneNormalYesHypermetropePre-presbyopic NoneReducedNoMyopePresbyopic NoneNormalNoMyopePresbyopic NoneReducedYesMyopePresbyopic HardNormalYesMyopePresbyopic NoneReducedNoHypermetropePresbyopic SoftNormalNoHypermetropePresbyopic NoneReducedYesHypermetropePresbyopic NoneNormalYesHypermetropePresbyopic SoftNormalNoHypermetropePre-presbyopic NoneReducedNoHypermetropePre-presbyopic HardNormalYesMyopePre-presbyopic NoneReducedYesMyopePre-presbyopic SoftNormalNoMyopePre-presbyopic NoneReducedNoMyopePre-presbyopic hardNormalYesHypermetropeYoung NoneReducedYesHypermetropeYoung SoftNormalNoHypermetropeYoung NoneReducedNoHypermetropeYoung HardNormalYesMyopeYoung NoneReducedYesMyopeYoung SoftNormalNoMyopeYoung NoneReducedNoMyopeYoung Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge
  • 6. Prof. Pier Luca Lanzi A Model for the Contact Lenses Data 6 If tear production rate = reduced then recommendation = none If age = young and astigmatic = no and tear production rate = normal then recommendation = soft If age = pre-presbyopic and astigmatic = no and tear production rate = normal then recommendation = soft If age = presbyopic and spectacle prescription = myope and astigmatic = no then recommendation = none If spectacle prescription = hypermetrope and astigmatic = no and tear production rate = normal then recommendation = soft If spectacle prescription = myope and astigmatic = yes and tear production rate = normal then recommendation = hard If age young and astigmatic = yes and tear production rate = normal then recommendation = hard If age = pre-presbyopic and spectacle prescription = hypermetrope and astigmatic = yes then recommendation = none If age = presbyopic and spectacle prescription = hypermetrope and astigmatic = yes then recommendation = none
  • 7. Prof. Pier Luca Lanzi CPU Performance Data 7 0 0 32 128 CHMAX 0 0 8 16 CHMIN Channels PerformanceCache (Kb) Main memory (Kb) Cycle time (ns) 45040001000480209 67328000512480208 … 26932320008000292 19825660002561251 PRPCACHMMAXMMINMYCT PRP = -55.9 + 0.0489 MYCT + 0.0153 MMIN + 0.0056 MMAX + 0.6410 CACH - 0.2700 CHMIN + 1.480 CHMAX
  • 8. Prof. Pier Luca Lanzi Classification vs. Prediction •  Classification § Predicts categorical class labels (discrete or nominal) § Classifies data (constructs a model) based on the training set and the values (class labels) in a classifying attribute and uses it in classifying new data •  Prediction § Models continuous-valued functions, i.e., predicts unknown or missing values •  Applications § Credit approval § Target marketing § Medical diagnosis § Fraud detection 8
  • 9. Prof. Pier Luca Lanzi classification = model building + model usage
  • 10. Prof. Pier Luca Lanzi What is classification? •  Classification is a two-step Process •  Model construction § Given a set of data representing examples of a target concept, build a model to “explain” the concept •  Model usage § The classification model is used for classifying future or unknown cases § Estimate accuracy of the model 10
  • 11. Prof. Pier Luca Lanzi Classification: Model Construction 11 Classification Algorithm IF rank = ‘professor’ OR years 6 THEN tenured = ‘yes’ name rank years tenured Mike Assistant Prof 3 no Mary Assistant Prof 7 yes Bill Professor 2 yes Jim Associate Prof 7 yes Dave Assistant Prof 6 no Anne Associate Prof 3 no Training Data Classifier (Model)
  • 12. Prof. Pier Luca Lanzi Classification: Model Usage 12 tenured = yes name rank years tenured Tom Assistant Prof 2 no Merlisa Associate Prof 7 no George Professor 5 yes Joseph Assistant Prof 7 yes Test Data Classifier (Model) Unseen Data Jeff, Professor, 4
  • 13. Prof. Pier Luca Lanzi Evaluating Classification Methods •  Accuracy § classifier accuracy: predicting class label § predictor accuracy: guessing value of predicted attributes •  Speed § time to construct the model (training time) § time to use the model (classification/prediction time) •  Other Criteria § Robustness: handling noise and missing values § Scalability: efficiency in disk-resident databases § Interpretability: understanding and insight provided § Other measures, e.g., goodness of rules, such as decision tree size or compactness of classification rules 13
  • 14. Prof. Pier Luca Lanzi Example
  • 15. Prof. Pier Luca Lanzi The Weather Dataset: Building the Model Outlook Temp Humidity Windy Play Sunny Hot High True No Overcast Hot High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Cool Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No 15 •  Write one rule like “if A=v1 then X, else if A=v2 thenY, …” to predict whether the player is going to play or not •  A is an attribute; vi are attribute values; X,Y are class labels
  • 16. Prof. Pier Luca Lanzi The Weather Dataset: Testing the Model Outlook Temp Humidity Windy Play Sunny Hot High False No Rainy Mild High False Yes Sunny Mild High False No Rainy Mild Normal False Yes 16
  • 17. Prof. Pier Luca Lanzi Examples of Models •  if outlook = sunny then no (3 / 2) if outlook = overcast then yes (0 / 4) if outlook = rainy then yes (2 / 3) correct: 10 out of 14 training examples •  if outlook = sunny then yes (1 / 2) if outlook = overcast then yes (0 / 4) if outlook = rainy then no (2 / 1) correct: 8 out of 10 training examples 17
  • 18. Prof. Pier Luca Lanzi The Machine Learning Perspective
  • 19. Prof. Pier Luca Lanzi The Machine Learning Perspective •  Classification algorithms are methods of supervised Learning •  The experience E consists of a set of examples of a target concept that have been prepared by a supervisor •  The task T consists of finding an hypothesis that accurately explains the target concept •  The performance P depends on how accurately the hypothesis h explains the examples in E 19
  • 20. Prof. Pier Luca Lanzi The Machine Learning Perspective •  Let us define the problem domain as the set of instance X (for instance, X contains different different fruits) •  We define a concept over X as a function c which maps elements of X into a range D or c:X→ D •  The range D represents the type of concept analyzed •  For instance, c: X → {isApple, notAnApple} 20
  • 21. Prof. Pier Luca Lanzi The Machine Learning Perspective •  Experience E is a set of x,d pairs, with x∈X and d∈D. •  The task T consists of finding an hypothesis h to explain E: •  ∀x∈X h(x)=c(x) •  The set H of all the possible hypotheses h that can be used to explain c it is called the hypothesis space •  The goodness of an hypothesis h can be evaluated as the percentage of examples that are correctly explained by h P(h) = | {x| x∈X e h(x)=c(x)}| / |X| 21
  • 22. Prof. Pier Luca Lanzi Examples •  Concept Learning when D={0,1} •  Supervised classification when D consists of a finite number of labels •  Prediction when D is a subset of Rn 22
  • 23. Prof. Pier Luca Lanzi The Machine Learning Perspective on Classification •  Supervised learning algorithms, given the examples in E, search the hypotheses space H for the hypothesis h that best explains the examples in E •  Learning is viewed as a search in the hypotheses space 23
  • 24. Prof. Pier Luca Lanzi Searching for Hypotheses •  The type of hypothesis required influences the search algorithm •  The more complex the representation the more complex the search algorithm •  Many algorithms assume that it is possible to define a partial ordering over the hypothesis space •  The hypothesis space can be searched using either a general to specific or a specific-to-general strategy 24
  • 25. Prof. Pier Luca Lanzi Exploring the Hypothesis Space •  General to Specific § Start with the most general hypothesis and then go on through specialization steps •  Specific to General § Start with the set of the most specific hypothesis and then go on through generalization steps 25
  • 26. Prof. Pier Luca Lanzi Inductive Bias •  Set of assumptions that together with the training data deductively justify the classification assigned by the learner to future instances •  There can be a number of hypotheses consistent with training data •  Each learning algorithm has an inductive bias that imposes a preference on the space of all possible hypotheses 26
  • 27. Prof. Pier Luca Lanzi Types of Inductive Bias •  Syntactic Bias § Depends on the language used to represent hypotheses •  Semantic Bias § Depends on the heuristics used to filter hypotheses •  Preference Bias § Depends on the ability to rank and compare hypotheses •  Restriction Bias § Depends on the ability to restrict the search space 27
  • 28. Prof. Pier Luca Lanzi Why Are We Looking for h?
  • 29. Prof. Pier Luca Lanzi Inductive Learning Hypothesis •  Any hypothesis (h) found to approximate the target function (c) over a sufficiently large set of training examples will also approximate the target function (c) well over other unobserved examples. •  Training § The hypothesis h is developed to explain the examples in ETrain •  Testing § The hypothesis h is evaluated (verified) with respect to the previously unseen examples in ETest •  The underlying hypothesis § If h explains ETrain then it can also be used to explain other unseen examples in ETest (not previously used to develop h) 29
  • 30. Prof. Pier Luca Lanzi Generalization and Overfitting •  Generalization § When h explains “well” both ETrain and ETest we say that h is general and that the method used to develop h has adequately generalized •  Overfitting § When h explains ETrain but not ETest we say that the method used to develop h has overfitted § We have overfitting when the hypothesis h explains ETrain too accurately so that h is not general enough to be applied outside ETrain 30
  • 31. Prof. Pier Luca Lanzi What are the general issues for classification in Machine Learning? •  Type of training experience § Direct or indirect? § Supervised or not? •  Type of target function and performance •  Type of search algorithm •  Type of representation of the solution •  Type of inductive bias 31