SlideShare a Scribd company logo
1 of 52
Download to read offline
Prof. Pier Luca Lanzi
Classification: Rule Induction
Data Mining andText Mining (UIC 583 @ Politecnico di Milano)
Prof. Pier Luca Lanzi
The Weather Dataset
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Mild High False No
Sunny Cool Normal False Yes
Rainy Mild Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
2
Prof. Pier Luca Lanzi
A Rule Set to Classify the Data
•  IF (humidity = high) and (outlook = sunny) 
THEN play=no (3.0/0.0)
•  IF (outlook = rainy) and (windy = TRUE)
THEN play=no (2.0/0.0)
•  OTHERWISE play=yes (9.0/0.0)
•  Confusion Matrix
§ yes no -- classified as
§ 7 2 | yes
§ 3 2 | no
3
Prof. Pier Luca Lanzi
Let’s Check the First Rule
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Mild High False No
Sunny Cool Normal False Yes
Rainy Mild Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
4
IF (humidity = high) and (outlook = sunny) THEN play=no (3.0/0.0)
Prof. Pier Luca Lanzi
Then, The Second Rule
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Mild High False No
Sunny Cool Normal False Yes
Rainy Mild Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
5
IF (outlook = rainy) and (windy = TRUE) THEN play=no (2.0/0.0)
Prof. Pier Luca Lanzi
Finally, the Third Rule
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Mild High False No
Sunny Cool Normal False Yes
Rainy Mild Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
6
IF (outlook = rainy) and (windy = TRUE) THEN play=no (2.0/0.0)
Prof. Pier Luca Lanzi
A Simpler Solution
•  IF (outlook = sunny) THEN play IS no
ELSE IF (outlook = overcast) THEN play IS yes
ELSE IF (outlook = rainy) THEN play IS yes
(6/14 instances correct)
•  Confusion Matrix
§ yes no -- classified as
§ 4 5 | yes
§ 3 2 | no
7
Prof. Pier Luca Lanzi
What is a Classification Rule?
Prof. Pier Luca Lanzi
What Is A Classification Rules?
Why Rules?
•  They are IF-THEN rules
§ The IF part states a condition over the data
§ The THEN part includes a class label
•  What types of conditions?
§ Propositional, with attribute-value comparisons
§ First order Horn clauses, with variables
•  Why rules? Because they are one of the most expressive and
most human readable representation for hypotheses is sets of IF-
THEN rules
9
Prof. Pier Luca Lanzi
Coverage and Accuracy
•  IF (humidity = high) and (outlook = sunny) 
THEN play=no (3.0/0.0)
•  ncovers = number of examples covered by the rule
•  ncorrect = number of examples correctly classified by the rule
•  coverage(R) = ncovers /size of the |training data set
•  accuracy(R) = ncorrect /ncovers
10
Prof. Pier Luca Lanzi
Conflict Resolution
•  If more than one rule is triggered, we need conflict resolution
•  Size ordering: assign the highest priority to the triggering rules
that has the “toughest” requirement (i.e., with the most attribute
test)
•  Class-based ordering: decreasing order of prevalence or
misclassification cost per class
•  Rule-based ordering (decision list): rules are organized into one
long priority list, according to some measure of rule quality or by
experts
11
Prof. Pier Luca Lanzi
Two Approaches for Rule Learning
•  Direct Methods
§ Directly learn the rules from the training data
•  Indirect Methods
§ Learn decision tree, then convert to rules
§ Learn neural networks, then extract rules
12
Prof. Pier Luca Lanzi
One Rule
Prof. Pier Luca Lanzi
Inferring Rudimentary Rules
•  OneRule (1R) learns a simple rule involving one attribute
§ Assumes nominal attributes
§ The rule that all the values of one particular attribute
•  Basic version
§ One branch for each value
§ Each branch assigns most frequent class
§ Error rate: proportion of instances that don’t belong to the
majority class of their corresponding branch
§ Choose attribute with lowest error rate
§ “missing” is treated as a separate value
14
Prof. Pier Luca Lanzi
For each attribute,
For each value of the attribute,
make a rule as follows:
count how often each class appears
find the most frequent class
make the rule assign that class to
this attribute-value
Calculate the error rate of the rules
Choose the rules with the smallest error rate
Pseudo-Code for OneRule 15
Prof. Pier Luca Lanzi
Evaluating the Weather Attributes 16
3/6True → No*
5/142/8False →YesWindy
1/7Normal →Yes
4/143/7High → NoHumidity
5/14
4/14
Total
errors
1/4Cool → Yes
2/6Mild → Yes
2/4Hot → No*Temp
2/5Rainy →Yes
0/4Overcast →Yes
2/5Sunny → NoOutlook
ErrorsRulesAttribute
NoTrueHighMildRainy
YesFalseNormalHotOvercast
YesTrueHighMildOvercast
YesTrueNormalMildSunny
YesFalseNormalMildRainy
YesFalseNormalCoolSunny
NoFalseHighMildSunny
YesTrueNormalCoolOvercast
NoTrueNormalCoolRainy
YesFalseNormalCoolRainy
YesFalseHighMildRainy
YesFalseHighHotOvercast
NoTrueHighHotSunny
NoFalseHighHotSunny
PlayWindyHumidityTempOutlook
* indicates a tie
Prof. Pier Luca Lanzi
OneRule and Numerical Attributes
•  Applies simple supervised discretization
•  Sort instances according to attribute’s values
•  Place breakpoints where class changes (majority class)
•  This procedure is however very sensitive to noise since one
example with an incorrect class label may produce a separate
interval. This is likely to lead to overfitting.
•  In the case of the temperature, 
17
64 65 68 69 70 71 72 72 75 75 80 81 83 85
Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No
Prof. Pier Luca Lanzi
OneRule and Numerical Attributes
•  To limit overfitting, enforce minimum number of instances in
majority class per interval.
•  For instance, in the case of the temperature, if we set the
minimum number of majority class instances to 3, we have
18
64 65 68 69 70 71 72 72 75 75 80 81 83 85
Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No
64 65 68 69 70 71 72 72 75 75 80 81 83 85
Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No
64 65 68 69 70 71 72 72 75 75 80 81 83 85
Yes No Yes Yes Yes | No No Yes Yes Yes | No Yes Yes No
join the intervals to get at least 3 examples
join the intervals with the same majority class
Prof. Pier Luca Lanzi
OneRule Applied to the Numerical Version of
the Weather Dataset
19
0/1 95.5 →Yes
3/6True → No*
5/142/8False →YesWindy
2/6 82.5 and ≤ 95.5 → No
3/141/7≤ 82.5 → YesHumidity
5/14
4/14
Total errors
2/4 77.5 → No*
3/10≤ 77.5 →YesTemperature
2/5Rainy →Yes
0/4Overcast →Yes
2/5Sunny → NoOutlook
ErrorsRulesAttribute
Prof. Pier Luca Lanzi
Sequential Covering
Prof. Pier Luca Lanzi
Sequential Covering Algorithms
•  Consider the set E of positive and negative examples
•  Repeat
§ Learn one rule with high accuracy, any coverage
§ Remove positive examples covered by this rule
•  Until all the examples are covered
21
Prof. Pier Luca Lanzi
Basic Sequential Covering Algorithm
procedure Covering (Examples, Classifier)
input: a set of positive and negative examples for class c
// rule set is initially empty
classifier = {}
while PositiveExamples(Examples)!={}
// find the best rule possible
Rule = FindBestRule(Examples)
// check if we need more rules
if Stop(Examples, Rule, Classifier) breakwhile
// remove covered examples and update the model
Examples = ExamplesCover(Rule,Examples)
Classifier = Classifier U {Rule}
Endwhile
// post-process the rules (sort them, simplify them, etc.)
Classifier = PostProcessing(Classifier)
output: Classifier
22
Prof. Pier Luca Lanzi
Finding the Best Rule Possible 23
IF THEN Play=yes
IF Wind=No
THEN Play=yes
IF Humidity=Normal
THEN Play=yes
IF Humidity=High
THEN Play=yes
IF …
THEN …
IF Wind=yes
THEN Play=yes
IF Humidity=Normal
AND Wind=yes
THEN Play=yes
IF Humidity=Normal
AND Wind=No
THEN Play=yes
IF Humidity=Normal
AND Outlook=Rainy
THEN Play=yes
P=5/10 = 0.5
P=6/8=0.75 P=6/7=0.86 P=3/7=0.43
Prof. Pier Luca Lanzi
Another Viewpoint 24
y
x
a
b b
b
b
b
b
b
b
b b b
b
b
b
a
a
a
a
a
y
a
b b
b
b
b
b
b
b
b b
b
b
b
b
a
a
aa
a
x
1đ2
y
a
b b
b
b
b
b
b
b
b b
b
b
b
b
a
a
aa
a
x
1đ2
2đ6
If x  1.2 then class = a
If x  1.2 and y  2.6
then class = a
If true then class = a
Prof. Pier Luca Lanzi
And Another Viewpoint 25
(i) Original Data (ii) Step 1
(iii) Step 2
R1
(iv) Step 3
R1
R2
Prof. Pier Luca Lanzi
Learning Just One Rule
LearnOneRule(Attributes, Examples, k)
init BH to the most general hypothesis
init CH to {BH}
while CH not empty Do
Generate Next More Specific CH in NCH
// check all the NCH for an hypothesis that

// improves the performance of BH
Update BH
Update CH with the k best NCH
endwhile
return a rule “IF BH THEN prediction”
26
Prof. Pier Luca Lanzi
An Example Using Contact Lens Data 27
NoneReducedYesHypermetropePre-presbyopic
NoneNormalYesHypermetropePre-presbyopic
NoneReducedNoMyopePresbyopic
NoneNormalNoMyopePresbyopic
NoneReducedYesMyopePresbyopic
HardNormalYesMyopePresbyopic
NoneReducedNoHypermetropePresbyopic
SoftNormalNoHypermetropePresbyopic
NoneReducedYesHypermetropePresbyopic
NoneNormalYesHypermetropePresbyopic
SoftNormalNoHypermetropePre-presbyopic
NoneReducedNoHypermetropePre-presbyopic
HardNormalYesMyopePre-presbyopic
NoneReducedYesMyopePre-presbyopic
SoftNormalNoMyopePre-presbyopic
NoneReducedNoMyopePre-presbyopic
hardNormalYesHypermetropeYoung
NoneReducedYesHypermetropeYoung
SoftNormalNoHypermetropeYoung
NoneReducedNoHypermetropeYoung
HardNormalYesMyopeYoung
NoneReducedYesMyopeYoung
SoftNormalNoMyopeYoung
NoneReducedNoMyopeYoung
Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge
Prof. Pier Luca Lanzi
First Step: the Most General Rule
•  Rule we seek:
•  Possible tests:
28
4/12Tear production rate = Normal
0/12Tear production rate = Reduced
4/12Astigmatism = yes
0/12Astigmatism = no
1/12Spectacle prescription = Hypermetrope
3/12Spectacle prescription = Myope
1/8Age = Presbyopic
1/8Age = Pre-presbyopic
2/8Age = Young
If ?
then recommendation = hard
Prof. Pier Luca Lanzi
Adding the First Clause
•  Rule with best test added,
•  Instances covered by modified rule,
29
NoneReducedYesHypermetropePre-presbyopic
NoneNormalYesHypermetropePre-presbyopic
NoneReducedYesMyopePresbyopic
HardNormalYesMyopePresbyopic
NoneReducedYesHypermetropePresbyopic
NoneNormalYesHypermetropePresbyopic
HardNormalYesMyopePre-presbyopic
NoneReducedYesMyopePre-presbyopic
hardNormalYesHypermetropeYoung
NoneReducedYesHypermetropeYoung
HardNormalYesMyopeYoung
NoneReducedYesMyopeYoung
Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge
If astigmatism = yes
then recommendation = hard
Prof. Pier Luca Lanzi
Extending the First Rule
•  Current state,
•  Possible tests,
30
4/6Tear production rate = Normal
0/6Tear production rate = Reduced
1/6Spectacle prescription = Hypermetrope
3/6Spectacle prescription = Myope
1/4Age = Presbyopic
1/4Age = Pre-presbyopic
2/4Age = Young
If astigmatism = yes
and ?
then recommendation = hard
Prof. Pier Luca Lanzi
The Second Rule
•  Rule with best test added:
•  Instances covered by modified rule
31
NoneNormalYesHypermetropePrepresbyopic
HardNormalYesMyopePresbyopic
NoneNormalYesHypermetropePresbyopic
HardNormalYesMyopePrepresbyopic
HardNormalYesHypermetropeYoung
HardNormalYesMyopeYoung
Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge
If astigmatism = yes
and tear production rate = normal
then recommendation = Hard
Prof. Pier Luca Lanzi
Adding the Third Clause
•  Current state:
•  Possible tests:
•  Tie between the first and the fourth test, 
we choose the one with greater coverage
32
1/3Spectacle prescription = Hypermetrope
3/3Spectacle prescription = Myope
1/2Age = Presbyopic
1/2Age = Pre-presbyopic
2/2Age = Young
If astigmatism = yes
and tear production rate = normal
and ?
then recommendation = hard
Prof. Pier Luca Lanzi
The Final Result
•  Final rule:
•  Second rule for recommending “hard lenses”:
(built from instances not covered by first rule)
•  These two rules cover all “hard lenses”:
•  Process is repeated with other two classes
33
If astigmatism = yes
and tear production rate = normal
and spectacle prescription = myope
then recommendation = hard
If age = young and astigmatism = yes
and tear production rate = normal
then recommendation = hard
Prof. Pier Luca Lanzi
Testing for the Best Rule
•  Measure 1: Accuracy (p/t)
§ t total instances covered by rule
pnumber of these that are positive
§ Produce rules that do not cover negative instances,
as quickly as possible
§ May produce rules with very small coverage
—special cases or noise?
•  Measure 2: Information gain p (log(p/t) – log(P/T))
§ P and T the positive and total numbers before the new
condition was added
§ Information gain emphasizes positive rather than negative
instances
•  These measures interact with the pruning mechanism used
34
Prof. Pier Luca Lanzi
Eliminating Instances
•  Why do we need to eliminate instances?
§ Otherwise, the next rule is identical to previous rule
•  Why do we remove positive instances?
§ To ensure that the next rule is different
•  Why do we remove negative instances?
§ Prevent underestimating accuracy of rule
§ Compare rules R2 and R3 in the following diagram
35
Prof. Pier Luca Lanzi
Eliminating Instances 36
class = +
class = -
+
+ +
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
-
-
-
-
- -
-
-
-
- -
-
-
-
-
-
-
-
-
-
-
+
+
++
+
+
+
R1
R3 R2
+
+
Prof. Pier Luca Lanzi
Missing Values and Numeric Attributes
•  Missing values usually fail the test
•  Covering algorithm must either
§ Use other tests to separate out positive instances
§ Leave them uncovered until later in the process
•  In some cases it is better to treat “missing” as a separate value
•  Numeric attributes are treated as in decision trees
37
Prof. Pier Luca Lanzi
Stopping Criterion and Rule Pruning
•  The process usually stops when there is no significant
improvement by adding the new rule
•  Rule pruning is similar to post-pruning of decision trees
•  Reduced Error Pruning:
§ Remove one of the conjuncts in the rule
§ Compare error rate on validation set
§ If error improves, prune the conjunct
38
Prof. Pier Luca Lanzi
Rules vs. Trees
•  Rule sets can be more readable
•  Decision trees suffer from replicated subtrees
•  Rule sets are collections of local models, trees represent models
over the whole domain
•  The covering algorithm concentrates on one class at a time
whereas decision tree learner takes all classes into account
39
Prof. Pier Luca Lanzi
Mining Association Rules for Classification
Prof. Pier Luca Lanzi
Mining Association Rules for Classification
(the CBA algorithm)
•  Association rule mining assumes that the data consist of a set of
transactions. Thus, the typical tabular representation of data used
in classification must be mapped into such a format.
•  Association rule mining is then applied to the new dataset and
the search is focused on association rules in which the tail
identifies a class label

X⇒ ci (where ci is a class label)
•  The association rules are pruned using the pessimistic error-based
method used in C4.5
•  Finally, rules are sorted to build the final classifier.
41
Prof. Pier Luca Lanzi
Indirect Methods
Prof. Pier Luca Lanzi
Indirect Methods
Rule Set
r1: (P=No,Q=No) == -
r2: (P=No,Q=Yes) == +
r3: (P=Yes,R=No) == +
r4: (P=Yes,R=Yes,Q=No) == -
r5: (P=Yes,R=Yes,Q=Yes) == +
P
Q R
Q- + +
- +
No No
No
Yes Yes
Yes
No Yes
43
Prof. Pier Luca Lanzi
Example
Prof. Pier Luca Lanzi
Prof. Pier Luca Lanzi
JRip Model
(humidity = high) and (outlook = sunny) =
play=no (3.0/0.0)
(outlook = rainy) and (windy = TRUE) =
play=no (2.0/0.0)
= play=yes (9.0/0.0)
46
Prof. Pier Luca Lanzi
One Rule Model
outlook:
overcast - yes
rainy - yes
sunny - no
(10/14 instances correct)
47
Prof. Pier Luca Lanzi
CBA Model
outlook=overcast == play=yes
humidity=normal windy=FALSE == play=yes
outlook=rainy windy=FALSE == play=yes
outlook=sunny humidity=high == play=no
outlook=rainy windy=TRUE == play=no
(default class is the majority class)
48
Prof. Pier Luca Lanzi
Summary
Prof. Pier Luca Lanzi
Summary
•  Advantages of Rule-Based Classifiers
§ As highly expressive as decision trees
§ Easy to interpret
§ Easy to generate
§ Can classify new instances rapidly
§ Performance comparable to decision trees
•  Two approaches: direct and indirect methods
50
Prof. Pier Luca Lanzi
Summary
•  Direct Methods, typically apply sequential covering approach
§ Grow a single rule
§ Remove Instances from rule
§ Prune the rule (if necessary)
§ Add rule to Current Rule Set
§ Repeat
•  Other approaches exist
§ Specific to general exploration (RISE)
§ Post processing of neural networks, 
association rules, decision trees, etc.
51
Prof. Pier Luca Lanzi
Homework
•  Generate the rule set for the Weather dataset by repeatedly
applying the procedure to learn one rule until no improvement
can be produced or the covered examples are too few
•  Check the problems provided in the previous exams and apply
both OneRule and Sequential Covering to generate the first rule.
Then, check the result with one of the implementations available
in Weka
52

More Related Content

What's hot

DMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsDMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsPier Luca Lanzi
 
DMTM 2015 - 06 Introduction to Clustering
DMTM 2015 - 06 Introduction to ClusteringDMTM 2015 - 06 Introduction to Clustering
DMTM 2015 - 06 Introduction to ClusteringPier Luca Lanzi
 
DMTM Lecture 10 Classification ensembles
DMTM Lecture 10 Classification ensemblesDMTM Lecture 10 Classification ensembles
DMTM Lecture 10 Classification ensemblesPier Luca Lanzi
 
DMTM Lecture 15 Clustering evaluation
DMTM Lecture 15 Clustering evaluationDMTM Lecture 15 Clustering evaluation
DMTM Lecture 15 Clustering evaluationPier Luca Lanzi
 
DMTM Lecture 03 Regression
DMTM Lecture 03 RegressionDMTM Lecture 03 Regression
DMTM Lecture 03 RegressionPier Luca Lanzi
 
DMTM Lecture 20 Data preparation
DMTM Lecture 20 Data preparationDMTM Lecture 20 Data preparation
DMTM Lecture 20 Data preparationPier Luca Lanzi
 
2013-1 Machine Learning Lecture 06 - Lucila Ohno-Machado - Ensemble Methods
2013-1 Machine Learning Lecture 06 - Lucila Ohno-Machado - Ensemble Methods2013-1 Machine Learning Lecture 06 - Lucila Ohno-Machado - Ensemble Methods
2013-1 Machine Learning Lecture 06 - Lucila Ohno-Machado - Ensemble MethodsDongseo University
 
DMTM 2015 - 10 Introduction to Classification
DMTM 2015 - 10 Introduction to ClassificationDMTM 2015 - 10 Introduction to Classification
DMTM 2015 - 10 Introduction to ClassificationPier Luca Lanzi
 
DMTM 2015 - 04 Data Exploration
DMTM 2015 - 04 Data ExplorationDMTM 2015 - 04 Data Exploration
DMTM 2015 - 04 Data ExplorationPier Luca Lanzi
 
DMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringDMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringPier Luca Lanzi
 
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno CandelH2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno CandelSri Ambati
 
Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...butest
 
Mixed Effects Models - Empirical Logit
Mixed Effects Models - Empirical LogitMixed Effects Models - Empirical Logit
Mixed Effects Models - Empirical LogitScott Fraundorf
 
Mixed Effects Models - Centering and Transformations
Mixed Effects Models - Centering and TransformationsMixed Effects Models - Centering and Transformations
Mixed Effects Models - Centering and TransformationsScott Fraundorf
 

What's hot (14)

DMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsDMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethods
 
DMTM 2015 - 06 Introduction to Clustering
DMTM 2015 - 06 Introduction to ClusteringDMTM 2015 - 06 Introduction to Clustering
DMTM 2015 - 06 Introduction to Clustering
 
DMTM Lecture 10 Classification ensembles
DMTM Lecture 10 Classification ensemblesDMTM Lecture 10 Classification ensembles
DMTM Lecture 10 Classification ensembles
 
DMTM Lecture 15 Clustering evaluation
DMTM Lecture 15 Clustering evaluationDMTM Lecture 15 Clustering evaluation
DMTM Lecture 15 Clustering evaluation
 
DMTM Lecture 03 Regression
DMTM Lecture 03 RegressionDMTM Lecture 03 Regression
DMTM Lecture 03 Regression
 
DMTM Lecture 20 Data preparation
DMTM Lecture 20 Data preparationDMTM Lecture 20 Data preparation
DMTM Lecture 20 Data preparation
 
2013-1 Machine Learning Lecture 06 - Lucila Ohno-Machado - Ensemble Methods
2013-1 Machine Learning Lecture 06 - Lucila Ohno-Machado - Ensemble Methods2013-1 Machine Learning Lecture 06 - Lucila Ohno-Machado - Ensemble Methods
2013-1 Machine Learning Lecture 06 - Lucila Ohno-Machado - Ensemble Methods
 
DMTM 2015 - 10 Introduction to Classification
DMTM 2015 - 10 Introduction to ClassificationDMTM 2015 - 10 Introduction to Classification
DMTM 2015 - 10 Introduction to Classification
 
DMTM 2015 - 04 Data Exploration
DMTM 2015 - 04 Data ExplorationDMTM 2015 - 04 Data Exploration
DMTM 2015 - 04 Data Exploration
 
DMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringDMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clustering
 
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno CandelH2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
 
Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...
 
Mixed Effects Models - Empirical Logit
Mixed Effects Models - Empirical LogitMixed Effects Models - Empirical Logit
Mixed Effects Models - Empirical Logit
 
Mixed Effects Models - Centering and Transformations
Mixed Effects Models - Centering and TransformationsMixed Effects Models - Centering and Transformations
Mixed Effects Models - Centering and Transformations
 

Viewers also liked

Focus Junior - 14 Maggio 2016
Focus Junior - 14 Maggio 2016Focus Junior - 14 Maggio 2016
Focus Junior - 14 Maggio 2016Pier Luca Lanzi
 
DMTM 2015 - 17 Text Mining Part 1
DMTM 2015 - 17 Text Mining Part 1DMTM 2015 - 17 Text Mining Part 1
DMTM 2015 - 17 Text Mining Part 1Pier Luca Lanzi
 
DMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical ClusteringDMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical ClusteringPier Luca Lanzi
 
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other MethodsDMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other MethodsPier Luca Lanzi
 
DMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data PreparationDMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data PreparationPier Luca Lanzi
 
DMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based ClusteringDMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based ClusteringPier Luca Lanzi
 
DMTM 2015 - 19 Graph Mining
DMTM 2015 - 19 Graph MiningDMTM 2015 - 19 Graph Mining
DMTM 2015 - 19 Graph MiningPier Luca Lanzi
 
Machine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesMachine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesPier Luca Lanzi
 
Codemotion Milan - November 21 2015
Codemotion Milan - November 21 2015Codemotion Milan - November 21 2015
Codemotion Milan - November 21 2015Pier Luca Lanzi
 
DMTM 2015 - 03 Data Representation
DMTM 2015 - 03 Data RepresentationDMTM 2015 - 03 Data Representation
DMTM 2015 - 03 Data RepresentationPier Luca Lanzi
 
DMTM 2015 - 01 Course Introduction
DMTM 2015 - 01 Course IntroductionDMTM 2015 - 01 Course Introduction
DMTM 2015 - 01 Course IntroductionPier Luca Lanzi
 
DMTM 2015 - 02 Data Mining
DMTM 2015 - 02 Data MiningDMTM 2015 - 02 Data Mining
DMTM 2015 - 02 Data MiningPier Luca Lanzi
 
Idea Generation and Conceptualization
Idea Generation and ConceptualizationIdea Generation and Conceptualization
Idea Generation and ConceptualizationPier Luca Lanzi
 
DMTM 2015 - 18 Text Mining Part 2
DMTM 2015 - 18 Text Mining Part 2DMTM 2015 - 18 Text Mining Part 2
DMTM 2015 - 18 Text Mining Part 2Pier Luca Lanzi
 
DMTM 2015 - 05 Association Rules
DMTM 2015 - 05 Association RulesDMTM 2015 - 05 Association Rules
DMTM 2015 - 05 Association RulesPier Luca Lanzi
 
Elements for the Theory of Fun
Elements for the Theory of FunElements for the Theory of Fun
Elements for the Theory of FunPier Luca Lanzi
 

Viewers also liked (20)

Focus Junior - 14 Maggio 2016
Focus Junior - 14 Maggio 2016Focus Junior - 14 Maggio 2016
Focus Junior - 14 Maggio 2016
 
DMTM 2015 - 17 Text Mining Part 1
DMTM 2015 - 17 Text Mining Part 1DMTM 2015 - 17 Text Mining Part 1
DMTM 2015 - 17 Text Mining Part 1
 
DMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical ClusteringDMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical Clustering
 
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other MethodsDMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
 
DMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data PreparationDMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data Preparation
 
DMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based ClusteringDMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based Clustering
 
Course Introduction
Course IntroductionCourse Introduction
Course Introduction
 
DMTM 2015 - 19 Graph Mining
DMTM 2015 - 19 Graph MiningDMTM 2015 - 19 Graph Mining
DMTM 2015 - 19 Graph Mining
 
Machine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesMachine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification Rules
 
Codemotion Milan - November 21 2015
Codemotion Milan - November 21 2015Codemotion Milan - November 21 2015
Codemotion Milan - November 21 2015
 
DMTM 2015 - 03 Data Representation
DMTM 2015 - 03 Data RepresentationDMTM 2015 - 03 Data Representation
DMTM 2015 - 03 Data Representation
 
DMTM 2015 - 01 Course Introduction
DMTM 2015 - 01 Course IntroductionDMTM 2015 - 01 Course Introduction
DMTM 2015 - 01 Course Introduction
 
DMTM 2015 - 02 Data Mining
DMTM 2015 - 02 Data MiningDMTM 2015 - 02 Data Mining
DMTM 2015 - 02 Data Mining
 
Course Organization
Course OrganizationCourse Organization
Course Organization
 
Idea Generation and Conceptualization
Idea Generation and ConceptualizationIdea Generation and Conceptualization
Idea Generation and Conceptualization
 
DMTM 2015 - 18 Text Mining Part 2
DMTM 2015 - 18 Text Mining Part 2DMTM 2015 - 18 Text Mining Part 2
DMTM 2015 - 18 Text Mining Part 2
 
DMTM 2015 - 05 Association Rules
DMTM 2015 - 05 Association RulesDMTM 2015 - 05 Association Rules
DMTM 2015 - 05 Association Rules
 
Game Mechanics
Game MechanicsGame Mechanics
Game Mechanics
 
Elements for the Theory of Fun
Elements for the Theory of FunElements for the Theory of Fun
Elements for the Theory of Fun
 
The Design Document
The Design DocumentThe Design Document
The Design Document
 

Similar to DMTM 2015 - 12 Classification Rules

One R (1R) Algorithm
One R (1R) AlgorithmOne R (1R) Algorithm
One R (1R) AlgorithmMLCollab
 
DMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesDMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesPier Luca Lanzi
 
3.Classification.ppt
3.Classification.ppt3.Classification.ppt
3.Classification.pptKhalilDaiyob1
 
Drools Planner Chtijug 2010
Drools Planner Chtijug 2010Drools Planner Chtijug 2010
Drools Planner Chtijug 2010Nicolas Heron
 
lec06_Classification_NaiveBayes_RuleBased.pptx
lec06_Classification_NaiveBayes_RuleBased.pptxlec06_Classification_NaiveBayes_RuleBased.pptx
lec06_Classification_NaiveBayes_RuleBased.pptxMmine1
 
Shift-Left Testing: QA in a DevOps World by David Laulusa
Shift-Left Testing: QA in a DevOps World by David LaulusaShift-Left Testing: QA in a DevOps World by David Laulusa
Shift-Left Testing: QA in a DevOps World by David LaulusaQA or the Highway
 
Decision tree.10.11
Decision tree.10.11Decision tree.10.11
Decision tree.10.11okeee
 
Lect9 Decision tree
Lect9 Decision treeLect9 Decision tree
Lect9 Decision treehktripathy
 
Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)CS, NcState
 
Test-Driven Development
 Test-Driven Development  Test-Driven Development
Test-Driven Development Amir Assad
 
Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018digitalzombie
 
mod_02_intro_ml.ppt
mod_02_intro_ml.pptmod_02_intro_ml.ppt
mod_02_intro_ml.pptbutest
 
Test suite minimization
Test suite minimizationTest suite minimization
Test suite minimizationGaurav Saxena
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautzbutest
 

Similar to DMTM 2015 - 12 Classification Rules (20)

One R (1R) Algorithm
One R (1R) AlgorithmOne R (1R) Algorithm
One R (1R) Algorithm
 
DMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesDMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision trees
 
3.Classification.ppt
3.Classification.ppt3.Classification.ppt
3.Classification.ppt
 
Drools Planner Chtijug 2010
Drools Planner Chtijug 2010Drools Planner Chtijug 2010
Drools Planner Chtijug 2010
 
Scientific method ii
Scientific method iiScientific method ii
Scientific method ii
 
lec06_Classification_NaiveBayes_RuleBased.pptx
lec06_Classification_NaiveBayes_RuleBased.pptxlec06_Classification_NaiveBayes_RuleBased.pptx
lec06_Classification_NaiveBayes_RuleBased.pptx
 
AI IMPORTANT QUESTION
AI IMPORTANT QUESTIONAI IMPORTANT QUESTION
AI IMPORTANT QUESTION
 
205_April_22.pptx
205_April_22.pptx205_April_22.pptx
205_April_22.pptx
 
Shift-Left Testing: QA in a DevOps World by David Laulusa
Shift-Left Testing: QA in a DevOps World by David LaulusaShift-Left Testing: QA in a DevOps World by David Laulusa
Shift-Left Testing: QA in a DevOps World by David Laulusa
 
Decision tree.10.11
Decision tree.10.11Decision tree.10.11
Decision tree.10.11
 
Lect9 Decision tree
Lect9 Decision treeLect9 Decision tree
Lect9 Decision tree
 
Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)
 
lecture3.pdf
lecture3.pdflecture3.pdf
lecture3.pdf
 
Test-Driven Development
 Test-Driven Development  Test-Driven Development
Test-Driven Development
 
Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018
 
Logic.ppt
Logic.pptLogic.ppt
Logic.ppt
 
Machine Learning and Data Mining
Machine Learning and Data MiningMachine Learning and Data Mining
Machine Learning and Data Mining
 
mod_02_intro_ml.ppt
mod_02_intro_ml.pptmod_02_intro_ml.ppt
mod_02_intro_ml.ppt
 
Test suite minimization
Test suite minimizationTest suite minimization
Test suite minimization
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautz
 

More from Pier Luca Lanzi

11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i VideogiochiPier Luca Lanzi
 
Breve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiBreve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiPier Luca Lanzi
 
Global Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomeGlobal Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomePier Luca Lanzi
 
Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Pier Luca Lanzi
 
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...Pier Luca Lanzi
 
GGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaGGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaPier Luca Lanzi
 
Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Pier Luca Lanzi
 
DMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationDMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationPier Luca Lanzi
 
DMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningDMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningPier Luca Lanzi
 
DMTM Lecture 17 Text mining
DMTM Lecture 17 Text miningDMTM Lecture 17 Text mining
DMTM Lecture 17 Text miningPier Luca Lanzi
 
DMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesDMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesPier Luca Lanzi
 
DMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clusteringDMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clusteringPier Luca Lanzi
 
DMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringDMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringPier Luca Lanzi
 
DMTM Lecture 05 Data representation
DMTM Lecture 05 Data representationDMTM Lecture 05 Data representation
DMTM Lecture 05 Data representationPier Luca Lanzi
 
DMTM Lecture 01 Introduction
DMTM Lecture 01 IntroductionDMTM Lecture 01 Introduction
DMTM Lecture 01 IntroductionPier Luca Lanzi
 
DMTM Lecture 02 Data mining
DMTM Lecture 02 Data miningDMTM Lecture 02 Data mining
DMTM Lecture 02 Data miningPier Luca Lanzi
 
VDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipelineVDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipelinePier Luca Lanzi
 
VDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with UnityVDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with UnityPier Luca Lanzi
 
VDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generationVDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generationPier Luca Lanzi
 

More from Pier Luca Lanzi (19)

11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi
 
Breve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiBreve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei Videogiochi
 
Global Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomeGlobal Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning Welcome
 
Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018
 
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
 
GGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaGGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di apertura
 
Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018
 
DMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationDMTM Lecture 19 Data exploration
DMTM Lecture 19 Data exploration
 
DMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningDMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph mining
 
DMTM Lecture 17 Text mining
DMTM Lecture 17 Text miningDMTM Lecture 17 Text mining
DMTM Lecture 17 Text mining
 
DMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesDMTM Lecture 16 Association rules
DMTM Lecture 16 Association rules
 
DMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clusteringDMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clustering
 
DMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringDMTM Lecture 11 Clustering
DMTM Lecture 11 Clustering
 
DMTM Lecture 05 Data representation
DMTM Lecture 05 Data representationDMTM Lecture 05 Data representation
DMTM Lecture 05 Data representation
 
DMTM Lecture 01 Introduction
DMTM Lecture 01 IntroductionDMTM Lecture 01 Introduction
DMTM Lecture 01 Introduction
 
DMTM Lecture 02 Data mining
DMTM Lecture 02 Data miningDMTM Lecture 02 Data mining
DMTM Lecture 02 Data mining
 
VDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipelineVDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipeline
 
VDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with UnityVDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with Unity
 
VDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generationVDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generation
 

Recently uploaded

Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 

Recently uploaded (20)

Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 

DMTM 2015 - 12 Classification Rules

  • 1. Prof. Pier Luca Lanzi Classification: Rule Induction Data Mining andText Mining (UIC 583 @ Politecnico di Milano)
  • 2. Prof. Pier Luca Lanzi The Weather Dataset Outlook Temp Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No 2
  • 3. Prof. Pier Luca Lanzi A Rule Set to Classify the Data •  IF (humidity = high) and (outlook = sunny) THEN play=no (3.0/0.0) •  IF (outlook = rainy) and (windy = TRUE) THEN play=no (2.0/0.0) •  OTHERWISE play=yes (9.0/0.0) •  Confusion Matrix § yes no -- classified as § 7 2 | yes § 3 2 | no 3
  • 4. Prof. Pier Luca Lanzi Let’s Check the First Rule Outlook Temp Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No 4 IF (humidity = high) and (outlook = sunny) THEN play=no (3.0/0.0)
  • 5. Prof. Pier Luca Lanzi Then, The Second Rule Outlook Temp Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No 5 IF (outlook = rainy) and (windy = TRUE) THEN play=no (2.0/0.0)
  • 6. Prof. Pier Luca Lanzi Finally, the Third Rule Outlook Temp Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No 6 IF (outlook = rainy) and (windy = TRUE) THEN play=no (2.0/0.0)
  • 7. Prof. Pier Luca Lanzi A Simpler Solution •  IF (outlook = sunny) THEN play IS no ELSE IF (outlook = overcast) THEN play IS yes ELSE IF (outlook = rainy) THEN play IS yes (6/14 instances correct) •  Confusion Matrix § yes no -- classified as § 4 5 | yes § 3 2 | no 7
  • 8. Prof. Pier Luca Lanzi What is a Classification Rule?
  • 9. Prof. Pier Luca Lanzi What Is A Classification Rules? Why Rules? •  They are IF-THEN rules § The IF part states a condition over the data § The THEN part includes a class label •  What types of conditions? § Propositional, with attribute-value comparisons § First order Horn clauses, with variables •  Why rules? Because they are one of the most expressive and most human readable representation for hypotheses is sets of IF- THEN rules 9
  • 10. Prof. Pier Luca Lanzi Coverage and Accuracy •  IF (humidity = high) and (outlook = sunny) THEN play=no (3.0/0.0) •  ncovers = number of examples covered by the rule •  ncorrect = number of examples correctly classified by the rule •  coverage(R) = ncovers /size of the |training data set •  accuracy(R) = ncorrect /ncovers 10
  • 11. Prof. Pier Luca Lanzi Conflict Resolution •  If more than one rule is triggered, we need conflict resolution •  Size ordering: assign the highest priority to the triggering rules that has the “toughest” requirement (i.e., with the most attribute test) •  Class-based ordering: decreasing order of prevalence or misclassification cost per class •  Rule-based ordering (decision list): rules are organized into one long priority list, according to some measure of rule quality or by experts 11
  • 12. Prof. Pier Luca Lanzi Two Approaches for Rule Learning •  Direct Methods § Directly learn the rules from the training data •  Indirect Methods § Learn decision tree, then convert to rules § Learn neural networks, then extract rules 12
  • 13. Prof. Pier Luca Lanzi One Rule
  • 14. Prof. Pier Luca Lanzi Inferring Rudimentary Rules •  OneRule (1R) learns a simple rule involving one attribute § Assumes nominal attributes § The rule that all the values of one particular attribute •  Basic version § One branch for each value § Each branch assigns most frequent class § Error rate: proportion of instances that don’t belong to the majority class of their corresponding branch § Choose attribute with lowest error rate § “missing” is treated as a separate value 14
  • 15. Prof. Pier Luca Lanzi For each attribute, For each value of the attribute, make a rule as follows: count how often each class appears find the most frequent class make the rule assign that class to this attribute-value Calculate the error rate of the rules Choose the rules with the smallest error rate Pseudo-Code for OneRule 15
  • 16. Prof. Pier Luca Lanzi Evaluating the Weather Attributes 16 3/6True → No* 5/142/8False →YesWindy 1/7Normal →Yes 4/143/7High → NoHumidity 5/14 4/14 Total errors 1/4Cool → Yes 2/6Mild → Yes 2/4Hot → No*Temp 2/5Rainy →Yes 0/4Overcast →Yes 2/5Sunny → NoOutlook ErrorsRulesAttribute NoTrueHighMildRainy YesFalseNormalHotOvercast YesTrueHighMildOvercast YesTrueNormalMildSunny YesFalseNormalMildRainy YesFalseNormalCoolSunny NoFalseHighMildSunny YesTrueNormalCoolOvercast NoTrueNormalCoolRainy YesFalseNormalCoolRainy YesFalseHighMildRainy YesFalseHighHotOvercast NoTrueHighHotSunny NoFalseHighHotSunny PlayWindyHumidityTempOutlook * indicates a tie
  • 17. Prof. Pier Luca Lanzi OneRule and Numerical Attributes •  Applies simple supervised discretization •  Sort instances according to attribute’s values •  Place breakpoints where class changes (majority class) •  This procedure is however very sensitive to noise since one example with an incorrect class label may produce a separate interval. This is likely to lead to overfitting. •  In the case of the temperature, 17 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No
  • 18. Prof. Pier Luca Lanzi OneRule and Numerical Attributes •  To limit overfitting, enforce minimum number of instances in majority class per interval. •  For instance, in the case of the temperature, if we set the minimum number of majority class instances to 3, we have 18 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes No Yes Yes Yes | No No Yes Yes Yes | No Yes Yes No join the intervals to get at least 3 examples join the intervals with the same majority class
  • 19. Prof. Pier Luca Lanzi OneRule Applied to the Numerical Version of the Weather Dataset 19 0/1 95.5 →Yes 3/6True → No* 5/142/8False →YesWindy 2/6 82.5 and ≤ 95.5 → No 3/141/7≤ 82.5 → YesHumidity 5/14 4/14 Total errors 2/4 77.5 → No* 3/10≤ 77.5 →YesTemperature 2/5Rainy →Yes 0/4Overcast →Yes 2/5Sunny → NoOutlook ErrorsRulesAttribute
  • 20. Prof. Pier Luca Lanzi Sequential Covering
  • 21. Prof. Pier Luca Lanzi Sequential Covering Algorithms •  Consider the set E of positive and negative examples •  Repeat § Learn one rule with high accuracy, any coverage § Remove positive examples covered by this rule •  Until all the examples are covered 21
  • 22. Prof. Pier Luca Lanzi Basic Sequential Covering Algorithm procedure Covering (Examples, Classifier) input: a set of positive and negative examples for class c // rule set is initially empty classifier = {} while PositiveExamples(Examples)!={} // find the best rule possible Rule = FindBestRule(Examples) // check if we need more rules if Stop(Examples, Rule, Classifier) breakwhile // remove covered examples and update the model Examples = ExamplesCover(Rule,Examples) Classifier = Classifier U {Rule} Endwhile // post-process the rules (sort them, simplify them, etc.) Classifier = PostProcessing(Classifier) output: Classifier 22
  • 23. Prof. Pier Luca Lanzi Finding the Best Rule Possible 23 IF THEN Play=yes IF Wind=No THEN Play=yes IF Humidity=Normal THEN Play=yes IF Humidity=High THEN Play=yes IF … THEN … IF Wind=yes THEN Play=yes IF Humidity=Normal AND Wind=yes THEN Play=yes IF Humidity=Normal AND Wind=No THEN Play=yes IF Humidity=Normal AND Outlook=Rainy THEN Play=yes P=5/10 = 0.5 P=6/8=0.75 P=6/7=0.86 P=3/7=0.43
  • 24. Prof. Pier Luca Lanzi Another Viewpoint 24 y x a b b b b b b b b b b b b b b a a a a a y a b b b b b b b b b b b b b b a a aa a x 1đ2 y a b b b b b b b b b b b b b b a a aa a x 1đ2 2đ6 If x 1.2 then class = a If x 1.2 and y 2.6 then class = a If true then class = a
  • 25. Prof. Pier Luca Lanzi And Another Viewpoint 25 (i) Original Data (ii) Step 1 (iii) Step 2 R1 (iv) Step 3 R1 R2
  • 26. Prof. Pier Luca Lanzi Learning Just One Rule LearnOneRule(Attributes, Examples, k) init BH to the most general hypothesis init CH to {BH} while CH not empty Do Generate Next More Specific CH in NCH // check all the NCH for an hypothesis that
 // improves the performance of BH Update BH Update CH with the k best NCH endwhile return a rule “IF BH THEN prediction” 26
  • 27. Prof. Pier Luca Lanzi An Example Using Contact Lens Data 27 NoneReducedYesHypermetropePre-presbyopic NoneNormalYesHypermetropePre-presbyopic NoneReducedNoMyopePresbyopic NoneNormalNoMyopePresbyopic NoneReducedYesMyopePresbyopic HardNormalYesMyopePresbyopic NoneReducedNoHypermetropePresbyopic SoftNormalNoHypermetropePresbyopic NoneReducedYesHypermetropePresbyopic NoneNormalYesHypermetropePresbyopic SoftNormalNoHypermetropePre-presbyopic NoneReducedNoHypermetropePre-presbyopic HardNormalYesMyopePre-presbyopic NoneReducedYesMyopePre-presbyopic SoftNormalNoMyopePre-presbyopic NoneReducedNoMyopePre-presbyopic hardNormalYesHypermetropeYoung NoneReducedYesHypermetropeYoung SoftNormalNoHypermetropeYoung NoneReducedNoHypermetropeYoung HardNormalYesMyopeYoung NoneReducedYesMyopeYoung SoftNormalNoMyopeYoung NoneReducedNoMyopeYoung Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge
  • 28. Prof. Pier Luca Lanzi First Step: the Most General Rule •  Rule we seek: •  Possible tests: 28 4/12Tear production rate = Normal 0/12Tear production rate = Reduced 4/12Astigmatism = yes 0/12Astigmatism = no 1/12Spectacle prescription = Hypermetrope 3/12Spectacle prescription = Myope 1/8Age = Presbyopic 1/8Age = Pre-presbyopic 2/8Age = Young If ? then recommendation = hard
  • 29. Prof. Pier Luca Lanzi Adding the First Clause •  Rule with best test added, •  Instances covered by modified rule, 29 NoneReducedYesHypermetropePre-presbyopic NoneNormalYesHypermetropePre-presbyopic NoneReducedYesMyopePresbyopic HardNormalYesMyopePresbyopic NoneReducedYesHypermetropePresbyopic NoneNormalYesHypermetropePresbyopic HardNormalYesMyopePre-presbyopic NoneReducedYesMyopePre-presbyopic hardNormalYesHypermetropeYoung NoneReducedYesHypermetropeYoung HardNormalYesMyopeYoung NoneReducedYesMyopeYoung Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge If astigmatism = yes then recommendation = hard
  • 30. Prof. Pier Luca Lanzi Extending the First Rule •  Current state, •  Possible tests, 30 4/6Tear production rate = Normal 0/6Tear production rate = Reduced 1/6Spectacle prescription = Hypermetrope 3/6Spectacle prescription = Myope 1/4Age = Presbyopic 1/4Age = Pre-presbyopic 2/4Age = Young If astigmatism = yes and ? then recommendation = hard
  • 31. Prof. Pier Luca Lanzi The Second Rule •  Rule with best test added: •  Instances covered by modified rule 31 NoneNormalYesHypermetropePrepresbyopic HardNormalYesMyopePresbyopic NoneNormalYesHypermetropePresbyopic HardNormalYesMyopePrepresbyopic HardNormalYesHypermetropeYoung HardNormalYesMyopeYoung Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge If astigmatism = yes and tear production rate = normal then recommendation = Hard
  • 32. Prof. Pier Luca Lanzi Adding the Third Clause •  Current state: •  Possible tests: •  Tie between the first and the fourth test, we choose the one with greater coverage 32 1/3Spectacle prescription = Hypermetrope 3/3Spectacle prescription = Myope 1/2Age = Presbyopic 1/2Age = Pre-presbyopic 2/2Age = Young If astigmatism = yes and tear production rate = normal and ? then recommendation = hard
  • 33. Prof. Pier Luca Lanzi The Final Result •  Final rule: •  Second rule for recommending “hard lenses”: (built from instances not covered by first rule) •  These two rules cover all “hard lenses”: •  Process is repeated with other two classes 33 If astigmatism = yes and tear production rate = normal and spectacle prescription = myope then recommendation = hard If age = young and astigmatism = yes and tear production rate = normal then recommendation = hard
  • 34. Prof. Pier Luca Lanzi Testing for the Best Rule •  Measure 1: Accuracy (p/t) § t total instances covered by rule pnumber of these that are positive § Produce rules that do not cover negative instances, as quickly as possible § May produce rules with very small coverage —special cases or noise? •  Measure 2: Information gain p (log(p/t) – log(P/T)) § P and T the positive and total numbers before the new condition was added § Information gain emphasizes positive rather than negative instances •  These measures interact with the pruning mechanism used 34
  • 35. Prof. Pier Luca Lanzi Eliminating Instances •  Why do we need to eliminate instances? § Otherwise, the next rule is identical to previous rule •  Why do we remove positive instances? § To ensure that the next rule is different •  Why do we remove negative instances? § Prevent underestimating accuracy of rule § Compare rules R2 and R3 in the following diagram 35
  • 36. Prof. Pier Luca Lanzi Eliminating Instances 36 class = + class = - + + + + + + + + + + + + + + + + ++ + + - - - - - - - - - - - - - - - - - - - - - + + ++ + + + R1 R3 R2 + +
  • 37. Prof. Pier Luca Lanzi Missing Values and Numeric Attributes •  Missing values usually fail the test •  Covering algorithm must either § Use other tests to separate out positive instances § Leave them uncovered until later in the process •  In some cases it is better to treat “missing” as a separate value •  Numeric attributes are treated as in decision trees 37
  • 38. Prof. Pier Luca Lanzi Stopping Criterion and Rule Pruning •  The process usually stops when there is no significant improvement by adding the new rule •  Rule pruning is similar to post-pruning of decision trees •  Reduced Error Pruning: § Remove one of the conjuncts in the rule § Compare error rate on validation set § If error improves, prune the conjunct 38
  • 39. Prof. Pier Luca Lanzi Rules vs. Trees •  Rule sets can be more readable •  Decision trees suffer from replicated subtrees •  Rule sets are collections of local models, trees represent models over the whole domain •  The covering algorithm concentrates on one class at a time whereas decision tree learner takes all classes into account 39
  • 40. Prof. Pier Luca Lanzi Mining Association Rules for Classification
  • 41. Prof. Pier Luca Lanzi Mining Association Rules for Classification (the CBA algorithm) •  Association rule mining assumes that the data consist of a set of transactions. Thus, the typical tabular representation of data used in classification must be mapped into such a format. •  Association rule mining is then applied to the new dataset and the search is focused on association rules in which the tail identifies a class label X⇒ ci (where ci is a class label) •  The association rules are pruned using the pessimistic error-based method used in C4.5 •  Finally, rules are sorted to build the final classifier. 41
  • 42. Prof. Pier Luca Lanzi Indirect Methods
  • 43. Prof. Pier Luca Lanzi Indirect Methods Rule Set r1: (P=No,Q=No) == - r2: (P=No,Q=Yes) == + r3: (P=Yes,R=No) == + r4: (P=Yes,R=Yes,Q=No) == - r5: (P=Yes,R=Yes,Q=Yes) == + P Q R Q- + + - + No No No Yes Yes Yes No Yes 43
  • 44. Prof. Pier Luca Lanzi Example
  • 46. Prof. Pier Luca Lanzi JRip Model (humidity = high) and (outlook = sunny) = play=no (3.0/0.0) (outlook = rainy) and (windy = TRUE) = play=no (2.0/0.0) = play=yes (9.0/0.0) 46
  • 47. Prof. Pier Luca Lanzi One Rule Model outlook: overcast - yes rainy - yes sunny - no (10/14 instances correct) 47
  • 48. Prof. Pier Luca Lanzi CBA Model outlook=overcast == play=yes humidity=normal windy=FALSE == play=yes outlook=rainy windy=FALSE == play=yes outlook=sunny humidity=high == play=no outlook=rainy windy=TRUE == play=no (default class is the majority class) 48
  • 49. Prof. Pier Luca Lanzi Summary
  • 50. Prof. Pier Luca Lanzi Summary •  Advantages of Rule-Based Classifiers § As highly expressive as decision trees § Easy to interpret § Easy to generate § Can classify new instances rapidly § Performance comparable to decision trees •  Two approaches: direct and indirect methods 50
  • 51. Prof. Pier Luca Lanzi Summary •  Direct Methods, typically apply sequential covering approach § Grow a single rule § Remove Instances from rule § Prune the rule (if necessary) § Add rule to Current Rule Set § Repeat •  Other approaches exist § Specific to general exploration (RISE) § Post processing of neural networks, association rules, decision trees, etc. 51
  • 52. Prof. Pier Luca Lanzi Homework •  Generate the rule set for the Weather dataset by repeatedly applying the procedure to learn one rule until no improvement can be produced or the covered examples are too few •  Check the problems provided in the previous exams and apply both OneRule and Sequential Covering to generate the first rule. Then, check the result with one of the implementations available in Weka 52