17. Example 1 Root node Branch Leaf Node Outlook Temperature Sunny Hot Sell Don’t Sell Sell Yes No Mild Holiday Season Cold Don’t Sell Holiday Season Overcast No Don’t Sell Yes Temperature Hot Cold Mild Don’t Sell Sell Don’t Sell
18.
19. Example 2 Draw a tree based upon the following data: - True False 3 - True True 4 2 1 Item + False True + False False Class Y X
20. Example 2: Tree 1 Y {3,4} Negative {1,2} Positive True False
21. Example 2: Tree 2 X {2,4} Y {1,3} Y {2} Positive {4} Negative True False True False {1} Positive {3} Negative True False
38. Final Tree Transport {7,12,16,19,20} Positive {8} Negative {6} Negative Callan 2003:243 {5,9,14} P ositive {2,4,10,13,18} Negative A P G {1,3,6,8,11,15,17} Housing Estate L M S N {11,17} Industrial Estate {17} Negative {11} P ositive Y N {1,3,15} University {15} Negative {1,3} P ositive Y N {2,4,5,9,10,13,14,18} Industrial Estate Y N
39.
40.
41.
42. Rules Example Transport {1,3,6,8,11,15,17} Housing Estate {2,4,5,9,10,13,14,18} Industrial Estate {7,12,16,19,20} Positive {11,17} Industrial Estate {1,3,15} University {8} Negative {6} Negative {5,9,14} P ositive {2,4,10,13,18} Negative {17} Negative {11} P ositive {15} Negative {1,3} P ositive A P G L M S N Y N Y N Callan 2003:243 Y N
43.
44.
45.
46.
47.
48.
Editor's Notes
Which class boundary is better?
Least squares: linear model, given a set of points, attempt to fit a line/boundary by minimising the residual sum of squares. Decision trees: our focus. Support vector machines: non-linear boundaries represented as linear boundaries in transformed (high-dimensional) space. Boosting: combine weak classifiers and tune the training data. Neural networks: biologically inspired supervised and unsupervised techniques. K-means: given a set of points, attempt to fit k-nearest neighbours to the points. Genetic algorithms: evolutionary techniques to evolve solutions using a fitness function/breeding/mutation.