Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Kato Mivule: An Overview of Adaptive Boosting – AdaBoost


Published on

An Overview of Adaptive Boosting – AdaBoost

Published in: Technology, Education
  • Login to see the comments

Kato Mivule: An Overview of Adaptive Boosting – AdaBoost

  1. 1. An Overview of Adaptive Boosting – AdaBoost Presented By Kato Mivule Dr. Manohar Mareboyana, Professor Data Mining - Spring 2013 Computer Science Department Bowie State University An Overview of AdaBoost 1
  2. 2. OUTLINE • Introduction • How AdaBoost Works • The experiment • Results • Conclusion and Discussion 2 An Overview of AdaBoost
  3. 3. Adaptive Booting – AdaBoost • Adaptive Boosting (AdaBoost) was proposed by Freund and Schapire (1995). • AdaBoost is a machine learning classifier that uses several iterations by adding weak learners to generate a new learner with improved performance. • AdaBoost is adaptive in that with each iteration, a new weak learner is added to the AdaBoost classifier by fine-tuning weights with priority given to misclassified data in prior iterations. • AdaBoost is less vulnerable to over-fitting but prone to noise and outliers. 3 An Overview of AdaBoost
  4. 4. AdaBoost Fit Ensemble 4 An overview of the AdaBoost Fit Ensemble procedure. An Overview of AdaBoost
  5. 5. 5 An Overview of AdaBoost
  6. 6. 6 An Overview of AdaBoost
  7. 7. 7 An Overview of AdaBoost
  8. 8. 8 An Overview of AdaBoost
  9. 9. How AdaBoost Works – Weak Learners • Decision Stump • For this overview, we choose Decision Stumps as our weak learner. • The Decision Stump generates a decision tree with only one single split. • The resulting tree can be used for classifying unseen (untrained) instances. • The leaf nodes is the class name. • A non-leaf node is a decision node. 9 An Overview of AdaBoost
  10. 10. How AdaBoost Works – Weak Learners • How a Decision Stump chooses the best attributes: • Information gain: attribute with lowest info gain is chosen. • Gain ratio. • Gini index. 10 An Overview of AdaBoost
  11. 11. AdaBoost – the experiment • For illustration purposes, we utilized Rapid Miner’s AdaBoost functionality • We used a UCI Cancer dataset with 643 data points. • We employed a 10 fold cross validation. 11 An Overview of AdaBoost
  12. 12. AdaBoost – the experiment • We used Rapid Miner’s Decision Stump as our weak learner. 12 An Overview of AdaBoost
  13. 13. AdaBoost – Results 13 Generated AdaBoost Model The following AdaBoost Model was generated: AdaBoost (prediction model for label Class) Number of inner models: 3 Embedded model #0 (weight: 2.582): Uniformity of Cell > 3.500: 4 {2=11, 4=202} Uniformity of Cell ≤ 3.500: 2 {2=433, 4=37} Embedded model #1 (weight: 1.352): Uniformity of Cell Shape > 1.500: 4 {2=100, 4=237} Uniformity of Cell Shape ≤ 1.500: 2 {2=344, 4=2} Embedded model #2 (weight: 1.016): Clump Thickness > 8.500: 4 {2=0, 4=83} Clump Thickness ≤ 8.500: 2 {2=444, 4=156} An Overview of AdaBoost
  14. 14. AdaBoost – Results • AdaBoost using Decision Stumps – classification accuracy at 93.12%. • Decision Stump with out AdaBoost – classification accuracy at 92.97%. 14 An Overview of AdaBoost
  15. 15. AdaBoost – the results • AdaBoost Confusion Matrix – Classification accuracy at 93.12% • Decision Stump Confusion Matrix – Classification accuracy at 92.97% 15 An Overview of AdaBoost
  16. 16. AdaBoost – the results 16 An Overview of AdaBoost The Receiver Operating Characteristic (ROC): •The ROC shows the false positive rate on X- axis (specificity), the probability of target = 1 when its true value is 0. •The true positive rate on Y-axis (sensitivity), the probability of target=1 when its true value is 1. •For an ideal situation, the curve rises fast to the top-left indicating that the model correctly made the predictions. Area Under the Curve (AUC): •AUC shows how the classifier will rank a randomly chosen positive instance higher than a randomly chosen negative instance. •The AUC calculates total performance of classifier. •Higher AUC indicates better performance. •0.50 AUC indicates random performance. •1.00 AUC indicates perfect performance weight: The ROC/AUC plot for AdaBoost – with AUC of 0.975. The ROC/AUC plot for Decision Stamp – with AUC of 0.911.
  17. 17. 17 CONCLUSION • As shown in the preliminary results, AdaBoost performs better than Decision Stump. • However, much of the success for the AdaBoost will depend largely on fine-tuning parameters in the machine learning classifier and the weak learner that is chosen. An Overview of AdaBoost
  18. 18. References 1. Y. Freund and R. E. Schapire, "A Decision-Theoretic generalization of On-Line learning and an application to boosting," Journal of Computer and System Sciences, vol. 55, no. 1, pp. 119-139, Aug. 1997. 2. T. G. Dietterich, "Ensemble methods in machine learning," Lecture Notes in Computer Science, vol. 1857, pp. 1-15, 2000. 3. Kato Mivule, Claude Turner, Soo-Yeon Ji, Towards A Differential Privacy and Utility Preserving Machine Learning Classifier, Procedia Computer Science, Volume 12, 2012, Pages 176-181 4. T. Fawcett, “An introduction to ROC analysis.”, Pattern recognition letters, vol. 27, no.8, 2006, Pages 861-874. 5. K. Bache and M. Lichman, “Breast Cancer Wisconsin (Original) Data Set - UCI Machine Learning Repository.” University of California, School of Information and Computer Science., Irvine, CA, 2013. 6. MATLAB, “AdaBoost - MATLAB.” Online, Accessed: May 3rd 2013, Available: 7. MATLAB, “Ensemble Methods :: Nonparametric Supervised Learning (Statistics Toolbox™).” Online, Accessed: May 3rd 2013, Available: 8. ROC Charts, “Model Evaluation – Classification” Online, Accessed May 3rd 2013, Available: http://chem- 9. MedCalc, “ROC curve analysis in MedCalc”, Online, Accessed May 3rd 2013, Available: curves.php 18 An Overview of AdaBoost
  19. 19. THANK YOU. Contact: kmivule at gmail dot com 19 An Overview of AdaBoost