SlideShare a Scribd company logo
1 of 12
Support Vector Machines and
Associative Classifiers
SVM—Support Vector Machines
 A new classification method for both linear and nonlinear data
 It uses a nonlinear mapping to transform the original training data
into a higher dimension
 With the new dimension, it searches for the linear optimal separating
hyperplane (i.e., “decision boundary”)
 With an appropriate nonlinear mapping to a sufficiently high
dimension, data from two classes can always be separated by a
hyperplane
 SVM finds this hyperplane using support vectors (“essential” training
tuples) and margins (defined by the support vectors)
SVM—History and Applications
 Vapnik and colleagues (1992)—groundwork from Vapnik &
Chervonenkis’ statistical learning theory in 1960s
 Features: training can be slow but accuracy is high owing to
their ability to model complex nonlinear decision boundaries
(margin maximization)
 Used both for classification and prediction
 Applications:
 handwritten digit recognition, object recognition, speaker
identification, benchmarking time-series prediction tests
SVM—General Philosophy
 Data Set D (X1, y1), (X2, y2)…(Xn, yn)
 Xi – Training tuples
 yi – Class labels
 Linearly Separable
 Best separating line – minimum
classification error
 Plane / Hyperplane
 Maximal Margin Hyperplane (MMH)
 Larger margin – more accurate at classifying
future samples
 Gives largest separation among classes
 Sides of margin are parallel to hyperplane
 Distance from MMH to closest training tuple
of either class
SVM—General Philosophy
Support Vectors
Small Margin Large Margin
SVM—Linearly Separable
 A separating hyperplane can be written as
W ● X + b = 0
where W={w1, w2, …, wn} is a weight vector and b a scalar (bias)
 For 2-D it can be written as
w0 + w1 x1 + w2 x2 = 0
 The hyperplane defining the sides of the margin:
H1: w0 + w1 x1 + w2 x2 ≥ 1 for yi = +1, and
H2: w0 + w1 x1 + w2 x2 ≤ – 1 for yi = –1
 Any training tuples that fall on hyperplanes H1 or H2 (i.e., the sides defining the
margin) are Support vectors
 Size of maximal margin is 2 / ||W||
 ||W|| - Euclidean norm of W = (w1
2
+w2
2
+…+wn
2
)1/2
 This becomes a constrained (convex) quadratic optimization problem
High Dimensional Data
 The complexity of trained classifier is characterized by the # of support vectors rather
than the dimensionality of the data
 The support vectors are the essential or critical training examples —they lie closest
to the decision boundary (MMH)
 If all other training examples are removed and the training is repeated, the same
separating hyperplane would be found
 The number of support vectors found can be used to compute an (upper) bound on the
expected error rate of the SVM classifier, which is independent of the data
dimensionality
 Thus, an SVM with a small number of support vectors can have good generalization,
even when the dimensionality of the data is high
 Classification using SVM
 Maximal Margin Hyperplane: d(XT
)= ∑i=1
l
yiαiXiXT
+ b0
 Xi –Support vectors (l numbers) and XT
–Test data
 Class depends on sign of result
SVM—Linearly Inseparable
 Transform the original input data into a higher dimensional space
 Search for a linear separating hyperplane in the new space
 3-D vector X =(x1, x2, x3) is mapped into a 6-D space using φ1(X) = x1, φ2(X) = x2, φ3(X) = x3,
φ4(X) = (x1)2
φ5(X) = x1x2 φ6(X) = x1x3
 Decision plane d(Z) = w1x1 + w2x2+ w3x3 + w4(x1)2
+ w5 x1x2 + w6 x1x3 + b
d(Z) = w1z1 + w2z2+ w3z3 + w4z4 + w5 z5 + w6 z6+ b
- Choosing non-linear mapping
- Expensive Computation XiXT
A 1
A 2
SVM—Kernel functions
 Instead of computing the dot product on the transformed data
tuples, it is mathematically equivalent to instead applying a kernel
function K(Xi, Xj) to the original data, i.e., K(Xi, Xj) = Φ(Xi) Φ(Xj)
 Typical Kernel Functions
 Polynomial Kernel : K(Xi, Xj) = (Xi, Xj + 1)h
 Sigmoid Kernel : K(Xi, Xj) = tanh(κXi.Xj - δ)
 Gaussian Radial Basis Function kernel :
K(Xi, Xj) = e-||Xi - Xj ||2 / 2 σ2
 SVM can also be used for classifying multiple (> 2) classes and for
regression analysis (with additional user parameters)
SVM vs. Neural Network
 SVM
 Relatively new concept
 Deterministic algorithm
 Nice Generalization
properties
 Hard to learn – learned in
batch mode using quadratic
programming techniques
 Using kernels can learn very
complex functions
 Neural Network
 Relatively old
 Nondeterministic algorithm
 Generalizes well but doesn’t
have strong mathematical
foundation
 Can easily be learned in
incremental fashion
 To learn complex functions—
use multilayer perceptron
Associative Classification
 Associative classification
 Association rules are generated and analyzed for use in classification
 Search for strong associations between frequent patterns (conjunctions of attribute-
value pairs) and class labels
 Classification: Based on evaluating a set of rules in the form of
p1 ^ p2 … ^ pl  “Aclass = C” (conf, sup)
 Effectiveness
 It explores highly confident associations among multiple attributes
 Overcome some constraints introduced by decision-tree induction which considers
only one attribute at a time
 Associative classification has been found to be more accurate than some traditional
classification methods, such as C4.5
Typical Associative Classification Methods
 CBA (Classification By Association)
 Mine possible rules in the form of
 Cond-set (a set of attribute-value pairs)  class label
 Iterative Approach similar to Apriori
 Build classifier: Organize rules according to decreasing precedence based on
confidence and then support
 CMAR (Classification based on Multiple Association Rules)
 Uses variant of FP-Growth
 Mined rules are stored and retrieved using tree structure
 Rule Pruning is carried out – Specialized rules with low confidence are pruned
 Classification: Statistical analysis on multiple rules
 CPAR (Classification based on Predictive Association Rules)
 Generation of predictive rules (FOIL-like analysis)
 High efficiency, accuracy similar to CMAR
 Uses best-k rules of each group to predict class label

More Related Content

What's hot

Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Simplilearn
 
Performance Evaluation for Classifiers tutorial
Performance Evaluation for Classifiers tutorialPerformance Evaluation for Classifiers tutorial
Performance Evaluation for Classifiers tutorial
Bilkent University
 

What's hot (20)

Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
Dbscan algorithom
Dbscan algorithomDbscan algorithom
Dbscan algorithom
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentation
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Support Vector Machines- SVM
Support Vector Machines- SVMSupport Vector Machines- SVM
Support Vector Machines- SVM
 
Back propagation
Back propagationBack propagation
Back propagation
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
 
Linear regression with gradient descent
Linear regression with gradient descentLinear regression with gradient descent
Linear regression with gradient descent
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees
 
Support Vector machine
Support Vector machineSupport Vector machine
Support Vector machine
 
Clustering
ClusteringClustering
Clustering
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Linear discriminant analysis
Linear discriminant analysisLinear discriminant analysis
Linear discriminant analysis
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANN
 
Performance Evaluation for Classifiers tutorial
Performance Evaluation for Classifiers tutorialPerformance Evaluation for Classifiers tutorial
Performance Evaluation for Classifiers tutorial
 
Classification using back propagation algorithm
Classification using back propagation algorithmClassification using back propagation algorithm
Classification using back propagation algorithm
 

Similar to 2.6 support vector machines and associative classifiers revised

SVM (2).ppt
SVM (2).pptSVM (2).ppt
SVM (2).ppt
NoorUlHaq47
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
butest
 
Unit-1 Introduction and Mathematical Preliminaries.pptx
Unit-1 Introduction and Mathematical Preliminaries.pptxUnit-1 Introduction and Mathematical Preliminaries.pptx
Unit-1 Introduction and Mathematical Preliminaries.pptx
avinashBajpayee1
 
Tutorial - Support vector machines
Tutorial - Support vector machinesTutorial - Support vector machines
Tutorial - Support vector machines
butest
 

Similar to 2.6 support vector machines and associative classifiers revised (20)

Introduction to Support Vector Machines
Introduction to Support Vector MachinesIntroduction to Support Vector Machines
Introduction to Support Vector Machines
 
Text categorization
Text categorizationText categorization
Text categorization
 
SVM.ppt
SVM.pptSVM.ppt
SVM.ppt
 
MLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic trackMLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic track
 
SVM (2).ppt
SVM (2).pptSVM (2).ppt
SVM (2).ppt
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1
 
2.7 other classifiers
2.7 other classifiers2.7 other classifiers
2.7 other classifiers
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Km2417821785
Km2417821785Km2417821785
Km2417821785
 
Application of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosisApplication of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosis
 
[ML]-SVM2.ppt.pdf
[ML]-SVM2.ppt.pdf[ML]-SVM2.ppt.pdf
[ML]-SVM2.ppt.pdf
 
MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1
 
[ppt]
[ppt][ppt]
[ppt]
 
[ppt]
[ppt][ppt]
[ppt]
 
Unit-1 Introduction and Mathematical Preliminaries.pptx
Unit-1 Introduction and Mathematical Preliminaries.pptxUnit-1 Introduction and Mathematical Preliminaries.pptx
Unit-1 Introduction and Mathematical Preliminaries.pptx
 
Lect4
Lect4Lect4
Lect4
 
Tutorial - Support vector machines
Tutorial - Support vector machinesTutorial - Support vector machines
Tutorial - Support vector machines
 
Cs501 cluster analysis
Cs501 cluster analysisCs501 cluster analysis
Cs501 cluster analysis
 
lecture14-SVMs (1).ppt
lecture14-SVMs (1).pptlecture14-SVMs (1).ppt
lecture14-SVMs (1).ppt
 

More from Krish_ver2

More from Krish_ver2 (20)

5.5 back tracking
5.5 back tracking5.5 back tracking
5.5 back tracking
 
5.5 back track
5.5 back track5.5 back track
5.5 back track
 
5.5 back tracking 02
5.5 back tracking 025.5 back tracking 02
5.5 back tracking 02
 
5.4 randomized datastructures
5.4 randomized datastructures5.4 randomized datastructures
5.4 randomized datastructures
 
5.4 randomized datastructures
5.4 randomized datastructures5.4 randomized datastructures
5.4 randomized datastructures
 
5.4 randamized algorithm
5.4 randamized algorithm5.4 randamized algorithm
5.4 randamized algorithm
 
5.3 dynamic programming 03
5.3 dynamic programming 035.3 dynamic programming 03
5.3 dynamic programming 03
 
5.3 dynamic programming
5.3 dynamic programming5.3 dynamic programming
5.3 dynamic programming
 
5.3 dyn algo-i
5.3 dyn algo-i5.3 dyn algo-i
5.3 dyn algo-i
 
5.2 divede and conquer 03
5.2 divede and conquer 035.2 divede and conquer 03
5.2 divede and conquer 03
 
5.2 divide and conquer
5.2 divide and conquer5.2 divide and conquer
5.2 divide and conquer
 
5.2 divede and conquer 03
5.2 divede and conquer 035.2 divede and conquer 03
5.2 divede and conquer 03
 
5.1 greedyyy 02
5.1 greedyyy 025.1 greedyyy 02
5.1 greedyyy 02
 
5.1 greedy
5.1 greedy5.1 greedy
5.1 greedy
 
5.1 greedy 03
5.1 greedy 035.1 greedy 03
5.1 greedy 03
 
4.4 hashing02
4.4 hashing024.4 hashing02
4.4 hashing02
 
4.4 hashing
4.4 hashing4.4 hashing
4.4 hashing
 
4.4 hashing ext
4.4 hashing  ext4.4 hashing  ext
4.4 hashing ext
 
4.4 external hashing
4.4 external hashing4.4 external hashing
4.4 external hashing
 
4.2 bst
4.2 bst4.2 bst
4.2 bst
 

Recently uploaded

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 

Recently uploaded (20)

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 

2.6 support vector machines and associative classifiers revised

  • 1. Support Vector Machines and Associative Classifiers
  • 2. SVM—Support Vector Machines  A new classification method for both linear and nonlinear data  It uses a nonlinear mapping to transform the original training data into a higher dimension  With the new dimension, it searches for the linear optimal separating hyperplane (i.e., “decision boundary”)  With an appropriate nonlinear mapping to a sufficiently high dimension, data from two classes can always be separated by a hyperplane  SVM finds this hyperplane using support vectors (“essential” training tuples) and margins (defined by the support vectors)
  • 3. SVM—History and Applications  Vapnik and colleagues (1992)—groundwork from Vapnik & Chervonenkis’ statistical learning theory in 1960s  Features: training can be slow but accuracy is high owing to their ability to model complex nonlinear decision boundaries (margin maximization)  Used both for classification and prediction  Applications:  handwritten digit recognition, object recognition, speaker identification, benchmarking time-series prediction tests
  • 4. SVM—General Philosophy  Data Set D (X1, y1), (X2, y2)…(Xn, yn)  Xi – Training tuples  yi – Class labels  Linearly Separable  Best separating line – minimum classification error  Plane / Hyperplane  Maximal Margin Hyperplane (MMH)  Larger margin – more accurate at classifying future samples  Gives largest separation among classes  Sides of margin are parallel to hyperplane  Distance from MMH to closest training tuple of either class
  • 6. SVM—Linearly Separable  A separating hyperplane can be written as W ● X + b = 0 where W={w1, w2, …, wn} is a weight vector and b a scalar (bias)  For 2-D it can be written as w0 + w1 x1 + w2 x2 = 0  The hyperplane defining the sides of the margin: H1: w0 + w1 x1 + w2 x2 ≥ 1 for yi = +1, and H2: w0 + w1 x1 + w2 x2 ≤ – 1 for yi = –1  Any training tuples that fall on hyperplanes H1 or H2 (i.e., the sides defining the margin) are Support vectors  Size of maximal margin is 2 / ||W||  ||W|| - Euclidean norm of W = (w1 2 +w2 2 +…+wn 2 )1/2  This becomes a constrained (convex) quadratic optimization problem
  • 7. High Dimensional Data  The complexity of trained classifier is characterized by the # of support vectors rather than the dimensionality of the data  The support vectors are the essential or critical training examples —they lie closest to the decision boundary (MMH)  If all other training examples are removed and the training is repeated, the same separating hyperplane would be found  The number of support vectors found can be used to compute an (upper) bound on the expected error rate of the SVM classifier, which is independent of the data dimensionality  Thus, an SVM with a small number of support vectors can have good generalization, even when the dimensionality of the data is high  Classification using SVM  Maximal Margin Hyperplane: d(XT )= ∑i=1 l yiαiXiXT + b0  Xi –Support vectors (l numbers) and XT –Test data  Class depends on sign of result
  • 8. SVM—Linearly Inseparable  Transform the original input data into a higher dimensional space  Search for a linear separating hyperplane in the new space  3-D vector X =(x1, x2, x3) is mapped into a 6-D space using φ1(X) = x1, φ2(X) = x2, φ3(X) = x3, φ4(X) = (x1)2 φ5(X) = x1x2 φ6(X) = x1x3  Decision plane d(Z) = w1x1 + w2x2+ w3x3 + w4(x1)2 + w5 x1x2 + w6 x1x3 + b d(Z) = w1z1 + w2z2+ w3z3 + w4z4 + w5 z5 + w6 z6+ b - Choosing non-linear mapping - Expensive Computation XiXT A 1 A 2
  • 9. SVM—Kernel functions  Instead of computing the dot product on the transformed data tuples, it is mathematically equivalent to instead applying a kernel function K(Xi, Xj) to the original data, i.e., K(Xi, Xj) = Φ(Xi) Φ(Xj)  Typical Kernel Functions  Polynomial Kernel : K(Xi, Xj) = (Xi, Xj + 1)h  Sigmoid Kernel : K(Xi, Xj) = tanh(κXi.Xj - δ)  Gaussian Radial Basis Function kernel : K(Xi, Xj) = e-||Xi - Xj ||2 / 2 σ2  SVM can also be used for classifying multiple (> 2) classes and for regression analysis (with additional user parameters)
  • 10. SVM vs. Neural Network  SVM  Relatively new concept  Deterministic algorithm  Nice Generalization properties  Hard to learn – learned in batch mode using quadratic programming techniques  Using kernels can learn very complex functions  Neural Network  Relatively old  Nondeterministic algorithm  Generalizes well but doesn’t have strong mathematical foundation  Can easily be learned in incremental fashion  To learn complex functions— use multilayer perceptron
  • 11. Associative Classification  Associative classification  Association rules are generated and analyzed for use in classification  Search for strong associations between frequent patterns (conjunctions of attribute- value pairs) and class labels  Classification: Based on evaluating a set of rules in the form of p1 ^ p2 … ^ pl  “Aclass = C” (conf, sup)  Effectiveness  It explores highly confident associations among multiple attributes  Overcome some constraints introduced by decision-tree induction which considers only one attribute at a time  Associative classification has been found to be more accurate than some traditional classification methods, such as C4.5
  • 12. Typical Associative Classification Methods  CBA (Classification By Association)  Mine possible rules in the form of  Cond-set (a set of attribute-value pairs)  class label  Iterative Approach similar to Apriori  Build classifier: Organize rules according to decreasing precedence based on confidence and then support  CMAR (Classification based on Multiple Association Rules)  Uses variant of FP-Growth  Mined rules are stored and retrieved using tree structure  Rule Pruning is carried out – Specialized rules with low confidence are pruned  Classification: Statistical analysis on multiple rules  CPAR (Classification based on Predictive Association Rules)  Generation of predictive rules (FOIL-like analysis)  High efficiency, accuracy similar to CMAR  Uses best-k rules of each group to predict class label