SlideShare a Scribd company logo
1 of 11
Download to read offline
1
Naïve Bayes Classifier
Naïve Bayes Classifier
• Only utilize the simple probability and Bayes’ theorem
• Computational efficiency
Definition
Potential Use Cases
In machine learning, Naive Bayes classifiers are a family of simple probabilistic classifiers based on
applying Bayes' theorem with strong (naive) independence assumptions between the features.
It is one of the most basic text classification techniques with various applications
• Email Spam Detection
• Language Detection
• Sentiment Detection
• Personal email sorting
• Document categorization
Advantages
Basic Probability Theory
• 2 events are disjoint (exclusive): if they can’t happen at the same time (a single coin flip cannot
yield a tail and a head at the same time). For Bayes classification, we are not concerned with
disjoint events.
• 2 events are independent: when they can happen at the same time, but the occurrence of one
event does not make the occurrence of another more or less probable. For example the second
coin-flip you make is not affected by the outcome of the first coin-flip.
• 2 events are dependent: if the outcome of one affects the other. In the example above, clearly it
cannot rain without a cloud formation. Also, in a horse race, some horses have better
performance on rainy days.
Events and Event Probability
Event Relationship
An “event” is a set of outcomes (a subset of all possible outcomes) with a probability attached. So
when flipping a coin, we can have one of these 2 events happening: tail or head. Each of them has a
probability of 50%. Using a Venn diagram, this would look like this:
events of flipping a coin events of rain and cloud formation
Conditional Probability and Independence
Two events are said to be independent if
the result of the second event is not
affected by the result of the first
event. The joint probability is the product
of the probabilities of the individual
events.
Two events are said to
be dependent if the result of the
second event is affected by the
result of the first event. The joint
probability is the product of the
probability of first event and
conditional probability of second
event on first event.
Chain Rule for Computing Joint Probability
)|()(),( ABPAPBAP ⋅=
For dependent events
For independent events
Conditional Probability and Bayes Theorem
• Posterior Probability (This is what we are trying to compute)
• probability of instance X being in class c
• Likelihood (Being in class c, causes you to have feature X with some probability)
• probability of generating instance X given class c
• Class Prior Probability (This is just how frequent the class c, is in our database)
• probability of occurrence of class c
• Predictor Prior Probability (Ignored because it is constant)
• probability of instance x occurring
)()|()()|(),( cPcXPXPXcPXcP ⋅=⋅=Conditional Probability:
)(
)()|(
)|(
XP
cPcXP
XcP
⋅
=
Likelihood Class Prior Probability
Posterior Probability
Predictor Prior Probability
Bayes Theorem:
Bayes Theorem Example
Let’s take one example. So we have the following stats:
• 30 emails out of a total of 74 are spam messages
• 51 emails out of those 74 contain the word “penis”
• 20 emails containing the word “penis” have been marked as spam
So the question is: what is the probability that the latest received email is a
spam message, given that it contains the word “penis”?
These 2 events are clearly dependent, which is why you must use the simple
form of the Bayes Theorem:
Naïve Bayes Approach
For single feature, applying Bayes theorem is simple. But it becomes more
complex when handling more features. For example
=),|( viagrapenisspamP
To simplify it, strong (naïve)
independence assumption between
features is applied
Let us complicate the problem above by adding to it:
• 25 emails out of the total contain the word “viagra”
• 24 emails out of those have been marked as spam
so what’s the probability that an email is spam, given that it contains both “viagra” and “penis”?
Naïve Bayes Classifier
Learning
1. Compute the class prior table which contains all P(c)
2. Compute the likelihood table which contains all P(xi|c) for all possible
combination of xi and c;
Scoring
1. Given a test instance X, compute the posterior probability of every class c;
2. Compare all P(c|X) and assign the instance x to the class c* which has the
maximum posterior probability
∏=
≈
K
i
i cPcXPXcP
1
)()|()|(
The constant term is ignored because it
won’t affect the comparison across different posterior
probabilities
∏=
=
N
i
iXPXP
1
)()(
∑=
+=
K
i
ic cXPcPc
1
*
))|(log())(log(maxarg
∑=
+≈
K
i
i cXPcPXcP
1
))|(log())(log()|(log
To avoid floating point underflow, we often need an optimization on the formula
Handling Insufficient Data
Problem
Both prior and conditional probabilities must be estimated from training data,
therefore subject to error. If we have only few training instances, then the
direct probability computation can give probabilities extreme values 0 or 1.
Example
Suppose we try to predict whether a patient has an allergy based on the
attribute whether he has cough. So we need to estimate P(allergy|cough). If
all patients in the training data have cough, then P(cough=true|allergy)=1 and
P(cough=false|allergy)=1-P(true|allergy)=0. Then we have
• What this mean is no not-coughing person can have an allergy, which is
not true.
• The error is caused by there is no observations in training data for non-
coughing patients
Solution
We need smooth the estimates of conditional probabilities to eliminate zeros.
0)()|()|( ==∝= allergyPallergyfalsecoughPfalsecoughallergyP
Laplace Smoothing
Assume binary attribute Xi, direct estimate:
Laplace estimate:
equivalent to prior observation of one example of class k where Xi=0 and one
where Xi=1
Generalized Laplace estimate:
• nc,i,v: number of examples in c where Xi=v
• nc: number of examples in c
• si: number of possible values for Xi
ic
vic
i
sn
n
cvXP
+
+
==
1
)|( ,,
2
1
)|0( 0,,
+
+
==
c
ic
i
n
n
cXP
2
1
)|1( 1,,
+
+
==
c
ic
i
n
n
cXP
c
ic
i
n
n
cXP 0,,
)|0( ==
c
ic
i
n
n
cXP 1,,
)|1( ==
Comments on Naïve Bayes Classifier
• It generally works well despite blanket independence assumption
• Experiments shows that it is quite competitive with other methods on
standard datasets
• Even when independence assumptions violated, and probability estimates
are inaccurate, the method may still find the maximum probability category
• Hypothesis constructed directly from parameter estimates derived from
training data, no search
• Hypothesis not guaranteed to fit the training data

More Related Content

What's hot

Classification Algorithm.
Classification Algorithm.Classification Algorithm.
Classification Algorithm.Megha Sharma
 
Linear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | EdurekaLinear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | EdurekaEdureka!
 
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...Edureka!
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer PerceptronsESCOM
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methodsReza Ramezani
 
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...Simplilearn
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learningHaris Jamil
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsMd. Main Uddin Rony
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for ClassificationPrakash Pimpale
 
Linear regression
Linear regressionLinear regression
Linear regressionMartinHogg9
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka
 
Representing uncertainty in expert systems
Representing uncertainty in expert systemsRepresenting uncertainty in expert systems
Representing uncertainty in expert systemsbhupendra kumar
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesnextlib
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Edureka!
 

What's hot (20)

Classification Algorithm.
Classification Algorithm.Classification Algorithm.
Classification Algorithm.
 
Linear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | EdurekaLinear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | Edureka
 
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
Naive Bayes Classifier in Python | Naive Bayes Algorithm | Machine Learning A...
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
 
Bayesian networks
Bayesian networksBayesian networks
Bayesian networks
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Bayesian networks
Bayesian networksBayesian networks
Bayesian networks
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
 
Random forest
Random forestRandom forest
Random forest
 
Linear regression
Linear regressionLinear regression
Linear regression
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 
Representing uncertainty in expert systems
Representing uncertainty in expert systemsRepresenting uncertainty in expert systems
Representing uncertainty in expert systems
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Perceptron
PerceptronPerceptron
Perceptron
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
 

Viewers also liked

02. naive bayes classifier revision
02. naive bayes classifier   revision02. naive bayes classifier   revision
02. naive bayes classifier revisionJeonghun Yoon
 
"Naive Bayes Classifier" @ Papers We Love Bucharest
"Naive Bayes Classifier" @ Papers We Love Bucharest"Naive Bayes Classifier" @ Papers We Love Bucharest
"Naive Bayes Classifier" @ Papers We Love BucharestStefan Adam
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Dev Sahu
 
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes ClassifiersDongseo University
 
KLASIFIKASI BAWANG BERBASIS CITRA DIGITAL MENGGUNAKAN METODE NAIVE BAYES CLAS...
KLASIFIKASI BAWANG BERBASIS CITRA DIGITAL MENGGUNAKAN METODE NAIVE BAYES CLAS...KLASIFIKASI BAWANG BERBASIS CITRA DIGITAL MENGGUNAKAN METODE NAIVE BAYES CLAS...
KLASIFIKASI BAWANG BERBASIS CITRA DIGITAL MENGGUNAKAN METODE NAIVE BAYES CLAS...Ika Nurrohmah
 
Implementasi Algoritma Naive Bayes (Studi Kasus : Prediksi Kelulusan Mahasisw...
Implementasi Algoritma Naive Bayes (Studi Kasus : Prediksi Kelulusan Mahasisw...Implementasi Algoritma Naive Bayes (Studi Kasus : Prediksi Kelulusan Mahasisw...
Implementasi Algoritma Naive Bayes (Studi Kasus : Prediksi Kelulusan Mahasisw...Jonathan Christian
 
A Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of CasesA Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of CasesNTNU
 
Wikipedia, Dead Authors, Naive Bayes and Python
Wikipedia, Dead Authors, Naive Bayes and Python Wikipedia, Dead Authors, Naive Bayes and Python
Wikipedia, Dead Authors, Naive Bayes and Python Abhaya Agarwal
 
Modified naive bayes model for improved web page classification
Modified naive bayes model for improved web page classificationModified naive bayes model for improved web page classification
Modified naive bayes model for improved web page classificationHammad Haleem
 
Bayesian Machine Learning & Python – Naïve Bayes (PyData SV 2013)
Bayesian Machine Learning & Python – Naïve Bayes (PyData SV 2013)Bayesian Machine Learning & Python – Naïve Bayes (PyData SV 2013)
Bayesian Machine Learning & Python – Naïve Bayes (PyData SV 2013)PyData
 
10 roses
10 roses10 roses
10 roseshoabido
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSubhabrata Mukherjee
 
Bayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive BayesBayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive BayesKrishna Sankar
 
Sentiment tool Project presentaion
Sentiment tool Project presentaionSentiment tool Project presentaion
Sentiment tool Project presentaionRavindra Chaudhary
 
Sentiment Analaysis on Twitter
Sentiment Analaysis on TwitterSentiment Analaysis on Twitter
Sentiment Analaysis on TwitterNitish J Prabhu
 
Scalable sentiment classification for big data analysis using naive bayes cla...
Scalable sentiment classification for big data analysis using naive bayes cla...Scalable sentiment classification for big data analysis using naive bayes cla...
Scalable sentiment classification for big data analysis using naive bayes cla...Tien-Yang (Aiden) Wu
 

Viewers also liked (20)

02. naive bayes classifier revision
02. naive bayes classifier   revision02. naive bayes classifier   revision
02. naive bayes classifier revision
 
"Naive Bayes Classifier" @ Papers We Love Bucharest
"Naive Bayes Classifier" @ Papers We Love Bucharest"Naive Bayes Classifier" @ Papers We Love Bucharest
"Naive Bayes Classifier" @ Papers We Love Bucharest
 
Naive Bayes
Naive Bayes Naive Bayes
Naive Bayes
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
Lecture10 - Naïve Bayes
Lecture10 - Naïve BayesLecture10 - Naïve Bayes
Lecture10 - Naïve Bayes
 
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
 
KLASIFIKASI BAWANG BERBASIS CITRA DIGITAL MENGGUNAKAN METODE NAIVE BAYES CLAS...
KLASIFIKASI BAWANG BERBASIS CITRA DIGITAL MENGGUNAKAN METODE NAIVE BAYES CLAS...KLASIFIKASI BAWANG BERBASIS CITRA DIGITAL MENGGUNAKAN METODE NAIVE BAYES CLAS...
KLASIFIKASI BAWANG BERBASIS CITRA DIGITAL MENGGUNAKAN METODE NAIVE BAYES CLAS...
 
Implementasi Algoritma Naive Bayes (Studi Kasus : Prediksi Kelulusan Mahasisw...
Implementasi Algoritma Naive Bayes (Studi Kasus : Prediksi Kelulusan Mahasisw...Implementasi Algoritma Naive Bayes (Studi Kasus : Prediksi Kelulusan Mahasisw...
Implementasi Algoritma Naive Bayes (Studi Kasus : Prediksi Kelulusan Mahasisw...
 
06 Machine Learning - Naive Bayes
06 Machine Learning - Naive Bayes06 Machine Learning - Naive Bayes
06 Machine Learning - Naive Bayes
 
A Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of CasesA Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of Cases
 
Wikipedia, Dead Authors, Naive Bayes and Python
Wikipedia, Dead Authors, Naive Bayes and Python Wikipedia, Dead Authors, Naive Bayes and Python
Wikipedia, Dead Authors, Naive Bayes and Python
 
Modified naive bayes model for improved web page classification
Modified naive bayes model for improved web page classificationModified naive bayes model for improved web page classification
Modified naive bayes model for improved web page classification
 
Bayesian Machine Learning & Python – Naïve Bayes (PyData SV 2013)
Bayesian Machine Learning & Python – Naïve Bayes (PyData SV 2013)Bayesian Machine Learning & Python – Naïve Bayes (PyData SV 2013)
Bayesian Machine Learning & Python – Naïve Bayes (PyData SV 2013)
 
10 roses
10 roses10 roses
10 roses
 
Naive Bayes | Statistics
Naive Bayes | StatisticsNaive Bayes | Statistics
Naive Bayes | Statistics
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
 
Bayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive BayesBayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive Bayes
 
Sentiment tool Project presentaion
Sentiment tool Project presentaionSentiment tool Project presentaion
Sentiment tool Project presentaion
 
Sentiment Analaysis on Twitter
Sentiment Analaysis on TwitterSentiment Analaysis on Twitter
Sentiment Analaysis on Twitter
 
Scalable sentiment classification for big data analysis using naive bayes cla...
Scalable sentiment classification for big data analysis using naive bayes cla...Scalable sentiment classification for big data analysis using naive bayes cla...
Scalable sentiment classification for big data analysis using naive bayes cla...
 

Similar to Naive Bayes Classifier

Mncs 16-09-4주-변승규-introduction to the machine learning
Mncs 16-09-4주-변승규-introduction to the machine learningMncs 16-09-4주-변승규-introduction to the machine learning
Mncs 16-09-4주-변승규-introduction to the machine learningSeung-gyu Byeon
 
UNIT2_NaiveBayes algorithms used in machine learning
UNIT2_NaiveBayes algorithms used in machine learningUNIT2_NaiveBayes algorithms used in machine learning
UNIT2_NaiveBayes algorithms used in machine learningmichaelaaron25322
 
Machine learning naive bayes and svm.pdf
Machine learning naive bayes and svm.pdfMachine learning naive bayes and svm.pdf
Machine learning naive bayes and svm.pdfSubhamKumar3239
 
Artificial Intelligence Notes Unit 3
Artificial Intelligence Notes Unit 3Artificial Intelligence Notes Unit 3
Artificial Intelligence Notes Unit 3DigiGurukul
 
CHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptx
CHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptxCHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptx
CHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptxanshujain54751
 
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5Daniel Katz
 
Machine Learning Algorithms Review(Part 2)
Machine Learning Algorithms Review(Part 2)Machine Learning Algorithms Review(Part 2)
Machine Learning Algorithms Review(Part 2)Zihui Li
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer Sammer Qader
 
chap4_Parametric_Methods.ppt
chap4_Parametric_Methods.pptchap4_Parametric_Methods.ppt
chap4_Parametric_Methods.pptShayanChowdary
 
Supervised learning: Types of Machine Learning
Supervised learning: Types of Machine LearningSupervised learning: Types of Machine Learning
Supervised learning: Types of Machine LearningLibya Thomas
 
Dimd_m_004 DL.pdf
Dimd_m_004 DL.pdfDimd_m_004 DL.pdf
Dimd_m_004 DL.pdfjuan631
 

Similar to Naive Bayes Classifier (20)

Machine learning clisification algorthims
Machine learning clisification algorthimsMachine learning clisification algorthims
Machine learning clisification algorthims
 
Navies bayes
Navies bayesNavies bayes
Navies bayes
 
Naieve_Bayee.pptx
Naieve_Bayee.pptxNaieve_Bayee.pptx
Naieve_Bayee.pptx
 
Week 2 notes.ppt
Week 2 notes.pptWeek 2 notes.ppt
Week 2 notes.ppt
 
Mncs 16-09-4주-변승규-introduction to the machine learning
Mncs 16-09-4주-변승규-introduction to the machine learningMncs 16-09-4주-변승규-introduction to the machine learning
Mncs 16-09-4주-변승규-introduction to the machine learning
 
Probability
ProbabilityProbability
Probability
 
UNIT2_NaiveBayes algorithms used in machine learning
UNIT2_NaiveBayes algorithms used in machine learningUNIT2_NaiveBayes algorithms used in machine learning
UNIT2_NaiveBayes algorithms used in machine learning
 
Machine learning naive bayes and svm.pdf
Machine learning naive bayes and svm.pdfMachine learning naive bayes and svm.pdf
Machine learning naive bayes and svm.pdf
 
Naive Bayes.pptx
Naive Bayes.pptxNaive Bayes.pptx
Naive Bayes.pptx
 
Artificial Intelligence Notes Unit 3
Artificial Intelligence Notes Unit 3Artificial Intelligence Notes Unit 3
Artificial Intelligence Notes Unit 3
 
CHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptx
CHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptxCHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptx
CHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptx
 
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
 
Machine Learning Algorithms Review(Part 2)
Machine Learning Algorithms Review(Part 2)Machine Learning Algorithms Review(Part 2)
Machine Learning Algorithms Review(Part 2)
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer
 
Machine Learning (Classification Models)
Machine Learning (Classification Models)Machine Learning (Classification Models)
Machine Learning (Classification Models)
 
chap4_Parametric_Methods.ppt
chap4_Parametric_Methods.pptchap4_Parametric_Methods.ppt
chap4_Parametric_Methods.ppt
 
5. RV and Distributions.pptx
5. RV and Distributions.pptx5. RV and Distributions.pptx
5. RV and Distributions.pptx
 
Supervised learning: Types of Machine Learning
Supervised learning: Types of Machine LearningSupervised learning: Types of Machine Learning
Supervised learning: Types of Machine Learning
 
Logistics regression
Logistics regressionLogistics regression
Logistics regression
 
Dimd_m_004 DL.pdf
Dimd_m_004 DL.pdfDimd_m_004 DL.pdf
Dimd_m_004 DL.pdf
 

Recently uploaded

Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...kumargunjan9515
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...HyderabadDolls
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 

Recently uploaded (20)

Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 

Naive Bayes Classifier

  • 2. Naïve Bayes Classifier • Only utilize the simple probability and Bayes’ theorem • Computational efficiency Definition Potential Use Cases In machine learning, Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence assumptions between the features. It is one of the most basic text classification techniques with various applications • Email Spam Detection • Language Detection • Sentiment Detection • Personal email sorting • Document categorization Advantages
  • 3. Basic Probability Theory • 2 events are disjoint (exclusive): if they can’t happen at the same time (a single coin flip cannot yield a tail and a head at the same time). For Bayes classification, we are not concerned with disjoint events. • 2 events are independent: when they can happen at the same time, but the occurrence of one event does not make the occurrence of another more or less probable. For example the second coin-flip you make is not affected by the outcome of the first coin-flip. • 2 events are dependent: if the outcome of one affects the other. In the example above, clearly it cannot rain without a cloud formation. Also, in a horse race, some horses have better performance on rainy days. Events and Event Probability Event Relationship An “event” is a set of outcomes (a subset of all possible outcomes) with a probability attached. So when flipping a coin, we can have one of these 2 events happening: tail or head. Each of them has a probability of 50%. Using a Venn diagram, this would look like this: events of flipping a coin events of rain and cloud formation
  • 4. Conditional Probability and Independence Two events are said to be independent if the result of the second event is not affected by the result of the first event. The joint probability is the product of the probabilities of the individual events. Two events are said to be dependent if the result of the second event is affected by the result of the first event. The joint probability is the product of the probability of first event and conditional probability of second event on first event. Chain Rule for Computing Joint Probability )|()(),( ABPAPBAP ⋅= For dependent events For independent events
  • 5. Conditional Probability and Bayes Theorem • Posterior Probability (This is what we are trying to compute) • probability of instance X being in class c • Likelihood (Being in class c, causes you to have feature X with some probability) • probability of generating instance X given class c • Class Prior Probability (This is just how frequent the class c, is in our database) • probability of occurrence of class c • Predictor Prior Probability (Ignored because it is constant) • probability of instance x occurring )()|()()|(),( cPcXPXPXcPXcP ⋅=⋅=Conditional Probability: )( )()|( )|( XP cPcXP XcP ⋅ = Likelihood Class Prior Probability Posterior Probability Predictor Prior Probability Bayes Theorem:
  • 6. Bayes Theorem Example Let’s take one example. So we have the following stats: • 30 emails out of a total of 74 are spam messages • 51 emails out of those 74 contain the word “penis” • 20 emails containing the word “penis” have been marked as spam So the question is: what is the probability that the latest received email is a spam message, given that it contains the word “penis”? These 2 events are clearly dependent, which is why you must use the simple form of the Bayes Theorem:
  • 7. Naïve Bayes Approach For single feature, applying Bayes theorem is simple. But it becomes more complex when handling more features. For example =),|( viagrapenisspamP To simplify it, strong (naïve) independence assumption between features is applied Let us complicate the problem above by adding to it: • 25 emails out of the total contain the word “viagra” • 24 emails out of those have been marked as spam so what’s the probability that an email is spam, given that it contains both “viagra” and “penis”?
  • 8. Naïve Bayes Classifier Learning 1. Compute the class prior table which contains all P(c) 2. Compute the likelihood table which contains all P(xi|c) for all possible combination of xi and c; Scoring 1. Given a test instance X, compute the posterior probability of every class c; 2. Compare all P(c|X) and assign the instance x to the class c* which has the maximum posterior probability ∏= ≈ K i i cPcXPXcP 1 )()|()|( The constant term is ignored because it won’t affect the comparison across different posterior probabilities ∏= = N i iXPXP 1 )()( ∑= += K i ic cXPcPc 1 * ))|(log())(log(maxarg ∑= +≈ K i i cXPcPXcP 1 ))|(log())(log()|(log To avoid floating point underflow, we often need an optimization on the formula
  • 9. Handling Insufficient Data Problem Both prior and conditional probabilities must be estimated from training data, therefore subject to error. If we have only few training instances, then the direct probability computation can give probabilities extreme values 0 or 1. Example Suppose we try to predict whether a patient has an allergy based on the attribute whether he has cough. So we need to estimate P(allergy|cough). If all patients in the training data have cough, then P(cough=true|allergy)=1 and P(cough=false|allergy)=1-P(true|allergy)=0. Then we have • What this mean is no not-coughing person can have an allergy, which is not true. • The error is caused by there is no observations in training data for non- coughing patients Solution We need smooth the estimates of conditional probabilities to eliminate zeros. 0)()|()|( ==∝= allergyPallergyfalsecoughPfalsecoughallergyP
  • 10. Laplace Smoothing Assume binary attribute Xi, direct estimate: Laplace estimate: equivalent to prior observation of one example of class k where Xi=0 and one where Xi=1 Generalized Laplace estimate: • nc,i,v: number of examples in c where Xi=v • nc: number of examples in c • si: number of possible values for Xi ic vic i sn n cvXP + + == 1 )|( ,, 2 1 )|0( 0,, + + == c ic i n n cXP 2 1 )|1( 1,, + + == c ic i n n cXP c ic i n n cXP 0,, )|0( == c ic i n n cXP 1,, )|1( ==
  • 11. Comments on Naïve Bayes Classifier • It generally works well despite blanket independence assumption • Experiments shows that it is quite competitive with other methods on standard datasets • Even when independence assumptions violated, and probability estimates are inaccurate, the method may still find the maximum probability category • Hypothesis constructed directly from parameter estimates derived from training data, no search • Hypothesis not guaranteed to fit the training data