SlideShare a Scribd company logo
1 of 95
Overview of Machine Learning Raymond J. Mooney Department of Computer Sciences University of Texas at Austin
What is Learning? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Classification Examples ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Other Tasks ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
How is Performance Measured? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Training Experience ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Types of Direct Supervision ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Categorization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Learning for Categorization ,[object Object],[object Object],[object Object],[object Object],[object Object]
Sample Category Learning Problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],negative triangle red small 3 positive circle red large 2 positive circle red small 1 negative circle blue large 4 Category Shape Color Size Example
General Learning Issues ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Learning as Search ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],2 n
Types of Bias ,[object Object],[object Object],[object Object]
Generalization ,[object Object],[object Object],[object Object],[object Object]
Over-Fitting ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Learning Approaches EM (inside-outside) PCFG Probabilistic Grammar EM (forward-backward) HMM Hidden Markov Model Maximum likelihood/EM Bayesian Network Bayes Net Memorize then Find closest match Stored instances Nearest Neighbor Instance/Case-based Gradient descent Artificial neural net Neural Network Greedy divide & conquer Decision trees Decision tree induction Greedy set covering Rules Rule Induction Search Method Representation Approach
More Learning Approaches Genetic algorithm Rules/neural-nets Evolutionary computation Greedy set covering Prolog program Inductive Logic Programming Averaging Average instance Prototype Quadratic optimization Hyperplane Support Vector Machine (SVM) Generalized/Improved Iterative Scaling Exponential Model Maximum Entropy (MaxEnt) Search Method Representation Approach
Text Categorization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Relevance Feedback Architecture Rankings IR System Document corpus Ranked Documents 1. Doc1  2. Doc2  3. Doc3  . . 1. Doc1   2. Doc2   3. Doc3   . . Feedback Query String Revised Query ReRanked Documents 1. Doc2  2. Doc4  3. Doc5  . . Query Reformulation
Using Relevance Feedback (Rocchio) ,[object Object],[object Object],[object Object],[object Object]
Illustration of Rocchio Text Categorization
Rocchio Text Categorization Algorithm (Training) Assume the set of categories is { c 1 ,  c 2 ,… c n } For  i  from 1 to  n  let  p i  = <0, 0,…,0>  ( init. prototype vectors ) For each training example < x ,  c ( x )>     D Let  d  be the frequency normalized TF/IDF term vector for doc  x Let  i  =  j : ( c j  =  c ( x )) ( sum all the document vectors in c i  to get  p i ) Let  p i  =  p i  +  d
Rocchio Text Categorization Algorithm (Test) Given test document  x Let  d  be the TF/IDF weighted term vector for  x Let  m  =  –2  ( init.   maximum cosSim ) For  i  from 1 to  n : ( compute similarity to prototype vector ) Let  s  = cosSim( d ,  p i ) if  s  >  m let  m  =  s let  r = c i  ( update most similar class prototype ) Return class  r
Rocchio Properties  ,[object Object],[object Object],[object Object],[object Object]
Rocchio Time Complexity ,[object Object],[object Object],[object Object],[object Object]
Nearest-Neighbor Learning Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
K Nearest-Neighbor ,[object Object],[object Object],[object Object],[object Object],[object Object]
Similarity Metrics ,[object Object],[object Object],[object Object],[object Object]
3 Nearest Neighbor Illustration (Euclidian Distance) . . . . . . . . . . .
K Nearest Neighbor for Text Training: For each each   training example < x ,  c ( x )>     D Compute the corresponding TF-IDF vector,  d x , for document  x Test instance  y : Compute TF-IDF vector  d  for document  y For each  < x ,  c ( x )>     D Let  s x  = cosSim( d ,  d x ) Sort examples,  x , in  D  by decreasing value of  s x Let  N  be the first  k  examples in D.  ( get most similar neighbors ) Return the majority class of examples in  N
Illustration of 3 Nearest Neighbor for Text
Rocchio Anomoly  ,[object Object]
3 Nearest Neighbor Comparison ,[object Object]
Nearest Neighbor Time Complexity ,[object Object],[object Object],[object Object],[object Object]
Nearest Neighbor with Inverted Index ,[object Object],[object Object],[object Object],[object Object],[object Object]
Bayesian Methods ,[object Object],[object Object],[object Object],[object Object]
Conditional Probability  ,[object Object],[object Object],[object Object],A B
Independence ,[object Object],[object Object],These two constraints are logically equivalent
Bayes Theorem ,[object Object],QED: (Def. cond. prob.) (Def. cond. prob.)
Bayesian Categorization ,[object Object],[object Object],[object Object],[object Object]
Bayesian Categorization (cont.) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Naïve Bayesian Categorization ,[object Object],[object Object]
Naïve Bayes Example ,[object Object],[object Object],[object Object],0.4 0.7 0.01 P(fever| c i ) 0.7 0.8 0.1 P(cough| c i ) 0.9 0.9 0.1 P(sneeze| c i ) 0.05 0.05 0.9 P( c i ) Allergy Cold Well Prob
Naïve Bayes Example (cont.) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],E={sneeze, cough,   fever} 0.4 0.7 0.01 P(fever |  c i ) 0.7 0.8 0.1 P(cough |  c i ) 0.9 0.9 0.1 P(sneeze |  c i ) 0.05 0.05 0.9 P( c i ) Allergy Cold Well Probability
Estimating Probabilities ,[object Object],[object Object],[object Object],[object Object],[object Object]
Smoothing ,[object Object],[object Object],[object Object]
Naïve Bayes for Text ,[object Object],[object Object],[object Object]
Text Naïve Bayes Algorithm (Train) Let  V  be the vocabulary of all words in the documents in  D For each category  c i     C Let  D i   be the subset of documents in  D  in category  c i P( c i ) = | D i | / | D | Let  T i  be the concatenation of all the documents in  D i Let  n i  be the total number of word occurrences in  T i For each word  w j     V Let  n ij  be the number of occurrences of  w j  in  T i Let P( w i   |  c i ) = ( n ij  + 1) / ( n i  + | V |)
Text Naïve Bayes Algorithm (Test) Given a test document  X Let  n  be the number of word occurrences in  X Return the category: where  a j  is the word occurring the  j th position in  X
Naïve Bayes Time Complexity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Underflow Prevention ,[object Object],[object Object],[object Object]
Naïve Bayes Posterior Probabilities ,[object Object],[object Object],[object Object]
Evaluating Categorization ,[object Object],[object Object],[object Object],[object Object]
N -Fold Cross-Validation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Learning Curves ,[object Object],[object Object],[object Object]
N -Fold Learning Curves ,[object Object],[object Object],[object Object]
Sample Document Corpus ,[object Object],[object Object]
Sample Learning Curve (Yahoo Science Data)
Clustering ,[object Object],[object Object],[object Object],[object Object]
Clustering Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Hierarchical Clustering ,[object Object],[object Object],animal vertebrate fish reptile amphib. mammal  worm insect crustacean invertebrate
Aglommerative vs. Divisive Clustering ,[object Object],[object Object]
Direct Clustering Method ,[object Object],[object Object],[object Object]
Hierarchical Agglomerative Clustering  (HAC) ,[object Object],[object Object],[object Object]
HAC Algorithm Start with all instances in their own cluster. Until there is only one cluster: Among the current clusters, determine the two  clusters,  c i  and  c j , that are most similar. Replace  c i  and  c j  with a single cluster  c i     c j
Cluster Similarity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Single Link Agglomerative Clustering ,[object Object],[object Object],[object Object]
Single Link Example
Complete Link Agglomerative Clustering ,[object Object],[object Object]
Complete Link Example
Computational Complexity ,[object Object],[object Object],[object Object]
Computing Cluster Similarity ,[object Object],[object Object],[object Object]
Group Average Agglomerative Clustering ,[object Object],[object Object],[object Object]
Computing Group Average Similarity ,[object Object],[object Object],[object Object]
Non-Hierarchical Clustering ,[object Object],[object Object],[object Object],[object Object],[object Object]
K-Means ,[object Object],[object Object],[object Object]
Distance Metrics ,[object Object],[object Object],[object Object]
K-Means Algorithm Let  d  be the distance measure between instances. Select  k  random instances { s 1 ,  s 2 ,…  s k } as seeds. Until clustering converges or other stopping criterion: For each instance  x i : Assign  x i  to the cluster  c j   such that  d ( x i ,  s j ) is minimal. ( Update the seeds to the centroid of each cluster ) For each cluster  c j s j  =   ( c j )
K Means Example (K=2) Reassign clusters Converged! Pick seeds Reassign clusters Compute centroids x x Reasssign clusters x x x x Compute centroids
Time Complexity ,[object Object],[object Object],[object Object],[object Object],[object Object]
Seed Choice ,[object Object],[object Object],[object Object]
Buckshot Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object]
Text Clustering ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Soft Clustering ,[object Object],[object Object],[object Object],[object Object]
Expectation Maximization (EM) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
EM Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Learning from Probabilistically Labeled Data  ,[object Object],[object Object],[object Object]
Naïve Bayes EM Randomly assign examples probabilistic category labels. Use standard naïve-Bayes training to learn a probabilistic model  with parameters    from the labeled data. Until convergence or until maximum number of iterations reached: E-Step : Use the naïve Bayes model     to compute P( c i  |  E ) for each category and example, and re-label each example  using these probability values as soft category labels. M-Step : Use standard naïve-Bayes training to re-estimate the  parameters    using these new probabilistic category labels.
Semi-Supervised Learning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Semi-Supervised Example ,[object Object],[object Object],[object Object],[object Object],[object Object]
Semi-Supervision Results ,[object Object],[object Object],[object Object],[object Object]
Active Learning ,[object Object],[object Object],[object Object],[object Object],[object Object]
Weak Supervision ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Prior Knowledge ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Learning to Learn ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

More Related Content

What's hot

17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis Functions17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis FunctionsAndres Mendez-Vazquez
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorizationmidi
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmIRJET Journal
 
Mini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic ModelingMini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic ModelingTomonari Masada
 
Latent Semantic Word Sense Disambiguation Using Global Co-Occurrence Information
Latent Semantic Word Sense Disambiguation Using Global Co-Occurrence InformationLatent Semantic Word Sense Disambiguation Using Global Co-Occurrence Information
Latent Semantic Word Sense Disambiguation Using Global Co-Occurrence Informationcsandit
 
Density Based Clustering
Density Based ClusteringDensity Based Clustering
Density Based ClusteringSSA KPI
 
Performance Analysis of Different Clustering Algorithm
Performance Analysis of Different Clustering AlgorithmPerformance Analysis of Different Clustering Algorithm
Performance Analysis of Different Clustering AlgorithmIOSR Journals
 
Recommendation system
Recommendation systemRecommendation system
Recommendation systemDing Li
 
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)neeraj7svp
 
A Novel Approach to Mathematical Concepts in Data Mining
A Novel Approach to Mathematical Concepts in Data MiningA Novel Approach to Mathematical Concepts in Data Mining
A Novel Approach to Mathematical Concepts in Data Miningijdmtaiir
 
3.4 density and grid methods
3.4 density and grid methods3.4 density and grid methods
3.4 density and grid methodsKrish_ver2
 
Deep learning ensembles loss landscape
Deep learning ensembles loss landscapeDeep learning ensembles loss landscape
Deep learning ensembles loss landscapeDevansh16
 
Selection K in K-means Clustering
Selection K in K-means ClusteringSelection K in K-means Clustering
Selection K in K-means ClusteringJunghoon Kim
 
Analytical learning
Analytical learningAnalytical learning
Analytical learningswapnac12
 
Inductive analytical approaches to learning
Inductive analytical approaches to learningInductive analytical approaches to learning
Inductive analytical approaches to learningswapnac12
 
2.7 other classifiers
2.7 other classifiers2.7 other classifiers
2.7 other classifiersKrish_ver2
 
Enhanced Genetic Algorithm with K-Means for the Clustering Problem
Enhanced Genetic Algorithm with K-Means for the Clustering ProblemEnhanced Genetic Algorithm with K-Means for the Clustering Problem
Enhanced Genetic Algorithm with K-Means for the Clustering ProblemAnders Viken
 
ANN Based POS Tagging For Nepali Text
ANN Based POS Tagging For Nepali Text ANN Based POS Tagging For Nepali Text
ANN Based POS Tagging For Nepali Text ijnlc
 

What's hot (19)

17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis Functions17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis Functions
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
 
Mini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic ModelingMini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic Modeling
 
Latent Semantic Word Sense Disambiguation Using Global Co-Occurrence Information
Latent Semantic Word Sense Disambiguation Using Global Co-Occurrence InformationLatent Semantic Word Sense Disambiguation Using Global Co-Occurrence Information
Latent Semantic Word Sense Disambiguation Using Global Co-Occurrence Information
 
Density Based Clustering
Density Based ClusteringDensity Based Clustering
Density Based Clustering
 
Performance Analysis of Different Clustering Algorithm
Performance Analysis of Different Clustering AlgorithmPerformance Analysis of Different Clustering Algorithm
Performance Analysis of Different Clustering Algorithm
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
 
A Novel Approach to Mathematical Concepts in Data Mining
A Novel Approach to Mathematical Concepts in Data MiningA Novel Approach to Mathematical Concepts in Data Mining
A Novel Approach to Mathematical Concepts in Data Mining
 
3.4 density and grid methods
3.4 density and grid methods3.4 density and grid methods
3.4 density and grid methods
 
Deep learning ensembles loss landscape
Deep learning ensembles loss landscapeDeep learning ensembles loss landscape
Deep learning ensembles loss landscape
 
Selection K in K-means Clustering
Selection K in K-means ClusteringSelection K in K-means Clustering
Selection K in K-means Clustering
 
Analytical learning
Analytical learningAnalytical learning
Analytical learning
 
Inductive analytical approaches to learning
Inductive analytical approaches to learningInductive analytical approaches to learning
Inductive analytical approaches to learning
 
2.7 other classifiers
2.7 other classifiers2.7 other classifiers
2.7 other classifiers
 
Enhanced Genetic Algorithm with K-Means for the Clustering Problem
Enhanced Genetic Algorithm with K-Means for the Clustering ProblemEnhanced Genetic Algorithm with K-Means for the Clustering Problem
Enhanced Genetic Algorithm with K-Means for the Clustering Problem
 
ANN Based POS Tagging For Nepali Text
ANN Based POS Tagging For Nepali Text ANN Based POS Tagging For Nepali Text
ANN Based POS Tagging For Nepali Text
 
10 clusbasic
10 clusbasic10 clusbasic
10 clusbasic
 

Similar to lecture_mooney.ppt

Machine Learning
Machine LearningMachine Learning
Machine Learningbutest
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.butest
 
Discovering Novel Information with sentence Level clustering From Multi-docu...
Discovering Novel Information with sentence Level clustering  From Multi-docu...Discovering Novel Information with sentence Level clustering  From Multi-docu...
Discovering Novel Information with sentence Level clustering From Multi-docu...irjes
 
Part 1
Part 1Part 1
Part 1butest
 
Machine learning and Neural Networks
Machine learning and Neural NetworksMachine learning and Neural Networks
Machine learning and Neural Networksbutest
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3butest
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.pptbutest
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.pptbutest
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.pptbutest
 
slides
slidesslides
slidesbutest
 
Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionMargaret Wang
 
Tdm probabilistic models (part 2)
Tdm probabilistic  models (part  2)Tdm probabilistic  models (part  2)
Tdm probabilistic models (part 2)KU Leuven
 
Search Engines
Search EnginesSearch Engines
Search Enginesbutest
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptAnshika865276
 
Computational Learning Theory
Computational Learning TheoryComputational Learning Theory
Computational Learning Theorybutest
 

Similar to lecture_mooney.ppt (20)

ppt
pptppt
ppt
 
.ppt
.ppt.ppt
.ppt
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.
 
Discovering Novel Information with sentence Level clustering From Multi-docu...
Discovering Novel Information with sentence Level clustering  From Multi-docu...Discovering Novel Information with sentence Level clustering  From Multi-docu...
Discovering Novel Information with sentence Level clustering From Multi-docu...
 
Part 1
Part 1Part 1
Part 1
 
Machine learning and Neural Networks
Machine learning and Neural NetworksMachine learning and Neural Networks
Machine learning and Neural Networks
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
slides
slidesslides
slides
 
Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and prediction
 
My7class
My7classMy7class
My7class
 
Tdm probabilistic models (part 2)
Tdm probabilistic  models (part  2)Tdm probabilistic  models (part  2)
Tdm probabilistic models (part 2)
 
Decision tree learning
Decision tree learningDecision tree learning
Decision tree learning
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
 
Computational Learning Theory
Computational Learning TheoryComputational Learning Theory
Computational Learning Theory
 

More from butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

More from butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

lecture_mooney.ppt

  • 1. Overview of Machine Learning Raymond J. Mooney Department of Computer Sciences University of Texas at Austin
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16. Learning Approaches EM (inside-outside) PCFG Probabilistic Grammar EM (forward-backward) HMM Hidden Markov Model Maximum likelihood/EM Bayesian Network Bayes Net Memorize then Find closest match Stored instances Nearest Neighbor Instance/Case-based Gradient descent Artificial neural net Neural Network Greedy divide & conquer Decision trees Decision tree induction Greedy set covering Rules Rule Induction Search Method Representation Approach
  • 17. More Learning Approaches Genetic algorithm Rules/neural-nets Evolutionary computation Greedy set covering Prolog program Inductive Logic Programming Averaging Average instance Prototype Quadratic optimization Hyperplane Support Vector Machine (SVM) Generalized/Improved Iterative Scaling Exponential Model Maximum Entropy (MaxEnt) Search Method Representation Approach
  • 18.
  • 19. Relevance Feedback Architecture Rankings IR System Document corpus Ranked Documents 1. Doc1 2. Doc2 3. Doc3 . . 1. Doc1  2. Doc2  3. Doc3  . . Feedback Query String Revised Query ReRanked Documents 1. Doc2 2. Doc4 3. Doc5 . . Query Reformulation
  • 20.
  • 21. Illustration of Rocchio Text Categorization
  • 22. Rocchio Text Categorization Algorithm (Training) Assume the set of categories is { c 1 , c 2 ,… c n } For i from 1 to n let p i = <0, 0,…,0> ( init. prototype vectors ) For each training example < x , c ( x )>  D Let d be the frequency normalized TF/IDF term vector for doc x Let i = j : ( c j = c ( x )) ( sum all the document vectors in c i to get p i ) Let p i = p i + d
  • 23. Rocchio Text Categorization Algorithm (Test) Given test document x Let d be the TF/IDF weighted term vector for x Let m = –2 ( init. maximum cosSim ) For i from 1 to n : ( compute similarity to prototype vector ) Let s = cosSim( d , p i ) if s > m let m = s let r = c i ( update most similar class prototype ) Return class r
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29. 3 Nearest Neighbor Illustration (Euclidian Distance) . . . . . . . . . . .
  • 30. K Nearest Neighbor for Text Training: For each each training example < x , c ( x )>  D Compute the corresponding TF-IDF vector, d x , for document x Test instance y : Compute TF-IDF vector d for document y For each < x , c ( x )>  D Let s x = cosSim( d , d x ) Sort examples, x , in D by decreasing value of s x Let N be the first k examples in D. ( get most similar neighbors ) Return the majority class of examples in N
  • 31. Illustration of 3 Nearest Neighbor for Text
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48. Text Naïve Bayes Algorithm (Train) Let V be the vocabulary of all words in the documents in D For each category c i  C Let D i be the subset of documents in D in category c i P( c i ) = | D i | / | D | Let T i be the concatenation of all the documents in D i Let n i be the total number of word occurrences in T i For each word w j  V Let n ij be the number of occurrences of w j in T i Let P( w i | c i ) = ( n ij + 1) / ( n i + | V |)
  • 49. Text Naïve Bayes Algorithm (Test) Given a test document X Let n be the number of word occurrences in X Return the category: where a j is the word occurring the j th position in X
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58. Sample Learning Curve (Yahoo Science Data)
  • 59.
  • 60. Clustering Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
  • 61.
  • 62.
  • 63.
  • 64.
  • 65. HAC Algorithm Start with all instances in their own cluster. Until there is only one cluster: Among the current clusters, determine the two clusters, c i and c j , that are most similar. Replace c i and c j with a single cluster c i  c j
  • 66.
  • 67.
  • 69.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78. K-Means Algorithm Let d be the distance measure between instances. Select k random instances { s 1 , s 2 ,… s k } as seeds. Until clustering converges or other stopping criterion: For each instance x i : Assign x i to the cluster c j such that d ( x i , s j ) is minimal. ( Update the seeds to the centroid of each cluster ) For each cluster c j s j =  ( c j )
  • 79. K Means Example (K=2) Reassign clusters Converged! Pick seeds Reassign clusters Compute centroids x x Reasssign clusters x x x x Compute centroids
  • 80.
  • 81.
  • 82.
  • 83.
  • 84.
  • 85.
  • 86.
  • 87.
  • 88. Naïve Bayes EM Randomly assign examples probabilistic category labels. Use standard naïve-Bayes training to learn a probabilistic model with parameters  from the labeled data. Until convergence or until maximum number of iterations reached: E-Step : Use the naïve Bayes model  to compute P( c i | E ) for each category and example, and re-label each example using these probability values as soft category labels. M-Step : Use standard naïve-Bayes training to re-estimate the parameters  using these new probabilistic category labels.
  • 89.
  • 90.
  • 91.
  • 92.
  • 93.
  • 94.
  • 95.