SlideShare a Scribd company logo
1 of 45
Computer vision: models,
 learning and inference
        Chapter 20
   Models for visual words



   Please send errata to s.prince@cs.ucl.ac.uk
Visual words

• Most models treat data as continuous
• Likelihood based on normal distribution
• Visual words = discrete representation of
  image
• Likelihood based on categorical distribution
• Useful for difficult tasks such as scene
  recognition and object recognition

         Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   2
Motivation: scene recognition




   Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   3
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   4
Computing dictionary of visual words

1. For every one of the I training images, select a
   set of Ji spatial locations.
     •   Interest points
     •   Regular grid
2. Compute a descriptor at each spatial location in
   each image
3. Cluster all of these descriptor vectors into K
   groups using a method such as the K-Means
   algorithm
4. The means of the K clusters are used as the K
   prototype vectors in the dictionary.
          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   5
Encoding images as visual words
1. Select a set of J spatial locations in the image using the same
   method as for the dictionary
2. Compute the descriptor at each of the J spatial locations.
3. Compare each descriptor to the set of K prototype
   descriptors in the dictionary
4. Assign a discrete index to this location that corresponds to
   the index of the closest word in the dictionary.

End result:

         Discrete feature index               x and y position
              Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   6
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   7
Bag of words model
Key idea:

• Abandon all spatial information
• Just represent image by relative frequency
  (histogram) of words from dictionary

                                                                            where




            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   8
Bag of words




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   9
Structure
Learning (MAP solution):




Inference:




         Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   10
Bag of words for object recognition




      Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   11
Problems with bag of words




  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   12
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   13
Latent Dirichlet allocation
• Describes relative frequency of visual words in a
  single image (no world term)
• Words not generated independently (connected by
  hidden variable)
• Analogy to text documents
   – Each image contains mixture of several topics (parts)
   – Each topic induces a distribution over words




           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   14
Latent Dirichlet allocation




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   15
Latent Dirichlet allocation
Generative equations




Marginal distribution over features




Conjugate priors over parameters



          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   16
Latent Dirichlet allocation




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   17
Learning LDA model
• Part labels      p     hidden variables
• If we knew them then it would be easy to estimate the
  parameters




• How about EM algorithm? Unfortunately, parts within in
  image not independent

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   18
Latent Dirichlet allocation




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   19
Learning
Strategy:

1. Write an expression for posterior distribution
   over part labels
2. Draw samples from posterior using MCMC
3. Use samples to estimate parameters




            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   20
1. Posterior over part labels

                                                                                     Denominator
                                                                                      intractable
Can compute two terms in numerator in closed form




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince          21
2. Draw samples from posterior
Gibbs’ sampling: fix all part labels except one and sample
from conditional distribution




This can be computed in closed form




        Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   22
3. Use samples to estimate parameters

Samples substitute in for real part labels in update
equations




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   23
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   24
Single author topic model




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   25
Single author-topic model




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   26
Learning
1. Posterior over part labels



Likelihood same as before, prior becomes




           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   27
Learning
2. Draw samples from posterior




3. Use samples to estimate parameters




        Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   28
Inference
Likelihood that words in this image are due to
category n




Compute posterior over categories




           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   29
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   30
Problems with bag of words




  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   31
Constellation model




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   32
Constellation model




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   33
Learning
1. Posterior over part labels



Prior same as before, likelihood becomes




           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   34
Learning
2. Draw samples from posterior




3. Use samples to estimate parameters



 Part and word probabilities as before
              Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   35
Inference
Likelihood that words in this image are due to
category n




Compute posterior over categories




           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   36
Learning




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   37
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   38
Problems with bag of words




  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   39
Scene model




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   40
Scene model




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   41
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   42
Video Google




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   43
Action recognition




Spatio-temporal bag of words model 91.8% classification


       Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   44
Action recognition




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   45

More Related Content

Viewers also liked

LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)rchbeir
 
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...Damiano Spina
 
Mathematical approach for Text Mining 1
Mathematical approach for Text Mining 1Mathematical approach for Text Mining 1
Mathematical approach for Text Mining 1Kyunghoon Kim
 
Mining Features from the Object-Oriented Source Code of a Collection of Softw...
Mining Features from the Object-Oriented Source Code of a Collection of Softw...Mining Features from the Object-Oriented Source Code of a Collection of Softw...
Mining Features from the Object-Oriented Source Code of a Collection of Softw...Ra'Fat Al-Msie'deen
 
SNAPP - Learning Analytics and Knowledge Conference 2011
SNAPP - Learning Analytics and Knowledge Conference 2011SNAPP - Learning Analytics and Knowledge Conference 2011
SNAPP - Learning Analytics and Knowledge Conference 2011aneeshabakharia
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionCory Andrew Henson
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureRakuten Group, Inc.
 
BigML Summer 2016 Release
BigML Summer 2016 ReleaseBigML Summer 2016 Release
BigML Summer 2016 ReleaseBigML, Inc
 
An approach to source code plagiarism
An approach to source code plagiarismAn approach to source code plagiarism
An approach to source code plagiarismvarsha_bhat
 
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet ProcessesBayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet ProcessesJinYeong Bak
 
Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003Ajay Ohri
 
How to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco AmalfiHow to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco AmalfiSocial Media Camp
 
Latent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationLatent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationElaheh Barati
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Topic Models, LDA and all that
Topic Models, LDA and all thatTopic Models, LDA and all that
Topic Models, LDA and all thatZhibo Xiao
 
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot..."Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...Davide Chicco
 

Viewers also liked (20)

LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
 
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...
 
Mathematical approach for Text Mining 1
Mathematical approach for Text Mining 1Mathematical approach for Text Mining 1
Mathematical approach for Text Mining 1
 
Practical Machine Learning
Practical Machine Learning Practical Machine Learning
Practical Machine Learning
 
Mining Features from the Object-Oriented Source Code of a Collection of Softw...
Mining Features from the Object-Oriented Source Code of a Collection of Softw...Mining Features from the Object-Oriented Source Code of a Collection of Softw...
Mining Features from the Object-Oriented Source Code of a Collection of Softw...
 
SNAPP - Learning Analytics and Knowledge Conference 2011
SNAPP - Learning Analytics and Knowledge Conference 2011SNAPP - Learning Analytics and Knowledge Conference 2011
SNAPP - Learning Analytics and Knowledge Conference 2011
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine Perception
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet Mixture
 
BigML Summer 2016 Release
BigML Summer 2016 ReleaseBigML Summer 2016 Release
BigML Summer 2016 Release
 
An approach to source code plagiarism
An approach to source code plagiarismAn approach to source code plagiarism
An approach to source code plagiarism
 
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet ProcessesBayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
 
Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003
 
How to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco AmalfiHow to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
 
Latent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationLatent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text Summarization
 
Naive Bayes | Statistics
Naive Bayes | StatisticsNaive Bayes | Statistics
Naive Bayes | Statistics
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Topic Models, LDA and all that
Topic Models, LDA and all thatTopic Models, LDA and all that
Topic Models, LDA and all that
 
C4.5
C4.5C4.5
C4.5
 
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
 
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot..."Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
 

Similar to 20 cv mil_models_for_words

17 cv mil_models_for_shape
17 cv mil_models_for_shape17 cv mil_models_for_shape
17 cv mil_models_for_shapezukun
 
09 cv mil_classification
09 cv mil_classification09 cv mil_classification
09 cv mil_classificationzukun
 
13 cv mil_preprocessing
13 cv mil_preprocessing13 cv mil_preprocessing
13 cv mil_preprocessingzukun
 
14 cv mil_the_pinhole_camera
14 cv mil_the_pinhole_camera14 cv mil_the_pinhole_camera
14 cv mil_the_pinhole_camerazukun
 
18 cv mil_style_and_identity
18 cv mil_style_and_identity18 cv mil_style_and_identity
18 cv mil_style_and_identityzukun
 
11 cv mil_models_for_chains_and_trees
11 cv mil_models_for_chains_and_trees11 cv mil_models_for_chains_and_trees
11 cv mil_models_for_chains_and_treeszukun
 
16 cv mil_multiple_cameras
16 cv mil_multiple_cameras16 cv mil_multiple_cameras
16 cv mil_multiple_cameraszukun
 
15 cv mil_models_for_transformations
15 cv mil_models_for_transformations15 cv mil_models_for_transformations
15 cv mil_models_for_transformationszukun
 
Graphical Models for chains, trees and grids
Graphical Models for chains, trees and gridsGraphical Models for chains, trees and grids
Graphical Models for chains, trees and gridspotaters
 
07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densities07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densitieszukun
 
04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_models04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_modelszukun
 
10 cv mil_graphical_models
10 cv mil_graphical_models10 cv mil_graphical_models
10 cv mil_graphical_modelszukun
 
Mathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsMathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsLiwei Ren任力偉
 
03 cv mil_probability_distributions
03 cv mil_probability_distributions03 cv mil_probability_distributions
03 cv mil_probability_distributionszukun
 
12 cv mil_models_for_grids
12 cv mil_models_for_grids12 cv mil_models_for_grids
12 cv mil_models_for_gridszukun
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inferencezukun
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with dataONE Talks
 
One talk Machine Learning
One talk Machine LearningOne talk Machine Learning
One talk Machine LearningONE Talks
 
Common Probability Distibution
Common Probability DistibutionCommon Probability Distibution
Common Probability DistibutionLukas Tencer
 
Context-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vecContext-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vecJIN KYU CHANG
 

Similar to 20 cv mil_models_for_words (20)

17 cv mil_models_for_shape
17 cv mil_models_for_shape17 cv mil_models_for_shape
17 cv mil_models_for_shape
 
09 cv mil_classification
09 cv mil_classification09 cv mil_classification
09 cv mil_classification
 
13 cv mil_preprocessing
13 cv mil_preprocessing13 cv mil_preprocessing
13 cv mil_preprocessing
 
14 cv mil_the_pinhole_camera
14 cv mil_the_pinhole_camera14 cv mil_the_pinhole_camera
14 cv mil_the_pinhole_camera
 
18 cv mil_style_and_identity
18 cv mil_style_and_identity18 cv mil_style_and_identity
18 cv mil_style_and_identity
 
11 cv mil_models_for_chains_and_trees
11 cv mil_models_for_chains_and_trees11 cv mil_models_for_chains_and_trees
11 cv mil_models_for_chains_and_trees
 
16 cv mil_multiple_cameras
16 cv mil_multiple_cameras16 cv mil_multiple_cameras
16 cv mil_multiple_cameras
 
15 cv mil_models_for_transformations
15 cv mil_models_for_transformations15 cv mil_models_for_transformations
15 cv mil_models_for_transformations
 
Graphical Models for chains, trees and grids
Graphical Models for chains, trees and gridsGraphical Models for chains, trees and grids
Graphical Models for chains, trees and grids
 
07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densities07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densities
 
04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_models04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_models
 
10 cv mil_graphical_models
10 cv mil_graphical_models10 cv mil_graphical_models
10 cv mil_graphical_models
 
Mathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsMathematical Modeling for Practical Problems
Mathematical Modeling for Practical Problems
 
03 cv mil_probability_distributions
03 cv mil_probability_distributions03 cv mil_probability_distributions
03 cv mil_probability_distributions
 
12 cv mil_models_for_grids
12 cv mil_models_for_grids12 cv mil_models_for_grids
12 cv mil_models_for_grids
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inference
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with data
 
One talk Machine Learning
One talk Machine LearningOne talk Machine Learning
One talk Machine Learning
 
Common Probability Distibution
Common Probability DistibutionCommon Probability Distibution
Common Probability Distibution
 
Context-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vecContext-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vec
 

More from zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featureszukun
 

More from zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 

Recently uploaded

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

20 cv mil_models_for_words

  • 1. Computer vision: models, learning and inference Chapter 20 Models for visual words Please send errata to s.prince@cs.ucl.ac.uk
  • 2. Visual words • Most models treat data as continuous • Likelihood based on normal distribution • Visual words = discrete representation of image • Likelihood based on categorical distribution • Useful for difficult tasks such as scene recognition and object recognition Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 2
  • 3. Motivation: scene recognition Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 3
  • 4. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 4
  • 5. Computing dictionary of visual words 1. For every one of the I training images, select a set of Ji spatial locations. • Interest points • Regular grid 2. Compute a descriptor at each spatial location in each image 3. Cluster all of these descriptor vectors into K groups using a method such as the K-Means algorithm 4. The means of the K clusters are used as the K prototype vectors in the dictionary. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 5
  • 6. Encoding images as visual words 1. Select a set of J spatial locations in the image using the same method as for the dictionary 2. Compute the descriptor at each of the J spatial locations. 3. Compare each descriptor to the set of K prototype descriptors in the dictionary 4. Assign a discrete index to this location that corresponds to the index of the closest word in the dictionary. End result: Discrete feature index x and y position Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 6
  • 7. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 7
  • 8. Bag of words model Key idea: • Abandon all spatial information • Just represent image by relative frequency (histogram) of words from dictionary where Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 8
  • 9. Bag of words Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 9
  • 10. Structure Learning (MAP solution): Inference: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 10
  • 11. Bag of words for object recognition Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 11
  • 12. Problems with bag of words Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 12
  • 13. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 13
  • 14. Latent Dirichlet allocation • Describes relative frequency of visual words in a single image (no world term) • Words not generated independently (connected by hidden variable) • Analogy to text documents – Each image contains mixture of several topics (parts) – Each topic induces a distribution over words Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 14
  • 15. Latent Dirichlet allocation Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 15
  • 16. Latent Dirichlet allocation Generative equations Marginal distribution over features Conjugate priors over parameters Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 16
  • 17. Latent Dirichlet allocation Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 17
  • 18. Learning LDA model • Part labels p hidden variables • If we knew them then it would be easy to estimate the parameters • How about EM algorithm? Unfortunately, parts within in image not independent Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 18
  • 19. Latent Dirichlet allocation Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 19
  • 20. Learning Strategy: 1. Write an expression for posterior distribution over part labels 2. Draw samples from posterior using MCMC 3. Use samples to estimate parameters Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 20
  • 21. 1. Posterior over part labels Denominator intractable Can compute two terms in numerator in closed form Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 21
  • 22. 2. Draw samples from posterior Gibbs’ sampling: fix all part labels except one and sample from conditional distribution This can be computed in closed form Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 22
  • 23. 3. Use samples to estimate parameters Samples substitute in for real part labels in update equations Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 23
  • 24. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 24
  • 25. Single author topic model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 25
  • 26. Single author-topic model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 26
  • 27. Learning 1. Posterior over part labels Likelihood same as before, prior becomes Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 27
  • 28. Learning 2. Draw samples from posterior 3. Use samples to estimate parameters Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 28
  • 29. Inference Likelihood that words in this image are due to category n Compute posterior over categories Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 29
  • 30. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 30
  • 31. Problems with bag of words Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 31
  • 32. Constellation model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 32
  • 33. Constellation model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 33
  • 34. Learning 1. Posterior over part labels Prior same as before, likelihood becomes Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 34
  • 35. Learning 2. Draw samples from posterior 3. Use samples to estimate parameters Part and word probabilities as before Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 35
  • 36. Inference Likelihood that words in this image are due to category n Compute posterior over categories Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 36
  • 37. Learning Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 37
  • 38. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 38
  • 39. Problems with bag of words Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 39
  • 40. Scene model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 40
  • 41. Scene model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 41
  • 42. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 42
  • 43. Video Google Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 43
  • 44. Action recognition Spatio-temporal bag of words model 91.8% classification Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 44
  • 45. Action recognition Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 45