SlideShare a Scribd company logo
1 of 60
Maximum Entropy Model LING 572 Fei Xia 02/07-02/09/06
History ,[object Object],[object Object],[object Object]
Outline ,[object Object],[object Object],[object Object],[object Object]
Reference papers ,[object Object],[object Object],[object Object],[object Object],[object Object]
Modeling
The basic idea ,[object Object],[object Object]
Setting ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Features in POS tagging (Ratnaparkhi, 1996) context (a.k.a. history) allowable classes
Maximum Entropy ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Ex1: Coin-flip example (Klein & Manning 2003) ,[object Object],[object Object],[object Object],[object Object],p1 H p1=0.3
Coin-flip example (cont) p1 p2 H p1 + p2 = 1 p1+p2=1.0,  p1=0.3
Ex2: An MT example (Berger et. al., 1996) Possible translation for the word ā€œinā€ is:  Constraint: Intuitive answer:
An MT example (cont) Constraints: Intuitive answer:
An MT example (cont) Constraints: Intuitive answer:  ??
Ex3: POS tagging (Klein and Manning, 2003)
Ex3 (cont)
Ex4: overlapping features (Klein and Manning, 2003)
Modeling the problem ,[object Object],[object Object],[object Object]
Features ,[object Object],[object Object],[object Object],[object Object]
Some notations Finite training sample of events: Observed probability of x in S: The model pā€™s probability of x: Model expectation of  : Observed expectation of  : The j th  feature: (empirical count of  )
Constraints ,[object Object],[object Object]
Training data  ļƒØ  observed events
Restating the problem The task: find p* s.t.   where Objective function:  -H(p) Constraints:  Add a feature
Questions ,[object Object],[object Object],[object Object],[object Object],[object Object]
What is the form of p*? (Ratnaparkhi, 1997) Theorem: if  then  Furthermore, p* is unique.
Using Lagrangian multipliers Minimize A(p):
Two equivalent forms
Relation to Maximum Likelihood The log-likelihood of the empirical distribution  as predicted by a model q is defined as Theorem: if  then  Furthermore, p* is unique.
Summary (so far) The model p* in P with maximum entropy is the model in Q that maximizes the likelihood of the training sample  Goal: find p* in P, which maximizes H(p). It can be proved that when p* exists it is unique.
Summary (cont) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Parameter estimation
Algorithms ,[object Object],[object Object]
GIS: setup ,[object Object],[object Object],[object Object],Let Add a new feature f k+1 :
GIS algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],where
Approximation for calculating feature expectation
Properties of GIS ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
IIS algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Calculating  If Then GIS is the same as IIS Else must be calcuated numerically.
Feature selection
Feature selection ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Notation The gain in the log-likelihood of the training data: After adding a feature: With the feature set S:
Feature selection algorithm (Berger et al., 1996) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ļƒØ   Problem: too expensive
Approximating gains (Berger et. al., 1996)  ,[object Object]
Training a MaxEnt Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Case study
POS tagging (Ratnaparkhi, 1996) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Using a MaxEnt Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Modeling
Training step 1:  define feature templates History h i Tag t i
Step 2: Create feature set ,[object Object],[object Object]
Step 3: determine the feature weights ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Decoding: Beam search ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Beam search
Viterbi search
Decoding (cont) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Experiment results
Comparison with other learners ,[object Object],[object Object],[object Object]
MaxEnt Summary ,[object Object],[object Object],[object Object],[object Object],[object Object]
Additional slides
Ex4 (cont) ??

More Related Content

What's hot

Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Tagging
theyaseen51
Ā 
Sequence Modelling with Deep Learning
Sequence Modelling with Deep LearningSequence Modelling with Deep Learning
Sequence Modelling with Deep Learning
Natasha Latysheva
Ā 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
Slideshare
Ā 

What's hot (20)

Principle of Maximum Entropy
Principle of Maximum EntropyPrinciple of Maximum Entropy
Principle of Maximum Entropy
Ā 
Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Tagging
Ā 
Scheme Programming Language
Scheme Programming LanguageScheme Programming Language
Scheme Programming Language
Ā 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platformsDichotomy of parallel computing platforms
Dichotomy of parallel computing platforms
Ā 
Sequence Modelling with Deep Learning
Sequence Modelling with Deep LearningSequence Modelling with Deep Learning
Sequence Modelling with Deep Learning
Ā 
Graph representation learning to prevent payment collusion fraud
Graph representation learning to prevent payment collusion fraudGraph representation learning to prevent payment collusion fraud
Graph representation learning to prevent payment collusion fraud
Ā 
Boosting - An Ensemble Machine Learning Method
Boosting - An Ensemble Machine Learning MethodBoosting - An Ensemble Machine Learning Method
Boosting - An Ensemble Machine Learning Method
Ā 
6 shallow parsing introduction
6 shallow parsing introduction6 shallow parsing introduction
6 shallow parsing introduction
Ā 
Language models
Language modelsLanguage models
Language models
Ā 
Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...
Ā 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
Ā 
NLP_KASHK:Finite-State Morphological Parsing
NLP_KASHK:Finite-State Morphological ParsingNLP_KASHK:Finite-State Morphological Parsing
NLP_KASHK:Finite-State Morphological Parsing
Ā 
Natural Language Processing: Parsing
Natural Language Processing: ParsingNatural Language Processing: Parsing
Natural Language Processing: Parsing
Ā 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models
Ā 
Word representations in vector space
Word representations in vector spaceWord representations in vector space
Word representations in vector space
Ā 
Concurrency Control in Distributed Database.
Concurrency Control in Distributed Database.Concurrency Control in Distributed Database.
Concurrency Control in Distributed Database.
Ā 
Developing a Map Reduce Application
Developing a Map Reduce ApplicationDeveloping a Map Reduce Application
Developing a Map Reduce Application
Ā 
Compiler an overview
Compiler  an overviewCompiler  an overview
Compiler an overview
Ā 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
Ā 
NLP_KASHK:Morphology
NLP_KASHK:MorphologyNLP_KASHK:Morphology
NLP_KASHK:Morphology
Ā 

Viewers also liked

Machine Learning for Computer Games
Machine Learning for Computer GamesMachine Learning for Computer Games
Machine Learning for Computer Games
butest
Ā 
D thies+ignite presentation
D thies+ignite presentationD thies+ignite presentation
D thies+ignite presentation
Kate Beihl
Ā 
TOPSIS - A multi-criteria decision making approach
TOPSIS - A multi-criteria decision making approachTOPSIS - A multi-criteria decision making approach
TOPSIS - A multi-criteria decision making approach
Presi
Ā 
Decision tree example problem
Decision tree example problemDecision tree example problem
Decision tree example problem
SATYABRATA PRADHAN
Ā 
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
Dongseo University
Ā 

Viewers also liked (18)

Introduction To Applied Machine Learning
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learning
Ā 
MaxEnt (Loglinear) Models - Overview
MaxEnt (Loglinear) Models - OverviewMaxEnt (Loglinear) Models - Overview
MaxEnt (Loglinear) Models - Overview
Ā 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Ā 
MaxEnt 2009 talk
MaxEnt 2009 talkMaxEnt 2009 talk
MaxEnt 2009 talk
Ā 
Two parameter entropy of uncertain variable
Two parameter entropy of uncertain variableTwo parameter entropy of uncertain variable
Two parameter entropy of uncertain variable
Ā 
Machine Learning for Computer Games
Machine Learning for Computer GamesMachine Learning for Computer Games
Machine Learning for Computer Games
Ā 
103 Optimizing on Multiple Attributes
103 Optimizing on Multiple Attributes103 Optimizing on Multiple Attributes
103 Optimizing on Multiple Attributes
Ā 
Hierarchichal species distributions model and Maxent
Hierarchichal species distributions model and MaxentHierarchichal species distributions model and Maxent
Hierarchichal species distributions model and Maxent
Ā 
Awg waveform compensation by maximum entropy method
Awg waveform compensation by maximum entropy methodAwg waveform compensation by maximum entropy method
Awg waveform compensation by maximum entropy method
Ā 
D thies+ignite presentation
D thies+ignite presentationD thies+ignite presentation
D thies+ignite presentation
Ā 
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
Ā 
Weighted Score And Topsis
Weighted Score And TopsisWeighted Score And Topsis
Weighted Score And Topsis
Ā 
Decision Making & Problem Solving
Decision Making & Problem SolvingDecision Making & Problem Solving
Decision Making & Problem Solving
Ā 
TOPSIS - A multi-criteria decision making approach
TOPSIS - A multi-criteria decision making approachTOPSIS - A multi-criteria decision making approach
TOPSIS - A multi-criteria decision making approach
Ā 
Entropy
EntropyEntropy
Entropy
Ā 
Decision tree example problem
Decision tree example problemDecision tree example problem
Decision tree example problem
Ā 
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
Ā 
Decision Analysis for choosing airflight
Decision Analysis for choosing airflightDecision Analysis for choosing airflight
Decision Analysis for choosing airflight
Ā 

Similar to Max Entropy

Machine Learning
Machine LearningMachine Learning
Machine Learning
butest
Ā 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.ppt
butest
Ā 
Download presentation source
Download presentation sourceDownload presentation source
Download presentation source
butest
Ā 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
ESCOM
Ā 
Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and prediction
Margaret Wang
Ā 
original
originaloriginal
original
butest
Ā 
Text categorization as a graph
Text categorization as a graphText categorization as a graph
Text categorization as a graph
James Wong
Ā 
Text categorization as a graph
Text categorization as a graphText categorization as a graph
Text categorization as a graph
Fraboni Ec
Ā 
Text categorization
Text categorization Text categorization
Text categorization
Luis Goldster
Ā 

Similar to Max Entropy (20)

20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
Ā 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Ā 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.ppt
Ā 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
Ā 
Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpally...
Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpally...Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpally...
Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpally...
Ā 
Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpally...
Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpally...Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpally...
Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpally...
Ā 
Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpall...
Algorithm Class at KPHB  (C, C++ Course Training Institute in KPHB, Kukatpall...Algorithm Class at KPHB  (C, C++ Course Training Institute in KPHB, Kukatpall...
Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpall...
Ā 
Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpally...
Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpally...Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpally...
Algorithm Class at KPHB (C, C++ Course Training Institute in KPHB, Kukatpally...
Ā 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events
Ā 
Python Training Tutorial for Frreshers
Python Training Tutorial for FrreshersPython Training Tutorial for Frreshers
Python Training Tutorial for Frreshers
Ā 
Chapter two
Chapter twoChapter two
Chapter two
Ā 
Download presentation source
Download presentation sourceDownload presentation source
Download presentation source
Ā 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
Ā 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
Ā 
Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and prediction
Ā 
original
originaloriginal
original
Ā 
Text categorization as a graph
Text categorization as a graphText categorization as a graph
Text categorization as a graph
Ā 
Graph classification problem.pptx
Graph classification problem.pptxGraph classification problem.pptx
Graph classification problem.pptx
Ā 
Text categorization as a graph
Text categorization as a graphText categorization as a graph
Text categorization as a graph
Ā 
Text categorization
Text categorization Text categorization
Text categorization
Ā 

Recently uploaded

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Enterprise Knowledge
Ā 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
Ā 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Ā 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
Ā 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
Ā 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
Ā 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Ā 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Ā 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Ā 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Ā 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Ā 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Ā 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Ā 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Ā 
šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜
Ā 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Ā 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Ā 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Ā 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Ā 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
Ā 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Ā 

Max Entropy