SlideShare a Scribd company logo
1 of 23
Download to read offline
Maximizing the Representation Gap
between In-domain & OOD Examples
Jay Nandy Wynne Hsu Mong Li Lee
National University of Singapore
{jaynandy,whsu,leeml}@comp.nus.edu.sg
ICML workshop on Uncertainty & Robustness in Deep Learning, 2020
Predictive Uncertainty of DNNs
 Data or Aleatoric uncertainty:
 Arises from the natural complexities of the
underlying distribution, such as class
overlap, label noise, homoscedastic and
heteroscedastic noise
 Distributional Uncertainty:
 Distributional mismatch between the
training and test examples during inference
 Model or Epistemic uncertainty
 Uncertainty to estimating the network parameters, given training data
 Reducible given enough training data
In-domain example with Data
or Aleatoric uncertainty
Out-of-distribution (OOD)
example, that leads to
distributional uncertainty
[Gal, 2016; Candela et al., 2009]
Contributions
 Motivation:
 In presence of high data uncertainty among multiple classes, the existing OOD
detectors, including DPN (Malinin & Gales, 2018), tend to produce similar
representation for both in-domain and OOD examples.
 Leads to compromise the performance for OOD detection
 Proposed solution:
 Maximize the representation gap between in-domain and OOD examples
 A different representation for distributional uncertainty of OOD examples
 Propose a novel loss function for DPN framework
 Experimental Results:
 Consistently outperforms existing OOD detectors by addressing this issue.
Existing Approaches: Non-Bayesian
• Representation of predictive uncertainty:
• Sharp categorical posterior for in-domain examples
• Flat categorical posterior for out-of-domain (OOD) examples
• Limitations:
• Cannot robustly determine the source of uncertainty
• In particular, high data uncertainty among multiple class leads to the same
representation for both in-domain and OOD examples.
In-Domain
Misclassification
Out-of-Domain (OOD)
Examples
In-Domain
Confident Prediction
[Hendrycks et al., 2019b, Lee et al., 2018]
Existing Approaches: Bayesian
• Bayesian neural networks assumes a prior distribution over the network parameters
• Approximation requires to estimate the true posterior of the model parameters
• Sample model parameters using MCMC or Deep Ensemble etc.
• Limitations:
• Computationally expensive to produce the ensemble
• Difficult to control this desired behavior
In-Domain Confident pred.
• Ensemble of prediction in one
corner of the simplex.
In-Domain Misclassification:
• Ensemble of prediction in the
middle of the simplex.
OOD Examples:
• Ensemble of prediction are
scattered over the simplex.
. . .
. . .
. . .
. . .
. . .
. . .
[Gal and Ghahramani, 2016; Lakshminarayanan et al., 2017]
Dirichlet Prior Network (Existing)
• Parameterize a prior Dirichlet distribution to the categorical posteriors over a
simplex
• Objective: Efficiently emulating the behavior of Bayesian (ensemble) approaches
Sharp Dirichlet in one corner
 Uni-modal categorical.
Sharp Dirichlet in the middle
 Multi-modal categorical
Flat Dirichlet
 Uniform categorical
over all class labels.
Confident prediction
(In-Domain Examples)
Misclassification
(In-Domain Examples)
OOD Examples
[Malinin & Gales, 2018; 2019]
Proposed Representation for OOD
• Limitation (high Data uncertainty)
• In-domain examples with high data-uncertainty, among multiple classes, leads to
producing flatter Dirichlet distribution
• Can be observed for classification task with large number of classes
• This often leads to indistinguishable representation from OOD examples.
• Compromise the OOD detection performance
Confident prediction
(In-Domain Examples)
Misclassification
(In-Domain Examples)
OOD Examples
Desired Actual
[see detailed analysis in our paper]
Proposed Representation for OOD
• Maximize the representation gap of OOD examples from In-domain examples
• Sharp multi-modal Dirichlet with densities uniformly distributed at each corner for
OOD examples, instead of flat Dirichlet
Confident prediction
(In-Domain Examples)
Misclassification
(In-Domain Examples)
OOD Examples
Desired Actual Existing Proposed
Proposed Loss function
Confident prediction
(In-Domain Examples)
Misclassification
(In-Domain Examples)
OOD Examples
Desired Actual Existing Proposed
We propose a novel loss function to separately model the mean and precision of the
output Dirichlet distribution:
• Mean: Cross entropy loss with soft-max activation
• Precision:A novel explicit precision regularization function
Provides a better control on the desired representation.
We show that the existing RKL loss cannot produce this representation
[see more detailed analysis in our paper]
Proposed Loss function
• A neural network with soft-max activation can be viewed as DPN.
• Concentration parameters of the Dirichlet is given by exponential of logits
Categorical posterior is given by the
mean of the output Dirichet:
Dirichlet distributions with different
concentration parameter values
Sharp uni-modal Dirichlet:
• Large precision value
• Large concentration value for the correct class.
Flat Dirichlet distribution:
• Small precision values.
• Equal concentration values > 1
Sharp multi-modal Dirichlet, uniform at all corners:
• Small precision value.
• Equal concentration values < 1
Proposed Loss function
In-Domain
Examples
• Objective: Model the mean position + Model the precision values
(Standard Cross-entropy loss) (Proposed regularizer)
(Bounded approximation
of the precision)
Maximum concentration
value for the correct class
Proposed Loss function
• Objective: Model the mean position + Model the precision values
(Standard Cross-entropy loss) (Proposed regularizer)
Standard CE loss w.r.t uniform dist.
 Equal prob. for all classes
OOD Examples
Proposed representation for OOD
Uncertainty Measures
Total Uncertainty Measure
High maxP score:
Confident Prediction
Low maxP score:
In-domain misclassification/ OOD?
Distributional Uncertainty Measure
Confident Pred.
First Term
Second Term
MI (Overall)
Low
Low
Low (~0)
Distributional Uncertainty Measure
Given prob. mass is
concentrated
High
Low
High
In-Domain Example OOD Example
Misclassification (Malinin & Gales) ProposedConfident Pred.
Maximizes the gap
First Term
Second Term
MI (Overall)
Low
Low
Low (~0)
High
High
Low (~0)
High
Average
Average
Synthetic Dataset
In-Domain Training Data
Synthetic Dataset
In-Domain Training Data
Larger uncertainty scores for both
in-domain examples with class
overlap (i.e data uncertainty ) and
OOD examples.
Synthetic Dataset
In-Domain Training Data
Synthetic Dataset
In-Domain Training Data
Precision as dist. uncertainty measure:
• High scores for in-domain examples
• Low scores for OOD examples
Benchmark Vision Datasets
Conclusion
 We show that: in presence of high data uncertainty, the existing OOD detection
models, including DPN, tend to produce similar representation for both in-domain
and OOD examples, leading to compromise OOD detection performance
 We propose to model the distributional uncertainty using multi-modal Dirichlet
distribution for DPN (Malinin & Gales, 2018) to maximize the representation gap
between in-domain and OOD examples
 Experimental results demonstrates that our proposed technique consistently
outperforms other OOD detection models by addressing this issue.
Thank You 

More Related Content

What's hot

Spss tutorial-cluster-analysis
Spss tutorial-cluster-analysisSpss tutorial-cluster-analysis
Spss tutorial-cluster-analysisAnimesh Kumar
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksSangwoo Mo
 
Forest Cover type prediction
Forest Cover type predictionForest Cover type prediction
Forest Cover type predictionDaniel Gribel
 
Chapter 09 class advanced
Chapter 09 class advancedChapter 09 class advanced
Chapter 09 class advancedHouw Liong The
 
Learning deep representation from coarse to fine for face alignment
Learning deep representation from coarse to fine for face alignmentLearning deep representation from coarse to fine for face alignment
Learning deep representation from coarse to fine for face alignmentZhiwen Shao
 
Statistical Clustering
Statistical ClusteringStatistical Clustering
Statistical Clusteringtim_hare
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningMLAI2
 
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...Jinho Choi
 
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...IRJET Journal
 
K means clustering
K means clusteringK means clustering
K means clusteringkeshav goyal
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Treesananth
 
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Sangwoo Mo
 
Combinatorial Problems2
Combinatorial Problems2Combinatorial Problems2
Combinatorial Problems23ashmawy
 
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Wuhyun Rico Shin
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...IRJET Journal
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...MLAI2
 

What's hot (20)

Spss tutorial-cluster-analysis
Spss tutorial-cluster-analysisSpss tutorial-cluster-analysis
Spss tutorial-cluster-analysis
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
 
Introduction to data mining and machine learning
Introduction to data mining and machine learningIntroduction to data mining and machine learning
Introduction to data mining and machine learning
 
Forest Cover type prediction
Forest Cover type predictionForest Cover type prediction
Forest Cover type prediction
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
Clustering
ClusteringClustering
Clustering
 
forest-cover-type
forest-cover-typeforest-cover-type
forest-cover-type
 
Chapter 09 class advanced
Chapter 09 class advancedChapter 09 class advanced
Chapter 09 class advanced
 
Learning deep representation from coarse to fine for face alignment
Learning deep representation from coarse to fine for face alignmentLearning deep representation from coarse to fine for face alignment
Learning deep representation from coarse to fine for face alignment
 
Statistical Clustering
Statistical ClusteringStatistical Clustering
Statistical Clustering
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
 
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
 
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...
 
K means clustering
K means clusteringK means clustering
K means clustering
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Trees
 
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
 
Combinatorial Problems2
Combinatorial Problems2Combinatorial Problems2
Combinatorial Problems2
 
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
 

Similar to Maximizing the Representation Gap between In-domain & OOD examples

NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design TrainingESCOM
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Yan Xu
 
PR-048: Towards Principled Methods for Training Generative Adversarial Networks
PR-048: Towards Principled Methods for Training Generative Adversarial NetworksPR-048: Towards Principled Methods for Training Generative Adversarial Networks
PR-048: Towards Principled Methods for Training Generative Adversarial NetworksJi-Hoon Kim
 
Face Anti Spoofing
Face Anti SpoofingFace Anti Spoofing
Face Anti Spoofingssuser17040e
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingLionel Briand
 
DSUS_SDM2012_Jie
DSUS_SDM2012_JieDSUS_SDM2012_Jie
DSUS_SDM2012_JieMDO_Lab
 
論文紹介:Learning With Neighbor Consistency for Noisy Labels
論文紹介:Learning With Neighbor Consistency for Noisy Labels論文紹介:Learning With Neighbor Consistency for Noisy Labels
論文紹介:Learning With Neighbor Consistency for Noisy LabelsToru Tamaki
 
Topological Data Analysis
Topological Data AnalysisTopological Data Analysis
Topological Data AnalysisDeviousQuant
 
17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptxssuser2023c6
 
Probability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsProbability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsChirag Gupta
 
Eswc2009
Eswc2009Eswc2009
Eswc2009fanizzi
 
Do wide and deep networks learn the same things? Uncovering how neural networ...
Do wide and deep networks learn the same things? Uncovering how neural networ...Do wide and deep networks learn the same things? Uncovering how neural networ...
Do wide and deep networks learn the same things? Uncovering how neural networ...Seunghyun Hwang
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative ModelsMLReview
 
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Jedha Bootcamp
 
On the Internet Delay Space Dimensionality
On the Internet Delay Space DimensionalityOn the Internet Delay Space Dimensionality
On the Internet Delay Space DimensionalityBruno Abrahao
 

Similar to Maximizing the Representation Gap between In-domain & OOD examples (20)

NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
 
Neural nets
Neural netsNeural nets
Neural nets
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering
 
PR-048: Towards Principled Methods for Training Generative Adversarial Networks
PR-048: Towards Principled Methods for Training Generative Adversarial NetworksPR-048: Towards Principled Methods for Training Generative Adversarial Networks
PR-048: Towards Principled Methods for Training Generative Adversarial Networks
 
Face Anti Spoofing
Face Anti SpoofingFace Anti Spoofing
Face Anti Spoofing
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
 
DSUS_SDM2012_Jie
DSUS_SDM2012_JieDSUS_SDM2012_Jie
DSUS_SDM2012_Jie
 
論文紹介:Learning With Neighbor Consistency for Noisy Labels
論文紹介:Learning With Neighbor Consistency for Noisy Labels論文紹介:Learning With Neighbor Consistency for Noisy Labels
論文紹介:Learning With Neighbor Consistency for Noisy Labels
 
Topological Data Analysis
Topological Data AnalysisTopological Data Analysis
Topological Data Analysis
 
17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx
 
Probability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsProbability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional Experts
 
Eswc2009
Eswc2009Eswc2009
Eswc2009
 
Do wide and deep networks learn the same things? Uncovering how neural networ...
Do wide and deep networks learn the same things? Uncovering how neural networ...Do wide and deep networks learn the same things? Uncovering how neural networ...
Do wide and deep networks learn the same things? Uncovering how neural networ...
 
Hx3115011506
Hx3115011506Hx3115011506
Hx3115011506
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative Models
 
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
 
Adam Ashenfelter - Finding the Oddballs
Adam Ashenfelter - Finding the OddballsAdam Ashenfelter - Finding the Oddballs
Adam Ashenfelter - Finding the Oddballs
 
20320140503011
2032014050301120320140503011
20320140503011
 
07 learning
07 learning07 learning
07 learning
 
On the Internet Delay Space Dimensionality
On the Internet Delay Space DimensionalityOn the Internet Delay Space Dimensionality
On the Internet Delay Space Dimensionality
 

Recently uploaded

Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
TEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxTEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxruthvilladarez
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsRommel Regala
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxJanEmmanBrigoli
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 

Recently uploaded (20)

Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
TEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxTEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docx
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World Politics
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 

Maximizing the Representation Gap between In-domain & OOD examples

  • 1. Maximizing the Representation Gap between In-domain & OOD Examples Jay Nandy Wynne Hsu Mong Li Lee National University of Singapore {jaynandy,whsu,leeml}@comp.nus.edu.sg ICML workshop on Uncertainty & Robustness in Deep Learning, 2020
  • 2. Predictive Uncertainty of DNNs  Data or Aleatoric uncertainty:  Arises from the natural complexities of the underlying distribution, such as class overlap, label noise, homoscedastic and heteroscedastic noise  Distributional Uncertainty:  Distributional mismatch between the training and test examples during inference  Model or Epistemic uncertainty  Uncertainty to estimating the network parameters, given training data  Reducible given enough training data In-domain example with Data or Aleatoric uncertainty Out-of-distribution (OOD) example, that leads to distributional uncertainty [Gal, 2016; Candela et al., 2009]
  • 3. Contributions  Motivation:  In presence of high data uncertainty among multiple classes, the existing OOD detectors, including DPN (Malinin & Gales, 2018), tend to produce similar representation for both in-domain and OOD examples.  Leads to compromise the performance for OOD detection  Proposed solution:  Maximize the representation gap between in-domain and OOD examples  A different representation for distributional uncertainty of OOD examples  Propose a novel loss function for DPN framework  Experimental Results:  Consistently outperforms existing OOD detectors by addressing this issue.
  • 4. Existing Approaches: Non-Bayesian • Representation of predictive uncertainty: • Sharp categorical posterior for in-domain examples • Flat categorical posterior for out-of-domain (OOD) examples • Limitations: • Cannot robustly determine the source of uncertainty • In particular, high data uncertainty among multiple class leads to the same representation for both in-domain and OOD examples. In-Domain Misclassification Out-of-Domain (OOD) Examples In-Domain Confident Prediction [Hendrycks et al., 2019b, Lee et al., 2018]
  • 5. Existing Approaches: Bayesian • Bayesian neural networks assumes a prior distribution over the network parameters • Approximation requires to estimate the true posterior of the model parameters • Sample model parameters using MCMC or Deep Ensemble etc. • Limitations: • Computationally expensive to produce the ensemble • Difficult to control this desired behavior In-Domain Confident pred. • Ensemble of prediction in one corner of the simplex. In-Domain Misclassification: • Ensemble of prediction in the middle of the simplex. OOD Examples: • Ensemble of prediction are scattered over the simplex. . . . . . . . . . . . . . . . . . . [Gal and Ghahramani, 2016; Lakshminarayanan et al., 2017]
  • 6. Dirichlet Prior Network (Existing) • Parameterize a prior Dirichlet distribution to the categorical posteriors over a simplex • Objective: Efficiently emulating the behavior of Bayesian (ensemble) approaches Sharp Dirichlet in one corner  Uni-modal categorical. Sharp Dirichlet in the middle  Multi-modal categorical Flat Dirichlet  Uniform categorical over all class labels. Confident prediction (In-Domain Examples) Misclassification (In-Domain Examples) OOD Examples [Malinin & Gales, 2018; 2019]
  • 7. Proposed Representation for OOD • Limitation (high Data uncertainty) • In-domain examples with high data-uncertainty, among multiple classes, leads to producing flatter Dirichlet distribution • Can be observed for classification task with large number of classes • This often leads to indistinguishable representation from OOD examples. • Compromise the OOD detection performance Confident prediction (In-Domain Examples) Misclassification (In-Domain Examples) OOD Examples Desired Actual [see detailed analysis in our paper]
  • 8. Proposed Representation for OOD • Maximize the representation gap of OOD examples from In-domain examples • Sharp multi-modal Dirichlet with densities uniformly distributed at each corner for OOD examples, instead of flat Dirichlet Confident prediction (In-Domain Examples) Misclassification (In-Domain Examples) OOD Examples Desired Actual Existing Proposed
  • 9. Proposed Loss function Confident prediction (In-Domain Examples) Misclassification (In-Domain Examples) OOD Examples Desired Actual Existing Proposed We propose a novel loss function to separately model the mean and precision of the output Dirichlet distribution: • Mean: Cross entropy loss with soft-max activation • Precision:A novel explicit precision regularization function Provides a better control on the desired representation. We show that the existing RKL loss cannot produce this representation [see more detailed analysis in our paper]
  • 10. Proposed Loss function • A neural network with soft-max activation can be viewed as DPN. • Concentration parameters of the Dirichlet is given by exponential of logits Categorical posterior is given by the mean of the output Dirichet:
  • 11. Dirichlet distributions with different concentration parameter values Sharp uni-modal Dirichlet: • Large precision value • Large concentration value for the correct class. Flat Dirichlet distribution: • Small precision values. • Equal concentration values > 1 Sharp multi-modal Dirichlet, uniform at all corners: • Small precision value. • Equal concentration values < 1
  • 12. Proposed Loss function In-Domain Examples • Objective: Model the mean position + Model the precision values (Standard Cross-entropy loss) (Proposed regularizer) (Bounded approximation of the precision) Maximum concentration value for the correct class
  • 13. Proposed Loss function • Objective: Model the mean position + Model the precision values (Standard Cross-entropy loss) (Proposed regularizer) Standard CE loss w.r.t uniform dist.  Equal prob. for all classes OOD Examples Proposed representation for OOD
  • 15. Total Uncertainty Measure High maxP score: Confident Prediction Low maxP score: In-domain misclassification/ OOD?
  • 16. Distributional Uncertainty Measure Confident Pred. First Term Second Term MI (Overall) Low Low Low (~0)
  • 17. Distributional Uncertainty Measure Given prob. mass is concentrated High Low High In-Domain Example OOD Example Misclassification (Malinin & Gales) ProposedConfident Pred. Maximizes the gap First Term Second Term MI (Overall) Low Low Low (~0) High High Low (~0) High Average Average
  • 19. Synthetic Dataset In-Domain Training Data Larger uncertainty scores for both in-domain examples with class overlap (i.e data uncertainty ) and OOD examples.
  • 21. Synthetic Dataset In-Domain Training Data Precision as dist. uncertainty measure: • High scores for in-domain examples • Low scores for OOD examples
  • 23. Conclusion  We show that: in presence of high data uncertainty, the existing OOD detection models, including DPN, tend to produce similar representation for both in-domain and OOD examples, leading to compromise OOD detection performance  We propose to model the distributional uncertainty using multi-modal Dirichlet distribution for DPN (Malinin & Gales, 2018) to maximize the representation gap between in-domain and OOD examples  Experimental results demonstrates that our proposed technique consistently outperforms other OOD detection models by addressing this issue. Thank You 