SlideShare a Scribd company logo
1 of 26
Download to read offline
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
Title of presentation
Subtitle
Name of presenter
Date
Subclass deep neural networks: re-enabling neglected classes
in deep network training for multimedia classification
N. Gkalelis,V. Mezaris
CERTH-ITI,Thermi -Thessaloniki,Greece
26th Int. Conf. Multimedia Modeling
Daejeon, Korea, January 2020
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
Outline
2
• Problem statement
• Related work
• Identification of neglected classes
• Subclass DNNs
• Experiments
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
3
• Deep neural networks (DNNs) have shown remarkable performance in many
machine learning problems
• They are being commercially deployed in various domains
Problem statement
• Multimedia
understanding
• Self-driving cars • Edge computing
Image Credits:V2Gov
ImageCredits: [1]
[1]Chen, J., Ran, X.: Deep LearningWith Edge Computing:A Review, Proc. of the IEEE, vol. 107, no. 8, (Aug. 2019)
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
4
One weakness of DNNs is that some classes are getting less attention during
training, for instance, due to:
• Overfitting: error in the training set becomes negligible although generalization
error is still high
• Simply, the contributions of the misclassified observations to the updating of the
weights associated with these classes cancel out
 How to identify these neglected classes?
 How to boost the classification performance of neglected classes?
Problem statement
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
5
Related work
• The problem of neglected classes sometimes is related with imbalanced class
learning, where the distribution of training data across different classes is skewed
• There are relatively few works studying the class imbalance problem in DNNs
(e.g. see [1, 2])
 As observed in our experiments neglected classes may also appear with balanced
training datasets (e.g. CIFAR datasets)
 The identification of classes getting less attention from the DNN during the
training procedure is a relatively unexplored topic
[1] Sarafianos, N., Xu, X., Kakadiaris, I.A.: Deep imbalanced attribute classification using visual attention aggregation. ECCV. Munich, Germany
(Sep 2018)
[2] Lin,T., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal Loss for DenseObject Detection, 2017 IEEE ICCV,Venice, 2017.
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
6
Identification of neglected classes: motivation
• During backpropagation, a strong gradient update gi for the weight vectors wi at
the output layer is typically produced for classes with high training error ei
• In this way, the overall network is guided to extract nonlinear features at different
layers in order to provide a linear separable subspace at the output layer for these
classes
wi
wi’ = wi + gi
gi
length of the gradient update ||gi|| is large as
expected due to a high error in the output node i
weight vector is updated (as well as
weight vectors in the layers below) in
order to reduce the output node error
weight vector of node i
(associated with class i)
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
7
Identification of neglected classes: motivation
• However, it is observed that for certain output nodes the gradient update is close
to zero although the classes associated with these nodes still have a large error
• The result is that the DNN neglects to extract the necessary discriminant features
for reducing the error for these classes
• We investigate this case in detail and based on the analysis we propose a new
measure to identify the so-called neglected classes
The length of the gradient
update ||gi|| is small despite a
large error in the output node i! the updated weight vector of output node i
remains relatively unchanged because the length
of the gradient update ||gi|| is close to zero
wi’  wi
gi
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
8
Identification of neglected classes: background
• Suppose a DNN with sigmoid (SG) output layer and cross entropy (CE) loss
xk
xf,k
xj,k
x1,k
wi
w1,i
wj,i
bi
weight vector and bias term of i-th node
input vector to the i-th node associated with the k-th training observation
output of the i-th node associated with k-th training observation
i-th output node of the SG layer (associated with i-th class)
wf,i
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
9
Identification of neglected classes: background
gradient update of wi
label of k-th observation w.r.t. class i
Learning rate
input vector to the i-th output
node associated with the k-th
training observation in the batch
output of the i-th node associated
with k-th training observation
Number of batch observations
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
10
Identification of neglected classes Unnormalized gradient update
(w.r.t i-th node)
Unnormalized
gradient update
of positive class
observations
Unnormalized gradient update
of negative class observations
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
11
Identification of neglected classes
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
12
Identification of neglected classes
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
13
Identification of neglected classes
Length of
unnormalized
gradient update of
positive and
negative class
observations
Length of unnormalized gradient
update (w.r.t i-th node)
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
14
Training of neglected classes: Subclass DNNs
• A subclass-based learning framework is applied to improve the classification
performance of neglected classes
• Subclass-based classification techniques have been successfully used in the
shallow learning paradigm [1-5].
• To this end, neglected classes are augmented and partitioned to subclasses and
the created labeling information is used to create and train a subclass DNN
[1] Hastie,T.,Tibshirani, R.: Discriminant analysis by Gaussian mixtures. Journal of the Royal Statistical Society. Series B 58(1), (Jul 1996)
[2] Manli, Z., Martinez, A.M.: Subclass discriminant analysis. IEEETrans. PatternAnal. Mach. Intell. 28(8), (Aug 2006)
[3] Escalera, S. et al.: Subclass problem-dependent design for error-correcting output codes. IEEETrans. PatternAnal. Mach. Intell. 30(6), (Jun
2008)
[4]You, D., Hamsici, O.C., Martinez, A.M.: Kernel optimization in discriminant analysis. IEEETrans. PatternAnal. Mach. Intell. 33(3), (Mar 2011)
[5] Gkalelis, N., Mezaris,V., Kompatsiaris, I., Stathaki,T.: Mixture subclass discriminant analysis link to restricted Gaussian model and other
generalizations. IEEETrans. Neural Netw. Learn. Syst. 24(1), (Jan 2013)
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
15
Training of neglected classes: Subclass DNNs
Number of training observations
Number of classes
Number of subclasses of class i
label of k-th observation
w.r.t subclass (i,j)
output of the node (i.j) associated
with k-th training observation
label of k-th observation
w.r.t class i
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
Experiments
16
Multiclass classification
• CIFAR10 [1]: 10 classes, 32 x 32 color images, 50000 training and 10000 testing
observations
• CIFAR100 [1]: as CIFAR10 but with 100 classes
• SVHN [2]: 10 classes, 32 x 32 color images, 73257 training, 26032 testing and
531131 extra observations
[1] Krizhevsky, A.: Learning multiple layers of features from tiny images.Tech. rep. (2009)
[2] Netzer,Y.,Wang,T., Coates, A., Bissacco, A.,Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. NIPS
Workshops.Venice, Italy, (Oct 2017)
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
Experiments
17
Experimental setup (similar setup with [1])
• Images are normalized to zero mean and unit variance
• Data augmentation is applied as in [1] (cropping, mirroring, cutout, etc.)
• DNNs [2,3]: VGG16, WRN-28-10 (CIFAR-10, -100), WRN-16-8 (SVHN)
• CE loss, Minibatch SGD, Nesterov momentum 0.9, batch size 128, weight decay
0.0005, 200 epochs, initial learning rate 0.1, exponential schedule (learning rate
multiplied by 0.1 for CIFAR or 0.2 for SVHN, at epochs 60, 120 and 160)
[1] Devries,T.,Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552 (2017)
[2]Zagoruyko, S., Komodakis, N.:Wide residual networks. BMVC.York, UK, (Sep 2016)
[3] Simonyan, K., Zisserman, A.:Very deep convolutional networks for large-scale image recognition. ICLR. San Diego, CA, USA (May 2015)
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
Experiments
18
SDNN training
• DNN is trained for 30 epochs in order to compute a reliable neglection score for
each class
• The classes with highest neglection score are selected: 2 classes for CIFAR-10 and
SVHN, 10 classes for CIFAR-100
• Observations of the selected classes are doubled using the augmentation
approach presented in [1]
• Selected classes are partitioned to 2 subclasses using k-means
• SDNN is trained using the annotations at class and subclass level
[1] Inoue, H.: Data augmentation by pairing samples for images classification. arXiv:1801.02929 (2018)
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
Experiments
19
Results
• The proposed subclass DNNs (SVGG16, SWRN) provide improved performance over their
conventional variants (VGG16 [1], WRN [2])
• Considering that [2] is among the state-of-the-art approaches, the attained CCR performance
improvement is significant
• Training time overhead is negligible for the medium-size datasets (CIFAR-10, -100) and relatively
small for the larger SVHN; testing time is only a few seconds for all DNNs
VGG16 [1] SVGG16 WRN [2] SWRN
CIFAR-10 93.5% (2.6 h) 94.8% (2.7 h) 96.92% (7.1 h) 97.14% (8.1 h)
CIFAR-100 71.24% (2.6 h) 73.67% (2.6 h) 81.59% (7.1 h) 82.17% (7.5 h)
SVHN 98.16% (29.1 h) 98.35 (34.1 h) 98.7% (33.1 h) 98.81% (42.7 h)
[1] Simonyan, K., Zisserman, A.:Very deep convolutional networks for large-scale image recognition. In: ICLR. San Diego, CA, USA (May 2015)
[2] Devries,T.,Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552 (2017)
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
Experiments
20
Multilabel classification
• YouTube-8M (YT8M) [1]: largest public dataset for multilabel video classification
• 3862 classes, 3888919 training, 1112356 evaluation and 1133323 testing videos
• 1024- and 128-dimensional visual and audio feature vectors are provided for each
video
[1]Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, A.P.,Toderici, G.,Varadarajan, B.,Vijayanarasimhan, S.:YouTube-8M:A large-scale video
classification benchmark. In: arXiv:1609.08675 (2016)
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
Experiments
21
Experimental setup
• Two experiments: using the visual feature vectors directly; concatenating the
visual and audio feature vectors
• L2-normalization is applied to feature vectors in all cases
• DNN: 1 conv layer with 64 1d-filters, max-pooling, dropout, SG output layer
• CE loss, Minibatch SGD, batch size 512, weight decay 0.0005, 5 epochs, initial
learning rate 0.001, exponential schedule (learning rate multiplied by 0.95 at
every epoch)
• The evaluation metrics of YT8M challenge [1] are utilized for performance
evaluation; GAP@20 is used as the primary evaluation metric
[1]Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, A.P.,Toderici, G.,Varadarajan, B.,Vijayanarasimhan, S.:YouTube-8M:A large-scale video
classification benchmark. In: arXiv:1609.08675 (2016)
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
22
SDNN training
• DNN is trained for 1/3 of an epoch; a neglection score for each class is computed
• 386 classes with highest neglection score are selected
• Observations of the selected classes are doubled by applying extrapolation in
feature space [1]
• A efficient variant of the nearest neighbor-based clustering algorithm [2,3] is
used to partition the neglected classes to subclasses
• SDNN is trained using the annotations at class and subclass level
[1] DeVries,T.,Taylor, G.W.: Dataset augmentation in feature space. ICLRWorkshops.Toulon, France (Apr 2017)
[2] Manli, Z., Martinez, A.M.: Subclass discriminant analysis. IEEETrans. PatternAnal. Mach. Intell. 28(8), 1274-1286 (Aug 2006)
[3]You, D., Hamsici, O.C., Martinez, A.M.: Kernel optimization in discriminant analysis. IEEE PAMI 33(3), 631-638 (Mar 2011)
Experiments
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
Experiments
23
Results
• The proposed SDNN outperforms (in terms of GAP@20) the DNN by 1% and 1.5% using visual
and audiovisual features, respectively; both networks outperform Logistic Regression (LR)
• A performance gain of 3% is observed by exploiting the audio information for both networks
Visual Visual + Audio
LR DNN SDNN LR DNN SDNN
Hit@1 82.4% 82.5% 83.2% 82.3% 85.2% 85.7%
PERR 71.9% 72.2% 72.9% 71.8% 75.4% 75.9%
mAP 41.2% 42.3% 45.2% 40.1% 45.6% 47.9%
GAP@20 77.1% 77.6% 78.6% 77% 80.7% 82.2%
Ttr (min) 18.7 59.2 66.2 18.9 60.3 67.1
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
Experiments
24
Results
• Comparison with best single-model results in YT8M
• These methods use frame-level features and build upon stronger and computationally
demanding feature descriptors (e.g. Fisher Vectors, VLAD, FVNet, DBoF and other)
• The proposed SDNN performs in par with top-performers in the literature despite the fact that we
utilize only the video-level features provided from YT8M
[1] Huang, P.,Yuan,Y., Lan, Z., Jiang, L., Hauptmann,A.G.:Video representation learning and latent concept mining for large-scale multi-label
video classification. arXiv:1707.01408 (2017)
[2] Na, S.,Yu,Y., Lee, S., Kim, J., Kim, G.: Encoding video and label priors for multilabel video classification onYouTube-8M dataset. In: CVPR
Workshops (2017)
[3] Bober-Irizar, M., Husain, S., Ong, E.J., Bober, M.: Cultivating DNN diversity for large scale video labelling. In: CVPRWorkshops (2017)
[4] Skalic, M., Pekalski, M., Pan, X.E.: Deep learning methods for ecient large scale video labeling. In: CVPRWorkshops (2017)
[1] [2] [3] [4] SDNN
GAP@20 82.15% 80.9% 82.5% 82.25% 82.2%
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
Summary and next steps
25
Summary
• A new criterion was suggested to identify the neglected classes, i.e. classes that getting less
attention from the DNN during the training procedure
• Subclass DNNs were proposed, where the identified classes are augmented and partitioned to
subclasses, and, a new CE Loss is applied to exploit effectively the subclass labelling information
• The proposed approach was evaluated successfully in two different problem domains, multiclass
classification (CIFAR-10, CIFAR-100, SVHN) and multilabel classification (YT8M)
Next steps
• Further improve the classification performance of the proposed approach using a mixture of
experts model to combine the subclass scores at the output layer
• Extend the proposed approach to the semi-supervised paradigm in order to exploit more
effectively partially labelled datasets
retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project
26
Thank you for your attention!
Questions?
Vasileios Mezaris, bmezaris@iti.gr
Code publicly available at:
https://github.com/bmezaris/subclass_deep_neural_networks
This work was supported by the EUs Horizon 2020 research and innovation
programme under grant agreement H2020-780656 ReTV

More Related Content

What's hot

"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...Edge AI and Vision Alliance
 
Understanding user interactivity for immersive communications and its impact ...
Understanding user interactivity for immersive communications and its impact ...Understanding user interactivity for immersive communications and its impact ...
Understanding user interactivity for immersive communications and its impact ...lauratoni4
 
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...IJCSIS Research Publications
 
MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Models
MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep ModelsMediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Models
MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Modelsmultimediaeval
 
Devil in the Details: Analysing the Performance of ConvNet Features
Devil in the Details: Analysing the Performance of ConvNet FeaturesDevil in the Details: Analysing the Performance of ConvNet Features
Devil in the Details: Analysing the Performance of ConvNet FeaturesKen Chatfield
 
Relational knowledge distillation
Relational knowledge distillationRelational knowledge distillation
Relational knowledge distillationNAVER Engineering
 
Quantum Computing: Timing is Everything
Quantum Computing: Timing is EverythingQuantum Computing: Timing is Everything
Quantum Computing: Timing is Everythinginside-BigData.com
 
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
Paraphrasing complex network
Paraphrasing complex networkParaphrasing complex network
Paraphrasing complex networkNAVER Engineering
 

What's hot (10)

"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
 
Understanding user interactivity for immersive communications and its impact ...
Understanding user interactivity for immersive communications and its impact ...Understanding user interactivity for immersive communications and its impact ...
Understanding user interactivity for immersive communications and its impact ...
 
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
 
Network recasting
Network recastingNetwork recasting
Network recasting
 
MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Models
MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep ModelsMediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Models
MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Models
 
Devil in the Details: Analysing the Performance of ConvNet Features
Devil in the Details: Analysing the Performance of ConvNet FeaturesDevil in the Details: Analysing the Performance of ConvNet Features
Devil in the Details: Analysing the Performance of ConvNet Features
 
Relational knowledge distillation
Relational knowledge distillationRelational knowledge distillation
Relational knowledge distillation
 
Quantum Computing: Timing is Everything
Quantum Computing: Timing is EverythingQuantum Computing: Timing is Everything
Quantum Computing: Timing is Everything
 
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
 
Paraphrasing complex network
Paraphrasing complex networkParaphrasing complex network
Paraphrasing complex network
 

Similar to Subclass deep neural networks

Fractional step discriminant pruning
Fractional step discriminant pruningFractional step discriminant pruning
Fractional step discriminant pruningVasileiosMezaris
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Multimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAMultimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAJin-Hwa Kim
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)Fellowship at Vodafone FutureLab
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)Fellowship at Vodafone FutureLab
 
RICE INSECTS CLASSIFICATION USIING TRANSFER LEARNING AND CNN
RICE INSECTS CLASSIFICATION USIING TRANSFER LEARNING AND CNNRICE INSECTS CLASSIFICATION USIING TRANSFER LEARNING AND CNN
RICE INSECTS CLASSIFICATION USIING TRANSFER LEARNING AND CNNIRJET Journal
 
A Pointing Gesture-based Signal to Text Communication System Using OpenCV in ...
A Pointing Gesture-based Signal to Text Communication System Using OpenCV in ...A Pointing Gesture-based Signal to Text Communication System Using OpenCV in ...
A Pointing Gesture-based Signal to Text Communication System Using OpenCV in ...IRJET Journal
 
#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language Processing#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language ProcessingBerlin Language Technology
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Universitat Politècnica de Catalunya
 
Reservoir computing fast deep learning for sequences
Reservoir computing   fast deep learning for sequencesReservoir computing   fast deep learning for sequences
Reservoir computing fast deep learning for sequencesClaudio Gallicchio
 
imageclassification-160206090009.pdf
imageclassification-160206090009.pdfimageclassification-160206090009.pdf
imageclassification-160206090009.pdfKammetaJoshna
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural NetworksYogendra Tamang
 
Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesIRJET Journal
 
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisWorkshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisOlga Scrivner
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...IRJET Journal
 
Masking preprocessing in transfer learning for damage building detection
Masking preprocessing in transfer learning for damage building detectionMasking preprocessing in transfer learning for damage building detection
Masking preprocessing in transfer learning for damage building detectionIAESIJAI
 
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...Universitat Politècnica de Catalunya
 

Similar to Subclass deep neural networks (20)

Fractional step discriminant pruning
Fractional step discriminant pruningFractional step discriminant pruning
Fractional step discriminant pruning
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Multimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAMultimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QA
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
Conv xg
Conv xgConv xg
Conv xg
 
RICE INSECTS CLASSIFICATION USIING TRANSFER LEARNING AND CNN
RICE INSECTS CLASSIFICATION USIING TRANSFER LEARNING AND CNNRICE INSECTS CLASSIFICATION USIING TRANSFER LEARNING AND CNN
RICE INSECTS CLASSIFICATION USIING TRANSFER LEARNING AND CNN
 
A Pointing Gesture-based Signal to Text Communication System Using OpenCV in ...
A Pointing Gesture-based Signal to Text Communication System Using OpenCV in ...A Pointing Gesture-based Signal to Text Communication System Using OpenCV in ...
A Pointing Gesture-based Signal to Text Communication System Using OpenCV in ...
 
#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language Processing#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language Processing
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
 
Reservoir computing fast deep learning for sequences
Reservoir computing   fast deep learning for sequencesReservoir computing   fast deep learning for sequences
Reservoir computing fast deep learning for sequences
 
imageclassification-160206090009.pdf
imageclassification-160206090009.pdfimageclassification-160206090009.pdf
imageclassification-160206090009.pdf
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning Techniques
 
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisWorkshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
 
Masking preprocessing in transfer learning for damage building detection
Masking preprocessing in transfer learning for damage building detectionMasking preprocessing in transfer learning for damage building detection
Masking preprocessing in transfer learning for damage building detection
 
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
 

More from VasileiosMezaris

Multi-Modal Fusion for Image Manipulation Detection and Localization
Multi-Modal Fusion for Image Manipulation Detection and LocalizationMulti-Modal Fusion for Image Manipulation Detection and Localization
Multi-Modal Fusion for Image Manipulation Detection and LocalizationVasileiosMezaris
 
CERTH-ITI at MediaEval 2023 NewsImages Task
CERTH-ITI at MediaEval 2023 NewsImages TaskCERTH-ITI at MediaEval 2023 NewsImages Task
CERTH-ITI at MediaEval 2023 NewsImages TaskVasileiosMezaris
 
Spatio-Temporal Summarization of 360-degrees Videos
Spatio-Temporal Summarization of 360-degrees VideosSpatio-Temporal Summarization of 360-degrees Videos
Spatio-Temporal Summarization of 360-degrees VideosVasileiosMezaris
 
Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...
Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...
Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...VasileiosMezaris
 
Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022
Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022
Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022VasileiosMezaris
 
TAME: Trainable Attention Mechanism for Explanations
TAME: Trainable Attention Mechanism for ExplanationsTAME: Trainable Attention Mechanism for Explanations
TAME: Trainable Attention Mechanism for ExplanationsVasileiosMezaris
 
Explaining video summarization based on the focus of attention
Explaining video summarization based on the focus of attentionExplaining video summarization based on the focus of attention
Explaining video summarization based on the focus of attentionVasileiosMezaris
 
Combining textual and visual features for Ad-hoc Video Search
Combining textual and visual features for Ad-hoc Video SearchCombining textual and visual features for Ad-hoc Video Search
Combining textual and visual features for Ad-hoc Video SearchVasileiosMezaris
 
Explaining the decisions of image/video classifiers
Explaining the decisions of image/video classifiersExplaining the decisions of image/video classifiers
Explaining the decisions of image/video classifiersVasileiosMezaris
 
Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...VasileiosMezaris
 
Are all combinations equal? Combining textual and visual features with multi...
Are all combinations equal?  Combining textual and visual features with multi...Are all combinations equal?  Combining textual and visual features with multi...
Are all combinations equal? Combining textual and visual features with multi...VasileiosMezaris
 
CA-SUM Video Summarization
CA-SUM Video SummarizationCA-SUM Video Summarization
CA-SUM Video SummarizationVasileiosMezaris
 
Video smart cropping web application
Video smart cropping web applicationVideo smart cropping web application
Video smart cropping web applicationVasileiosMezaris
 
PGL SUM Video Summarization
PGL SUM Video SummarizationPGL SUM Video Summarization
PGL SUM Video SummarizationVasileiosMezaris
 
Hard-Negatives Selection Strategy for Cross-Modal Retrieval
Hard-Negatives Selection Strategy for Cross-Modal RetrievalHard-Negatives Selection Strategy for Cross-Modal Retrieval
Hard-Negatives Selection Strategy for Cross-Modal RetrievalVasileiosMezaris
 
Misinformation on the internet: Video and AI
Misinformation on the internet: Video and AIMisinformation on the internet: Video and AI
Misinformation on the internet: Video and AIVasileiosMezaris
 
PoR_evaluation_measure_acm_mm_2020
PoR_evaluation_measure_acm_mm_2020PoR_evaluation_measure_acm_mm_2020
PoR_evaluation_measure_acm_mm_2020VasileiosMezaris
 
GAN-based video summarization
GAN-based video summarizationGAN-based video summarization
GAN-based video summarizationVasileiosMezaris
 

More from VasileiosMezaris (20)

Multi-Modal Fusion for Image Manipulation Detection and Localization
Multi-Modal Fusion for Image Manipulation Detection and LocalizationMulti-Modal Fusion for Image Manipulation Detection and Localization
Multi-Modal Fusion for Image Manipulation Detection and Localization
 
CERTH-ITI at MediaEval 2023 NewsImages Task
CERTH-ITI at MediaEval 2023 NewsImages TaskCERTH-ITI at MediaEval 2023 NewsImages Task
CERTH-ITI at MediaEval 2023 NewsImages Task
 
Spatio-Temporal Summarization of 360-degrees Videos
Spatio-Temporal Summarization of 360-degrees VideosSpatio-Temporal Summarization of 360-degrees Videos
Spatio-Temporal Summarization of 360-degrees Videos
 
Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...
Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...
Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...
 
Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022
Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022
Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022
 
TAME: Trainable Attention Mechanism for Explanations
TAME: Trainable Attention Mechanism for ExplanationsTAME: Trainable Attention Mechanism for Explanations
TAME: Trainable Attention Mechanism for Explanations
 
Gated-ViGAT
Gated-ViGATGated-ViGAT
Gated-ViGAT
 
Explaining video summarization based on the focus of attention
Explaining video summarization based on the focus of attentionExplaining video summarization based on the focus of attention
Explaining video summarization based on the focus of attention
 
Combining textual and visual features for Ad-hoc Video Search
Combining textual and visual features for Ad-hoc Video SearchCombining textual and visual features for Ad-hoc Video Search
Combining textual and visual features for Ad-hoc Video Search
 
Explaining the decisions of image/video classifiers
Explaining the decisions of image/video classifiersExplaining the decisions of image/video classifiers
Explaining the decisions of image/video classifiers
 
Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...
 
Are all combinations equal? Combining textual and visual features with multi...
Are all combinations equal?  Combining textual and visual features with multi...Are all combinations equal?  Combining textual and visual features with multi...
Are all combinations equal? Combining textual and visual features with multi...
 
CA-SUM Video Summarization
CA-SUM Video SummarizationCA-SUM Video Summarization
CA-SUM Video Summarization
 
Video smart cropping web application
Video smart cropping web applicationVideo smart cropping web application
Video smart cropping web application
 
PGL SUM Video Summarization
PGL SUM Video SummarizationPGL SUM Video Summarization
PGL SUM Video Summarization
 
Video Thumbnail Selector
Video Thumbnail SelectorVideo Thumbnail Selector
Video Thumbnail Selector
 
Hard-Negatives Selection Strategy for Cross-Modal Retrieval
Hard-Negatives Selection Strategy for Cross-Modal RetrievalHard-Negatives Selection Strategy for Cross-Modal Retrieval
Hard-Negatives Selection Strategy for Cross-Modal Retrieval
 
Misinformation on the internet: Video and AI
Misinformation on the internet: Video and AIMisinformation on the internet: Video and AI
Misinformation on the internet: Video and AI
 
PoR_evaluation_measure_acm_mm_2020
PoR_evaluation_measure_acm_mm_2020PoR_evaluation_measure_acm_mm_2020
PoR_evaluation_measure_acm_mm_2020
 
GAN-based video summarization
GAN-based video summarizationGAN-based video summarization
GAN-based video summarization
 

Recently uploaded

Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 

Recently uploaded (20)

The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 

Subclass deep neural networks

  • 1. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project Title of presentation Subtitle Name of presenter Date Subclass deep neural networks: re-enabling neglected classes in deep network training for multimedia classification N. Gkalelis,V. Mezaris CERTH-ITI,Thermi -Thessaloniki,Greece 26th Int. Conf. Multimedia Modeling Daejeon, Korea, January 2020
  • 2. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project Outline 2 • Problem statement • Related work • Identification of neglected classes • Subclass DNNs • Experiments
  • 3. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 3 • Deep neural networks (DNNs) have shown remarkable performance in many machine learning problems • They are being commercially deployed in various domains Problem statement • Multimedia understanding • Self-driving cars • Edge computing Image Credits:V2Gov ImageCredits: [1] [1]Chen, J., Ran, X.: Deep LearningWith Edge Computing:A Review, Proc. of the IEEE, vol. 107, no. 8, (Aug. 2019)
  • 4. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 4 One weakness of DNNs is that some classes are getting less attention during training, for instance, due to: • Overfitting: error in the training set becomes negligible although generalization error is still high • Simply, the contributions of the misclassified observations to the updating of the weights associated with these classes cancel out  How to identify these neglected classes?  How to boost the classification performance of neglected classes? Problem statement
  • 5. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 5 Related work • The problem of neglected classes sometimes is related with imbalanced class learning, where the distribution of training data across different classes is skewed • There are relatively few works studying the class imbalance problem in DNNs (e.g. see [1, 2])  As observed in our experiments neglected classes may also appear with balanced training datasets (e.g. CIFAR datasets)  The identification of classes getting less attention from the DNN during the training procedure is a relatively unexplored topic [1] Sarafianos, N., Xu, X., Kakadiaris, I.A.: Deep imbalanced attribute classification using visual attention aggregation. ECCV. Munich, Germany (Sep 2018) [2] Lin,T., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal Loss for DenseObject Detection, 2017 IEEE ICCV,Venice, 2017.
  • 6. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 6 Identification of neglected classes: motivation • During backpropagation, a strong gradient update gi for the weight vectors wi at the output layer is typically produced for classes with high training error ei • In this way, the overall network is guided to extract nonlinear features at different layers in order to provide a linear separable subspace at the output layer for these classes wi wi’ = wi + gi gi length of the gradient update ||gi|| is large as expected due to a high error in the output node i weight vector is updated (as well as weight vectors in the layers below) in order to reduce the output node error weight vector of node i (associated with class i)
  • 7. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 7 Identification of neglected classes: motivation • However, it is observed that for certain output nodes the gradient update is close to zero although the classes associated with these nodes still have a large error • The result is that the DNN neglects to extract the necessary discriminant features for reducing the error for these classes • We investigate this case in detail and based on the analysis we propose a new measure to identify the so-called neglected classes The length of the gradient update ||gi|| is small despite a large error in the output node i! the updated weight vector of output node i remains relatively unchanged because the length of the gradient update ||gi|| is close to zero wi’  wi gi
  • 8. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 8 Identification of neglected classes: background • Suppose a DNN with sigmoid (SG) output layer and cross entropy (CE) loss xk xf,k xj,k x1,k wi w1,i wj,i bi weight vector and bias term of i-th node input vector to the i-th node associated with the k-th training observation output of the i-th node associated with k-th training observation i-th output node of the SG layer (associated with i-th class) wf,i
  • 9. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 9 Identification of neglected classes: background gradient update of wi label of k-th observation w.r.t. class i Learning rate input vector to the i-th output node associated with the k-th training observation in the batch output of the i-th node associated with k-th training observation Number of batch observations
  • 10. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 10 Identification of neglected classes Unnormalized gradient update (w.r.t i-th node) Unnormalized gradient update of positive class observations Unnormalized gradient update of negative class observations
  • 11. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 11 Identification of neglected classes
  • 12. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 12 Identification of neglected classes
  • 13. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 13 Identification of neglected classes Length of unnormalized gradient update of positive and negative class observations Length of unnormalized gradient update (w.r.t i-th node)
  • 14. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 14 Training of neglected classes: Subclass DNNs • A subclass-based learning framework is applied to improve the classification performance of neglected classes • Subclass-based classification techniques have been successfully used in the shallow learning paradigm [1-5]. • To this end, neglected classes are augmented and partitioned to subclasses and the created labeling information is used to create and train a subclass DNN [1] Hastie,T.,Tibshirani, R.: Discriminant analysis by Gaussian mixtures. Journal of the Royal Statistical Society. Series B 58(1), (Jul 1996) [2] Manli, Z., Martinez, A.M.: Subclass discriminant analysis. IEEETrans. PatternAnal. Mach. Intell. 28(8), (Aug 2006) [3] Escalera, S. et al.: Subclass problem-dependent design for error-correcting output codes. IEEETrans. PatternAnal. Mach. Intell. 30(6), (Jun 2008) [4]You, D., Hamsici, O.C., Martinez, A.M.: Kernel optimization in discriminant analysis. IEEETrans. PatternAnal. Mach. Intell. 33(3), (Mar 2011) [5] Gkalelis, N., Mezaris,V., Kompatsiaris, I., Stathaki,T.: Mixture subclass discriminant analysis link to restricted Gaussian model and other generalizations. IEEETrans. Neural Netw. Learn. Syst. 24(1), (Jan 2013)
  • 15. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 15 Training of neglected classes: Subclass DNNs Number of training observations Number of classes Number of subclasses of class i label of k-th observation w.r.t subclass (i,j) output of the node (i.j) associated with k-th training observation label of k-th observation w.r.t class i
  • 16. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project Experiments 16 Multiclass classification • CIFAR10 [1]: 10 classes, 32 x 32 color images, 50000 training and 10000 testing observations • CIFAR100 [1]: as CIFAR10 but with 100 classes • SVHN [2]: 10 classes, 32 x 32 color images, 73257 training, 26032 testing and 531131 extra observations [1] Krizhevsky, A.: Learning multiple layers of features from tiny images.Tech. rep. (2009) [2] Netzer,Y.,Wang,T., Coates, A., Bissacco, A.,Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. NIPS Workshops.Venice, Italy, (Oct 2017)
  • 17. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project Experiments 17 Experimental setup (similar setup with [1]) • Images are normalized to zero mean and unit variance • Data augmentation is applied as in [1] (cropping, mirroring, cutout, etc.) • DNNs [2,3]: VGG16, WRN-28-10 (CIFAR-10, -100), WRN-16-8 (SVHN) • CE loss, Minibatch SGD, Nesterov momentum 0.9, batch size 128, weight decay 0.0005, 200 epochs, initial learning rate 0.1, exponential schedule (learning rate multiplied by 0.1 for CIFAR or 0.2 for SVHN, at epochs 60, 120 and 160) [1] Devries,T.,Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552 (2017) [2]Zagoruyko, S., Komodakis, N.:Wide residual networks. BMVC.York, UK, (Sep 2016) [3] Simonyan, K., Zisserman, A.:Very deep convolutional networks for large-scale image recognition. ICLR. San Diego, CA, USA (May 2015)
  • 18. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project Experiments 18 SDNN training • DNN is trained for 30 epochs in order to compute a reliable neglection score for each class • The classes with highest neglection score are selected: 2 classes for CIFAR-10 and SVHN, 10 classes for CIFAR-100 • Observations of the selected classes are doubled using the augmentation approach presented in [1] • Selected classes are partitioned to 2 subclasses using k-means • SDNN is trained using the annotations at class and subclass level [1] Inoue, H.: Data augmentation by pairing samples for images classification. arXiv:1801.02929 (2018)
  • 19. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project Experiments 19 Results • The proposed subclass DNNs (SVGG16, SWRN) provide improved performance over their conventional variants (VGG16 [1], WRN [2]) • Considering that [2] is among the state-of-the-art approaches, the attained CCR performance improvement is significant • Training time overhead is negligible for the medium-size datasets (CIFAR-10, -100) and relatively small for the larger SVHN; testing time is only a few seconds for all DNNs VGG16 [1] SVGG16 WRN [2] SWRN CIFAR-10 93.5% (2.6 h) 94.8% (2.7 h) 96.92% (7.1 h) 97.14% (8.1 h) CIFAR-100 71.24% (2.6 h) 73.67% (2.6 h) 81.59% (7.1 h) 82.17% (7.5 h) SVHN 98.16% (29.1 h) 98.35 (34.1 h) 98.7% (33.1 h) 98.81% (42.7 h) [1] Simonyan, K., Zisserman, A.:Very deep convolutional networks for large-scale image recognition. In: ICLR. San Diego, CA, USA (May 2015) [2] Devries,T.,Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552 (2017)
  • 20. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project Experiments 20 Multilabel classification • YouTube-8M (YT8M) [1]: largest public dataset for multilabel video classification • 3862 classes, 3888919 training, 1112356 evaluation and 1133323 testing videos • 1024- and 128-dimensional visual and audio feature vectors are provided for each video [1]Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, A.P.,Toderici, G.,Varadarajan, B.,Vijayanarasimhan, S.:YouTube-8M:A large-scale video classification benchmark. In: arXiv:1609.08675 (2016)
  • 21. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project Experiments 21 Experimental setup • Two experiments: using the visual feature vectors directly; concatenating the visual and audio feature vectors • L2-normalization is applied to feature vectors in all cases • DNN: 1 conv layer with 64 1d-filters, max-pooling, dropout, SG output layer • CE loss, Minibatch SGD, batch size 512, weight decay 0.0005, 5 epochs, initial learning rate 0.001, exponential schedule (learning rate multiplied by 0.95 at every epoch) • The evaluation metrics of YT8M challenge [1] are utilized for performance evaluation; GAP@20 is used as the primary evaluation metric [1]Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, A.P.,Toderici, G.,Varadarajan, B.,Vijayanarasimhan, S.:YouTube-8M:A large-scale video classification benchmark. In: arXiv:1609.08675 (2016)
  • 22. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 22 SDNN training • DNN is trained for 1/3 of an epoch; a neglection score for each class is computed • 386 classes with highest neglection score are selected • Observations of the selected classes are doubled by applying extrapolation in feature space [1] • A efficient variant of the nearest neighbor-based clustering algorithm [2,3] is used to partition the neglected classes to subclasses • SDNN is trained using the annotations at class and subclass level [1] DeVries,T.,Taylor, G.W.: Dataset augmentation in feature space. ICLRWorkshops.Toulon, France (Apr 2017) [2] Manli, Z., Martinez, A.M.: Subclass discriminant analysis. IEEETrans. PatternAnal. Mach. Intell. 28(8), 1274-1286 (Aug 2006) [3]You, D., Hamsici, O.C., Martinez, A.M.: Kernel optimization in discriminant analysis. IEEE PAMI 33(3), 631-638 (Mar 2011) Experiments
  • 23. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project Experiments 23 Results • The proposed SDNN outperforms (in terms of GAP@20) the DNN by 1% and 1.5% using visual and audiovisual features, respectively; both networks outperform Logistic Regression (LR) • A performance gain of 3% is observed by exploiting the audio information for both networks Visual Visual + Audio LR DNN SDNN LR DNN SDNN Hit@1 82.4% 82.5% 83.2% 82.3% 85.2% 85.7% PERR 71.9% 72.2% 72.9% 71.8% 75.4% 75.9% mAP 41.2% 42.3% 45.2% 40.1% 45.6% 47.9% GAP@20 77.1% 77.6% 78.6% 77% 80.7% 82.2% Ttr (min) 18.7 59.2 66.2 18.9 60.3 67.1
  • 24. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project Experiments 24 Results • Comparison with best single-model results in YT8M • These methods use frame-level features and build upon stronger and computationally demanding feature descriptors (e.g. Fisher Vectors, VLAD, FVNet, DBoF and other) • The proposed SDNN performs in par with top-performers in the literature despite the fact that we utilize only the video-level features provided from YT8M [1] Huang, P.,Yuan,Y., Lan, Z., Jiang, L., Hauptmann,A.G.:Video representation learning and latent concept mining for large-scale multi-label video classification. arXiv:1707.01408 (2017) [2] Na, S.,Yu,Y., Lee, S., Kim, J., Kim, G.: Encoding video and label priors for multilabel video classification onYouTube-8M dataset. In: CVPR Workshops (2017) [3] Bober-Irizar, M., Husain, S., Ong, E.J., Bober, M.: Cultivating DNN diversity for large scale video labelling. In: CVPRWorkshops (2017) [4] Skalic, M., Pekalski, M., Pan, X.E.: Deep learning methods for ecient large scale video labeling. In: CVPRWorkshops (2017) [1] [2] [3] [4] SDNN GAP@20 82.15% 80.9% 82.5% 82.25% 82.2%
  • 25. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project Summary and next steps 25 Summary • A new criterion was suggested to identify the neglected classes, i.e. classes that getting less attention from the DNN during the training procedure • Subclass DNNs were proposed, where the identified classes are augmented and partitioned to subclasses, and, a new CE Loss is applied to exploit effectively the subclass labelling information • The proposed approach was evaluated successfully in two different problem domains, multiclass classification (CIFAR-10, CIFAR-100, SVHN) and multilabel classification (YT8M) Next steps • Further improve the classification performance of the proposed approach using a mixture of experts model to combine the subclass scores at the output layer • Extend the proposed approach to the semi-supervised paradigm in order to exploit more effectively partially labelled datasets
  • 26. retv-project.eu @ReTV_EU @ReTVproject retv-project retv_project 26 Thank you for your attention! Questions? Vasileios Mezaris, bmezaris@iti.gr Code publicly available at: https://github.com/bmezaris/subclass_deep_neural_networks This work was supported by the EUs Horizon 2020 research and innovation programme under grant agreement H2020-780656 ReTV