SlideShare a Scribd company logo
1 of 28
Download to read offline
2020/11/29
Ho Seong Lee (hoya012)
Cognex Deep Learning Lab
Research Engineer
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 1
Contents
• Introduction
• Related Work
• Experiments
• Analysis & Discussion
• Conclusion
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 2
Introduction
Transfer Learning is a widely-used paradigm in deep learning (maybe.. default..?)
• Models pre-trained on standard datasets (e.g. ImageNet) can be efficiently adapted to downstream tasks.
• Better pre-trained models yield better transfer results, suggesting that initial accuracy is a key aspect of
transfer learning performance.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 3
Reference: “Do Better ImageNet Models Transfer Better?“, 2019 CVPR
Related Works
Transfer Learning in various domain
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 4
• Medical imaging
• “Comparison of deep transfer learning strategies for digital pathology”, 2018 CVPRW
• Language modeling
• “Senteval: An evaluation toolkit for universal sentence representations”, 2018 arXiv
• Object Detection, Segmentation
• “Faster r-cnn: Towards real-time object detection with region proposal networks”, 2015 NIPS
• “R-fcn: Object detection via region-based fully convolutional networks”, 2016 NIPS
• “Speed/accuracy trade-offs for modern convolutional object detectors”, 2017 CVPR
• “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and
fully connected crfs”, 2017 TPAMI
Related Works
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 5
Transfer Learning with fine-tuning or frozen feature-based methods
• “Analyzing the performance of multilayer neural networks for object recognition”, 2014 ECCV
• “Return of the devil in the details: Delving deep into convolutional nets”, 2014 arXiv
• “Rich feature hierarchies for accurate object detection and semantic seg- mentation”, 2014 CVPR
• “How transferable are features in deep neural networks?”, 2014 NIPS
• “Factors of transferability for a generic convnet representation”, 2015 TPAMI
• “Bilinear cnn models for fine- grained visual recognition”, 2015 ICCV
• “What makes ImageNet good for transfer learning?”, 2016 arXiv
• “Best practices for fine-tuning visual classifiers to new domains”, 2016 ECCV
→ They show that fine-tuning outperforms frozen feature-based methods
Related Works
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 6
Adversarial robustness
• “Towards deep learning models resistant to adversarial attacks”, 2018 ICLR
• “Virtual adversarial training: a regularization method for supervised and semi-supervised learning”,
2018
• “Provably robust deep learning via adversarially trained smoothed classifier”, 2019 NeurIPS
• And many papers has studied the features learned by these robust networks and suggested that they
improve upon those learned by standard networks.
• On the other hand, prior studies have also identified theoretical and empirical tradeoffs between
standard accuracy and adversarial robustness.
Related Works
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 7
Adversarial robustness and Transfer learning
• “Adversarially robust transfer learning”, 2019 arXiv
• Transfer learning can increase downstream-task adversarial robustness
• “Adversarially-Trained Deep Nets Transfer Better”, 2020 arXiv
• Investigate the transfer performance of adversarially robust networks. → Very similar work!
• Authors study a larger set of downstream datasets and tasks and analyze the effects of model
accuracy, model width, and data resolution.
Experiments
Motivation: Fixed-Feature Transfer Learning
• Basically we use the source model as a feature extractor for the target dataset, the trains a simple (often
linear) model on the resulting features
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 8
Reference: Stanford cs231n lecture note
Experiments
How can we improve transfer learning?
• Prior works suggest that accuracy on the source dataset is a strong indicator of performance on
downstream tasks.
• Still, it is unclear if improving ImageNet accuracy is the only way to improve performance.
• After all, the behavior of fixed-feature transfer is governed by models’ learned representations, which
are not fully described by source-dataset accuracy.
• These representations are, in turn, controlled by the priors that we put on them during training
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 9
architectural components, loss functions, augmentations, etc.
Experiments
The adversarial robustness prior
• Adversarial robustness refers to a model’s invariance to small (often imperceptible) perturbations of its
inputs.
• Robustness is typically induced at training time by replacing the standard empirical risk minimization
objective with a robust optimization objective
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 10
Experiments
Should adversarial robustness help fixed-feature transfer?
• In fact, adversarially robust models are known to be significantly less accurate than their standard
counterparts.
• It suggest that using adversarially robust feature representations should hurt transfer performance.
• On the other hand, recent work has found that the feature representations of robust models carry
several advantages over those of standard models.
• For example, adversarially robust representations typically have better-behaved gradients and thus
facilitate regularization-free feature visualization
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 11
Experiments
Experiments – Fixed Feature Transfer Learning
• To resolve these two conflicting hypotheses (adversarially robust feature representations should hurt
transfer performance. vs. feature representations of robust models carry several advantages over
those of standard models.), use a test bed of 12 standard transfer learning datasets.
• Use four ResNet-based architecture (ResNet-18, 50, WideResNet-50-x2, 50-x4)
• The results indicate that robust networks consistently extract better features for transfer learning than
standard networks.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 12
Experiments
Experiments – Fixed Feature Transfer Learning
• To resolve these two conflicting hypotheses (adversarially robust feature representations should hurt
transfer performance. vs. feature representations of robust models carry several advantages over
those of standard models.), use a test bed of 12 standard transfer learning datasets.
• Use four ResNet-based architecture (ResNet-18, 50, WideResNet-50-x2, 50-x4)
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 13
Experiments
Experiments – Full-Network Fine Tuning
• A more expensive but often better-performing transfer learning method uses the pre-trained model as a
weight initialization rather than as a feature extractor.
• In other words, update all of the weights of the pre-trained model (via gradient descent) to minimize loss
on the target task.
• Many previous works find that for standard models, performance on full-network transfer learning is
highly correlated with performance on fixed-feature transfer learning.
• Hope that the findings of the last section (fixed-feature) also carry over to this setting (full-network).
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 14
Experiments
Experiments – Full-Network Fine Tuning
• Robust models match or improve on standard models in terms of transfer learning performance.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 15
Experiments
Experiments – Full-Network Fine Tuning
• Also, adversarially robust networks consistently outperform standard networks in Object Detection &
Instance Segmentation
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 16
Analysis & Discussion
4.1 ImageNet accuracy and transfer performance
• Take a closer look at the similarities and differences in transfer learning between robust networks and
standard networks.
• Hypothesis: robustness and accuracy have counteracting yet separate effects!
• That is, higher accuracy improves transfer learning for a fixed level of robustness, and higher
robustness improves transfer learning for a fixed level of accuracy
• The results (cf. Figure 5; similar results for full-network transfer in Appendix F) support this hypothesis.
• The previously observed linear relationship between accuracy and transfer performance is often violated
once robustness aspect comes into play.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 17
Analysis & Discussion
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 18
Analysis & Discussion
4.1 ImageNet accuracy and transfer performance
• In even more direct support of our hypothesis, find that when the robustness level is held fixed, the
accuracy- transfer correlation observed by prior works for standard models holds for robust models too.
• This findings also indicate that accuracy is not a sufficient measure of feature quality or versatility.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 19
Analysis & Discussion
4.2 Robust models improve with width
• Previous works find that although increasing network depth improves transfer performance, increasing
width hurts it.
• The results corroborate this trend for standard networks but indicate that it does not hold for robust
networks, at least in the regime of widths tested.
• As width increases, transfer performance plateaus and decreases for standard models, but continues to
steadily grow for robust models.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 20
Not always!!
Analysis & Discussion
4.2 Robust models improve with width
• Previous works find that although increasing network depth improves transfer performance, increasing
width hurts it.
• The results corroborate this trend for standard networks but indicate that it does not hold for robust
networks, at least in the regime of widths tested.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 21
Analysis & Discussion
4.3 Optimal robustness levels for downstream tasks
• Although the best robust models often outperform the best standard models, the optimal choice of
robustness parameter ε varies widely between datasets. For example, when transferring to CIFAR- 10
and CIFAR-100, the optimal ε values were 3.0 and 1.0, respectively.
• In contrast, smaller values of ε (smaller by an order of magnitude) tend to work better for the rest of the
datasets.
• One possible explanation for this variability in the optimal choice of ε might relate to dataset granularity.
• Although we lack a quantitative notion of granularity (in reality, features are not simply singular pixels),
we consider image resolution as a crude proxy.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 22
Analysis & Discussion
4.3 Optimal robustness levels for downstream tasks
• Since we scale target datasets to match ImageNet dimensions, each pixel in a low-resolution dataset
(e.g., CIFAR-10) image translates into several pixels in transfer, thus inflating dataset’s separability.
• Attempt to calibrate the granularities of the 12 image classification datasets used in this work, by first
downscaling all the images to the size of CIFAR-10 (32× 32), and then upscaling them to ImageNet size
once more.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 23
Analysis & Discussion
4.3 Optimal robustness levels for downstream tasks
• After controlling for original dataset dimension, the dataset’s epsilon vs. transfer accuracy curves all
behave almost identically to CIFAR-10 and CIFAR-100 ones. (Similar results for full-network transfer)
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 24
Analysis & Discussion
4.4 Comparing adversarial robustness to texture robustness
• Consider texture-invariant models, i.e., models trained on the texture-randomizing Stylized ImageNet
(SIN) dataset.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 25
Analysis & Discussion
4.4 Comparing adversarial robustness to texture robustness
• Transfer learning from adversarially robust models outperforms transfer learning from texture-invariant
models on all considered datasets.
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 26
Full-network
Fixed-feature
Conclusion
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better?
• Propose using adversarially robust models for transfer learning.
• Compare transfer learning performance of robust and standard models on a suite of 12
classification tasks, object detection, and instance segmentation.
• Find that adversarial robust neural networks consistently match or improve upon the
performance of their standard counterparts, despite having lower ImageNet accuracy.
• Take a closer look at the behavior of adversarially robust networks, and study the interplay
between ImageNet accuracy, model width, robustness, and transfer performance.
27
Conclusion
PR-290 | Do Adversarially Robust ImageNet Models Transfer Better?
• We can simply try this experiments! (https://github.com/Microsoft/robust-models-transfer)
28

More Related Content

What's hot

YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewLEE HOSEONG
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural NetworksPyData
 
Pelee: a real time object detection system on mobile devices Paper Review
Pelee: a real time object detection system on mobile devices Paper ReviewPelee: a real time object detection system on mobile devices Paper Review
Pelee: a real time object detection system on mobile devices Paper ReviewLEE HOSEONG
 
Efficient de cvpr_2020_paper
Efficient de cvpr_2020_paperEfficient de cvpr_2020_paper
Efficient de cvpr_2020_papershanullah3
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...Jinwon Lee
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsJinwon Lee
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...Jinwon Lee
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer VisionSungjoon Choi
 
“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...
“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...
“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...Edge AI and Vision Alliance
 
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object DetectorPR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object DetectorJinwon Lee
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionJinwon Lee
 
[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun YooJaeJun Yoo
 
A beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trendsA beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trendsJaeJun Yoo
 
Learning deep features for discriminative localization
Learning deep features for discriminative localizationLearning deep features for discriminative localization
Learning deep features for discriminative localization太一郎 遠藤
 
Face Recognition: From Scratch To Hatch
Face Recognition: From Scratch To HatchFace Recognition: From Scratch To Hatch
Face Recognition: From Scratch To HatchEduard Tyantov
 
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...Edge AI and Vision Alliance
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningMLAI2
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersSungchul Kim
 
Reward constrained interactive recommendation with natural language feedback ...
Reward constrained interactive recommendation with natural language feedback ...Reward constrained interactive recommendation with natural language feedback ...
Reward constrained interactive recommendation with natural language feedback ...Jeong-Gwan Lee
 

What's hot (20)

YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
 
Pelee: a real time object detection system on mobile devices Paper Review
Pelee: a real time object detection system on mobile devices Paper ReviewPelee: a real time object detection system on mobile devices Paper Review
Pelee: a real time object detection system on mobile devices Paper Review
 
Efficient de cvpr_2020_paper
Efficient de cvpr_2020_paperEfficient de cvpr_2020_paper
Efficient de cvpr_2020_paper
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...
“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...
“DNN Training Data: How to Know What You Need and How to Get It,” a Presentat...
 
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object DetectorPR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
PR-270: PP-YOLO: An Effective and Efficient Implementation of Object Detector
 
CNN Quantization
CNN QuantizationCNN Quantization
CNN Quantization
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
 
[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo
 
A beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trendsA beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trends
 
Learning deep features for discriminative localization
Learning deep features for discriminative localizationLearning deep features for discriminative localization
Learning deep features for discriminative localization
 
Face Recognition: From Scratch To Hatch
Face Recognition: From Scratch To HatchFace Recognition: From Scratch To Hatch
Face Recognition: From Scratch To Hatch
 
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
 
Reward constrained interactive recommendation with natural language feedback ...
Reward constrained interactive recommendation with natural language feedback ...Reward constrained interactive recommendation with natural language feedback ...
Reward constrained interactive recommendation with natural language feedback ...
 

Similar to do adversarially robust image net models transfer better

How well do self-supervised models transfer.pptx
How well do self-supervised models transfer.pptxHow well do self-supervised models transfer.pptx
How well do self-supervised models transfer.pptxssuserbafbd0
 
Augmix review [cdm]
Augmix review [cdm]Augmix review [cdm]
Augmix review [cdm]Dongmin Choi
 
How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?Seunghyun Hwang
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningSanghamitra Deb
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksSeunghyun Hwang
 
Presentation of master thesis
Presentation of master thesisPresentation of master thesis
Presentation of master thesisSeoung-Ho Choi
 
A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution Mohammed Ashour
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...IEEEFINALYEARSTUDENTPROJECT
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...IEEEMEMTECHSTUDENTSPROJECTS
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEEFINALYEARSTUDENTPROJECTS
 
Inception v4 vs Inception Resnet v2.pdf
Inception v4 vs Inception Resnet v2.pdfInception v4 vs Inception Resnet v2.pdf
Inception v4 vs Inception Resnet v2.pdfChauVVan
 
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksPR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee
 
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondenceParn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondenceNAVER Engineering
 
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...Christopher Sneed, MSDS, PMP, CSPO
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for BeginnersSanghamitra Deb
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...Dongmin Choi
 
Graph convolutional neural networks for web-scale recommender systems.pptx
Graph convolutional neural networks for web-scale recommender systems.pptxGraph convolutional neural networks for web-scale recommender systems.pptx
Graph convolutional neural networks for web-scale recommender systems.pptxssuser2624f71
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-Systeminside-BigData.com
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSitakanta Mishra
 

Similar to do adversarially robust image net models transfer better (20)

How well do self-supervised models transfer.pptx
How well do self-supervised models transfer.pptxHow well do self-supervised models transfer.pptx
How well do self-supervised models transfer.pptx
 
Augmix review [cdm]
Augmix review [cdm]Augmix review [cdm]
Augmix review [cdm]
 
How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention Networks
 
Presentation of master thesis
Presentation of master thesisPresentation of master thesis
Presentation of master thesis
 
A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution A Fully Progressive approach to Single image super-resolution
A Fully Progressive approach to Single image super-resolution
 
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
 
Inception v4 vs Inception Resnet v2.pdf
Inception v4 vs Inception Resnet v2.pdfInception v4 vs Inception Resnet v2.pdf
Inception v4 vs Inception Resnet v2.pdf
 
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksPR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
 
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondenceParn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
 
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...
 
Graph convolutional neural networks for web-scale recommender systems.pptx
Graph convolutional neural networks for web-scale recommender systems.pptxGraph convolutional neural networks for web-scale recommender systems.pptx
Graph convolutional neural networks for web-scale recommender systems.pptx
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-System
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
 

More from LEE HOSEONG

Unsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillationUnsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillationLEE HOSEONG
 
CNN Architecture A to Z
CNN Architecture A to ZCNN Architecture A to Z
CNN Architecture A to ZLEE HOSEONG
 
carrier of_tricks_for_image_classification
carrier of_tricks_for_image_classificationcarrier of_tricks_for_image_classification
carrier of_tricks_for_image_classificationLEE HOSEONG
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution OverviewLEE HOSEONG
 
2019 ICLR Best Paper Review
2019 ICLR Best Paper Review2019 ICLR Best Paper Review
2019 ICLR Best Paper ReviewLEE HOSEONG
 
"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper Review"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper ReviewLEE HOSEONG
 
"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper Review"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper ReviewLEE HOSEONG
 
"Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re..."Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re...LEE HOSEONG
 
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper ReviewLEE HOSEONG
 
"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper Review"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper ReviewLEE HOSEONG
 
"From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ..."From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ...LEE HOSEONG
 
"simple does it weakly supervised instance and semantic segmentation" Paper r...
"simple does it weakly supervised instance and semantic segmentation" Paper r..."simple does it weakly supervised instance and semantic segmentation" Paper r...
"simple does it weakly supervised instance and semantic segmentation" Paper r...LEE HOSEONG
 
"How does batch normalization help optimization" Paper Review
"How does batch normalization help optimization" Paper Review"How does batch normalization help optimization" Paper Review
"How does batch normalization help optimization" Paper ReviewLEE HOSEONG
 

More from LEE HOSEONG (13)

Unsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillationUnsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillation
 
CNN Architecture A to Z
CNN Architecture A to ZCNN Architecture A to Z
CNN Architecture A to Z
 
carrier of_tricks_for_image_classification
carrier of_tricks_for_image_classificationcarrier of_tricks_for_image_classification
carrier of_tricks_for_image_classification
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution Overview
 
2019 ICLR Best Paper Review
2019 ICLR Best Paper Review2019 ICLR Best Paper Review
2019 ICLR Best Paper Review
 
"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper Review"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper Review
 
"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper Review"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper Review
 
"Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re..."Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re...
 
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
 
"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper Review"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper Review
 
"From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ..."From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ...
 
"simple does it weakly supervised instance and semantic segmentation" Paper r...
"simple does it weakly supervised instance and semantic segmentation" Paper r..."simple does it weakly supervised instance and semantic segmentation" Paper r...
"simple does it weakly supervised instance and semantic segmentation" Paper r...
 
"How does batch normalization help optimization" Paper Review
"How does batch normalization help optimization" Paper Review"How does batch normalization help optimization" Paper Review
"How does batch normalization help optimization" Paper Review
 

Recently uploaded

Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Recently uploaded (20)

Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

do adversarially robust image net models transfer better

  • 1. 2020/11/29 Ho Seong Lee (hoya012) Cognex Deep Learning Lab Research Engineer PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 1
  • 2. Contents • Introduction • Related Work • Experiments • Analysis & Discussion • Conclusion PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 2
  • 3. Introduction Transfer Learning is a widely-used paradigm in deep learning (maybe.. default..?) • Models pre-trained on standard datasets (e.g. ImageNet) can be efficiently adapted to downstream tasks. • Better pre-trained models yield better transfer results, suggesting that initial accuracy is a key aspect of transfer learning performance. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 3 Reference: “Do Better ImageNet Models Transfer Better?“, 2019 CVPR
  • 4. Related Works Transfer Learning in various domain PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 4 • Medical imaging • “Comparison of deep transfer learning strategies for digital pathology”, 2018 CVPRW • Language modeling • “Senteval: An evaluation toolkit for universal sentence representations”, 2018 arXiv • Object Detection, Segmentation • “Faster r-cnn: Towards real-time object detection with region proposal networks”, 2015 NIPS • “R-fcn: Object detection via region-based fully convolutional networks”, 2016 NIPS • “Speed/accuracy trade-offs for modern convolutional object detectors”, 2017 CVPR • “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs”, 2017 TPAMI
  • 5. Related Works PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 5 Transfer Learning with fine-tuning or frozen feature-based methods • “Analyzing the performance of multilayer neural networks for object recognition”, 2014 ECCV • “Return of the devil in the details: Delving deep into convolutional nets”, 2014 arXiv • “Rich feature hierarchies for accurate object detection and semantic seg- mentation”, 2014 CVPR • “How transferable are features in deep neural networks?”, 2014 NIPS • “Factors of transferability for a generic convnet representation”, 2015 TPAMI • “Bilinear cnn models for fine- grained visual recognition”, 2015 ICCV • “What makes ImageNet good for transfer learning?”, 2016 arXiv • “Best practices for fine-tuning visual classifiers to new domains”, 2016 ECCV → They show that fine-tuning outperforms frozen feature-based methods
  • 6. Related Works PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 6 Adversarial robustness • “Towards deep learning models resistant to adversarial attacks”, 2018 ICLR • “Virtual adversarial training: a regularization method for supervised and semi-supervised learning”, 2018 • “Provably robust deep learning via adversarially trained smoothed classifier”, 2019 NeurIPS • And many papers has studied the features learned by these robust networks and suggested that they improve upon those learned by standard networks. • On the other hand, prior studies have also identified theoretical and empirical tradeoffs between standard accuracy and adversarial robustness.
  • 7. Related Works PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 7 Adversarial robustness and Transfer learning • “Adversarially robust transfer learning”, 2019 arXiv • Transfer learning can increase downstream-task adversarial robustness • “Adversarially-Trained Deep Nets Transfer Better”, 2020 arXiv • Investigate the transfer performance of adversarially robust networks. → Very similar work! • Authors study a larger set of downstream datasets and tasks and analyze the effects of model accuracy, model width, and data resolution.
  • 8. Experiments Motivation: Fixed-Feature Transfer Learning • Basically we use the source model as a feature extractor for the target dataset, the trains a simple (often linear) model on the resulting features PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 8 Reference: Stanford cs231n lecture note
  • 9. Experiments How can we improve transfer learning? • Prior works suggest that accuracy on the source dataset is a strong indicator of performance on downstream tasks. • Still, it is unclear if improving ImageNet accuracy is the only way to improve performance. • After all, the behavior of fixed-feature transfer is governed by models’ learned representations, which are not fully described by source-dataset accuracy. • These representations are, in turn, controlled by the priors that we put on them during training PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 9 architectural components, loss functions, augmentations, etc.
  • 10. Experiments The adversarial robustness prior • Adversarial robustness refers to a model’s invariance to small (often imperceptible) perturbations of its inputs. • Robustness is typically induced at training time by replacing the standard empirical risk minimization objective with a robust optimization objective PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 10
  • 11. Experiments Should adversarial robustness help fixed-feature transfer? • In fact, adversarially robust models are known to be significantly less accurate than their standard counterparts. • It suggest that using adversarially robust feature representations should hurt transfer performance. • On the other hand, recent work has found that the feature representations of robust models carry several advantages over those of standard models. • For example, adversarially robust representations typically have better-behaved gradients and thus facilitate regularization-free feature visualization PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 11
  • 12. Experiments Experiments – Fixed Feature Transfer Learning • To resolve these two conflicting hypotheses (adversarially robust feature representations should hurt transfer performance. vs. feature representations of robust models carry several advantages over those of standard models.), use a test bed of 12 standard transfer learning datasets. • Use four ResNet-based architecture (ResNet-18, 50, WideResNet-50-x2, 50-x4) • The results indicate that robust networks consistently extract better features for transfer learning than standard networks. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 12
  • 13. Experiments Experiments – Fixed Feature Transfer Learning • To resolve these two conflicting hypotheses (adversarially robust feature representations should hurt transfer performance. vs. feature representations of robust models carry several advantages over those of standard models.), use a test bed of 12 standard transfer learning datasets. • Use four ResNet-based architecture (ResNet-18, 50, WideResNet-50-x2, 50-x4) PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 13
  • 14. Experiments Experiments – Full-Network Fine Tuning • A more expensive but often better-performing transfer learning method uses the pre-trained model as a weight initialization rather than as a feature extractor. • In other words, update all of the weights of the pre-trained model (via gradient descent) to minimize loss on the target task. • Many previous works find that for standard models, performance on full-network transfer learning is highly correlated with performance on fixed-feature transfer learning. • Hope that the findings of the last section (fixed-feature) also carry over to this setting (full-network). PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 14
  • 15. Experiments Experiments – Full-Network Fine Tuning • Robust models match or improve on standard models in terms of transfer learning performance. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 15
  • 16. Experiments Experiments – Full-Network Fine Tuning • Also, adversarially robust networks consistently outperform standard networks in Object Detection & Instance Segmentation PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 16
  • 17. Analysis & Discussion 4.1 ImageNet accuracy and transfer performance • Take a closer look at the similarities and differences in transfer learning between robust networks and standard networks. • Hypothesis: robustness and accuracy have counteracting yet separate effects! • That is, higher accuracy improves transfer learning for a fixed level of robustness, and higher robustness improves transfer learning for a fixed level of accuracy • The results (cf. Figure 5; similar results for full-network transfer in Appendix F) support this hypothesis. • The previously observed linear relationship between accuracy and transfer performance is often violated once robustness aspect comes into play. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 17
  • 18. Analysis & Discussion PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 18
  • 19. Analysis & Discussion 4.1 ImageNet accuracy and transfer performance • In even more direct support of our hypothesis, find that when the robustness level is held fixed, the accuracy- transfer correlation observed by prior works for standard models holds for robust models too. • This findings also indicate that accuracy is not a sufficient measure of feature quality or versatility. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 19
  • 20. Analysis & Discussion 4.2 Robust models improve with width • Previous works find that although increasing network depth improves transfer performance, increasing width hurts it. • The results corroborate this trend for standard networks but indicate that it does not hold for robust networks, at least in the regime of widths tested. • As width increases, transfer performance plateaus and decreases for standard models, but continues to steadily grow for robust models. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 20 Not always!!
  • 21. Analysis & Discussion 4.2 Robust models improve with width • Previous works find that although increasing network depth improves transfer performance, increasing width hurts it. • The results corroborate this trend for standard networks but indicate that it does not hold for robust networks, at least in the regime of widths tested. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 21
  • 22. Analysis & Discussion 4.3 Optimal robustness levels for downstream tasks • Although the best robust models often outperform the best standard models, the optimal choice of robustness parameter ε varies widely between datasets. For example, when transferring to CIFAR- 10 and CIFAR-100, the optimal ε values were 3.0 and 1.0, respectively. • In contrast, smaller values of ε (smaller by an order of magnitude) tend to work better for the rest of the datasets. • One possible explanation for this variability in the optimal choice of ε might relate to dataset granularity. • Although we lack a quantitative notion of granularity (in reality, features are not simply singular pixels), we consider image resolution as a crude proxy. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 22
  • 23. Analysis & Discussion 4.3 Optimal robustness levels for downstream tasks • Since we scale target datasets to match ImageNet dimensions, each pixel in a low-resolution dataset (e.g., CIFAR-10) image translates into several pixels in transfer, thus inflating dataset’s separability. • Attempt to calibrate the granularities of the 12 image classification datasets used in this work, by first downscaling all the images to the size of CIFAR-10 (32× 32), and then upscaling them to ImageNet size once more. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 23
  • 24. Analysis & Discussion 4.3 Optimal robustness levels for downstream tasks • After controlling for original dataset dimension, the dataset’s epsilon vs. transfer accuracy curves all behave almost identically to CIFAR-10 and CIFAR-100 ones. (Similar results for full-network transfer) PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 24
  • 25. Analysis & Discussion 4.4 Comparing adversarial robustness to texture robustness • Consider texture-invariant models, i.e., models trained on the texture-randomizing Stylized ImageNet (SIN) dataset. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 25
  • 26. Analysis & Discussion 4.4 Comparing adversarial robustness to texture robustness • Transfer learning from adversarially robust models outperforms transfer learning from texture-invariant models on all considered datasets. PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? 26 Full-network Fixed-feature
  • 27. Conclusion PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? • Propose using adversarially robust models for transfer learning. • Compare transfer learning performance of robust and standard models on a suite of 12 classification tasks, object detection, and instance segmentation. • Find that adversarial robust neural networks consistently match or improve upon the performance of their standard counterparts, despite having lower ImageNet accuracy. • Take a closer look at the behavior of adversarially robust networks, and study the interplay between ImageNet accuracy, model width, robustness, and transfer performance. 27
  • 28. Conclusion PR-290 | Do Adversarially Robust ImageNet Models Transfer Better? • We can simply try this experiments! (https://github.com/Microsoft/robust-models-transfer) 28