SlideShare a Scribd company logo
1 of 30
Download to read offline
Intelligence Machine Vision Lab
Strictly Confidential
2019 CVPR paper overview
SUALAB
Ho Seong Lee
2Type A-3
Contents
• CVPR 2019 Statistics
• 20 paper-one page summary
3Type A-3
CVPR 2019 Statistics
• What is CVPR?
• Conference on Computer Vision and Pattern Recognition (CVPR)
• CVPR was first held in 1983 and has been held annually
• CVPR 2019: June 16th – June 20th in Long Beach, CA
4Type A-3
CVPR 2019 Statistics
• CVPR 2019 statistics
• The total number of papers is increasing every year and this year has increased significantly!
• We can visualize main topic using title of paper and simple python script!
• https://github.com/hoya012/CVPR-Paper-Statistics
28.4% 30%
29.9%
29.6%
25.1%
5Type A-3
CVPR 2019 Statistics
2019 CVPR paper statistics
6Type A-3
CVPR 2018 Statistics
2018 CVPR paper statistics
Compared to 2018 Statistics..
7Type A-3
CVPR 2018 vs CVPR 2019 Statistics
• Most of the top keywords were maintained
• Image, detection, 3d, object, video, segmentation, adversarial, recognition, visual …
• “graph”, “cloud”, “representation” are about twice as frequent
• graph : 15 → 45
• representation: 25 → 48
• cloud: 16 → 35
8Type A-3
Before beginning..
• It does not mean that it is not an interesting article because it is not in the list.
• Since I mainly studied Computer Vision, most papers that I will discuss today are
Computer Vision papers..
• Topics not covered today
• Natural Language Processing
• Reinforcement Learning
• Robotics
• Etc..?
9Type A-3
1. Learning to Synthesize Motion Blur (oral)
• Synthesizing a motion blurred image from a pair of unblurred sequential images
• Motion blur is important in cinematography, and artful photo
• Generate a large-scale synthetic training dataset of motion blurred images
Recommended reference: “Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation”, 2018 CVPR
10Type A-3
2. Semantic Image Synthesis with Spatially-Adaptive Normalization (oral)
• Synthesizing photorealistic images given an input semantic layout
• Spatially-adaptive normalization can keep semantic information
• This model allows user control over both semantic and style as synthesizing images
Demo Code: https://github.com/NVlabs/SPADE
11Type A-3
3. SiCloPe: Silhouette-Based Clothed People (Oral)
• Reconstruct a complete and textured 3D model of a person from a single image
• Use 2D Silhouettes and 3D joints of a body pose to reconstruct 3D mesh
• An effective two-stage 3D shape reconstruction pipeline
• Predicting multi-view 2D silhouettes from single input segmentation
• Deep visual hull based mesh reconstruction technique
Recommended reference: “BodyNet: Volumetric Inference of 3D Human Body Shapes”, 2018 ECCV
12Type A-3
4. Im2Pencil: Controllable Pencil Illustration from Photographs
• Propose controllable photo-to-pencil translation method
• Modeling pencil outline(rough, clean), pencil shading(4 types)
• Create training data pairs from online websites(e.g., Pinterest) and use image filtering techniques
Demo Code: https://github.com/Yijunmaverick/Im2Pencil
13Type A-3
5. End-to-End Time-Lapse Video Synthesis from a Single Outdoor Image
• End-to-end solution to synthesize a time-lapse video from single image
• Use time-lapse videos and image sequences during training
• Use only single image during inference
Input image(single)
14Type A-3
6. StoryGAN: A Sequential Conditional GAN for Story Visualization
• Propose a new task called Story Visualization using GAN
• Sequential conditional GAN based StoryGAN
• Story Encoder – stochastic mapping from story to an low-dimensional embedding vector
• Context Encoder – capture contextual information during sequential image generation
• Two Discriminator – Image Discriminator & Story Discriminator
Context Encoder
15Type A-3
7. Image Super-Resolution by Neural Texture Transfer (oral)
• Improve “RefSR” even when irrelevant reference images are provided
• Traditional Single Image Super-Resolution is extremely challenging (ill-posed problem)
• Reference-based(RefSR) utilizes rich texture from HR references .. but.. only similar Ref images
• Adaptively transferring the texture from Ref Images according to their texture similarity
Recommended reference: “CrossNet: An end-to-end reference-based super resolution network using cross-scale warping”, 2018 ECCV
Similar
Different
16Type A-3
8. DVC: An End-to-end Deep Video Compression Framework (oral)
• Propose the first end-to-end video compression deep model
• Conventional video compression use predictive coding architecture and encode corresponding
motion information and residual information
• Taking advantage of both classical compression and neural network
• Use learning based optical flow estimation
17Type A-3
9. Defense Against Adversarial Images using Web-Scale Nearest-Neighbor Search (oral)
• Defense adversarial attack using Big-data and image manifold
• Assume that adversarial attack move the image away from the image manifold
• A successful defense mechanism should aim to project the images back on the image manifold
• For tens of billions of images, search a nearest-neighbor images (K=50) and use them
• Also propose two novel attack methods to break nearest neighbor defenses
18Type A-3
10. Bag of Tricks for Image Classification with Convolutional Neural Networks
• Examine a collection of some refinements and empirically evaluate their impact
• Improve ResNet-50’s accuracy from 75.3% to 79.29% on ImageNet with some refinements
• Efficient Training
• FP32 with BS=256 → FP16 with BS=1024 with some techniques
• Training Refinements:
• Cosine Learning Rate Decay / Label Smoothing / Knowledge Distillation / Mixup Training
• Transfer from classification to Object Detection, Semantic Segmentation
Linear scaling LR
LR warmup
Zero γ initialization in BN
No bias decay
Result of Efficient Training
Result of Training refinements
ResNet tweaks
19Type A-3
11. Fully Learnable Group Convolution for Acceleration of Deep Neural Networks
• Automatically learn the group structure in training stage with end-to-end manner
• Outperform standard group convolution
• Propose an efficient strategy for index re-ordering
20Type A-3
12. ScratchDet:Exploring to Train Single-Shot Object Detectors from Scratch (oral)
• Explore to train object detectors from scratch robustly
• Almost SOTA detectors are fine-tuned from pretrained CNN (e.g., ImageNet)
• The classification and detection have different degrees of sensitivity to translation
• The architecture is limited by the classification network(backbone) → inconvenience!
• Find that one of the overlooked points is BatchNorm!
Recommended reference: “DSOD: Learning Deeply Supervised Object Detectors from Scratch”, 2017 ICCV
21Type A-3
13. Precise Detection in Densely Packed Scenes
• Propose precise detection in densely packed scenes
• In real-world, there are many applications of object detection (ex, detection and count # of object)
• In densely packed scenes, SOTA detector can’t detect accurately
(1) layer for estimating the Jaccard index (2) a novel EM merging unit (3) release SKU-110K dataset
22Type A-3
14. SIXray: A Large-scale Security Inspection X-ray Benchmark for Prohibited Item Discovery in Overlapping Images
• Present a large-scale dataset and establish a baseline for security inspection X-ray
• Total 1,059,231 X-ray images in which 6 classes of 8,929 prohibited items
• Propose an approach named class-balanced hierarchical refinement(CHR) and class-balanced loss
function
23Type A-3
15. Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regressio
• Address the weaknesses of IoU and introduce generalized version(GIoU)
• Intersection over Union(IoU) is the most popular evaluation metric used in object detection
• But, there is a gap between optimizing distance losses and maximizing IoU
• Introducing generalized IoU as both a new loss and a new metric
24Type A-3
16. Bounding Box Regression with Uncertainty for Accurate Object Detection
• Propose novel bounding box regression loss with uncertainty
• Most of datasets have ambiguities and labeling noise of bounding box coordinate
• Network can learns to predict localization variance for each coordinate
25Type A-3
17. UPSNet: A Unified Panoptic Segmentation Network (oral)
• Propose a unified panoptic segmentation network(UPSNet)
• Semantic segmentation + Instance segmentation = panoptic segmentation
• Semantic Head + Instance Head + Panoptic head → end-to-end manner
Recommended reference: “Panoptic Segmentation”, 2018 arXiv
Countable objects → things
Uncountable objects → stuff
Deformable Conv Mask R-CNN Parameter-free
26Type A-3
18. SFNet: Learning Object-aware Semantic Correspondence (Oral)
• Propose SFNet for semantic correspondence problem
• Propose to use images annotated with binary foreground masks and synthetic geometric
deformations during training
• Manually selecting point correspondences is so expensive!!
• Outperform SOTA on standard benchmarks by a significant margin
27Type A-3
19. Fast Interactive Object Annotation with Curve-GC
• Propose end-to-end fast interactive object annotation tool (Curve-GCN)
• Predict all vertices simultaneously using a Graph Convolutional Network, (→ Polygon-RNN X)
• Human annotator can correct any wrong point and only the neighboring points are affected
Recommended reference: “Efficient interactive annotation of segmentation datasets with polygon-rnn++ ”, 2018 CVPR
Code: https://github.com/fidler-lab/curve-gcn
Correction!
28Type A-3
20. FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stochastic Inference
• Propose image-level WSSS method using stochastic inference (dropout)
• Localization maps(CAM) only focus on the small parts of objects → Problem
• FickleNet allows a single network to generate multiple CAM from a single image
• Does not require any additional training steps and only adds a simple layer
Full Stochastic
Both Training
and Inference
29Type A-3
Related Post..
• In my personal blog, there are similar works
• SIGGRAPH 2018
• NeurIPS 2018
• ICLR 2019
https://hoya012.github.io/
Thank you

More Related Content

What's hot

【技術解説20】 ミニバッチ確率的勾配降下法
【技術解説20】 ミニバッチ確率的勾配降下法【技術解説20】 ミニバッチ確率的勾配降下法
【技術解説20】 ミニバッチ確率的勾配降下法Ruo Ando
 
論文紹介:Deep Mutual Learning
論文紹介:Deep Mutual Learning論文紹介:Deep Mutual Learning
論文紹介:Deep Mutual LearningToru Tamaki
 
[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...
[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...
[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...Deep Learning JP
 
[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...
[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...
[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...Deep Learning JP
 
CycleGANによる異種モダリティ画像生成を用いた股関節MRIの筋骨格セグメンテーション
CycleGANによる異種モダリティ画像生成を用いた股関節MRIの筋骨格セグメンテーションCycleGANによる異種モダリティ画像生成を用いた股関節MRIの筋骨格セグメンテーション
CycleGANによる異種モダリティ画像生成を用いた股関節MRIの筋骨格セグメンテーション奈良先端大 情報科学研究科
 
敵対的生成ネットワーク(GAN)
敵対的生成ネットワーク(GAN)敵対的生成ネットワーク(GAN)
敵対的生成ネットワーク(GAN)cvpaper. challenge
 
「解説資料」Toward Fast and Stabilized GAN Training for High-fidelity Few-shot Imag...
「解説資料」Toward Fast and Stabilized GAN Training for High-fidelity Few-shot Imag...「解説資料」Toward Fast and Stabilized GAN Training for High-fidelity Few-shot Imag...
「解説資料」Toward Fast and Stabilized GAN Training for High-fidelity Few-shot Imag...Takumi Ohkuma
 
SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用
SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用
SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用SSII
 
帰納バイアスが成立する条件
帰納バイアスが成立する条件帰納バイアスが成立する条件
帰納バイアスが成立する条件Shinobu KINJO
 
[DLHacks]StyleGANとBigGANのStyle mixing, morphing
[DLHacks]StyleGANとBigGANのStyle mixing, morphing[DLHacks]StyleGANとBigGANのStyle mixing, morphing
[DLHacks]StyleGANとBigGANのStyle mixing, morphingDeep Learning JP
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion ModelsSangwoo Mo
 
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.Sunghoon Joo
 
論文紹介「A Perspective View and Survey of Meta-Learning」
論文紹介「A Perspective View and Survey of Meta-Learning」論文紹介「A Perspective View and Survey of Meta-Learning」
論文紹介「A Perspective View and Survey of Meta-Learning」Kota Matsui
 
[DL Hacks]Self-Attention Generative Adversarial Networks
[DL Hacks]Self-Attention Generative Adversarial Networks[DL Hacks]Self-Attention Generative Adversarial Networks
[DL Hacks]Self-Attention Generative Adversarial NetworksDeep Learning JP
 
パンでも分かるVariational Autoencoder
パンでも分かるVariational Autoencoderパンでも分かるVariational Autoencoder
パンでも分かるVariational Autoencoderぱんいち すみもと
 
Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unk...
Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unk...Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unk...
Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unk...Kazuyuki Miyazawa
 
深層ニューラルネットワークの積分表現(Deepを定式化する数学)
深層ニューラルネットワークの積分表現(Deepを定式化する数学)深層ニューラルネットワークの積分表現(Deepを定式化する数学)
深層ニューラルネットワークの積分表現(Deepを定式化する数学)Katsuya Ito
 
SSII2022 [SS2] 少ないデータやラベルを効率的に活用する機械学習技術 〜 足りない情報をどのように補うか?〜
SSII2022 [SS2] 少ないデータやラベルを効率的に活用する機械学習技術 〜 足りない情報をどのように補うか?〜SSII2022 [SS2] 少ないデータやラベルを効率的に活用する機械学習技術 〜 足りない情報をどのように補うか?〜
SSII2022 [SS2] 少ないデータやラベルを効率的に活用する機械学習技術 〜 足りない情報をどのように補うか?〜SSII
 
AI勉強会用スライド
AI勉強会用スライドAI勉強会用スライド
AI勉強会用スライドharmonylab
 

What's hot (20)

【技術解説20】 ミニバッチ確率的勾配降下法
【技術解説20】 ミニバッチ確率的勾配降下法【技術解説20】 ミニバッチ確率的勾配降下法
【技術解説20】 ミニバッチ確率的勾配降下法
 
論文紹介:Deep Mutual Learning
論文紹介:Deep Mutual Learning論文紹介:Deep Mutual Learning
論文紹介:Deep Mutual Learning
 
[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...
[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...
[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...
 
[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...
[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...
[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...
 
ResNetの仕組み
ResNetの仕組みResNetの仕組み
ResNetの仕組み
 
CycleGANによる異種モダリティ画像生成を用いた股関節MRIの筋骨格セグメンテーション
CycleGANによる異種モダリティ画像生成を用いた股関節MRIの筋骨格セグメンテーションCycleGANによる異種モダリティ画像生成を用いた股関節MRIの筋骨格セグメンテーション
CycleGANによる異種モダリティ画像生成を用いた股関節MRIの筋骨格セグメンテーション
 
敵対的生成ネットワーク(GAN)
敵対的生成ネットワーク(GAN)敵対的生成ネットワーク(GAN)
敵対的生成ネットワーク(GAN)
 
「解説資料」Toward Fast and Stabilized GAN Training for High-fidelity Few-shot Imag...
「解説資料」Toward Fast and Stabilized GAN Training for High-fidelity Few-shot Imag...「解説資料」Toward Fast and Stabilized GAN Training for High-fidelity Few-shot Imag...
「解説資料」Toward Fast and Stabilized GAN Training for High-fidelity Few-shot Imag...
 
SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用
SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用
SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用
 
帰納バイアスが成立する条件
帰納バイアスが成立する条件帰納バイアスが成立する条件
帰納バイアスが成立する条件
 
[DLHacks]StyleGANとBigGANのStyle mixing, morphing
[DLHacks]StyleGANとBigGANのStyle mixing, morphing[DLHacks]StyleGANとBigGANのStyle mixing, morphing
[DLHacks]StyleGANとBigGANのStyle mixing, morphing
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion Models
 
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
 
論文紹介「A Perspective View and Survey of Meta-Learning」
論文紹介「A Perspective View and Survey of Meta-Learning」論文紹介「A Perspective View and Survey of Meta-Learning」
論文紹介「A Perspective View and Survey of Meta-Learning」
 
[DL Hacks]Self-Attention Generative Adversarial Networks
[DL Hacks]Self-Attention Generative Adversarial Networks[DL Hacks]Self-Attention Generative Adversarial Networks
[DL Hacks]Self-Attention Generative Adversarial Networks
 
パンでも分かるVariational Autoencoder
パンでも分かるVariational Autoencoderパンでも分かるVariational Autoencoder
パンでも分かるVariational Autoencoder
 
Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unk...
Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unk...Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unk...
Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unk...
 
深層ニューラルネットワークの積分表現(Deepを定式化する数学)
深層ニューラルネットワークの積分表現(Deepを定式化する数学)深層ニューラルネットワークの積分表現(Deepを定式化する数学)
深層ニューラルネットワークの積分表現(Deepを定式化する数学)
 
SSII2022 [SS2] 少ないデータやラベルを効率的に活用する機械学習技術 〜 足りない情報をどのように補うか?〜
SSII2022 [SS2] 少ないデータやラベルを効率的に活用する機械学習技術 〜 足りない情報をどのように補うか?〜SSII2022 [SS2] 少ないデータやラベルを効率的に活用する機械学習技術 〜 足りない情報をどのように補うか?〜
SSII2022 [SS2] 少ないデータやラベルを効率的に活用する機械学習技術 〜 足りない情報をどのように補うか?〜
 
AI勉強会用スライド
AI勉強会用スライドAI勉強会用スライド
AI勉強会用スライド
 

Similar to 2019 cvpr paper_overview

TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverviewMotaz El-Saban
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyNUPUR YADAV
 
Unsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingUnsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingYu Huang
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learningpratik pratyay
 
Computer Vision Landscape : Present and Future
Computer Vision Landscape : Present and FutureComputer Vision Landscape : Present and Future
Computer Vision Landscape : Present and FutureSanghamitra Deb
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learningNAVER Engineering
 
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...NAVER Engineering
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...Edge AI and Vision Alliance
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsPetteriTeikariPhD
 
Neural Networks for Machine Learning and Deep Learning
Neural Networks for Machine Learning and Deep LearningNeural Networks for Machine Learning and Deep Learning
Neural Networks for Machine Learning and Deep Learningcomifa7406
 
Moving object detection in complex scene
Moving object detection in complex sceneMoving object detection in complex scene
Moving object detection in complex sceneKumar Mayank
 
The Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- ReduxThe Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- ReduxPierre Schaus
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSGanesan Narayanasamy
 
Camera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IICamera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IIYu Huang
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonAditya Bhattacharya
 
Content Based Image Retrieval (CBIR)
Content Based Image Retrieval (CBIR)Content Based Image Retrieval (CBIR)
Content Based Image Retrieval (CBIR)Behzad Shomali
 

Similar to 2019 cvpr paper_overview (20)

TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Unsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingUnsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object tracking
 
slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learning
 
Computer Vision Landscape : Present and Future
Computer Vision Landscape : Present and FutureComputer Vision Landscape : Present and Future
Computer Vision Landscape : Present and Future
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learning
 
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
 
Multimedia Mining
Multimedia Mining Multimedia Mining
Multimedia Mining
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problems
 
Neural Networks for Machine Learning and Deep Learning
Neural Networks for Machine Learning and Deep LearningNeural Networks for Machine Learning and Deep Learning
Neural Networks for Machine Learning and Deep Learning
 
Moving object detection in complex scene
Moving object detection in complex sceneMoving object detection in complex scene
Moving object detection in complex scene
 
The Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- ReduxThe Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- Redux
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUS
 
PointNet
PointNetPointNet
PointNet
 
Camera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IICamera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning II
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
 
Content Based Image Retrieval (CBIR)
Content Based Image Retrieval (CBIR)Content Based Image Retrieval (CBIR)
Content Based Image Retrieval (CBIR)
 
kanimozhi2019.pdf
kanimozhi2019.pdfkanimozhi2019.pdf
kanimozhi2019.pdf
 

More from LEE HOSEONG

Unsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillationUnsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillationLEE HOSEONG
 
do adversarially robust image net models transfer better
do adversarially robust image net models transfer betterdo adversarially robust image net models transfer better
do adversarially robust image net models transfer betterLEE HOSEONG
 
CNN Architecture A to Z
CNN Architecture A to ZCNN Architecture A to Z
CNN Architecture A to ZLEE HOSEONG
 
carrier of_tricks_for_image_classification
carrier of_tricks_for_image_classificationcarrier of_tricks_for_image_classification
carrier of_tricks_for_image_classificationLEE HOSEONG
 
"The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen...
"The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen..."The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen...
"The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen...LEE HOSEONG
 
Mixed Precision Training Review
Mixed Precision Training ReviewMixed Precision Training Review
Mixed Precision Training ReviewLEE HOSEONG
 
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly DetectionMVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly DetectionLEE HOSEONG
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewLEE HOSEONG
 
FixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidenceFixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidenceLEE HOSEONG
 
"Revisiting self supervised visual representation learning" Paper Review
"Revisiting self supervised visual representation learning" Paper Review"Revisiting self supervised visual representation learning" Paper Review
"Revisiting self supervised visual representation learning" Paper ReviewLEE HOSEONG
 
Unsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-SupervisionUnsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-SupervisionLEE HOSEONG
 
Human uncertainty makes classification more robust, ICCV 2019 Review
Human uncertainty makes classification more robust, ICCV 2019 ReviewHuman uncertainty makes classification more robust, ICCV 2019 Review
Human uncertainty makes classification more robust, ICCV 2019 ReviewLEE HOSEONG
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution OverviewLEE HOSEONG
 
2019 ICLR Best Paper Review
2019 ICLR Best Paper Review2019 ICLR Best Paper Review
2019 ICLR Best Paper ReviewLEE HOSEONG
 
"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper Review"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper ReviewLEE HOSEONG
 
"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper Review"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper ReviewLEE HOSEONG
 
"Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re..."Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re...LEE HOSEONG
 
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper ReviewLEE HOSEONG
 
"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper Review"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper ReviewLEE HOSEONG
 
"From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ..."From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ...LEE HOSEONG
 

More from LEE HOSEONG (20)

Unsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillationUnsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillation
 
do adversarially robust image net models transfer better
do adversarially robust image net models transfer betterdo adversarially robust image net models transfer better
do adversarially robust image net models transfer better
 
CNN Architecture A to Z
CNN Architecture A to ZCNN Architecture A to Z
CNN Architecture A to Z
 
carrier of_tricks_for_image_classification
carrier of_tricks_for_image_classificationcarrier of_tricks_for_image_classification
carrier of_tricks_for_image_classification
 
"The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen...
"The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen..."The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen...
"The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen...
 
Mixed Precision Training Review
Mixed Precision Training ReviewMixed Precision Training Review
Mixed Precision Training Review
 
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly DetectionMVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
 
FixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidenceFixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidence
 
"Revisiting self supervised visual representation learning" Paper Review
"Revisiting self supervised visual representation learning" Paper Review"Revisiting self supervised visual representation learning" Paper Review
"Revisiting self supervised visual representation learning" Paper Review
 
Unsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-SupervisionUnsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-Supervision
 
Human uncertainty makes classification more robust, ICCV 2019 Review
Human uncertainty makes classification more robust, ICCV 2019 ReviewHuman uncertainty makes classification more robust, ICCV 2019 Review
Human uncertainty makes classification more robust, ICCV 2019 Review
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution Overview
 
2019 ICLR Best Paper Review
2019 ICLR Best Paper Review2019 ICLR Best Paper Review
2019 ICLR Best Paper Review
 
"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper Review"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper Review
 
"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper Review"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper Review
 
"Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re..."Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re...
 
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
 
"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper Review"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper Review
 
"From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ..."From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ...
 

Recently uploaded

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 

Recently uploaded (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 

2019 cvpr paper_overview

  • 1. Intelligence Machine Vision Lab Strictly Confidential 2019 CVPR paper overview SUALAB Ho Seong Lee
  • 2. 2Type A-3 Contents • CVPR 2019 Statistics • 20 paper-one page summary
  • 3. 3Type A-3 CVPR 2019 Statistics • What is CVPR? • Conference on Computer Vision and Pattern Recognition (CVPR) • CVPR was first held in 1983 and has been held annually • CVPR 2019: June 16th – June 20th in Long Beach, CA
  • 4. 4Type A-3 CVPR 2019 Statistics • CVPR 2019 statistics • The total number of papers is increasing every year and this year has increased significantly! • We can visualize main topic using title of paper and simple python script! • https://github.com/hoya012/CVPR-Paper-Statistics 28.4% 30% 29.9% 29.6% 25.1%
  • 5. 5Type A-3 CVPR 2019 Statistics 2019 CVPR paper statistics
  • 6. 6Type A-3 CVPR 2018 Statistics 2018 CVPR paper statistics Compared to 2018 Statistics..
  • 7. 7Type A-3 CVPR 2018 vs CVPR 2019 Statistics • Most of the top keywords were maintained • Image, detection, 3d, object, video, segmentation, adversarial, recognition, visual … • “graph”, “cloud”, “representation” are about twice as frequent • graph : 15 → 45 • representation: 25 → 48 • cloud: 16 → 35
  • 8. 8Type A-3 Before beginning.. • It does not mean that it is not an interesting article because it is not in the list. • Since I mainly studied Computer Vision, most papers that I will discuss today are Computer Vision papers.. • Topics not covered today • Natural Language Processing • Reinforcement Learning • Robotics • Etc..?
  • 9. 9Type A-3 1. Learning to Synthesize Motion Blur (oral) • Synthesizing a motion blurred image from a pair of unblurred sequential images • Motion blur is important in cinematography, and artful photo • Generate a large-scale synthetic training dataset of motion blurred images Recommended reference: “Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation”, 2018 CVPR
  • 10. 10Type A-3 2. Semantic Image Synthesis with Spatially-Adaptive Normalization (oral) • Synthesizing photorealistic images given an input semantic layout • Spatially-adaptive normalization can keep semantic information • This model allows user control over both semantic and style as synthesizing images Demo Code: https://github.com/NVlabs/SPADE
  • 11. 11Type A-3 3. SiCloPe: Silhouette-Based Clothed People (Oral) • Reconstruct a complete and textured 3D model of a person from a single image • Use 2D Silhouettes and 3D joints of a body pose to reconstruct 3D mesh • An effective two-stage 3D shape reconstruction pipeline • Predicting multi-view 2D silhouettes from single input segmentation • Deep visual hull based mesh reconstruction technique Recommended reference: “BodyNet: Volumetric Inference of 3D Human Body Shapes”, 2018 ECCV
  • 12. 12Type A-3 4. Im2Pencil: Controllable Pencil Illustration from Photographs • Propose controllable photo-to-pencil translation method • Modeling pencil outline(rough, clean), pencil shading(4 types) • Create training data pairs from online websites(e.g., Pinterest) and use image filtering techniques Demo Code: https://github.com/Yijunmaverick/Im2Pencil
  • 13. 13Type A-3 5. End-to-End Time-Lapse Video Synthesis from a Single Outdoor Image • End-to-end solution to synthesize a time-lapse video from single image • Use time-lapse videos and image sequences during training • Use only single image during inference Input image(single)
  • 14. 14Type A-3 6. StoryGAN: A Sequential Conditional GAN for Story Visualization • Propose a new task called Story Visualization using GAN • Sequential conditional GAN based StoryGAN • Story Encoder – stochastic mapping from story to an low-dimensional embedding vector • Context Encoder – capture contextual information during sequential image generation • Two Discriminator – Image Discriminator & Story Discriminator Context Encoder
  • 15. 15Type A-3 7. Image Super-Resolution by Neural Texture Transfer (oral) • Improve “RefSR” even when irrelevant reference images are provided • Traditional Single Image Super-Resolution is extremely challenging (ill-posed problem) • Reference-based(RefSR) utilizes rich texture from HR references .. but.. only similar Ref images • Adaptively transferring the texture from Ref Images according to their texture similarity Recommended reference: “CrossNet: An end-to-end reference-based super resolution network using cross-scale warping”, 2018 ECCV Similar Different
  • 16. 16Type A-3 8. DVC: An End-to-end Deep Video Compression Framework (oral) • Propose the first end-to-end video compression deep model • Conventional video compression use predictive coding architecture and encode corresponding motion information and residual information • Taking advantage of both classical compression and neural network • Use learning based optical flow estimation
  • 17. 17Type A-3 9. Defense Against Adversarial Images using Web-Scale Nearest-Neighbor Search (oral) • Defense adversarial attack using Big-data and image manifold • Assume that adversarial attack move the image away from the image manifold • A successful defense mechanism should aim to project the images back on the image manifold • For tens of billions of images, search a nearest-neighbor images (K=50) and use them • Also propose two novel attack methods to break nearest neighbor defenses
  • 18. 18Type A-3 10. Bag of Tricks for Image Classification with Convolutional Neural Networks • Examine a collection of some refinements and empirically evaluate their impact • Improve ResNet-50’s accuracy from 75.3% to 79.29% on ImageNet with some refinements • Efficient Training • FP32 with BS=256 → FP16 with BS=1024 with some techniques • Training Refinements: • Cosine Learning Rate Decay / Label Smoothing / Knowledge Distillation / Mixup Training • Transfer from classification to Object Detection, Semantic Segmentation Linear scaling LR LR warmup Zero γ initialization in BN No bias decay Result of Efficient Training Result of Training refinements ResNet tweaks
  • 19. 19Type A-3 11. Fully Learnable Group Convolution for Acceleration of Deep Neural Networks • Automatically learn the group structure in training stage with end-to-end manner • Outperform standard group convolution • Propose an efficient strategy for index re-ordering
  • 20. 20Type A-3 12. ScratchDet:Exploring to Train Single-Shot Object Detectors from Scratch (oral) • Explore to train object detectors from scratch robustly • Almost SOTA detectors are fine-tuned from pretrained CNN (e.g., ImageNet) • The classification and detection have different degrees of sensitivity to translation • The architecture is limited by the classification network(backbone) → inconvenience! • Find that one of the overlooked points is BatchNorm! Recommended reference: “DSOD: Learning Deeply Supervised Object Detectors from Scratch”, 2017 ICCV
  • 21. 21Type A-3 13. Precise Detection in Densely Packed Scenes • Propose precise detection in densely packed scenes • In real-world, there are many applications of object detection (ex, detection and count # of object) • In densely packed scenes, SOTA detector can’t detect accurately (1) layer for estimating the Jaccard index (2) a novel EM merging unit (3) release SKU-110K dataset
  • 22. 22Type A-3 14. SIXray: A Large-scale Security Inspection X-ray Benchmark for Prohibited Item Discovery in Overlapping Images • Present a large-scale dataset and establish a baseline for security inspection X-ray • Total 1,059,231 X-ray images in which 6 classes of 8,929 prohibited items • Propose an approach named class-balanced hierarchical refinement(CHR) and class-balanced loss function
  • 23. 23Type A-3 15. Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regressio • Address the weaknesses of IoU and introduce generalized version(GIoU) • Intersection over Union(IoU) is the most popular evaluation metric used in object detection • But, there is a gap between optimizing distance losses and maximizing IoU • Introducing generalized IoU as both a new loss and a new metric
  • 24. 24Type A-3 16. Bounding Box Regression with Uncertainty for Accurate Object Detection • Propose novel bounding box regression loss with uncertainty • Most of datasets have ambiguities and labeling noise of bounding box coordinate • Network can learns to predict localization variance for each coordinate
  • 25. 25Type A-3 17. UPSNet: A Unified Panoptic Segmentation Network (oral) • Propose a unified panoptic segmentation network(UPSNet) • Semantic segmentation + Instance segmentation = panoptic segmentation • Semantic Head + Instance Head + Panoptic head → end-to-end manner Recommended reference: “Panoptic Segmentation”, 2018 arXiv Countable objects → things Uncountable objects → stuff Deformable Conv Mask R-CNN Parameter-free
  • 26. 26Type A-3 18. SFNet: Learning Object-aware Semantic Correspondence (Oral) • Propose SFNet for semantic correspondence problem • Propose to use images annotated with binary foreground masks and synthetic geometric deformations during training • Manually selecting point correspondences is so expensive!! • Outperform SOTA on standard benchmarks by a significant margin
  • 27. 27Type A-3 19. Fast Interactive Object Annotation with Curve-GC • Propose end-to-end fast interactive object annotation tool (Curve-GCN) • Predict all vertices simultaneously using a Graph Convolutional Network, (→ Polygon-RNN X) • Human annotator can correct any wrong point and only the neighboring points are affected Recommended reference: “Efficient interactive annotation of segmentation datasets with polygon-rnn++ ”, 2018 CVPR Code: https://github.com/fidler-lab/curve-gcn Correction!
  • 28. 28Type A-3 20. FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stochastic Inference • Propose image-level WSSS method using stochastic inference (dropout) • Localization maps(CAM) only focus on the small parts of objects → Problem • FickleNet allows a single network to generate multiple CAM from a single image • Does not require any additional training steps and only adds a simple layer Full Stochastic Both Training and Inference
  • 29. 29Type A-3 Related Post.. • In my personal blog, there are similar works • SIGGRAPH 2018 • NeurIPS 2018 • ICLR 2019 https://hoya012.github.io/