Yinyin Liu presents a model for object detection and localization, called Fast-RCNN. She will show how to introduce a ROI pooling layer into neon, and how to add the PASCAL VOC dataset to interface with model training and inference. Lastly, Yinyin will run through a demo on how to apply the trained model to detect new objects.
1. Proprietary and confidential. Do not distribute.
using neon for object detection and localization
Yinyin Liu, PhD
March 3 2016
2. Outline
2
Intro to Deep Learning• From a user’s perspective, how to use neon to solve your
problem
• Use object localization problem as an example to
• understand and utilize neon architecture
• implement a new model
5. Fast region-based CNN (RCNN)
5
Fast R-CNN [Girshick 2015] http://arxiv.org/abs/1504.08083
• Pre-trained the ConvNets
• ROI pooling
• Branch architecture
• cost function to consider both
functions
7. Building fast R-CNN network
7
model
Optimizer
smooth L1 cost
callbacks
loading trained VGG layers
Branch architecture
Pooling
ROI pooling
Dropout
Affine
Conv
data iterator
PASCAL VOC
Object detection metric
• Input
• Image
• ROIs
• Target
• class label
• box regression
• box regression mask
new components that
were not in neon
8. PASCAL VOC in a dataset container
8
PASCAL VOC
• Input
• Image
• ROIs
• Target
• class label
• box regression
• box regression mask
9. ROI pooling layer
9
• ROI pooling layer combines feature map from a
layer, and ROIs from the dataset
• Make a ROI pooling a container
• contains the ConvNet layers
• interface with dataset directly
10. ROI pooling layer
10
• Any new layer or container needs to:
• work as part of the model’s forward and
backward propagation process
• have fprop and bprop functions
• start from an python implementation
• write backend (GPU) support for speed
11. Cost and metric
11
• new type of cost need to be derived from Cost
class
• new type of metric need to be derived from
Metric class